[phenixbb] Dummy atoms

Wed Jul 28 16:28:02 PDT 2010

I was thinking of the case (which we have had) where we could place the peptide plausibly (eg in a helix) but not identify the side chain. Maybe there should be different UNK-likes, for unknown amino-acid, unknown nucleotide, unknown thing

Phil

On 28 Jul 2010, at 23:34, Pavel Afonine wrote:

> I agree with Phil about UNK - it seems to be good indeed to call unknown (undefined) residue as it appears on the map rather than call ALA something that in fact is TYR, and then later on getting confused about the mismatch between actual sequence and the one derived from PDB file. This is actually what I get confused all the time looking at results of model building programs, because the first thing I always do is I compare the real actual sequence with the one derived from PDB file - just to validate the result of model building.
> 
> However, I agree with Tom too about loosing identity in cases where we really do know what to expect: polypeptide or rna/dna.
> 
> Hm... interesting situation -:)
> 
> I guess UNK is may be still better, ONLY IF you go one level deeper and look at atom names (or make sure you do that consistently). Say you name a "residue" as UNK and name corresponding atoms within this residue as CA, N, C, O (kind of peptide pattern) - then you have a chance to guess what it is. Of course how you then know where you place those CA,N,C and O...
> 
> Pavel.
> 
> 
> 
> On 7/28/10 3:19 PM, Tom Terwilliger wrote:
>> One disadvantage of using UNK is that it is often a loss of information. For example in the case Phil mentions...we do think that we have a polypeptide.  By labelling protein residues UNK we no longer distinguish them from DNA, or depending on HETATM vs ATOM identification, from ligands.
>> -Tom T
>> 
>> On Jul 28, 2010, at 4:01 PM, Phil Evans wrote:
>> 
>>> UNK residues have another valid use where you can see peptide but not assign a sequence register. A poly-Ala model in that case is better labelled UNK than ALA, since it isn't ALA
>>> 
>>> Phil 
>>> 
>>> 
>>> On 28 Jul 2010, at 19:12, Pavel Afonine wrote:
>>> 
>>>> Dear Ed,
>>>> 
>>>>> I think it is very important to be able to include unknown atoms
>>>>> in a deposited pdb file (with echoing the caveat about flooding
>>>>> the structure with UNK's to lower the R-factor).
>>>> 
>>>> yes, as I wrote in original reply, including these atoms may improve the map and in turn may reveal or improve some its other important (biologically) places. The only point is: please define these dummy atoms properly, providing all the information, such as scattering element type that you or your program used for such an approximation.
>>>> 
>>>>> For one thing, these structures are produced not just for structure-factor
>>>>> calculation and validation. Many of the end users will never even
>>>>> bother to do a structure factor calculation.
>>>> 
>>>> The ability to reproduce the R-factor is not only for someones pleasure but for the validation purposes at least. If I've got a PDB file for which I can't compute the R-factors (and, by the way, even the map too), then I don't need the deposited Fobs too, unless I'm going to re-determine the structure from scratch.
>>>> 
>>>>> It important for the
>>>>> depositor to be able to refer to an unknown but likely significant
>>>>> ligand and for the reader to be able to go and look at that position
>>>>> (ideally surrounded by electron density).
>>>> 
>>>> Sure, it is important.
>>>> 
>>>>> For another thing, the structure factor calculation will give exactly
>>>>> the same result whether the dummy atoms are omitted or are flagged
>>>>> with zero occupancy or atom-type X to be ignored in sf calculation.
>>>> 
>>>> If you look in PDB you will find that very often the occupancies are not set up to 1. Plus, as I mentioned, often the B-factors for these atoms are set to some funny numbers (looks like they were refined).
>>>> Are we sure that these programs were ignoring these dummies in Fcalc calculations? If so how the B-factor were refined, or they were made up?
>>>> 
>>>> Again, if it is defined properly, for example, like this:
>>>> 
>>>> ATOM   1959  O   DUM A   1      -8.762   8.060  25.324  1.00 31.23           O
>>>> 
>>>> or
>>>> 
>>>> ATOM   1959  O   UNK A   1      -8.762   8.060  25.324  1.00 31.23           O
>>>> 
>>>> then it is absolutely OK to have such entries, because it is completely defined and can be used in any calculations without any unnecessary guesswork. But if you start masking things with X or blanks then I (and the software I write) will start asking all these nasty questions...
>>>> 
>>>> All the best!
>>>> Pavel.
>>>> 
>>>> _______________________________________________
>>>> phenixbb mailing list
>>>> phenixbb at phenix-online.org
>>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>> 
>>> _______________________________________________
>>> phenixbb mailing list
>>> phenixbb at phenix-online.org
>>> http://phenix-online.org/mailman/listinfo/phenixbb
>> 
>> 
>> Thomas C. Terwilliger
>> Mail Stop M888
>> Los Alamos National Laboratory
>> Los Alamos, NM 87545
>> 
>> Tel:  505-667-0072                 email: terwilliger at LANL.gov
>> Fax: 505-665-3024                 SOLVE web site: http://solve.lanl.gov
>> PHENIX web site: http:www.phenix-online.org
>> ISFI Integrated Center for Structure and Function Innovation web site: http://techcenter.mbi.ucla.edu
>> TB Structural Genomics Consortium web site: http://www.doe-mbi.ucla.edu/TB
>> CBSS Center for Bio-Security Science web site: http://www.lanl.gov/cbss
>> 
>> 
>> 
>> 
>> 
>> _______________________________________________
>> phenixbb mailing list
>> 
>> phenixbb at phenix-online.org
>> http://phenix-online.org/mailman/listinfo/phenixbb
> 
> _______________________________________________
> phenixbb mailing list
> phenixbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/phenixbb