[phenixbb] CaBLAM fixing and interpretation in comprehensive validation for cryo-EM

Fri Jun 28 15:20:43 PDT 2019

Hi Christopher,

> [I forgot to copy my reply to the bulletin board, so here it is, 
> reproduced for the record.]
>
> The official publication for CaBLAM is the 2017 Molprobity paper in 
> Protein Science, here: https://doi.org/10.1002/pro.3330 Further 
> technical documentation on CaBLAM can be found in Phenix's 
> Computational Crystallography Newsletter, here: 
> https://www.phenix-online.org/newsletter/CCN_2018_07.pdf#page=7 I 
> recommend the newsletter article as a fast read.
>
> To identify outliers, CaBLAM looks at a structure's CA trace, which is 
> generally well-modeled.  For each residue, it compares the local 
> peptide plane orientations of the model to the observed distribution 
> of peptide plane orientations for high quality residues with matching 
> CA trace geometry.  The CaBLAM score is a percentile score that rates 
> how well the model matches with the expected distribution.  The lower 
> the score, the rarer the observed conformation is in our database of 
> quality structures.  A conformation falling in the bottom 5% of 
> observed behavior is potentially suspicious ("Disfavored") and a 
> conformation falling in the bottom 1% is considered an outlier.
>
> This percentile-based scoring is fundamentally the same scoring used 
> in MolProbity's description of Ramachandran and rotamer outliers, 
> though of course CaBLAM puts its cutoffs in different places.
>
> As a matter of interpretation, loop/coil regions tend to be highly 
> varied.  CaBLAM "Disfavored" conformations in loops can largely be 
> ignored.  However, disfavored conformations in regions expected to by 
> highly regular (repeating secondary structure) should be taken 
> seriously.  CaBLAM outliers should be inspected wherever they occur.
>
> The CA geometry score looks at just the CA trace, and takes the CA 
> virtual angle into account (defined by CAi-1, CAi, CAi+1).  Outliers 
> in this space reflect some sort of severe problem with CA geometry, 
> often involving an over-extended or over-compressed CA virtual angle.
>
> The secondary structure scores are based on how well a residue's local 
> CA trace matches the expected CA trace of each major secondary 
> structure type, alpha, 3-10, and beta.  You can see the contours used 
> for this assessment in Figure 3 of the newsletter article.  Each 
> residue receives individual secondary structure scores.  Then regions 
> of residues that all pass a scoring threshold are assembled into 
> probable secondary structure elements.  This is where the "try beta 
> sheet" recommendations come from.  That recommendation indicates that 
> the residue in question /and its neighboring residues/ all have CA 
> traces that look like beta sheet.
>
> I wish I had a simple recommendation for you, but fixing CaBLAM 
> outliers systematically has proven to be a challenge. Take a look at 
> your structure and see if you believe that the outlier residues really 
> are intended to be part of beta sheets.  If so, beta sheets have 
> distinctive hydrogen bonding patterns that tend to be disrupted by the 
> kind of problems that CaBLAM identifies.  Ideally, you will be able to 
> use Coot's tools to restore the proper hydrogen bonding.  Then, 
> applying hydrogen bonding restraints during refinement may help keep 
> your work in place.
>
> If you have large regions of outliers, it may instead be more 
> practical to strip out the existing model and replace it with 
> idealized beta sheet structure, then rerefine.  Again, hydrogen 
> bonding restraints may be helpful.
>
> As a general rule, CaBLAM outliers usually indicate a problem with the 
> orientations for one or more peptide planes. Look for a way to 
> reorient the peptide either to remove clashes or establish hydrogen 
> bonds.  Make sure you build good regular secondary structure, don't 
> sweat about the loops too much, and trust your judgement and 
> experience to identify the real and justified outliers.

this is a great summary, thanks!
Personally, I have a lot of trouble interpreting CaBLAM output. I've 
seen many CaBLAM outliers and disfavored that looked to me just fine 
leaving me confused as to what I should do. Misplaced carbonyl groups 
are among rare cases where a fix is obvious by rotating the group to 
satisfy H bonding. So.. I'd say a set of concrete and clear fixing 
instructions would be very helpful to have. And if these instructions 
can be encoded in software -- that's even better!

> We of the Richardson Lab generally dislike torsion-based Ramachandran 
> restraints/secondary structure restraints.

I can see your point. But the matter of fact is: to make low-resolution 
refinement possible these restraints are absolutely necessary to use. 
Showing unfolding helix in 5A resolution map as result of refinement 
without these restraints is among my favorite examples that I've been 
showing for years not at workshops.
Key point here is that one should not use these restraints to fix 
outliers (because of limited convergence radius of refinement) but only 
to keep a good model good during refinement.
And, surely, we count on validation to stay on the safe side!

All the best,
Pavel

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/phenixbb/attachments/20190628/0f80e79e/attachment.htm>