[phenixbb] geometry_minimization makes molprobity score worse
Pavel Afonine
pafonine at lbl.gov
Wed Jul 7 23:27:25 PDT 2021
Hi James,
thanks for email and sharing your observations!
> Greetings all, and I hope this little observation helps improve things
> somehow.
>
> I did not expect this result, but there it is. My MolProbity score
> goes from 0.7 to 1.9 after a run of phenix.geometry_minimization
>
> I started with an AMBER-minimized model (based on 1aho), and that got
> me my best MolProbity score so far (0.7). But, even with hydrogens and
> waters removed the geometry_minimization run increases the clashscore
> from 0 to 3.1 and Ramachandran favored drops from 98% to 88% with one
> residue reaching the outlier level.
It is not a secret that 'standard geometry restraints' used in Phenix
and alike (read Refmac, etc) are very simplistic. They are not aware of
main chain preferential conformations (Ramachandran plot), favorable
side chain rotamer conformations. They don't even have any
electrostatic/attraction terms -- only anti-bumping repulsion! Standard
geometry restraints won't like any NCI (non-covalent interaction) and
likely will make interacting atoms break apart rather than stay close
together interacting.
With this in mind any high quality (high-resolution) atomic model or the
one optimized using sufficiently high-level QM is going to have a more
realistic geometry than the result of geometry regularization against
very simplistic restraints target. An example:
https://journals.iucr.org/d/issues/2020/12/00/lp5048/lp5048.pdf
and previous papers on the topic.
> Just for comparison, with refmac5 in "refi type ideal" mode I see the
> MolProbity rise to 1.13, but Clashscore remains zero, some Ramas go
> from favored to allowed, but none rise to the level of outliers.
I believe this is because of the nature of minimizer used. Refmac uses
2nd derivative based one, which in a nutshell means it can move the
model much less (just a bit in vicinity of a local minimum) than any
program that uses gradients only (like Phenix).
> Files and logs here:
> https://bl831.als.lbl.gov/~jamesh/bugreports/phenixmin_070721.tgz
>
> I suspect this might have something to do with library values for
> main-chain bonds and angles? They do seem to vary between programs.
> Phenix having the shortest CA-CA distance by up to 0.08 A. After
> running thorough minimization on a poly-A peptide I get:
> bond amber refmac phenix shelxl Stryer
> C-N 1.330 1.339 1.331 1.325 1.32
> N-CA 1.462 1.482 1.455 1.454 1.47
> CA-C 1.542 1.534 1.521 1.546 1.53
> CA-CA 3.862 3.874 3.794 3.854
>
> So, which one is "right" ?
I'd say they are all the same, within their 'sigmas' which are from
memory about 0.02A:
elbow.where_is_that_cif_file phe
All the best!
Pavel
More information about the phenixbb
mailing list