[phenixbb] adding hydrogens -> increasing clash score - SUMMARY
Pavel Afonine
pafonine at lbl.gov
Thu Aug 4 22:44:38 PDT 2011
Hi Tanya,
thanks for sending the data and model files. Here are some results.
1) Starting values corresponding to the model you sent me:
r_work = 0.2891 r_free = 0.3267
MOLPROBITY STATISTICS.
ALL-ATOM CLASHSCORE : 67.04
RAMACHANDRAN PLOT:
OUTLIERS : 6.04 %
ALLOWED : 16.58 %
FAVORED : 77.38 %
ROTAMER OUTLIERS : 14.80 %
CBETA DEVIATIONS : 13
2) I ran six refinement jobs using input model with and without H atoms,
and for each model I tried 3 ways of using NCS in refinement: the new
option defining NCS in torsion angle space, let phenix.refine define NCS
automatically and apply it in Cartesian space, and finally I used the
NCS selections that you sent me. So 2 (model with and w/o H) * 3 (NCS
options) = 6 refinements in total.
Here are the results:
MODEL with H:
- torsion NCS:
r_work = 0.2951 r_free = 0.3319
REMARK 3 MOLPROBITY STATISTICS.
REMARK 3 ALL-ATOM CLASHSCORE : 19.91
REMARK 3 RAMACHANDRAN PLOT:
REMARK 3 OUTLIERS : 1.99 %
REMARK 3 ALLOWED : 6.99 %
REMARK 3 FAVORED : 91.02 %
REMARK 3 ROTAMER OUTLIERS : 0.45 %
REMARK 3 CBETA DEVIATIONS : 0
- Cartesian NCS defined automatically:
r_work = 0.3260 r_free = 0.3427
REMARK 3 MOLPROBITY STATISTICS.
REMARK 3 ALL-ATOM CLASHSCORE : 12.79
REMARK 3 RAMACHANDRAN PLOT:
REMARK 3 OUTLIERS : 1.02 %
REMARK 3 ALLOWED : 5.97 %
REMARK 3 FAVORED : 93.01 %
REMARK 3 ROTAMER OUTLIERS : 14.07 %
REMARK 3 CBETA DEVIATIONS : 45
- Cartesian NCS using your selections for NCS groups:
r_work = 0.2792 r_free = 0.3245
REMARK 3 MOLPROBITY STATISTICS.
REMARK 3 ALL-ATOM CLASHSCORE : 15.29
REMARK 3 RAMACHANDRAN PLOT:
REMARK 3 OUTLIERS : 0.90 %
REMARK 3 ALLOWED : 7.77 %
REMARK 3 FAVORED : 91.33 %
REMARK 3 ROTAMER OUTLIERS : 12.87 %
REMARK 3 CBETA DEVIATIONS : 36
MODEL without H:
- torsion NCS:
r_work = 0.2938 r_free = 0.3301
REMARK 3 MOLPROBITY STATISTICS.
REMARK 3 ALL-ATOM CLASHSCORE : 30.12
REMARK 3 RAMACHANDRAN PLOT:
REMARK 3 OUTLIERS : 1.89 %
REMARK 3 ALLOWED : 7.01 %
REMARK 3 FAVORED : 91.11 %
REMARK 3 ROTAMER OUTLIERS : 0.22 %
REMARK 3 CBETA DEVIATIONS : 2
- Cartesian NCS defined automatically:
r_work = 0.3174 r_free = 0.3354
REMARK 3 MOLPROBITY STATISTICS.
REMARK 3 ALL-ATOM CLASHSCORE : 33.76
REMARK 3 RAMACHANDRAN PLOT:
REMARK 3 OUTLIERS : 0.63 %
REMARK 3 ALLOWED : 6.31 %
REMARK 3 FAVORED : 93.06 %
REMARK 3 ROTAMER OUTLIERS : 12.60 %
REMARK 3 CBETA DEVIATIONS : 142
- Cartesian NCS using your selections for NCS groups:
r_work = 0.2762 r_free = 0.3233
REMARK 3 MOLPROBITY STATISTICS.
REMARK 3 ALL-ATOM CLASHSCORE : 41.83
REMARK 3 RAMACHANDRAN PLOT:
REMARK 3 OUTLIERS : 0.73 %
REMARK 3 ALLOWED : 8.32 %
REMARK 3 FAVORED : 90.95 %
REMARK 3 ROTAMER OUTLIERS : 18.36 %
REMARK 3 CBETA DEVIATIONS : 118
3) Looking at the results above, I would say we have two overall equally
good results (in my interpretation):
MODEL with H (torsion NCS):
r_work = 0.2951 r_free = 0.3319
REMARK 3 MOLPROBITY STATISTICS.
REMARK 3 ALL-ATOM CLASHSCORE : 19.91
REMARK 3 RAMACHANDRAN PLOT:
REMARK 3 OUTLIERS : 1.99 %
REMARK 3 ALLOWED : 6.99 %
REMARK 3 FAVORED : 91.02 %
REMARK 3 ROTAMER OUTLIERS : 0.45 %
REMARK 3 CBETA DEVIATIONS : 0
and
MODEL without H (torsion NCS):
r_work = 0.2938 r_free = 0.3301
REMARK 3 MOLPROBITY STATISTICS.
REMARK 3 ALL-ATOM CLASHSCORE : 30.12
REMARK 3 RAMACHANDRAN PLOT:
REMARK 3 OUTLIERS : 1.89 %
REMARK 3 ALLOWED : 7.01 %
REMARK 3 FAVORED : 91.11 %
REMARK 3 ROTAMER OUTLIERS : 0.22 %
REMARK 3 CBETA DEVIATIONS : 2
compare it to your original structure:
r_work = 0.2891 r_free = 0.3267
MOLPROBITY STATISTICS.
ALL-ATOM CLASHSCORE : 67.04
RAMACHANDRAN PLOT:
OUTLIERS : 6.04 %
ALLOWED : 16.58 %
FAVORED : 77.38 %
ROTAMER OUTLIERS : 14.80 %
CBETA DEVIATIONS : 13
4) To be able to reproduce these numbers you need to use the most recent
PHENIX version from the nightly builds:
http://www.phenix-online.org/download/nightly_builds.cgi
use dev-838 and up.
5) I'm sending the relevant files off-list.
6) The six commands I used are:
phenix.refine data.mtz model_H.pdb ramachandran_restraints=true
main.number_of_mac=5 optimize_xyz_weight=true optimize_adp_weight=true
--overwrite main.ncs=true ncs.type=torsion output.prefix=H
secondary_structure_restraints=true *cif xray_data.high_res=3.6 > & zlogH &
phenix.refine data.mtz model.pdb ramachandran_restraints=true
main.number_of_mac=5 optimize_xyz_weight=true optimize_adp_weight=true
--overwrite main.ncs=true ncs.type=torsion output.prefix=noH
secondary_structure_restraints=true *cif xray_data.high_res=3.6 > &
zlognoH &
phenix.refine data.mtz model_H.pdb excessive_distance_limit=None
ramachandran_restraints=true main.number_of_mac=5
optimize_xyz_weight=true optimize_adp_weight=true --overwrite
main.ncs=true output.prefix=H_ncsC secondary_structure_restraints=true
*cif xray_data.high_res=3.6 > & zlogH_ncsC &
phenix.refine data.mtz model.pdb excessive_distance_limit=None
ramachandran_restraints=true main.number_of_mac=5
optimize_xyz_weight=true optimize_adp_weight=true --overwrite
main.ncs=true output.prefix=noH_ncsC secondary_structure_restraints=true
*cif xray_data.high_res=3.6 > & zlognoH_ncsC &
phenix.refine data.mtz model_H.pdb excessive_distance_limit=None
ncs_groups_H.params ramachandran_restraints=true main.number_of_mac=5
optimize_xyz_weight=true optimize_adp_weight=true --overwrite
main.ncs=true output.prefix=H_ncsC_cust
secondary_structure_restraints=true *cif xray_data.high_res=3.6 > &
zlogH_ncsC_cust &
phenix.refine data.mtz model.pdb excessive_distance_limit=None
ncs_groups_H.params ramachandran_restraints=true main.number_of_mac=5
optimize_xyz_weight=true optimize_adp_weight=true --overwrite
main.ncs=true output.prefix=noH_ncsC_cust
secondary_structure_restraints=true *cif xray_data.high_res=3.6 > &
zlognoH_ncsC_cust &
It may not be necessary to use weights optimization, although I did not
try without it. I recommend you try it: if it turns out to be not
necessary than it may save you many hours of time.
Also, the amount of CB-outliers and Ramachandran outliers in your
original model probably indicates that it is far from final and requires
some careful analysis manually.
As Nat mentioned before, fit_rotamers will not work at resolutions lower
than 2.8-3.0A, as it relies on "reasonably good" density. I plan to
extend it to lower resolutions, but it will take some time.
You don't need to run tools like phenix.ramalyze, etc separately after
refinement, since most of Molprobity statistics is reported in REMARK 3
records by phenix.refine as in examples above.
Finally, you don't really have to type all these long commands. Using
parameter files is easy, and can reduce the amount of typing to
something like:
phenix.refine params.eff
See
https://www.phenix-online.org/presentations/latest/pavel_phenix_refine.pdf
for details and examples.
7) Finally finally, at 3.6A resolution or so, the R-factors withing ~2%
can be considered the same. For example, I would not say that r_free =
0.3245 is better than 0.3319.
For discussion see
http://phenix-online.org/newsletter/CCN_2011_07.pdf
article "Improved target weight optimization in phenix.refine".
Please let me know if you have any questions or need any help with this.
Pavel.
On 7/30/11 11:35 AM, Tatyana Sysoeva wrote:
> I ran the same commands with the build dev-833.
> Below are the results.
> These parameters are obtained by running phenix.ramalyze, cbetdev, and
> clashscore.
> I am almost done repeating fix-rotamers run. First time it gave a way
> worse validation results then before running phenix without
> fix_rotamers=true.
> I don't understand what I am doing wrong in the runs and would
> appreciate your help!
>
> Thanks,
> Tanya
>
>
> I used phenix-dev-833 to test the riding hydrogens refinement at 3.6A.
>
> I ran two indentical commands:
>
> Command line arguments: "model.pdb" "data.mtz" "main.ncs=true"
> "ncs.find_automatically=false" "ncs_groups.params"
> "refinement.input.xray_data.high_resolution=3.6" "mgadpbef.cif"
> "refinement.ncs.excessive_distance_limit=None"
> "strategy=individual_sites+individual_sites_real_space+group_adp+occupancies"
>
> with the only difference in the input files – ncs.params and model.pdb
>
> Model pdb contained the model identical to the control run but with
> added H atoms. I have not add H to the ADP molecules since I did not
> know how to write a correct CIF file for it. NCS definitions were
> changed by addition of “and not (element H)” to each line. It was done
> to exclude H from the NCS groups.
>
> _no hydrogens_
>
>
>
> _with hydrogens_
>
> # Date 2011-07-29 Time 19:09:58 EDT -0400 (1311980998.46 s)
>
> wall clock time: 2646.75 s
>
> Start R-work = 0.2891, R-free = 0.3267
>
> Final R-work = 0.2868, R-free = 0.3304
>
>
>
> # Date 2011-07-29 Time 20:14:20 EDT -0400 (1311984860.16 s)
>
> wall clock time: 6517.60 s
>
> Start R-work = 0.2910, R-free = 0.3273
>
> Final R-work = 0.2720, R-free = 0.3353
>
> clashscore 31.77
>
>
>
> clashscore 96.12
>
> cbeta 0
>
>
>
> cbeta 151
>
> rama 12.62% outliers
>
>
>
> rama 19.67% outliers
>
> Interestingly in previous release version the same refinement runs
> produced:
>
> without H clashscore = 77.580195/cbeta =8
>
> with H clashscore = 119.729960/cbeta=330
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/phenixbb/attachments/20110804/5658fca2/attachment-0001.htm>
More information about the phenixbb
mailing list