| Python-based Hierarchical ENvironment for Integrated Xtallography |
| Documentation Home |
Validation tools in Phenix
OverviewThis document covers the use of the validation software in the Phenix GUI, which is run both as a standalone program (phenix.validate) and as part of phenix.refine. Much of this software is derived from the Molprobity web server. Some programs are available as command-line tools; their use is covered in the last section. There are two versions of validation in the GUI: one which performs only geometry validation (phenix.validate_model), and one which also compares the model to an electron-density map and calculates R-factors (phenix.validate). Since the geometry results are identical, this document covers the latter program. The analyses performed by Phenix include: Any non-protein molecule wil be included in the real-space fit evaluation, and if CIF files are provided, geometric outliers will be detected as well. Comprehensive validation of nucleic acid structures is planned for a future release. All of the outliers listed in the GUI link to supported molecular graphics programs (currently Coot, PyMOL, and the internal viewer) if launched from within Phenix. Clicking an outlier will recenter the graphics window on that atom or residue, and in some cases will also highlight the relevant selection. Certain outlier lists are also automatically displayed from within Coot in a separate window, with similar behavior. When the real-space correlation or refinement is performed, buttons in the new tabs will open the resulting maps and model in Coot. Running validationIf you are performing refinement with the *phenix.refine* GUI, the validation will be performed automatically as soon as the final model is ready. To validate a model with diffraction data, launch phenix.validate from the command-line or the main Phenix GUI. At a minimum, you will need a PDB file, and a reflections file containing intensities or amplitudes and R-free flags. Most parameters should be extracted automatically. ![]() Geometry restraints outliersPhenix, like most crystallographic software, uses the Engh and Huber (1991) restraints for proteins, nucleic acids, and other common molecules, here in the form of the CCP4 monomer library. Other restraints come from CIF files provided by the user; these can be generated by phenix.elbow and associated programs. These restraints are used in refinement to prevent distortions of model geometry, and to increase the observation-to-parameter ratio. The default restraints are for bond lengths, bond angles, dihedral (torsion) angles, chiral centers, planar groups (such as aromatic rings), and nonbonded (VDW) interactions. All of these are analyzed here except for the nonbonded restraint outliers, which are made redundant by the much more thorough all-atom contact analysis (see below). ![]() Validation of protein geometryThe geometry of protein chains is restricted by additional empirical observations that are not part of the standard restraints. The classic example of this is the Ramachandran plot, which sets limits on the combination of Phi (CA-C-N-CA) and Psi (N-CA-C-O) dihedral angles for any residue. In Phenix and Molprobity, the standard distribution of values is taken from the Top500 database of high-resolution protein structures. The main window contains a list of scored outliers (the lower the score, the worse the residue), and all residues are plotted graphically in a separate window. The scoring depends on the residue type and relative position; Gly, Pro, and adjacent residues depend on different distributions, all of which can be viewed in the plot window. ![]() ![]() ![]() ![]() ![]() All-atom contactsNonbonded restraints used in refinement can function with or without explicit hydrogen atoms; however, if hydrogens are absent, the atomic radii of other atoms will be increased to compensate. This approximation works decently on a global structural level, but often leaves chemically impossible geometries in place. Therefore, for this step, hydrogens are added to the model (proteins and nucleic acids only at this time) by phenix.reduce; this will first strip off any existing hydrogens. Reduce will flag residues whose sidechains require flipping based on hydrogen-bonding geometry and clashes caused by newly added hydrogens. These include Asn, Gln, and His, which are easily mis-fit due to the apparent symmetry of the sidechain without hydrogens. ![]() ![]() ![]() Real-space correlationThis is the only part of validation that requires the underlying diffraction data. Phenix will perform bulk-solvent correction and scaling on the data and calculate a likelihood-weighted 2mFo-DFc map. This is compared to a map calculated from the model alone, and correlation coefficients for each residue are obtained. At resolutions better than 2.5 Angstroms, the values for individual atoms will also be displayed. In the GUI, these lists can be filtered and/or sorted by CC. ![]() POLYGONThe program POLYGON (Urzhumtseva et al. 2009) has been ported to the Phenix GUI for comparing model quality indicators to similar structures in the PDB. Pre-computed values for a selection of 1000 structures determined at a similar resolution are plotted radially as one-dimensional histograms. The score for the model for each of these statistics is also plotted on the histograms, and the lines connecting these points form a complete polygon. For a high-quality, well-refined structure, the shape should be approximately symmetric and relatively small. ![]() Suggestions for improving modelsExcept for the pseudo-symmetric sidechain flips performed by Reduce, there are currently no fully automatic corrections for problems identified in validation. phenix.autobuild can be used to rebuild problem regions of the structure, but this becomes more difficult as the quality of the map decreases or the resolution gets worse. Re-refinement with a slightly different protocol is often helpful; in particular, explicit hydrogens can help constrain the model. The weight applied to X-ray terms during refinement may need to be reduced in favor of geometric restraints; this can be done automatically by phenix.refine. References
| |