|Python-based Hierarchical ENvironment for Integrated Xtallography|
Overview of molecular replacement in PHENIX
Molecular replacement (MR) is a phasing method that uses prior information in the form of a related or homologous structure. The procedure is roughly divided into two steps, a rotation function (RF) to determine the orientation of the search model(s), and a translation function (TF) to determine the absolute position(s) in the unit cell. Because it requires no additional experimental procedures or data, and additionally simplifies model-building, MR is usually the method of choice for structure determination when a suitable search model is available. (See the Limitations section below for advice on search models.)
In PHENIX, MR is performed by the program Phaser, written by Randy Read's group at the University of Cambridge. Although Phaser may be run on the command line with CCP4-style inputs, we recommend using either the AutoMR wizard (GUI or command line), or the Phaser-MR GUI. AutoMR presents a user-friendly and automated frontend to Phaser, and interfaces directly with the AutoBuild wizard for model-building. We recommend that new users, and anyone who expects MR to work for their specific structure, start with the AutoMR GUI. The Phaser-MR GUI is more complex, but enables finer control over parameters and multi-step searches, which may be necessary for difficult structures. The Sculptor and Ensembler utilities are available for preparing search models.
Input files and mandatory parameters
All of the MR programs in PHENIX require a single reflections file containing experimental data (with sigmas); AutoMR and the GUI will accept any file format or data type, including intensities. The procedure traditionally uses all reflections, so R-free flags are not required. (Unlike refinement, this does not severely bias the final R-free value, since the placement of molecules will be approximate and some conformational changes usually occur.)
At least one search model is required. In most cases this will be a PDB file containing a partial structure. For more variable structures, an ensemble model may be used instead - either a PDB file with multiple MODEL records, or multiple similar PDB files. When using an ensemble, all models must be superposed in the same orientation; the Ensembler utility is used for this. There are no limitations on the size or number of search models. For large complexes, if the relative positions of individual subunits do not change, the entire assembly may be used as a search model instead of placing each component separately. All heteroatoms (ligands and waters) should be removed from the PDB file(s) before use.
Another option, only available in Phaser itself (not AutoMR), is to search using a map (or rather, an MTZ file containing pre-weighted map coefficients), often solved at low resolution. This requires additional information about the center and extent of the map section to search with.
The maximum likelihood phasing methods used in Phaser require prior knowledge about the deviation (or variance) of the search model(s) from the real structure, and the expected composition or scattering mass of the crystal. To specify the model variance, either an RMSD value or percent sequence identity may be used (these will be converted internally, and a sequence identity of 100% is assumed to mean an approximate RMSD of 1.0A). It is important to minimize the variance if possible (see Limitations below for guidelines), which often requires eliminating atoms or modifying B-factors. The Sculptor utility will perform this step, given a PDB file and a sequence alignment. (This is usually unnecessary for search models with high sequence identity to the target molecule.)
For composition, you may supply a sequence file (protein or nucleic acid), or simply enter the molecular weight. The standalone versions of Phaser also accept the fractional composition of each search ensemble, if known. Note that the composition data does not necessarily have a 1:1 correspondence with the search ensembles (see FAQ list below for details). Even if you are only searching for a single ensemble out of several (e.g. the protein in a protein-DNA complex), you must still supply the expected composition of the entire crystal.
Outline of MR procedure
If multiple search models are used, these steps will be performed sequentially for each model. Although each step may be run individually in the Phaser-MR GUI, this is necessary only in exceptionally difficult cases.
The main restriction on the use of molecular replacement is the requirement for a suitably similar search model. Although there is no exact rule for this, the relationship between sequence identity and MR success is roughly as follows:
Structures which undergo large conformational changes may need to be split into separate domains for searching, regardless of sequence identity. Where multiple similar search models are available, combining these into an ensemble may improve the likelihood of success. Processing models with the Sculptor utility is highly recommended, especially at lower sequence identity.
For cases where anomalous data from a SAD experiment are available, a poor (but genuine) MR solution may be used to identify heavy-atom sites and combined with SAD phases, a technique known as MR-SAD. This may provide a decent-quality map where neither technique is independently sufficient. The AutoMR and AutoSol manuals have details on running this in Phenix.
Frequently Asked Questions
The Phaser home page has a general FAQ list; look there first to see if it answers your question. Note that most of the questions below apply mainly to the "automated molecular replacement" (MR_AUTO) mode of Phaser.
See the Phaser-MR GUI manual for additional details specific to that interface.