phenix_logo
Python-based Hierarchical ENvironment for Integrated Xtallography
Documentation Home
 

Structure refinement in PHENIX

Graphical interface
Available features
Current limitations
phenix.refine organization
Running phenix.refine
Giving parameters on the command line or in files
Refinement scenarios
Refinement with all default parameters
Refinement of coordinates
Refinement of atomic displacement parameters (commonly named as ADP or B-factors)
Occupancy refinement
f' and f'' refinement
Using NCS restraints in refinement
Using secondary structure restraints
Using a reference model in refinement
Automatic Asn/Gln/His corrections
Water picking
Hydrogens in refinement
Refinement using twinned data
Neutron and joint X-ray and neutron refinement
Optimizing target weights
Refinement at high resolution (higher than approx. 1.0 Angstrom)
Examples of frequently used refinement protocols, common problems
Useful options
Changing the number of refinement cycles and minimizer iterations
Parallelizing for multi-core systems
Creating R-free flags (if not present in the input reflection files)
Specify the name for output files
Reflection output
Setting the resolution range for the refinement
Bulk solvent correction and anisotropic scaling
Default refinement with user specified X-ray target function
Modifying the initial model before refinement starts
Refinement using FFT or direct structure factor calculation algorithm
Ignoring test (free) flags in refinement
Using phenix.refine to calculate structure factors
Scattering factors
Suppressing the output of certain files
Random seed
Electron density maps
Refining with anomalous data (or what phenix.refine does with Fobs+ and Fobs-).
Rejecting reflections by sigma
Developer's tools
CIF modifications and links
Definition of custom bonds and angles
Atom selection examples
Depositing refined structure with PDB
Referencing phenix.refine
Relevant reading
Feedback, more information
List of all refinement keywords

phenix.refine is the general purpose crystallographic structure refinement program

Graphical interface

A complete graphical interface for phenix.refine is available; it includes integration with several refinement-related utilities such as phenix.ready_set, phenix.simple_ncs_from_pdb, and phenix.find_tls_groups. Essentially all of the program details described in this document should apply to the GUI as well.

Available features

  • Coordinate refinement:
  1. Restrained or completely unrestrained individual
  2. Grouped (rigid body)
  3. LBFGS minimization, Cartesian or torsion Simulated Annealing
  4. Selective removing of stereochemistry restraints
  5. Adding custom (user-defined) bonds and angles
  6. Fixing coordinates of any selected part of the structure during refinement
  7. NCS and secondary structure restraints
  • Atomic Displacement Parameters (ADP) refinement:
  1. Restrained individual isotropic, anisotropic, mixed
  2. Group isotropic (one isotropic B per selected model part)
  3. TLS
  4. Comprehensive mode: combined TLS + individual or group ADP
  • Occupancy refinement (any: individual, group, constrained for alternative conformations)
  • Anomalous f' and f'' refinement
  • Bulk solvent correction (flat model using a mask) and anisotropic scaling
  • Multiple refinement and scale target functions: least-squares (ls), maximum-likelihood (ml), phased maximum-likelihood (mlhl)
  • FFT and direct summation based refinement
  • Various electron density map calculations (including likelihood-weighted, kick maps)
  • Simple structure factor calculation (with or without bulk solvent and scaling)
  • Combined automatic ordered solvent building, update and refinement
  • Complete model and data statistics (including twinning analysis, Wilson B calculation, stereo-chemistry statistics and much more)
  • Automatic detection of NCS related copies and building NCS restraints
  • Refinement using X-ray, neutron or both experimental data
  • Complex refinement strategies in one run
  • Refinement at subatomic resolution (approx. < 1.0 A) with IAS model
  • Refinement with twinned data
  • Automatic fixing amino-acid side chains by doing local real-space refinement combined with torsion angle grid search fit into density.

Current limitations

  • No omit maps calculation (use PHENIX wizards for this)
  • TLS and individual anisotropic ADP cannot be refined at once for the same group
  • Certain refinement strategies are not available for joint X-ray/neutron refinement
  • No NCS constraints (restraints only)
  • Atoms with anisotropic ADP in NCS groups
  • No Simulated Annealing for selected fragments
  • Oxidation state of metals is ignored: for example, "+2" in Fe+2 will be ignored

Remark on using amplitudes (Fobs) vs intensities (Iobs)

Although phenix.refine can read in both data types, intensities or amplitudes, internally it uses amplitudes in nearly all calculations. Both ways of doing refinement, with Iobs or Fobs, have their own slight advantages and disadvantages. To our knowledge there is no strong points to argue using one data type w.r.t. another.

phenix.refine organization

A refinement run in phenix.refine always consists of three main steps: reading in and processing of the data (model in PDB format, reflections in most known formats, parameters and optionally cif files with stereochemistry definitions), performing requested refinement protocols (bulk solvent and scaling, refinement of coordinates and B-factors, water picking, etc...) and finally writing out refined model, complete refinement statistics and electron density maps in various formats. The figure below illustrates these steps:

images/phenix_refine_flowchart.png

The second central step encompassing from bulk solvent correction and scaling to refinement of particular model parameters is called macro-cycles and repeated several times (3 by default). Multiple refinement scenario can be realized at this step and applied to any selected part of a model as illustrated at figure below:

images/phenix_refine_flexibility.png

Running phenix.refine

phenix.refine can be run from the command line:

% phenix.refine <pdb-file(s)> <reflection-file(s)> <monomer-library-file(s)>

or from PHENIX GUI.

When you do this a number of things happen:

  • The program automatically generates a ".eff" file which contains all of the parameters for the job (for example if you provided lysozyme.pdb the file lysozyme_refine_001.eff will be generated). This is the set of input parameters for this run.
  • The program automatically interprets the reflection file(s). If there is an unambiguous choice of data arrays these will be used for the refinement. If there is a choice, you're given a message telling you how to select the arrays. Several reflection files can be provided, for example: one containing Fobs and another one with R-free flags.
  • Once the data arrays are chosen, the program writes all of the data it will be using in the refinement to a new MTZ file, for example, lysozyme_refine_data.mtz. This makes it very easy to keep track of what you actually used in the refinement (instead of having the arrays spread across multiple files).
  • At the end of refinement the program generates:
  1. a new PDB file, with the refined model, called for example lysozyme_refine_001.pdb;

  2. a reflection file with map coefficients for use in Coot or XtalView (e.g. lysozyme_refine_001_map_coeffs.mtz);

  3. Optionally, two maps: likelihood weighted mFo-DFc and 2mFo-DFc can be requested. These are in binary CCP4 format.

  4. a new defaults file to run the next cycle of refinement, e.g. lysozyme_refine_002.def. This means you can run the next cycle of refinement by typing:

    % phenix.refine lysozyme_refine_002.def
    

To get information about command line options type:

% phenix.refine --help

To have the program generate the default input parameters without running the refinement job (e.g. if you want to modify the parameters prior to running the job):

% phenix.refine --dry_run <pdb-file> <reflection-file(s)>

If you know the parameter that you want to change you can override it from the command line:

% phenix.refine data.hkl model.pdb xray_data.low_resolution=8.0 \
  simulated_annealing.start_temperature=5000

Note that you don't have to specify the full parameter name. What you specify on the command line is matched against all known parameters names and the best substring match is used if it is unique.

To rerun a job that was previously run:

% phenix.refine --overwrite lysozyme_refine_001.def

The --overwrite option allows the program to overwrite existing files. By default the program will not overwrite existing files - just in case this would remove the results of a refinement job that took a long time to finish.

To see all default parameters:

% phenix.refine --show-defaults=all

Giving parameters on the command line or in files

In phenix.refine parameters to control refinement can be given by the user on the command line:

% phenix.refine data.hkl model.pdb simulated_annealing=true

However, sometimes the number of parameters is large enough to make it difficult to type them all on the command line, for example:

% phenix.refine data.hkl model.pdb refine.adp.tls="chain A" \
  refine.adp.tls="chain B" main.number_of_macro_cycles=4 \
  xray_data.high_resolution=2.5 wxc_scale=3 wxu_scale=5 \
  output.prefix=my_best_model strategy=tls+individual_sites+individual_adp \
  simulated_annealing.start_temperature=5000

The same result can be achieved by using:

% phenix.refine data.hkl model.pdb custom_par_1.params

where the custom_par_1.params file contains the following lines:

refinement.refine.strategy=tls+individual_sites+individual_adp
refinement.refine.adp.tls="chain A"
refinement.refine.adp.tls="chain B"
refinement.main.number_of_macro_cycles=4
refinement.input.xray_data.high_resolution=2.5
refinement.target_weights.wxc_scale=3
refinement.target_weights.wxu_scale=5
refinement.output.prefix=my_best_model
refinement.simulated_annealing.start_temperature=5000

which can also be formatted by grouping the parameters under the relevant scopes (custom_par_2.params):

refinement.main {
   number_of_macro_cycles=4
}
refinement.input.xray_data.high_resolution=2.5
refinement.refine {
  strategy = *individual_sites \
              rigid_body \
             *individual_adp \
              group_adp \
             *tls \
              occupancies \
              group_anomalous \
              none
  adp {
    tls = "chain A"
    tls = "chain B"
  }
}
refinement.target_weights {
  wxc_scale=3
  wxu_scale=5
}
refinement.output.prefix=my_best_model
refinement.simulated_annealing.start_temperature=5000

and the refinement run will be:

% phenix.refine data.hkl model.pdb custom_par_2.params

The easiest way to create a file like the custom_par_2.params file is to generate a template file containing all parameters by using the command phenix.refine --show-defaults=all and then take the parameters that you want to use (and remove the rest).

Comments in parameter files

Use # for comments:

% phenix.refine data.hkl model.pdb comments_in_params_file.params

where comments_in_params_file.params file contains the lines:

refinement {
  refine {
    #strategy =  individual_sites rigid_body  individual_adp group_adp tls \
    #           occupancies group_anomalous *none
  }
  #main {
  #  number_of_macro_cycles = 1
  #}
}
refinement.target_weights.wxc_scale = 1.5
#refinement.input.xray_data.low_resolution=5.0

In this example the only parameter that is used to overwrite the defaults is target_weights.wxc_scale and the rest is commented.

Refinement scenarios

The refinement of atomic parameters is controlled by the strategy keyword. Those include:

- individual_sites (refinement of individual atomic coordinates)
- individual_adp   (refinement of individual atomic B-factors)
- group_adp        (group B-factors refinement)
- group_anomalous  (refinement of f' and f" values)
- tls              (TLS refinement = refinement of ADP through TLS parameters)
- rigid_body       (rigid body refinement)
- occupancies      (occupancy refinement: individual, group, group constrained)
- none             (bulk solvent and anisotropic scaling only)

Below are examples to illustrate the use of the strategy keyword as well as a few others.

Refinement with all default parameters

% phenix.refine data.hkl model.pdb

This will perform coordinate refinement and restrained ADP refinement. Three macrocycles will be executed, each consisting of bulk solvent correction, anisotropic scaling of the data, coordinate refinement (25 iterations of the LBFGS minimizer) and ADP refinement (25 iterations of the LBFGS minimizer). At the end the updated coordinates, maps, map coefficients, and statistics are written to files.

Refinement of coordinates

phenix.refine offers three ways of coordinate refinement:

  • individual coordinate refinement using gradient-driven (LBFGS) minimization;
  • individual coordinate refinement using simulated annealing (SA refinement);
  • grouped coordinate refinement (rigid body refinement).

All types of coordinate refinement listed above can be used separately or combined all together in any combination and can be applied to any selected part of a model. For example, if a model contains three chains A, B and C, than it would require only one single refinement run to perform SA refinement and minimization for atoms in chain A, rigid body refinement with two rigid groups A and B, and refine nothing for chain C. Below we will illustrate this with several examples.

The default refinement includes a standard set of stereo-chemical restraints ( covalent bonds, angles, dihedrals, planarities, chiralities, non-bonded). The NCS restrains can be added as well. Completely unrestrained refinement is possible.

The total refinement target is defined as:

Etotal = wxc_scale * wxc * Exray + wc * Egeom

where: Exray is crystallographic refinement target (least-squares, maximum-likelihood, or any other), Egeom is the sum of restraints (including NCS if requested), wc is 1.0 by default and used to turn the restraints off, wxc ~ ratio of gradient's norms for geometry and X-ray targets as defined in (Adams et al, 1997, PNAS, Vol. 94, p. 5018), wxc_scale is an 'ad hoc' scale found empirically to be ok for most of the cases.

Important to note:

When a refinement of coordinates (individual or rigid body) is run without using selections, then the coordinates of all atoms will be refined. Otherwise, if selections are used, the only coordinates of selected atoms will be refined and the rest will be fixed.

Using strategy=rigid_body or strategy=individual_sites will ask phenix.refine to refine only coordinates while other parameters (ADP, occupancies) will be fixed.

phenix.refine will stop if an atom at special position is included in rigid body group. The solution is to make a new rigid body group selection containing no atoms at special positions.

  • Rigid body refinement

    phenix.refine implementation of rigid body refinement is very sophisticated and efficient (big convergence radius, one run, no need to cut off high-resolution data). We call this MZ protocol (multiple zones). The essence of MZ protocol is that the refinement starts with a few reflections selected in the lowest resolution zone and proceeds with gradually adding higher resolution reflections. Also, it almost constantly updates the mask and bulk solvent model parameters and this is crucial since the bulk solvent affects the low resolution reflections - exactly those the most important for success of rigid body refinement. The default set of the rigid body parameters is good for most of the cases and is normally not supposed to be changed.

    1. One rigid body group per chain (default behavior):

      % phenix.refine data.hkl model.pdb strategy=rigid_body
      
    2. Multiple groups (requires a basic knowledge of the PHENIX atom selection language, see below):

      % phenix.refine data.hkl model.pdb strategy=rigid_body \
        sites.rigid_body="chain A" sites.rigid_body="chain B"
      

      This will refine the chain A and chain B as two rigid bodies. The rest of the model will be kept fixed.

    3. If one have many rigid groups, a lot of typing in the command line may not be convenient, so creating a parameter file rigid_body_selections, containing the following lines, may be a good idea:

      refinement.refine.sites {
        rigid_body = chain A
        rigid_body = chain B
      }
      

      The command line will then be:

      % phenix.refine data.hkl model.pdb strategy=rigid_body \
        rigid_body_selections.params
      

      Files like this can be created, for example, by copy-and-paste from the complete list of parameters (phenix.refine --show-defaults=all).

    4. To switch from MZ protocol to traditional way of doing rigid body refinement (not recommended!):

      % phenix.refine data.hkl model.pdb strategy=rigid_body rigid_body.number_of_zones=1 \
        rigid_body.high_resolution=4.0
      

      Note that doing one zone refinement one need to cut the high-resolution data off at some arbitrary point around 3-5 A (depending on model size and data quality).

    5. By default the rigid body refinement is run only the first macro-cycles. To switch from running rigid body refinement only once at the first macro-cycle to running it every macro-cycle:

      % phenix.refine data.hkl model.pdb strategy=rigid_body rigid_body.mode=every_macro_cycle
      
    6. To change the default number of lowest resolution reflections used to determine the first resolution zone to do rigid body refinement in it (for MZ protocol only):

      % phenix.refine data.hkl model.pdb strategy=rigid_body \
        rigid_body.min_number_of_reflections=250
      

      Decreasing this number may increase the convergence radius of rigid body refinement but small numbers may lead to refinement instability.

    7. To change the number of zones for MZ protocol:

      % phenix.refine data.hkl model.pdb strategy=rigid_body \
        rigid_body.number_of_zones=7
      

      Increasing this number may increase the convergence radius of rigid body refinement at the cost of much longer run time.

    8. Rigid body refinement can be combined with individual coordinates refinement in a smart way:

      % phenix.refine data.hkl model.pdb strategy=rigid_body+individual_sites
      

      this will perform 3 macro-cycles of individual coordinates refinement and the rigid body refinement will be performed only once at the first macro-cycle. More powerful combination for coordinates refinement is:

      % phenix.refine data.hkl model.pdb strategy=rigid_body+individual_sites \
        simulated_annealing=true
      

      this will do the same refinement as above plus the Simulated annealing at the second macro-cycle (see more options/examples for running SA in this document).

  • Refinement of individual coordinates

    1. Refinement with Simulated Annealing:

      % phenix.refine data.hkl model.pdb simulated_annealing=true \
        strategy=individual_sites
      

      This will perform the Simulated Annealing refinement and LBFGS minimization for the whole model.

      To change the start SA temperature:

      % phenix.refine data.hkl model.pdb simulated_annealing=true \
        strategy=individual_sites simulated_annealing.start_temperature=10000
      

      Since a SA run may take some time, there are several options defining of how many times the SA will be performed per refinement run. Run it only the first macro_cycle:

      % phenix.refine data.hkl model.pdb simulated_annealing=true \
        strategy=individual_sites simulated_annealing.mode=first
      

      or every macro-cycle:

      % phenix.refine data.hkl model.pdb simulated_annealing=true \
        strategy=individual_sites simulated_annealing.mode=every_macro_cycle
      

      or second and before the last macro-cycle:

      % phenix.refine data.hkl model.pdb simulated_annealing=true \
      strategy=individual_sites simulated_annealing.mode=second_and_before_last
      
    2. Refinement with minimization (whole model):

      % phenix.refine data.hkl model.pdb strategy=individual_sites
      
    3. Refinement with minimization (selected part of model):

      % phenix.refine data.hkl model.pdb strategy=individual_sites \
      sites.individual="chain A"
      

      This will refine the coordinates of atoms in chain A while keeping fixed the atomic coordinates in chain B.

    4. To perform unrestrained refinement of coordinates (usually at ultra-high resolutions):

      % phenix.refine data.hkl model.pdb strategy=individual_sites wc=0
      

      This assigns the contribution of the geometry restraints target to zero. However, it is still calculated for statistics output.

    5. Removing selected geometry restraints

      In the example below:

      % phenix.refine data.hkl model.pdb remove_restraints_selections.params
      

      where remove_restraints_selections.params contains:

      refinement {
        geometry_restraints.remove {
          angles = chain B
          dihedrals = name CA
          chiralities = all
          planarities = None
        }
      }
      

      the following restraints will be removed: angle for all atoms in chain B, dihedral for all involving CA atoms, all chirality. All planarity restraints will be preserved.

Refinement of atomic displacement parameters (commonly named as ADP or B-factors)

An ADP in phenix.refine is defined as a sum of three contributions:

Utotal = Ulocal + Utls + Ucryst

where Utotal is the total ADP, Ulocal reflects the local atomic vibration (also named as residual B) and Ucryst reflects global lattice vibrations. Ucryst is determined and refined at anisotropic scaling stage.

phenix.refine offers multiple choices for ADP refinement:

  • individual isotropic, anisotropic or mixed ADP;
  • grouped with one isotropic ADP per selected group;
  • TLS.

All types of ADP refinement listed above can be used separately or combined all together in any combination (except TLS+individual anisotropic) and can be applied to any selected part of a model. For example, if a model contains six chains A, B, C, D, E and F than it would require only one single refinement run to perform refinement of:

- individual isotropic ADP for atoms in chain A,
- individual anisotropic ADP for atoms in chain B,
- grouped B with one B per all atoms in chain C,
- TLS refinement for chain D,
- TLS and individual isotropic refinement for chain E,
- TLS and grouped B refinement for chain F.

Below we will illustrate this with several examples.

Restraints are used for default ADP refinement of isotropic and anisotropic atoms. Completely unrestrained refinement is possible.

The total refinement target is defined as:

Etotal = wxu_scale * wxu * Exray + wu * Eadp

where: Exray is crystallographic refinement target (least-squares, maximum-likelihood, ...), Eadp is the ADP restraints term, wu is 1.0 by default and used to turn the restraints off, wxu and wxu_scale are defined similarly to coordinates refinement (see Refinement of Coordinates paragraph).

It is important to keep in mind:

If a model was previously refined using TLS that means all atoms participating in TLS groups are reported in output PDB file as anisotropic (have ANISOU records). Now if a PDB file like this is submitted for default refinement then all atoms with ANISOU records will be refined as individual anisotropic which is most likely not desired.

When performing TLS refinement along with individual isotropic refinement of Ulocal, the restraints are applied to Ulocal and not to the total ADP (Ulocal+Utls).

When performing group B or TLS refinement only, no ADP restrains is used.

When ADP refinement is run without using selections then ADP for all atoms will be refined. Otherwise, if selections are used, the only ADP of selected atoms will be refined and the ADP of the rest will be unchanged.

If a TLS parametrization is used for a model previously refined with individual anisotropic ADP then normally an increase of R-factors is expected.

phenix.refine will stop if an atom at special position is included in TLS group. The solution is to make a new TLS group selection containing no atoms at special positions.

When refining TLS, the output PDB file always has the ANISOU records for the atoms involved in TLS groups. The anisotropic B-factor in ANISOU records is the total B-factor (B_tls + B_individual). The isotropic equivalent B-factor in ATOM records is the mean of the trace of the ANISOU matrix divided by 10000 and multiplied by 8*pi^2 and represents the isotropic equivalent of the total B-factor (B_tls + B_individual). To obtain the individual B-factors, one needs to compute the TLS component (B_tls) using the TLS records in the PDB file header and then subtract it from the total B-factors (on the ANISOU records).

  • Refining group isotropic B-factors

    1. One B-factor per residue:

      % phenix.refine data.hkl model.pdb strategy=group_adp
      

      Two B-factors per residue:

      % phenix.refine data.hkl model.pdb strategy=group_adp \
        group_adp_refinement_mode=two_adp_groups_per_residue
      
    2. One isotropic B per selected group of atoms:

      % phenix.refine data.hkl model.pdb strategy=group_adp \
        group_adp_refinement_mode=group_selection \
        adp.group="chain A" adp.group="chain B"
      

      This will refine one isotropic B for chain A and one B for chain B.

    The refinement of group isotropic B-factors in phenix.refine does not change the original distribution of B-factors within the group, that is the differences between B-factors for atoms withing the group remain constant while the only total component added to all atoms of given group is varied. The atoms with anisotropic ADP are allowed to be withing the group.

  • Refinement of individual ADP (isotropic, anisotropic)

    By default atoms in a PDB file with ANISOU records are refined as anisotropic and atoms without ANISOU records are refined as isotropic. This behavior can be changed with appropriate keywords.

    1. Default refinement of individual ADP:

      % phenix.refine data.hkl model.pdb strategy=individual_adp
      

      Note, atoms in input PDB file with ANISOU records will be refined as anisotropic and those without ANISOU - as isotropic.

    2. Refinement of individual isotropic ADP for a model previously refined as anisotropic or TLS:

      % phenix.refine data.hkl model.pdb strategy=individual_adp \
        adp.individual.isotropic=all
      

      or equivalently:

      % phenix.refine data.hkl model.pdb strategy=individual_adp \
        convert_to_isotropic=true
      

      All anisotropic atoms in input PDB file will be converted to isotropic before the refinement starts. Obviously, this may raise the R-factors.

    3. Refinement of individual anisotropic ADP for a model previously refined as isotropic:

      % phenix.refine data.hkl model.pdb strategy=individual_adp \
        adp.individual.anisotropic="not element H"
      

      This will refine all atoms as anisotropic except hydrogens.

    4. Refinement of mixed model (some atoms are isotropic, some are anisotropic):

      % phenix.refine data.hkl model.pdb strategy=individual_adp \
        adp.individual.anisotropic="chain A and not element H" \
        adp.individual.isotropic="chain B or element H"
      

      In this example the atoms (except hydrogens if any) in chain A will be refined as anisotropic and the atoms in chain B (and hydrogens if any) will be refined as isotropic. Often, the ADP of water and hydrogens are desired to be refined as isotropic while the other atoms - as anisotropic:

      % phenix.refine data.hkl model.pdb strategy=individual_adp \
        adp.individual.anisotropic="not water and not element H" \
        adp.individual.isotropic="water or element H"
      

      Exactly the same command using slightly shorter selection syntax:

      % phenix.refine data.hkl model.pdb strategy=individual_adp \
        adp.individual.anisotropic="not (water or element H)" \
        adp.individual.isotropic="water or element H"
      
    5. To perform unrestrained individual ADP refinement (usually at ultra-high resolutions):

      % phenix.refine data.hkl model.pdb strategy=individual_adp wu=0
      

      This assigns the contribution of the ADP restraints target to zero. However, it is still calculated for statistics output.

  • TLS refinement

    1. Refinement of TLS parameters only (whole model as one TLS group):

      % phenix.refine data.hkl model.pdb strategy=tls
      
    2. Refinement of TLS parameters only (multiple TLS group):

      % phenix.refine data.hkl model.pdb strategy=tls tls_group_selections.params
      

      where, similar to the rigid body or group B-factor refinement, the selection for TLS groups has been made in a user-created parameter file (tls_group_selections.params) as following:

      refinement.refine.adp {
        tls = chain A
        tls = chain B
      }
      

      Alternatively, the selection for the TLS groups can be made from the command line (see rigid body refinement for an example).

      Note: TLS parameters will be refined only for selected fragments. This, for example, will allow to not include the solvent molecules into the TLS groups.

    3. More complete is to perform combined TLS and individual or grouped isotropic ADP refinement:

      % phenix.refine data.hkl model.pdb strategy=tls+individual_adp
      

      or:

      % phenix.refine data.hkl model.pdb strategy=tls+group_adp
      

      This will allow to model global (TLS) and local (individual) components of the total ADP and also compensate for the model parts where TLS parametrization doesn't suite well.

Occupancy refinement

List of facts about occupancy refinement in phenix.refine:

  • phenix.refine can perform the following types of occupancy refinement:

    • individual occupancy refinement - refinement of one occupancy per atom. In this case the refined occupancy value will be constrained between main.occupancy_min and main.occupancy_max, which is 0 and 1 by default.

    • group (constrained) occupancy refinement. There are two typical uses of this option:

      • refine one occupancy per selected set of atoms, such as a partially occupied ligand or ion. The refined occupancy will be constrained between 0 and 1.
      • refine occupancies of atoms in alternative conformations. For example, if a residue has two alternative conformations, A and B, then there will be one refinable occupancy parameter. All occupancies within the conformer A will be equal to each other, and the same will be for conformer B. The sum of occupancies of A and B will add up to one. In general, for a constrained group containing N coupled conformers (for example, N=2 for alternative conformations A and B) there will be (N-1) refinable occupancies. Currently N has to be less or equal to 4.
  • The occupancy refinement is ON by default. This does not mean that occupancies of all atoms will be refined. Based on input PDB file, phenix.refine automatically finds which occupancies it will be refining. If no user defined selections is provided, phenix.refine will refine:

    • individual occupancies for all atoms that have partial occupancy values in input PDB file (0<occupancy<1), for example:

      ATOM   1001 AU   AU    500      14.333   3.856  26.301  0.23  7.97
      

      NOTE: occupancies of atoms with starting 0 or 1 occupancy value will not be refined.

    • occupancies of atoms in alternative conformations. Atoms in alternative conformations will be automatically determined based on altLoc identifiers (a one-letter code in front of three-letter residue name in ATOM record) in input PDB file and the group constrained occupancy refinement for these atoms will be performed. For example:

      ATOM   5085  N  AALA   270      19.772  -6.267  40.250  0.75  5.17
      ATOM   5086  CA AALA   270      19.927  -5.299  41.342  0.75  5.15
      ATOM   5087  CB AALA   270      20.132  -6.108  42.617  0.75  6.92
      ATOM   5088  C  AALA   270      21.058  -4.290  41.124  0.75  5.06
      ATOM   5089  O  AALA   270      20.831  -3.090  41.384  0.75  5.54
      ATOM   5090  N  BALA   270      19.733  -6.282  40.242  0.25  5.04
      ATOM   5091  CA BALA   270      19.592  -5.512  41.492  0.25  5.03
      ATOM   5092  CB BALA   270      19.702  -6.389  42.726  0.25  6.62
      ATOM   5093  C  BALA   270      20.673  -4.426  41.454  0.25  4.77
      ATOM   5094  O  BALA   270      20.381  -3.268  41.761  0.25  5.70
      

      NOTE 1: The starting occupancy values can be any: they can already be set to some reasonable values, like 0.75 and 0.25 above, can be all zero or 1, or any random value. The refined occupancy values will be looking like in the above example: identical for all atoms within each conformer (in the above example: 0.75 for conformer A and 0.25 for conformer B), and the sum of unique occupancies across all conformers will be exactly 1 (=0.25+0.75).

      NOTE 2: If there are two or more consecutive residues in the same chain that have alternative conformations, then their occupancies will be automatically grouped and refined. For example:

      ATOM      0  N  AGLY A   1       2.650   4.221   1.463  0.70 18.35
      ATOM      1  CA AGLY A   1       2.206   4.688   2.763  0.70 17.27
      ATOM      2  C  AGLY A   1       3.296   4.604   3.813  0.70  4.90
      ATOM      3  O  AGLY A   1       4.143   3.711   3.772  0.70 10.35
      ATOM      4  N  BGLY A   1       3.650   6.221   4.463  0.30 18.35
      ATOM      5  CA BGLY A   1       3.206   6.688   5.763  0.30 17.27
      ATOM      6  C  BGLY A   1       4.296   6.604   6.813  0.30  4.90
      ATOM      7  O  BGLY A   1       5.143   5.711   6.772  0.30 10.35
      ATOM      8  N  AALA A   2       3.276   5.538   4.758  0.70  8.03
      ATOM      9  CA AALA A   2       2.260   6.584   4.782  0.70  4.94
      ATOM     10  C  AALA A   2       2.886   7.964   4.606  0.70  8.07
      ATOM     11  O  AALA A   2       3.307   8.594   5.576  0.70  2.01
      ATOM     12  CB AALA A   2       1.465   6.522   6.077  0.70  4.52
      ATOM     13  N  BALA A   2       4.276   7.538   7.758  0.30  8.03
      ATOM     14  CA BALA A   2       3.260   8.584   7.782  0.30  4.94
      ATOM     15  C  BALA A   2       3.886   9.964   7.606  0.30  8.07
      ATOM     16  O  BALA A   2       4.307  10.594   8.576  0.30  2.01
      ATOM     17  CB BALA A   2       2.465   8.522   9.077  0.30  4.52
      

      where the occupancies of conformer A (in residue numbers 1 and 2) are all equal to each other (0.7), the occupancies of conformer B are all equal to each other as well (0.3), and their sum is 1 (0.7+0.3).

      NOTE 3: It is possible to have multiple different (having different residue name) alternative conformers within the same residue number of the same chain:

      ATOM      0  N  APRO A  22       4.915  12.683  -3.102  0.25 11.83
      ATOM      1  CA APRO A  22       6.042  13.429  -2.601  0.25 11.82
      ATOM      2  C  APRO A  22       6.387  13.122  -1.160  0.25 11.66
      ATOM      3  O  APRO A  22       5.480  13.006  -0.345  0.25 12.09
      ATOM      4  CB APRO A  22       5.655  14.896  -2.744  0.25 12.86
      ATOM      5  CG APRO A  22       4.661  14.854  -4.058  0.25 12.66
      ATOM      6  CD APRO A  22       3.957  13.505  -3.910  0.25 12.27
      ATOM      7  CA BSER A  22       6.034  13.399  -2.687  0.30 11.55
      ATOM      8  C  BSER A  22       6.367  13.062  -1.223  0.30 12.92
      ATOM      9  O  BSER A  22       5.412  13.050  -0.345  0.30 11.87
      ATOM     10  CB BSER A  22       5.409  14.835  -2.876  0.30 12.60
      ATOM     11  OG BSER A  22       4.760  15.243  -1.635  0.30 12.11
      ATOM     12  CA CSER A  22       6.112  13.653  -2.656  0.45 12.31
      ATOM     13  C  CSER A  22       6.354  13.275  -1.187  0.45 11.92
      ATOM     14  O  CSER A  22       5.636  12.705  -0.270  0.45 11.77
      ATOM     15  CB CSER A  22       5.605  15.097  -2.687  0.45 13.56
      ATOM     16  OG CSER A  22       6.750  15.771  -2.280  0.45 16.38
      

      If a structure contains a residue or ligand with all equal non-zero occupancies, for example:

      ATOM      6  S   SO4     1       1.302   1.419   1.560  0.70 13.00
      ATOM      7  O1  SO4     1       1.497   1.295   0.118  0.70 11.00
      ATOM      8  O2  SO4     1       1.098   0.095   2.140  0.70 10.00
      ATOM      9  O3  SO4     1       2.481   2.037   2.159  0.70 14.00
      ATOM     10  O4  SO4     1       0.131   2.251   1.823  0.70 12.00
      

      in this case one occupancy per whole ligand will be refined automatically and it will be constrained between 0 and 1. If at least one occupancy is different from the rest:

      ATOM      6  S   SO4     1       1.302   1.419   1.560  0.70 13.00
      ATOM      7  O1  SO4     1       1.497   1.295   0.118  0.21 11.00
      ATOM      8  O2  SO4     1       1.098   0.095   2.140  0.70 10.00
      ATOM      9  O3  SO4     1       2.481   2.037   2.159  0.70 14.00
      ATOM     10  O4  SO4     1       0.131   2.251   1.823  0.70 12.00
      

      then the occupancies of all atoms in the above SO4 ion will refined individually. And in the example below:

      ATOM      6  S   SO4     1       1.302   1.419   1.560  0.70 13.00
      ATOM      7  O1  SO4     1       1.497   1.295   0.118  0.70 11.00
      ATOM      8  O2  SO4     1       1.098   0.095   2.140  0.00 10.00
      ATOM      9  O3  SO4     1       2.481   2.037   2.159  0.70 14.00
      ATOM     10  O4  SO4     1       0.131   2.251   1.823  0.70 12.00
      

      all occupancies will be refined individually except for atom O2 where it will stay zero.

  • A special case is refinement of a partially deuterated structure against neutron data. Such a structure contains exchangeable H/D sites that in phenix.refine are modeled as alternative conformations, for example:

    ATOM     54  N   GLY L   2       2.908 -22.755  25.168  1.00 12.24           N
    ATOM     55  CA  GLY L   2       2.957 -24.115  25.675  1.00 13.12           C
    ATOM     56  C   GLY L   2       2.218 -24.358  26.958  1.00 15.55           C
    ATOM     57  O   GLY L   2       2.343 -25.435  27.503  1.00 13.47           O
    ATOM     58  HA2 GLY L   2       2.590 -24.711  25.004  1.00 15.29           H
    ATOM     59  HA3 GLY L   2       3.885 -24.361  25.816  1.00 16.65           H
    ATOM     60  H  AGLY L   2       2.349 -22.638  24.525  0.25 18.77           H
    ATOM     61  D  BGLY L   2       2.349 -22.638  24.525  0.75 18.77           D
    

    This situation is detected automatically and the occupancies of H and D atoms are refined in such a way so their sum is one.

  • Turning OFF the occupancy refinement can be done by removing the star (*) from the corresponding keyword in strategy = ... *occupancies ....

  • If selections are provided by the user (see examples below) then the occupancy refinement for selected atoms will be performed as well as for those selected automatically (as described above).

  • User defined selections will override those defined by phenix.refine automatically. For example, if an atom is automatically selected for individual occupancy refinement, but the user defined a group of atoms for which one occupancy factor will be refined (group occupancy refinement), and this particular atom belongs to user defined group, then the individual occupancy will not be refined for this atom.

  • User can withhold the occupancy refinement for any atoms that were automatically selected by phenix.refine for occupancy refinement.

  • The presence of user defined selections for occupancies to be refined is not enough to engage the occupancy refinement. It is important that the occupancy refinement is turned ON using the strategy = keyword.

Examples:

  1. Running with all default parameters:

    % phenix.refine data.hkl model.pdb
    

    This will refine individual coordinates, individual B-factors (isotropic or anisotropic) and occupancies for atoms in alternative conformations or for atoms having partial occupancies. If there is no such atoms in input PDB file, then no occupancies will be refined.

  2. Refinement of occupancies only:

    % phenix.refine data.hkl model.pdb strategy=occupancies
    

    This will only refine occupancies for atoms in alternative conformations or for atoms having partial occupancies. If there is no such atoms in input PDB file, then no occupancies will be refined. Other model parameters, such as B-factors or coordinates will not be refined (this is the only difference between this and the above refinement runs).

  3. Refine individual occupancies of water molecules (in addition to atoms with partial occupancies and those in alternative conformations, if any):

    % phenix.refine data.hkl model.pdb refine.occupancies.individual="water"
    

    Similar refinement as above where all Zn atoms in chain X will be refined as well:

    % phenix.refine data.hkl model.pdb occupancies.individual="water" \
      occupancies.individual="chain X and element Zn"
    
  4. Complex occupancy refinement strategy (combination of various available occupancy refinement types):

    % phenix.refine data.hkl model.pdb strategy=occupancies occ.params
    

The amount of atom selections makes it inconvenient to type them all from the command line. This is why the parameter file occ.params is used and it contains following lines:

refinement {
  refine {
    occupancies {
      individual = element BR or water
      individual = element Zn
      constrained_group {
        selection = chain A and resseq 1
      }
      constrained_group {
        selection = chain A and resseq 2
        selection = chain A and resseq 3
      }
      constrained_group {
        selection = chain X and resname MAN 
        selection = chain X and resseq 42
        selection = chain X and resseq 121
      }
      remove_selection = chain B and resseq 1 and name O
      remove_selection = chain B and resseq 3 and name O
    }
  }
}

which defines:

  • individual occupancy refinement for all BR, Zn and water atoms;
  • group occupancy refinement for residue number 1 in chain A (as selected with chain A and resseq 1). One occupancy for all atoms in this residue will be refined and it will be constrained between main.occupancy_min and main.occupancy_min, which by default is 0 and 1, correspondingly.
  • another constrained occupancy group, where the occupancies of atoms in chain A and resseq 2 and chain A and resseq 3 will be coupled. That is all occupancies within chain A and resseq 2 will have the exact same value between 0 and 1, and same for chain A and resseq 3. The sum of occupancies of chain A and resseq 2 and chain A and resseq 3 will be 1.0, making it one constrained group.
  • another constrained group contains three residues (number 42 and 121, and MAN) and their occupancies will be refined similarly as described above.
  • occupancies of atoms O in residues 1 and 3 of chain B will not be refined as requested using remove_selection keyword (even though these atoms have partial occupancies in input PDB file and so they would normally be refined by default).

f' and f'' refinement

If the structure contains anomalous scatterers (e.g. Se in a SAD or MAD experiment), and if anomalous data are available, it is possible to refine the dispersive (f') and anomalous (f") scattering contributions (see e.g. Ethan Merritt's tutorial for more information). In phenix.refine, each group of scatterers with common f' and f" values is defined via an anomalous_scatterers scope, e.g.:

refinement.refine.anomalous_scatterers {
  group {
    selection = name BR
    f_prime = 0
    f_double_prime = 0
    refine = *f_prime *f_double_prime
  }
}

NOTE: The refinement of the f' and f" values is carried out only if group_anomalous is included under refine.strategy! Otherwise the values are simply used as specified but not refined. So the refinement run with the parameters above included into group_anomalous_1.params:

% phenix.refine model.pdb data_anom.hkl group_anomalous_1.params \
  strategy=individual_sites+individual_adp+group_anomalous

If required, multiple scopes can be specified, one for each unique pair of f' and f" values. These values are assigned to all selected atoms (see below for atom selection details). Often it is possible to start the refinement from zero. If the refinement is not stable, it may be necessary to start from better estimates, or even to fix some values. For example (file group_anomalous_2.params):

refinement.refine.anomalous_scatterers {
  group {
    selection = name BR
    f_prime = -5
    f_double_prime = 2
    refine = f_prime *f_double_prime
  }
}

% phenix.refine model.pdb data_anom.hkl group_anomalous_2.params \
  strategy=individual_sites+individual_adp+group_anomalous

Here f' is fixed at -5 (note the missing * in front of f_prime in the refine definition), and the refinement of f" is initialized at 2.

The phenix.form_factor_query command is available for obtaining estimates of f' and f" given an element type and a wavelength, e.g.:

% phenix.form_factor_query element=Br wavelength=0.8

Information from Sasaki table about Br (Z = 35) at 0.8 A
fp:  -1.0333
fdp: 2.9928

Run without arguments for usage information:

% phenix.form_factor_query

Note that if you perform anomalous refinement, you may also want to include a log-likelihood gradient anomalous map (map_type=llg) in the output, as this will show any unmodeled anomalous scattering with greater sensitivity than the conventional anomalous difference map.

Using NCS restraints in refinement

phenix.refine has both torsion-based and Cartesian-based NCS implementations. NCS-related atoms can be identified automatically or be defined by the user.

The default NCS implementation in phenix.refine restrains NCS-related chains in torsion space.

Torsion NCS (default)

Torsion-based NCS restraints use a flexible target function that is smoothly shut off as the difference between related torsions increases, allowing for local differences between NCS-related chains. The default behavior identifies related chains automatically, but users may also specify NCS groups.

Automatic rotamer outlier correction and rotamer consistency checks between NCS-related sidechains are carried out for refinements against data at 3.0 A and better.

  1. Refinement with automatic group determination:

    % phenix.refine data.hkl model.pdb main.ncs=True
    
  2. Refinement with user provided NCS selections:

    Create a torsion_ncs.params file with selections like:

    refinement.ncs.torsion.restraint_group {
      selection = chain A
      selection = chain B
      selection = chain C
    }
    

    Specify torsion_ncs.params as an additional input when running phenix.refine:

    % phenix.refine data.hkl model.pdb main.ncs=True torsion_ncs.params
    

Cartesian NCS (optional)

Cartesian-based NCS restraints are also available in phenix.refine. Atoms in NCS-related chains are restrained to the average xyz position.

Gaps in selected sequences are allowed - a sequence alignment is performed to detect insertions or deletions. We recommend to check the automatically detected or adjusted NCS groups.

  1. Refinement with user provided NCS selections:

    Create a ncs_groups.params file with the NCS selections:

    refinement.ncs.restraint_group {
      reference = chain A resid 1:4
      selection = chain B and resid 1:3
      selection = chain C
    }
    refinement.ncs.restraint_group {
      reference = chain E
      selection = chain F
    }
    
  2. Specify ncs_groups.params as an additional input when running phenix.refine:

    % phenix.refine data.hkl model.pdb ncs_groups.params \
      main.ncs=True ncs.type=cartesian
    

    This will perform the default refinement round (individual coordinates and B-factors) using NCS restraints on coordinates and B-factors.

    Note: user specified NCS restraints in ncs_groups.params can be modified automatically if better selection is found. To disable this potential automatic adjustment:

    % phenix.refine data.hkl model.pdb ncs_groups.params main.ncs=True \
      ncs.type=cartesian ncs.find_automatically=False
    

    Automatic detection of NCS groups:

    % phenix.refine data.hkl model.pdb main.ncs=True ncs.type=cartesian
    

    This will perform the default refinement round (individual coordinates and B-factors) using NCS restraints automatically created based on input PDB file.

Using secondary structure restraints

At low resolutions it is often beneficial to restrain hydrogen bonding distances in helices, sheets, and nucleic acid base pairs. These can be used with or without explicit hydrogen atoms. Appropriate atom selections will be detected automatically if none are provided by the user, but in most cases careful manual annotation will probably yield better results, especially if the starting model is of low quality. To turn on the additional restraints a single extra parameter is sufficient:

% phenix.refine data.hkl model.pdb main.secondary_structure_restraints=True

You can also generate starting parameters for secondary structure restraints using a standalone utility:

% phenix.secondary_structure_restraints model.pdb

This will print a set of parameters suitable for use in phenix.refine, which may be edited to correct errors or add undetected groups.

Like other restraints, the hydrogen bond distances have adjustable sigma and target values; these are defined in the hydrogen_bonding scope. The default potential is labeled 'simple', and mimics the covalent bond restraints. The sigma defaults to 0.05 Angstrom; the targets will be different depending on whether explicit hydrogens are used or not (defaults are 1.975A or 2.9A):

% phenix.refine data.hkl model.pdb main.secondary_structure_restraints=True \
  hydrogen_bonding.distance_ideal_h_o=2.0 \
  hydrogen_bonding.simple.sigma=0.04

% phenix.refine data.hkl model.pdb main.secondary_structure_restraints=True \
  hydrogen_bonding.distance_ideal_n_o=3.0 \
  hydrogen_bonding.simple.slack=0.1

A relatively strict outlier cutoff is applied by default, to prevent improperly restraining incorrectly annotated residues. Any bonds longer than the outlier cutoffs (2.5A for H-O, or 3.5A for N-O) will be be weighted down to zero during refinement (they may contribute later if the bond length decreases). If you are certain of your annotations, you can increase or remove the cutoff:

% phenix.refine data.hkl model.pdb main.secondary_structure_restraints=True
hydrogen_bonding.distance_cut_h_o=3.0
% phenix.refine data.hkl model.pdb main.secondary_structure_restraints=True
h_bond_restraints.remove_outliers=False

Using a reference model in refinement

phenix.refine can be given a reference model that is used to steer refinement of the working model. This technique is advantageous in cases where the working data set is low resolution, but there is a known related structure solved at higher resolution. The higher resolution reference model is used to generate a set of dihedral restraints that are applied to each matching dihedral in the working model.

Reference chains are matched to working chains automatically, and sequences need not match exactly.

The default parameters are a good starting point:

% phenix.refine data.hkl model.pdb main.reference_model_restraints=True \
  reference_model.file=reference.pdb

The default sigma value for these reference dihedral restraints is 1.0 degrees. To increase the strength of these restraints, select a smaller sigma:

% phenix.refine data.hkl model.pdb main.reference_model_restraints=True \
  reference_model.file=reference.pdb reference_model.sigma=0.5

To decrease the strength of the restraints, select a larger sigma:

% phenix.refine data.hkl model.pdb main.reference_model_restraints=True \
  reference_model.file=reference.pdb reference_model.sigma=2.0

The reference restraints have a limit parameter which turns them off when the angle in the working model differs from the reference by an amount greater than limit. The default value is 15 degrees, but may be user-defined:

% phenix.refine data.hkl model.pdb main.reference_model_restraints=True \
  reference_model.file=reference.pdb reference_model.limit=10

For an optimal set of protein restraints, rotamer outliers in the working model that have rotameric counterparts in the reference model are automatically corrected to the rotamer from the reference model prior to refinement. In practice this step almost always improves the final model, but can be turned off if desired:

% phenix.refine data.hkl model.pdb main.reference_model_restraints=True \
  reference_model.file=reference.pdb reference_model.fix_outliers=False

Selections may also be used with reference_model restraints. Selections are useful in cases where multiple chains in the working model should be restrained to the same reference chain, the model or reference have insertions that change the register, only part of a chain is desirable to restrain, etc.

To specify selections, create a reference.params file with selections like:

refinement.reference_model.reference_group {
     reference = chain A and resseq 2:119
     selection = chain A and resseq 2:119
   }
refinement.reference_model.reference_group {
     reference = chain A and resseq 130:134
     selection = chain A and resseq 120:124
   }
refinement.reference_model.reference_group {
     reference = chain A
     selection = chain B
   }

Specify reference.params as an additional input when running phenix.refine:

% phenix.refine data.hkl model.pdb main.reference_model_restraints=True \
  reference_model.file=reference.pdb reference.params

Each selection (both reference and selection entries as above) may only specify one chainID and/or one resseq range.

Multiple reference models may also be specified in cases where a working complex has reference structures from different coordinate files. To specify multiple reference model input files, the command line or parameter file should contain:

refinement.reference_model.file = reference_A.pdb
refinement.reference_model.file = reference_B.pdb

Reference chain/model chain pairs are determined automatically, and details are written to the .log file and .eff file. If you want to specify your own matching, include a parameter file that contains:

refinement.reference_model.file = reference_A.pdb
refinement.reference_model.file = reference_B.pdb
refinement.reference_model.reference_group {
     reference = chain A
     selection = chain A
     file_name = reference_A.pdb
   }
refinement.reference_model.reference_group {
     reference = chain A
     selection = chain B
     file_name = reference_B.pdb
   }

The refinement.reference_model.reference_group.file_name parameter is only required when more than one reference file is used. This parameter allows the reference model restraint generation to disambiguate between reference files that contain chains with the same chainID.

Automatic Asn/Gln/His corrections

Asn, Gln, and His residues can often be fit favorably to the data in two orientations, related by a 180 degree rotation. In many cases, however, only one of these orientations is sterically and electrostatically favorable. phenix.refine uses Reduce to identify Asn, Gln, and His residues that should be flipped, and then flips them automatically.

To turn this feature on, run:

% phenix.refine data.hkl model.pdb main.nqh_flips=True

Water picking

phenix.refine has very efficient and fully automated protocol for water picking and refinement. One run of phenix.refine is normally necessary to locate waters, refine them, select good ones, add new and refine again, repeating the whole process multiple times.

Normally, the default parameter settings are good for most cases:

% phenix.refine data.hkl model.pdb ordered_solvent=true

This will perform new water picking, analysis of existing waters and refinement of individual coordinates and B-factors for both, macromolecule and waters. Several cycles will be performed allowing sorting out of spurious waters and refinement of well placed ones.

Water picking can be combined with all others protocols, like simulated annealing, TLS refinement, etc. Some useful commands are:

  1. Perform water picking every macro-cycle.

    By default, water picking starts after a half of macro-cycles is done:

    % phenix.refine data.hkl model.pdb ordered_solvent=true \
      ordered_solvent.mode=every_macro_cycle
    
  2. Remove water only (based on specified criteria):

    % phenix.refine data.hkl model.pdb ordered_solvent=true \
      ordered_solvent.mode=filter_only
    
  3. The following run illustrates the use of some important parameters:

    % phenix.refine data.hkl model.pdb ordered_solvent=true solvent.params
    

    where the parameter file solvent.params contains:

    refinement {
      ordered_solvent {
        low_resolution = 2.8
        b_iso_min = 1.0
        b_iso_max = 50.0
        b_iso = 25.0
        primary_map_type = mFobs-DFmodel
        primary_map_cutoff = 3.0
        secondary_map_and_map_cc_filter
        {
          cc_map_2_type = 2mFobs-DFmodel
        }
      }
      peak_search {
        map_next_to_model {
          min_model_peak_dist = 1.8
          max_model_peak_dist = 6.0
          min_peak_peak_dist = 1.8
        }
      }
    }
    

    This will skip water picking if the resolution of data is lower than 2.8A, it will remove waters with B < 1.0 or B > 50.0 A**2 or occupancy different from 1 or peak height at mFo-DFc map lower then 3 sigma. It will not select or will remove existing water if water-water or water-macromolecule distance is less than 1.8A or water-macromolecule distance is greater than 6.0 A. The initial occupancies and B-factors of newly placed waters will be 1.0 and 25.0 correspondingly. If b_either = None, then b_iso will be the mean atomic B-factor.

Hydrogens in refinement

Depending on data type (X-ray or neutron), data quality (resolution, completeness) phenix.refine offers different options for parametrization of hydrogen atoms:

  • riding model.
  • complete refinement of H (H atoms will be refined as other atoms in the model).

Using the riding model does not add additional refinable parameters, since position of a hydrogen atom H in X-H bond is recalculated from the current position of atom X. Also, H atom inherits the occupancy of X atom and its B-factor. Sometime the B-factor of H atom is the product of B-factor of X atoms and a scale from 1 to 1.5. The riding model should be used to parametrize H atoms at almost all resolutions in X-ray refinement. An exception can be a subatomic resolution ( ~0.7A and higher), where the hydrogen's parameters can be refined individually.

Although the contribution of hydrogen atoms to X-ray scattering is weak, the H atoms are still present in real structures irrespective the data quality. Including them as riding model at any resolution makes other model atoms aware of their positions and hence preventing non-physical (bad) contacts at no cost in terms of refinable parameters (= no risk of overfitting).

The scattering contribution of Hydrogens by default is always accounted for, however there is a parameter to turn it off:

% phenix.refine model.pdb data.hkl hydrogens.contribute_to_f_calc=false

If neutron data is used then the parameters of H atoms should always be refined individually, except the cases where data resolution and/or completeness are poor. In that case riding model can be used. If partially deuterated structure is used in refinement then the constrained occupancies of exchangeable H/D sites are refined so they add up to 1.

Currently, phenix.refine does not add the H atoms (except a few cases mentioned below), so it is necessary to use a different program from PHENIX suite (ReadySet!) to add H, D or H/D atoms to your structure. Internally ReadySet! uses Reduce program to add hydrogens to macromolecule (protein, DNA/RNA) and it uses its own resources to add hydrogens to ligands or water. Hydrogens are added to their ideal geometrical positions.

If a structure contains a ligand unknown to phenix.refine, ReadySet! will create a library CIF file which will include the definitions for all newly added hydrogens.

phenix.refine can build H or D atoms for water molecules only. To do so it uses residual density map, mFo-DFc. This option is normally used at relatively high resolution neutron data (~2.0...2.5A and higher) or at subatomic X-ray resolution:

% phenix.refine model.pdb data.hkl main.find_and_add_hydrogens=true

Although this is not thoroughly studied yet, it seems that hydrogens should not be included into NCS or TLS groups. This is why they are automatically excluded from them. However, if NCS selections are created manually and the structure contains H atoms, it might be a good idea to add and not (element H or element D) to all selection strings.

Below are some useful commands:

  1. Add hydrogens:

    % phenix.ready_set model.pdb
    
  2. Add deuteriums:

    % phenix.ready_set model.pdb perdeuterate=true
    
  3. Add H and exchangeable H/D:

    % phenix.ready_set model.pdb neutron_exchange_hydrogens=true
    
  4. Add H to water:

    % phenix.ready_set model.pdb add_h_to_water=true
    
  5. Once hydrogens added to a model, by default they will be refined as riding model:

    % phenix.refine model.pdb data.hkl
    

    It is possible to refine individual parameters for H atoms (if neutron data is used or at ultra-high resolution):

    % phenix.refine model.pdb data.hkl hydrogens.refine=individual
    
  6. To refine individual coordinates and ADP of H atoms:

    % phenix.refine model.pdb data.hkl hydrogens.refine=individual
    
  7. To remove hydrogens from a model:

    % phenix.pdbtools model.pdb remove="element H or element D"
    

    or Reduce programs can be used for this:

    % phenix.reduce model_h.pdb -trim > model_noH.pdb
    

    We strongly recommend to not remove hydrogen atoms after refinement since it will make the refinement statistics (R-factors, etc...) unreproducible without repeating exactly the same refinement protocol.

  8. Yet another option to add hydrogens (rarely used in practice):

    % phenix.elbow --final-geometry=model.pdb --residue=MAN --output=model_h
    

    Output PDB file called model_h.pdb will contain the original ligand MAN with all hydrogen atoms added.

Refinement using twinned data

phenix.refine can handle the refinement of hemihedrally twinned data (two twin domains). Least square twin refinement can be carried out using the following commands line instructions:

% phenix.refine data.hkl model.pdb twin_law="-k,-h,-l"

The twin law (in this case -k,-h,-l) can be obtained from phenix.xtriage. If more than a single twin law is possible for the given unit cell and space group, using phenix.twin_map_utils might give clues which twin law is the most likely candidate to be used in refinement.

Correcting maps for anisotropy might be useful:

% phenix.refine data.hkl model.pdb twin_law="-k,-h,-l" \
  detwin.map_types.aniso_correct=true

The detwinning mode is auto by default: it will perform algebraic detwinning for twin fraction below 40%, and detwinning using proportionality rules (SHELXL style) for fractions above 40%.

An important point to stress is that phenix.refine will only deal properly with twinning that involves two twin domains.

Neutron and joint X-ray and neutron refinement

Refinement using neutron data requires having H or/and D atoms added to the model. Use ReadySet! program to add all H, D or H/D atoms. See "Hydrogens in refinement" section for details.

  1. Running refinement with neutron data only:

    % phenix.refine data.hkl model.pdb main.scattering_table=neutron
    

    this will tell phenix.refine that the data in data.hkl file is coming from neutron scattering experiment and the appropriate scattering factors will be used in all calculations. All the examples and phenix.refine functionality presented in this document are valid and compatible with using neutron data.

  2. Using X-ray and neutron data simultaneously (joint X/N refinement).

    phenix.refine allows simultaneous use of both data sets, X-ray and neutron. The data sets are allowed to have different number of reflections and be collected at different resolutions.

    The only requirement (that is not enforced by the program but is the user's responsibility) is that both data sets have to be collected at the same temperature from same crystals (or grown in identical conditions, having identical space groups and unit cell parameters):

    phenix.refine model.pdb data_xray.hkl neutron_data.file_name=data_neutron.hkl
    input.xray_data.labels=FOBSx input.neutron_data.labels=FOBSn
    

Optimizing target weights

phenix.refine uses automatic procedure to determine the weights between X-ray target and stereochemistry or ADP restraints. To optimize these weights (that is to find those resulting in lowest Rfree factors):

% phenix.refine data.hkl model.pdb optimize_xyz_weight=true optimize_adp_weight=true

where optimize_xyz_weight will turn on the optimization of X-ray/stereochemistry weight and optimize_adp_weight will turn on the optimization of X-ray/ADP weight. Note that this could be very slow since the procedure involves a grid search over an array of weights-candidates. It could be a good idea to run this overnight for a final model tune up.

Refinement at high resolution (higher than approx. 1.0 Angstrom)

Guidelines for structure refinement at high resolution:

  • make sure the model contains hydrogen atoms. If not, phenix.reduce can be used to add them:

    % phenix.reduce model.pdb > model_h.pdb
    

    By default, phenix.refine will refine positions of H atoms as riding model (H atom will exactly follow the atom it is attached to). Note that phenix.refine can also refine individual coordinates of H atoms (can be used for small molecules at ultra-high resolutions or for refinement against neutron data). This is governed by hydrogens.refine = individual *riding keyword and the default is to use riding model. hydrogens.refine defines how hydrogens' B-factors are refined (default is to refine one group B for all H atoms). At high resolution one should definitely try to use one_b_per_molecule or even individual choice (resolution permitting). Similar strategy should be used for refinement of H's occupancies, hydrogens.refine_occupancies keyword.

  • most of the atoms should be refined with anisotropic ADP. Exceptions could be model parts with high B-factors), atoms in alternative conformations, hydrogens and solvent molecules. However, at resolutions higher than 1.0A it's worth of trying to refine solvent with anisotropic ADP.

  • it is a good idea to constantly monitor the existing solvent molecules and check for new ones by using ordered_solvent=true keyword. If it's decided to refine waters with anisotropic ADP then make sure that the newly added ones are also anisotropic; use ordered_solvent.new_solvent=anisotropic (default is isotropic). One can also ask phenix.refine to refine occupancies of water: ordered_solvent.refine_occupancies=true (default is False).

  • at high resolution the alternative conformations can be visible for more than 20% of residues. phenix.refine automatically recognizes atoms in alternative conformations (based on PDB records) and by default does constrained refinement of occupancies for these atoms. Please note, that phenix.refine does not build or create the fragments in alternative conformations; the atoms in alternative conformations should be properly defined in input PDB file (using conformer identifiers) (if actually found in a structure).

  • the default weights for stereochemical and ADP restraints are most likely too tight at this resolution, so most likely the corresponding values need to be relaxed. Use wxc_scale and wxu_scale for this; lower values, like 1/2, 1/3, 1/4, ... etc of the default ones should be tried. phenix.refine allows automatically optimize these values ( optimize_xyz_weight=True and optimize_adp_weight=True), however this is a very slow task so it may be considered for an over night run or even longer. At ultra-high resolutions (approx. 0.8A or higher) a complete unrestrained refinement should be definitely tried out for well ordered parts of the model (single conformations, low B-factors).

  • at ultra-high resolution the residual maps show the electron density redistribution due to bonds formation as density peaks at interatomic bonds. phenix.refine has specific tools to model this density called IAS models (Afonine et al, Acta Cryst. (2007). D63, 1194-1197).

This example illustrates most of the above points:

% phenix.refine model_h.pdb data.hkl high_res.params

where the file high_res.params contains following lines (for more parameters under each scope look at complete list of parameters):

refinement.main {
  number_of_macro_cycles = 5
  ordered_solvent=true
}
refinement.refine {
  adp {
    individual {
      isotropic = element H
      anisotropic = not element H
    }
  }
}
refinement.target_weights {
  wxc_scale = 0.25
  wxu_scale = 0.3
}
refinement {
  ordered_solvent {
    mode = auto filter_only *every_macro_cycle
    new_solvent =  isotropic *anisotropic
    refine_occupancies = True
  }
}

In the example above phenix.refine will perform 5 macro-cycles with ordered solvent update (add/remove) every macro-cycles, all atoms including newly added water will be refined with anisotropic B-factors (except hydrogens), riding model will be used for positional refinement of H atoms, one occupancy and isotropic B-factor will be refined per all hydrogens within a residue, occupancies of waters will be refined as well, the default stereochemistry and ADP restraints weights are scaled down by the factors of 0.25 and 0.3 respectively. If starting model is far enough from the "final" one, more macro-cycles may be required (than 5 used in this example).

Examples of frequently used refinement protocols, common problems

  1. Starting refinement from high R-factors:

    % phenix.refine data.hkl model.pdb ordered_solvent=true main.number_of_macro_cycles=10 \
      simulated_annealing=true strategy=rigid_body+individual_sites+individual_adp \
    

Depending on data resolution, refinement of individual ADP may be replaced with grouped B refinement:

% phenix.refine data.hkl model.pdb ordered_solvent=true simulated_annealing=true \
  strategy=rigid_body+individual_sites+group_adp main.number_of_macro_cycles=10

Adding TLS refinement may be a good idea. Note, unlike other programs, phenix.refine does not require "good model" for doing TLS refinement; TLS refinement is always stable in phenix.refine (please report if noticed otherwise):

% phenix.refine data.hkl model.pdb ordered_solvent=true simulated_annealing=true \
  strategy=rigid_body+individual_sites+individual_adp+tls main.number_of_macro_cycles=10

If NCS is present - once can use it:

% phenix.refine data.hkl model.pdb ordered_solvent=true simulated_annealing=true \
  strategy=rigid_body+individual_sites+individual_adp+tls main.ncs=true \
  main.number_of_macro_cycles=10 tls_group_selections.params \
  rigid_body_selections.params

where tls_groups_selections.txt, rigid_body_groups_selections.txt are the files TLS and rigid body groups selections, NCS will be determined automatically from input PDB file. See this document for details on how specify these selections.

Note: in these four examples above we re-defined the default number of refinement macro-cycles from 3 to 10, since a start model with high R-factors most likely requires more cycles to become a good one. Also in these examples, the rigid body refinement will be run only once at first macro-cycle, the water picking will start after half of macro-cycles is done (after 5th), the SA will be done only twice - the first and before the last macro-cycles. Even though it is requested, the water picking may not be performed if the resolution is too low. All these default behaviors can be changed: see parameter's help for more details.

The last command looks too long to type it in the command line. Look this document for an example of how to make it like this:

% phenix.refine data.hkl model.pdb custom_par_1.params
  1. Refinement at "higher than medium" resolution - getting anisotropic.

Refining at higher resolution one may consider:

  • At resolutions around 1.8 ... 1.7 A or higher it is a good idea to try refinement of anisotropic ADP for atoms at well ordered parts of the model. Well ordered parts can be identified by relatively small isotropic B-factors ~5-20A**2 of so.
  • The riding model for H atoms should be used.
  • Loosing stereochemistry and ADP restraints.
  • Re-thing using the NCS (if present): it may turn out to be enough of data to not use NCS restrains. Try both, with and without NCS, and based on R-free vales decide the strategy.

Supposing the H atoms were added to the model, below is an example of what may want to do at higher resolution:

% phenix.refine data.hkl model.pdb adp.individual.anisotropic="resid 1:2 and not element H" \
  adp.individual.isotropic="not (resid 1:2 and not element H)" wxc_scale=2 wxu_scale=2

In the command above phenix.refine will refine the ADP of atoms in residues from 1 to 2 as anisotropic, the rest (including all H atoms) will be isotropic, the X-ray target contribution is increased for both, coordinate and ADP refinement. IMPORTANT: Please make note of the selection used in the above command: selecting atoms in residues 1 and 2 to be refined as anisotropic, one need to exclude hydrogens, which should be refined as isotropic.

  1. Stereochemistry looks too tightly / loosely restrained, or gap between R-free and R-work seems too big: playing with restraints contribution.

    Although the automatic calculation of weight between X-ray and stereochemistry or ADP restraint targets is good for most of cases, it may happen that rmsd deviations from ideal bonds length or angles are looking too tight or loose ( depending on resolution). Or the difference between R-work and R-free is too big (significantly bigger than approx. 5%). In such cases one definitely need to try loose or tighten the restraints. Hers is how for coordinates refinement:

    % phenix.refine data.hkl model.pdb wxc_scale=5
    

    The default value for wxc_scale is 0.5. Increasing wxc_scale will make the X-ray target contribution greater and restraints looser. Note: wxc_scale=0 will completely exclude the experimental data from the refinement resulting in idealization of the stereochemistry. For stereochemistry idealization use the separate command:

    % phenix.geometry_minimization model.pdb
    

    To see the options type:

    % phenix.geometry_minimization --help
    

    To play with ADP restraints contribution:

    % phenix.refine data.hkl model.pdb wxu_scale=3
    

    The default value for wxu_scale is 1.0. Increasing wxu_scale will make the X-ray target contribution greater and therefore the B-factors restraints weaker.

    Also, one can completely ignore the automatically determined weights (for both, coordinates and ADP refinement) and use specific values instead:

    % phenix.refine data.hkl model.pdb fix_wxc=15.0
    

    The refinement target will be: Etotal = 15.0 * Exray + Egeom

    Similarly for ADP refinement:

    % phenix.refine data.hkl model.pdb fix_wxu=25.0
    

    The refinement target will be: Etotal = 25.0 * Exray + Eadp

  2. Having unknown to phenix.refine item in PDB file (novel ligand, etc...).

    phenix.refine uses the CCP4 Monomer Library as the source of stereochemical information for building geometry restraints and reporting statistics.

    If phenix.refine is unable to match an item in input PDB file against the Monomer Library it will stop with "Sorry" message explaining what to do and listing the problem atoms. If this happened, it is necessary to obtain a cif file (parameter file, describing unknown molecule) by either making it manually or having eLBOW program to generate it:

    phenix.elbow model.pdb --do-all --output=all_ligands
    

    this will ask eLBOW to inspect the model_new.pdb file, find all unknown items in it and create one cif file for them all_ligands.cif. Alternatively, one can specify a three-letters name for the unknown residue:

    phenix.elbow model.pdb --residue=MAN --output=man
    

    Once the cif file is created, the new run of phenix.refine will be:

    phenix.refine model.pdb data.pdb man.cif
    

    Consult eLBOW documentation for more details.

Useful options

Changing the number of refinement cycles and minimizer iterations

% phenix.refine data.hkl model.pdb main.number_of_macro_cycles=5 \
  main.max_number_of_iterations=20

Parallelizing for multi-core systems

% phenix.refine data.hkl model.pdb optimize_xyz_weight=True nproc=4

The nproc parameter instructures phenix.refine to use multiple processors for several highly parallel routines. Currently this applies to the following procedures:

  • Automatic TLS identification (tls.find_automatically=True)
  • Bulk solvent mask optimization (optimize_mask=True, on by default)
  • XYZ restraints weight optimization (optimize_xyz_weight=True)
  • ADP restraints weight optimization (optimize_adp_weight=True)

When used with the default settings, nproc will have a minimal effect on overall runtime, but when the optimization grid searches are enabled, a speedup of 4-5x is possible. Values of nproc above 18 are unlikely to yield further speed improvement.

Note: this parallelization method is not compatible with OpenMP, and is limited to Mac and Linux systems. (It is, however, available in the Phenix GUI.)

Creating R-free flags (if not present in the input reflection files)

% phenix.refine data.hkl model.pdb xray_data.r_free_flags.generate=True

It is important to understand that reflections selected for test set must be never used in any refinement of any parameters. If the newly selected test reflections were used in refinement before then the corresponding R-free statistics will be wrong. In such case "refinement memory" removal procedure must be applied to recover proper statistics.

To change the default maximal number of test flags to be generated and the fraction:

% phenix.refine data.hkl model.pdb xray_data.r_free_flags.generate=True \
  xray_data.r_free_flags.fraction=0.05 xray_data.r_free_flags.max_free=500

Specify the name for output files

% phenix.refine data.hkl model.pdb output.prefix=lysozyme

Reflection output

At the end of refinement a file with Fobs, Fmodel, Fcalc, Fmask, FOM, R-free_flags can be written out (in MTZ format):

% phenix.refine data.hkl model.pdb export_final_f_model=true

Note: Fmodel is the total model structure factor including all scales:

Fmodel = scale_k1 * exp(-h*U_overall*ht) * (Fcalc + k_sol * exp(-B_sol*s^2) * Fmask)

Setting the resolution range for the refinement

% phenix.refine data.hkl model.pdb xray_data.low_resolution=15.0 xray_data.high_resolution=2.0

Bulk solvent correction and anisotropic scaling

By default phenix.refine always starts with bulk solvent modeling and anisotropic scaling. Here is the list of command that may be of use in some cases:

  1. Perform bulk-solvent modeling and anisotropic scaling only:

    % phenix.refine data.hkl model.pdb strategy=none
    
  2. Bulk-solvent modeling only (no anisotropic scaling):

    % phenix.refine data.hkl model.pdb strategy=none bulk_solvent_and_scale.anisotropic_scaling=false
    
  3. Anisotropic scaling only (no bulk-solvent modeling):

    % phenix.refine data.hkl model.pdb strategy=none bulk_solvent_and_scale.bulk_solvent=false
    
  4. Turn off bulk-solvent modeling and anisotropic scaling:

    % phenix.refine data.hkl model.pdb main.bulk_solvent_and_scale=false
    
  5. Fixing bulk-solvent and anisotropic scale parameters to user defined values:

    % phenix.refine data.hkl model.pdb bulk_solvent_and_scale.params
    

    where bulk_solvent_and_scale.params is the file containing these lines:

    refinement {
      bulk_solvent_and_scale {
        k_sol_b_sol_grid_search = False
        minimization_k_sol_b_sol = False
        minimization_b_cart = False
        fix_k_sol = 0.45
        fix_b_sol = 56.0
        fix_b_cart {
          b11 = 1.2
          b22 = 2.3
          b33 = 3.6
          b12 = 0.0
          b13 = 0.0
          b23 = 0.0
        }
      }
    }
    
  6. Mask parameters:

    Bulk solvent modeling involves the mask calculation. There are three principal parameters controlling it: solvent_radius, shrink_truncation_radius and grid_step_factor. Normally, these parameters are not supposed to be changed but can be changed:

    % phenix.refine data.hkl model.pdb refinement.mask.solvent_radius=1.0 \
      refinement.mask.shrink_truncation_radius=1.0 refinement.mask.grid_step_factor=3
    

    If one wants to gain some more drop in R-factors (somewhere between 0.0 and 1.0%) it is possible to run fairly time consuming (depending on structure size and resolution) procedure of mask parameters optimization:

    % phenix.refine data.hkl model.pdb optimize_mask=true
    

    This will perform the grid search for solvent_radius and shrink_truncation_radius and select the values giving the best R-factor.

By default phenix.refine adds isotropic component of overall anisotropic scale matrix to atomic B-factors, leaving the trace of overall anisotropic scale matrix equals to zero. This is the reason why one can observe the ADP changed even though the only anisotropic scaling was done and no ADP refinement performed.

Default refinement with user specified X-ray target function

  1. Refinement with least-squares target:

    % phenix.refine data.hkl model.pdb main.target=ls
    
  2. Refinement with maximum-likelihood target (default):

    % phenix.refine data.hkl model.pdb main.target=ml
    
  3. Refinement with phased maximum-likelihood target:

    % phenix.refine data.hkl model.pdb main.target=mlhl
    

    If phenix.refine finds Hendrickson-Lattman coefficients in input reflection file, it will automatically switch to mlhl target. To disable this:

    % phenix.refine data.hkl model.pdb main.use_experimental_phases=false
    

Modifying the initial model before refinement starts

phenix.refine offers several options to modify input model before refinement starts:

  1. shaking of coordinates (adding a random shift to coordinates):

    % phenix.refine data.hkl model.pdb sites.shake=0.3
    
  2. rotation-translation shift of coordinates:

    % phenix.refine data.hkl model.pdb sites.rotate="1 2 3" sites.translate="4 5 6"
    
  3. shaking of occupancies:

    % phenix.refine data.hkl model.pdb occupancies.randomize=true
    
  4. shaking of ADP:

    % phenix.refine data.hkl model.pdb adp.randomize=true
    
  5. shifting of ADP (adding a constant value):

    % phenix.refine data.hkl model.pdb adp.shift_b_iso=10.0
    
  6. scaling of ADP (multiplying by a constant value):

    % phenix.refine data.hkl model.pdb adp.scale_adp=0.5
    
  7. setting a value to ADP:

    % phenix.refine data.hkl model.pdb adp.set_b_iso=25
    
  8. converting to isotropic:

    % phenix.refine data.hkl model.pdb adp.convert_to_isotropic=true
    
  9. converting to anisotropic:

    % phenix.refine data.hkl model.pdb adp.convert_to_anisotropic=true \
      modify_start_model.selection="not element H"
    

    When converting atoms into anisotropic, it is important to make sure that hydrogens (if present in the model) are not converted into anisotropic.

By default, the specified manipulations will be applied to all atoms. However, it is possible to apply them to only selected atoms:

% phenix.refine data.hkl model.pdb adp.set_b_iso=25 modify_start_model.selection="chain A"

To write out the modified model (without any refinement), add: main.number_of_macro_cycles=0, e.g.:

% phenix.refine data.hkl model.pdb adp.set_b_iso=25 \
  main.number_of_macro_cycles=0

All the commands listed above plus some more are available from phenix.pdbtools utility which in fact is used internally in phenix.refine to perform these manipulations. For more information on phenix.pdbtools type:

% phenix.pdbtools --help

Documentation on phenix.pdbtools is also available.

Refinement using FFT or direct structure factor calculation algorithm

% phenix.refine data.hkl model.pdb \
  structure_factors_and_gradients_accuracy.algorithm=fft

or:

% phenix.refine data.hkl model.pdb \
  structure_factors_and_gradients_accuracy.algorithm=direct

Ignoring test (free) flags in refinement

Sometimes one needs to use all reflections ("work" and "test") in the refinement; for example, at very low resolution where each single reflection counts, or at subatomic resolution where the risk of overfitting is very low. In the example below all the reflections are used in the refinement:

% phenix.refine data.hkl model.pdb xray_data.r_free_flags.ignore_r_free_flags=true

Note: 1) the corresponding statistics (R-factors, ...) will be identical for "work" and "test" sets; 2) it is still necessary to have test flags presented in input reflection file (or automatically generated by phenix.refine).

Using phenix.refine to calculate structure factors

The total structure factor used in phenix.refine nearly in all calculations is defined as:

Fmodel = scale_k1 * exp(-h*U_overall*ht) * (Fcalc + k_sol * exp(-B_sol*s^2) * Fmask)
  1. Calculate Fcalc from atomic model and output in MTZ file (no solvent modeling or scaling):

    % phenix.refine data.hkl model.pdb main.number_of_macro_cycles=0 \
      main.bulk_solvent_and_scale=false export_final_f_model=true
    
  2. Calculate Fcalc from atomic model including bulk solvent and all scales:

    % phenix.refine data.hkl model.pdb main.number_of_macro_cycles=1 \
      strategy=none export_final_f_model=true
    
  3. Resolution limits can be applied:

    % phenix.refine data.hkl model.pdb main.number_of_macro_cycles=1 \
      strategy=none xray_data.low_resolution=15.0 xray_data.high_resolution=2.0
    

Note:

  • The number of calculated structure factors will the same as the number of observed data (Fobs) provided in the input reflection files or less since resolution and sigma cutoffs may be applied to Fobs or some Fobs may be automatically removed by outliers detection procedure.
  • The set of calculated structure factors has the same completeness as the set of provided Fobs.

Scattering factors

There are four choices for the scattering table to be used in phenix.refine:

  • wk1995: Waasmaier & Kirfel table;
  • it1992: International Crystallographic Tables (1992)
  • n_gaussian: dynamic n-gaussian approximation
  • neutron: table for neutron scattering

The default is n_gaussian. To switch to different table:

% phenix.refine data.hkl model.pdb main.scattering_table=neutron

Suppressing the output of certain files

The following command will tell phenix,refine to not write .eff, .geo, .def, maps and map coefficients files:

% phenix.refine data.hkl model.pdb write_eff_file=false write_geo_file=false \
  write_def_file=false write_maps=false write_map_coefficients=false

The only output will be: .log and .pdb files.

Random seed

To change random seed:

% phenix.refine data.hkl model.pdb main.random_seed=7112384

The results of certain refinement protocols, such as restrained refinement of coordinates (with SA or LBFGS minimization), are sensitive to the random seed. This is because: 1) for SA the refinement starts with random assignment of velocities to atoms; 2) the X-ray/geometry target weight calculation involves model shaking with some Cartesian dynamics. As result, running such refinement jobs with exactly the same parameters but different random seeds will produce different refinement statistics. The author's experience includes the case where the difference in R-factors was about 2.0% between two SA runs.

Also, this opens a possibility to perform multi-start SA refinement to create an ensemble of slightly different models in average but sometimes containing significant variations in certain parts.

Electron density maps

By default phenix.refine outputs two likelihood-weighted maps: 2mFo-DFc and mFo-DFc. These are the map coefficients generated for use in Coot. The user can also choose between likelihood-weighted or regular maps with any specified coefficients, for example: 2mFo-DFc, 2.7mFo-1.3DFc, Fo-Fc, 3Fo-2Fc. Any number of maps can be created. Optionally, the result can be output as binary CCP4 format. The example below illustrates the main options:

% phenix.refine data.hkl model.pdb map.params write_maps=true

where map.params contains:

refinement {
  electron_density_maps {
    map_coefficients {
      mtz_label_amplitudes = 2FOFCWT
      mtz_label_phases = PH2FOFCWT
      map_type = 2mFo-DFc
    }
    map_coefficients {
      mtz_label_amplitudes = FOFCWT
      mtz_label_phases = PHFOFCWT
      map_type = mFo-DFc
    }
    map_coefficients {
      mtz_label_amplitudes = 3FO2FCWT
      mtz_label_phases = PH3FO2FCWT
      map_type = 3Fo-2Fc
    }
    map {
      map_type = 2mFo-DFc
      grid_resolution_factor = 1/4.
      region = *selection cell
      atom_selection = chain A and resseq 1
    }
  }
}

This will output one file with map coefficients for 2mFo-DFc, mFo-DFc and 3Fo-2Fc maps, and one X-plor formatted file containing 2mFo-DFc map computed around residue 1 in chain A. The map finesse will be (data resolution)*grid_resolution_factor. If atom_selection is set to None or all then map will be computed for all atoms.

Refining with anomalous data (or what phenix.refine does with Fobs+ and Fobs-).

The way phenix.refine uses Fobs+ and Fobs- is controlled by xray_data.force_anomalous_flag_to_be_equal_to parameter.

Here are 3 possibilities:

  1. Default behavior: phenix.refine will use all Fobs: Fobs+ and Fobs- as independent reflections:

    % phenix.refine model.pdb data_anom.hkl
    
  2. phenix.refine will generate missing Bijvoet mates and use all Fobs+ and Fobs- as independent reflections if:

    % phenix.refine model.pdb data_anom.hkl xray_data.force_anomalous_flag_to_be_equal_to=true
    
  3. phenix.refine will merge Fobs+ and Fobs-, that is instead of two separate Fobs+ and Fobs- it will use one value F_mean = (Fobs+ + Fobs-)/2 if:

    % phenix.refine model.pdb data_anom.hkl xray_data.force_anomalous_flag_to_be_equal_to=false
    

Look this documentation to see how to use and refine f' and f''.

Rejecting reflections by sigma

Reflections can be rejected by sigma cutoff criterion applied to amplitudes Fobs <= sigma_fobs_rejection_criterion * sigma(Fobs):

% phenix.refine model.pdb data_anom.hkl xray_data.sigma_fobs_rejection_criterion=2

or/and intensities Iobs <= sigma_iobs_rejection_criterion * sigma(Iobs):

% phenix.refine model.pdb data_anom.hkl xray_data.sigma_iobs_rejection_criterion=2

Internally, phenix.refine uses amplitudes. If both sigma_fobs_rejection_criterion and sigma_iobs_rejection_criterion are given as non-zero values, then both criteria will be applied: first to Iobs, then to Fobs (after truncated Iobs got converted to Fobs):

% phenix.refine model.pdb data_anom.hkl xray_data.sigma_fobs_rejection_criterion=2 \
  xray_data.sigma_iobs_rejection_criterion=2

By default, both sigma_fobs_rejection_criterion and sigma_iobs_rejection_criterion are set to zero (no reflections rejected) and, unless strongly motivated, we encourage to not change these values. If amplitudes provided at input then sigma_fobs_rejection_criterion is ignored.

Developer's tools

phenix.refine offers a broad functionality for experimenting that may not be useful in everyday practice but handy for testing ideas.

Substitute input Fobs with calculated Fcalc, shake model and refine it

Instead of using Fobs from input data file one can ask phenix.refine to use the calculated structure factors Fcalc using the input model. Obviously, the R-factors will be zero throughout the refinement. One can also shake various model parameters (see this document for details), then refinement will start with some bad statistics (big R-factors at least) and hopefully will converge to unmodified start model (if not shaken too well).

Also it's possible to simulate Flat bulk solvent model contribution and anisotropic scaling:

% phenix.refine model.pdb data.hkl experiment.params

where experiment.params contains the following:

refinement {
  main {
    fake_f_obs = True
  }
  modify_start_model {
    selection = "chain A"
    sites {
      shake = 0.5
    }
  }
  fake_f_obs {
    fmodel {
      k_sol = 0.35
      b_sol = 45.0
      b_cart = 1.25 3.78 1.25 0.0 0.0 0.0
      scale = 358.0
    }
  }
}

In this example, the input Fobs will be substituted with the same amount of Fcalc (absolute values of Fcalc), then the coordinates of the structure will be shaken to achieve rmsd=0.5 and finally the default run of refinement will be done. The bulk solvent and anisotropic scale and overall scalar scales are also added to thus obtained Fcalc in accordance with Fmodel definition (see this document for definition of total structure factor, Fmodel). Expected refinement behavior: R-factors will drop from something big to zero.

CIF modifications and links

phenix.refine uses the CCP4 monomer library to build geometry restraints (bond, angle, dihedral, chirality and planarity restraints). The CCP4 monomer library comes with a set of "modifications" and "links" which are defined in the file mon_lib_list.cif. Some of these are used automatically when phenix.refine builds the geometry restraints (e.g. the peptide and RNA/DNA chain links). Other links and modifications have to be applied manually, e.g. (cif_modification.params file):

refinement.pdb_interpretation.apply_cif_modification {
  data_mod = 5pho
  residue_selection = resname GUA and name O5T
}

Here a custom 5pho modification is applied to all GUA residues with an O5T atom. I.e. the modification can be applied to multiple residues with a single apply_cif_modification block. The CIF modification is supplied as a separate file on the phenix.refine command line, e.g. (data_mod_5pho.cif file):

data_mod_5pho
#
loop_
_chem_mod_atom.mod_id
_chem_mod_atom.function
_chem_mod_atom.atom_id
_chem_mod_atom.new_atom_id
_chem_mod_atom.new_type_symbol
_chem_mod_atom.new_type_energy
_chem_mod_atom.new_partial_charge
 5pho     add      .      O5T    O    OH      .
loop_
_chem_mod_bond.mod_id
_chem_mod_bond.function
_chem_mod_bond.atom_id_1
_chem_mod_bond.atom_id_2
_chem_mod_bond.new_type
_chem_mod_bond.new_value_dist
_chem_mod_bond.new_value_dist_esd
 5pho     add      O5T     P         coval        1.520    0.020

The whole command will be:

% phenix.refine model_o5t.pdb data.hkl data_mod_5pho.cif cif_modification.params

Similarly, a link can be applied like this (cif_link.params file):

refinement.pdb_interpretation.apply_cif_link {
  data_link = MAN-THR
  residue_selection_1 = chain X and resname MAN and resid 900
  residue_selection_2 = chain X and resname THR and resid 42
}

% phenix.refine model.pdb data.hkl cif_link.params

The residue selections for links must select exactly one residue each. The MAN-THR link is pre-defined in mon_lib_list.cif. Custom links can be supplied as additional files on the phenix.refine command line. See mon_lib_list.cif for examples. The full path to this file can be obtained with the command:

% phenix.where_mon_lib_list_cif

All apply_cif_modification and apply_cif_link definitions will be included into the .def files. I.e. it is not necessary to specify the definitions again if further refinement runs are started with .def files.

Note that all LINK, SSBOND, HYDBND, SLTBRG and CISPEP records in the input PDB files are ignored.

Definition of custom bonds and angles

Most geometry restraints (bonds, angles, etc.) are generated automatically based on the CCP4 monomer library. Additional custom bond and angle restraints, e.g. between protein and a ligand or ion, can be specified in this way:

refinement.geometry_restraints.edits {
  zn_selection = chain X and resname ZN and resid 200 and name ZN
  his117_selection = chain X and resname HIS and resid 117 and name NE2
  asp130_selection = chain X and resname ASP and resid 130 and name OD1
  bond {
    action = *add
    atom_selection_1 = $zn_selection
    atom_selection_2 = $his117_selection
    symmetry_operation = None
    distance_ideal = 2.1
    sigma = 0.02
    slack = None
  }
  bond {
    action = *add
    atom_selection_1 = $zn_selection
    atom_selection_2 = $asp130_selection
    symmetry_operation = None
    distance_ideal = 2.1
    sigma = 0.02
    slack = None
  }
  angle {
    action = *add
    atom_selection_1 = $his117_selection
    atom_selection_2 = $zn_selection
    atom_selection_3 = $asp130_selection
    angle_ideal = 109.47
    sigma = 5
  }
}

The atom selections must uniquely select a single atom. Save the geometry_restraints.edits to a file and specify the file name as an additional argument when running phenix.refine for the first time. For example:

% phenix.refine model.pdb data.hkl restraints_edits.params

The edits will be included into the .def files. I.e. it is not necessary to manually specify them again if further refinement runs are started with .def files.

For bonds to symmetry copies, specify the symmetry operation in xyz notation, for example:

symmetry_operation = -x-1/2,y-1/2,-z+1/2

To obtain the symmetry_operation, either use Coot (turn on drawing on symmetry copies, then click on the copy and look for the symmetry operation in the status bar), or run this command:

iotbx.show_distances your.pdb > all_distances

This will produce a potentially long all_distances file, but if you search for sym= there will probably only be a few matches from which it is easy to pick the one you are interested in, based on the pdb atom labels.

The bond.slack parameter above can be used to disable a bond restraint within the slack tolerance around distance_ideal. This is useful for hydrogen bond restraints, or when refining with very high-resolution data (e.g. better than 1 A). The bond restraint is activated only if the discrepancy between the model bond distance and distance_ideal is greater than the slack value. The slack is subtracted from the discrepancy. The resulting potential is called a "square-well potential" by some authors. The formula for the contribution to the refinement target function is:

weight * delta_slack**2

with:

delta_slack = sign(delta) * max(0, (abs(delta) - slack))
delta = distance_ideal - distance_model
weight = 1 / sigma**2

The slack value must be greater than or equal to zero (it can also be None, which is equivalent to zero in this case).

Atom selection examples

All atoms

all

All C-alpha atoms (not case sensitive)

name ca

All atoms with ``H`` in the name (``*`` is a wildcard character)

name *H*

Atoms names with ``*`` (backslash disables wildcard function)

name o2\*

Atom names with spaces

name 'O 1'

Atom names with primes don't necessarily have to be quoted

name o2'

Boolean ``and``, ``or`` and ``not``

resname ALA and (name ca or name c or name n or name o)
chain a and not altid b
resid 120 and icode c and model 2
segid a and element c and charge 2+ and anisou

Residue 188

resseq 188

resid is a synonym for resseq:

resid 188

Note that if there are several chains containing residue number 188, all of them will be selected. To be more specific and select residue 188 in particular chain:

chain A and resid 188

this will select residue 188 only in chain A.

Residues 2 through 10 (including 2 and 10)

resseq 2:10

"Smart" selections

resname ALA and backbone
resname ALA and sidechain
peptide backbone
rna backbone or dna backbone
water or nucleotide
dna and not (phosphate or ribose)
within(5, (nucleotide or peptide) backbone)

Depositing refined structure with PDB

phenix.refine reports a comprehensive statistics in PDB file header of refined model. This statistics consists of two parts: the first (upper, formatted with REMARK record) part is relevant to the current refinement run and contains the information about input data and model files, time stamp, start and final R-factors, refinement statistics from macro-cycle to macro-cycle, etc. The second (lower, formatted with REMARK 3 record) part is abstracted from a particular refinement run (no intermediate statistics, time, no file names, etc.). This part is supposed to go in PDB and the first part should be removed manually.

Referencing phenix.refine

Afonine PV, Grosse-Kunstleve RW, Echols N, Headd JJ, Moriarty NW, Mustyakimov M, Terwilliger TC, Urzhumtsev A, Zwart PH, Adams PD. (2012) Towards automated crystallographic structure refinement with phenix.refine. Acta Crystallogr D Biol Crystallogr. 68:352-67. PMID: 22505256

Relevant reading

Below is the list of papers either published in connection with phenix.refine or used to implement specific features in phenix.refine:

  1. Maximum-likelihood in structure refinement:
    • V.Yu., Lunin & T.P., Skovoroda. Acta Cryst. (1995). A51, 880-887. "R-free likelihood-based estimates of errors for phases calculated from atomic models"
    • Pannu, N.S., Murshudov, G.N., Dodson, E.J. & Read, R.J. (1998). Acta Cryst. D54, 1285-1294. "Incorporation of Prior Phase Information Strengthens Maximum-Likelihood Structure Refinement"
    • V.Y., Lunin, P.V. Afonine & A.G., Urzhumtsev. Acta Cryst. (2002). A58, 270-282. "Likelihood-based refinement. I. Irremovable model errors"
    • P. Afonine, V.Y. Lunin & A. Urzhumtsev. J. Appl. Cryst. (2003). 36, 158-159. "MLMF: least-squares approximation of likelihood-based refinement criteria"
  2. ADP:
    • V. Schomaker & K.N. Trueblood. Acta Cryst. (1968). B24, 63-76. "On the rigid-body motion of molecules in crystals"
    • F.L. Hirshfeld. Acta Cryst. (1976). A32, 239-244. "Can X-ray data distinguish bonding effects from vibrational smearing?"
    • T.R. Schneider. Proceedings of the CCP4 Study Weekend (E. Dodson, M. Moore, A. Ralph, and S. Bailey, eds.), SERC Daresbury Laboratory, Daresbury, U.K., pp. 133-144 (1996). "What can we Learn from Anisotropic Temperature Factors ?"
    • M.D. Winn, M.N. Isupov & G.N. Murshudov. Acta Cryst. (2001). D57, 122-133. "Use of TLS parameters to model anisotropic displacements in macromolecular refinement"
    • R.W. Grosse-Kunstleve & P.D. Adams. J. Appl. Cryst. (2002). 35, 477-480. "On the handling of atomic anisotropic displacement parameters"
    • P. Afonine & A. Urzhumtsev. (2007). CCP4 Newsletter on Protein Crystallography. 45. Contribution 6. "On determination of T matrix in TLS modeling"
  3. Rigid body refinement:
    • Afonine PV, Grosse-Kunstleve RW, Adams PD & Urzhumtsev AG. "Methods for optimal rigid body refinement of models with large displacements". (in preparation for Acta Cryst. D).
  4. Bulk-solvent modeling and anisotropic scaling:
    • S. Sheriff & W.A. Hendrickson. Acta Cryst. (1987). A43, 118-121. "Description of overall anisotropy in diffraction from macromolecular crystals"
    • Jiang, J.-S. & Brunger, A. T. (1994). J. Mol. Biol. 243, 100-115. "Protein hydration observed by X-ray diffraction. Solvation properties of penicillopepsin and neuraminidase crystal structures."
    • A. Fokine & A. Urzhumtsev. Acta Cryst. (2002). D58, 1387-1392. "Flat bulk-solvent model: obtaining optimal parameters"
    • P.V. Afonine, R.W. Grosse-Kunstleve & P.D. Adams. Acta Cryst. (2005). D61, 850-855. "A robust bulk-solvent correction and anisotropic scaling procedure"
  5. Refinement at low resolution:
    • Headd, J.J., Echols, N., Afonine, P.V., Grosse-Kunstleve, R.W., Chen, V.B., Moriarty, N.W., Richardson, D.C., Richardson, J.S., Adams, P.D. (2012). Acta Cryst. D68:381-90. "Use of knowledge-based restraints in phenix.refine to improve macromolecular refinement at low resolution."
  6. Refinement at subatomic resolution:
    • Afonine, P.V., Pichon-Pesme, V., Muzet, N., Jelsch, C., Lecomte, C. & Urzhumtsev, A. (2002). CCP4 Newsletter on Protein Crystallography. 41. "Modeling of bond electron density"
    • Afonine P.V., Lunin, V., Muzet, N. & Urzhumtsev, A. (2004). Acta Cryst., D60, 260-274. "On the possibility of observation of valence electron density for individual bonds in proteins in conventional difference maps"
    • P.V. Afonine, R.W. Grosse-Kunstleve, P.D. Adams, V.Y. Lunin, A. Urzhumtsev. "On macromolecular refinement at subatomic resolution with interatomic scatterers" (submitted to Acta Cryst. D).
  7. LBFGS minimization:
    • Liu, D.C. & Nocedal, J. (1989). Mathematical Programming, 45, 503-528. "On the limited memory BFGS method for large scale optimization"
  8. Dynamics, simulated annealing:
    • Brunger, A.T., Kuriyan, J., Karplus, M. (1987). Science. 235, 458-460. "Crystallographic R factor refinement by molecular dynamics"
    • Adams, P.D., Pannu, N.S., Read, R.J. & Brunger, A.T. (1997). Proc. Natl. Acad. Sci. 94, 5018-5023. "Cross-validated maximum likelihood enhances crystallographic simulated annealing refinement"
    • L.M. Rice, Y. Shamoo & A.T. Brunger. J. Appl. Cryst. (1998). 31, 798-805. "Phase Improvement by Multi-Start Simulated Annealing Refinement and Structure-Factor Averaging"
    • Brunger, A.T & Adams, P.D. (2002). Acc. Chem. Res. 35, 404-412. "Molecular dynamics applied to X-ray structure refinement"
  9. Target weights calculation:
    • Brunger, A.T., Karplus, M. & Petsko, G.A. (1989). Acta Cryst. A45, 50-61. "Crystallographic refinement by simulated annealing: application to crambin"
    • Brunger, A.T. (1992). Nature (London), 355, 472-474. "The free R value: a novel statistical quantity for assessing the accuracy of crystal structures"
    • Adams, P.D., Pannu, N.S., Read, R.J. & Brunger, A.T. (1997). Proc. Natl. Acad. Sci. 94, 5018-5023. "Cross-validated maximum likelihood enhances crystallographic simulated annealing refinement"
  10. Electron density maps (Fourier syntheses) calculation:
    • A.G. Urzhumtsev, T.P. Skovoroda & V.Y. Lunin. J. Appl. Cryst. (1996). 29, 741-744. "A procedure compatible with X-PLOR for the calculation of electron-density maps weighted using an R-free-likelihood approach"
  11. Monomer Library:
    • Vagin, A.A., Steiner, R.A., Lebedev, A.A, Potterton, L., McNicholas, S., Long, F. & Murshudov, G.N. (2004). Acta Cryst. D60, 2184-2195. "REFMAC5 dictionary: organization of prior chemical knowledge and guidelines for its use"
  12. Scattering factors:
    • D. Waasmaier & A. Kirfel. Acta Cryst. (1995). A51, 416-431. "New analytical scattering-factor functions for free atoms and ions"
    • International Tables for Crystallography (1992)
    • Neutron News, Vol. 3, No. 3, 1992, pp. 29-37. http://www.ncnr.nist.gov/resources/n-lengths/list.html
    • Grosse-Kunstleve RW, Sauter NK & Adams PD. Newsletter of the IUCr Commission on Crystallographic Computing 2004, 3:22-31. "cctbx news"
  13. Neutron and joint X-ray/neutron refinement:
    • A. Wlodawer & W.A. Hendrickson. Acta Cryst. (1982). A38, 239-247. "A procedure for joint refinement of macromolecular structures with X-ray and neutron diffraction data from single crystals"
    • A. Wlodawer, H. Savage & G. Dodson. Acta Cryst. (1989). B45, 99-107. "Structure of insulin: results of joint neutron and X-ray refinement"
  14. Stereochemical restraints:
    • Grosse-Kunstleve, R.W., Afonine, P.V., Adams, P.D. (2004). Newsletter of the IUCr Commission on Crystallographic Computing, 4, 19-36. "cctbx news: Geometry restraints and other new features"
  15. Parameters parsing and interpretation:
    • Grosse-Kunstleve RW, Afonine PV, Sauter NK, Adams PD. Newsletter of the IUCr Commission on Crystallographic Computing 2005, 5:69-91. "cctbx news: Phil and friends"

Feedback, more information

List of all refinement keywords

------------------------------------------------------------------------------- 
Legend: black bold - scope names
        black - parameter names
        red - parameter values
        blue - parameter help
        blue bold - scope help
        Parameter values:
          * means selected parameter (where multiple choices are available)
          False is No
          True is Yes
          None means not provided, not predefined, or left up to the program
          "%3d" is a Python style formatting descriptor
------------------------------------------------------------------------------- 
refinement Scope of parameters for structure refinement with phenix.refine
   crystal_symmetry Scope of space group and unit cell parameters
      unit_cell= None
      space_group= None
   input Scope of input file names, labels, processing directions
      symmetry_safety_check= *error warning Check for consistency of crystal
                             symmetry from model and data files
      pdb
         file_name= None Model file(s) name (PDB)
      neutron_data Scope of neutron data and neutron free-R flags
         ignore_xn_free_r_mismatch= False
         file_name= None
         labels= None
         high_resolution= None
         low_resolution= None
         outliers_rejection= True Remove basic wilson outliers , extreme
                             wilson outliers , and beamstop shadow outliers
         french_wilson_scale= True
         sigma_fobs_rejection_criterion= None
         sigma_iobs_rejection_criterion= None
         ignore_all_zeros= True
         force_anomalous_flag_to_be_equal_to= None
         french_wilson
            max_bins= 60 Maximum number of resolution bins
            min_bin_size= 40 Minimum number of reflections per bin
         r_free_flags
            file_name= None This is normally the same as the file containing
                       Fobs and is usually selected automatically.
            label= None
            test_flag_value= None This value is usually selected automatically
                             - do not change unless you really know what
                             you're doing!
            ignore_r_free_flags= False Use all reflections in refinement (work
                                 and test)
            disable_suitability_test= False
            ignore_pdb_hexdigest= False If True, disables safety check based
                                  on MD5 hexdigests stored in PDB files
                                  produced by previous runs.
            generate= False Generate R-free flags (if not available in input
                      files)
            fraction= 0.1
            max_free= 2000
            lattice_symmetry_max_delta= 5
            use_lattice_symmetry= True
            use_dataman_shells= False Used to avoid biasing of the test set by
                                certain types of non-crystallographic
                                symmetry.
            n_shells= 20
      xray_data Scope of X-ray data and free-R flags
         file_name= None
         labels= None
         high_resolution= None
         low_resolution= None
         outliers_rejection= True Remove basic wilson outliers , extreme
                             wilson outliers , and beamstop shadow outliers
         french_wilson_scale= True
         sigma_fobs_rejection_criterion= None
         sigma_iobs_rejection_criterion= None
         ignore_all_zeros= True
         force_anomalous_flag_to_be_equal_to= None
         french_wilson
            max_bins= 60 Maximum number of resolution bins
            min_bin_size= 40 Minimum number of reflections per bin
         r_free_flags
            file_name= None This is normally the same as the file containing
                       Fobs and is usually selected automatically.
            label= None
            test_flag_value= None This value is usually selected automatically
                             - do not change unless you really know what
                             you're doing!
            ignore_r_free_flags= False Use all reflections in refinement (work
                                 and test)
            disable_suitability_test= False
            ignore_pdb_hexdigest= False If True, disables safety check based
                                  on MD5 hexdigests stored in PDB files
                                  produced by previous runs.
            generate= False Generate R-free flags (if not available in input
                      files)
            fraction= 0.1
            max_free= 2000
            lattice_symmetry_max_delta= 5
            use_lattice_symmetry= True
            use_dataman_shells= False Used to avoid biasing of the test set by
                                certain types of non-crystallographic
                                symmetry.
            n_shells= 20
      experimental_phases Scope of experimental phase information (HL
                          coefficients)
         file_name= None
         labels= None
      monomers Scope of monomers information (CIF files)
         file_name= None Monomer file(s) name (CIF)
      sequence
         file_name= None Sequence data in a text file (supported formats
                    include FASTA, PIR, and raw text). Currently this is only
                    used by the PHENIX GUI for validation.
   output Scope for output files
      prefix= None Prefix for all output files
      serial= None Serial number for consequtive refinement runs
      serial_format= "%03d" Format serial number in output file name
      title= None Brief string describing run
      write_eff_file= True
      write_geo_file= True
      write_final_geo_file= False
      write_def_file= True
      write_model_cif_file= False
      write_reflection_cif_file= False
      export_final_f_model= False Write Fobs, Fmodel, various scales and more
                            to MTZ file
      write_maps= False
      write_map_coefficients= True
      pickle_fmodel= False Dump final fmodel object into a pickle file.
      pickle_stats_by_cycle= False Dump monitored refinement statistics into a
                             pickle file.
      n_resolution_bins= None Sets the number of bins used for
                         resolution-dependent statistics in output files and
                         the Phenix GUI. If None, the binning will be
                         determined automatically.
   electron_density_maps Electron density maps calculation parameters
      apply_default_maps= None
      map_coefficients
         map_type= None
         format= *mtz phs
         mtz_label_amplitudes= None
         mtz_label_phases= None
         kicked= False
         fill_missing_f_obs= False
         acentrics_scale= 2.0 Scale terms corresponding to acentric
                          reflections (residual maps only: k==n)
         centrics_pre_scale= 1.0 Centric reflections, k!=n and k*n != 0:
                             max(k-centrics_pre_scale,0)*Fo-max(n-centrics_pre_
                            scale,0)*Fc
         sharpening= False Apply B-factor sharpening
         sharpening_b_factor= None Optional sharpening B-factor value
         exclude_free_r_reflections= False Exclude free-R selected reflections
                                     from output map coefficients
         isotropize= True
         ncs_average= False Perform NCS averaging on map using RESOLVE
                      (without density modification). Will be ignored if NCS
                      is not present.
         dev
            complete_set_up_to_d_min= False
            aply_same_incompleteness_to_complete_set_at= randomly low high
      map
         map_type= None
         format= xplor *ccp4
         file_name= None
         kicked= False
         fill_missing_f_obs= False
         grid_resolution_factor= 1/4.
         scale= *sigma volume
         region= *selection cell
         atom_selection= None
         atom_selection_buffer= 3
         acentrics_scale= 2.0 Scale terms corresponding to acentric
                          reflections (residual maps only: k==n)
         centrics_pre_scale= 1.0 Centric reflections, k!=n and k*n != 0:
                             max(k-centrics_pre_scale,0)*Fo-max(n-centrics_pre_
                            scale,0)*Fc
         sharpening= False Apply B-factor sharpening
         sharpening_b_factor= None Optional sharpening B-factor value
         exclude_free_r_reflections= False Exclude free-R selected reflections
                                     from map calculation
         isotropize= True
         ncs_average= False Perform NCS averaging on map using RESOLVE
                      (without density modification). Will be ignored if NCS
                      is not present.
   refine Scope of refinement flags (=flags defining what to refine) and atom
          selections (=atoms to be refined)
      strategy= *individual_sites *individual_sites_real_space rigid_body
                *individual_adp group_adp tls *occupancies group_anomalous 
               Atomic parameters to be refined
      sites Scope of atom selections for coordinates refinement
         individual= None Atom selections for individual atoms
         torsion_angles= None Atom selections for Torsion Angle Refinement and
                         Dynamics
         rigid_body= None Atom selections for rigid groups
      adp Scope of atom selections for ADP (Atomic Displacement Parameters)
          refinement
         group_adp_refinement_mode= *one_adp_group_per_residue
                                    two_adp_groups_per_residue group_selection
                                    Select one of three available modes for
                                    group B-factors refinement. For two groups
                                    per residue, the groups will be main-chain
                                    and side-chain atoms. Provide selections
                                    for groups if group_selection is chosen.
         group= None One isotropic ADP for group of selected here atoms will
                be refined
         tls= None Selection(s) for TLS group(s)
         individual Scope of atom selections for refinement of individual ADP
            isotropic= None Selections for atoms to be refinement with
                       isotropic ADP
            anisotropic= None Selections for atoms to be refinement with
                         anisotropic ADP
      occupancies Scope of atom selections for occupancy refinement
         individual= None Selection(s) for individual atoms. None is default
                     which is to refine the individual occupancies for atoms
                     in alternative conformations or for atoms with partial
                     occupancies only.
         remove_selection= None Occupancies of selected atoms will not be
                           refined (even though they might satisfy the default
                           criteria for occupancy refinement).
         constrained_group Selections to define constrained occupancies. If
                           only one selection is provided then one occupancy
                           factor per selected atoms will be refined and it
                           will be constrained between predefined max and min
                           values.
            selection= None Atom selection string.
      anomalous_scatterers
         group
            selection= None
            f_prime= 0
            f_double_prime= 0
            refine= *f_prime *f_double_prime
   main Scope for most common and frequently used parameters
      bulk_solvent_and_scale= True Do bulk solvent correction and anisotropic
                              scaling
      apply_overall_isotropic_scale_to_adp= True
      fix_rotamers= False
      flip_peptides= False
      nqh_flips= True
      use_molprobity= True
      simulated_annealing= False Do simulated annealing
      simulated_annealing_torsion= False Do simulated annealing in torsion
                                   angle space
      ordered_solvent= False Add (or/and remove) and refine ordered solvent
                       molecules (water)
      ncs= False Use restraints NCS in refinement (can be determined
           automatically)
      ias= False Build and use IAS (interatomic scatterers) model (at
           resolutions higher than approx. 0.9 A)
      number_of_macro_cycles= 3 Number of macro-cycles to be performed
      max_number_of_iterations= 25
      use_form_factor_weights= False
      tan_u_iso= False Use tan() reparameterization in ADP refinement
                 (currently disabeled)
      use_geometry_restraints= True
      secondary_structure_restraints= False Adds distance restraints for
                                      hydrogen bonds involved in secondary
                                      structure. Annotation will be done
                                      automatically if no helix or sheet
                                      records are specified, but this depends
                                      on having a good starting structure.
                                      Nucleic acid base pairs (Watson-Crick
                                      and G-U only) will also be restrained if
                                      present.
      hydrogen_bonds= False
      reference_model_restraints= False Restrains the dihedral angles to a
                                  high-resolution reference structure to
                                  reduce overfitting at low resolution. You
                                  will need to specify a reference PDB file
                                  (in the input list in the main window) to
                                  use this option.
      use_convergence_test= False Determine if refinement converged and stop
                            then
      target= *ml mlhl ml_sad ls Choices for refinement target
      min_number_of_test_set_reflections_for_max_likelihood_target= 50 minimum
                                                                    number of
                                                                    test
                                                                    reflections
                                                                    required
                                                                    for use of
                                                                    ML target
      max_number_of_resolution_bins= 30
      reference_xray_structure= None
      show_map_histogram= None
      use_experimental_phases= None Use experimental phases if available. If
                               true, the target function must be set to mlhl ,
                               and a file containing Hendrickson-Lattman
                               coefficients must be supplied.
      random_seed= 2679941 Ransom seed
      scattering_table= wk1995 it1992 *n_gaussian electron neutron Choices of
                        scattering table for structure factors calculations
      wavelength= None X-ray wavelength, currently for testing only
      use_normalized_geometry_target= True
      target_weights_only= False Calculate target weights only and exit
                           refinement
      use_f_model_scaled= False Use Fmodel structure factors multiplied by
                          overall scale factor scale_k1
      max_d_min= 0.25 Highest allowable resolution limit for refinement
      fake_f_obs= False Substitute real experimental Fobs with those
                  calculated from input model (scales and solvent can be
                  added)
      optimize_mask= False Refine mask parameters (solvent_radius and
                     shrink_truncation_radius)
      occupancy_max= 1.0 Maximum allowable occupancy of an atom
      occupancy_min= 0.0 Minimum allowable occupancy of an atom
      stir= None Stepwise increase of resolution: start refinement at lower
            resolution and gradually proceed with higher resolution
      rigid_bond_test= False Compute Hirshfeld's rigid bond test value (RBT)
      show_residual_map_peaks_and_holes= False Show highest peaks and deepest
                                         holes in residual_map.
      fft_vs_direct= False Check accuracy of approximations used in Fcalc
                     calculations
      switch_to_isotropic_high_res_limit= 1.5 If the resolution is lower than
                                          this limit, all atoms selected for
                                          individual ADP refinement and not
                                          participating in TLS groups will be
                                          automatically converted to
                                          isotropic, whether or not ANISOU
                                          records are present in the input PDB
                                          file.
      find_and_add_hydrogens= False Find H or D atoms using difference map and
                              add them to the model. This option should be
                              used if ultra-high resolution data is available
                              or when refining againts neutron data.
      process_pdb_file_reference= False
      correct_special_position_tolerance= 1.0
      use_statistical_model_for_missing_atoms= False
      nproc= 1 Determines number of processor cores to use in parallel
             routines. Currently, this only applies to automatic TLS group
             identification.
   statistical_model_for_missing_atoms
      solvent_content= 0.5
      map_type= *2mFo-DFc
      resolution_factor= 0.25
      probability_mask= True
      diff_map_cutoff= 1.5
      output_all_masks= False
      use_dm_map= False
   modify_start_model Scope of parameters to modify initial model before
                      refinement
      selection= None Selection for atoms to be modified
      renumber_residues= False Re-number residues
      truncate_to_polyala= False Truncate a model to poly-Ala. If True, other
                           options will be ignored.
      remove_alt_confs= False Deletes atoms whose altloc identifier is not
                        blank or A , and resets the occupancies of the
                        remaining atoms to 1.0. If True, other options will be
                        ignored.
      set_chemical_element_simple_if_necessary= None Make a simple guess about
                                                what the chemical element is
                                                (based on atom name and the
                                                way how it is formatted) and
                                                write it into output file.
      set_seg_id_to_chain_id= False Sets the segID field to the chain ID
                              (padded with spaces).
      clear_seg_id= False Erases the segID field.
      convert_semet_to_met= False
      remove_first_n_atoms_fraction= None
      random_seed= None Random seed
      adp Scope of options to modify ADP of selected atoms
         atom_selection= None Selection for atoms to be modified. Overrides
                         parent-level selection.
         randomize= False Randomize ADP within a certain range
         set_b_iso= None Set ADP of atoms to set_b_iso
         convert_to_isotropic= False Convert atoms to isotropic
         convert_to_anisotropic= False Convert atoms to anisotropic
         shift_b_iso= None Add shift_b_iso value to ADP
         scale_adp= None Multiply ADP by scale_adp
      sites Scope of options to modify coordinates of selected atoms
         atom_selection= None Selection for atoms to be modified. Overrides
                         parent-level selection.
         shake= None Randomize coordinates with mean error value equal to shake
         max_rotomer_distortion= None Switch to a rotomer maximally distant
                                 from the current one
         min_rotomer_distortion= None Switch to a rotomer minimally distant
                                 from the current one
         translate= 0 0 0 Translational shift
         rotate= 0 0 0 Rotational shift
         euler_angle_convention= *xyz zyz Euler angles convention to be used
                                 for rotation
      occupancies Scope of options to modify occupancies of selected atoms
         randomize= False Randomize occupancies within a certain range
         set= None Set all or selected occupancies to given value
      rotate_about_axis
         axis= None
         angle= None
         atom_selection= None
      rename_chain_id Rename chains
         old_id= None
         new_id= None
      set_charge
         charge_selection= None
         charge= None
      output Write out PDB file with modified model (file name is defined in
             write_modified)
         file_name= None Default is the original file name with the file
                    extension replaced by _modified.pdb .
   fake_f_obs Scope of parameters to simulate Fobs
      r_free_flags_fraction= None
      scattering_table= wk1995 it1992 *n_gaussian neutron Choices of
                        scattering table for structure factors calculations
      fmodel
         k_sol= 0.0 Bulk solvent k_sol values
         b_sol= 0.0 Bulk solvent b_sol values
         b_cart= 0 0 0 0 0 0 Anisotropic scale matrix
         scale= 1.0 Overall scale factor
      structure_factors_accuracy
         algorithm= *fft direct
         cos_sin_table= False
         grid_resolution_factor= 1/3.
         quality_factor= None
         u_base= None
         b_base= None
         wing_cutoff= None
         exp_table_one_over_step_size= None
      mask
         use_asu_masks= True
         solvent_radius= 1.11
         shrink_truncation_radius= 0.9
         grid_step_factor= 4.0 The grid step for the mask calculation is
                           determined as highest_resolution divided by
                           grid_step_factor. This is considered as suggested
                           value and may be adjusted internally based on the
                           resolution.
         verbose= 1
         mean_shift_for_mask_update= 0.1 Value of overall model shift in
                                     refinement to updates the mask.
         ignore_zero_occupancy_atoms= True Include atoms with zero occupancy
                                      into mask calculation
         ignore_hydrogens= True Ignore H or D atoms in mask calculation
         n_radial_shells= 1 Number of shells in a radial shell bulk solvent
                          model
         radial_shell_width= 1.3 Radial shell width
   hydrogens Scope of parameters for H atoms refinement
      refine= individual riding *Auto Choice for refinement: riding model or
              full (H is refined as other atoms, useful at very high
              resolutions only)
      optimize_scattering_contribution= True
      contribute_to_f_calc= True Add H contribution to Xray (Fcalc)
                            calculations
      high_resolution_limit_to_include_scattering_from_h= 1.6
      real_space_optimize_x_h_orientation= True
      xh_bond_distance_deviation_limit= 0.0 Idealize XH bond distances if
                                        deviation from ideal is greater than
                                        xh_bond_distance_deviation_limit
      local_real_space_fit_angular_step= 0.5
      build
         map_type= mFobs-DFmodel Map type to be used to find hydrogens
         map_cutoff= 2.0 Map cutoff
         angular_step= 3.0 Step in degrees for 6D rigid body search for best
                       fit
         dod_and_od= False Build DOD/OD/O types of waters for neutron models
         filter_dod= False Filter DOD/OD/O by correlation
         use_sigma_scaled_maps= True Default is sigma scaled map, map in
                                absolute scale is used otherwise.
         resolution_factor= 1./4.
         max_number_of_peaks= None
         map_next_to_model
            min_model_peak_dist= 0.7
            max_model_peak_dist= 1.05
            min_peak_peak_dist= 1.0
            use_hydrogens= False
         peak_search
            peak_search_level= 1
            max_peaks= 0
            interpolate= True
            min_distance_sym_equiv= None
            general_positions_only= False
            min_cross_distance= 1.0
            min_cubicle_edge= 5
   group_b_iso
      number_of_macro_cycles= 3
      max_number_of_iterations= 25
      convergence_test= False
      run_finite_differences_test= False
   adp
      iso
         max_number_of_iterations= 25
         automatic_randomization_if_all_equal= True
         scaling
            scale_max= 3.0
            scale_min= 10.0
   tls
      find_automatically= None
      one_residue_one_group= None
      refine_T= True
      refine_L= True
      refine_S= True
      number_of_macro_cycles= 2
      max_number_of_iterations= 25
      start_tls_value= None
      run_finite_differences_test= False
      eps= 1.e-6
      min_tls_group_size= 5 min number of atoms allowed per TLS group
      verbose= True
   adp_restraints
      iso
         use_u_local_only= False
         sphere_radius= 5.0
         distance_power= 1.69
         average_power= 1.03
         wilson_b_weight_auto= False
         wilson_b_weight= None
         plain_pairs_radius= 5.0
         refine_ap_and_dp= False
   group_occupancy
      number_of_macro_cycles= 3
      max_number_of_iterations= 25
      convergence_test= False
      run_finite_differences_test= False
   group_anomalous
      number_of_minimizer_cycles= 3
      lbfgs_max_iterations= 20
      number_of_finite_difference_tests= 0
      find_automatically= False
   rigid_body Scope of parameters for rigid body refinement
      mode= *first_macro_cycle_only every_macro_cycle Defines how many times
            the rigid body refinement is performed during refinement run.
            first_macro_cycle_only to run only once at first macrocycle,
            every_macro_cycle to do rigid body refinement
            main.number_of_macro_cycles times
      target= ls_wunit_k1 ml *auto Rigid body refinement target function:
              least-squares or maximum-likelihood
      target_auto_switch_resolution= 6.0 Used if target=auto, use optimal
                                     target for given working resolution.
      disable_final_r_factor_check= False If True, the R-factor check after
                                    refinement will not revert to the previous
                                    model, even if the R-factors have
                                    increased.
      refine_rotation= True Only rotation is refined (translation is fixed).
      refine_translation= True Only translation is refined (rotation is fixed).
      max_iterations= 25 Number of LBFGS minimization iterations
      bulk_solvent_and_scale= True Bulk-solvent and scaling within rigid body
                              refinement (needed since large rigid body shifts
                              invalidate the mask).
      euler_angle_convention= *xyz zyz Euler angles convention
      lbfgs_line_search_max_function_evaluations= 10
      min_number_of_reflections= 200 Number of reflections that defines the
                                 first lowest resolution zone for the
                                 multiple_zones protocol. If very large
                                 displacements are expected, decreasing this
                                 parameter to 100 may lead to a larger
                                 convergence radius.
      multi_body_factor= 1
      zone_exponent= 3.0
      high_resolution= 3.0 High resolution cutoff (used for rigid body
                       refinement only)
      max_low_high_res_limit= None Maximum value for high resolution cutoff
                              for the first lowest resolution zone
      number_of_zones= 5 Number of resolution zones for MZ protocol
   ncs
      find_automatically= True If enabled, Phenix will ignore existing
                          restraint groups and attempt to define appropriate
                          selections by comparing chains. This only applies to
                          global NCS restraints - if torsion restraints are
                          used, the restraint groups will always be defined
                          automatically unless the user provides custom
                          selections.
      type= *torsion cartesian
      coordinate_sigma= None
      restrain_b_factors= False If enabled, b-factors will be restrained for
                          NCS-related atoms. Otherwise, atomic b-factors will
                          be refined independently, and b_factor_weight will
                          be set to 0.0
      b_factor_weight= None
      excessive_distance_limit= 1.5
      special_position_warnings_only= False
      simple_ncs_from_pdb
         pdb_in= None Input PDB file to be used to identify ncs
         temp_dir= "" temporary directory (ncs_domain_pdb will be written
                   there)
         min_length= 10 minimum number of matching residues in a segment
         min_fraction_represented= 0.10 Minimum fraction of residues
                                   represented by NCS to keep. If less...skip
                                   ncs entirely
         njump= 1 Take every njumpth residue instead of each 1
         njump_recursion= 10 Take every njump_recursion residue instead of
                          each 1 on recursive call
         min_length_recursion= 50 minimum number of matching residues in a
                               segment for recursive call
         min_percent= 95. min percent identity of matching residues
         max_rmsd= 2. max rmsd of 2 chains. If 0, then only search for domains
         quick= True If quick is set and all chains match, just look for 1 NCS
                group
         max_rmsd_user= 3. max rmsd of chains suggested by user (i.e., if
                        called from phenix.refine with suggested ncs groups)
         maximize_size_of_groups= True You can request that the scoring be set
                                  up to maximize the number of members in NCS
                                  groups (maximize_size_of_groups=True) or
                                  that scoring is set up to maximize the
                                  length of the matching segments in the NCS
                                  group (maximize_size_of_groups=False)
         require_equal_start_match= True You can require that all matching
                                    segments start at the same relative
                                    residue number for all members of an NCS
                                    group, trimming the matching region as
                                    necessary. This is required if residue
                                    numbers in different chains are not the
                                    same, but not otherwise
         ncs_domain_pdb_stem= None NCS domains will be written to
                              ncs_domain_pdb_stem+"group_"+nn
         write_ncs_domain_pdb= False You can write out PDB files representing
                               NCS domains for density modification if you
                               want
         verbose= False Verbose output
         raise_sorry= False Raise sorry if problems
         debug= False Debugging output
         dry_run= False Just read in and check parameter names
         domain_finding_parameters
            find_invariant_domains= True Find the parts of a set of chains
                                    that follow NCS
            initial_rms= 0.5 Guess of RMS among chains
            match_radius= 2.0 Keep atoms that are within match_radius of
                          NCS-related atoms
            similarity_threshold= 0.75 Threshold for similarity between
                                  segments
            smooth_length= 0 two segments separated by smooth_length or less
                           get connected
            min_contig_length= 3 segments < min_contig_length rejected
            min_fraction_domain= 0.2 domain must be this fraction of a chain
            max_rmsd_domain= 2. max rmsd of domains
      restraint_group
         reference= None
         selection= None
         coordinate_sigma= 0.05
         b_factor_weight= 10
      torsion
         sigma= 2.5
         limit= 15.0
         similarity= .80
         fix_outliers= Auto
         check_rotamer_consistency= Auto Check for rotamer differences between
                                    NCS matched sidechains and search for best
                                    fit amongst candidate rotamers
         target_damping= False
         damping_limit= 10.0
         verbose= True
         filter_phi_psi_outliers= True
         remove_conflicting_torsion_restraints= False
         restrain_to_master_chain= False
         silence_warnings= False
         restraint_group
            selection= None
            b_factor_weight= 10
            coordinate_sigma= 0.5
      map_averaging
         resolution_factor= 0.25
         use_molecule_mask= False
         averaging_radius= 5.0
         solvent_content= 0.5
         exclude_hd= True
         skip_difference_map= Auto
   modify_f_obs
      remove= random strong weak strong_and_weak low other
      remove_fraction= 0.1
      fill_mode= fobs_mean_mixed_with_dfmodel random fobs_mean *dfmodel
   pdb_interpretation
      cdl= False Use Conformation Dependent Library (CDL) for geometry
           minimization restraints
      correct_hydrogens= False
      link_distance_cutoff= 3
      disulfide_distance_cutoff= 3
      peptide_nucleotide_distance_cutoff= 3
      dihedral_function_type= *determined_by_sign_of_periodicity
                              all_sinusoidal all_harmonic
      chir_volume_esd= 0.2
      max_reasonable_bond_distance= 50.0
      nonbonded_distance_cutoff= None
      default_vdw_distance= 1
      min_vdw_distance= 1
      nonbonded_buffer= 1 **EXPERIMENTAL, developers only**
      nonbonded_weight= None Weighting of nonbonded restraints term. By
                        default, this will be set to 16 if explicit hydrogens
                        are used (this was the defaault in earlier versions of
                        Phenix), or 100 if hydrogens are missing.
      const_shrink_donor_acceptor= 0.6 **EXPERIMENTAL, developers only**
      vdw_1_4_factor= 0.8
      min_distance_sym_equiv= 0.5
      custom_nonbonded_symmetry_exclusions= None
      translate_cns_dna_rna_residue_names= None
      proceed_with_excessive_length_bonds= False
      stop_for_unknowns= True Stop if any nonbonded parameters are unknown.
      altloc_weighting
         weight= False
         bonds= True
         angles= True
         factor= 1
         sqrt= True
      automatic_linking
         intra_chain= False
         amino_acid_bond_cutoff= 1.9
         rna_dna_bond_cutoff= 3.5
         intra_residue_bond_cutoff= 1.99
      apply_cif_modification
         data_mod= None
         residue_selection= None
      apply_cif_link
         data_link= None
         residue_selection_1= None
         residue_selection_2= None
      peptide_link
         ramachandran_restraints= False Restrains peptide backbone to fall
                                  within allowed regions of Ramachandran plot.
                                  Although it does not eliminate outliers, it
                                  can significantly improve the percent
                                  favored and percent outliers at low
                                  resolution. Probably not useful (and maybe
                                  even harmful) at resolutions much higher
                                  than 3.5A.
         cis_threshold= 45
         discard_omega= False
         discard_psi_phi= True
         omega_esd_override_value= None
         rama_weight= 1.0
         scale_allowed= 1.0
         rama_potential= *oldfield emsley
         rama_selection= None
         rama_exclude_sec_str= False
         oldfield
            esd= 10.0
            weight_scale= 1.0
            dist_weight_max= 10.0
            weight= None
      rna_sugar_pucker_analysis
         bond_min_distance= 1.2
         bond_max_distance= 1.8
         epsilon_range_min= 155.0
         epsilon_range_max= 310.0
         delta_range_2p_min= 129.0
         delta_range_2p_max= 162.0
         delta_range_3p_min= 65.0
         delta_range_3p_max= 104.0
         p_distance_c1p_outbound_line_2p_max= 2.9
         o3p_distance_c1p_outbound_line_2p_max= 2.4
         bond_detection_distance_tolerance= 0.5
      show_histogram_slots
         bond_lengths= 5
         nonbonded_interaction_distances= 5
         bond_angle_deviations_from_ideal= 5
         dihedral_angle_deviations_from_ideal= 5
         chiral_volume_deviations_from_ideal= 5
      show_max_items
         not_linked= 5
         bond_restraints_sorted_by_residual= 5
         nonbonded_interactions_sorted_by_model_distance= 5
         bond_angle_restraints_sorted_by_residual= 5
         dihedral_angle_restraints_sorted_by_residual= 3
         chirality_restraints_sorted_by_residual= 3
         planarity_restraints_sorted_by_residual= 3
         residues_with_excluded_nonbonded_symmetry_interactions= 12
         fatal_problem_max_lines= 10
      clash_guard
         nonbonded_distance_threshold= 0.5
         max_number_of_distances_below_threshold= 100
         max_fraction_of_distances_below_threshold= 0.1
   geometry_restraints
      edits
         excessive_bond_distance_limit= 10
         bond
            action= *add delete change
            atom_selection_1= None
            atom_selection_2= None
            symmetry_operation= None The bond is between atom_1 and
                                symmetry_operation * atom_2, with atom_1 and
                                atom_2 given in fractional coordinates.
                                Example: symmetry_operation = -x-1,-y,z
            distance_ideal= None
            sigma= None
            slack= None
         angle
            action= *add delete change
            atom_selection_1= None
            atom_selection_2= None
            atom_selection_3= None
            angle_ideal= None
            sigma= None
         planarity
            action= *add delete change
            atom_selection= None
            sigma= None
         scale_restraints Apply a scale factor to restraints for specific atom
                          selections, to tighten geometry without changing the
                          overall scale of the geometry target.
            atom_selection= None
            scale= 1.0
            apply_to= *bond *angle *dihedral *chirality
   geometry_restraints
      remove
         angles= None
         dihedrals= None
         chiralities= None
         planarities= None
   ordered_solvent
      low_resolution= 2.8 Low resolution limit for water picking (at lower
                      resolution water will not be picked even if requessted)
      mode= *auto filter_only every_macro_cycle Choices for water picking
            strategy: auto - start water picking after ferst few macro-cycles,
            filter_only - remove water only, every_macro_cycle - do water
            update every macro-cycle
      n_cycles= 1
      output_residue_name= HOH
      output_chain_id= S
      output_atom_name= O
      scattering_type= O Defines scattering factors for newly added waters
      primary_map_type= mFobs-DFmodel
      primary_map_cutoff= 3.0
      h_bond_min_mac= 1.8
      h_bond_min_sol= 1.8
      h_bond_max= 3.2
      refine_adp= True Refine ADP for newly placed solvent.
      refine_occupancies= False Refine solvent occupancies.
      new_solvent= *isotropic anisotropic Based on the choice, added solvent
                   will have isotropic or anisotropic b-factors
      b_iso_min= 1.0 Minimum B-factor value, waters with smaller value will be
                 rejected
      b_iso_max= 80.0 Maximum B-factor value, waters with bigger value will be
                 rejected
      anisotropy_min= 0.1 For solvent refined as anisotropic: remove is less
                      than this value
      b_iso= None Initial B-factor value for newly added water
      occupancy_min= 0.1 Minimum occupancy value, waters with smaller value
                     will be rejected
      occupancy_max= 1.0 Maximum occupancy value, waters with bigger value
                     will be rejected
      occupancy= 1.0 Initial occupancy value for newly added water
      filter_at_start= True
      ignore_final_filtering_step= False
      correct_drifted_waters= True
      use_kick_maps= False Use Dusan's Turk kick maps for peak picking
      secondary_map_and_map_cc_filter
         cc_map_1_type= "Fc"
         cc_map_2_type= 2mFo-DFmodel
         poor_cc_threshold= 0.7
         poor_map_value_threshold= 1.0
      kick_map parameters for kick maps
         kick_size= 0.5
         number_of_kicks= 100
   peak_search
      use_sigma_scaled_maps= True Default is sigma scaled map, map in absolute
                             scale is used otherwise.
      resolution_factor= 1./4.
      max_number_of_peaks= None
      map_next_to_model
         min_model_peak_dist= 1.8
         max_model_peak_dist= 6.0
         min_peak_peak_dist= 1.8
         use_hydrogens= False
      peak_search
         peak_search_level= 1
         max_peaks= 0
         interpolate= True
         min_distance_sym_equiv= None
         general_positions_only= False
         min_cross_distance= 1.8
         min_cubicle_edge= 5
   bulk_solvent_and_scale
      mode= slow *fast
      bulk_solvent= True
      anisotropic_scaling= True
      k_sol_b_sol_grid_search= True
      minimization_k_sol_b_sol= True
      minimization_b_cart= True
      target= ls_wunit_k1 *ml
      symmetry_constraints_on_b_cart= True
      k_sol_max= 0.6
      k_sol_min= 0.0
      b_sol_max= 300.0
      b_sol_min= 0.0
      k_sol_grid_search_max= 0.6
      k_sol_grid_search_min= 0.0
      b_sol_grid_search_max= 80.0
      b_sol_grid_search_min= 20.0
      k_sol_step= 0.2
      b_sol_step= 20.0
      number_of_macro_cycles= 2
      max_iterations= 25
      min_iterations= 25
      fix_k_sol= None
      fix_b_sol= None
      fix_b_cart
         b11= None
         b22= None
         b33= None
         b12= None
         b13= None
         b23= None
   alpha_beta
      free_reflections_per_bin= 140
      number_of_macromolecule_atoms_absent= 225
      n_atoms_included= 0
      bf_atoms_absent= 15.0
      final_error= 0.0
      absent_atom_type= "O"
      method= *est calc
      estimation_algorithm= *analytical iterative
      verbose= -1
      interpolation= True
      number_of_waters_absent= 613
      sigmaa_estimator
         kernel_width_free_reflections= 100
         kernel_on_chebyshev_nodes= True
         number_of_sampling_points= 20
         number_of_chebyshev_terms= 10
         use_sampling_sum_weights= True
   mask
      use_asu_masks= True
      solvent_radius= 1.11
      shrink_truncation_radius= 0.9
      grid_step_factor= 4.0 The grid step for the mask calculation is
                        determined as highest_resolution divided by
                        grid_step_factor. This is considered as suggested
                        value and may be adjusted internally based on the
                        resolution.
      verbose= 1
      mean_shift_for_mask_update= 0.1 Value of overall model shift in
                                  refinement to updates the mask.
      ignore_zero_occupancy_atoms= True Include atoms with zero occupancy into
                                   mask calculation
      ignore_hydrogens= True Ignore H or D atoms in mask calculation
      n_radial_shells= 1 Number of shells in a radial shell bulk solvent model
      radial_shell_width= 1.3 Radial shell width
   tardy Under development
      mode= every_macro_cycle *second_and_before_last once first first_half
      xray_weight_factor= 10
      start_temperature_kelvin= 2500
      final_temperature_kelvin= 300
      velocity_scaling= True
      temperature_cap_factor= 1.5
      excessive_temperature_factor= 5
      number_of_cooling_steps= 500
      number_of_time_steps= 1
      time_step_pico_seconds= 0.001
      temperature_degrees_of_freedom= *cartesian constrained
      minimization_max_iterations= 0
      omit_bonds_with_slack_greater_than= 0
      constrain_dihedrals_with_sigma_less_than= 10
      near_singular_hinges_angular_tolerance_deg= 5
      emulate_cartesian= False
      trajectory_directory= None
      prolsq_repulsion_function_changes energy(delta) =
                                        c_rep*(max(0,(k_rep*vdw_distance)**irex
                                       p-delta**irexp))**rexp
         c_rep= None Usual value: 16
         k_rep= 0.75 Usual value: 1. Smaller values reduce the distance
                threshold at which the repulsive force becomes active.
         irexp= None Usual value: 1
         rexp= None Usual value: 4
   cartesian_dynamics
      temperature= 300
      number_of_steps= 200
      time_step= 0.0005
      initial_velocities_zero_fraction= 0
      n_print= 100
      verbose= -1
   simulated_annealing
      start_temperature= 5000
      final_temperature= 300
      cool_rate= 100
      number_of_steps= 50
      time_step= 0.0005
      initial_velocities_zero_fraction= 0
      interleave_minimization= False
      n_print= 100
      update_grads_shift= 0.3
      refine_sites= True
      refine_adp= False
      max_number_of_iterations= 25
      mode= every_macro_cycle *second_and_before_last once first first_half
      verbose= -1
   target_weights
      optimize_xyz_weight= False
      optimize_adp_weight= False
      wxc_scale= 0.5
      wxu_scale= 1.0
      wc= 1.0
      wu= 1.0
      fix_wxc= None
      fix_wxu= None
      shake_sites= True
      shake_adp= 10.0
      regularize_ncycles= 50
      verbose= 1
      wnc_scale= 0.5
      wnu_scale= 1.0
      rmsd_cutoff_for_gradient_filtering= 3.0
      force_optimize_weights= False
      weight_selection_criteria
         bonds_rmsd= None
         angles_rmsd= None
         r_free_minus_r_work= None
         r_free_range_width= None
         mean_diff_b_iso_bonded_fraction= None
         min_diff_b_iso_bonded= None
   ias
      b_iso_max= 100.0
      occupancy_min= -1.0
      occupancy_max= 1.5
      ias_b_iso_max= 100.0
      ias_b_iso_min= 0.0
      ias_occupancy_min= 0.01
      ias_occupancy_max= 3.0
      initial_ias_occupancy= 1.0
      build_ias_types= L R B BH
      ring_atoms= None
      use_map= True
      build_only= False
      file_prefix= None
      lone_pair
         atom_x= CA
         atom_xo= C
         atom_o= O
      peak_search_map
         map_type= *Fobs-Fmodel mFobs-DFmodel
         grid_step= 0.25
         scaling= *volume sigma
   ls_target_names
      target_name= *ls_wunit_k1 ls_wunit_k2 ls_wunit_kunit ls_wunit_k1_fixed
                   ls_wunit_k1ask3_fixed ls_wexp_k1 ls_wexp_k2 ls_wexp_kunit
                   ls_wff_k1 ls_wff_k2 ls_wff_kunit ls_wff_k1_fixed
                   ls_wff_k1ask3_fixed lsm_kunit lsm_k1 lsm_k2 lsm_k1_fixed
                   lsm_k1ask3_fixed
   twinning
      twin_law= None
      twin_target= *twin_lsq_f
      detwin
         mode= algebraic proportional *auto
         map_types
            twofofc= *two_m_dtfo_d_fc two_dtfo_fc
            fofc= *m_dtfo_d_fc gradient m_gradient
            aniso_correct= False
   structure_factors_and_gradients_accuracy
      algorithm= *fft direct
      cos_sin_table= False
      grid_resolution_factor= 1/3.
      quality_factor= None
      u_base= None
      b_base= None
      wing_cutoff= None
      exp_table_one_over_step_size= None
   r_free_flags
      fraction= 0.1
      max_free= 2000
      lattice_symmetry_max_delta= 5.0 Tolerance used in the determination of
                                  the highest lattice symmetry. Can be thought
                                  of as angle between lattice vectors that
                                  should line up perfectly if the symmetry is
                                  ideal. A typical value is 3 degrees.
      use_lattice_symmetry= True When generating Rfree flags, do so in the
                            asymmetric unit of the highest lattice symmetry.
                            The result is an Rfree set suitable for twin
                            refinement.
   fit_side_chains
      number_of_macro_cycles= 1
      real_space_refine_overall= False
      validate_change= True
      exclude_hydrogens= True
      filter_residual_map_value= 2.0
      filter_2fofc_map= None
      target_map= 2mFo-DFc
      residual_map= mFo-DFc
      model_map= Fc
      exclude_free_r_reflections= False
      use_dihedral_restraints= False
      ignore_water_when_move_sidechains= True
      residue_iteration
         poor_cc_threshold= 0.9
         real_space_refine_rotamer= True
         real_space_refine_max_iterations= 25
         real_space_refine_target_weight= 100.
         use_rotamer_iterator= True
         torsion_grid_search= True
         ignore_alt_conformers= True
         torsion_search
            min_angle_between_solutions= 5
            range_start= -40
            range_stop= 40
            step= 2
   flip_peptides
      number_of_macro_cycles= 1
      real_space_refine_overall= False
      validate_change= True
      exclude_hydrogens= True
      filter_residual_map_value= 2.0
      filter_2fofc_map= None
      target_map= 2mFo-DFc
      residual_map= mFo-DFc
      model_map= Fc
      exclude_free_r_reflections= False
      ignore_water_when_flipping= True
      skip_approximate_helices= True
      residue_iteration
         poor_cc_threshold= 0.8
         real_space_refine_peptide= True
         real_space_refine_window= 1
         real_space_refine_max_iterations= 25
         real_space_refine_target_weight= 100.
         torsion_grid_search= True
         ignore_alt_conformers= True
         torsion_search
            min_angle_between_solutions= 5
            range_start= -40
            range_stop= 40
            step= 2
   secondary_structure
      input
         file_name= None
         use_hydrogens= True
         include_helices= True
         include_sheets= True
         find_automatically= None
         helices_from_phi_psi= False
         force_nucleic_acids= False This will ignore the automatic chain type
                              detection and run the base pair detection using
                              PROBE even if no nucleic acids are found. Useful
                              for tRNAs which have a large number of modified
                              bases.
         use_ksdssp= True Use KSDSSP program to annotate secondary structure.
                     If False, a built-in DSSP method will be used instead.
      h_bond_restraints
         verbose= False
         substitute_n_for_h= None
         restrain_helices= True
         alpha_only= False
         restrain_sheets= True
         restrain_base_pairs= True
         remove_outliers= None
         distance_ideal_n_o= 2.9
         distance_cut_n_o= 3.5
         distance_ideal_h_o= 1.975
         distance_cut_h_o= 2.5
         sigma= 0.05
         slack= 0.0
         top_out= False
      helix
         selection= None
         helix_type= *alpha pi 3_10 unknown Type of helix, defaults to alpha.
                     Only alpha, pi, and 3_10 helices are used for
                     hydrogen-bond restraints.
         restraint_sigma= None
         restraint_slack= None
         backbone_only= False Only applies to rigid-body groupings, and not
                        H-bond restraints which are already backbone-only.
      sheet
         first_strand= None
         restraint_sigma= None
         restraint_slack= None
         backbone_only= False Only applies to rigid-body groupings, and not
                        H-bond restraints which are already backbone-only.
         strand
            selection= None
            sense= parallel antiparallel *unknown
            bond_start_current= None
            bond_start_previous= None
      nucleic_acids
         sigma= None Defaults to global setting
         slack= None Defaults to global setting
         use_db_values= True
         base_pair
            base1= None
            base2= None
            saenger_class= None reference
            leontis_westhof_class= *Auto wwt reference
   hydrogen_bonding
      restraint_type= *Auto simple lennard_jones implicit
      include_side_chains= True
      optimize_hbonds= False
      optimize_hbonds_thorough= False
      optimize_mode= *first last every_macro_cycle
      restraints_weight= 1.0
      falloff_distance= 0.05
      exclude_nonbonded= True
      distance_ideal_h_o= 1.975
      distance_cut_h_o= 2.5
      distance_ideal_n_o= 2.9
      distance_cut_n_o= 3.5
      implicit Based on H-bond potential for CNS by Chapman lab
         theta_high= 155
         theta_low= 115
         theta_cut= 90
      explicit Similar to Rosetta H-bond energy (Kortemme & Baker)
         theta_ideal= 180
         theta_sigma= 5
         psi_ideal= 155
         psi_sigma= 5
         relative_weights= 1.0 1.0 1.0
      lennard_jones
         potential= *4_6 6_12
      simple Pseudo-bond restraints
         sigma= 0.05
         slack= 0.0
   reference_model
      use_distance_based_target= False
      file= None
      use_starting_model_as_reference= False
      sigma= 1.0
      limit= 15.0
      hydrogens= False
      main_chain= True
      side_chain= True
      fix_outliers= True
      strict_rotamer_matching= False
      auto_shutoff_for_ncs= False
      SSM_alignment= True
      similarity= .80
      secondary_structure_only= False
      reference_group
         reference= None
         selection= None
         file_name= None this is to used internally to disambiguate cases
                    where multiple reference models contain the same chain ID.
                    This normally does not need to be set by the user
   gui Miscellaneous parameters for phenix.refine GUI
      base_output_dir= None
      tmp_dir= None
      send_notification= False
      notify_email= None
      add_hydrogens= False Runs phenix.ready_set to add hydrogens prior to
                     refinement.
      skip_rsr= False
      skip_kinemage= False
      phil_file= None
      ready_set_hydrogens
         neutron_option= *all_h all_d hd_and_h hd_and_d all_hd
         add_h_to_water= False
         add_d_to_water= False
         neutron_exchange_hydrogens= False Add deuteriums in exchangeable sites
         perdeuterate= False Add deuteriums in all possible sites