phenix_logo
Python-based Hierarchical ENvironment for Integrated Xtallography
Documentation Home
 

Automated molecular replacement with AutoMR

Author(s)
Purpose
Purpose of the AutoMR Wizard
Usage
Summary of inputs and outputs for AutoMR
Output files from AutoMR
How to run the AutoMR Wizard
Components, copies, search models, and ensembles
What the AutoMR wizard needs to run
Specifying which columns of data to use from input data files
Examples
Standard AutoMR run with coords.pdb native.sca
Specifying data columns
Specifying a refinement file for AutoBuild
Passing any commands to AutoBuild
AutoMR searching for 2 components
Specifying molecular masses of 2 components
AutoMR searching for 2 components, but specifying the orientation of one of them
Possible Problems
Specific limitations and problems
Literature
Additional information
List of all AutoMR keywords

Author(s)

  • Phaser: Randy J. Read, Airlie J. McCoy and Laurent C. Storoni
  • AutoMR Wizard: Tom Terwilliger, Laurent Storoni, Randy Read, and Airlie McCoy
  • PHENIX GUI and PDS Server: Nigel W. Moriarty
  • phenix.xtriage: Peter Zwart

Purpose

Purpose of the AutoMR Wizard

The AutoMR Wizard provides a convenient interface to Phaser molecular replacement and feeds the results of molecular replacement directly into the AutoBuild Wizard for automated model rebuilding.

The AutoMR Wizard begins with datafiles with structure factor amplitudes and uncertainties, a search model or models, and identifies placements of the search models that are compatible with the data.

Usage

The AutoMR Wizard can be run from the PHENIX GUI, from the command-line, and from keyworded script files. All three versions are identical except in the way that they take commands from the user. See Running a Wizard from a GUI, the command-line, or a script for details of how to run a Wizard. The command-line version will be described here.

NOTE: You may find it easiest to run the GUI version of AutoMR when you are learning how to use it, and then to move to the command-line or script versions later, as the GUI version will take you through all the necessary steps of organizing your data.

Summary of inputs and outputs for AutoMR

Input data file. This file can be in most any format, and must contain either amplitudes or intensities and sigmas. You can specify what resolution to use for molecular replacement and separately what resolution to use for model rebuilding. If you specify "0.0" for resolution (recommended) then defaults will be used for molecular replacement (i.e. use data to 2.5A if available to solve structure, then carry out rigid body refinement of final solution with all data) and all the data will be used for model rebuilding.

Composition of the asymmetric unit. PHASER needs to know what the total mass in the asymmetric unit is (i.e. not just the mass of the search models). You can define this either by specifying one or more protein or nucleic acid sequence files, or by specifying protein or nucleic acid molecular masses, and telling the Wizard how many copies of each are present.

Space groups to search. You can request that all space groups with the same point group as the one you start out with be searched, and the best one be chosen. If you select this option then the best space group will be used for model rebuilding in AutoBuild.

Ensembles to search for. AutoMR builds up a model by finding a set of good positions and orientations of one "ensemble", and then using each of those placements as starting points for finding the next ensemble, until all the contents of the asymmetric unit are found and a consistent solution is obtained. You can specify any number of different ensembles to search for, and you can search for any number of copies of each ensemble. The order of searching for ensembles does make a difference. If possible, you want to search for the biggest, best-ordered, most accurate ensemble first. You specify the order when you list the ensembles to search for on the last main window of the AutoMR wizard.

Each ensemble can be specified by a single PDB file or a set of PDB files. The contents of one set of PDB files for an ensemble must all be oriented in the same way, as they will be put together and used as a group always in the molecular replacement process.

You will need to specify how similar you think each input PDB file that is part of an ensemble is to the structure that is in your crystal. You can specify either sequence identity, or expected rmsd. Note that if you use a homology model, you should give the sequence identity of the template from which the model was constructed, not the 100% identity of the model!

Output of AutoMR

Output files from AutoMR

When you run AutoMR the output files will be in a subdirectory with your run number:

AutoMR_run_1_/   # subdirectory with results

  • A summary file listing the results of the run and the other files produced:
    AutoMR_summary.dat  # overall summary
    

  • A warnings file listing any warnings about the run
    AutoMR_warnings.dat  # any warnings
    

  • A file that lists all parameters and knowledge accumulated by the Wizard during the run (some parts are binary and are not printed)
    AutoMR_Facts.dat   # all Facts about the run
    

  • Molecular replacement model, structure factors, and map coefficients:
    MR.1.pdb
    MR.1.mtz
    MR.MAP_COEFFS.1.mtz
    
    The AutoMR wizard writes out MR.1.pdb and MR.1.mtz and MR.MAP_COEFFS.1.mtz well as output log files. The MR.1.pdb file will contain all the components of your MR solution. If there are multiple PDB files in an ensemble, the model with the lowest estimated rmsd is chosen to represent the whole ensemble and is written to MR.1.pdb. If there are multiple copies of a model, the chains are lettered sequentially A B C... The MR.1.mtz file contains the data from your input file to the full resolution available. The MR.MAP_COEFFS.1.mtz file contains sigmaA-weighted 2Fo-Fc map coefficients based on the rigid-body-refined model.

Model rebuilding. After PHASER molecular replacement the AutoMR Wizard loads the AutoBuild Wizard and sets the defaults based on the MR solution that has just been found. You can use the default values, or you may choose to use 2Fo-Fc maps instead of density-modified maps for rebuilding, or you may choose to start the model-rebuilding with the map coefficients from MR.MAP_COEFFS.1.mtz.

How to run the AutoMR Wizard

Running the AutoMR Wizard is easy. For example, from the command-line you can type:

phenix.automr native.sca search.pdb RMS=0.8 mass=23000 copies=1

The AutoMR Wizard will find the best location and orientation of the search model search.pdb in the unit cell based on the data in native.sca, assuming that the RMSD between the correct model and search.pdb is about 0.8 A, that the molecular mass of the true model is 23000 and that there is 1 copy of this model in the asymmetric unit. Once the AutoMR Wizard has found a solution, it will automatically call the AutoBuild Wizard and rebuild the model.

Components, copies, search models, and ensembles

  • Your structure is composed of one or more components such as a 20Kd subunit with sequence seq-of-20Kd-subunit.

  • There may be one or more copies of each component in your structure.

  • You can search for the location(s) of a component with a search model that consists of a single structure or an ensemble of structures.

What the AutoMR wizard needs to run

In a simple case where you have one search model and are looking for N copies of this model in your structure, you need:

  • (1) a datafile name (native.sca or data=native.sca)

  • (2) a search model (search_model.pdb or coords=search_model.pdb)

  • (3) how similar the search model is to your structure ( RMS=0.8 or identity=75)

  • (4) information about the contents of the asymmetric unit: (mass=23000 or seq_file=seq.dat) and (copies=1)

It may be advantageous to search using an ensemble of similar structures, rather than a single structure. If you have an ensemble of search models to search for, then specify it as

coords='model_1.pdb model_2.pdb model_3.pdb'

In this case you need to give the RMS or identity for each model: identity='45 40 35'. Each of the models in the ensemble must be in the same orientation as the others, so that the ensemble of models can be placed as a group in the unit cell.

If you are searching for more than one ensemble, or if there is more than one component in the a.u., then use the full syntax and specify them as (NOTE copies becomes copies_to_find or component_copies):

ensemble_1.coords=s1.pdb ensemble_1.RMS=0.8 ensemble_1.copies_to_find=1 \
   component_1.mass=23000 component_1.component_copies=1

Specifying which columns of data to use from input data files

If one or more of your data files has column names that the Wizard cannot identify automatically, you can specify them yourself. You will need to provide one column "name" for each expected column of data, with "None" for anything that is missing.

For example, if your data file data.mtz has columns F SIGF then you might specify

data=data.mtz
input_label_string="F SIGF"

You can find out all the possible label strings in a data file that you might use by typing:

phenix.autosol display_labels=data.mtz  # display all labels for data.mtz

You can specify many more parameters as well. See the list of keywords, defaults and descriptions at the end of this page and also general information about running Wizards at Running a Wizard from a GUI, the command-line, or a script for how to do this. Some of the most common parameters are:

data=w1.sca       # data file
model=coords.pdb  # starting model
seq_file=seq.dat  # sequence file

Examples

Standard AutoMR run with coords.pdb native.sca

Run AutoMR using coords.pdb as search model, native.sca as data, assume RMS between coords.pdb and true model is about 0.85 A, the sequence of true model is seq.dat and there is 1 copy in the unit cell:

phenix.automr coords.pdb native.sca RMS=0.85 seq.dat copies=1  \
    n_cycle_rebuild_max=2 n_cycle_build_max=2

Specifying data columns

Run AutoMR as above, but specify the data columns explicitly:

phenix.automr coords.pdb RMS=0.85 seq.dat copies=1  \
    data=data.mtz input_label_string="F SIGF"  \
    n_cycle_rebuild_max=2 n_cycle_build_max=2 
Note that the data columns are specified by a string that includes both F and SIGF : "F SIGF". The string must match some set of data labels that can be extracted automatically from your data file. You can find the possible values of this string as described above with
phenix.automr display_labels=data.mtz

Specifying a refinement file for AutoBuild

Run AutoMR as above, but specify a refinement file that is different from the file used for the MR search:

phenix.automr coords.pdb RMS=0.85 seq.dat copies=1  \
    data=data.mtz input_label_string="F SIGF"  \
    input_refinement_file=refinement.mtz \
    input_refinement_labels="FP SIGFP FreeR_flag"  \
    n_cycle_rebuild_max=2 n_cycle_build_max=2 
Note that the commands input_refinement_file and input_refinement_labels are in the scope "autobuild_variables" . These commands and others with this prefix are passed on to AutoBuild.

Passing any commands to AutoBuild

You can pass any AutoBuild commands on to AutoBuild, even if they are not already defined for you in AutoMR. Use the command autobuild_input_list_add to add a command, and then apply that command by adding "autobuild_" to the beginning of the command name. For example, to add the commands semet=True and refine=False:

phenix.automr coords.pdb RMS=0.85 seq.dat copies=1  \
    data=data.mtz input_label_string="F SIGF"  \
    autobuild_input_list_add='semet refine'  \
    semet=True \
    refine=False
Notes. This applies only to command-line operation of AutoMR. Note that any keywords that are used in both AutoBuild and AutoMR will apply to both if you specify them in autobuild_input_list_add. For example if you set the resolution in AutoBuild with autobuild_input_list_add=resolution and resolution=2.6 then this resolution will apply to both AutoMR and AutoBuild.

AutoMR searching for 2 components

Run AutoMR on a structure with 2 components. Define the components of the asymmetric unit with sequence files (beta.seq and blip.seq) and number of copies of each component (1). Define the search models with PDB files and estimated RMS from true structures.

phenix.automr data=beta_blip_P3221.mtz input_label_string="Fobs Sigma"  \
 resolution=0.0 resolution_build=3.0                               \
 component_1.component_type=protein component_1.seq_file=beta.seq  \
 component_1.component_copies=1                                    \
 component_2.component_type=protein component_2.seq_file=blip.seq  \
 component_2.component_copies=1                                    \
 ensemble_1.coords=beta.pdb ensemble_1.RMS=0.85 ensemble_1.copies_to_find=1 \
 ensemble_2.coords=blip.pdb ensemble_2.RMS=0.90 ensemble_2.copies_to_find=1 \
 n_cycle_rebuild_max=1

Specifying molecular masses of 2 components

Run AutoMR as in the previous example, except specify the components of the asymmetric unit with molecular masses (30000 and 20000), and define the search models with PDB files and percent sequence identity with the true structures (50% and 60%).

phenix.automr data=beta_blip_P3221.mtz input_label_string="Fobs Sigma"  \
 resolution=0.0 resolution_build=3.0                               \
 component_1.component_type=protein component_1.mass=30000  \
 component_1.component_copies=1                                    \
 component_2.component_type=protein component_2.mass=20000 \
 component_2.component_copies=1                                    \
 ensemble_1.coords=beta.pdb ensemble_1.identity=50 ensemble_1.copies_to_find=1 \
 ensemble_2.coords=blip.pdb ensemble_2.identity=60 ensemble_2.copies_to_find=1 \
 n_cycle_rebuild_max=1

AutoMR searching for 2 components, but specifying the orientation of one of them

Run AutoMR on a structure with 2 components. Define the components of the asymmetric unit with sequence files (beta.seq and blip.seq) and number of copies of each component (1). Define the search models with PDB files and estimated RMS from true structures. Define the orientation and position of one component. Define the number of copies to find for each component (0 for beta, which is fixed, 1 for blip).

phenix.automr data=beta_blip_P3221.mtz input_label_string="Fobs Sigma"  \
 resolution=0.0 resolution_build=3.0                               \
 component_1.component_type=protein component_1.seq_file=beta.seq  \
 component_1.component_copies=1                                    \
 component_2.component_type=protein component_2.seq_file=blip.seq  \
 component_2.component_copies=1                                    \
 ensemble_1.coords=beta.pdb ensemble_1.RMS=0.85 ensemble_1.copies_to_find=0 \
 ensemble_1.ensembleID="beta" \
 ensemble_2.coords=blip.pdb ensemble_2.RMS=0.90 ensemble_2.copies_to_find=1 \
 ensemble_2.ensembleID="blip" \
 n_cycle_rebuild_max=1 \
 fixed_ensembleID_list="beta" \
 fixed_euler_list="199.84,41.535,184.15"\
 fixed_frac_list="-0.49736,-0.15895,-0.28067"
Note: you have to define an ensemble for the fixed molecule (beta in this example).

Possible Problems

Specific limitations and problems

  • The AutoBuild Wizard can build PROTEIN, RNA, or DNA, but it can only build one at a time. If your MR model contains more than one type of chain, then you will need to run AutoBuild separately from AutoMR and when you run AutoBuild, specify one of them with input_lig_file_list and the type of chain to build with chain_type:

     
    input_lig_file_list=ProteinPartofMRmodel.pdb
    chain_type=DNA
    

  • If you use an ensemble as a search model, the output structure will contain just the first member of the ensemble, so you may wish to put the member that is likely to be the most similar to the true structure as the first one in your ensemble.

  • If you run AutoMR from the GUI and continue on to AutoBuild, and then select "Start run over (delete everything for this run)" it will delete your AutoBuild and your AutoMR run and start your AutoMR run all over.

  • The AutoMR Wizard can take most settings of most space groups, however it can only use the hexagonal setting of rhombohedral space groups (eg., #146 R3:H or #155 R32:H), and it cannot use space groups 114-119 (not found in macromolecular crystallography) even in the standard setting due to difficulties with the use of asuset in the version of ccp4 libraries used in PHENIX for these settings and space groups.

Literature

Phaser crystallographic software. A. J. McCoy, R. W. Grosse-Kunstleve, P. D. Adams, M. D. Winn, L. C. Storoni and R. J. Read J. Appl. Cryst. 40, 658-674 (2007)
[pdf]
Likelihood-enhanced fast translation functions. A.J. McCoy, R.W. Grosse-Kunstleve, L.C. Storoni & R.J. Read Acta Cryst. D61, 458-464 (2005)
[pdf]
Likelihood-enhanced fast rotation functions. L.C. Storoni, A.J. McCoy and R.J. Read. Acta Cryst. D60, 432-438 (2004)
[pdf]

Additional information

List of all AutoMR keywords

------------------------------------------------------------------------------- 
Legend: black bold - scope names
        black - parameter names
        red - parameter values
        blue - parameter help
        blue bold - scope help
        Parameter values:
          * means selected parameter (where multiple choices are available)
          False is No
          True is Yes
          None means not provided, not predefined, or left up to the program
          "%3d" is a Python style formatting descriptor
------------------------------------------------------------------------------- 
automr
   build= True Run AutoBuild immediately after AutoMR (Command-line only)
   data= None Datafile (any standard format) (Command-line only)
   copies= None Set both copies_to_find and component_copies with copies. This
           is the number of copies of this search model to find, and also the
           number of copies of this sequence or mass in the asymmetric unit.
           (Command-line only)
   ensembleID= ensemble_1 ID for this ensemble. (Command-line only)
   copies_to_find= None Number of copies of this ensemble to find in a.u.
                   (Command-line only)
   coords= None model(s) for this ensemble. (Command-line only)
   identity= None percent identity(ies) of model(s) in this ensemble to
             structure (alternative is RMS). (Command-line only)
   RMS= None RMSD(s) of model(s) to structure (alternative is identity).
        (Command-line only)
   seq_file= None protein seq_file for this component. (Command-line only)
   component_type= *protein nucleic_acid protein or nucleic acid.
                   (Command-line only)
   mass= None molecular mass (kDa) of this component. (Command-line only)
   component_copies= None Number of copies of this component in the a.u.
                     (required). (Command-ine only)
   special_keywords
      write_run_directory_to_file= None Writes the full name of a run
                                   directory to the specified file. This can
                                   be used as a call-back to tell a script
                                   where the output is going to go.
                                   (Command-line only)
   run_control
      coot= None Set coot to True and optionally run=[run-number] to run Coot
            with the current model and map for run run-number. In some wizards
            (AutoBuild) you can edit the model and give it back to PHENIX to
            use as part of the model-building process. If you just say coot
            then the facts for the highest-numbered existing run will be
            shown. (Command-line only)
      ignore_blanks= None ignore_blanks allows you to have a command-line
                     keyword with a blank value like "input_lig_file_list="
      stop= None You can stop the current wizard with "stopwizard" or "stop".
            If you type "phenix.autobuild run=3 stop" then this will stop run
            3 of autobuild. (Command-line only)
      display_facts= None Set display_facts to True and optionally
                     run=[run-number] to display the facts for run run-number.
                     If you just say display_facts then the facts for the
                     highest-numbered existing run will be shown.
                     (Command-line only)
      display_summary= None Set display_summary to True and optionally
                       run=[run-number] to show the summary for run
                       run-number. If you just say display_summary then the
                       summary for the highest-numbered existing run will be
                       shown. (Command-line only)
      carry_on= None Set carry_on to True to carry on with highest-numbered
                run from where you left off. (Command-line only)
      run= None Set run to n to continue with run n where you left off.
           (Command-line only)
      copy_run= None Set copy_run to n to copy run n to a new run and continue
                where you left off. (Command-line only)
      display_runs= None List all runs for this wizard. (Command-line only)
      delete_runs= None List runs to delete: 1 2 3-5 9:12 (Command-line only)
      display_labels= None display_labels=test.mtz will list all the labels
                      that identify data in test.mtz. You can use the label
                      strings that are produced in AutoSol to identify which
                      data to use from a datafile like this: peak.data="F+
                      SIGF+ F- SIGF-" # the entire string in quotes counts
                      here You can use the individual labels from these
                      strings as identifiers for data columns in AutoSol and
                      AutoBuild like this: input_refinement_labels="FP SIGFP
                      FreeR_flags" # each individual label counts
      dry_run= False Just read in and check parameter names
      params_only= False Just read in and return parameter defaults
      display_all= False Just read in and display parameter defaults
   autobuild_variables
      two_fofc_in_rebuild= None Actively sets two_fofc_in_rebuild in
                           AutoBuild. NOTE: value is not checked
      include_input_model= None Actively sets include_input_model in
                           AutoBuild. NOTE: value is not checked
      n_cycle_rebuild_min= None Actively sets n_cycle_rebuild_min in
                           AutoBuild. NOTE: value is not checked
      n_cycle_rebuild_max= None Actively sets n_cycle_rebuild_max in
                           AutoBuild. NOTE: value is not checked
      n_cycle_build_min= None Actively sets n_cycle_build_min in AutoBuild.
                         NOTE: value is not checked
      n_cycle_build_max= None Actively sets n_cycle_build_max in AutoBuild.
                         NOTE: value is not checked
      rebuild_in_place= None Actively sets rebuild_in_place in AutoBuild.
                        NOTE: value is not checked
      thorough_denmod= None Actively sets thorough_denmod in AutoBuild. NOTE:
                       value is not checked
      i_ran_seed= None Actively sets i_ran_seed in AutoBuild. NOTE: value is
                  not checked
      start_chains_list= None Actively sets start_chains_list in AutoBuild.
                         NOTE: value is not checked
      input_refinement_file= None Actively sets input_refinement_file in
                             AutoBuild. NOTE: value is not checked
      input_refinement_labels= None Actively sets input_refinement_labels in
                               AutoBuild. NOTE: value is not checked
      input_labels= None Actively sets input_labels in AutoBuild. NOTE: value
                    is not checked
      resolve_command_list= None Actively sets resolve_command_list in
                            AutoBuild. NOTE: value is not checked
      resolve_pattern_command_list= None Actively sets
                                    resolve_pattern_command_list in AutoBuild.
                                    NOTE: value is not checked
      morph= None Actively sets morph in AutoBuild. NOTE: value is not checked
      morph_rad= None Actively sets morph_rad in AutoBuild. NOTE: value is not
                 checked
   ensemble_1
      ensembleID= ensemble_1 ID for this ensemble. (Command-line only)
      copies_to_find= None Number of copies of this ensemble to find in a.u.
                      (Command-line only)
      coords= None model(s) for this ensemble. (Command-line only)
      identity= None percent identity(ies) of model(s) in this ensemble to
                structure (alternative is RMS). (Command-line only)
      RMS= None RMSD(s) of model(s) to structure (alternative is identity).
           (Command-line only)
   ensemble_2
      ensembleID= ensemble_2 ID for this ensemble. (Command-line only)
      copies_to_find= None Number of copies of this ensemble to find in a.u.
                      (Command-line only)
      coords= None model(s) for this ensemble. (Command-line only)
      identity= None percent identity(ies) of model(s) in this ensemble to
                structure (alternative is RMS). (Command-line only)
      RMS= None RMSD(s) of model(s) to structure (alternative is identity).
           (Command-line only)
   ensemble_3
      ensembleID= ensemble_3 ID for this ensemble. (Command-line only)
      copies_to_find= None Number of copies of this ensemble to find in a.u.
                      (Command-line only)
      coords= None model(s) for this ensemble. (Command-line only)
      identity= None percent identity(ies) of model(s) in this ensemble to
                structure (alternative is RMS). (Command-line only)
      RMS= None RMSD(s) of model(s) to structure (alternative is identity).
           (Command-line only)
   ensemble_4
      ensembleID= ensemble_4 ID for this ensemble. (Command-line only)
      copies_to_find= None Number of copies of this ensemble to find in a.u.
                      (Command-line only)
      coords= None model(s) for this ensemble. (Command-line only)
      identity= None percent identity(ies) of model(s) in this ensemble to
                structure (alternative is RMS). (Command-line only)
      RMS= None RMSD(s) of model(s) to structure (alternative is identity).
           (Command-line only)
   ensemble_5
      ensembleID= ensemble_5 ID for this ensemble. (Command-line only)
      copies_to_find= None Number of copies of this ensemble to find in a.u.
                      (Command-line only)
      coords= None model(s) for this ensemble. (Command-line only)
      identity= None percent identity(ies) of model(s) in this ensemble to
                structure (alternative is RMS). (Command-line only)
      RMS= None RMSD(s) of model(s) to structure (alternative is identity).
           (Command-line only)
   component_1
      seq_file= None protein seq_file for this component. (Command-line only)
      component_type= *protein nucleic_acid protein or nucleic acid.
                      (Command-line only)
      mass= None molecular mass (kDa) of this component. (Command-line only)
      component_copies= None Number of copies of this component in the a.u.
                        (required). (Command-ine only)
   component_2
      seq_file= None protein seq_file for this component. (Command-line only)
      component_type= *protein nucleic_acid protein or nucleic acid.
                      (Command-line only)
      mass= None molecular mass (kDa) of this component. (Command-line only)
      component_copies= None Number of copies of this component in the a.u.
                        (required). (Command-ine only)
   component_3
      seq_file= None protein seq_file for this component. (Command-line only)
      component_type= *protein nucleic_acid protein or nucleic acid.
                      (Command-line only)
      mass= None molecular mass (kDa) of this component. (Command-line only)
      component_copies= None Number of copies of this component in the a.u.
                        (required). (Command-ine only)
   component_4
      seq_file= None protein seq_file for this component. (Command-line only)
      component_type= *protein nucleic_acid protein or nucleic acid.
                      (Command-line only)
      mass= None molecular mass (kDa) of this component. (Command-line only)
      component_copies= None Number of copies of this component in the a.u.
                        (required). (Command-ine only)
   component_5
      seq_file= None protein seq_file for this component. (Command-line only)
      component_type= *protein nucleic_acid protein or nucleic acid.
                      (Command-line only)
      mass= None molecular mass (kDa) of this component. (Command-line only)
      component_copies= None Number of copies of this component in the a.u.
                        (required). (Command-ine only)
   crystal_info
      cell= 0.0 0.0 0.0 0.0 0.0 0.0 Enter cell parameter a b c alpha beta
            gamma
      chain_type= *Auto PROTEIN DNA RNA  You can specify whether to build
                  protein, DNA, or RNA chains. At present you can only build
                  one of these in a single run. If you have both DNA and
                  protein, build one first, then run AutoBuild again,
                  supplying the prebuilt model in the "input_lig_file_list"
                  and build the other. NOTE: default for this keyword is Auto,
                  which means "carry out normal process to guess this
                  keyword". The process is to look at the sequence file and/or
                  input pdb file to see what the chain type is. If there are
                  more than one type, the type with the larger number of
                  residues is guessed. If you want to force the chain_type,
                  then set it to PROTEIN RNA or DNA.
      resolution= 0.0 Enter the high-resolution limit for MR search. All the
                  data input will be written out regardless of your choice. By
                  default, the final rigid-body refinement will use all data. 
      sg= None Space Group symbol (i.e., C2221 or C 2 2 21)
   decision_making
      min_seq_identity_percent= 50.0  The sequence in your input PDB file will
                                be adjusted to match the sequence in your
                                sequence file (if any).  If there are
                                insertions/deletions in your model and the
                                wizard does not seem to identify them, you can
                                split up your PDB file by adding records like
                                this:  BREAK  You can specify the minimum
                                sequence identity between your sequence file
                                and a segment from your input PDB file to
                                consider the sequences to be matched. Default
                                is 50.0%. You might want a higher number to
                                make sure that deletions in the sequence are
                                noticed.
      overlap_allowed= None Solutions with no C-alpha clashes will be
                       accepted. If the best packing has some clashes,
                       solutions with that number of clashes will be accepted,
                       as long as this does not exceed the maximum allowed.
                       You can choose to increase the maximum if the packing
                       is tight and your search molecule is not exactly the
                       same as the molecule in the cell. If you leave it blank
                       then Phaser will decide for you.
      selection_criteria_rot= *Percent_of_best Number_of_solutions Z_score All
                              Choose a criterion for keeping rotation
                              solutions at each stage. The choices are: 
                             Percent of Best Score: AutoMR looks down the list
                              of LLG scores and only keeps the ones that
                              differ from the mean by more than the chosen
                              percentage, compared to the top solution. Enter
                              your desired percentage into the entry field
                              (default=75%) Number of Solutions: Keep the N
                              top solutions (you can set N; default=1) 
                             Z-score: Keep all the solutions with a Z-score
                              greater than X (you can set X; default=6). All:
                              Keep everything and go on holiday while Phaser
                              crunches through it all (definitely not
                              recommended!)
      selection_criteria_rot_value= 75 Choose a value for your criterion for
                                    keeping rotation solutions at each stage. 
                                   Percent of Best Score: AutoMR looks down
                                    the list of LLG scores and only keeps the
                                    ones that differ from the mean by more
                                    than the chosen percentage, compared to
                                    the top solution. Enter your desired
                                    percentage into the entry field
                                    (default=75%) Number of Solutions: Keep
                                    the N top solutions (you can set N;
                                    default=1) Z-score: Keep all the solutions
                                    with a Z-score greater than X (you can set
                                    X; default=6). All: Keep everything and go
                                    on holiday while Phaser crunches through
                                    it all (definitely not recommended!)
   fixed_ensembles
      fixed_ensembleID_list= None Enter the ID (set with ensemble_1.ensembleID
                             or equivalent) of the component that is to be
                             fixed. NOTE 1: Each ensemble in
                             fixed_ensembleID_list must be defined. NOTE 2:
                             you can enter more than one fixed component if
                             you want. If you do, then enter fixed_euler_list
                             in multiples of 3 numbers and also
                             fixed_frac_list in multiples of 3 numbers.
      fixed_euler_list= 0.0 0.0 0.0 Enter Euler angles (from AutoMR or Phaser)
                        for fixed component defined with
                        fixed_ensembleID_list. NOTE 2: you can enter more than
                        one fixed component if you want. If you do, then enter
                        fixed_euler_list in multiples of 3 numbers and also
                        fixed_frac_list in multiples of 3 numbers.
      fixed_frac_list= 0.0 0.0 0.0 Enter fractional offset (location) for
                       fixed component (from AutoMR or Phaser) for fixed
                       component defined with fixed_ensembleID_list. NOTE 2:
                       you can enter more than one fixed component if you
                       want. If you do, then enter fixed_euler_list in
                       multiples of 3 numbers and also fixed_frac_list in
                       multiples of 3 numbers.
   general
      all_plausible_sg_list= None Choose which space groups to search
      autobuild_input_list_add= None You can add keywords to those that AutoMR
                                passes on to AutoBuild (command-line only) The
                                format for this command is: 
                               autobuild_input_list_add='semet refine'  Then
                                you can set any of the variables you specify
                                by adding the prefix "autobuild_" to the name
                                of your variable: autobuild_semet=False 
                               autobuild_refine=True This will now set
                                'semet'=False and refine=True in AutoBuild
      background= True When you specify nproc=nn, you can run the jobs in
                  background (default if nproc is greater than 1) or
                  foreground (default if nproc=1).  If you set
                  run_command=qsub (or otherwise submit to a batch queue),
                  then you should set background=False, so that the batch
                  queue can keep track of your runs. There is no need to use
                  background=True in this case because all the runs go as
                  controlled by your batch system. If you use run_command=csh
                  (or similar, csh is default) then normally you will use
                  background=True so that all the jobs run simultaneously.
      base_path= None You can specify the base path for files (default is
                 current working directory)
      clean_up= False At the end of the entire run the TEMP directories will
                be removed if clean_up is True. The default is No, keep these
                directories. If you want to remove them after your run is
                finished use a command like "phenix.autobuild run=1
                clean_up=True"
      coot_name= coot If your version of coot is called something else, then
                 you can specify that here.
      debug= False  You can have the wizard stop with error messages about the
             code if you use debug. NOTE: you cannot use Pause with debug.
      do_anisotropy_correction= True Choose whether you want to apply
                                anisotropy correction
      extra_verbose= False Facts and possible commands will be printed every
                     cycle if Yes
      max_wait_time= 100.0 You can specify the length of time (seconds) to
                     wait when testing the run_command. If you have a cluster
                     where jobs do not start right away you may need a longer
                     time to wait.
      nbatch= 1 You can specify the number of processors to use (nproc) and
              the number of batches to divide the data into for parallel jobs.
              Normally you will set nproc to the number of processors
              available and leave nbatch alone. If you leave nbatch as None it
              will be set automatically, with a value depending on the Wizard.
              This is recommended. The value of nbatch can affect the results
              that you get, as the jobs are not split into exact replicates,
              but are rather run with different random numbers. If you want to
              get the same results, keep the same value of nbatch.
      nproc= 1 You can specify the number of processors to use (nproc) and the
             number of batches to divide the data into for parallel jobs.
             Normally you will set nproc to the number of processors available
             and leave nbatch alone. If you leave nbatch as None it will be
             set automatically, with a value depending on the Wizard. This is
             recommended. The value of nbatch can affect the results that you
             get, as the jobs are not split into exact replicates, but are
             rather run with different random numbers. If you want to get the
             same results, keep the same value of nbatch.
      run_command= csh When you specify nproc=nn, you can run the subprocesses
                   as jobs in background with csh (default) or submit them to
                   a queue with the command of your choice (i.e., qsub ). If
                   you have a multi-processor machine, use csh. If you have a
                   cluster, use qsub or the equivalent command for your
                   system.  NOTE: If you set run_command=qsub (or otherwise
                   submit to a batch queue), then you should set
                   background=False, so that the batch queue can keep track of
                   your runs. There is no need to use background=True in this
                   case because all the runs go as controlled by your batch
                   system. If you use run_command=csh (or similar, csh is
                   default) then normally you will use background=True so that
                   all the jobs run simultaneously.
      skip_xtriage= False You can bypass xtriage if you want. This will
                    prevent you from applying anisotropy corrections, however.
      temp_dir= None Define a temporary directory (it must exist)
      title= Run 1 AutoMR Sun Dec 7 17:46:24 2008  Enter any text you like to
             help identify what you did in this run
      top_output_dir= None This is used in subprocess calls of wizards and to
                      tell the Wizard where to look for the STOPWIZARD file. 
      use_all_plausible_sg= False Normally you will want to search all space
                            groups with the same point group as you may not
                            know which is correct from your data. You can
                            select which of these to choose using 'Choose
                            variable to set' and selecting
                            'all_plausible_sg_list'
      verbose= False Command files and other verbose output will be printed
   input_files
      input_data_file= None Enter the a file with input structure factor data.
                       For structure factor data only (e.g., FP SIGFP) any
                       format is ok. If you have free R flags, phase
                       information or HL coefficients that you want to use
                       then an mtz file is required. If this file contains
                       phase information, this phase information should be
                       experimental (i.e., MAD/SAD/MIR etc), and should not be
                       density-modified phases (enter any files with
                       density-modified phases as input_map_file instead). 
                       NOTE: If you supply HL coefficients they will be used
                       in phase recombination. If you supply PHIB or PHIB and
                       FOM and not HL coefficients, then HL coefficients will
                       be derived from your PHIB and FOM and used in phase
                       recombination.  If you also specify a hires data file,
                       then FP and SIGFP will come from that data file (and
                       not this one)  If an input_refinement_file is
                       specified, then F, Sigma, FreeR_flag (if present) from
                       that file will be used for refinement instead of this
                       one.
      input_label_string= None Choose the set of labels that represent the
                          data and sigma columns for your data. NOTE: Applies
                          to input data file for AutoMR. See also
                          'input_labels', which applies to input data file for
                          AutoBuild.
      input_pdb_file= None You can enter a PDB file containing a starting
                      model of your structure NOTE: If you enter a PDB file
                      then the AutoBuild wizard will start right in with
                      rebuild steps, skipping the build process. If the model
                      is very poor than it may be better to leave it out as
                      the build process (which includes pattern recognition
                      and recognition of helical and strand fragments) is
                      optimized for improving poor maps, while the rebuild
                      process is optimized for better maps that can be
                      produced by having a partial model.
      input_seq_file= None Enter name of file with 1-letter code of protein
                      sequence NOTES: 1. lines starting with > are ignored
                      and separate chains  2. FASTA format is fine  3. If
                      there are multiple copies of a chain, just enter one
                      copy.  4. If you enter a PDB file for rebuilding and it
                      has the sequence you want, then the sequence file is not
                      necessary.   NOTE: You can also enter the name of a PDB
                      file that contains SEQRES records, and the sequence from
                      the SEQRES records will be read, written to
                      seq_from_seqres_records.dat, and used as your input
                      sequence.  NOTE: for AutoBuild you can specify
                      start_chains_list on the first line of your sequence
                      file: >> start_chains_list 23 11 5 NOTE: default
                      for this keyword is Auto, which means "carry out normal
                      process to guess this keyword". This means if you
                      specify "after_autosol" in AutoBuild, AutoBuild will
                      automatically take the value from AutoSol. If you do not
                      want this to happen, you can specify None which means
                      "No file"
      input_seq_file_list= None The keyword input_seq_file_list is used in
                           AutoMR to specify the molecular masses of the
                           components of the unit cell using a set of sequence
                           files. Usually you should input the sequences of
                           the actual components of the unit cell here (one
                           sequence file for each component).  NOTE: If no
                           input_seq_file is specified, then the sequences
                           from input_seq_file_list are used to create a new
                           file "composite_seq.dat" with all their sequences
                           and this is used as the input_seq_file. NOTE: the
                           format of each file in input_seq_file_list is the
                           1-letter code of the protein sequence (separate
                           chains with >>>>) 
   model_building
      build_type= *RESOLVE_AND_TEXTAL RESOLVE TEXTAL You can choose to build
                  models with RESOLVE and TEXTAL or either one, and how many
                  different models to build with RESOLVE. The more you build,
                  the more likely to get a complete model.  Note that
                  rebuild_in_place can only be carried out with RESOLVE
                  model-building
      rebuild_after_mr= True You can choose to go right on to the AutoBuild
                        wizard with the rebuild-in-place option after running
                        molecular replacement.
      resolution_build= 0.0 Enter the high-resolution limit for
                        model-building. If 0.0, the value of resolution is
                        used as a default. 
      semet= False You can specify that the dataset that is used for
             refinement is a selenomethionine dataset, and that the model
             should be the SeMet version of the protein, with all SD of MET
             replaced with Se of MSE.
   non_user_parameters
      composition_num_list= 1 Enter number of copies of this component
      weight_list= 0.0 Molecular weight of component (Da; e.g. 30000)
      weight_seq_list= None Choose whether to define composition through
                       molecular weight or sequence
   refinement
      link_distance_cutoff= 3.0 You can specify the maximum bond distance for
                            linking residues in phenix.refine called from the
                            wizards.
      r_free_flags_fraction= 0.1 Maximum fraction of reflections in the free R
                             set. You can choose the maximum fraction of
                             reflections in the free R set and the maximum
                             number of reflections in the free R set. The
                             number of reflections in the free R set will be
                             up the lower of the values defined by these two
                             parameters.
      r_free_flags_lattice_symmetry_max_delta= 5.0 You can set the maximum
                                               deviation of distances in the
                                               lattice that are to be
                                               considered the same for
                                               purposes of generating a
                                               lattice-symmetry-unique set of
                                               free R flags.
      r_free_flags_max_free= 2000 Maximum number of reflections in the free R
                             set. You can choose the maximum fraction of
                             reflections in the free R set and the maximum
                             number of reflections in the free R set. The
                             number of reflections in the free R set will be
                             up the lower of the values defined by these two
                             parameters.
      r_free_flags_use_lattice_symmetry= True When generating r_free_flags you
                                         can decide whether to include lattice
                                         symmetry (good in general, necessary
                                         if there is twinning).