phenix_logo
Python-based Hierarchical ENvironment for Integrated Xtallography
Documentation Home
 

Using the PHENIX Wizards

Purpose
Overview of Structure Determination with the PHENIX Wizards
Usage
Wizard data directories, sub-directories, Facts, and the PDS (Project Data Storage)
Running a Wizard using a multiprocessor machine or on a cluster
Running a Wizard from a GUI
Basic operation of a Wizard from the GUI
Keeping track of multiple runs of a Wizard from the GUI
Setting parameters of a Wizard from the GUI
Navigating steps in a Wizard from the GUI
Running a Wizard from the command-line
Basic operation of a Wizard from the command-line
Keeping track of multiple runs of a Wizard from the command-line
Setting parameters of a Wizard from the command-line
Running a Wizard from a script
Differences between running from the command line and running a script
Basic operation of a Wizard from a script
Keeping track of multiple runs of a Wizard from a script
Setting parameters of a Wizard from a script
Useful script commands
Specific limitations and problems:
Literature
Additional information

Purpose

Any Wizard can be run from the PHENIX GUI, from the command-line, and from keyworded script files. All three versions are identical except in the way that they take commands and keywords from the user.

This page describes how to run a Wizard and what a Wizard does in general. The specific Wizard help pages describe the details of each PHENIX Wizard.

Overview of Structure Determination with the PHENIX Wizards

You can use the AutoSol Wizard to solve structures by SAD, MAD, SIR/SIRAS, and MIR/MIRAS. The AutoMR Wizard can solve a structure by MR. The AutoMR and AutoSol Wizards together can carry out MRSAD. The AutoSol Wizard can also combine SAD, MAD, SIR, and MIR datasets and solve the structure using all available data.

Once you have experimental or MR phases, you can carry out iterative model-building, density modification, and refinement with the AutoBuild Wizard to improve your model. Finally you can use the rebuild_in_place feature of the AutoBuild Wizard to make one very good final model.

If your structure contains ligands, you can place them using the LigandFit Wizard

This help page describes how to run the Wizards from a GUI, the command-line, or a script. The individual Wizard documentation pages describe the strategies and commands for each Wizard:

Usage

Wizard data directories, sub-directories, Facts, and the PDS (Project Data Storage)

  • The directory that you are in when you start up PHENIX is your working directory.

  • Each run of a Wizard will have all output data in a subdirectory of your working directory named like this (for AutoSol run 3):
    AutoSol_run_3_/
    

  • This subdirectory will have one or more temporary directories:
    AutoSol_run_3_/TEMP0/
    
    which contain intermediate files. These temporary directories will be deleted when the Wizard is finished (unless you set the parameter clean_up to False)

  • For OMIT and MULTIPLE-MODEL runs, the final OMIT maps and multiple models will be in a subdirectory of your run directory:
    AutoSol_run_3_/OMIT/
    AutoSol_run_3_/MULTIPLE_MODELS/
    

  • All the parameter values as well as any other information that a Wizard generates during its run is stored in the PDS (Project Data Storage) and/or the Wizard Facts. The Facts are values of parameters and pointers to files in the PDS. The Facts keep track of the current knowledge available to the Wizard. Each time a step is completed by a Wizard, the new Facts are saved (overwriting old ones for that run). As the Facts define the state of the Wizard, the Wizard can be restarted any time by loading the appropriate set of Facts.

  • The PDS (Project data storage) will be in your working directory:
    ./PDS/
    
    The PDS contains the output of each of your runs for all Wizards and a record of all the Facts (parameters and data) for each run. If you delete a run using the PHENIX Wizard GUI or with a command like "phenix.autosol delete_runs=2", the corresponding entries in the PDS are also deleted. You can copy the PDS from one place to another. Note that if you delete directories such as "AutoSol_run_1_" by hand then the corresponding information remains in the PDS. For this reason it is best to use the GUI or specific commands to delete runs.

    Running a Wizard using a multiprocessor machine or on a cluster

    You can take advantage of having a multiprocessor machine or a cluster when running the wizards (Currently this applies to the LigandFit and AutoBuild Wizards). For example, adding command

    nproc=4
    
    to a command-line command for a Wizard will use 4 processors to run the wizard (if possible). Normally you will run the parallel processes in the background with the default of
    background=True
    
    If you have a cluster with a batch queue, you can send subprocesses to the batch queue with
    run_command=qsub
    
    (or whatever your batch command is). In this case you will use
    background=False
    
    so that the batch queue can keep track of your jobs.

    The Wizards divide the data into nbatch batches during processing. The value of

    nbatch=3
    
    is set from 3 to 5 by default (depending on the Wizard) and is appropriate if you have up to nbatch processors. If you have more, then you may wish to increase nbatch to match the number of processors. The reason it is done this way is that the value of nbatch can affect the results that you get, as the jobs are not split into exact replicates, but are rather run with different random numbers. If you want to get the same results, keep the same value of nbatch.

    Running a Wizard from a GUI

    Basic operation of a Wizard from the GUI

  • Start up the PHENIX GUI in your working directory by typing "phenix"

  • Answer "yes" to the question "Do you want to make it a project directory?".

  • Launch a Wizard from the PHENIX GUI by double-clicking on the name of the Wizard ("AutoSol") under "Wizards" in the Strategy Interface of the main GUI.

  • The Wizard will come up in a blue window and will open a grey Parameters window asking you for information on what files to use and what to do.

  • Enter the file names and make choices as necessary (NOTE: to select a file click on the yellow box to the right of the file entry field. To add a new file entry field click on the "Parameter group options" tab if present).

  • Proceed to the next window by clicking "Continue" in the upper left corner of the grey Parameters window.

  • The Wizard will guide you through the necessary inputs, then it will continue on its own until it is finished.

  • When the Wizard is done, you can double-click on the Display icon (the little magnifying glass on the upper left of the blue Wizard window) to show a list of files and maps that can be displayed. (NOTE: The Display Options window is updated when you open it. Once this window is open you cannot open it again until you close it. Sometimes this window may be behind other windows and this will prevent you from opening it again.)

  • You can open the Parameters window any time the Wizard is stopped by clicking on the Parameters icon (4 little lines in the upper left corner of the blue Wizard window). This allows you to carry out some of the more advanced options below.

  • Your output log file will be in a file called "AutoSol.1.output" for an AutoSol run. You can also see the same file by clicking on the "LOG" button at the lower right of the blue or green window.

Keeping track of multiple runs of a Wizard from the GUI

  • You can run more than one Wizard job at a time if you want. Each run of a Wizard is put in a separate sub-directory (e.g., "AutoSol_run_1_").

  • When you start a Wizard, it will start a new run of that Wizard.

  • If you want to continue on with the highest-numbered run of a Wizard, you can start the Wizard with the continue button for that Wizard (for example the continue_AutoSol button).

  • If you want to go back to a previous run, you can use the Run Control and Run Number selections near the bottom of any Parameters window (NOTE: to open the parameters window click on the lines at the upper left of the blue Wizard window). Select goto_run and choose a run number to go to.

  • If you want to copy a previous run and go on, use the Run Control and Run Number selections and select copy_run and choose a run number to copy. The Wizard will create a new run (with number equal to the highest previous number plus one) and carry on with it.

  • To see what runs are available, select View or Delete Runs in the Navigate tab at the lower left of any Parameters window.

  • If you want to stop the Wizard, hit the PAUSE button on the green Wizard window (the Wizard is green when running, blue or purple when stopped). NOTE: this may take a little time, particularly if Phaser or HYSS or phenix.refine are running. In those cases if you really want to stop the Wizard right away, got to "Strategy" and then select "Stop Strategy" and it will be stopped.

Setting parameters of a Wizard from the GUI

  • You can set any parameter in a Wizard by selecting the variable in the Choose Variable to Set tab. The next time you click Continue, the Wizard will save all the current inputs as usual, and then instead of going on to the next step, it will open a window asking you for the new value of that variable. When you enter it and press Continue, the Wizard will continue on with what it was doing, but with this new value.

  • NOTE that some parameters (e.g., resolution) may affect many steps. If a prior step is affected by a parameter that is changed, the Wizard does not go back and change it. If you want the parameter change to affect something that has already been done, you need to re-run the corresponding step.

  • NOTE that you can set any SOLVE, RESOLVE or RESOLVE_PATTERN keyword when you are running a Wizard using the "resolve_command", "solve_command" or "resolve_pattern_command" keywords. These can be set in the GUI from the Choose Variable pull-down menu. You just type in the command to the entry form like this: (for resolve_command):
    res_start 4.0
    
    telling resolve in this case to start out density modification at a resolution of 4 A. This allows you to control what solve, resolve and resolve_pattern do more finely than you otherwise can in the Wizards.

Navigating steps in a Wizard from the GUI

  • When the Wizard is done or Paused, you can select any available step in the Navigate tab at the middle bottom of any Parameter window. This tells the Wizard to get any necessary inputs for that step and to then carry it out.

  • The Wizards normally start out in Manual mode (one step at a time, asking user for inputs). Once the necessary inputs are entered, the Wizard enters Automatic mode (no more asking for inputs until something required is missing). You can control this by specifying Manual or Automatic in the Auto/Manual tab at the bottom right of any Wizard.

Running a Wizard from the command-line

Basic operation of a Wizard from the command-line

  • You can run a wizard from the command line like this (autosol is the AutoSol wizard):
    phenix.autosol data=w1.sca seq_file=seq.dat 2 Se
    

  • The command_line interpreter will try to interpret obvious information (2 means sites=2, Se means atom_type=Se) and will run the wizard.

  • To see all the information about this wizard and the keywords that you can set for this wizard, type:
    phenix.autosol --help all
    

  • Any wizard keyword can be entered at the command line (not just the ones labelled "command-line only"). The documentation for each wizard lists all the keywords that apply to that wizard.

  • If you want to stop a Wizard, you can create a file "STOPWIZARD" and put it in the subdirectory (i.e., AutoSol_2_/) where the Wizard is running. This is like hitting the PAUSE button on the GUI and stops the wizard cleanly.

Keeping track of multiple runs of a Wizard from the command-line

  • When you start a Wizard from the command line, the default is to start a new run of that Wizard.

  • To see all the available runs of this Wizard, type:
    phenix.autosol show_runs
    

  • To delete runs 1,2 and 4-7 of this Wizard, type something like this:
    phenix.autosol delete_runs="1 2 4-7"
    
    Note that the group of numbers is enclosed in quotes ("). This tells the input parser (iotbx.phil) that all these numbers go with the one keyword of delete_runs. Note also that there are no spaces around the "=" sign!

  • To go back to run 2 and carry on (remembering all previous inputs and possibly adding new ones, in this case setting the resolution) type something like:
    phenix.autosol run=2 resolution=3.0
    

  • To carry on with the current highest-numbered run (remembering all previous inputs and possibly adding new ones, in this case setting the resolution) type something like:
    phenix.autosol carry_on resolution=3.0
    

  • To copy run 2 to a new run and carry on from there (remembering all previous inputs and possibly adding new ones, in this case setting the resolution) type something like:
    phenix.autosol copy_run=2 resolution=3.0
    

Setting parameters of a Wizard from the command-line

When you run a Wizard from the command-line, two files are produced and put in the subdirectory of the Wizard (e.g., AutoBuild_run_3_/).

  • A parameters (".eff") file will be produced that you can edit to rerun the Wizard:
    phenix.autosol autosol.eff
    
    This autosol.eff file (for AutoSol) contains the values of all the AutoSol parameters at the time of starting the Wizard.

    Note that the syntax in the autosol.eff file is very slightly different than the syntax from the command line. From the command line, if a value has several parts, you enclose them in quotes and there are no spaces around the "=" sign:

    phenix.autosol ... input_phase_labels="FP PHIM FOMM"
    
    In the .eff file, you MUST leave off the quotes or the three values will be treated as one, and you should leave blanks around the "=" sign:
     input_phase_labels = FP PHIM FOMM
    
    The reason these are different is that in the .eff file, the structure of the file and the brackets tell the PHIL parser what is grouped together, while from the commmand line, the quotes tell the parser what is to be grouped together.

  • A script file (".inp") with inputs in the format for running from a script is produced that you can edit and use like this:
    phenix.runWizard AutoSol AutoSol.inp
    

  • To get keyword help on a specific keyword you can type:
    phenix.autosol --help data  # get help on the keyword data for autosol
    

  • To show current Facts (values of all parameters) for highest_numbered run:
    phenix.autosol show_facts
    

  • To show current Facts (values of all parameters) for run 3:
    phenix.autosol run=3 show_facts
    

  • To show current summary:
    phenix.autosol show_summary 
    

  • When you use a keyword like data= you need to give enough information to specify this keyword uniquely. You can see all the keywords for each PHENIX Wizard or tool at the end of the documentation for that Wizard or tool. This will have entries like this (for AutoSol):
    autosol
           sites= None Number of heavy-atom sites. (Command-line only)
    
    which describes the keyword sites in the scope defined by autosol. You can explicitly specify this on the command line with:
    autosol.sites=3 
    which in this case is entirely the same as
    sites=3 

  • NOTE that you can set any SOLVE, RESOLVE or RESOLVE_PATTERN keyword in PHENIX using the "resolve_command", "solve_command" or "resolve_pattern_command" keywords from the command line. The format is a little tricky: you have to put two sets of quotes around the command like this:
    resolve_command="'ligand_start start.pdb'"    # NOTE ' and " quotes
    
    This will put the text
    ligand_start start.pdb
    
    at the end of every temporary command file created to run resolve.

Running a Wizard from a script

Differences between running from the command line and running a script

Command-line

The command-line is an easy way to run a Wizard and is recommended for any users. The command starts with phenix. plus the name of the Wizard in lower-case letters (phenix.autosol). Following this, all of the keywords are on the same line (or on continuation lines) and values are assigned with an "=" sign. The order of keywords makes no difference running from the command line. A simple command is:

phenix.autosol data=w1.sca seq_file=seq.dat  sites=2 atom_type=Se

Scripts

Normally scripts are for advanced users only (for running MIR or multiple datasets, you have to use the GUI or a script, however). A script can contain both commands and keywords. Keywords are read in until a command is found, then the command is executed, then additional keywords are read in until another command is found, and so on. If the script file contains only keywords and no commands, then the keywords are read in and used as input to the Wizard, in just the same way as running from the command line. In a script file, each line can contain a command or keyword and optional values for the command or keyword, separated by spaces.

The keywords for scripts are a subset of keywords for the command-line. This is because the command-line interpreter has a number of special keywords (essentially shortcuts) to make typing at the command-line easier.

A script file assigns values to keywords by being on the same line, not using any "=" signs.

A sample script file "autosol.inp" that contains the same information as the command-line command shown above (but with the full keyword names, not the command-line shortcuts) is:

# autosol.inp
# script file with inputs for AutoSol Wizard.
# run with: phenix.runWizard AutoSol autosol.inp
#
input_file_list w1.sca   # script keyword is input_file_list not data
input_seq_file seq.dat   # script keyword is input_seq_file not seq_file
mad_ha_n 2               # script keyword is mad_ha_n not sites
mad_ha_type Se           # script keyword is mad_ha_type not atom_type
#
# end of autosol.inp
which you can run with:
phenix.runWizard AutoSol autosol.inp

NOTE: The script interpreter will accept any keywords and values. If the keyword is not recognized, then it will write a warning to the log file, but it will not stop. This means that if you use the wrong name for a keyword, you will only find this out by looking at the beginning of the log file. The utility of this feature is that keywords set the value of the corresponding variable in the Wizard. If you know what you are doing, you can set any variable in the Wizard in this way, whether or not it is a keyword.

Basic operation of a Wizard from a script

  • You can run a wizard from a script like this (AutoSol wizard):
    phenix.runWizard AutoSol autosol.inp
    
    The script file (autosol.inp) should contain keyword entries telling the Wizard what to do. The output will be written to the log file (e.g., AutoSol_run_1_/AutoSol_run_1_1.log).

  • The keywords that can be set in a script file include most of the keywords for for command-line running, plus a set of control commands for running from a script. To see all the basic keywords for a wizard, make a script (e.g., keywords.inp) that says:
    list_keywords
    
    and then type:
    phenix.runWizard AutoSol keywords.inp
    
    The keywords will be written to the log file (e.g., AutoSol_run_1_/AutoSol_run_1_1.log).

  • For help on a Wizard, your script file should say:
    help
    

  • Unlike running from the command-line, the order of entries in a script file can make a difference. For example you can specify a group of inputs for one dataset and then start a new dataset.

  • If you want to stop a Wizard, you can create a file "STOPWIZARD" and put it in the subdirectory (i.e., AutoSol_2_/) where the Wizard is running. This is like hitting the PAUSE button on the GUI and stops the wizard cleanly.

Keeping track of multiple runs of a Wizard from a script

  • When you start a Wizard from the command line, the default is to start a new run of that Wizard.

  • To see all the available runs of this Wizard, delete some runs, carry on with run 3, or copy run 4 into a new run, your script should say one of the following:
    show_runs
    delete_run_list 1 2 3-5
    run 3
    copy_run 4
    

Setting parameters of a Wizard from a script

  • You can set nearly any parameter using keywords from a script. For example:
    resolution 2.5
    
    will set the overall high-resolution cutoff to 2.5 A.

    Useful script commands

    With the exception of show_runs and delete_runs, the output for each of these commands is written to the log file (e.g., AutoSol_run_1_/AutoSol_run_1_1.log).

    help  # print out this help message 
    
    show_runs # list all the runs that are saved
    
    delete_runs 1 2 3-5 9:12 # delete runs 1 2 3-5 9-12
    
    carry_on # continue on with the highest-numbered run
    
    run 5 # continue with run 5
    
    copy_run 5 # make a new copy of run 5 (with number equal
       # to highest existing run number +1) and continue
       # with this new copy.
    
    run 2
    run_only DumpFacts # list current values of all parameters in run 2 and stop
    
    run_only nothing # do nothing and stop
    
    list_keywords # list all the keywords and their possible values
    
    run_list method_1 method_2 # run these methods and anything
        # that follows automatically
    
    run_only method_1 method_2 # run just these methods and stop
    user_command method_1 
    
    list_methods # list all methods that can be run with run_list
    
    

    These are a good way to run Wizards initially, and also a good way to change some parameters after stopping a run

    Note: these all have the form:

    keyword parameter
    

    where the parameter must be enclosed in quotes if it is a string containing blanks. If the keyword contains the text "list" or the words "dataset_", "cell" or "input_labels" then the parameter can be a list of items, separated by blanks:

     
    cell 40 50 40 90 90 90
    

    An empty list is indicated by "[]"

  • NOTE that you can set any SOLVE, RESOLVE or RESOLVE_PATTERN keyword in PHENIX using the "resolve_command", "solve_command" or "resolve_pattern_command" keywords from a script. The format is different than from the command-line: you don't have to put quotes around around the command:
    resolve_command ligand_start start.pdb # NOTE: quotes not necessary for script
    

    This will put the text

    ligand_start start.pdb
    
    at the end of every temporary command file created to run resolve.

Specific limitations and problems:

  • In the GUI version of Wizards, The Display Options window is updated only when you open it. Further, once this window is open you cannot open it again until you close it. Sometimes this window may be behind other windows and this will prevent you from opening it again until you close the open window.

  • The Wizards use file names based on the names of your input files, but they do not differentiate between files with the same name coming from different directories. Consequently you should not use two files with different contents but with the same file name as inputs to a Wizard, even if they come from separate starting directories.

  • The command-line version of AutoSol cannot be used for MIR or for combining multiple datasets. The script and GUI versions can be used instead for these cases.

  • If you stop a Wizard and continue on with a command such as phenix.autobuild run=2 then you can change most parameters with keywords just as if you were starting from scratch, but if you had previously changed a keyword away from the default, you cannot set it back to the default in this way (the Wizard ignores keywords that are the same as the default).

  • You should not work on the same run in two ways at the same time. This can lead to unpredictable results because the two runs will really be the same run and the data and databases for the two runs will be overwriting each other. This means you need to be careful that if you goto_run 1 of a Wizard in one window that you do not also goto_run 1 of the same Wizard in another window. On the other hand, it is perfectly fine to work on run 1 of a Wizard in one window and run 2 of the same Wizard in another window.

  • The PHENIX Wizards can take most settings of most space groups, however they can only use the hexagonal setting of rhombohedral space groups (eg., #146 R3:H or #155 R32:H), and cannot use space groups 114-119 (not found in macromolecular crystallography) even in the standard setting due to difficulties with the use of asuset in the version of ccp4 libraries used in PHENIX for these settings and space groups.

Literature

Additional information