[phenixbb] Few questions about the model building and refinement of cryo-EM data

Wang, Bing bingwang at ou.edu
Mon Aug 28 09:30:38 PDT 2017

Hi guys,
Sorry for the detailed questions from a beginner! I am starting a modeling/refinement work for a set of cryo-EM data.

Data basic information: phage data (looks like a ball with P1 space group), map size ~870 MB, resolution ~6 A, two X-ray structures for model building

General strategies: I select COOT for the model building; using phenix.real_sapce_refine for the real space refinement with secondary structure restrain and REFMAC for the reciprocal space refinement.

Few questions are listed here.
1. The map was too bigger to open it in COOT. The phenix.map_to_structure_factors was used to obtaine ~120 MB sized MTZ file (still a little big for my computer). I manually build up the whole ball-shaped phage with the rigid body fit in COOT (from two X-ray structures to 120 chains). My first question will be: In this case, should I crop the map in Chimera or other software and only focus on a small asymmetric unit to do the model building and the followed refinement.

2. I would like to do a real space refinement after the model building.
Input files:
    A pdb file I just built up from COOT
    A original MAP file
    A transfered MTZ file
    Two restraint files from two X-ray structures by ProSMART (TXT formart)
The refinement parameters I would like to select in GUI interface:
    minimization_global, rigid_body, local_grid_search, adp
    Use secondary structure restraints
    Reference model restraints: use starting model as reference, main chain, side chain, fix outliers, secondary structure only
    Rotamer restraints
    Ramachandran restraints
    Show per residue

My second set of questions: Should I select the MAP file (~870 MB) or the MTZ file (~120 MB)? Is that necessary to add the two restraint files from ProSMART if I use the starting model as reference? Is the refinement parameters selected properly?

3. I gave a try by phenix.real_space_refine. An first error showed up:
Number of atoms with unknown nonbonded energy type symbols: 6840
    "ATOM    184  HG1 SER 1  12 .*.     H  "
    "ATOM    458  HG1 SER 1  30 .*.     H  "
    "ATOM    699  HG1 SER 1  45 .*.     H  "
    "ATOM    720  HG1 SER 1  47 .*.     H  "
    "ATOM    762  HG1 SER 1  50 .*.     H  "
    "ATOM   1241  HG1 SER 1  81 .*.     H  "
    "ATOM   1465  HG1 SER 1  95 .*.     H  "
    "ATOM   1747  HG1 SER 1 113 .*.     H  "
    "ATOM   1758  HG1 SER 1 114 .*.     H  "
    "ATOM   2173  HG1 SER 1 141 .*.     H  "
    ... (remaining 6830 not shown)

I tried phenix.ready_set to fix this problem according to a previous discussion but it gave me another error: ENDMDL record missing at end of input.
Thus my third question will be how to fix the first error?

Thank you for patience! I would really appreciate your help!

