[phenixbb] Detecting and dealing with Pseudotranslation and/or twinning

Hamaoka, Brent bhamaoka at ucsd.edu
Thu May 12 14:19:57 PDT 2011

I’ve been working on a troublesome protein structure.  The native protein forms crystals that diffract to 2.75A and belong to P212121 (55.179 64.316 233.748 90.000 90.000 90.000) with 4 molecules in the ASU.  I have 3 versions of the same protein where selenomethionine mutations are incorporated at different positions.  Interestingly, these mutations cause the protein to form crystals belonging to C2221 (56.130 64.665 240.854 90.000 90.000 90.000).  Looking back at the native datasets, Xtraige indicates the largest Patterson peak is (0.5, 0.486, 0), height is 11.8% of the origin peak, and p_value(height) is 0.08549, which is just outside of the threshold for being identified as containing pseudotranslation.  Datasets from a couple of the selenomet incorporated crystals yield diffraction to ~3.5A and anomalous signal to ~6.7A.  Some of the datasets give a solution with reasonable maps, but the best maps are achieved from combining MAD/SAD selenomet datasets and one from a mercury derivatized crystal using the ‘group’ command in Phenix.   
STEP: finished
Top solution: # 39 Dataset #0
BAYES-CC: 69.2 +/- 13.4   FOM: 0.6
Built: 219 Side-chains: 44 Chains: 9   CC: 0.74

 Score type:       SKEW    CORR_RMS    NCS_OVERLAP
 Raw scores:        0.41      0.82      0.00  
 100x EST OF CC:   69.17     43.34     31.25  

Maps from this solution show connected electron density that looks like helices, consistent with the predicted secondary structure.  Strangely, there is absolutely no side-chain density, only c-beta at most.  I can build a poly-ala model into the map and the distances between the heavy atom sites appear correct based upon the known positions of the selenomethionines and the single cysteine in the protein sequence.  However the model does not refine.  R-free starts and remains near 0.45.  I’ve tried indexing in lower symmetry space groups (P2, C2, P1) and re-solving by molecular replacement, but the refinement still fails.

Xtriage does not indicate twinning.

Twinning and intensity statistics summary (acentric data):
Statistics independent of twin laws
  <I^2>/<I>^2 : 2.305
  <F>^2/<F^2> : 0.758
  <|E^2-1|>   : 0.788
  <|L|>, <L^2>: 0.501, 0.333
  Multivariate Z score L-test: 1.643

 The multivariate Z score is a quality measure of the given
 spread in intensities. Good to reasonable data are expected
 to have a Z score lower than 3.5.
 Large values can indicate twinning, but small values do not
 necessarily exclude it.

One possible clue as to what is going on comes from analysis of SOLVE results. I was analyzing whether SOLVE/PHENIX solutions were related with one another by various origin shifts and came across one particular SOLVE run from a SeMet SAD dataset in C2221 that gave good statistics for a solution FOM=0.57, Z-score=20.26, peak height between 7.1 and 10.8 for 4 SeMet sites to 3.8A).  The maps, however, looked poor.  What was interesting, though, was that 2 of the Se sites matched well with where I was expecting the Se sites in one molecule in the asymmetric unit.  The other two matched with where I would expect the sites in the other molecule when the model is shifted by one half the unit cell distance along the ‘a’ or ‘b’ axis.

I’d appreciate any advice as to what might be happening, and how might I go about detecting the problem, and how to dealing with the data?

Brent Hamaoka
UC San Diego
9500 Gilman Drive 0375
La Jolla, CA 92093

More information about the phenixbb mailing list