[phenixbb] phenix.map_to_model input mtz file failure --caution on using map_to_model with X-ray data
Edward A. Berry
BerryE at upstate.edu
Tue Jun 13 17:56:41 PDT 2017
On 06/13/2017 07:44 PM, Dale Tronrud wrote:
>
> First we have to agree on exactly what we are talking about. I
> presumed we were talking about real space refinement against an
> experimental map such as one gets in cryo-EM. In that case there are no
> Fobs. Reciprocal space is a fiction and should be avoided.
>
> If you are working, instead, with a 2Fo-Fc map then just do
> reciprocal space refinement. I don't know of any reason to do
> whole-molecule real space refinement when you are working with crystal
> diffraction data. Reciprocal space is where the experiment lives and
> the analysis should be done there.
>
> In cases of model building it is computationally quicker to do a
> local real space refinement to touch up a model just so you can see if
> it looks reasonable before going back into reciprocal space. This real
> space refinement is quick-and-dirty and any flaws will be erased by the
> proper reciprocal space refinement that follows.
I have no argument with that! And I do realize how significant cryo-EM has
become to structural biology with the advent of direct electron detectors.
I was still stuck on the perhaps not so important theoretical question of
"to exclude or not to exclude" the free-R set in real-space refinement
prior to reciprocal space refinement.
> As for "neighborhood correlation" I was thinking of cryo-EM maps.
> Since the individual measurements (pictures) are real space in nature, I
> can't imagine an experimental error in the voltage of one voxel wouldn't
> tend to show up similarly in its neighbors. The whole group of voxels
> will be illuminated by electrons who all had very similar histories
> passing through the microscope.
>
> We have similar situations with diffraction data. A reflection whose
> neighbor is in a shadow has a much higher chance of being shadowed
> itself. Our spots, however, are much further apart on the detector than
> the voxels of an EM image.
>
> There is another type of correlation that is probably more important.
> Our diffraction spots are separated enough that you cannot predict the
> intensity of a reflection based on its neighbors. You can make a very
> good prediction of the darkness of a voxel based on its neighbors. If
> you leave out one voxel, as a test set member, you could easily deduce
> its hidden value without even building a molecular model - just
> interpolate. You can't do that with diffraction data.
>
> This means if you want to leave out a chunk of map data for a test
> set you have to pull out a big enough piece (many contiguous voxels)
> that you can't deduce anything about their opaqueness from the remaining
> image. To do this you have to know something about how the microscope
> works.
>
> Dale Tronrud
>
> On 6/13/2017 4:08 PM, Edward A. Berry wrote:
>>> To unbias you would have to calculate a
>>> new map with current Fcalc's for every iteration of the model, but this
>>> method would not take into account the neighborhood correlation present
>>> in experimental maps.)
>>
>> Thanks, Dale,
>> Could you explain this "neighborhood correlation"?
>>
>> My very simple (maybe too simple) understanding of how real space would
>> bias reflections is as follows:
>>
>> You make a map using phases (and Fc?) from the current model and Fobs.
>> But you omit the free set.
>>
>> Now if you take the fourier transform of that unmodified map, you would
>> get back
>> exactly the coefficients you put in: 2Fo-Fc (?) for the working
>> reflections,
>> and zero for the free set.
>>
>> Then you make modifications to the model to make its density match as
>> nearly
>> as possible the density of the map. If you were able to make the density of
>> the model exactly match that of the map, then the Fc for the model would
>> be that of the map.
>>
>> Of course you can never make the density of the model exactly match that of
>> the map - modelization is the severest form of density modification.
>> But, to the extent that you make the model's density more nearly like that
>> of the map, the Fourier transform of the new model will be more
>> like that of the map.
>>
>> That means for the working reflections, Fc will get closer to 2Fo-Fc
>> which brings them closer to Fo; and the R-work improves.
>> (If there is error in the Fobs, that will be reflected in the map,
>> and the Fcalc of the model will tend toward these eromeous Fobs
>> (fitting the error) and Rwork will get better than it should (bias).)
>> Free reflections will move closer to zero, and most likely Rfree
>> will get worse.
>>
>> I think that's all consistent with what you wrote, but then
>> I had the impression that the bias could be prevented by making the
>> map with Fc for the test set (proposed in an old paper by Ivan
>> Rayment. That way the free reflections get are following the
>> process by their coupling to neighboring reflections in reciprocal
>> space (neighborhood correlation?), the same way they do in reciprocal
>> space refinement, rather than the Fobs being used. The information
>> in these free Fcalc is coming from the neighboring working reflections
>> due to redundancy of information in a finely sampled molecular transform.
>>
>> Ed
>>
>>
>>
>> On 06/13/2017 05:40 PM, Dale Tronrud wrote:
>>>
>>>
>>> On 6/13/2017 12:30 PM, Edward A. Berry wrote:
>>>> Thanks, Pavel,
>>>> I really appreciate your taking the time to generate the example.
>>>>
>>>> While I agree with Tim and Ian that refinement to convergence should
>>>> remove the bias making it perhaps not a serious problem, my question was
>>>> in fact whether there is any bias immediately after the refinement.
>>>>
>>>> I will need to study this example a bit, but one thing I notice is
>>>> that you are doing exactly what I was guessing, comparing Rfree
>>>> after real-space refinement with and without using the free set.
>>>> Then, I still think, we
>>>>>>> have to think about how much of that difference results from
>>>>>>> bias towards the observed values (when the reflections are included)
>>>>>>> and
>>>>>>> how much is from bias towards zero (when the free set is excluded).
>>>>
>>> Of course the model is refined as though the test set Fourier
>>> components were equal to zero. In reciprocal space refinement when you
>>> leave a reflection out of the "sum over all reflections" when
>>> calculating the difference map you are saying that you have no opinion
>>> about the amplitude of that reflection. When you calculate a real space
>>> map from Fourier coefficients you can't not have an opinion, i.e. you
>>> can't leave a term out of the sum you can only set that term to zero.
>>> If your model produces a prediction for that term which is not equal to
>>> zero it will be penalized. (If you set that term to Fcalc you tie your
>>> model to its starting point. To unbias you would have to calculate a
>>> new map with current Fcalc's for every iteration of the model, but this
>>> method would not take into account the neighborhood correlation present
>>> in experimental maps.)
>>>
>>> What this means is that Rfree is not a meaningful stat for assessing
>>> overfitting of real space refinement. This is hardly a surprise. A
>>> test of a refinement protocol has to be based on the mathematics of that
>>> protocol, not the protocol you happened to have used yesterday. If you
>>> want an unbiased estimate of the quality of a real space refinement you
>>> have to leave out a region of the map and then see how well the model
>>> fits that region. This is harder to do in an automated fashion and
>>> there will be a lot of caveats about your results (e.g. you know about
>>> the ability to fit one region but does that generalize to other areas?).
>>> If you recall there are a lot of caveats about Rfree too - we have just
>>> stopped worrying about them. (e.g. low resolution vrs high resolution
>>> reflections, choosing based on shells or randomly, what to do about
>>> ncs...)
>>>
>>> I think you should consider yourself on the wrong track if you come
>>> up with a statistical test, but haven't given any thought to the actual
>>> experiment that produced your map.
>>>
>>> Dale Tronrud
>>>
>>>> Things I need to look at-
>>>> What are R and R-free for the original refined model
>>>> What are R and R-free after shaking (did RSR lower R but not Rfree, or
>>>> did it raise Rfree?
>>>> What if RSR is done using a map made with fill-in strategy?
>>>>
>>>> Ed
>>>>
>>>> On 06/13/2017 02:15 PM, Pavel Afonine wrote:
>>>>> Hi Ed,
>>>>>
>>>>> Including free-r reflections into map calculation and then using such
>>>>> map in real-space refinement of entire model will affect Rfree. Here
>>>>> is a simple example that illustrates my statement, step-by-step:
>>>>>
>>>>> 1) Get data and model from PDB:
>>>>>
>>>>> phenix.fetch_pdb 1f8t --mtz
>>>>>
>>>>> 2) Compute two 2mFo-DFc maps: one includes all reflections the other
>>>>> one has no free-r terms:
>>>>>
>>>>> phenix.python run.py 1f8t.{pdb,mtz}
>>>>>
>>>>> This will create an MTZ file (map_coeffs.mtz) that contains Fourier
>>>>> map coefficients for both maps.
>>>>>
>>>>> 3) Shake model a bit:
>>>>>
>>>>> phenix.dynamics 1f8t.pdb number_of_steps=500
>>>>>
>>>>> 4) Run real-space refinement using two maps:
>>>>>
>>>>> phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb
>>>>> label="work,PHIwork" ncs_constraints=false output.file_name_prefix=work
>>>>>
>>>>> phenix.real_space_refine map_coeffs.mtz 1f8t_shaken.pdb
>>>>> label="all,PHIall" ncs_constraints=false output.file_name_prefix=all
>>>>>
>>>>> 5) Compute R-factors using data and real-space refined models:
>>>>>
>>>>> phenix.model_vs_data 1f8t.mtz all_real_space_refined.pdb
>>>>> r_work(re-computed) : 0.2419
>>>>> r_free(re-computed) : 0.2441
>>>>>
>>>>> phenix.model_vs_data 1f8t.mtz work_real_space_refined.pdb
>>>>> r_work(re-computed) : 0.2444
>>>>> r_free(re-computed) : 0.2756
>>>>>
>>>>> The result is self-explicable and is inline with Tom's reply to Wei.
>>>>>
>>>>> All files necessary to reproduce calculations above are here:
>>>>> http://cci.lbl.gov/~afonine/tmp/
>>>>>
>>>>> All the best,
>>>>> Pavel
>>>>>
>>>>>
>>>>> On 6/8/17 10:05, Tim Gruene wrote:
>>>>>> Hi Ed,
>>>>>>
>>>>>> including the 'free' reflections in the map for modelling does not
>>>>>> taint the
>>>>>> value of Rfree. That is a misconception that i s very persistent (as
>>>>>> prejudice
>>>>>> usually are). I believe it was Ian Tickle who formulated that when
>>>>>> you simply
>>>>>> refine long enough towards convergence, all reflections excluded from
>>>>>> refinement
>>>>>> will become independent, i.e. you can assign a new set for Rfree
>>>>>> every time
>>>>>> you refine, if you wish so.
>>>>>>
>>>>>> This concept is the reason why Rcomplete (the "better" equivalent to
>>>>>> Rfree for
>>>>>> small data sets with < 10,000 unique reflections), introduced by Axel
>>>>>> Brunger,
>>>>>> works, as we could demonstrate in doi: 10.1073/pnas.1502136112
>>>>>>
>>>>>> So nothing to worry about when including all reflections in map
>>>>>> calculations.
>>>>>>
>>>>>> Cheers,
>>>>>> Tim
>>>>>>
>>>>>> On Thursday, June 8, 2017 12:42:53 PM CEST Edward A. Berry wrote:
>>>>>>> Hi, Tom,
>>>>>>> Please forgive what may be a silly question from an outsider who
>>>>>>> hasn't
>>>>>>> really kept up with the crystallography literature or even all the
>>>>>>> Phenix
>>>>>>> newsletters- What is the evidence that including the free set in
>>>>>>> real space
>>>>>>> refinement biases R-free of the resulting model? Is this Rfree also
>>>>>>> biased
>>>>>>> when map coefficients use "fill-in" for the excluded free
>>>>>>> reflections (and
>>>>>>> is that what phenix.remove_free_from_map does?).
>>>>>>>
>>>>>>> My point is that literally excluding the free reflections, as
>>>>>>> opposed to
>>>>>>> substituting their values with Fc, will bias the free set toward
>>>>>>> grossly
>>>>>>> incorrect values (namely zero) and therefore greatly worsen R-free.
>>>>>>> Thus if
>>>>>>> the evidence for bias is that you get worse R-free when you
>>>>>>> exclude the
>>>>>>> free set, you have to think about how much of that difference
>>>>>>> results from
>>>>>>> bias towards the observed values (when the reflections are included)
>>>>>>> and
>>>>>>> how much is from bias towards zero (when the free set is excluded).
>>>>>>> (Again, I realize this may be all very well understood by the
>>>>>>> crystallography community and properly taken care of in phenix; I'm
>>>>>>> just
>>>>>>> asking for my own information) eab
>>>>>>>
>>>>>>> On 06/08/2017 07:28 AM, Terwilliger, Thomas Charles wrote:
>>>>>>>> Hi Wei,
>>>>>>>>
>>>>>>>>
>>>>>>>> I want to give a word of caution about how to use
>>>>>>>> phenix.map_to_model on
>>>>>>>> crystallographic data...The bottom line is you should remove the
>>>>>>>> test set
>>>>>>>> from your map coefficients before running phenix.map_to model on
>>>>>>>> X-ray
>>>>>>>> data. Here is why:
>>>>>>>>
>>>>>>>>
>>>>>>>> phenix.map_to_model uses real-space refinement, which is refinement
>>>>>>>> against the map. If you supply map coefficients that include your
>>>>>>>> test
>>>>>>>> reflections, then you will be refining against data that is in your
>>>>>>>> test
>>>>>>>> set. This will make your Rfree invalid when you go back and
>>>>>>>> refine your
>>>>>>>> model against the original crystallographic data.
>>>>>>>>
>>>>>>>>
>>>>>>>> To remove the test set from your map coefficients you can use:
>>>>>>>>
>>>>>>>>
>>>>>>>> phenix.remove_free_from_map map_coeffs=my_map_coeffs.mtz
>>>>>>>> free_in=my_data_file_with_freeR_flags.mtz
>>>>>>>> mtz_out=my_map_coeffs_no_free.mtz
>>>>>>>>
>>>>>>>>
>>>>>>>> Also note that phenix.map_to_model uses a fixed map (it does not do
>>>>>>>> density modification). Consequently for most crystallographic
>>>>>>>> data at
>>>>>>>> moderate resolution or higher phenix.autobuild is going to do much
>>>>>>>> better
>>>>>>>> than phenix.map_to_model.
>>>>>>>>
>>>>>>>>
>>>>>>>> All the best,
>>>>>>>>
>>>>>>>> Tom T
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> --------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ---------------------------- *From:*dingding830106 at 163.com
>>>>>>>> <dingding830106 at 163.com> on behalf ofdancingdream at 163.com
>>>>>>>> <dancingdream at 163.com> *Sent:* Tuesday, June 6, 2017 9:16 PM
>>>>>>>> *To:* Terwilliger, Thomas Charles
>>>>>>>> *Cc:*phenixbb at phenix-online.org
>>>>>>>> *Subject:* Re:Re: [phenixbb] phenix.map_to_model input mtz file
>>>>>>>> failure
>>>>>>>> Dear Thomas,
>>>>>>>> I use CAD to convert the labels from FDM->FWT, PHIDM->PHFWT, then
>>>>>>>> submit
>>>>>>>> this job again (without map_coeffs_labels=... ), and everything
>>>>>>>> seems ok.
>>>>>>>> Thank you very much for you help.
>>>>>>>> Best!
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Wei Ding
>>>>>>>> P.O.Box 603
>>>>>>>> The Institute of Physics,Chinese Academy of Sciences
>>>>>>>> Beijing,China
>>>>>>>> 100190
>>>>>>>> Tel: +86-10-82649083
>>>>>>>>
>>>>>>>> E-mail:dingwei at iphy.ac.cn <mailto:wangli at moon.ibp.ac.cn>
>>>>>>>>
>>>>>>>> At 2017-06-07 10:32:14, "Terwilliger, Thomas Charles"
>>>>>> <terwilliger at lanl.gov> wrote:
>>>>>>>> Hi Wei,
>>>>>>>>
>>>>>>>>
>>>>>>>> I'm sorry for the trouble!
>>>>>>>>
>>>>>>>>
>>>>>>>> If you supply an MTZ file that has FWT,PHFWT or similar
>>>>>>>> labels, then
>>>>>>>> you can skip the "labels=...." statement and it should run.
>>>>>>>>
>>>>>>>>
>>>>>>>> Let me know if that does not work!
>>>>>>>> All the best,
>>>>>>>>
>>>>>>>> Tom T
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> ----------------------------------------------------------------------
>>>>>>>>
>>>>>>>> ---------- *From:*phenixbb-bounces at phenix-online.org
>>>>>>>> <mailto:phenixbb-bounces at phenix-online.org>
>>>>>>>> <phenixbb-bounces at phenix-online.org
>>>>>>>> <mailto:phenixbb-bounces at phenix-online.org>> on behalf of
>>>>>>>> dancingdream at 163.com <mailto:dancingdream at 163.com>
>>>>>>>> <dancingdream at 163.com <mailto:dancingdream at 163.com>> *Sent:*
>>>>>>>> Tuesday,
>>>>>>>> June 6, 2017 8:19 PM
>>>>>>>> *To:*phenixbb at phenix-online.org
>>>>>>>> <mailto:phenixbb at phenix-online.org>
>>>>>>>> *Subject:* [phenixbb] phenix.map_to_model input mtz file
>>>>>>>> failure
>>>>>>>> Dear Phenix bb,
>>>>>>>> I intend to build a initial model by phenix.map_to_model.
>>>>>>>> And the
>>>>>>>> command line is as follows: phenix.map_to_model_1.12rc0-2787
>>>>>>>> map_coeffs_file=../rep_dm.mtz map_coeffs_labels="'FP,SIGFP'
>>>>>>>> 'PHIDM'
>>>>>>>> 'FOMDM'" seq_file=../resolve.seq is_crystal=True
>>>>>>>> use_sg_symmetry=True density_select=False
>>>>>>>> truncate_at_d_min=True
>>>>>>>> and the feedback like this:
>>>>>>>> Sorry: No initial assignment made for map_coeffs. Labels used:
>>>>>>>> FP,SIGFP PHIDM FOMDM. Available labels: ['PHIB', 'FOM',
>>>>>>>> 'HLA,HLB,HLC,HLD', 'FP,SIGFP', 'PHIDM', 'FOMDM', 'FDM',
>>>>>>>> 'HLADM,HLBDM,HLCDM,HLDDM'] NOTE: grouped labels like
>>>>>>>> 'FP,SIGFP' must
>>>>>>>> stay together,
>>>>>>>> have commas, and have no spaces. If they come from an MTZ
>>>>>>>> file,
>>>>>>>> they must be in adjacent columns as well.
>>>>>>>> Suggested labels to use: PHIDM FOMDM
>>>>>>>> I try many other input format of map_coeffs_labels, such as
>>>>>>>> map_coeffs_labels="FP,SIGFP PHIDM FOMDM"
>>>>>>>> map_coeffs_labels=["FP,SIGFP PHIDM FOMDM"]
>>>>>>>> ... ...
>>>>>>>> but the result is the same. Dose anyone can tell me how to fix
>>>>>>>> this
>>>>>>>> problem? Thank a lot.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> Wei Ding
>>>>>>>> P.O.Box 603
>>>>>>>> The Institute of Physics,Chinese Academy of Sciences
>>>>>>>> Beijing,China
>>>>>>>> 100190
>>>>>>>> Tel: +86-10-82649083
>>>>>>>> E-mail:dingwei at iphy.ac.cn <mailto:wangli at moon.ibp.ac.cn>
>>>>>>>>
>>>>>>>> _______________________________________________
>>>>>>>> phenixbb mailing list
>>>>>>>> phenixbb at phenix-online.org
>>>>>>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>>>>>>> Unsubscribe:phenixbb-leave at phenix-online.org
>>>>>>> _______________________________________________
>>>>>>> phenixbb mailing list
>>>>>>> phenixbb at phenix-online.org
>>>>>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>>>>>> Unsubscribe:phenixbb-leave at phenix-online.org
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> phenixbb mailing list
>>>>>> phenixbb at phenix-online.org
>>>>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>>>>> Unsubscribe:phenixbb-leave at phenix-online.org
>>>>>
>>>> _______________________________________________
>>>> phenixbb mailing list
>>>> phenixbb at phenix-online.org
>>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>>> Unsubscribe: phenixbb-leave at phenix-online.org
>>> _______________________________________________
>>> phenixbb mailing list
>>> phenixbb at phenix-online.org
>>> http://phenix-online.org/mailman/listinfo/phenixbb
>>> Unsubscribe: phenixbb-leave at phenix-online.org
>>>
>>
>
More information about the phenixbb
mailing list