[phenixbb] sculptor : "Wrong alignment format:"

Mon Nov 29 06:42:51 PST 2010

Hi Bryan,

on second thought, I have put a quick pre-filtering into the alignment 
search, and now the 190-alignment takes only 2 s instead of 5 min. Could 
you try this version and let me know whether the speed issue has gone away?

This will not speed up the calculation if all sequences in the alignment 
are almost identical.

BW, Gabor

>Hi Bryan,
>
> yes, it could be. Sculptor has to find the sequence corresponding to the 
> protein model, and it will first align the chain sequence with all 
> sequences in the alignment, and it picks the best one. This can take some 
> time. On my machine, searching a 190-sequence alignment takes about 5 
> mins. However, if you have several chains in you model and you want all 
> of them to be processed, the total time will be the multiple of 5 mins 
> and the number of protein chains.
>
> Now, I am wondering what you are trying the achieve with using such a 
> large alignment. If this is something you consider routine, I will spend 
> some time speeding up the calculation.
>
> Obviously, you must be trying to extract as much information from the 
> sequence alignment as possible, and I am not sure the sequence similarity 
> calculation as implemented in sculptor is optimal for this (right now, 
> sculptor will just take the minimum of all pairwise substitution scores 
> for a certain position). This works well for a pairwise sequence 
> alignment, but for a 190-sequence alignment just results in gap scores 
> everywhere. Could you also give some advice on how this is best 
> calculated? Would it be better to calculate the average?
>
>Best wishes, Gabor
>
>On Nov 26 2010, Bryan Lepore wrote:
>
>>Hi,
>>
>>finally got back around to this one, but its about speed now, not format :
>>
>>On Mon, Nov 22, 2010 at 5:07 AM, Dr G. Bunkoczi <gb360 at cam.ac.uk> wrote:
>>>>> Is this what you are running (0.3.0)?
>>
>>yes (via dev-590)
>>
>>> Could point me to an example that takes very long? I can give another 
>>> go in finding the bottleneck.
>>
>>i could - but if i told you i have 190 sequences or 189529 characters
>>(via `wc`) in the alignment, does that indicate anything?
>>
>>-Bryan
>>_______________________________________________
>>phenixbb mailing list
>>phenixbb at phenix-online.org
>>http://phenix-online.org/mailman/listinfo/phenixbb
>>
>
>_______________________________________________
>phenixbb mailing list
>phenixbb at phenix-online.org
>http://phenix-online.org/mailman/listinfo/phenixbb
>