[phenixbb] parallel phenix.phaser on SGE cluster

Jon Schuermann schuerjp at anl.gov
Wed Jul 10 11:17:44 PDT 2013


Lionel,

It's not quite as easy as one would hope, and it has nothing to do with 
Phenix or Phaser. We make use of extensive multiprocessing across our 
cluster in our RAPD software used at the beamline to speed up the 
results for the user.

I set this up a while ago, but from what I remember, you first have to 
setup a 'parallel environment' (PE) in SGE. The options maybe different 
depending on your version of SGE. We use 6.2u4. There are probably 
default PE's setup already but they might not have the correct 
parameters. I created a new one called 'smp' with the number of 'slots' 
set to the number of cores of your cluster, or some other lower limit. 
(If you set 'slots' to 12 and you submit 5 jobs requiring 4 slots each, 
only three will run, until one has finished and the resources are free.) 
The 'allocation rules' are set to '$pe_slots' so that the job can use 
only the cores on a single node. There are other rules that might be 
better for what you want to do. In our case, I setup different queues 
with different priorities that have access to specific PE's depending on 
the jobs that are getting submitted at the beamline. Your setup may not 
need this complexity. I would read through the huge manual for SGE for 
details or do a search on Oracle's website.

When you submit the job, make sure you add 'qsub ... -pe smp 1-4 ...' 
which will tell SGE that your job will need 1-4 cores on a single node. 
You could also just specify a single integer (4 instead of 1-4) to 
request 4 slots. Obviously, you can modify these to your needs. After 
you submit the job, run 'qstat' and look at the last column labeled 
'slots' to see how many slots are saved for the job.

In your Phaser command include 'JOBS 4' to match your requested number 
of slots. I am not sure how much this speeds up a single Phaser job 
because there isn't a whole lot of code to parallelize in MR. Randy Read 
mentioned (either to me or the BB, I don't remember) that everything 
that could be parallelized in Phaser is done.

On a side note, if you write code in Python, and you start a new 
multiprocessing.Process() it will automatically launch it on another 
core on the same node. You have to account for this when you request a 
specific number of slots during job submission, otherwise you could 
overload your cluster pretty quickly. Many programs will have an 
optimized number of slots to request and requesting more slots will not 
make it run any faster, but it will limit resources available for other 
jobs on the cluster. I assume Phaser is one of these programs.

Jon

-- 
Jonathan P. Schuermann, Ph. D.
Beamline Scientist, NE-CAT
Argonne National Laboratory, 436E
9700 S. Cass Ave.
Argonne, IL 60439

Email: schuerjp at anl.gov
Tel: (630) 252-0682



On 07/10/2013 09:16 AM, L. Costenaro (IBB) wrote:
> Hello,
>
> I am trying to run phaser in a SGE cluster (qsub) using multiple proc 
> (either phenix.phaser from the command line or phaser-MR from the 
> GUI), but the jobs do not parallelize. When I run the phaser-MR 
> locally (same executable) it does parallelize (multiple python threads).
>
> Any help , advice would be welcome.
>
> Best regards,
> Lionel
>
>
> _______________________________________________
> phenixbb mailing list
> phenixbb at phenix-online.org
> http://phenix-online.org/mailman/listinfo/phenixbb


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://phenix-online.org/pipermail/phenixbb/attachments/20130710/c3cd221c/attachment-0001.htm>


More information about the phenixbb mailing list