Getting MEME on the cluster

This was a program recently requested by one of our users. Since its publicly available open-source software, its something we can readily do. Though that was a bit more easily said then done. I won’t be covering how I set up the web interface as well I forgot how I did most of it. I can say it was alot of work and I really hate setting up multiple perl dependencies. I really wish more perl modules were available as rpms. It just makes it easier, but of course not 😛

But back on topic, while the web interface was built on another server that gets a lot of usage. So much so we try to encourage uses to do more work on the cluster then on this machine, which is more useful for large memory jobs as it as 128GB of ram. So on the cluster MEME goes. The software is readily available on their website, no hoops to got through and there is some nice, but not the best, documentation on how to setup the software.

So following their instructions, download, untar, apply any patches, in this case there were two, but only one of them worked. I didn’t bother with the second one as it only patched web related files. Run the standard configure script, set the prefix, but also included options for MPICH2 and to build the included libxml. During compiling and linking, it was having problems with the system version of libxml and not wanting to do anything to the cluster, I opted for the included version.

So it builds and I run the test. Second issue, perl dependencies. Since I wasn’t dealing with the web interface, the list of dependencies isn’t very long and the tests tell you which one is needed. Luckily I was able to find an rpm for it which lists it own dependencies, which again I was able to find rpms for. With the tests run successful, MEME gets installed.

Next was testing the parallel execution. Our Rocks based cluster uses OpenMPI by default and its well integrated into the cluster, but MEME doesn’t support it. It is either LAM or MPICH2. Since MPICH2 is already installed and working, I went with that(On the big memory machine, LAM was used). At the configure stage, you specify the MPICH2 directory and binaries and it takes care of the rest. It should be noted, one of the setup requirements for using MPICH is to create .mpd.conf in your home directory and specify some password within. This is for securing communication between mpd daemons, so no cross-talk between different user jobs.

So to run MEME, the following is an example job script for submition to the scheduler(grid engine):

#!/bin/bash
#$ -pe mpich2 5
#$ -N meme_test
#$ -j y
#$ -cwd
#$ -S /bin/bash
export MPICH2_ROOT=”/opt/mpich2/gnu”
export PATH=”$MPICH2_ROOT/bin:$PATH”
export MPD_CON_EXT=”sge_$JOB_ID.$SGE_TASK_ID”
time /share/apps/meme/bin/meme -p $NSLOTS INO_up800.s -dna

-mod anr -revcomp -bfile yeast.nc.6.freq
exit 0

*Note the “" is just to show that this really should be a single line.

Thats about it, luckily it wasn’t as painful as the web interface, and even nicer is the configure option to specify the use of a particular web server. So output generated will have links back to the database information hosted on the big memory machine.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s