How do I get an account for Memex?

To request an account you should contact your departmental IT Manager and get approval from your PI:

DGE/DPB - Garret Huntress or request HPC Access

DTM - Michael Acierno

EMB - Ed Hirschmugl or Fred Tan

GL - Gabor Szilagyi

OBS - Chris Burns or Andrew Benson

HQ - Floyd Fayton

Everyone with a valid Memex login account will be automatically added to our Memex-Announce Google Group (memex-announce@carnegiescience.edu(link sends e-mail)).

How do I login to memex?

The basic command to login from Putty or any terminal:

 ssh username@memex.carnegiescience.edu
Password:
Last login: Fri Jul 28 09:15:05 2017 from 192.70.249.30

Rocks 6.2 (SideWinder)
Profile built 02:15 24-Nov-2015
Kickstarted 02:20 24-Nov-2015
Login Node
 ____                            _      ____       _
/ ___|__ _ _ __ _ __   ___  __ _(_) ___/ ___|  ___(_) ___ _ __   ___ ___
| |   / _  |  __|  _ \ / _ \/ _  | |/ _ \___ \ / __| |/ _ \  _ \ / __/ _ \
| |__| (_| | |  | | | |  __/ (_| | |  __/___) | (__| |  __/ | | | (_|  __/
\____\__,_|_|  |_| |_|\___|\__, |_|\___|____/ \___|_|\___|_| |_|\___\___|.edu
                           |___/
…………. (bunch of machine information)
[username@memex(link sends e-mail) ~]$


where your username/password is the same as your username/password for Gmail.  For example, if your email is bush@carnegiescience.edu

(link sends e-mail), your login for Memex is bush@memex.carnegiescience.edu(link sends e-mail).

What partition or queue should I use?

The majority of the partitions corresponding to each Department (DGE, DPB, DTM, EMB, GL, HQ, and OBS), except for the GPU, SHARED, and PREEMPTION partitions.  Typing "sinfo -a -s", will give you a summary of all the partitions.  All of the Department partitions have no time limit or memory limits (up to 128GB) for its nodes. The GPU nodes have the same specifications except they also have an NVIDIA K80 GPU.  Department nodes are also shared in SHARED or PREEMPTION, which have limits to make sure Department nodes are generally available to Departments.  

Nodes in the SHARED partition have a two hour limit for all of its jobs.  This ensures no Department user is waiting more than two hours to use their Department's nodes.

Nodes in the PREEMPTION partition have a 7 day time limit and are limited to 40GB of memory per node and can be suspended at any time if a Department user requests any of the nodes.

Please use your own discretion for which shared partition suites your needs, SHARED or PREEMPTION.

Most nodes (memex-g[01-02],memex-c[001-100]) have 128GB, 24-cores, and 1.6TB local storage (~30GB /tmp).  Nodes, memex-c[101-108] have 128GB, 28-cores but only 250GB local storage (~30GB /tmp).


Memex SLURM Partitions

Parition Name

Wallclock Hours

Total Cores

Number of Nodes

MemLimit per Node (GB)

SHARED 2 2160 78 128
PREEMPTION 168 696 30 40
DGE infinite 240 10 128
DPB infinite 240 10 128
DTM infinite 960 36

128 (memex-c[109-116] have 256GB)

EMB infinite 120 5 128
GL infinite 240 10 128
HQ infinite 120 4 128
OBS infinite 960 40 128
GPU infinite

48 (+2 K80s)

2 128

 

 

 

 

 

 

How do I run a job on Memex?

For an interactive job use SLURM’s “salloc” or "srun" from the command line.  Here’s an example to grab 4 nodes in a bash shell,

$ salloc -N 4 bash 

or

$ srun -N4 --pty bash -i 

Then run your application with 96 CPUs (24 CPUs per node),

$ mpirun -n 96 a.out < input_file > output_file

For a batch/non-interactive job, here is a SLURM batch script,

​$ cat batch_script.sh
#!/bin/bash
#SBATCH --nodes=1 # only grab one node on Memex
#SBATCH --ntasks-per-node=24  # 24 is the max for most nodes
#SBATCH --time=02:00:00     # two hour limit for SHARED only, 7d for PREEMPTION
#SBATCH -p SHARED,PREEMPTION # if SHARED isn’t available, 2nd choice is PREEMPTION
#SBATCH --mem-per-cpu=2000  # default is 1000M, and PREEMPTION has a 40000M total limit per node
#SBATCH --output=slurm-${SLURM_JOBID}.out # you can specify directory and file name of SLURM’s output log
#SBATCH --error=slurm-${SLURM_JOBID}.out # you can specify directory and file name of SLURM’s error log
#SBATCH --mail-user=username@carnegiescience.edu # Use your email for notifications
#SBATCH --mail-type=FAIL # only send email if the job fails (BEGIN,END,SUSPEND are options as well and can be used in combination)
echo "SLURM_JOBID: " $SLURM_JOBID
echo "SLURM_ARRAY_TASK_ID: " $SLURM_ARRAY_TASK_ID
echo "SLURM_ARRAY_JOB_ID: " $SLURM_ARRAY_JOB_ID
echo “$SLURM_JOB_NODELIST”
module load Intel/2018

mpirun -n $SLURM_NTASKS a.out 2>1 slurm-${SLURM_JOBID}.out  # if you redirect your applications output to append the SLURM output
Submit the script to SLURM with

$ sbatch < batch_script.sh

Monitor all user jobs with,

For more information, please send an email to memexsupport@memex.carnegiescience.edu(link sends e-mail).

How do I use job dependencies in SLURM?

Here is an example of a regular shell script to submit jobs with dependencies:

#!/bin/bash
# Launch first job
JOB=`sbatch job.sh | egrep -o -e "\b[0-9]+$"`
# Launch a job that should run if the first is successful
sbatch --dependency=afterok:${JOB} after_success.sh
# Launch a job that should run if the first job is unsuccessful
sbatch --dependency=afternotok:${JOB} after_fail.sh

 

Where the content of job.sh, after_success.sh, and after_fail.sh are SLURM batch scripts.

How do I monitor my job on Memex?

There are several ways to monitor your job on Memex.  If you want to monitor the progress of your application, you can “tail -f output_file.txt”, where the filename is specified in your batch or interactive SLURM submission.  For example,

mpirun -n a.out 2>1 output_file.txt 

If you’re interested in monitoring resource usage (CPU/memory/network), you can go to our Ganglia page(link is external), or use the following command for running jobs only:

sstat --format=AveCPU,AveCPUFreq,MaxDiskRead,MaxDiskWrite,AvePages,MaxRSS,MaxVMSize,JobID,NTasks -j XXXXX.batch

Where XXXXX is a SLURM jobid.  Otherwise, please email memexsupport@carnegiescience.edu(link sends e-mail) for other options.

How do I backup my Memex data?

We do NOT have a central backup in place for user data.  However, you can backup your critical data to Google Drive using the “rclone” tool on Memex. 

Here's a video on how to setup for Google Drive:

Here are examples of how to use Rclone(link is external) commands:

$ rclone ls remote:path
$ rclone copy /local/path remote:path # copies /local/path to the remote 
$ rclone sync /local/path remote:path # syncs /local/path to the remote

Also, please don't use spaces for the names of your Google Drive backups!  For example, if you're backing up a directory or file, please make a path with no spaces on GDrive (i.e. rclone sync /home/user/bio GDrive:biobackup).  If you have issues, please submit a ticket to memexsupport@carnegiescience.edu.(link sends e-m.

How can I request software?

Please email memexsupport@carnegiescience.edu(link sends e-mail) for request software to be added to Memex.  Normally this request can take up to one week to fulfil but please be advised, some applications may not work on Memex.  If the application can be installed, a module should be made available as well.

How to monitor Memex's hardware?

Visit our Ganglia page by browsing to http://10.15.176.1/ganglia, after you've logged into Memex.  Make sure you log into Memex using X11 forwarding ("ssh -X username@memex.carnegiescience.edu" or "ssh -XY username@memex.carnegiescience.edu").   

For slower connections, setting up VNC (from Mac or Windows, using memex.carnegiescience.edu not calc.d**.carnegiescience.edu) would be a better option.

For Memex news and events?

Subscribe to Carnegie's HPC calendar(link is external) for important updates and if you have a Memex account, you'll automatically receive emails from the official mailing list (memex-announce@carnegiescience.edu(link sends e-mail)).  For general questions that are not affecting your work, please join the Memex Discuss mailing list(link is external) or follow our Slack Channel for Carnegie HPC(link is external).

How to use Lustre?

Here is general information on Lustre,

https://www.nics.tennessee.edu/computing-resources/file-systems/io-lustr...(link is external)

Linux Basics for HPC....

Here's straighforward Linux tutorial from the National Institute for Computational Sciences (link is external).  Whether it is Memex, or your local desktop, the video covers commands and topics that every Linux user should know.  I have broken down the hour-long video by topic, so you can choose which topic you're most interested in. See below video for links to each topic (typically << 2-3m).



by Topic (starts at topic and continues):
 
CyberDuck is the recommended SFTP transfer program due to the way it can handle session sharing, and ease of use. Download CyberDuck from https://cyberduck.io/?l=en(link is external) and install. Once installed be sure to set File Transfer settings to "Use browser connection" to avoid having to authenticate each time you want to transfer a file.