See also the instructions for the New cluster (October 2018)
Please Contact Us for any question you don't find answered, or if you have suggestions or corrections.
Hpc64 is a centrally administered computing cluster dedicated to research computing, composed of servers owned by various labs in the Division of Science.
The system is accessible at hpc64.brandeis.edu to users with accounts. Upon login the users land on the "login node", and from there they can organize their files, compile the software and prepare the runs, and then interact with the job scheduler to submit the calculations to the compute nodes.
NOTE: the login node must not be used to run directly calculations, but only to submit calculation to the scheduler.
If you have a technical question or problem that cannot be addressed by the documentation below, or you need HPC related advise, please open a ticket using the form at this link: Open a Ticket
The cluster runs 64-bit Rocks/RHEL linux and is composed by an heterogeneous collection of Intel Xeon based hardware ranging from 2 sockets ( 8 cores/node ) Dell PowerEdge R410, to more recent 2 sockets (16 cores/node) Supermicro servers and 4 sockets ( 32 cores/node ) M820Blade servers.
The system is currently comprised of 1900 physical cpu cores and 168 GPUs, on 145 computational nodes each of which has 8 to 32 physical cores ranging in clock speed from 2.20 GHz to 2.80 GHz.
Four nodes of the cluster are connected to 12 NVIDIA Tesla M2050 GPUs, 11 nodes are connected to 52 NVIDIA GeForce GTX 780 Ti GPUs and 17 nodes are connected to 104 NVIDIA GeForce TitanX GPUs.
The RAM on the nodes ranges from 8GB (1GB/CPU core) for many of the older Dell R410 servers to 128GB (4GB/CPU core) for the recent Dell M820Blade servers. The storage amounts to about 34TB.
The system features also 1 node with 512GB RAM and 32 Haswell cores, dedicated to large memory jobs and 1 node with 2 Nvidia GPU dedicated to remote visualization, which allow remote accelerated rendering using VirtualGL.
The system is structured as a 'condo model', and access to resources is connected to hardware ownership as described in the section Policy and queues.
- Connecting from Off Campus:
- Displaying graphics : X2go and X11 forwarding.
- Transferring files
- Home File System
- Scratch Space
- Work File System
- Policy and queues
- Managing Jobs
- Submitting a batch job : qsub
- Interactive sessions : qrsh
- Monitor your jobs : qstat
- qstat typical use cases
- How to check my jobs?
- How to see all the jobs in the queueing system?
- How to see the general status of the queues to see if there are slots available?
- How can I get a list with the status of the nodes associated to my queue?
- How can I see who is running in my queue?
- How can I see who is running on my node ?
- How to get additional information on memory and cpu usage of my job?
- How to get additional information on a node?
- qstat typical use cases
- Deleting your jobs : qdel
- Modify a pending job : qalter
- Suspending jobs : qmod
- Tips on how to run some popular applications