Job Resource Planning
The Slurm resource scheduler is in-charge of allocating resources on the HPC cluster for your HPC jobs. It does not actually utilize the resources (that is the responsibility of your job); rather, it just reserves them.
Things to consider#
- How long your job needs to run - Choose this value carefully. If you underestimate this value, the scheduler will kill your job before it completes. If you overestimate this value too much, your job may wait in-queue for longer than necessary. Generally, it is better to overestimate than underestimate.
- How many compute cores and nodes your job will need - This is the level of parallelization. Most jobs the run on the HPC take advantage of multiple processors/cores. You will need to instruct the Slurm scheduler regarding how many cores your job will need, and how those cores should be distributed over physical processors and compute nodes.
- How much memory your job will need - By default, the scheduler allocates 3.9GB of RAM per each CPU you allocate1. This value is enough for most jobs, but sometimes jobs are more memory-intensive than CPU intensive (e.g. loading large datasets into memory). In these cases, you need to explicitly instruct Slurm to allocate extra memory.
- If your job needs access to special hardware or features - Your job may need access to one or more GPU nodes or a specific-model CPU. You can specify these as constraints when you submit your job.
Nodes, processors, and cores#
Nodes are physical servers in the HPC cluster. Each node contains multiple CPUs, and each CPU contains multiple cores. Your job can request resources by adding parameters to your submit script. If your submit script does not contain any specific instructions for allocating CPU resources, Slurm will allocate a single CPU on a single node.
To request more than one CPU, use the
--ntasks) parameter in your submit script:
The above example is a script that requests eight CPUs. There is no guarantee that these CPUs will be allocated on a single node. Instead, Slurm will find eight processors somewhere in the cluster as efficiently as possible and allocate them.
If you need all eight processors to be on a single node, you should use the
--nodes) parameter in your submit
Note that this won't guarantee that your job has exclusive access to the node while it is running. Other jobs may be running on that node concurrently while your job is running. Rather, it instructs Slurm that all the resources you request should be allocated on a single node.
Reserving entire nodes for your job#
If you need to ensure that your job is the only job running on a node, Slurm provides the
Requesting exclusive access for your jobs will greatly increase the amount of time that your job will wait in-queue since it takes longer for entire nodes to become available.
Taking control over multi-core processors#
All nodes in the HPC have multicore processors. The number of cores per physical processor varies across the cluster. The lowest number (on older nodes) is 2 and some have as many as 40 cores. You can instruct Slurm to allocate processors with a minimum number Slurm to allocate processors with a minimum number of cores if your job will benefit:
The above code will instruct Slurm to select only nodes with a minimum of 8 cores per processor for your job.
Additionally, you can change the meaning of
-n to mean cores (instead of entire processors) by using the
parameter. For example, if you wish to have eight parallel processes run on a single CPU, you can do the following:
In the above example, Slurm allocate a single 8-core processor for your job.
There are many other ways to fine-tune your job submission scripts; refer to the Slurm documentation for a complete reference.
Memory resource planning#
By default, Slurm automatically allocates a fixed amount of memory (or RAM) for each processor:
- 3.9GB per processor in most Slurm Accounts
- 1.9GB per processor in the
If your job needs more memory, one way to ensure this is to simply instruct Slurm to request more than one processor:
Since you asked for 16 processors in the above example, our job will be allocated 16 × 3.9GB = 62.4GB RAM.
Alternatively, if your job is memory-intensive, but does not have heavy parallel processing (i.e., you do not need many
CPU resources), you can use the
--mem parameter to ask for a specific amount of memory per node. For example:
Different units can be specified by using the suffixes:
If you do not specify a suffix, Slurm defaults to using
--mem parameter specifies the amount of memory your job needs per node. So, if you request multiple nodes,
Slurm will allocate the amount of memory you request per each node:
In the above example, your job will run on a total of 2 nodes × 64GB RAM = 128GB RAM. Note, because
you did not specify the
-n parameter, your job will be allocated a single processor on each node.
If you request more memory per node than is available in the Slurm Account you are submitting to, Slurm will inform you that your job cannot be scheduled.
Why 3.9GB per processor and not 4GB?#
Each node in the cluster requires a small amount of memory for overhead operations for the operating system. By multiplying memory by 3.9GB instead of 4GB, we ensure that compute jobs have access to as much memory as possible with as little wasted as possible.
If compute jobs are given access to all the memory on a node, they can run the node out of memory and cause the node to crash. This happens occasionally.
Time resource planning#
--time parameter allows you to specify exactly how long Slurm allocates resources for your job.
Each Slurm Account (queue) is configured with a maximum amount of time that jobs are allowed to run. Each Slurm Account in the cluster has a default and maximum time limit. When you submit your job, you can specify exactly how long you expect to run. Jobs that request less time tend to have shorter wait times in-queue than jobs that reques longer times.
Slurm will kill your job when your time limit is reached regardless of whether the job has completed or not.
Choosing a Slurm account#
You can see which Slurm accounts to by running the following command on the terminal:
Each Slurm Account specifies processor, memory, and time limits. Owner-based accounts get priority access for research groups that have purchased resources on the cluster. All users can use general access accounts.
After your job ends#
Details about completed jobs (sacct)#
If your job has completed within the last year regardless of whether it succeeded or failed,
you can extract useful information about it using the
||Print only Job ID, status, and exit code|
||List all available output fields to use with
||Display information about a specific job, or job step|
||Display information about a specific user|
||Display information for a specific Slurm Account (queue)|
||Customize the fields displayed (use
sacct -eto show all possible display fields for the --format option
You can also use the environment variable
SACCT_FORMAT to set the default format for the sacct command. Example:
Maximum memory used is represented by the MaxRSS format field; i.e.,
For more information about
sacct, refer to the official Slurm documentation
Efficiency statistics for completed jobs (seff)#
If your job has completed within the last year regardless of whether it succeeded or failed, you can use the
command to see how efficient your resource usage was. This will better help you tune your submit script to maximize
resources and minimize wait times.
An exception to this are the backfill and backfill2 queues, which allocate 1.9GB of RAM per core by default. This can be overridden by using the
--mem-per-cpuSlurm parameter. ↩