Job Resource Planning
The Slurm resource scheduler is in-charge of allocating resources on the HPC cluster for your HPC jobs. It does not actually utilize the resources (that is the responsibility of your job); rather, it just reserves them.
Things to consider#
- How long your job needs to run - Choose this value carefully. If you underestimate this value, the scheduler will kill your job before it completes. If you overestimate this value too much, your job may wait in-queue for longer than necessary. Generally, it is better to overestimate than underestimate.
- How many compute cores and nodes your job will need - This is the level of parallelization. Most jobs that run on the HPC take advantage of multiple processors/cores. You will need to instruct the Slurm scheduler regarding how many cores your job will need, and how those cores should be distributed among physical processors and compute nodes.
- How much memory your job will need - By default, the scheduler allocates 3.9GB of RAM per each CPU you allocate1. This value is enough for most jobs, but sometimes jobs are more memory-intensive than CPU intensive (e.g. loading large datasets into memory). In these cases, you need to explicitly instruct Slurm to allocate extra memory.
- If your job needs access to special hardware or features - Your job may need access to one or more GPU nodes or a specific-model CPU. You can specify these as constraints when you submit your job.
Nodes, processors, and cores#
Nodes are physical servers in the HPC cluster. Each node contains multiple CPUs, and each CPU contains multiple cores. Your job can request resources by adding parameters to your submit script. If your submit script does not contain any specific instructions for allocating CPU resources, Slurm will allocate a single CPU on a single node.
To request more than one CPU, use the -n
(or --ntasks
) parameter in your submit script:
The above example is a script that requests eight CPUs. There is no guarantee that these CPUs will be allocated on a single node. Instead, Slurm will find eight processors somewhere in the cluster as efficiently as possible and allocate them.
If you need all eight processors to be on a single node, you should use the -N
(or --nodes
) parameter in your submit
script:
Note that this won't guarantee that your job has exclusive access to the node while it is running. Other jobs may be running on that node concurrently while your job is running. Rather, it instructs Slurm that all the resources you request should be allocated on a single node.
Memory resource planning#
By default, Slurm automatically allocates a fixed amount of memory (or RAM) for each processor:
- 3.9GB per processor in most Slurm Accounts
- 1.9GB per processor in the
genacc_q
,condor
andquicktest
Slurm Accounts
If your job needs more memory, the recommended way to ensure this is to instruct Slurm to increase the number of cores for your job:
Since you asked for 16 processors in the above example, our job will be allocated 16 × 3.9GB = 62.4GB RAM.
Alternatively, if your job is memory-intensive, but does not have heavy parallel processing (i.e., you do not need many
CPU resources), you can use the --mem-per-core
parameter to ask for a specific amount of memory per node. For example:
In this example, our job will be allocated 8 × 8 = 64GB RAM on a single node.
Tip
Different units can be specified by using the suffixes:
K
= kilobytesM
= megabytesG
= gigabytesT
= terabytes
Warning
If you do not specify a suffix, Slurm defaults to using M
(megabytes)
The --mem
parameter specifies the amount of memory your job needs per node. So, if you request multiple nodes,
Slurm will allocate the amount of memory you request per each node:
In the above example, your job will run on a total of 2 nodes × 64GB RAM = 128GB RAM. Note, because
you did not specify the -n
parameter, your job will be allocated a single processor on each node.
If you request more memory per node than is available in the Slurm Account you are submitting to, Slurm will inform you that your job cannot be scheduled.
Why 3.9GB per processor and not 4GB?#
Each node in the cluster requires a small amount of memory for overhead operations for the operating system. By multiplying memory by 3.9GB instead of 4GB, we ensure that compute jobs have access to as much memory as possible with as little wasted as possible.
If compute jobs are given access to all the memory on a node, they can run the node out of memory and cause the node to crash. This happens occasionally.
Recommended processor and memory parameters#
When it comes to Slurm submit scripts, less is more. In other words, the fewer memory and processor constraints you specify, the sooner your job is likely to run. This is especially true for our general access resources.
Try to avoid using the --mem
and --mem-per-cpu
options. If your job needs more memory, we recommend instead using the
-n
to assign a higher number of CPUs. For example, our job needs at least 60GB of RAM. We can accomplish this by specifying
at least 8 CPUs (16 CPUs × 3.9GB per CPU = 62.4GB)
One exception to this rule-of-thumb is if the amount of memory exceeds the number of cores × 3.9GB. In this case,
the --mem-per-cpu
option is preferred over --mem
. For example, suppose our job needs 64 CPUs, but 500GB of RAM. Since
64 CPUS × 3.9GB per CPU = 249.6GB RAM, we need to increase the amount of memory assigned to each CPU:
Note
For Open OnDemand interactive jobs, the --mem
option is preferred over --mem-per-cpu
.
You will notice that the --mem-per-cpu
option does not appear in any Interactive Apps forms.
Time resource planning#
The -t
/--time
parameter allows you to specify exactly how long Slurm allocates resources for your job.
Each Slurm Account (queue) is configured with a maximum amount of time that jobs are allowed to run. Each Slurm Account in the cluster has a default and maximum time limit. When you submit your job, you can specify exactly how long you expect to run. Jobs that request less time tend to have shorter wait times in-queue than jobs that reques longer times.
Warning
Slurm will kill your job when your time limit is reached regardless of whether the job has completed or not.
Choosing a Slurm account#
You can see which Slurm accounts to by running the following command on the terminal:
Each Slurm Account specifies processor, memory, and time limits. Owner-based accounts get priority access for research groups that have purchased resources on the cluster. All users can use general access accounts.
After your job ends#
Details about completed jobs (sacct)#
If your job has completed within the last year regardless of whether it succeeded or failed,
you can extract useful information about it using the sacct
command:
Option | Explanation |
---|---|
-b , --brief |
Print only Job ID, status, and exit code |
-e |
List all available output fields to use with --format option |
-j [job_id][.step] |
Display information about a specific job, or job step |
-u [username] |
Display information about a specific user |
-A [slurm_account_name] |
Display information for a specific Slurm Account (queue) |
--format |
Customize the fields displayed (use sacct -e to list available fields) |
Example:
- Use
sacct -e
to show all possible display fields for the --format option
Tip
You can also use the environment variable SACCT_FORMAT
to set the default format for the sacct command. Example:
~/.bashrc
file:
Tip
Maximum memory used is represented by the MaxRSS format field; i.e.,
For more information about sacct
, refer to the official Slurm documentation
Efficiency statistics for completed jobs (seff)#
If your job has completed within the last year regardless of whether it succeeded or failed, you can use the seff
command to see how efficient your resource usage was. This will better help you tune your submit script to maximize
resources and minimize wait times.
Example:
Less common examples#
Reserving entire nodes for your job#
If you need to ensure that your job is the only job running on a node, Slurm provides the --exclusive
parameter:
Warning
Requesting exclusive access for your jobs will likely increase the amount of time that your job will wait in-queue since it takes longer for entire nodes to become available.
Taking control over multi-core processors#
All nodes in the HPC have multicore processors. The number of cores per physical processor varies across the cluster. The lowest number (on older nodes) is 2 cores per processor and some have as many as 64 cores. You can instruct Slurm to allocate processors with a minimum number Slurm to allocate processors with a minimum number of cores if your job will benefit:
The above code will instruct Slurm to select only nodes with a minimum of 8 cores per processor for your job.
Additionally, you can change the meaning of -n
to mean cores (instead of entire processors) by using the --ntasks-per-core
parameter. For example, if you wish to have eight parallel processes run on a single CPU, you can do the following:
In the above example, Slurm allocate a single 8-core processor for your job.
There are many other ways to fine-tune your job submission scripts; refer to the Slurm documentation for a complete reference.
Further reading#
-
An exception to this are the genacc_q, condor, and quicktest queues, which allocate 1.9GB of RAM per core by default. This can be overridden by using the
--mem-per-cpu
Slurm parameter. Our self-service portal includes a list of all queues. ↩