Resources¶
Our cluster provides the following hardware resources.
CPUs¶
If you do not load any hardware modules, your job will end up on the first available node. The node may have either AMD or Intel CPU. If you need a specific CPU type, you would need to load the corresponding module, e.g. module load amd.
GPUs¶
There are different NVIDIA GPUs available: H200, H100, A100, L4. The H200, H100 and A100 nodes have nvlink (high performance bandwidth interconnect among GPUs). These machines are suitable for multi-GPU workloads. L4 GPUs are meant for single-GPU jobs only. If your job requires a lot of GPU memory, you may want to request a GPU flavor by VRAM, for example: module load gpumem80gb.
For multi-GPU and multi-node multi-GPU jobs, you would need to request either A100, H100 or H200 specifically. Alternatively, you can load the multigpu module, i.e. module load multigpu.
Hardware¶
| CPUs per Node | CPU Mem per Node (GB) | GPU Type | GPU Mem (GB) | GPUs per Node | Chipset | Total Nodes |
|---|---|---|---|---|---|---|
| 8 | 30 | AMD | 10 | |||
| 30 | 116 | AMD | 30 | |||
| 120 | 472 | AMD | 9 | |||
| 92 | 708 | AMD | 12 | |||
| 256 | 4030 | Intel | 2 | |||
| 48 | 1510 | A100 | 80 | 8 | AMD | 5 |
| 192 | 1510 | H100 | 80 | 8 | Intel | 2 |
| 288 | 1510 | H100 | 96 | 2 | AMD | 4 |
| 288 | 2014 | H200 | 140 | 8 | AMD | 1 |
| 15 | 58 | L4 | 24 | 1 | AMD | 15 |
| 80 | 754 | V100 | 32 | 8 | Intel | 6 |