Skip to content

Research HPC Cluster (Aoraki)

On this Page

  • What the HPC Research Cluster is
  • Cluster computing resources available
  • Resource term definitions

Shared computing resources available to Otago researchers include high performance computing, fast storage, GPUs and virtual servers.

Otago Resources

The RTIS Research cluster provides researchers with access to shared resources, such as CPUs, GPUs, and high-speed storage. Also available are specialised software and libraries optimised for scientific and datascience computing.

If you need special software or configurations please ask the RTIS team at rtis.support@otago.ac.nz

Cluster Overview

Photo of the cluster
Figure 1: Photo of the cluster

We offer a variety of SLURM partitions based on different resource needs. The default partition (aoraki) provides balanced compute and memory capabilities. Additional partitions include those optimized for GPU usage and those with expanded memory capacity. On every cluster node there are 2 cores reseved for the OS (weka storage), reducing the available compute cores by 2.

Individual Job Limitations

This table lists the default maximum resources that can be requested for individual jobs. If your work requires alterations to these limitations please contact rtis.support@otago.ac.nz to discuss how these can be accommodated.

Partition TimeLimit (Days) MaxRunningJobs MaxCPU MaxMem MaxGPUs NumNodes NodeList
aoraki* 7 100 126 1000G - 27 aoraki[01-09,14-15,17,20-26,34-41]
aoraki_bigcpu 14 50 252 1500G - 10 aoraki[15,20-23,34-38]
aoraki_bigmem 14 10 126 2000G - 5 aoraki[14,17,24-26]
aoraki_fastcore 14 50 94 1500G - 5 aoraki[39-43]
aoraki_long 30 25 252 2000G - 10 aoraki[20-26,34-36]
aoraki_short 1 250 32 256G - 3 aoraki[11,12,16]
aoraki_small 7 30 8 32G - 7 aoraki[18,19,27,28,31-33]
aoraki_gpu 7 2 16 150G 2 10 aoraki[11,12,16,18,19,27,28,31-33]
aoraki_gpu_h100 7 2 16 150G 2 2 aoraki[16,30]
aoraki_gpu_l40 7 2 16 150G 2 5 aoraki[18,19,31-33]
aoraki_gpu_a100_80gb 7 2 16 150G 2 2 aoraki[11,12]
aoraki_gpu_a100_40gb 7 2 16 150G 2 2 aoraki[27,28]
aoraki_gpu_l4_24gb 7 2 8 60G 2 1 aoraki[29]
aoraki_gpu_rtx3090 7 2 8 60G 2 4 aoraki-g[01,02,04,05]
  • Partition: Name of the partition, with an asterisk (*) denoting the default partition. Aoraki_small and aoraki_short are specialized partitions that utilize typically idle CPU cores on GPU nodes, designed to handle small or short-duration jobs efficiently.
  • Time Limit: Maximum time a job can run in that partition. The time limit for running jobs can be extended upon request. In such cases, the extended time limit may exceed the partition's standard wall time.
  • MaxRunningJobs: The maximum number of simultaneously running jobs. Subsequent jobs will wait in the queue.
  • MaxCPU: Maximum number of CPU cores available to be requested on a node.
  • MaxMem: The maximum amount of memory (in GB) available to be requested on each node in partition.
  • MaxGPUs: The maximum number of gpus that can be requested on a node for a job.
  • Nodes: Number of nodes available in the partition.
  • Nodelist: The specific nodes allocated to that partition.

Additional limits

  • Maximum of 5000 submitted jobs per user (onDemand jobs are not counted in this limit)

  • Jobs requesting GPUs or running through onDemand are limited to a single node.

  • OnDemand is limited to 10 running jobs per user.

  • Users are limited to 2 simultaneously running GPU jobs per GPU partition. Any additional GPU jobs will remain queued until resources become available.

Individual Node Specifications

Within the cluster there are different hardware configurations to accommodate a wide range of use cases. Some jobs require specific hardware or may benefit from running on a particular node type.

Number Node type CPU RAM Extra
1 aoraki-login 2x 64 cores AMD EPYC 7763 1TB DDR4 3200 MT/s CPU 2.4GHz
9 aoraki[01-9] 2x 64 cores AMD EPYC 7763 1TB DDR4 3200 MT/s CPU 2.4GHz
2 aoraki[11,12] 2x 64 cores AMD EPYC 7763 1TB DDR4 3200 MT/s 2x A100 80GB PCIe GPU per node cuda12.5 NVLink 20.55GB/s
2 aoraki[27,28] 2x 32 cores AMD EPYC 7543 1TB DDR4 3200 MT/s 2x A100 40GB PCIe GPU per node cuda12.5 NVLink 16.21GB/s
5 aoraki[14,17,24-26] 2x 64 cores AMD EPYC 7763 2TB DDR4 2933 MT/s CPU 2.4GHz
10 aoraki[15,20-23,34-38] 2x 128 cores AMD EPYC 9754 1.5TB DDR5 4800 MT/s CPU 2.2GHz
5 aoraki[39-43] 2x 48 cores AMD EPYC 9474F 1.5TB DDR5 4800 MT/s CPU 3.6GHz
1 aoraki16 2x 56 cores Intel Xeon 8480+ 1TB DDR5 4800 MT/s 4x H100 80GB HBM3 GPU per node cuda12.4 NVLink 121.29GB/s
1 aoraki30 2x 32 cores Intel Xeon 8562Y+ 1TB DDR5 4800 MT/s 4x H100 96GB NVL GPU per node cuda12.5 NVLink 237.16GB/s
2 aoraki[18,19] 2x 32 cores AMD EPYC 7543 1TB DDR4 3200 MT/s 3x L40 48GB PCIe GPU per node cuda12.5 NVLink 24.37GB/s
3 aoraki[31-33] 1x 64 cores AMD EPYC 9554P 768GB DDR5 4800 MT/s 3x L40S 48GB PCIe GPU per node cuda12.5 NVLink 24.48GB/s
1 aoraki29 2x 32 cores Intel Xeon 8562Y+ 1TB DDR5 4800 MT/s 7x L4 24GB GPU per node cuda12.5 NVLink 21.05GB/s
2 standalone 32 cores AMD Ryzen Threadripper PRO 3975WX 128 GB DDR4 3200 MT/s 1x RTX 3090 24GB cuda12.5
3 standalone 16 cores AMD Ryzen 9 5950X 64 GB DDR4 3200 MT/s 1x RTX 3090 24GB cuda12.5
2 standalone 2x 6 cores Intel Xeon CPU E5-2620 v3 256GB DDR4 3200 MT/s 2x RTX A6000 48GB GPU per node