Gordon Bell runs 2026¶
Info
Gordon Bell runs will take place on Clariden from Tuesday, April 7, 2026 to Monday, April 13, 2026.
During this period, Clariden will be temporarily expanded to 2300 GH200 nodes, Daint will operate with a reduced compute capacity, while Santis will be unavailable.
All times on this page are local CSCS time (Europe/Zurich, CEST).
Warning
During the daily reserved window, Clariden will be dedicated to Gordon Bell teams and unavailable for regular user jobs.
Regular user jobs can still be submitted at any time and will be scheduled automatically during the open-access windows.
Clariden¶
Connecting¶
Connecting to Clariden via SSH is the same as for Daint and Santis, see the SSH guide for more information.
Add the following to your SSH configuration to enable you to directly connect to Clariden using ssh clariden.
Host clariden
HostName clariden.alps.cscs.ch
ProxyJump ela
# change cscsusername to your CSCS username
User cscsusername
IdentityFile ~/.ssh/cscs-key
IdentitiesOnly yes
Reservations¶
During the Gordon Bell period, Clariden will operate with a different schedule than usual.
| Date | Time | Status |
|---|---|---|
| April 7, 2026 | 07:00-approximately 12:00 | Cluster reconfiguration to the Gordon Bell layout |
| April 7-12, 2026 | 09:00-19:00 daily | Reserved for Gordon Bell teams |
| April 7-12, 2026 | 19:00-09:00 daily | Open access for regular users |
| April 13, 2026 | until 07:00 | Final overnight open-access period |
| April 13, 2026 | from 07:00 | Cluster restored to its standard configuration |
On April 7, the cluster reconfiguration starts at 07:00 and is expected to complete by 12:00. The reservation starts once the reconfiguration is complete.
On April 13, the special Gordon Bell configuration ends at 07:00, when the cluster will be restored to its standard layout.
The temporary Gordon Bell layout provides 2300 nodes on Clariden. In practice, around 2050 nodes are expected to be available for production runs.
For regular users¶
Please plan your workloads with the following temporary changes in mind:
- Clariden is unavailable to regular user jobs during the reserved window 09:00-19:00.
- Jobs can still be submitted at any time and will be queued until an open-access window becomes available.
- The maximum job length on Clariden is temporarily reduced from 12 hours to 6 hours during this period.
- The
Apertustraining reservation (600 nodes) remains unchanged and continues during the overnight window. - The weekend of April 11-12, 2026 Clariden is expected to remain accessible. However, an extension of the reserved period may be required depending on progress during the week. Any change will be communicated as early as possible.
For Gordon Bell teams¶
- The reserved period on Clariden is intended for large-scale Gordon Bell runs.
- Reservation details, including any reservation names and operational instructions, will be communicated directly to the participating teams.
- The maximum reservation window is 10 hours per day.
- The CSCS Gordon Bell support team will be available to help teams prepare and execute successful runs.
Software environment¶
Clariden and Santis use the same software image during this period:
- USS 1.3.1
- NVIDIA driver 590
- No CPE software stack
Daint uses a different image:
- USS 1.1.0
- NVIDIA driver 550
- CPE software stack available
Warning
Clariden does not provide the CPE software stack. Gordon Bell teams should therefore prepare and validate their software environment on Clariden (or Santis), rather than on Daint. For most users, this means using a uenv, a container-based workflow, or a self-managed software stack.
Storage¶
The same shared filesystems are available across Daint, Clariden, and Santis:
- capstor, iopsstor, and VAST are mounted on Clariden during the Gordon Bell runs
- Home is shared between Daint, Clariden, and Santis
- Scratch spaces are shared between Daint, Clariden, and Santis
- Store/Project filesystems are mounted
For most large-scale run data, staging, and scratch-like workloads, /capstor/scratch/cscs/${USER} is the recommended choice.
Lustre striping¶
Uenvs¶
uenv images can be striped across multiple OSTs on the Lustre filesystem, which can significantly improve I/O performance for large files.
Striping is applied automatically to all uenv images created in repositories that were either created in the last few months, or who have updated their repos.
If your uenv images were created before this change, you can update your repository to apply striping to your existing images.
Disabling core-dumps¶
If a large job crashes and tries to write core-dump files on thousands of processes, it will overwhelm the filesystem. Therefore we strongly recommend to disable them with the following command:
MPI¶
MPI jobs on Clariden must be started with the Shasta MPI integration:
MPI may need longer than the default timeout to initialize in large scale runs. As a precaution, we recommend increasing the timeout from 180 seconds to 300.
NCCL¶
See the container engine documentation for information on using NCCL in containers. The NCCL documentation contains general information on configuring NCCL. This is especially important when using uenvs, as the relevant environment variables are not set automatically. Because Clariden and Santis do not provide CPE, Gordon Bell teams are strongly encouraged to validate their NCCL and MPI configuration in the exact runtime environment they plan to use for production.