Skip to content

Machine learning applications and frameworks

CSCS supports a wide range of machine learning (ML) applications and frameworks on its systems. Most ML workloads are containerized to ensure portability, reproducibility, and ease of use across systems.

Users can choose between running containers, using provided uenv software stacks, or building custom Python environments tailored to their needs.

First time users are recommended to consult the LLM tutorials to get familiar with the concepts of the Machine Learning platform in a series of hands-on examples.

Containerization is the recommended approach for ML workloads on Alps, as it simplifies software management and maximizes compatibility with other systems.

Users are encouraged to build their own containers, starting from popular sources such as the Nvidia NGC Catalog, which offers a variety of pre-built images optimized for HPC and ML workloads. Examples include:

Documented best practices are available for:

Extending a container with a virtual environment

For frequently changing Python dependencies during development, consider creating a Virtual Environment (venv) on top of the packages in the container (see this example).

Helpful references:

Using provided uenv software stacks

Alternatively, CSCS provides pre-configured software stacks (uenvs) that can serve as a starting point for machine learning projects. These environments provide optimized compilers, libraries, and selected ML frameworks.

Available ML-related uenvs:

Extending a uenv with a virtual environment

To extend these environments with additional Python packages, it is recommended to create a Python Virtual Environment (venv) layered on top of the packages in the uenv. See this PyTorch venv example for details.

Building custom Python environments

Users may also choose to build entirely custom software stacks using Python package managers such as uv or conda. Most ML libraries are available via the Python Package Index (PyPI).

Note

While many Python packages provide pre-built binaries for common architectures, some may require building from source.

To ensure optimal performance on CSCS systems, we recommend starting from an environment that already includes:

  • CUDA, cuDNN
  • MPI, NCCL
  • C/C++ compilers

This can be achieved either by:

and extending it with a virtual environment.