Using container engine
Running containerized environments¶
Specifying the --environment
option to the Slurm command (e.g., srun
or salloc
) will make it run inside the EDF environment.
There are three ways to do so:
-
Through an absolute path: an absolute path to EDF.
-
Through a relative path: a path relative from the current working directory (i.e., where the Slurm command is executed). Should be prepended by
./
. -
From EDF search paths: the name of EDF in the EDF search path.
--environment
also accepts the EDF filename without the.toml
extension:
Use from batch scripts¶
Use --environment
with the Slurm command (e.g., srun
or salloc
):
srun
inside a batch script with EDF
Specifying the --environment
option with an #SBATCH
option is experimental.
Such usage is discouraged as it may result in unexpected behaviors.
Note
Specifying --environment
with #SBATCH
will put the entire batch script inside the containerized environment, requiring the Slurm hook to use any Slurm commands within the batch script (e.g., srun
or scontrol
).
The hook is controlled by the ENROOT_SLURM_HOOK
environment variable and activated by default on most vClusters.
EDF search path¶
By default, the EDFs for each user are looked up in ${HOME}/.edf
.
The default EDF search path can be changed through the EDF_PATH
environment variable.
EDF_PATH
must be a colon-separated list of absolute paths to directories, where the CE searches each directory in order.
If an EDF is located in the search path, its name can be used in the --environment
option without the .toml
extension.
Using EDF_PATH
to control the default search path
Using container images¶
By default, images defined in the EDF as remote registry references (e.g. a Docker reference) are automatically pulled and locally cached. A cached image would be preferred to pulling the image again in later usage.
An image cache is automatically created at .edf_imagestore
in the user's scratch folder (i.e., ${SCRATCH}/.edf_imagestore
), under which cached images are stored with the corresponding CPU architecture suffix (e.g., x86
and aarch64
).
Cached images may be subject to the automatic cleaning policy of the scratch folder.
Should users want to re-pull a cached image, they have to remove the corresponding image in the cache.
To choose an alternative image store path (e.g., to use a directory owned by a group and not to an individual user), users can specify an image cache path explicitly by defining the environment variable EDF_IMAGESTORE
.
EDF_IMAGESTORE
must be an absolute path to an existing folder.
Note
- If the CE cannot create a directory for the image cache, it operates in cache-free mode, meaning that it pulls an ephemeral image before every container launch and discards it upon termination.
- Local container images are not cached. See the section below on how to use local images in EDF.
Pulling images manually¶
To bypass any caching behavior, users can manually pull an image and directly plug it into their EDF.
To do so, users may execute enroot import docker://[REGISTRY#]IMAGE[:TAG]
to pull container images from OCI registries to the current directory.
After the import is complete, images are available in Squashfs format in the current directory and can be used in EDFs:
Manually pulling an nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04
image
-
Pull the image.
-
Create an EDF referencing the pulled image.
Note
It is recommended to save images in ${SCRATCH}
or its subdirectories before using them.
Third-party and private registries¶
Docker Hub is the default registry from which remote images are imported.
Registry rate limits
Some registries will rate limit image pulls by IP address. Since public IPs are a shared resource we recommend authenticating even for publicly available images. For example, Docker Hub applies its rate limits per user when authenticated.
To use an image from a different registry, the corresponding registry URL has to be prepended to the image reference, using a hash character (#) as a separator:
Using a third-party registry
- Within an EDF
- On the command line
To import images from private repositories, access credentials should be configured by individual users in the $HOME/.config/enroot/.credentials
file, following the netrc file format.
Using the enroot import
documentation page as a reference:
netrc
example
# NVIDIA NGC catalog (both endpoints are required)
machine nvcr.io login $oauthtoken password <token>
machine authn.nvidia.com login $oauthtoken password <token>
# DockerHub
machine auth.docker.io login <login> password <password>
# Google Container Registry with OAuth
machine gcr.io login oauth2accesstoken password $(gcloud auth print-access-token)
# Google Container Registry with JSON
machine gcr.io login _json_key password $(jq -c '.' $GOOGLE_APPLICATION_CREDENTIALS | sed 's/ /\\u0020/g')
# Amazon Elastic Container Registry
machine 12345.dkr.ecr.eu-west-2.amazonaws.com login AWS password $(aws ecr get-login-password --region eu-west-2)
# Azure Container Registry with ACR refresh token
machine myregistry.azurecr.io login 00000000-0000-0000-0000-000000000000 password $(az acr login --name myregistry --expose-token --query accessToken | tr -d '"')
# Azure Container Registry with ACR admin user
machine myregistry.azurecr.io login myregistry password $(az acr credential show --name myregistry --subscription mysub --query passwords[0].value | tr -d '"')
# Github.com Container Registry (GITHUB_TOKEN needs read:packages scope)
machine ghcr.io login <username> password <GITHUB_TOKEN>
# GitLab Container Registry (GITLAB_TOKEN needs a scope with read access to the container registry)
# GitLab instances often use different domains for the registry and the authentication service, respectively
# Two separate credential entries are required in such cases, for example:
# Gitlab.com
machine registry.gitlab.com login <username> password <GITLAB TOKEN>
machine gitlab.com login <username> password <GITLAB TOKEN>
# ETH Zurich GitLab registry
machine registry.ethz.ch login <username> password <GITLAB_TOKEN>
machine gitlab.ethz.ch login <username> password <GITLAB_TOKEN>