Build and Run Environments

Setting up your environment is crucial for running and building tests, and Pavilion gives you several options for doing so.

Environment Variables
Modules
Module Wrappers
Spack Packages

Assumptions

Pavilion assumes that it runs under a relatively clean, default login environment; i.e., the login environment a new user might get when they log into the machine for the first time, including any default modules or environment variables. A clean environment is not required, but simply means that when you run Pavilion, it will work the same as when your co-worker does.

That aside, most basic changes to the enviroment won’t have a significant impact on Pavilion’s behavior. However, a few things will:

Changing from the default Python3 or PYTHONPATH
Modifying LD_LIBRARY_PATH or similar variables that affect compilation.

Lastly, Pavilion writes and runs BASH scripts. It assumes that whatever your environment is, the module system will work under BASH just as well as your native environment.

Environment Variables

The env attribute allows you to set environment variables in either the run or build scripts. They are configured as a YAML mapping/dict, and (unlike the rest of Pavilion) can have upper-case keys (but no dashes). Like with the run/build commands, the values can contain any bash shell syntax without issue.

env_example:
  run:
    env:
      PYTHONPATH: $(pwd)/libs
      TEST_PARAM1: 37
      # The starting { means this has to be quoted.
      AN_ARRAY: "{hello world}"

    cmds:
      - for value in ${AN_ARRAY[@]}; do echo $value; done
      - python3 mytest.py

Each set variable is set (and exported) in the order given.

#!/bin/bash

export PYTHONPATH=$(pwd)/libs
export TEST_PARAM1=37
export AN_ARRAY={hello world}

for value in ${AN_ARRAY[@]}; do echo $value; done
python3 mytest.py

Escaping

Values are not quoted. If they need to be, you’ll have to quote them twice, once for YAML and once for the quotes you actually need.

quote_example:
  run:
    env:
      DQUOTED: '"This will be in double quotes. It is a literal string as far
               as YAML is concerned."'
      SQUOTED: "'This $VAR will not be resolved in bash, because this is single
               quoted.'"
      DDQUOTED: """Double quotes to escape them."""
      SSQUOTED: '"That goes for single quotes '' too."'
      NO_QUOTES: $(echo "YAML only quotes things if the first character
      is a quote. These are safe.")

#/bin/bash

export DQUOTED="This will be in double quotes. It is a literal string as far as YAML is concerned."
export SQUOTED='This $VAR will not be resolved in bash, because this is single quoted.'
export DDQUOTED="Double quotes to escape them."
export SSQUOTED="That goes for single quotes '' too."
export NO_QUOTES=$(echo "YAML only quotes things if the first character is a quote. These are safe.")

Modules

Many clusters employ module systems to allow for easy switching between build environments. Pavilion supports both the environment (TCL) and the LMOD module systems, but other module systems can be supported by overriding the base Module Wrapper Plugins.

Loading modules

In either run or build configs, you can have Pavilion import modules by listing them (in the order needed) under the modules attribute.

module_example:
  build:
    modules: [gcc, openmpi/2.1.2]

In the generated build script, each of these modules will be first loaded, then checked to verify that it was loaded successfully.

#/bin/bash

TEST_ID=$1

module load gcc
# This checks to make sure the module was loaded. If it isn't the script
# exits and updates the test status.
verify_module_loaded gcc $TEST_ID

module load openmpi/2.1.2
verify_module_loaded openmpi/2.1.2 $TEST_ID

Other Module Manipulations

You can also swap modules by using the arrow (->) syntax and unload modules by prefixing their names with a dash (-):

module_example2:
  build:
    source_location: test_code.xz
  run:
    # This assumes gcc and openmpi are already loaded by default.
    modules: [gcc->intel/18.0.4, -openmpi, intel-mpi]
    cmds:
      - $MPICC -o test_code test_code.c

It is sometimes useful to start a build or run with a fresh module environment, that is, with no modules loaded. This is helpful for ensuring consistency across machines that may have different default modules. Pavilion offers the option to purge modules at the beggining of a build or run, prior to loading explicitly requested modules:

module_example3:
  build:
    purge_modules: True
    modules:
      - gcc
  run:
    purge_modules: True
    modules:
      - gcc

By default, modules are not purged.

Module Wrappers

Module wrappers allow you to change how Pavilion loads specific modules, module versions, and even modules in general. The default module wrapper provides support for lmod and tmod, generates the source to load modules within run and build scripts, and checks to see if they’ve been successfully loaded (or unloaded).

Module wrappers are added, typically in Host Configs, via the module_wrappers sections:

# This would be in a a 'host' file, typically
module_wrappers:
    gcc:
      # When gcc is asked for on this system, load these modules instead.
      modules:
          # This assumes PrgEnv-cray is the default on this machine.
          - PrgEnv-cray->PrgEnv-gnu
          # Swap the default gcc (that comes with the PrgEnv) for the requested one
          # You shouldn't specify versions (unless you want to force a version), Pavilion
          # will automatically ask for the version requested in the test config.
          - gcc->gcc
      env:
          # You can also add environment variables to automatically be exported after
          # the module is loaded.
          PAV_CC: '$CC'
          PATH: '$PATH:$(dirname $(which gcc))'

So now, we can write our tests to generically ask for ‘gcc’:

mytest:
    run:
        modules: 'gcc'

Wildcards

The modules specified can be written as file system globs, to match a wider range of modules and to support module naming (particularly under lmods) that is less generic.

module_wrappers:
    # On this system, the modules for MPI layers have the compiler as part of the name.
    # This will match 'openmpi-gcc', 'openmpi-intel', etc.
    openmpi-*:
        modules:
            # This will be auto-converted into the mpi requested.
            - 'openmpi-*'

        env:
            PAV_MPICC: '$(which mpicc)'

Wildcards work on the left side of module swaps (modA->modB) as well. Pavilion will look for a loaded package that matches the left side, and swap it for the right side.

Version Variable

If you need the version of the loaded module, it’s available in the <mod_name>_VERSION environment variable. If the the mod_name contains wildcards, ‘*’ is replaced with ‘any’, and other characters are replaced with underscores. So gcc-[f]-?-* gets a gcc-_-_-any_VERSION environment variable.

Module Wrapper Plugins

For more complicated use cases, you can also write module wrapper plugins. For more information on writing these, see Module Wrapper Plugins.

Spack Packages

Pavilion supports both the installation and loading of Spack packages inside of test scripts. This is not enabled by default as it requires an external Spack instance.

Once configured, Spack packages can be installed and loaded in Pavilion test scripts using the spack section inside both the build and run sections of a test config. This section has two keys, install and load, that take a list of package names with optional spec and dependency options.

build:
    spack:
        install:
            - ember
            - mpich@3.0.4
            - mpileaks @1.2:1.4 %gcc@4.7.5 +debug
        load:
            - gcc
run:
    spack:
        load:
            - ember
            - mpich
            - mpileaks

Pavilion will also allow for Spack-specific configuration changes to be added inside test configs under the spack section. The following Spack-specific options are currently supported:

build_jobs - The max number of jobs to use when running make in parallel.
repos - Paths to package repositories.
mirrors - URLs that point to a directories that contain Spack packages.
upstreams - Other Spack instances.

These options are directly inserted into the Spack build environment’s spack.yaml file. Refer to Spack documentation on usage.

base:
    spack:
        build_jobs: 4
        mirrors:
            MIRROR1: https://a_spack_mirror.com
        repos:
            - /a/path/to/package/repo
            - /a/different/path/to/package/repo
        upstreams:
            Upstream1:
                install_tree: /path/to/other/spack/instance

Enabling Spack Features

Spack features can be enabled by providing a Spack instance’s path under the spack_path key in the Pavilion config file (pavilion.yaml). For more Pavilion configuration information, see Configuring Pavilion.

Once Spack is enabled globally for Pavilion, it can be enabled for individual tests simply by including a spack.load or spack.install key under the run or build sections of a test config. Trying to use Spack in a test when it is not globally enabled first results in an error.

How Pavilion Uses Spack

When Spack is enabled inside of a test config, Pavilion generates an anonymous Spack environment file that is activated at the beginning of both the build and run scripts. The generated environment file, spack.yaml, is placed in the respective build directory so that it can be reactivated when a build is reused.

The Spack environment file is modified so that Spack packages are installed inside their respective build directory in a directory named spack_installs, as seen below:

# SPACK: Spack environment configuration file.
spack:
    config:
        install_tree: ~/.pavilion/builds/7a3986a56e7c04a7/spack_installs

This means any installs that are not in the global Spack instance will only be in the scope of this build.

Global Spack packages or packages in upstreams will still need to be listed under the install section for both the build and run sections of a test config so that those packages can be added to the Spack environment correctly.