Scheduler Plugins
=================

Scheduler plugins take care of the scheduling part of testing. They provide
tests with a set of variables that can be used in the test, and handle passing
test runs off to the control of the scheduler.

Everything in :ref:`plugins.basics` applies here, so you should read that first.

This may seem quite daunting at first. The hard part, however, is typically
in parsing the information you get back from the scheduler itself. Interfacing
that with Pavilion is fairly easy.

.. contents::

Scheduler Requirements
----------------------

For a scheduler to work with Pavilion, it must:

- Produce jobs with a unique (for the moment), trackable job id
- Produce jobs that can be cancelled
- Allow a job to be started asynchronously.

The Pavilion Scheduler plugin system was designed to be flexible
in order to support as many schedulers as possible.

Pavilion also provides an advanced scheduler class that provides quite a few features:

- Allows tests to auto-size relative to available/up nodes.
- Will automatically break the system into discrete 'chunks' of nodes, allowing for
  tests that run over the whole system in a piecemeal fashion.

Advanced schedulers must be able to get an accurate inventory of nodes, including:

- Whether each node is currently 'up' or 'allocated'.
- System information about each node (CPUS, memory info, etc...)
- The scheduler 'groups' that the node belongs to: reservations, partitions. Pavilion's
  must be able to filter nodes according the allocation parameters the same way the scheduler would.

Advanced schedulers must also be able to dictate to the scheduler exactly which nodes to use.

Scheduler Plugins
-----------------

The Scheduler Plugin
~~~~~~~~~~~~~~~~~~~~

This inherits from the 'pavilion.schedulers.BasicSchedulerPlugin' or
'pavilion.schedulers.AdvancedSchedulerPlugin' class.  All of these are fully documented in
the 'pavilion.schedulers.scheduler.SchedulerPlugin' class.

All scheduler plugin require that you extend the base class by providing:

1. A ``_kickoff()`` method - a means to acquire an allocation given the scheduler parameters
   and run a script on it. Also needs to return a 'serializable' job id, to uniquely
   identify a scheduler job.
2. A ``job_status()`` method, that asks the scheduler whether a given job id is
   scheduled, had a scheduling error, was cancelled, or is running.
3. A ``cancel()`` method, to cancel a given job id.
4. A ``_get_alloc_nodes()`` method, to get the list of nodes in an allocation that
   Pavilion is currently running under.
5. An ``available()`` method, to tell Pavilion if your scheduler can be used at all.


Advanced schedulers must also override the following. They are fully documented
in the 'pavilion.schedulers.advanced.SchedulerPluginAdvanced' class.

1. ``_get_raw_node_data()`` - Should fetch and return a list of information about each node.
    This is the per-node information mentioned above.
2. ``_transform_raw_node_data()`` - Converts that data into a '{node: info_dict}' dictionary.

   There are several required keys each node's info_dict must contain, see the method
   documentation for info on the required and optional keys.

Basic scheduler plugins don't require any extra methods, but are limited in functionality.
See :ref:`tests.scheduling.types` for more info.

Scheduler Variables
~~~~~~~~~~~~~~~~~~~

Every scheduler should also include a scheduler variables class, assigned to your
class's 'VAR_CLASS' class variable. This provides information from the scheduler
for each test to use in it's configuration, such as ``sched.test_nodes`` (the
for each test to use in it's configuration, such as `sched.test_nodes` (the
number of nodes in the test's allocation). The base class uses information given
by the scheduler plugin and the test's configuration to figure out 99% of these
on its own. You'll only need to override a few.

Writing a Scheduler Plugin Class
--------------------------------

Handling Errors
~~~~~~~~~~~~~~~

Your scheduler class should catch any errors it reasonably expects to occur.
This includes OSError when making system calls, ValueError when manipulating
values (like converting strings to ints), etc. Once caught, then raise a Pavilion
specific error, in this case it should always be SchedulerPluginError. Pavilion exceptions
take a message about the local context as their first argument, and the prior exception
as the second (optional) argument.


.. code-block:: python

    from pavilion.schedulers import SchedulerPluginError

    try:
        int(foo)
    except ValueError as exc:
        raise SchedulerPluginError("Invalid value for foo.", exc)

This allows Pavilion to catch and handle predictable errors, and pass them
directly to the user.

Init
~~~~

Scheduler plugins initialize much like other Pavilion plugins:

.. code-block:: python

    from pavilion import schedulers

    class Slurm(schedulers.SchedulerPluginAdvanced):

        def __init__(self):
            super().__init__(
                name='slurm',
                description='Schedules tests via the Slurm scheduler.'
            )

Most customization is through method overrides and a few class variables that
we'll cover later.  There is also a ``SchedulerPluginBasic`` which allows for working
with schedulers with a much reduced feature set.


.. _Yaml Config: https://yaml-config.readthedocs.io/en/latest/

Configuraton
~~~~~~~~~~~~

Pavilion has unified scheduler plugin configuration into the 'schedule' section. Not all keys from
this section will apply to your scheduler, and that's ok. Most keys are handled automatically given
the information gathered on nodes.

You can also, optionally, add a scheduler specific configuration section. To do this, you'll need
to override the ``_get_config_elems()`` method. This method returns three items:

  1. A list of YamlConfig Elements.
  2. A dictionary of validation/normalization functions. These will be called to
     transform the data for each key to a standard format.
  3. A dictionary of default values for each key.

Pavilion uses the `Yaml Config`_ library to manage it's configuration format.
Yaml Config uses 'config elements' to describe each component of the
configuration and their relationships.

The Slurm scheduler plugin provides a solid example of this, but in general:

  - You should only use yaml_config StrElem, ListElem, KeyedElem (a dict with specific key
    and value formats), and CategoryElem (a dict with mostly unlimited keys, and a shared
    value format).
  - Validators for individual keys are optional, but you should do str to int conversion and value
    range checking. These can take several forms, see the ``SchedulerPlugin._get_config_elems()``
    method documentation.
  - Don't use the built-in validation and default options for the yaml_config objects,
    use the validation callbacks/objects and defaults dictionary returned by the function
    instead.

Kicking Off Tests
~~~~~~~~~~~~~~~~~

Pavilion scheduler plugins generate a kickoff script for each job - a script that will
be handed to the scheduler to be run within the allocation. That script will run Pavilion
one or more times within that allocation, starting a ``run.sh`` script for each test. It's
the responsibility of the ``run.sh`` script to actually run applications under MPI, either
with ``mpirun``, ``srun``, or similar.

Many schedulers rely on a header information in that ``kickoff`` script to relay to
the scheduler what the settings for the allocation should be. This is header is optional - the
default header adds nothing to the file except a ``#!/bin/bash`` line. If you need to
define header lines, you'll need to create a class that inherits from
``pavilion.schedulers.scheduler.KickoffScriptHeader``, and override the
``_kickoff_lines()`` method. This method simply returns a list of header lines
to add.

Alternatively, when writing your ``_kickoff`` method, you can simply pass any relevant
information about the job to the scheduler directly through the command line
or API calls.

Either way, there are a set of parameters that must be passed on to the scheduler. These
are described in the ``SchedulerPlugin._kickoff`` docstring. You can safely ignore parameters
that aren't supported by your scheduler.


Composing Commands
~~~~~~~~~~~~~~~~~~

Your scheduler plugin will most likely require that you run commands in a subshell. This
section provides guidance on how to do so reliably under Pavilion.

.. code-block:: python

    # These should be at the top of the file, as standard
    import subprocess
    import shutil

    # Use shutil.which to find the path to your executable, if needed
    srun_cmd = shutil.which('srun')
    if srun_cmd is None:
        raise SchedulerError("Could not find srun command path.")

    my_cmd = [srun_cmd]

    # Building your commands with a list is simple and flexible.
    if config['account']:
        my_cmd.extend(['-A', config['account']])

    # subprocess.check_output will run your command to completion and simultaniously redirect
    # and gather the output.
    try:
        # You should also redirect stderr, as is appropriate for your command.
        run_output = subprocess.check_output(my_cmd, stderr=subprocess.STDOUT)
    # A CalledProcessError will be raised if the command returns an error code.
    except CalledProcessError as err:
        raise SchedulerError("Error calling srun. Return code '{}', msg:\n{}"
                             .format(err.returncode, err.output)

    # The output will be binary, and will need to be decoded
    run_output = run_output.decode()


To find commands on a system, 'distutils.spawn.find_executable' is essentially
an in-python version of 'which'.

Environment Variables
^^^^^^^^^^^^^^^^^^^^^

You can also add to the environment through the ``env`` argument, though you
need to make sure to include the base environment in most cases.

.. code-block:: python

    import os
    import subprocess

    myenv = dict(os.environ)
    myenv['MY_ENV_VAR'] = 'Hiya!'
    myenv['PATH'] = '{}:/opt/share/something/bin'.format(os.environ['PATH'])

    subprocess.run(my_cmd, env=myenv)

Job Id's
^^^^^^^^

Regardless of how you kickoff a test, you must capture a job id for it, and return it
as part of a JobInfo object (which is really just a dict). All scheduler commands that act on a
job, like cancel, will have access to this object either directly or through an attached test.

The JobInfo dict can contain any keys and values you like, as long as they're all strings. It's
useful to include the 'sys_name' of the machine you're on (via 'sys_vars.get_vars(True)
["sys_name"]') so that you also check if the system that started the job is the same as the one
that's manipulating it.

Job Status
~~~~~~~~~~

The '_job_status()' method takes the Pavilion base config (Pavilion's configuration, rather than
a test configuration), and the JobInfo for job that status is needed for. It returns a
'TestStatusInfo' object, describing the job state returned by the scheduler.

It's job is to translate all the complicated potential job states for any particular scheduler
into one of four more basic states understood by Pavilion:

- SCHED_ERROR - There was an error in scheduling the job
- SCHED_CANCELLED - The job was cancelled (usually externally to Pavilion)
- SCHED_RUNNING - The job is running (but not necessarily the particular test.
- SCHEDULED - The job is simply waiting for an allocation.

Note that this will only be called if the cached job status in the plugin's internal
'_job_statuses' dictionary is out of date. In fact, you can (as the slurm plugin does), simply
use the first call of this function to update the status of all the jobs on the system at once
in that dictionary.

.. code-block:: python

    # The STATES object has attributes for each valid Pavilion test state,
    # but you'll only be using those with the 'SCHED_' prefix.
    from pavilion.status_file import STATES
    from pavilion.status_file import TestStatusInfo

    my_status = TestStatusInfo(
        STATES.SCHED_ERROR,     # Simply pass one of the valid scheduler state constants.
        "Cthulhu at my test.")  # Along with a longer message describing the state.

Cancelling Runs
~~~~~~~~~~~~~~~

To write the 'cancel()' method, all you need to do is use the job id you saved when you
kicked a test off. If there's an error doing so, return a message why, otherwise simply
return 'None' to denote success.

All the more complicated parts of cancelling are handled by functions that will wrap your method,
so there really isn't too much to worry about here.  The Slurm plugin cancel command is a good
example in how simple this can be.

Finding the Allocation Nodes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The ``_get_alloc_nodes()`` method needs to be overridden to find the list of nodes for
a test's allocation. This will always be called only from within the allocation - typically
the scheduler sets an environment variable with this information.

Note that this may not always be called. If chunking is used, the scheduler plugin will know
the exact list of allocation nodes before the test is kicked off.


Scheduler Availability
~~~~~~~~~~~~~~~~~~~~~~

The 'available()' method simply tells Pavilion if the scheduler is available to run jobs
on the given system. It's not a measure of operability, simply a True/False value saying
whether the basic commands (or API modules) needed to use the plugin exist.

.. _decoratored: https://www.programiz.com/python-programming/decorator

Advanced Scheduler Methods
--------------------------

If you're trying to write an advanced scheduler plugin using the 'SchedulerPluginAdvanced'
parent class, there are a couple more methods to override.  These are:

- ``_get_raw_node_data()`` - A method to gather raw information on the cluster's nodes.
- ``_transform_raw_node_data`` - A method that translates that same data into a dictionary of
  information about each node.

For information on overriding each of these, refer to the doc strings for each as defined
in the 'pavilion.schedulers.advanced.SchedulerPluginAdvanced' class. They will tell you
everything you need to know about how to write those methods.

The purpose of these methods is to provide Pavilion with the information it needs to make
decisions about what nodes to schedule on itself, rather than relying on the scheduler to do
so. This allows Pavilion to partition the system in ways that the scheduler might not support
on its own. These include the ability to specify 'all' as the number of nodes requested,
and the ability to perform :ref:`tests.scheduling.chunking` of system into multiple, evenly sized
pieces.

The downside is that the per-node information must be perfectly accurate or jobs may be rejected by
the scheduler (such as when improperly requesting nodes not in the selected partition) or simply
wait in the queue forever (such as when selecting nodes that are down).

Scheduler Variables
-------------------

The second part of creating a scheduler plugin is adding a set of variables that
test configs can use to manipulate their test. The vast majority of these are automatically
derived from the information you gathered about the nodes for Advanced scheduler plugins or
via the ``schedule.cluster_info`` test configuration information for Basic scheduler plugins.

Pavilion provides a framework for creating these variables, the
``pavilion.schedulers.vars.SchedulerVariables`` class. By inheriting from this
class, you can define scheduler variables simply by adding `decoratored`_
methods to your child class. The decorators do most of the hard work, you
simply have create and return the value. The class itself provides good documentation
on how to do this.

The most important variable in all of these is the ``test_cmd`` variable, which is probably the
only variable that will need to be customized for your scheduler plugin. It provides
tests with an mpi startup command, such as ``mpirun``, with arguments automatically set
according to the test's settings. Pavilion tests generally use this variable to prefix
their mpi runs when writing their run scripts:

.. code-block:: yaml

    test_cmd_example:

      scheduler: slurm
      schedule:
        nodes: 32

      run:
        cmds:
          - '{{test_cmd}} ./my_mpi_cmd'

How to write a ``test_cmd`` variable is documented in the ``SchedulerVariables.test_cmd()`` method's
doc string.


Adding the Scheduler Vars to the Scheduler Plugin
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

To add your scheduler variable class to your scheduler plugin, simply
set the variable class as the ``VAR_CLASS`` attribute on your scheduler.

.. code-block:: python

    from pavilion import schedulers

    class MyVarClass(schedulers.SchedulerVariables):
        # Your scheduler variable class

    class MySchedPlugin(schedulers.SchedulerPlugin):
        VAR_CLASS = MyVarClass