Pavilion Command Plugins
Every Pavilion command is actually a plugin. The command plugin system provides easy ways to set up command arguments and the functions that actually perform the command.
Contents
Command Plugin Init
Every command plugin inherits from the pavilion.commands.Command
plugin
base class. Like all other Pavilion plugin classes, the __init__
method must
take no arguments, and it must call the parent class’s __init__
to name
and document the plugin.
from pavilion import commands
class CancelCommand(commands.Command):
"""A basic command class."""
def __init__(self):
super().__init__(
name='cancel',
description='Cancel tests or series.',
short_help='Cancel a test, tests, or series.",
aliases=['kill', 'stop']
)
Name and Aliases
The name
attribute is both the name of this plugin and that users will
use to call this command. The aliases
attribute takes a list of alternate
names for the command. Given the above, all of the following are valid ways
to call our example command.
$ pav cancel
$ pav kill
$ pav stop
Help
The command description
is printed when running pav <cmd> --help
, along
with the rest of the documentation in the command’s arguments. The following is
what is printed for the real pav cancel
command.
$ pav cancel --help
usage: pav.py cancel [-h] [-s] [-j] [tests [tests ...]]
Cancel a test, tests, or test series.
positional arguments:
tests The name(s) of the tests to cancel. These may be any mix of
test IDs and series IDs. If no value is provided, the most
recent series submitted by the user is cancelled.
optional arguments:
-h, --help show this help message and exit
-s, --status Prints status of cancelled jobs.
-j, --json Prints status of cancelled jobs in json format.
The command short_help
is printed when running pav --help
when the
command is listed. If no short help is given, the command will be hidden (but
still usable). Hidden commands are generally useful to call back into pavilion
in generated scripts.
$ pav --help
usage: pav.py [-h] [-v]
{run,show,status,view,results,result,log,clean,_run,set_status,status_set,wait,cancel}
...
Pavilion is a framework for running tests on supercomputers.
positional arguments:
{run,show,status,view,results,result,log,clean,_run,set_status,status_set,wait,cancel}
run Setup and run a set of tests.
show Show pavilion plugin/config info.
status Get status of tests.
view Show the resolved config for a test.
results (result) Displays results from the given tests.
log Displays log for the given test id.
clean Clean up Pavilion working diretory.
set_status (status_set)
Set status of tests.
wait Wait for statuses of tests.
cancel Cancel a test, tests, or test series.
optional arguments:
-h, --help show this help message and exit
-v, --verbose Log all levels of messages to stderr.
_setup_arguments(self, parser)
Every Pavilion command plugin must provide a
_setup_arguments(self, parser) method. It will
get passed a python argparse
parser that you can use to add arguments to
your command. This parser will already be configured to include the basics
of the command as specified when you called the parent’s __init__
method.
def _setup_arguments(self, parser):
parser.add_argument(
'-s', '--status', action='store_true', default=False,
help='Prints status of cancelled jobs.'
)
parser.add_argument(
'-j', '--json', action='store_true', default=False,
help='Prints status of cancelled jobs in json format.'
)
parser.add_argument(
'tests', nargs='*', action='store',
help='The name(s) of the tests to cancel. These may be any mix of '
'test IDs and series IDs. If no value is provided, the most '
'recent series submitted by the user is cancelled. '
)
# No need to return anything.
See the official documentation on argparse for more information on defining arguements.
run(self, pav_cfg, args)
The run command is what actually executes your command. Other than a few Pavilion conventions you should follow, each command is free to do anything it needs to.
Arguments
It will be given a couple of useful parameters.
pav_cfg
- The pavilion configuration object. A dictionary like object that contains all of the base pavilion configuration attributes, as well as useful things like the path to the general working directory.pav_cfg.working_dir
The
args
object containing all of the pavilion command arguments. It will contain all the arguments specific to your command, as well as the general pavilion arguments (like –verbose).
Return Values
Pavilion’s return value is the return value of whatever command was run. So
if your command succeeds, you should return 0
. If your command fails,
you return an appropriate error code from the errno
library.
import errno
from pavilion import utils
class KnownRuns(commands.Command):
...
def run(self, pav_cfg, args):
"""Print the number test runs in the working_dir."""
runs_dir = pav_cfg.working_dir/'test_runs'
try:
runs = list(runs_dir.iterdir())
except PermissionError as err:
utils.fprint(
"Could not access run dir at {}: {}"
.format(str(runs_dir), err),
color=utils.YELLOW,
file=sys.stderr)
return errno.EACCESS
utils.fprint(len(runs))
return 0
Exceptions
Pavilion commands should never raise exceptions, or let the exceptions of anything they call to go uncaught. Uncaught exceptions are always considered to be a bug in Pavilion.
When dealing with non-Pavilion libraries, you’ll have to work out how to handle any exceptions they raise yourself. Each Pavilion library, however, comes with one or more custom exceptions that should be the only exception type raised by that library. These exceptions should contain information about what went wrong, so you’ll probably want to print that information for the user like in the example above.
Output via utils.fprint()
Most command output should be given through either the utils.fprint
function (or utils.draw_table
).
utils.fprint
is just like the standard python command, except that it
allows for ANSI color sequences through the color
argument.
from pavilion import utils
utils.fprint("hello world", color=utils.YELLOW)
The core output of your command should be given via stdout.
By default, output is meant for human readability, so colorization is encouraged.
Error output should be
Color Scheme
The utils
module provides names that map to the standard ANSI 3/4 bit
foreground colors and a few special format codes. While fprint can take any
ANSI sequence as the color (including 8 and 24 bit color codes), only the basic
colors are typically mapped by user color schemes to ensure readability.
Color |
Code |
Usage |
---|---|---|
BLACK |
30 |
Default |
RED |
31 |
Use for fatal errors |
GREEN |
32 |
Use for ‘success’ messages |
YELLOW |
33 |
Non-Fatal Errors (Warnings) |
BLUE |
34 |
Discouraged (contrast issues) |
CYAN |
35 |
Info messages |
GREY/WHITE |
37 |
|
BOLD |
1 |
|
FAINT |
2 |
|
UNDERLINE |
4 |
|
Output via utils.draw_table()
The utils draw_table()
function provides an easy yet feature-rich way to
draw dynamic output tables to screen. The table’s contents will be automatically
wrapped to the terminal size, the text can be colorized, and more.
from pavilion import utils
import sys
# The table data is expected as a list of dictionaries with identical keys.
# Not all dictionary fields will necessarily be used. Commands will
# typically generate the rows dynamically...
rows = [
{'color': 'BLACK', 'code': 30, 'usage': 'Default'},
{'color': 'RED', 'code': 31, 'usage': 'Fatal Errors'},
{'color': 'GREEN', 'code': 32, 'usage': 'Warnings'},
{'color': 'YELLOW', 'code': 33, 'usage': 'Discouraged'},
{'color': 'BLUE', 'code': 34, 'usage': 'Info'}
]
# The data columns to print (and their default column labels).
columns = ['color', 'usage']
utils.draw_table(
outfile=sys.stdout,
field_info={},
fields=columns,
rows=rows)
The run
method should only take three arguments: self, pav_cfg,
args. Pav_cfg is the pavilion configuration file which holds all of the
information about pavilion, and args is the list of arguments given when
the command was run.
Args is the parsed argument object from argparse, and will contain all
your arguments as attributes args.myarg.
For example, if you wanted
to get the list of tests provided when the command was run you would
reference the list by args.tests
. Note, for flags like -s
you
get the name of the argument from the long name, i.e. --status
specifies that args.status
will hold the information required for
the status argument (in this case it will be a bool value).
When working with tests (I believe just about every command will be),
you need to remember that the arguments are strings. Because of this you
will need to import additional libraries, most importantly
from pavilion.pav_test import TestRun
, to allow yourself the ability
to access the actual test object. If you anticipate using series as well
it will also be important to add from pavilion import series
. Below
is some sample code used to generate the lists of tests provided by
args.tests
including the ability to extract those in a test series.
for test_id in args.tests:
if test_id.startswith('s'):
test_list.extend(series.TestSeries.from_id(pav_cfg,int(test_id[1:])).tests)
else:
test_list.append(test_id)
This code will populate a list of all test IDs, but they are still strings so you will need to do one of the following to get each test object.
# Using Imported Series Module
test_object_list, test_failed_list = series.test_obj_from_id(pav_cfg, test_list)
Note, when using series.test_obj_from_id
error handling is handled
for you, as it will return a tuple made up of a list of test objects,
and a list of test IDs that couldn’t be found. Because of this,
series.test_obj_from_id
is the preferred way of accessing test
objects.
Now that you have a test object you can get valuable information out of it. A test object has quite a few attributes, here are some of the more important ones:
You can access the most recent status object of the test by calling
status = test.status.current()
. This also has a few attributes that
allow you to extract relevant status information, like:
Attribute |
Detail |
---|---|
|
Returns the state of the test object. |
|
Returns the time stamp of this status object. |
|
Returns any additional collected information on the test. |
An example of using the different attributes and methods can be seen below, this is a simplified version of what is being done in the pavilion cancel command.
test_object_list, test_failed_list = series.test_obj_from_id(pav_Cfg, test_list)
for test in test_object_list:
# Requires that the schedulers module be loaded
scheduler = schedulers.get_scheduler_plugin(test.scheduler)
status = test.status.current()
if status.state != STATES.COMPLETE:
sched.cancel_job(test)