Pavilion Result Parser Plugins

This is an overview of how to write Pavilion Result Parser plugins. It assumes you’ve already read Plugin Basics. You should also read up on how to use Result Parsers.

Writing Result Parser Plugins

If you’re familiar with using result parsers, you’ll know that they take additional arguments like ‘per_file’ and ‘action’, and can accept multiple files. None of those, or even the key that the result will be stored in, are exposed to the result parser itself.

A result parser is essentially a function that takes a pre-opened file object (automatically advanced to points of interest), plus any arguments it specifically needs, processes that file, and returns some result data or structure.

They also have to provide a way to validate their arguments (to catch errors early) and define what those arguments are.

Result Parser Class

While the result parsing functionality is just a function, you still have to define the result parser as Yapsy plugin class as detailed in Plugin Basics. You must give your parser a name, should give it a description, and can give it a priority. We’ll use the regex parser as an example:

import yaml_config as yc
# Remember to not import ResultParser directly to avoid yapsy confusion.
from pavilion.result import parsers, ResultError

class Split(parsers.ResultParser):
    """Split a line by some substring, and return the list of parts."""

    def __init__(self):
        super().__init__(
            name='split',
            description="Split by a substring, are return the whitespace "
                        "stripped parts.",
            # This adds a 'sep' configuration option to the test_config
            # format.
            config_elems=[
                yc.StrElem(
                    'sep',
                    help_text="The substring to split by. Default is "
                              "to split by whitespace.")],
            # Set the default value for each argument (optional)
            # (The real 'split' parser doesn't do this)
            defaults={
                'sep': '',
            },
            # Set a validator for sep. In this case only allow these three
            # strings. (The real 'split' allows any string)
            validators={
                'sep': (',', '', ':')
            }

        )

    def __call__(self, file, sep=None):
        """Simply use the split string method to split"""

        sep = None if sep == '' else sep

        line = file.readline().strip()

        return [part.strip() for part in line.split(sep)]

Additional Arguments

Result parsers use a few additional properties to tell Pavilion how to work with it.

Arguments (config_elems)

The arguments to your parser are actually configuration items within the Pavilion test config format. By adding a result parser, you add a new section that can appear under result_parse in your test configs. Dynamically adding to a config like this can be complicated, but Pavilion takes care of all of the difficult bits for you.

Every result parser gets ‘action’, ‘per_file’, and ‘files’, ‘match_select’, ‘preceded_by’, and ‘for_lines_matching’ added as arguments automatically, so you won’t have to add those. You also don’t have to handle those, as they’re not passed to your result parser anyway.

Configuration items are added using the yaml_config library. Each config item (or element in yaml_config speak) is defined using a yaml_config instance. There are a few rules to adding such elements that apply to Pavilion.

  • All values should be StrElem or a ListElem of StrElem instances. Pavilion expects every config value to be a string so that Pavilion variables and expressions can be used.

  • Don’t do any validation (or type conversions) here, even though yaml_config supports it.

  • Don’t set choices with yaml_config.

  • Do give the ‘help_text’ for each element.

  • Do set required elements as such with ‘required=True’.

  • The order of your arguments doesn’t matter.

Multi-Valued Config Elements

To add an config item that can take one or more values, use ListElem:

def __init__(self):
    super().__init__(
        name="example",
        description="Look for the given tokens, and set this as true if "
                    "any are found."
        config_elems=[
            yc.ListElem(
                'tokens', sub_elem=StrElem(),
                help_text="One or more tokens to look for."
            )
        ]
    )

Argument Defaults (defaults)

The ‘defaults’ __init__() argument takes a dictionary of default values for each of the result parser arguments. Always give these as strings compatible with your argument validation.

Argument Validators (validators)

The ‘validators’ __init__() argument takes a dictionary of validators for each of the result parser arguments. It can either be a tuple of valid choices (all strings) or a function that takes a single argument and returns the validated value.

Type conversion functions, like int or float, are all valid here.

ValueError exceptions are caught during validation and reported in the results as errors; other exceptions are not. If your validation function raises other exceptions, make sure to catch and convert them into ValueErrors.

File Handling (open_mode)

By default, your result parser function will be handed a file object that has already been opened in text (unicode) read mode. It will also be advanced to a position dictated by the Preceded_By and For_Lines_Matching options.

As a result, your result parser generally needs to only read the next line of the file using file.readline(), but it is free to read more, less, or seek to other positions in the file as needed.

Further Validating Arguments

You can also provide a _check_args method to validate the arguments your result parser accepts. This is in addition to the ‘validators’ you passed in the init().

  • Catch any expected exceptions (let bug related exceptions through). - On type conversions, catch ValueError. - Catch OSError on system calls or file manipulation. - Catch library specific errors as needed.

  • After catching those exceptions, raise a Pavilion ResultError that contains a helpful message and the erroneous value and/or the original error message.

    • Formatting works best when the error messages are included directly from the exception object, rather than simply formatting the exception itself. Mostly, this means inserting err.args[0].

    • Pavilion will extend that information so that the user can easily find where in their config the error occurred.

  • The _check_args method should take the expected arguments as keyword arguments.

  • The _check_args method should return a dictionary of the arguments with any defaults or formatting changes applied. These will be passed directly to your result_parser function.

# The _check_args method for the regex parser.
def _check_args(self, **kwargs):

    try:
        re.compile(kwargs.get('regex'))
    except (ValueError, sre_constants.error) as err:
        raise pavilion.result.base.ResultError(
            "Invalid regular expression: {}".format(err.args[0]))

    return kwargs

Result Parsing Function

Result parsers use the special __call__() method to define the result parser function (This lets python use the class as a function, but that doesn’t matter here).

It must accept a file object as the first positional argument. The arguments you defined in the __init__ will be passed as keyword arguments. You can accept them using either **kwargs or by just defining them normally. If you provided a validation function, the value passed will be the value returned from that function.

def __call__(self, file, sep=None):

    line = file.readline()

    return line.split(sep)

Return Value

Your result parser should return None or an empty list if nothing was found. Pavilion will evaluate this to False when using store_true.

Other than that consideration, it can return any JSON compatible structure, though you should generally keep it simple.