Test Results¶
Every successful test run generates a set of results in JSON. These are
saved with the test, but are also logged to a central results.log
file that is formatted in a Splunk compatible manner.
This page covers how results are gathered for each test run, as well as how to use the results system to pull results from test output.
Gathering Results¶
Result gathering steps:
- Generate the base result field values.
- Use result parser plugins to parse values from result files.
- Use result evaluations to modify results or generate additional values.
- Convert the ‘result’ key from True/False to ‘PASS’/’FAIL’.
All results are stored in a json mapping that looks like this:
{"avg_speed": 3.7999,
"command_test": "PASS",
"created": "2020-07-06 15:28:08.528899",
"duration": "0:00:06.130584",
"finished": "2020-07-06 15:28:08.528617",
"id": 16176,
"job_id": "123",
"name": "mytests.base",
"pav_result_errors": [],
"result": "PASS",
"return_value": 0,
"sched": {"avail_mem": "80445",
"cpus": "36",
"free_mem": "76261",
"min_cpus": "36",
"min_mem": "131933.144",
"total_mem": "125821"},
"started": "2020-07-06 15:28:02.398033",
"sys_name": "pav-test",
"n": {
"node01": {"raw_speed": 33},
"node03": {"raw_speed": 39},
"node05": {"raw_speed": 42}
}
"user": "bob"}
- Most values are stored at the top level of the mapping.
- Most scheduler variables are included, to improve tracking of how exactly the test ran.
pav_result_errors
stores a list of errors encountered when gathering results.- The ‘n’ and ‘fn’ sections store per_file information. This can be used to gather separate per-node information in cases where you have an output file for each node when using result parsers.
- Average speed in this case is a value calculated from each of the node speeds using result evaluations.
The Results Log¶
The results log records every step of the results gathering process in great
detail. It is stored in each test run directory in results.log
, and is
available via the pav log results <test_id>
command.
The general results log keeps a record of the results of every test run under
your instance of Pavilion. It’s location is configured via the general
pavilion.yaml
config, but defaults to residing in the working directory.
It’s format is designed to be easily read by Splunk and similar tools.
Basic Result Keys¶
These keys are present in the results for every test, whether the test
passed or failed. To see the latest list of base result values, run
pav show result_base
. All of these keys, as well as ‘result’, are reserved.
$ pav show result_base
Name | Doc
-------------------+-------------------------------------------------
name | The test run name
id | The test run id
created | When the test was created.
started | When the test run itself started.
finished | When the test run finished.
duration | Duration of the test run (finished - started)
user | The user that started the test.
job_id | The scheduler plugin's jobid for the test.
sched | Most of the scheduler variables.
sys_name | The system name '{{sys.sys_name}}'
pav_result_errors | Errors from processing results.
n | Per file results (the filename sans extension).
fn | Per filename results.
return_value | The return value of run.sh
All time fields are in ISO8601 format.
Basic Result Parsing¶
The result_parse
section of the config lets you configure result parsers
to pull useful information out of files generated by your test run.
Result parsers are plugins. Pavilion comes with several that are useful in
common scenarios, but tests with more complex results may warrant the
development of a custom result parser to specially handle that test’s output.
To see the currently available result parsers, use pav show parsers
.
$ pav show result_parsers
Available Result Parsers
-----------+----------------------------------------------------------
Name | Description
-----------+----------------------------------------------------------
command | Runs a given command.
filecheck | Checks the working directory for a given file. The
| parser will tell the user if the filename exists or not.
constant | Set a constant as result.
table | Parses tables.
regex | Find matches to the given regex in the given file. The
| matched string or strings are returned as the result.
Each of these is configured using its own configuration section under
result parse
. To see the full configuration documentation for a parser,
use pav show result_parsers --config <parser_name>
.
mytest:
scheduler: raw
run:
cmds:
- ping -c 10 google.com
result_parse:
# The results.parse section is comprised of configs for result parsers,
# identified by name. Each parser type section is further comprised
# of a mapping of key to
regex:
# Each result parser can have multiple configs.
# The value matched will be stored in this key
loss:
# This tells the regex parser what regular expression to use.
# Single quotes are recommended, as they are literal in yaml.
regex: '\d+% packet loss'
- Each result parser type can define multiple keys.
- The found value is stored at the top level of the result JSON (by default).
- All result parser keys must be unique.
- For more information on result parsers, see Result Parsers
Result Evaluations¶
Result evaluations allow you to perform math operations or call various functions on your results to further process them. This allows you to perform normalization, set constants, calculate averages, and other useful operations. The syntax is identical to Expressions (the part inside the double curly braces), except variables are looked up from the results rather than Pavilion variables.
eval_example:
run:
cmds:
- time wget google.com
# This will produce results that include the line:
# real 0m3.256s
result_parse:
regex:
# Parse out the minutes and seconds
real_m:
regex: 'real\s+(\d+)m'
real_s:
regex: 'real\s\d+m([0-9.]+)s'
result_evaluate:
# combine the minutes and seconds that we parsed out
# into one number of seconds.
real: 'real_m * 60 + real_s'
# These can reference values set by other evaluations.
real_hours: 'real/60/60'
- Result evaluations may reference each other. Order doesn’t matter.
- Values are automatically treated as the type that most resemble. This
conversion is applied both to constants and the contents of variables.
- 3.59 -> float
- 3 -> int
- True -> bool
- “hello” -> string
- hello -> variable
- You can refer to deeply nested values using dot notation:
- “table.col.row” - Would refer to the ‘row’ key of the ‘col’ dict which is itself a key in the ‘table’ dict.
- “sched.test_node_list.5” - Would refer to the fifth item in the node list.
Complex Variable References¶
The result JSON isn’t necessarily a flat dictionary. It may contain lists, other dictionaries, lists of dicts containing lists, and so on. To get to deeper keys, you simply provide each step in the path in a dot-seperated name.
For example, given a result structure that looks like:
{
"name": "example_results",
"table":
"node1": {"speed": 30.0, "items": 5},
"node2": {"speed": 50.0, "items": 4},
"node3": {"speed": 70.0, "items": 3}
"sched": {
"test_nodes": 3,
"test_node_list": ["node1", "node2", "node3"],
}
- “table.node1.speed” - Would refer to the ‘speed’ key of the ‘node1’ dict which
is itself a key in the ‘table’ dict. (It has value
30.0
). - “sched.test_node_list.1” - Would refer to the second item in the node list. (‘node2’ in this case)
Pulling Lists of Values¶
You can also use a single ‘*’ in a variable name to return a list of every matching value.
- “table.*.speed” -> [30.0, 50.0, 70.0]
- “test_node_list.*” -> [“node1”, “node2”, “node3”]
- Which is the same as “test_node_list” by itself, actually.
To get the keys of a dictionary, use the keys
function. The keys are
guaranteed to be in the same order as the values produced when using a ‘*’.
A Complex Example¶
Given a test that produces values per node and gathers them using the result parser per_file: Manipulating Multiple File Results feature, you may want to use all of those values to calculate an average or outliers.
multi_val:
slurm:
num_nodes: all
run:
# Get the root filesystem usage per node.
cmds: '{{test_cmd}} -o "%N.out" df /'
result_parse:
regex:
used:
regex: '^rootfs\s+\d+\s+(\d+)'
files: '*.out'
per_file: 'name'
- This will give us a
<node_name>.out
file for each node with the command output. - The result parser will parse out the 3rd column from the ‘rootfs’ line from each of these files.
- The ‘per_file’ option of ‘name’ will store these results in the ‘n’ dictionary by the root filename.
The results will look like:
{
"name": "examples.multi_val",
"n": {
"node01": {"used": "709248"},
"node03": {"used": "802218"},
"node04": {"used": "699320"},
"node05": {"used": "2030531"},
},
"etc": "...",
}
We could then add the following to our test config to perform calculations on these values.
multi_val:
# ...
result_evaluate:
mean_used: 'avg(n.*.used)'
outlier_data: 'outliers(n.*.used, keys(n), 1.5)'
outliers: 'keys(outlier_data)'
Would give us results like:
{
"name": "examples.multi_val",
"n": {
"node01": {"used": "709248"},
"node03": {"used": "802218"},
"node04": {"used": "699320"},
"node05": {"used": "2030531"},
},
"mean_used": 1060329.25,
"outlier_data": {"node05": 1.7276},
"outliers": {"node05"},
}
String Expressions in Result Evaluations¶
Pavilion variables and string expressions (‘{{ stuff }}’) can be used, carefully, in result evaluations. Keep in mind that they are evaluated as a separate step (using pavilion variables), and will become the string that is later evaluated in result evaluations.
expr_eval:
variables:
num_var: 'num'
num: 8
base: 2
message: "hello world"
result_evaluate:
# Set the result 'num' key to 52
num: '50'
half_num: '{{num_var}}/2'
# After strings are resolved, this will become:
# 'num/2'
# This will then be evaluated, and the 'num' result value will be
# used (50).
# The result 'half_num' key will thus be 25. (ie: ``50/2`` )
# If we want to include a Pavilion variable as a string, it must
# be put in quotes.
msg_len: 'len("{{message}}")'
# This will become: 'len("hello world")'
# Which will evaluate to ``11``.
# You can actually include more complex expressions in both
# the string expression and the evaluation, but this should be
# avoided
complicated: '(num * {{ base^10 }})/100'
# The value string resolves to: '(num * 1024)/100'
# Which evaluates to: ``(50 * 1024)/100`` -> ``512.0``
The Test Result¶
The ‘result’ key denotes the final test result, and will always be either ‘PASS’, ‘FAIL’ or ‘ERROR’. ERROR in this case means the test had a non-recoverable error when checking whether the test passed or failed.
You can set the ‘result’ using either result parsers or result evaluations. It must be set as a single True or False value.
- For result parsers, this means you should use an ‘action’ of ‘store_true’ (the default) or ‘store_false’ (See Actions). You will also need to use a ‘per_file’ setting that produces a single value, like ‘first’ or ‘all’ (See per_file: Manipulating Multiple File Results).
- For result evaluations this simply means ensuring that the evaluation returns a boolean, typically by way of a comparison operator.
If you don’t set the ‘result’ key yourself, Pavilion defaults to adding the
evaluation: result: 'return_value == 0'
. This is why, by default,
Pavilion test runs PASS if the run script returns 0.
Result Parsers¶
The result_parse
section of each test config lets us configure additional
result parsers that can pull data out of test output files. By default
each parser reads from the run log, which contains the stdout and stderr
from the run script and your test.
mytest:
scheduler: raw
run:
cmds:
- ping -c 10 google.com
result_parse:
# The results.parse section is comprised of configs for result parsers,
# identified by name. Each parser can have a list of one or more
# configs, each of which will parse a different result value from
# the test output.
result_parse:
regex:
# Each result parser can have multiple configs.
# The value matched will be stored in this key
loss:
# This tells the regex parser what regular expression to use.
# Single quotes are recommended, as they are literal in yaml.
regex: '\d+% packet loss'
# We're storing this value in the result key. If it's found
# (and has a value of 'True', then the test will 'PASS'.
result:
regex: '10 received'
# The action denotes how to handle the parser's data. In this case
# a successful match will give a 'True' value.
action: store_true
The results for this test run might look like:
{
"name": "mytest",
"id": 51,
"created": "2019-06-18 16:00:35.692878-06:00",
"started": "2019-06-18 16:00:36.744221-06:00",
"finished": "2019-06-18 16:01:39.997299-06:00",
"duration": "0:01:04.304421",
"result": "PASS",
"loss": "0% packet loss"
}
Result Value Types¶
Result parsers can return any sort of json compatible value. This can be a string, number (int or float), boolean, or a complex structure that includes lists and dictionaries. Pavilion, in handling result values, groups these into a few internal categories.
- empty - An empty result is a json
null
, or an empty list. Everything else is non-empty. - match - A match is a non-empty result that is also not json
false
. - false - False is special, in that it is neither empty nor a match.
The actions and per_file sections below work with these categories when deciding how to handle result parser values.
Actions¶
We saw in the above example that we can use an action to change how the results are stored. There are several additional actions that can be selected:
- store - (Default) Simply store the result parser’s output.
- store_true - Store
true
if the result is a match (non-empty and not false) - store_false - Stores
true
if the result is not a match. - count - Count the length of list matches, regardless of contents. Non-list matches are 1 if a match, 0 otherwise.
Files¶
By default, each result parser reads through the test’s run.log
file. You can specify a different file, a file glob, or even multiple
file globs to match an assortment of files. The files are parsed in the
order given.
If you need to reference the run log in addition to other files, it is
one directory up from the test’s run directory, in ../run.log
.
This test runs across a bunch of nodes, and produces an output file for
each. The regex parser runs across each of these, and (because it
defaults to returning the first found item only) returns that item or
null
for each of the files found. What it does with those values
depends on the per_file attribute for the result parser.
hugetlb_check:
scheduler: slurm
slurm:
num_nodes: 4
run:
cmds:
# Use the srun --output option to specify that results are
# to be written to separate files.
- {{sched.test_cmd}} --output="%N.out" env
result_parse:
regex:
# The matched values will be stored under the 'huge_size' key,
# but that will vary based on the 'per_file' value.
huge_size:
regex: 'HUGETLB_DEFAULT_PAGE_SIZE=(.+)'
# Run the parser against all files that end in .out
files: '*.out'
per_file: # We'll demonstrate these settings below
per_file: Manipulating Multiple File Results¶
The per_file option lets you manipulate how results are stored on a file-by-file basis. Since the choice here will have a drastic effect on your results, we’ll demonstrate each from the standpoint of the test config above.
Let’s say the test ran on four nodes (node1, node2, node3, and node4), but only node2 and node3 found a match. The results would be:
- node1 -
<null>
- node2 -
2M
- node3 -
4K
- node4 -
<null>
first - Keep the first result (Default)¶
result_parse:
regex:
huge_size:
regex: 'HUGETLB_DEFAULT_PAGE_SIZE=(.+)'
files: '*.out'
per_file: first
Only the result from the first file with a match is kept. In this case, the value from node1 would be ignored in favor of that of node2. The results would contain:
{
"huge_size": "2M"
}
In the simple case of only specifying one file, the ‘first’ result is the only result. That’s why this is the default; the first is all you normally need.
last - Keep the last result¶
result_parse:
regex:
huge_size:
regex: 'HUGETLB_DEFAULT_PAGE_SIZE=(.+)'
files: '*.out'
per_file: last
Just like ‘first’, except we work backwards through the files and get the last match value. In this case, that means ignoring node4’s result (because it is null) and taking node3’s:
{
"huge_size": "4K",
}
all - True if each file returned a True result¶
result_parse:
regex:
huge_size:
regex: 'HUGETLB_DEFAULT_PAGE_SIZE=(.+)'
files: '*.out'
per_file: all
By itself, ‘all’ sets the key to True if the result values for all
the files evaluate to True. Setting action: store_true
produces more
predictable results.
value | t/f value | action: store_true | |
---|---|---|---|
No result | <null> |
false | false |
Non-empty strings | '2M' |
true | true |
Empty strings | '' |
false | true |
Non-zero numbers | 5 |
true | true |
Zero | 0 |
false | true |
Literal true | true |
true | true |
Literal false | false |
false | false |
In our example, the result is false
because some of our files had no matches
(a <null>
result).
{
"huge_size": false,
}
any - True if any file returned a True result¶
result_parse:
regex:
huge_size:
regex: 'HUGETLB_DEFAULT_PAGE_SIZE=(.+)'
files: '*.out'
per_file: any
Like ‘all’, but is true
if any of the results evaluates to True. In
the case of our example, since at least one file matched, the key will be
set to ‘true’
{
"huge_size": true,
}
list - Merge the file results into a single list¶
result_parse:
regex:
huge_size:
regex: 'HUGETLB_DEFAULT_PAGE_SIZE=(.+)'
files: '*.out'
per_file: list
For each result from each file, add them into a single list. empty
values are not added, but false
is. If the result value is a list
already, then each of the values in the list is added.
{
"huge_size": ["2M", "4K"],
}
fullname - Stores in a filename based dict.¶
result_parse:
regex:
huge_size:
regex: 'HUGETLB_DEFAULT_PAGE_SIZE=(.+)'
files: '*.out'
per_file: fullname
Put the result under the key, but in a dictionary specific to that file. All
the file specific dictionaries are stored under the fn
key by filename.
{
"fn": {
"node1.out": {"huge_size": null},
"node2.out": {"huge_size": "2M"},
"node3.out": {"huge_size": "4K"},
"node4.out": {"huge_size": null}
}
}
- When using the fullname per_file setting, the key cannot be
result
. - The rest of the file’s path is ignored, so there is potential for
file name collisions, as the same filename could exist in multiple
places. Pavilion will report such collisions in the results under the
error
key.
name - Stores in a filename (without extension) based dict.¶
result_parse:
regex:
huge_size:
regex: 'HUGETLB_DEFAULT_PAGE_SIZE=(.+)'
files: '*.out'
per_file: fullname
Just like fullname, but instead the file name with the file extension
removed. These are stored under the n
key in the results.
{
"n": {
"node1": {"huge_size": null},
"node2": {"huge_size": "2M"},
"node3": {"huge_size": "4K"},
"node4": {"huge_size": null}
}
}
fullname_list - Stores the name of the files that matched.¶
result_parse:
regex:
huge_size:
regex: 'HUGETLB_DEFAULT_PAGE_SIZE=(.+)'
files: '*.out'
per_file: fullname_list
Stores a list of the names of the files that matched. The actual matched values aren’t saved.
{
"huge_size": ["node2.out", "node3.out"],
}
name_list - Stores the name of the files that matched.¶
result_parse:
regex:
huge_size:
regex: 'HUGETLB_DEFAULT_PAGE_SIZE=(.+)'
files: '*.out'
per_file: name_list
Stores a list of the names of the files that matched, minus extension. The actual matched values aren’t saved.
{
"huge_size": ["node2", "node3"],
}
Errors¶
If an error occurs when parsing results that can be recovered from, a
description of the error is recorded under the error
key. Each of
these is a dictionary with some useful values:
{
"errors": [{
# The error happened under this parser.
"result_parser": "regex",
# The file being processed.
"file": "node3.out",
# The key being processed
"key": "hugetlb",
"msg": "Error reading file 'node3.out': Permission error"
}]
}