Unit Test Framework¶
Note
Available in v0.11.0 and later
Overview¶
You can write test cases for your tool by defining the tests()
function in your Tool class.
Test cases defined by this function will be picked up by janisdk run-test
command.
This unit test framework provides several predefined preprocessors to transform Janis execution output data in the format that can be tested. For example, there is a preprocessor to read the md5 checksum of an output file. In addition to the predefined preprocessors, the frameworks also allows users to define and pass their own preprocessors.
Define test cases¶
You can define multiple test cases per tool. For each test case, you can declare multiple expected outputs. Mostly, you really only need to define more test cases if they require different input data.
The classes we use here janis_core.tool.test_classes.TTestCase
, janis_core.tool.test_classes.TTestExpectedOutput
and janis_core.tool.test_classes.TTestPreprocessor
are declared at the bottom of this document.
class BwaAligner(BioinformaticsWorkflow):
def id(self):
return "BwaAligner"
...
def tests(self):
return [
TTestCase(
name="basic",
input={
"bam": "https://some-public-container/directory/small.bam"
},
output=[
TTestExpectedOutput(
tag="out",
preprocessor=TTestPreprocessor.FileMd5,
operator=operator.eq,
expected_value="dc58fe92a9bb0c897c85804758dfadbf",
),
TTestExpectedOutput(
tag="out",
preprocessor=TTestPreprocessor.FileContent,
operator=operator.contains,
expected_value="19384 + 0 in total (QC-passed reads + QC-failed reads)",
),
TTestExpectedOutput(
tag="out",
preprocessor=TTestPreprocessor.LineCount,
operator=operator.eq,
expected_value=13,
),
],
)
]
Run the tests¶
# Run a specific test case
janisdk run-test --test-case [TEST CASE NAME] [TOOL ID]
# Run ALL test cases of one tool
janisdk run-test [TOOL ID]
To run the example test case shown above:
# Run a specific test case
janisdk run-test --test-case=basic BwaAligner
# Run all test cases
janisdk run-test BwaAligner
Test Files¶
There are two different ways to store your test files (input and expected output files):
Remote HTTP files:¶
you can use a publicly accessible http link https://some-public-container/directory/small.bam
.
- Input files will be downloaded to a cache folder in
~/.janis/remote_file_cache
folder. This is the same directory where files will be cached when you runjanis run
. - Expected output files however will be cached in the test directory
[WORKING DIRECTORY WHERE janisdk run-test is run]/tests_output/cached_test_files/
.
If the same url is found in the cache directory, we will not re-download the files unless the Last-Modified
http header has changed. If you want to force the files to be re-downloaded, you will need to remove the files from the cache directories.
Local test files:¶
you can store your files in local directory named test_data
. There are a few different examples of where you can place this directory.
Example from janis-bioinformatics
project:
- A
test_data
folder that contain files to be shared by multiple tools can be located atjanis_bioinformatics/tools/test_data
. To access files in this directory, you can callos.path.join(BioinformaticsTool.test_data_path(), "small.bam")
. - A
test_data
folder that contain files to be used byflagstat
can be located atjanis_bioinformatics/tools/samtools/flagstat/test_data
. To access files in this directory from within theSamToolsFlagstatBase
class, you can callos.path.join(self.test_data_path(), "small.bam")
.
Preprocessors and Comparison Operators¶
TTestExpectedOutput.preprocessor
is used to reformat the Tool output.
TTestExpectedOutput.operator
is used to compare output value with the expected output.
Predefined Preprocessors¶
- Value: No preprocessing, value as output by Janis e.g. an integer, string, or a file path for a File type output.
- FileDiff: The differences between two files as output by
difflib.unified_diff
. This can only be applied on File type output. If this preprocessor is used,TTestExpectedOutput.file_diff_source
must be provided.file_diff_source
must contain the file path to compare the output file with. - LinesDiff Number of different lines between two files as a tuple
(additions, deletions)
, as diff’d by theFileDiff
preprocessor. If this preprocessor is used,TTestExpectedOutput.file_diff_source
must be provided.file_diff_source
must contain the file path to compare the output file with. - FileContent: Extract the file content. This can only be applied to File type output.
- FileExists: Check if a file exists. It returns a True/False value. This can only be applied to File type output.
- FileSize: File size is bytes. This can only be applied on File type output.
- FileMd5: Md5 checksum of a file. This can only be applied to File type output.
- LineCount: Count the number of lines in a string or in a file.
- ListSize: Count the number of items in a list. This can only be applied to Array type output.
Custom preprocessor example:¶
In this example below, we are testing a tool that has an output field named out
.
The output value of this field is a file path that points to the location of a BAM file.
We want to test the flagstat value of this BAM file.
Here, we define your custom preprocessor function that
takes a file path as input and returns a string that contains the flagstat value of a BAM file.
TTestExpectedOutput.expected_file
simply points to a file that contains the expected output value.
You can also replace this with TTestExpectedOutput.expected_value="19384 + 0 in total (QC-passed reads + QC-failed reads)\n ..."
TTestExpectedOutput(
tag="out",
preprocessor=Bam.flagstat,
operator=operator.eq,
expected_file="https://some-public-container/directory/flagstat.txt"
)
class Bam(File):
...
@classmethod
def flagstat(cls, file_path: str):
command = ["samtools", "flagstat", file_path]
result = subprocess.run(
command,
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
universal_newlines=True,
)
if result.stderr:
raise Exception(result.stderr)
return result.stdout
Custom operator example:¶
In this example, we also want to test the flagstat value of BAM file returned by the out
output field.
Here, instead of writing a custom preprocessors, we write a custom operator that takes two file path and compare the flagstat output of this two files.
TTestExpectedOutput(
tag="out",
preprocessor=TTestPreprocessor.Value,
operator=Bam.equal,
expected_value="https://some-public-container/directory/small.bam"
)
class Bam(File):
...
@classmethod
def equal(cls, file_path_1: str, file_path_2: str):
flagstat1 = cls.flagstat(file_path_1)
flagstat2 = cls.flagstat(file_path_2)
return flagstat1 == flagstat2
Declaration¶
-
class
janis_core.tool.test_classes.
TTestCase
(name: str, input: Dict[str, Any], output: List[janis_core.tool.test_classes.TTestExpectedOutput])[source]¶ A test case requires a workflow or tool to be run once (per engine). But, we can have multiple output to apply different test logic.
-
class
janis_core.tool.test_classes.
TTestExpectedOutput
(tag: str, preprocessor: Union[janis_core.tool.test_classes.TTestPreprocessor, Callable[[Any], Any]], operator: Callable[[Any, Any], bool], expected_value: Optional[Any] = None, expected_file: Optional[str] = None, file_diff_source: Optional[str] = None, array_index: Optional[int] = None, suffix_secondary_file: Optional[str] = None, preprocessor_params: Optional[Dict[KT, VT]] = {})[source]¶ Describe the logic on how to test the expected output of a test case. A test case can have multiple instances of this class to test different output or different logic of the same output