Tutorial 0 - Introduction to Janis¶
Janis is workflow framework that uses Python to construct a declarative workflow. It has a simple workflow API within Python that you use to build your workflow. Janis converts your pipeline to the Common Workflow Language (CWL) and Workflow Description Language (WDL) for execution, and it’s also great for publishing and archiving.
Janis was designed with a few points in mind:
- Workflows should be easy to build,
- Workflows and tools must be easily shared (portable),
- Workflows should be able to execute on HPCs and cloud environments.
- Workflows should be reproducible and re-runnable.
Janis uses an abstracted execution environment, which removes the shared file system in favour of you specifiying all the files you need up front and passing them around as a File object. This allows the same workflow to be executable on your local machine, HPCs and cloud, and we let the execution engine
handle moving our files. This also means that we can use file systems like S3
, GCS
, FTP
and more without any changes to our workflow.
Instructions for setting up Janis on a compute cluster are under construction.
Requirements¶
- Local environment
- Python 3.6+
- Docker
- Python virtualenv (
pip3 install virtualenv
)
NB: This tutorial requires Docker to be installed and available.
Installing Janis¶
We’ll install Janis in a virtual environment as it preserves versioning of Janis in a reproducible way.
Create and activate the virtualenv:
# create virtual env virtualenv -p python3 ~/janis/env # source the virtual env source ~/janis/env/bin/activate
Install Janis through PIP:
pip install -q janis-pipelines
Test that janis was installed:
janis -v # -------------------- ------ # janis-core v0.9.7 # janis-assistant v0.9.9 # janis-unix v0.9.0 # janis-bioinformatics v0.9.5 # janis-pipelines v0.9.2 # janis-templates v0.9.4 # -------------------- ------
Installing CWLTool¶
CWLTool is a reference workflow engine for the Common Workflow Language. Janis can run your workflow using CWLTool and collect the results. For more information about which engines Janis supports, visit the Engine Support page.
pip install cwltool
Test that CWLTool has installed correctly with:
cwltool --version
# .../bin/cwltool 1.0.20190906054215
Running an example workflow with Janis¶
First off, let’s create a directory to store our janis workflows. This could be anywhere you want, but for now we’ll put it at $HOME/janis/
mkdir ~/janis
cd ~/janis
You can test run an example workflow with Janis and CWLTool with the following command:
janis run --engine cwltool -o tutorial0 hello
You’ll see the INFO
statements from CWLTool in terminal.
To see all logs, add
-d
to become:janis -d run --engine cwltool -o tutorial0 hello
At the start, we see the two lines in our output:
2020-03-16T18:49:08 [INFO]: Starting task with id = 'd909df'
d909df
This is our workflow ID (wid) and is one way we can refer to our workflow.
After the workflow has completed (or in a different window), you can see the progress of this workflow with:
janis watch d909df
# WID: d909df
# EngId: d909df
# Name: hello
# Engine: cwltool
#
# Task Dir: $HOME/janis/tutorial0
# Exec Dir: None
#
# Status: Completed
# Duration: 4s
# Start: 2020-03-16T07:49:08.367981+00:00
# Finish: 2020-03-16T07:49:11.881006+00:00
# Updated: 3h:51m:54s ago (2020-03-16T07:49:11+00:00)
#
# Jobs:
# [✓] hello (1s)
#
# Outputs:
# - out: $HOME/janis/tutorial0/out
There is a single output out
from the workflow, cat-ing this result we get:
cat $HOME/janis/tutorial0/out
# Hello, World
Overriding an input¶
The workflow hello
has one input inp
. We can override this input by passing --inp $value
onto the end of our run statement. Note the structure for workflow parameters and parameter overriding:
janis run <run options> worklowname <workflow inputs>
We can run the following command:
janis run --engine cwltool -o tutorial0-override hello --inp "Hello, $(whoami)"
# out: Hello, mfranklin
Running Janis in the background¶
You may want to run Janis in the background as it’s own process. You could do this with nohup [command] &
, however we can also run Janis with the --background
flag and capture the workflow ID to watch, eg:
wid=$(janis run \
--background --engine cwltool -o tutorial0-background \
hello \
--inp "Run in background")
janis watch $wid