Scattering

Improving workflow performance with embarrassingly parallel tasks

Janis support scattering by field when constructing a janis.Workflow.step() through the scatter=Union[str, janis.ScatterDescription ] parameter.

class janis.ScatterDescription(fields: List[str], method: janis_core.utils.scatter.ScatterMethod = None, labels: Union[janis_core.operators.selectors.Selector, List[str]] = None)[source]

Class for keeping track of scatter information

__init__(fields: List[str], method: janis_core.utils.scatter.ScatterMethod = None, labels: Union[janis_core.operators.selectors.Selector, List[str]] = None)[source]
Parameters:
  • fields – The fields of the the tool that should be scattered on.
  • method (ScatterMethod) – The method that should be used to scatter the two arrays
  • labels – (JANIS ONLY) -
janis.ScatterMethods

alias of janis_core.utils.scatter.ScatterMethod

Simple scatter

To simply scatter by a single field, you can simple provide the scatter="fieldname" parameter to the janis.Workflow.step() method.

For example, let’s presume you have the tool MyTool which accepts a single string input on the myToolInput field.

w = Workflow("mywf")
w.input("arrayInp", Array(String))
w.step("stp", MyTool(inp1=w.arrayInp), scatter="inp1")
# equivalent to
w.step("stp", MyTool(inp1=w.arrayInp), scatter=ScatterDescription(fields=["inp1"]))

Scattering by more than one field

Janis supports scattering by multiple fields by the dot and scatter methods, you will need to use a janis.ScatterDescription and janis.ScatterMethods:

Example:

from janis import ScatterDescription, ScatterMethods
# OR
from janis_core import ScatterDescription, ScatterMethods

w = Workflow("mywf")
w.input("arrayInp1", Array(String))
w.input("arrayInp2", Array(String))
w.step(
  "stp",
  MyTool(inp1=w.arrayInp1, inp2=w.arrayInp2),
  scatter=ScatterDescription(fields=["inp1", "inp2"], method=ScatterMethods.dot)
)