Scattering¶
Improving workflow performance with embarrassingly parallel tasks
Janis support scattering by field when constructing a janis.Workflow.step()
through the scatter=Union[str,
janis.ScatterDescription
]
parameter.
-
class
janis.
ScatterDescription
(fields: List[str], method: janis_core.utils.scatter.ScatterMethod = None, labels: Union[janis_core.operators.selectors.Selector, List[str]] = None)[source]¶ Class for keeping track of scatter information
-
__init__
(fields: List[str], method: janis_core.utils.scatter.ScatterMethod = None, labels: Union[janis_core.operators.selectors.Selector, List[str]] = None)[source]¶ Parameters: - fields – The fields of the the tool that should be scattered on.
- method (ScatterMethod) – The method that should be used to scatter the two arrays
- labels – (JANIS ONLY) -
-
-
janis.
ScatterMethods
¶ alias of
janis_core.utils.scatter.ScatterMethod
Simple scatter¶
To simply scatter by a single field, you can simple provide the scatter="fieldname"
parameter to the janis.Workflow.step()
method.
For example, let’s presume you have the tool MyTool
which accepts a single string input on the myToolInput
field.
w = Workflow("mywf")
w.input("arrayInp", Array(String))
w.step("stp", MyTool(inp1=w.arrayInp), scatter="inp1")
# equivalent to
w.step("stp", MyTool(inp1=w.arrayInp), scatter=ScatterDescription(fields=["inp1"]))
Scattering by more than one field¶
Janis supports scattering by multiple fields by the dot
and scatter
methods, you will need to use a janis.ScatterDescription
and janis.ScatterMethods
:
Example:
from janis import ScatterDescription, ScatterMethods
# OR
from janis_core import ScatterDescription, ScatterMethods
w = Workflow("mywf")
w.input("arrayInp1", Array(String))
w.input("arrayInp2", Array(String))
w.step(
"stp",
MyTool(inp1=w.arrayInp1, inp2=w.arrayInp2),
scatter=ScatterDescription(fields=["inp1", "inp2"], method=ScatterMethods.dot)
)