Scattering¶
Improving workflow performance with embarrassingly parallel tasks
Janis support scattering by field when constructing a janis.Workflow.step() through the scatter=Union[str, janis.ScatterDescription ] parameter.
-
class
janis.ScatterDescription(fields: List[str], method: janis_core.utils.scatter.ScatterMethod = None, labels: Union[janis_core.operators.selectors.Selector, List[str]] = None)[source]¶ Class for keeping track of scatter information
-
__init__(fields: List[str], method: janis_core.utils.scatter.ScatterMethod = None, labels: Union[janis_core.operators.selectors.Selector, List[str]] = None)[source]¶ Parameters: - fields – The fields of the the tool that should be scattered on.
- method (ScatterMethod) – The method that should be used to scatter the two arrays
- labels – (JANIS ONLY) -
-
-
janis.ScatterMethods¶ alias of
janis_core.utils.scatter.ScatterMethod
Simple scatter¶
To simply scatter by a single field, you can simple provide the scatter="fieldname" parameter to the janis.Workflow.step() method.
For example, let’s presume you have the tool MyTool which accepts a single string input on the myToolInput field.
w = Workflow("mywf")
w.input("arrayInp", Array(String))
w.step("stp", MyTool(inp1=w.arrayInp), scatter="inp1")
# equivalent to
w.step("stp", MyTool(inp1=w.arrayInp), scatter=ScatterDescription(fields=["inp1"]))
Scattering by more than one field¶
Janis supports scattering by multiple fields by the dot and scatter methods, you will need to use a janis.ScatterDescription and janis.ScatterMethods:
Example:
from janis import ScatterDescription, ScatterMethods
# OR
from janis_core import ScatterDescription, ScatterMethods
w = Workflow("mywf")
w.input("arrayInp1", Array(String))
w.input("arrayInp2", Array(String))
w.step(
"stp",
MyTool(inp1=w.arrayInp1, inp2=w.arrayInp2),
scatter=ScatterDescription(fields=["inp1", "inp2"], method=ScatterMethods.dot)
)