VcfTools: VcfConcat¶
VcfToolsVcfConcat
· 1 contributor · 1 version
Concatenates VCF files (for example split by chromosome). Note that the input and output VCFs will have the same number of columns, the script does not merge VCFs by position (see also vcf-merge).
In the basic mode it does not do anything fancy except for a sanity check that all files have the same columns. When run with the -s option, it will perform a partial merge sort, looking at limited number of open files simultaneously.
Quickstart¶
from janis_bioinformatics.tools.vcftools.vcfconcat.versions import VcfToolsVcfConcat_0_1_16 wf = WorkflowBuilder("myworkflow") wf.step( "vcftoolsvcfconcat_step", VcfToolsVcfConcat_0_1_16( vcfTabix=None, ) ) wf.output("out", source=vcftoolsvcfconcat_step.out)
OR
- Install Janis
- Ensure Janis is configured to work with Docker or Singularity.
- Ensure all reference files are available:
Note
More information about these inputs are available below.
- Generate user input files for VcfToolsVcfConcat:
# user inputs
janis inputs VcfToolsVcfConcat > inputs.yaml
inputs.yaml
vcfTabix:
- vcfTabix_0.vcf.gz
- vcfTabix_1.vcf.gz
- Run VcfToolsVcfConcat with:
janis run [...run options] \
--inputs inputs.yaml \
VcfToolsVcfConcat
Information¶
ID: | VcfToolsVcfConcat |
---|---|
URL: | http://vcftools.sourceforge.net/perl_module.html#vcf-concat |
Versions: | 0.1.16 |
Container: | biocontainers/vcftools:v0.1.16-1-deb_cv1 |
Authors: | Jiaan Yu |
Citations: | None |
Created: | 2020-05-21 |
Updated: | 2020-05-21 |
Outputs¶
name | type | documentation |
---|---|---|
out | stdout<VCF> |
Additional configuration (inputs)¶
name | type | prefix | position | documentation |
---|---|---|---|---|
vcfTabix | Array<Gzipped<VCF>> | 10 | ||
checkColumns | Optional<Boolean> | -c | Do not concatenate, only check if the columns agree. | |
padMissing | Optional<Boolean> | -p | Write ‘.’ in place of missing columns. Useful for joining chrY with the rest. | |
mergeSort | Optional<Integer> | –merge-sort | Allow small overlaps in N consecutive files. |
Workflow Description Language¶
version development
task VcfToolsVcfConcat {
input {
Int? runtime_cpu
Int? runtime_memory
Int? runtime_seconds
Int? runtime_disks
Boolean? checkColumns
Boolean? padMissing
Int? mergeSort
Array[File] vcfTabix
Array[File] vcfTabix_tbi
}
command <<<
set -e
vcf-concat \
~{if (defined(checkColumns) && select_first([checkColumns])) then "-c" else ""} \
~{if (defined(padMissing) && select_first([padMissing])) then "-p" else ""} \
~{if defined(mergeSort) then ("--merge-sort " + mergeSort) else ''} \
~{if length(vcfTabix) > 0 then "'" + sep("' '", vcfTabix) + "'" else ""}
>>>
runtime {
cpu: select_first([runtime_cpu, 1])
disks: "local-disk ~{select_first([runtime_disks, 20])} SSD"
docker: "biocontainers/vcftools:v0.1.16-1-deb_cv1"
duration: select_first([runtime_seconds, 86400])
memory: "~{select_first([runtime_memory, 4])}G"
preemptible: 2
}
output {
File out = stdout()
}
}
Common Workflow Language¶
#!/usr/bin/env cwl-runner
class: CommandLineTool
cwlVersion: v1.2
label: 'VcfTools: VcfConcat'
doc: |-
Concatenates VCF files (for example split by chromosome). Note that the input and output VCFs will have the same number of columns, the script does not merge VCFs by position (see also vcf-merge).
In the basic mode it does not do anything fancy except for a sanity check that all files have the same columns. When run with the -s option, it will perform a partial merge sort, looking at limited number of open files simultaneously.
requirements:
- class: ShellCommandRequirement
- class: InlineJavascriptRequirement
- class: DockerRequirement
dockerPull: biocontainers/vcftools:v0.1.16-1-deb_cv1
inputs:
- id: checkColumns
label: checkColumns
doc: Do not concatenate, only check if the columns agree.
type:
- boolean
- 'null'
inputBinding:
prefix: -c
- id: padMissing
label: padMissing
doc: Write '.' in place of missing columns. Useful for joining chrY with the rest.
type:
- boolean
- 'null'
inputBinding:
prefix: -p
- id: mergeSort
label: mergeSort
doc: Allow small overlaps in N consecutive files.
type:
- int
- 'null'
inputBinding:
prefix: --merge-sort
- id: vcfTabix
label: vcfTabix
type:
type: array
items: File
inputBinding:
position: 10
outputs:
- id: out
label: out
type: stdout
stdout: _stdout
stderr: _stderr
baseCommand:
- ''
- vcf-concat
arguments: []
hints:
- class: ToolTimeLimit
timelimit: |-
$([inputs.runtime_seconds, 86400].filter(function (inner) { return inner != null })[0])
id: VcfToolsVcfConcat