GATK4: GetFilterMutectCalls¶
Gatk4FilterMutectCalls
· 1 contributor · 6 versions
Filter variants in a Mutect2 VCF callset.
FilterMutectCalls applies filters to the raw output of Mutect2. Parameters are contained in M2FiltersArgumentCollection and described in https://github.com/broadinstitute/gatk/tree/master/docs/mutect/mutect.pdf. To filter based on sequence context artifacts, specify the –orientation-bias-artifact-priors [artifact priors tar.gz file] argument one or more times. This input is generated by LearnReadOrientationModel.
If given a –contamination-table file, e.g. results from CalculateContamination, the tool will additionally filter on contamination fractions. This argument may be specified with a table for one or more tumor sample. Alternatively, provide a numerical fraction to filter with the –contamination argument. FilterMutectCalls can also be given one or more –tumor-segmentation files, which are also output by CalculateContamination.
Quickstart¶
from janis_bioinformatics.tools.gatk4.filtermutectcalls.versions import Gatk4FilterMutectCalls_4_1_8 wf = WorkflowBuilder("myworkflow") wf.step( "gatk4filtermutectcalls_step", Gatk4FilterMutectCalls_4_1_8( vcf=None, reference=None, ) ) wf.output("out", source=gatk4filtermutectcalls_step.out)
OR
- Install Janis
- Ensure Janis is configured to work with Docker or Singularity.
- Ensure all reference files are available:
Note
More information about these inputs are available below.
- Generate user input files for Gatk4FilterMutectCalls:
# user inputs
janis inputs Gatk4FilterMutectCalls > inputs.yaml
inputs.yaml
reference: reference.fasta
vcf: vcf.vcf.gz
- Run Gatk4FilterMutectCalls with:
janis run [...run options] \
--inputs inputs.yaml \
Gatk4FilterMutectCalls
Information¶
ID: | Gatk4FilterMutectCalls |
---|---|
URL: | https://software.broadinstitute.org/gatk/documentation/tooldocs/4.1.2.0/org_broadinstitute_hellbender_tools_walkers_mutect_Mutect2.php |
Versions: | 4.1.8.1, 4.1.7.0, 4.1.6.0, 4.1.4.0, 4.1.3.0, 4.1.2.0 |
Container: | broadinstitute/gatk:4.1.8.1 |
Authors: | Hollizeck Sebastian |
Citations: | TBD |
Created: | 2019-09-09 |
Updated: | 2019-09-09 |
Outputs¶
name | type | documentation |
---|---|---|
out | Gzipped<VCF> | vcf containing filtered calls |
Additional configuration (inputs)¶
name | type | prefix | position | documentation |
---|---|---|---|---|
vcf | Gzipped<VCF> | -V | vcf to be filtered | |
reference | FastaWithIndexes | -R | Reference sequence file | |
javaOptions | Optional<Array<String>> | |||
compression_level | Optional<Integer> | Compression level for all compressed files created (e.g. BAM and VCF). Default value: 2. | ||
contaminationTable | Optional<File> | –contamination-table | Tables containing contamination information. | |
segmentationFile | Optional<File> | –tumor-segmentation | Tables containing tumor segments’ minor allele fractions for germline hets emitted by CalculateContamination | |
statsFile | Optional<File> | –stats | The Mutect stats file output by Mutect2 | |
readOrientationModel | Optional<File> | –orientation-bias-artifact-priors | One or more .tar.gz files containing tables of prior artifact probabilities for the read orientation filter model, one table per tumor sample | |
outputFilename | Optional<Filename> | -O | 2 |
Workflow Description Language¶
version development
task Gatk4FilterMutectCalls {
input {
Int? runtime_cpu
Int? runtime_memory
Int? runtime_seconds
Int? runtime_disks
Array[String]? javaOptions
Int? compression_level
File? contaminationTable
File? segmentationFile
File? statsFile
File? readOrientationModel
File vcf
File vcf_tbi
File reference
File reference_fai
File reference_amb
File reference_ann
File reference_bwt
File reference_pac
File reference_sa
File reference_dict
String? outputFilename
}
command <<<
set -e
gatk FilterMutectCalls \
--java-options '-Xmx~{((select_first([runtime_memory, 16, 4]) * 3) / 4)}G ~{if (defined(compression_level)) then ("-Dsamjdk.compress_level=" + compression_level) else ""} ~{sep(" ", select_first([javaOptions, []]))}' \
~{if defined(contaminationTable) then ("--contamination-table '" + contaminationTable + "'") else ""} \
~{if defined(segmentationFile) then ("--tumor-segmentation '" + segmentationFile + "'") else ""} \
~{if defined(statsFile) then ("--stats '" + statsFile + "'") else ""} \
~{if defined(readOrientationModel) then ("--orientation-bias-artifact-priors '" + readOrientationModel + "'") else ""} \
-V '~{vcf}' \
-R '~{reference}' \
-O '~{select_first([outputFilename, "~{basename(vcf, ".vcf.gz")}.vcf.gz"])}'
>>>
runtime {
cpu: select_first([runtime_cpu, 1, 1])
disks: "local-disk ~{select_first([runtime_disks, 20])} SSD"
docker: "broadinstitute/gatk:4.1.8.1"
duration: select_first([runtime_seconds, 86400])
memory: "~{select_first([runtime_memory, 16, 4])}G"
preemptible: 2
}
output {
File out = select_first([outputFilename, "~{basename(vcf, ".vcf.gz")}.vcf.gz"])
File out_tbi = select_first([outputFilename, "~{basename(vcf, ".vcf.gz")}.vcf.gz"]) + ".tbi"
}
}
Common Workflow Language¶
#!/usr/bin/env cwl-runner
class: CommandLineTool
cwlVersion: v1.2
label: 'GATK4: GetFilterMutectCalls'
doc: |-
Filter variants in a Mutect2 VCF callset.
FilterMutectCalls applies filters to the raw output of Mutect2. Parameters are contained in M2FiltersArgumentCollection and described in https://github.com/broadinstitute/gatk/tree/master/docs/mutect/mutect.pdf. To filter based on sequence context artifacts, specify the --orientation-bias-artifact-priors [artifact priors tar.gz file] argument one or more times. This input is generated by LearnReadOrientationModel.
If given a --contamination-table file, e.g. results from CalculateContamination, the tool will additionally filter on contamination fractions. This argument may be specified with a table for one or more tumor sample. Alternatively, provide a numerical fraction to filter with the --contamination argument. FilterMutectCalls can also be given one or more --tumor-segmentation files, which are also output by CalculateContamination.
requirements:
- class: ShellCommandRequirement
- class: InlineJavascriptRequirement
- class: DockerRequirement
dockerPull: broadinstitute/gatk:4.1.8.1
inputs:
- id: javaOptions
label: javaOptions
type:
- type: array
items: string
- 'null'
- id: compression_level
label: compression_level
doc: |-
Compression level for all compressed files created (e.g. BAM and VCF). Default value: 2.
type:
- int
- 'null'
- id: contaminationTable
label: contaminationTable
doc: Tables containing contamination information.
type:
- File
- 'null'
inputBinding:
prefix: --contamination-table
- id: segmentationFile
label: segmentationFile
doc: |-
Tables containing tumor segments' minor allele fractions for germline hets emitted by CalculateContamination
type:
- File
- 'null'
inputBinding:
prefix: --tumor-segmentation
- id: statsFile
label: statsFile
doc: The Mutect stats file output by Mutect2
type:
- File
- 'null'
inputBinding:
prefix: --stats
- id: readOrientationModel
label: readOrientationModel
doc: |-
One or more .tar.gz files containing tables of prior artifact probabilities for the read orientation filter model, one table per tumor sample
type:
- File
- 'null'
inputBinding:
prefix: --orientation-bias-artifact-priors
- id: vcf
label: vcf
doc: vcf to be filtered
type: File
secondaryFiles:
- pattern: .tbi
inputBinding:
prefix: -V
- id: reference
label: reference
doc: Reference sequence file
type: File
secondaryFiles:
- pattern: .fai
- pattern: .amb
- pattern: .ann
- pattern: .bwt
- pattern: .pac
- pattern: .sa
- pattern: ^.dict
inputBinding:
prefix: -R
- id: outputFilename
label: outputFilename
type:
- string
- 'null'
default: generated.vcf.gz
inputBinding:
prefix: -O
position: 2
valueFrom: $(inputs.vcf.basename.replace(/.vcf.gz$/, "")).vcf.gz
outputs:
- id: out
label: out
doc: vcf containing filtered calls
type: File
secondaryFiles:
- pattern: .tbi
outputBinding:
glob: $(inputs.vcf.basename.replace(/.vcf.gz$/, "")).vcf.gz
loadContents: false
stdout: _stdout
stderr: _stderr
baseCommand:
- gatk
- FilterMutectCalls
arguments:
- prefix: --java-options
position: -1
valueFrom: |-
$("-Xmx{memory}G {compression} {otherargs}".replace(/\{memory\}/g, (([inputs.runtime_memory, 16, 4].filter(function (inner) { return inner != null })[0] * 3) / 4)).replace(/\{compression\}/g, (inputs.compression_level != null) ? ("-Dsamjdk.compress_level=" + inputs.compression_level) : "").replace(/\{otherargs\}/g, [inputs.javaOptions, []].filter(function (inner) { return inner != null })[0].join(" ")))
hints:
- class: ToolTimeLimit
timelimit: |-
$([inputs.runtime_seconds, 86400].filter(function (inner) { return inner != null })[0])
id: Gatk4FilterMutectCalls