Arriba

Arriba · 1 contributor · 2 versions

Arriba gene fusion detector

Version: 1.2.0 Arriba is a fast tool to search for aberrant transcripts such as gene fusions. It is based on chimeric alignments found by the STAR RNA-Seq aligner.

Arriba is a command-line tool for the detection of gene fusions from RNA-Seq data. It was developed for the use in a clinical research setting. Therefore, short runtimes and high sensitivity were important design criteria. It is based on the ultrafast STAR aligner and the post-alignment runtime is typically just ~2 minutes. In contrast to many other fusion detection tools which build on STAR, Arriba does not require to reduce the alignIntronMax parameter of STAR to detect fusions arising from focal deletions.

Apart from gene fusions, Arriba can detect other structural rearrangements with potential clinical relevance, such as exon duplications or truncations of genes (i.e., breakpoints in introns and intergenic regions).

Quickstart

from janis_bioinformatics.tools.suhrig.arriba.versions import Arriba_1_2_0

wf = WorkflowBuilder("myworkflow")

wf.step(
    "arriba_step",
    Arriba_1_2_0(
        aligned_inp=None,
    )
)
wf.output("out", source=arriba_step.out)
wf.output("out_discarded", source=arriba_step.out_discarded)

OR

  1. Install Janis
  2. Ensure Janis is configured to work with Docker or Singularity.
  3. Ensure all reference files are available:

Note

More information about these inputs are available below.

  1. Generate user input files for Arriba:
# user inputs
janis inputs Arriba > inputs.yaml

inputs.yaml

aligned_inp: aligned_inp.bam
  1. Run Arriba with:
janis run [...run options] \
    --inputs inputs.yaml \
    Arriba

Information

ID:Arriba
URL:No URL to the documentation was provided
Versions:1.2.0, 1.1.0
Container:quay.io/biocontainers/arriba:1.2.0–hd2e4403_2
Authors:Michael Franklin
Citations:None
Created:2020-09-02
Updated:2020-09-02

Outputs

name type documentation
out tsv  
out_discarded tsv  

Additional configuration (inputs)

name type prefix position documentation
aligned_inp BAM -x   File in SAM/BAM/CRAM format with main alignments as generated by STAR (Aligned.out.sam). Arriba extracts candidate reads from this file. This is sometimes /dev/stdin
inp_chimeric Optional<BAM> -c   File in SAM/BAM/CRAM format with chimeric alignments as generated by STAR (Chimeric.out.sam). This parameter is only required, if STAR was run with the parameter ‘–chimOutType SeparateSAMold’. When STAR was run with the parameter ‘–chimOutType WithinBAM’, it suffices to pass the parameter -x to Arriba and -c can be omitted.
gtf_file Optional<File> -g   GTF file with gene annotation. The file may be gzip-compressed.
gtf_features Optional<csv> -G   Comma-/space-separated list of names of GTF features. Default: gene_name=gene_name|gene_id gene_id=gene_id transcript_id=transcript_id feature_exon=exon feature_CDS=CDS
reference Optional<Fasta> -a   FastA file with genome sequence (assembly). The file may be gzip-compressed. An index with the file extension .fai must exist only if CRAM files are processed.
blacklist Optional<File> -b   File containing blacklisted events (recurrent artifacts and transcripts observed in healthy tissue).
known_fusions Optional<tsv> -k   File containing known/recurrent fusions. Some cancer entities are often characterized by fusions between the same pair of genes. In order to boost sensitivity, a list of known fusions can be supplied using this parameter. The list must contain two columns with the names of the fused genes, separated by tabs.
output_filename Optional<Filename> -o   Output file with fusions that have passed all filters.
discarded_output_filename Optional<Filename> -O   Output file with fusions that were discarded due to filtering.
structural_variants_coordinates Optional<tsv> -d   Tab-separated file with coordinates of structural variants found using whole-genome sequencing data. These coordinates serve to increase sensitivity towards weakly expressed fusions and to eliminate fusions with low evidence.
max_genomic_breakpoint_distance Optional<Integer> -D   When a file with genomic breakpoints obtained via whole-genome sequencing is supplied via the -d parameter, this parameter determines how far a genomic breakpoint may be away from a transcriptomic breakpoint to consider it as a related event. For events inside genes, the distance is added to the end of the gene; for intergenic events, the distance threshold is applied as is. Default: 100000
strandedness Optional<String> -s   Whether a strand-specific protocol was used for library preparation, and if so, the type of strandedness (auto/yes/no/reverse). When unstranded data is processed, the strand can sometimes be inferred from splice-patterns. But in unclear situations, stranded data helps resolve ambiguities. Default: auto
contigs Optional<Array<String>> -i   Comma-/space-separated list of interesting contigs. Fusions between genes on other contigs are ignored. Contigs can be specified with or without the prefix ‘chr’. Default: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y
filters Optional<Array<String>> -f   Comma-/space-separated list of filters to disable. By default all filters are enabled. Valid values: homopolymer, same_gene, inconsistently_clipped, duplicates, low_entropy, no_genomic_support, short_anchor, homologs, blacklist, pcr_fusions, isoforms, intronic, uninteresting_contigs, read_through, genomic_support, mismatches, no_coverage, spliced, mismappers, merge_adjacent, select_best, many_spliced, long_gap, min_support, relative_support, end_to_end, known_fusions, non_coding_neighbors, intragenic_exonic, hairpin, small_insert_size
max_e_value Optional<Float> -E   Arriba estimates the number of fusions with a given number of supporting reads which one would expect to see by random chance. If the expected number of fusions (e-value) is higher than this threshold, the fusion is discarded by the ‘relative_support’ filter. Note: Increasing this threshold can dramatically increase the number of false positives and may increase the runtime of resource-intensive steps. Fractional values are possible. Default: 0.300000
min_supporting_reads Optional<Integer> -S   The ‘min_support’ filter discards all fusions with fewer than this many supporting reads (split reads and discordant mates combined). Default: 2
max_mismappers Optional<Float> -m   When more than this fraction of supporting reads turns out to be mismappers, the ‘mismappers’ filter discards the fusion. Default: 0.800000
max_homolog_identity Optional<Float> -L   Genes with more than the given fraction of sequence identity are considered homologs and removed by the ‘homologs’ filter. Default: 0.300000
homopolymer_length Optional<Integer> -H   The ‘homopolymer’ filter removes breakpoints adjacent to homopolymers of the given length or more. Default: 6
read_through_distance Optional<Integer> -R   The ‘read_through’ filter removes read-through fusions where the breakpoints are less than the given distance away from each other. Default: 10000
min_anchor_length Optional<Integer> -A   Alignment artifacts are often characterized by split reads coming from only one gene and no discordant mates. Moreover, the split reads only align to a short stretch in one of the genes. The ‘short_anchor’ filter removes these fusions. This parameter sets the threshold in bp for what the filter considers short. Default: 23
many_spliced_events Optional<Integer> -M   The ‘many_spliced’ filter recovers fusions between genes that have at least this many spliced breakpoints. Default: 4
max_kmer_content Optional<Float> -K   The ‘low_entropy’ filter removes reads with repetitive 3-mers. If the 3-mers make up more than the given fraction of the sequence, then the read is discarded. Default: 0.600000
max_mismatch_pvalue Optional<Float> -V   The ‘mismatches’ filter uses a binomial model to calculate a p-value for observing a given number of mismatches in a read. If the number of mismatches is too high, the read is discarded. Default: 0.010000
fragment_length Optional<Integer> -F   When paired-end data is given, the fragment length is estimated automatically and this parameter has no effect. But when single-end data is given, the mean fragment length should be specified to effectively filter fusions that arise from hairpin structures. Default: 200
max_reads Optional<Integer> -U   Subsample fusions with more than the given number of supporting reads. This improves performance without compromising sensitivity, as long as the threshold is high. Counting of supporting reads beyond the threshold is inaccurate, obviously. Default: 300
quantile Optional<Float> -Q   Highly expressed genes are prone to produce artifacts during library preparation. Genes with an expression above the given quantile are eligible for filtering by the ‘pcr_fusions’ filter. Default: 0.998000
exonic_fraction Optional<Float> -e   The breakpoints of false-positive predictions of intragenic events are often both in exons. True predictions are more likely to have at least one breakpoint in an intron, because introns are larger. If the fraction of exonic sequence between two breakpoints is smaller than the given fraction, the ‘intragenic_exonic’ filter discards the event. Default: 0.200000
fusion_transcript Optional<Boolean> -T   When set, the column ‘fusion_transcript’ is populated with the sequence of the fused genes as assembled from the supporting reads. Specify the flag twice to also print the fusion transcripts to the file containing discarded fusions (-O). Default: off
peptide_sequence Optional<Boolean> -P   When set, the column ‘peptide_sequence’ is populated with the sequence of the fused proteins as assembled from the supporting reads. Specify the flag twice to also print the peptide sequence to the file containing discarded fusions (-O). Default: off
read_identifiers Optional<Boolean> -I   When set, the column ‘read_identifiers’ is populated with identifiers of the reads which support the fusion. The identifiers are separated by commas. Specify the flag twice to also print the read identifiers to the file containing discarded fusions (-O). Default: off

Workflow Description Language

version development

task Arriba {
  input {
    Int? runtime_cpu
    Int? runtime_memory
    Int? runtime_seconds
    Int? runtime_disks
    File aligned_inp
    File? inp_chimeric
    File? gtf_file
    File? gtf_features
    File? reference
    File? blacklist
    File? known_fusions
    String? output_filename
    String? discarded_output_filename
    File? structural_variants_coordinates
    Int? max_genomic_breakpoint_distance
    String? strandedness
    Array[String]? contigs
    Array[String]? filters
    Float? max_e_value
    Int? min_supporting_reads
    Float? max_mismappers
    Float? max_homolog_identity
    Int? homopolymer_length
    Int? read_through_distance
    Int? min_anchor_length
    Int? many_spliced_events
    Float? max_kmer_content
    Float? max_mismatch_pvalue
    Int? fragment_length
    Int? max_reads
    Float? quantile
    Float? exonic_fraction
    Boolean? fusion_transcript
    Boolean? peptide_sequence
    Boolean? read_identifiers
  }
  command <<<
    set -e
    arriba \
      -x '~{aligned_inp}' \
      ~{if defined(inp_chimeric) then ("-c '" + inp_chimeric + "'") else ""} \
      ~{if defined(gtf_file) then ("-g '" + gtf_file + "'") else ""} \
      ~{if defined(gtf_features) then ("-G '" + gtf_features + "'") else ""} \
      ~{if defined(reference) then ("-a '" + reference + "'") else ""} \
      ~{if defined(blacklist) then ("-b '" + blacklist + "'") else ""} \
      ~{if defined(known_fusions) then ("-k '" + known_fusions + "'") else ""} \
      -o '~{select_first([output_filename, "generated.tsv"])}' \
      -O '~{select_first([discarded_output_filename, "generated.discarded.tsv"])}' \
      ~{if defined(structural_variants_coordinates) then ("-d '" + structural_variants_coordinates + "'") else ""} \
      ~{if defined(max_genomic_breakpoint_distance) then ("-D " + max_genomic_breakpoint_distance) else ''} \
      ~{if defined(strandedness) then ("-s '" + strandedness + "'") else ""} \
      ~{if (defined(contigs) && length(select_first([contigs])) > 0) then "-i '" + sep("' '", select_first([contigs])) + "'" else ""} \
      ~{if (defined(filters) && length(select_first([filters])) > 0) then "-f '" + sep("' '", select_first([filters])) + "'" else ""} \
      ~{if defined(max_e_value) then ("-E " + max_e_value) else ''} \
      ~{if defined(min_supporting_reads) then ("-S " + min_supporting_reads) else ''} \
      ~{if defined(max_mismappers) then ("-m " + max_mismappers) else ''} \
      ~{if defined(max_homolog_identity) then ("-L " + max_homolog_identity) else ''} \
      ~{if defined(homopolymer_length) then ("-H " + homopolymer_length) else ''} \
      ~{if defined(read_through_distance) then ("-R " + read_through_distance) else ''} \
      ~{if defined(min_anchor_length) then ("-A " + min_anchor_length) else ''} \
      ~{if defined(many_spliced_events) then ("-M " + many_spliced_events) else ''} \
      ~{if defined(max_kmer_content) then ("-K " + max_kmer_content) else ''} \
      ~{if defined(max_mismatch_pvalue) then ("-V " + max_mismatch_pvalue) else ''} \
      ~{if defined(fragment_length) then ("-F " + fragment_length) else ''} \
      ~{if defined(max_reads) then ("-U " + max_reads) else ''} \
      ~{if defined(quantile) then ("-Q " + quantile) else ''} \
      ~{if defined(exonic_fraction) then ("-e " + exonic_fraction) else ''} \
      ~{if (defined(fusion_transcript) && select_first([fusion_transcript])) then "-T" else ""} \
      ~{if (defined(peptide_sequence) && select_first([peptide_sequence])) then "-P" else ""} \
      ~{if (defined(read_identifiers) && select_first([read_identifiers])) then "-I" else ""}
  >>>
  runtime {
    cpu: select_first([runtime_cpu, 1])
    disks: "local-disk ~{select_first([runtime_disks, 20])} SSD"
    docker: "quay.io/biocontainers/arriba:1.2.0--hd2e4403_2"
    duration: select_first([runtime_seconds, 86400])
    memory: "~{select_first([runtime_memory, 4])}G"
    preemptible: 2
  }
  output {
    File out = select_first([output_filename, "generated.tsv"])
    File out_discarded = select_first([discarded_output_filename, "generated.discarded.tsv"])
  }
}

Common Workflow Language

#!/usr/bin/env cwl-runner
class: CommandLineTool
cwlVersion: v1.2
label: Arriba
doc: |2

  Arriba gene fusion detector
  ---------------------------
  Version: 1.2.0
  Arriba is a fast tool to search for aberrant transcripts such as gene fusions.
  It is based on chimeric alignments found by the STAR RNA-Seq aligner.

  Arriba is a command-line tool for the detection of gene fusions from RNA-Seq data. It was developed for the use in a
  clinical research setting. Therefore, short runtimes and high sensitivity were important design criteria. It is based
  on the ultrafast STAR aligner and the post-alignment runtime is typically just ~2 minutes. In contrast to many other
  fusion detection tools which build on STAR, Arriba does not require to reduce the alignIntronMax parameter of STAR
  to detect fusions arising from focal deletions.

  Apart from gene fusions, Arriba can detect other structural rearrangements with potential clinical relevance, such
  as exon duplications or truncations of genes (i.e., breakpoints in introns and intergenic regions).

requirements:
- class: ShellCommandRequirement
- class: InlineJavascriptRequirement
- class: DockerRequirement
  dockerPull: quay.io/biocontainers/arriba:1.2.0--hd2e4403_2

inputs:
- id: aligned_inp
  label: aligned_inp
  doc: |-
    File in SAM/BAM/CRAM format with main alignments as generated by STAR (Aligned.out.sam). Arriba extracts candidate reads from this file. This is sometimes /dev/stdin
  type: File
  inputBinding:
    prefix: -x
    separate: true
- id: inp_chimeric
  label: inp_chimeric
  doc: |-
    File in SAM/BAM/CRAM format with chimeric alignments as generated by STAR (Chimeric.out.sam). This parameter is only required, if STAR was run with the parameter '--chimOutType SeparateSAMold'. When STAR was run with the parameter '--chimOutType WithinBAM', it suffices to pass the parameter -x to Arriba and -c can be omitted.
  type:
  - File
  - 'null'
  inputBinding:
    prefix: -c
    separate: true
- id: gtf_file
  label: gtf_file
  doc: GTF file with gene annotation. The file may be gzip-compressed.
  type:
  - File
  - 'null'
  inputBinding:
    prefix: -g
    separate: true
- id: gtf_features
  label: gtf_features
  doc: |-
    Comma-/space-separated list of names of GTF features. Default: gene_name=gene_name|gene_id gene_id=gene_id transcript_id=transcript_id feature_exon=exon feature_CDS=CDS
  type:
  - File
  - 'null'
  inputBinding:
    prefix: -G
    separate: true
- id: reference
  label: reference
  doc: |-
    FastA file with genome sequence (assembly). The file may be gzip-compressed. An index with the file extension .fai must exist only if CRAM files are processed.
  type:
  - File
  - 'null'
  inputBinding:
    prefix: -a
    separate: true
- id: blacklist
  label: blacklist
  doc: |-
    File containing blacklisted events (recurrent artifacts and transcripts observed in healthy tissue).
  type:
  - File
  - 'null'
  inputBinding:
    prefix: -b
    separate: true
- id: known_fusions
  label: known_fusions
  doc: |-
    File containing known/recurrent fusions. Some cancer entities are often characterized by fusions between the same pair of genes. In order to boost sensitivity, a list of known fusions can be supplied using this parameter. The list must contain two columns with the names of the fused genes, separated by tabs.
  type:
  - File
  - 'null'
  inputBinding:
    prefix: -k
    separate: true
- id: output_filename
  label: output_filename
  doc: Output file with fusions that have passed all filters.
  type:
  - string
  - 'null'
  default: generated.tsv
  inputBinding:
    prefix: -o
    separate: true
- id: discarded_output_filename
  label: discarded_output_filename
  doc: Output file with fusions that were discarded due to filtering.
  type:
  - string
  - 'null'
  default: generated.discarded.tsv
  inputBinding:
    prefix: -O
    separate: true
- id: structural_variants_coordinates
  label: structural_variants_coordinates
  doc: |-
    Tab-separated file with coordinates of structural variants found using whole-genome sequencing data. These coordinates serve to increase sensitivity towards weakly expressed fusions and to eliminate fusions with low evidence.
  type:
  - File
  - 'null'
  inputBinding:
    prefix: -d
    separate: true
- id: max_genomic_breakpoint_distance
  label: max_genomic_breakpoint_distance
  doc: |-
    When a file with genomic breakpoints obtained via whole-genome sequencing is supplied via the -d parameter, this parameter determines how far a genomic breakpoint may be away from a transcriptomic breakpoint to consider it as a related event. For events inside genes, the distance is added to the end of the gene; for intergenic events, the distance threshold is applied as is. Default: 100000
  type:
  - int
  - 'null'
  inputBinding:
    prefix: -D
    separate: true
- id: strandedness
  label: strandedness
  doc: |-
    Whether a strand-specific protocol was used for library preparation, and if so, the type of strandedness (auto/yes/no/reverse). When unstranded data is processed, the strand can sometimes be inferred from splice-patterns. But in unclear situations, stranded data helps resolve ambiguities. Default: auto
  type:
  - string
  - 'null'
  inputBinding:
    prefix: -s
    separate: true
- id: contigs
  label: contigs
  doc: |-
    Comma-/space-separated list of interesting contigs. Fusions between genes on other contigs are ignored. Contigs can be specified with or without the prefix 'chr'. Default: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 X Y
  type:
  - type: array
    items: string
  - 'null'
  inputBinding:
    prefix: -i
- id: filters
  label: filters
  doc: |-
    Comma-/space-separated list of filters to disable. By default all filters are enabled. Valid values: homopolymer, same_gene, inconsistently_clipped, duplicates, low_entropy, no_genomic_support, short_anchor, homologs, blacklist, pcr_fusions, isoforms, intronic, uninteresting_contigs, read_through, genomic_support, mismatches, no_coverage, spliced, mismappers, merge_adjacent, select_best, many_spliced, long_gap, min_support, relative_support, end_to_end, known_fusions, non_coding_neighbors, intragenic_exonic, hairpin, small_insert_size
  type:
  - type: array
    items: string
  - 'null'
  inputBinding:
    prefix: -f
    itemSeparator: ' '
- id: max_e_value
  label: max_e_value
  doc: |-
    Arriba estimates the number of fusions with a given number of supporting reads which one would expect to see by random chance. If the expected number of fusions (e-value) is higher than this threshold, the fusion is discarded by the 'relative_support' filter. Note: Increasing this threshold can dramatically increase the number of false positives and may increase the runtime of resource-intensive steps. Fractional values are possible. Default: 0.300000
  type:
  - float
  - 'null'
  inputBinding:
    prefix: -E
    separate: true
- id: min_supporting_reads
  label: min_supporting_reads
  doc: |-
    The 'min_support' filter discards all fusions with fewer than this many supporting reads (split reads and discordant mates combined). Default: 2
  type:
  - int
  - 'null'
  inputBinding:
    prefix: -S
    separate: true
- id: max_mismappers
  label: max_mismappers
  doc: |-
    When more than this fraction of supporting reads turns out to be mismappers, the 'mismappers' filter discards the fusion. Default: 0.800000
  type:
  - float
  - 'null'
  inputBinding:
    prefix: -m
    separate: true
- id: max_homolog_identity
  label: max_homolog_identity
  doc: |-
    Genes with more than the given fraction of sequence identity are considered homologs and removed by the 'homologs' filter. Default: 0.300000
  type:
  - float
  - 'null'
  inputBinding:
    prefix: -L
    separate: true
- id: homopolymer_length
  label: homopolymer_length
  doc: |-
    The 'homopolymer' filter removes breakpoints adjacent to homopolymers of the given length or more. Default: 6
  type:
  - int
  - 'null'
  inputBinding:
    prefix: -H
    separate: true
- id: read_through_distance
  label: read_through_distance
  doc: |-
    The 'read_through' filter removes read-through fusions where the breakpoints are less than the given distance away from each other. Default: 10000
  type:
  - int
  - 'null'
  inputBinding:
    prefix: -R
    separate: true
- id: min_anchor_length
  label: min_anchor_length
  doc: |-
    Alignment artifacts are often characterized by split reads coming from only one gene and no discordant mates. Moreover, the split reads only align to a short stretch in one of the genes. The 'short_anchor' filter removes these fusions. This parameter sets the threshold in bp for what the filter considers short. Default: 23
  type:
  - int
  - 'null'
  inputBinding:
    prefix: -A
    separate: true
- id: many_spliced_events
  label: many_spliced_events
  doc: |-
    The 'many_spliced' filter recovers fusions between genes that have at least this many spliced breakpoints. Default: 4
  type:
  - int
  - 'null'
  inputBinding:
    prefix: -M
    separate: true
- id: max_kmer_content
  label: max_kmer_content
  doc: |-
    The 'low_entropy' filter removes reads with repetitive 3-mers. If the 3-mers make up more than the given fraction of the sequence, then the read is discarded. Default: 0.600000
  type:
  - float
  - 'null'
  inputBinding:
    prefix: -K
    separate: true
- id: max_mismatch_pvalue
  label: max_mismatch_pvalue
  doc: |-
    The 'mismatches' filter uses a binomial model to calculate a p-value for observing a given number of mismatches in a read. If the number of mismatches is too high, the read is discarded. Default: 0.010000
  type:
  - float
  - 'null'
  inputBinding:
    prefix: -V
    separate: true
- id: fragment_length
  label: fragment_length
  doc: |-
    When paired-end data is given, the fragment length is estimated automatically and this parameter has no effect. But when single-end data is given, the mean fragment length should be specified to effectively filter fusions that arise from hairpin structures. Default: 200
  type:
  - int
  - 'null'
  inputBinding:
    prefix: -F
    separate: true
- id: max_reads
  label: max_reads
  doc: |-
    Subsample fusions with more than the given number of supporting reads. This improves performance without compromising sensitivity, as long as the threshold is high. Counting of supporting reads beyond the threshold is inaccurate, obviously. Default: 300
  type:
  - int
  - 'null'
  inputBinding:
    prefix: -U
    separate: true
- id: quantile
  label: quantile
  doc: |-
    Highly expressed genes are prone to produce artifacts during library preparation. Genes with an expression above the given quantile are eligible for filtering by the 'pcr_fusions' filter. Default: 0.998000
  type:
  - float
  - 'null'
  inputBinding:
    prefix: -Q
    separate: true
- id: exonic_fraction
  label: exonic_fraction
  doc: |-
    The breakpoints of false-positive predictions of intragenic events are often both in exons. True predictions are more likely to have at least one breakpoint in an intron, because introns are larger. If the fraction of exonic sequence between two breakpoints is smaller than the given fraction, the 'intragenic_exonic' filter discards the event. Default: 0.200000
  type:
  - float
  - 'null'
  inputBinding:
    prefix: -e
    separate: true
- id: fusion_transcript
  label: fusion_transcript
  doc: |-
    When set, the column 'fusion_transcript' is populated with the sequence of the fused genes as assembled from the supporting reads. Specify the flag twice to also print the fusion transcripts to the file containing discarded fusions (-O). Default: off
  type:
  - boolean
  - 'null'
  inputBinding:
    prefix: -T
    separate: true
- id: peptide_sequence
  label: peptide_sequence
  doc: |-
    When set, the column 'peptide_sequence' is populated with the sequence of the fused proteins as assembled from the supporting reads. Specify the flag twice to also print the peptide sequence to the file containing discarded fusions (-O). Default: off
  type:
  - boolean
  - 'null'
  inputBinding:
    prefix: -P
    separate: true
- id: read_identifiers
  label: read_identifiers
  doc: |-
    When set, the column 'read_identifiers' is populated with identifiers of the reads which support the fusion. The identifiers are separated by commas. Specify the flag twice to also print the read identifiers to the file containing discarded fusions (-O). Default: off
  type:
  - boolean
  - 'null'
  inputBinding:
    prefix: -I
    separate: true

outputs:
- id: out
  label: out
  type: File
  outputBinding:
    glob: generated.tsv
    loadContents: false
- id: out_discarded
  label: out_discarded
  type: File
  outputBinding:
    glob: generated.discarded.tsv
    loadContents: false
stdout: _stdout
stderr: _stderr

baseCommand:
- arriba
arguments: []

hints:
- class: ToolTimeLimit
  timelimit: |-
    $([inputs.runtime_seconds, 86400].filter(function (inner) { return inner != null })[0])
id: Arriba