mag: Parameters

Define where the pipeline should find input data and save output data.

Input FastQ files or CSV samplesheet file containing information about the samples in the experiment.

required

type: string

Specifies that the input is single-end reads.

type: boolean

The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.

required

type: string

Email address for completion summary.

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

MultiQC report title. Printed as page header, used for filename if not otherwise specified.

type: string

Reference genome related files and options required for the workflow.

Directory / URL base for iGenomes references.

hidden

type: string

default: s3://ngi-igenomes/igenomes

Do not load the iGenomes reference config.

hidden

type: boolean

Parameters used to describe centralised config profiles. These should not be edited.

Git commit id for Institutional configs.

hidden

type: string

default: master

Base directory for Institutional configs.

hidden

type: string

default: https://raw.githubusercontent.com/nf-core/configs/master

Institutional config name.

hidden

type: string

Institutional config description.

hidden

type: string

Institutional config contact information.

hidden

type: string

Institutional config URL link.

hidden

type: string

Set the top limit for requested resources for any single job.

Maximum number of CPUs that can be requested for any single job.

hidden

type: integer

default: 16

Maximum amount of memory that can be requested for any single job.

hidden

type: string

default: 128.GB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Maximum amount of time that can be requested for any single job.

hidden

type: string

default: 240.h

pattern: ^(\d+\.?\s*(s|m|h|day)\s*)+$

Less common options for the pipeline, typically set in a config file.

Display help text.

hidden

type: boolean

Method used to save pipeline results to output directory.

hidden

type: string

Email address for completion summary, only when pipeline fails.

hidden

type: string

pattern: ^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$

Send plain-text email instead of HTML.

hidden

type: boolean

File size limit when attaching MultiQC reports to summary emails.

hidden

type: string

default: 25.MB

pattern: ^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$

Do not use coloured log outputs.

hidden

type: boolean

Custom config file to supply to MultiQC.

hidden

type: string

Directory to keep pipeline Nextflow logs and reports.

hidden

type: string

default: ${params.outdir}/pipeline_info

Boolean whether to validate parameters against the schema at runtime

hidden

type: boolean

default: true

Show all params when using --help

hidden

type: boolean

Run this workflow with Conda. You can also use ‘-profile conda’ instead of providing this parameter.

hidden

type: boolean

Use these parameters to also enable reproducible results from the individual assembly and binning tools .

Fix number of CPUs for MEGAHIT to 1. Not increased with retries.

type: boolean

Fix number of CPUs used by SPAdes. Not increased with retries.

type: integer

default: -1

Fix number of CPUs used by SPAdes hybrid. Not increased with retries.

type: integer

default: -1

RNG seed for MetaBAT2.

type: integer

default: 1

Specify which adapter clipping tool to use. Options: ‘fastp’, ‘adapterremoval’

type: string

The minimum length of reads must have to be retained for downstream analysis.

type: integer

default: 15

Minimum phred quality value of a base to be qualified in fastp.

type: integer

default: 15

The mean quality requirement used for per read sliding window cutting by fastp.

type: integer

default: 15

Save reads that fail fastp filtering in a separate file. Not used downstream.

type: boolean

The minimum base quality for low-quality base trimming by AdapterRemoval.

type: integer

default: 2

Turn on quality trimming by consecutive stretch of low quality bases, rather than by window.

type: boolean

Forward read adapter to be trimmed by AdapterRemoval.

type: string

default: AGATCGGAAGAGCACACGTCTGAACTCCAGTCACNNNNNNATCTCGTATGCCGTCTTCTGCTTG

Reverse read adapter to be trimmed by AdapterRemoval for paired end data.

type: string

default: AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGTAGATCTCGGTGGTCGCCGTATCATT

Name of iGenomes reference for host contamination removal.

type: string

Fasta reference file for host contamination removal.

type: string

Use the --very-sensitive instead of the--sensitivesetting for Bowtie 2 to map reads against the host genome.

type: boolean

Save the read IDs of removed host reads.

type: boolean

Keep reads similar to the Illumina internal standard PhiX genome.

type: boolean

Genome reference used to remove Illumina PhiX contaminant reads.

hidden

type: string

default: ${baseDir}/assets/data/GCA_002596845.1_ASM259684v1_genomic.fna.gz

Skip removing adapter sequences from long reads.

type: boolean

Discard any read which is shorter than this value.

type: integer

default: 1000

Keep this percent of bases.

type: integer

default: 90

The higher the more important is read length when choosing the best reads.

type: integer

default: 10

Keep reads similar to the ONT internal standard Escherichia virus Lambda genome.

type: boolean

Genome reference used to remove ONT Lambda contaminant reads.

hidden

type: string

default: ${baseDir}/assets/data/GCA_000840245.1_ViralProj14204_genomic.fna.gz

Taxonomic classification is disabled by default. You have to specify one of the options below to activate it.

Database for taxonomic binning with centrifuge.

type: string

Database for taxonomic binning with kraken2.

type: string

Skip creating a krona plot for taxonomic binning.

type: boolean

Database for taxonomic classification of metagenome assembled genomes.

type: string

Generate CAT database.

type: boolean

Save the CAT database generated when specified by --cat_db_generate.

type: boolean

GTDB database for taxonomic classification of bins with GTDB-tk.

type: string

default: https://data.gtdb.ecogenomic.org/releases/release202/202.0/auxillary_files/gtdbtk_r202_data.tar.gz

Min. bin completeness (in %) required to apply GTDB-tk classification.

type: number

default: 50

Max. bin contamination (in %) allowed to apply GTDB-tk classification.

type: number

default: 10

Min. fraction of AA (in %) in the MSA for bins to be kept.

type: number

default: 10

Min. alignment fraction to consider closest genome.

type: number

default: 0.65

Number of CPUs used for the by GTDB-Tk run tool pplacer.

type: number

default: 1

Reduce GTDB-Tk memory consumption by running pplacer in a setting writing to disk.

type: boolean

default: true

Co-assemble samples within one group, instead of assembling each sample separately.

type: boolean

Additional custom options for SPAdes.

type: string

Additional custom options for MEGAHIT.

type: string

Skip Illumina-only SPAdes assembly.

type: boolean

Skip SPAdes hybrid assembly.

type: boolean

Skip MEGAHIT assembly.

type: boolean

Skip metaQUAST.

type: boolean

Skip Prodigal gene prediction

type: boolean

Defines mapping strategy to compute co-abundances for binning, i.e. which samples will be mapped against the assembly.

type: string

default: group

Skip metagenome binning entirely

type: boolean

Skip MetaBAT2 Binning

type: boolean

Skip MaxBin2 Binning

type: boolean

Minimum contig size to be considered for binning and for bin quality check.

type: integer

default: 1500

Minimal length of contigs that are not part of any bin but treated as individual genome.

type: integer

default: 1000000

Maximal number of contigs that are not part of any bin but treated as individual genome.

type: integer

default: 100

Bowtie2 alignment mode

type: string

Skip Prokka genome annotation.

type: boolean

Disable bin QC with BUSCO.

type: boolean

Download path for BUSCO lineage dataset, instead of using automated lineage selection.

type: string

Path to local folder containing already downloaded and unpacked lineage datasets.

type: string

Run BUSCO with automated lineage selection, but ignoring eukaryotes (saves runtime).

type: boolean

Save the used BUSCO lineage datasets provided via —busco_reference or downloaded when not using —busco_reference or —busco_download_path.

type: boolean

Turn on bin refinement using DAS Tool.

type: boolean

Specify single-copy gene score threshold for bin refinement.

type: number

default: 0.5

Specify which binning output is sent for downstream annotation, taxonomic classification, bin quality control etc.

type: string

Performs ancient DNA assembly validation and contig consensus sequence recalling.

Turn on/off the ancient DNA subworfklow

type: boolean

Ploidy for variant calling

type: integer

default: 1

minimum base quality required for variant calling

type: integer

default: 20

minimum minor allele frequency for considering variants

type: number

default: 0.33

minimum genotype quality for considering a variant high quality

type: integer

default: 30

minimum genotype quality for considering a variant medium quality

type: integer

default: 20

minimum number of bases supporting the alternative allele

type: integer

default: 3

PyDamage accuracy threshold

type: number

default: 0.5

nf-core/mag