nf-core/taxprofiler
Highly parallelised multi-taxonomic profiling of shotgun short- and long-read metagenomic data
classificationilluminalong-readsmetagenomicsmicrobiomenanoporepathogenprofilingshotguntaxonomic-classificationtaxonomic-profiling
Version history
Added
- #417 Added reference-free metagenome complexity/coverage estimation with Nonpareil (added by @jfy133)
- #466 Input database sheets can specify a
db_type
column to distinguish between short- and long-read databases (added by @LilyAnderssonLee) - #505 Add small files to the file
tower.yml
(added by @LilyAnderssonLee) - #508 Add
nanoq
as a filtering tool for nanopore reads (added by @LilyAnderssonLee) - #511 Add
porechop_abi
as an alternative adapter removal tool for long reads nanopore data (added by @LilyAnderssonLee) - #512 Update all tools to the latest version and include nf-test (updated by @LilyAnderssonLee & @jfy133)
- #537 Update the module
motus/merge
to the latest version (Updated by @sofstam & @LilyAnderssonLee)
Fixed
- #518 Fixed a bug where Oxford Nanopore FASTA input files would not be processed (❤️ to @ikarls for reporting, fixed by @jfy133)
- #523 Removed hardcoded
-m lca
from GANON_CLASSIFY due to more options in new version of ganon (fixed by @LilyAnderssonLee & @jfy133) - #531 Fix FASTA input validation in schema allowing FASTQ extension, expand allowed FASTA extensions (fixed by @jfy133)
- #512 Minor formatting and ordering improvements in MultiQC report (by @jfy133)
- #532 - Added missing documentation behind the ‘ignore’ BRACKEN_BRACKEN error strategy (❤️ to @Mavti for reporting, fixed by @jfy133)
- #536 - Redefine
contents_re
for filtlong to fix its missing from the MultiQC report (fixed by @LilyAnderssonLee)
Dependencies
Tool | Previous version | New version |
---|---|---|
bbmap | 39.01 | 39.06 |
bowtie2 | 2.4.4 | 2.5.2 |
bracken | 2.7 | 2.9 |
diamond | 2.0.15 | 2.1.8 |
ganon | 1.5.1 | 2.0.0 |
kraken2 | 2.1.2 | 2.1.3 |
krona | 2.8 | 2.8.1 |
megan | 6.24.20 | 6.25.9 |
metaphlan | 4.0.6 | 4.1.1 |
minimap2 | 2.24 | 2.28 |
motus | 3.0.3 | 3.1.0 |
multiqc | 1.21 | 1.25 |
samtools | 1.17 | 1.20 |
Deprecated
Added
Fixed
- #484 Improved input validation to immediately fail if run accession IDs within a given sample ID are not unique (❤️ to @sofstam for reporting, fixed by @jfy133)
- #491 Added flag to publish intermediate bracken files (❤️ to @ewissel for reporting, fixed by @sofstam and @jfy133)
- #489 Fix KrakenUniq classified reads output format mismatch (❤️ to @SannaAb for reporting, fixed by @jfy133)
- #495 Stop TAXPASTA failures when profiles do not have exact compositionality (fixes by @Midnighter, @jfy133)
Dependencies
Tool | Previous version | New version |
---|---|---|
KMCP | 0.9.1 | 0.9.4 |
TAXPASTA | 0.6.1 | 0.7.0 |
Deprecated
Added
- #477 Provide more emphasis and links to tutorials on how to retrieve and supply reference databases (❤️ to @vmkalbskopf for reporting, added by @jfy133)
Fixed
- #476 Fixed bug in validating Bracken/Kraken/KMCP split database parameters (fixed by @LilyAnderssonLee)
Dependencies
Deprecated
Added
- #454 Updated to nf-core pipeline template v2.13.1 (added by @LilyAnderssonLee & @sofstam)
- #461 Turned on ‘strict’ Nextflow evaluation runs (added by @jfy133)
- #461 Optimised database compression so each compressed input database is untarred once, and shared amongst each run with different parameters (added by @jfy133)
- #461 Added new parameter to optionally save uncompressed databases (added by @jfy133)
- #471 Removed
-stub
run in thedownload_pipeline.yml
because the pipeline does not support stub runs on dev (fixed by @LilyAnderssonLee)
Fixed
- #336 Replaced samplesheet check with nf-validation for both sample and database input sheets (fix by @LilyAnderssonLee)
- #460 Corrected the channel transformations to combine Kaiju and mOTUs reports with their reference databases (fix by @Midnighter)
Added
- #439 Read deduplication with fastp (added by @maxibor)
- #440 Include mention of pre-built kaiju databases in tutorial.md (added by @Joon-Klaps)
- #442 Updated to nf-core pipeline template v2.12 (added by @sofstam)
Fixed
- #444 Centrifuge now uses dedicated tmp directory to hopefully prevent mkfifo clashes (❤️ to @erinyoung for reporting, fix by @jfy133)
Dependencies
Tool | Previous version | New version |
---|---|---|
Centrifuge | 1.0.4_beta | 1.0.4.1 |
Fixed
- #431 Updated kaiju2table module to report taxon names (fix by @Joon-Klaps)
- #430 Fix the fastq output in the module LONGREAD_HOSTREMOVAL. (fix by @LilyAnderssonLee)
Dependencies
Tool | Previous version | New version |
---|---|---|
kaiju | 1.8.2 | 1.10.0 |
Added
- #424 Updated to nf-core pipeline template v2.11.1 (added by @LilyAnderssonLee & @sofstam)
Fixed
- #419 Added improved syntax highlighting for tables in documentation (fix by @mashehu)
- #421 Updated the krakenuniq/preloadedkrakenuniq module that contained a fix for saving the output reads (❤️ to @SannaAb for reporting, fix by @Midnighter)
- #427 Fixed preprint information in the recommended methods text (fix by @jfy133)
Dependencies
Tool | Previous version | New version |
---|---|---|
multiqc | 1.15 | 1.19 |
fastqc | 11.9 | 12.1 |
nf-validation | unpinned | 1.1.3 |
Added
Fixed
- #405 Fix database to tool mismatching in KAIJU2KRONA input (❤️ to @MajoroMask for reporting, fix by @jfy133)
- #406 Fix overwriting of bracken-derived kraken2 outputs when the database name is shared between Bracken/Kraken2. (❤️ to @MajoroMask for reporting, fix by @jfy133)
- #409 Fix a NullPointerException error occurring occasionally in older version of MEGAN’s rma2info (❤️ to @MajoroMask for reporting, fix by @jfy133)
Dependencies
Tool | Previous version | New version |
---|---|---|
megan/rma2info | 6.21.7 | 6.24.20 |
Added
- #379 Added support for previously missing Bracken-corrected Kraken2 report as output (added by @hkaspersen & @jfy133 )
- #380 Updated to nf-core pipeline template v2.10 (added by @LilyAnderssonLee & @sofstam)
- #393 Add validation check for a taxpasta taxonomy directory if —taxpastaadd* parameters requested (♥️ to @alimalrashed for reporting, added by @jfy133)
Fixed
- #383 Update the module of KrakenUniq to the latest to account for edge case bugs where FASTQ input was mis-detected as wrong format (❤️ to @asafpr for reporting and solution, fixed by @LilyAnderssonLee)
- #392 Update the module of Taxpasta to support adding taxa information to results (❤️ to @SannaAb for reporting, fixed by @Midnighter)
Dependencies
Tool | Previous version | New version |
---|---|---|
KrakenUniq | 1.0.2 | 1.0.4 |
taxpasta | 0.6.0 | 0.6.1 |
Deprecated
Added
- #298 New classifier ganon (added by @jfy133)
- #312 New classifier KMCP (added by @sofstam)
- #318 New classifier MetaPhlAn4 (MetaPhlAn3 support remains) (added by @LilyAnderssonLee)
- #276 Implemented batching in the KrakenUniq samples processing (added by @Midnighter)
- #272 Add saving of final ‘analysis-ready-reads’ to dedicated directory (❤️ to @alexhbnr for request, added by @jfy133)
- #303 Add support for taxpasta profile standardisation in single sample pipeline runs (❤️ to @artur-matysik for request, added by @jfy133)
- #308 Add citations and bibliographic information to the MultiQC methods text of tools used in a given pipeline run (added by @jfy133)
- #315 Updated to nf-core pipeline template v2.9 (added by @sofstam & @jfy133)
- #319 Added support for virus hit expansion in Kaiju (❤️ to @dnlrxn for requesting, added by @jfy133)
- #323 Add ability to skip sequencing quality control tools (❤️ to @vinisalazar for requesting, added by @jfy133)
- #345 Add simple tutorial to explain how to get up and running with an nf-core/taxprofiler run (added by @jfy133)
- #355 Add support for TAXPASTA’s
--add-rank-lineage
to output (❤️ to @MajoroMask for request, added by @Midnighter, @sofstam, @jfy133) - #368 Add the ability to ignore profile errors caused by empty profiles and other validation errors when merging multiple profiles using TAXPASTA (added by @Midnighter and @LilyAnderssonLee)
Fixed
- #271 Improved standardised table generation documentation for mOTUs manual database download tutorial (♥ to @prototaxites for reporting, fix by @jfy133)
- #269 Reduced output files in AWS full test output due to very large files (fix by @jfy133)
- #270 Fixed warning for host removal index parameter, and improved index checks (♥ to @prototaxites for reporting, fix by @jfy133)
- #274 Substituted the samtools/bam2fq module with samtools/fastq module (fix by @sofstam)
- #275 Replaced function used for error reporting to more Nextflow friendly method (fix by @jfy133)
- #285 Fixed overly large log files in Kraken2 output (♥ to @prototaxites for reporting, fix by @Midnighter & @jfy133)
- #286 Runtime optimisation of MultiQC step via improved log file processing (fix by @Midnighter & @jfy133)
- #289 Pipeline updated to nf-core template 2.8 (fix by @Midnighter & @jfy133)
- #290 Minor database input documentation improvements (♥ to @alneberg for reporting, fix by @jfy133)
- #305 Fix docker/podman registry definition for tower compatibility (fix by @adamrtalbot, @jfy133)
- #304 Correct mistake in kaiju2table documentation, only single rank can be supplied (♥ to @artur-matysik for reporting, fix by @jfy133)
- #307 Fix databases being sometimes associated with the wrong tool (e.g. Kaiju) (fix by @jfy133, @Midnighter and @LilyAnderssonLee)
- #313 Fix pipeline not providing error when database sheet does not have a header (♥ to @noah472 for reporting, fix by @jfy133)
- #330 Added better tagging to allow disambiguation of Kraken2 steps of Kraken2 vs Bracken (♥ to @MajoroMask for requesting, added by @jfy133)
- #334 Increase the memory of the FALCO process to 4GB (fix by @LilyAnderssonLee)
- #332 Improved meta map stability for more robust pipeline resuming (fix by @jfy133)
- #338 Fixed wrong file ‘out’ file going to
centrifuge kreport
module (♥ to @LilyAnderssonLee for reporting, fix by @jfy133) - #342 Fixed docs/usage to correctly list the required database files for Bracken and tips to obtain Kraken2 databases (fix by @husensofteng)
- #350 Reorganize the CI tests into separate profiles in preparation for implementation of nf-test (fix by @LilyAnderssonLee)
- #364 Add autoMounts to apptainer profile in nextflow.config (♥ to @hkaspersento for reporting, fix by @LilyAnderssonLee)
- #372 Update modules to use quay.io nf-core mirrored containers (♥ to @maxulysse for pointing out, fix by @LilyAnderssonLee and @jfy133)
Dependencies
Tool | Previous version | New version |
---|---|---|
MultiQC | 1.13 | 1.15 |
TAXPASTA | 0.2.3 | 0.6.0 |
MetaPhlAn | 3.0.12 | 4.0.6 |
fastp | 0.23.2 | 0.23.4 |
samtools | 1.16.1 | 1.17 |
Deprecated
- #338 Updated Centrifuge module to not generate (undocumented) SAM alignments by default if —save_centrifuge_reads supplied, due to a Centrifuge bug modifying profile header. SAM alignments can still be generated if
--out-fmt
supplied indatabase.csv
(♥ to @LilyAnderssonLee for reporting, fix by @jfy133)
Fixed
v1.0.0 - Dodgy Dachshund [2023-03-13]
Added
- Add read quality control (sequencing QC, adapter removal and merging)
- Add read complexity filtering
- Add host-reads removal step
- Add run merging
- Add taxonomic classification
- Add taxon table standardisation
- Add post-classification visualisation
Contributed by: @jfy133 @sofstam @Midnighter @ljmesi @MillironX @jianhong @mjamy @rafalstepien @maxibor @talnor