Skip to content

Add CAT database creation#196

Merged
skrakau merged 13 commits intonf-core:devfrom
skrakau:add_cat_db_creation
May 25, 2021
Merged

Add CAT database creation#196
skrakau merged 13 commits intonf-core:devfrom
skrakau:add_cat_db_creation

Conversation

@skrakau
Copy link
Copy Markdown
Member

@skrakau skrakau commented May 23, 2021

Add CAT database creation as an alternative to using pre-built databases.

The problem with pre-build CAT databases is that they do not stay accessible on https://tbb.bio.uu.nl/bastiaan/CAT_prepare/ and that databases built with a different DIAMOND version than that used for running CAT classification are not always compatible (see #188 and #90).

By adding this step, it will be ensured the same DIAMOND is used for building the database and running the actual classification. The parameter --save_cat_db can be used to save the generated db for future reproducibility.

The process requires up to 200 GB memory, up to 300 GB for the generated database in the work dir and > 100 GB for the saved database.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - add to the software_versions process and a regex to scrape_software_versions.py
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • If necessary, also make a PR on the nf-core/mag branch on the nf-core/test-datasets repository.
  • Make sure your code lints (nf-core lint .).
  • Ensure the test suite passes (nextflow run . -profile test,docker).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 23, 2021

nf-core lint overall result: Passed ✅ ⚠️

Posted for pipeline commit 50ec0b8

+| ✅ 123 tests passed       |+
#| ❔   5 tests were ignored |#
!| ❗  75 tests had warnings |!
Details

❗ Test warnings:

  • files_exist - File not found: environment.yml
  • files_exist - File not found: Dockerfile
  • nextflow_config - Config variable not found: process.container
  • params_used - Config variable not found in main.nf: params.input
  • params_used - Config variable not found in main.nf: params.single_end
  • params_used - Config variable not found in main.nf: params.save_trimmed_fail
  • params_used - Config variable not found in main.nf: params.mean_quality
  • params_used - Config variable not found in main.nf: params.trimming_quality
  • params_used - Config variable not found in main.nf: params.keep_phix
  • params_used - Config variable not found in main.nf: params.phix_reference
  • params_used - Config variable not found in main.nf: params.host_fasta
  • params_used - Config variable not found in main.nf: params.host_genome
  • params_used - Config variable not found in main.nf: params.host_removal_verysensitive
  • params_used - Config variable not found in main.nf: params.host_removal_save_ids
  • params_used - Config variable not found in main.nf: params.binning_map_mode
  • params_used - Config variable not found in main.nf: params.skip_binning
  • params_used - Config variable not found in main.nf: params.min_contig_size
  • params_used - Config variable not found in main.nf: params.min_length_unbinned_contigs
  • params_used - Config variable not found in main.nf: params.max_unbinned_contigs
  • params_used - Config variable not found in main.nf: params.coassemble_group
  • params_used - Config variable not found in main.nf: params.spades_options
  • params_used - Config variable not found in main.nf: params.megahit_options
  • params_used - Config variable not found in main.nf: params.skip_spades
  • params_used - Config variable not found in main.nf: params.skip_spadeshybrid
  • params_used - Config variable not found in main.nf: params.skip_megahit
  • params_used - Config variable not found in main.nf: params.skip_quast
  • params_used - Config variable not found in main.nf: params.centrifuge_db
  • params_used - Config variable not found in main.nf: params.kraken2_db
  • params_used - Config variable not found in main.nf: params.skip_krona
  • params_used - Config variable not found in main.nf: params.cat_db
  • params_used - Config variable not found in main.nf: params.cat_db_generate
  • params_used - Config variable not found in main.nf: params.save_cat_db
  • params_used - Config variable not found in main.nf: params.gtdb
  • params_used - Config variable not found in main.nf: params.gtdbtk_min_completeness
  • params_used - Config variable not found in main.nf: params.gtdbtk_max_contamination
  • params_used - Config variable not found in main.nf: params.gtdbtk_min_perc_aa
  • params_used - Config variable not found in main.nf: params.gtdbtk_min_af
  • params_used - Config variable not found in main.nf: params.gtdbtk_pplacer_cpus
  • params_used - Config variable not found in main.nf: params.gtdbtk_pplacer_scratch
  • params_used - Config variable not found in main.nf: params.skip_adapter_trimming
  • params_used - Config variable not found in main.nf: params.keep_lambda
  • params_used - Config variable not found in main.nf: params.longreads_min_length
  • params_used - Config variable not found in main.nf: params.longreads_keep_percent
  • params_used - Config variable not found in main.nf: params.longreads_length_weight
  • params_used - Config variable not found in main.nf: params.lambda_reference
  • params_used - Config variable not found in main.nf: params.skip_busco
  • params_used - Config variable not found in main.nf: params.busco_reference
  • params_used - Config variable not found in main.nf: params.busco_download_path
  • params_used - Config variable not found in main.nf: params.busco_auto_lineage_prok
  • params_used - Config variable not found in main.nf: params.save_busco_reference
  • params_used - Config variable not found in main.nf: params.megahit_fix_cpu_1
  • params_used - Config variable not found in main.nf: params.spades_fix_cpus
  • params_used - Config variable not found in main.nf: params.spadeshybrid_fix_cpus
  • params_used - Config variable not found in main.nf: params.metabat_rng_seed
  • params_used - Config variable not found in main.nf: params.multiqc_config
  • params_used - Config variable not found in main.nf: params.multiqc_title
  • params_used - Config variable not found in main.nf: params.max_multiqc_email_size
  • params_used - Config variable not found in main.nf: params.skip_multiqc
  • params_used - Config variable not found in main.nf: params.outdir
  • params_used - Config variable not found in main.nf: params.tracedir
  • params_used - Config variable not found in main.nf: params.publish_dir_mode
  • params_used - Config variable not found in main.nf: params.email
  • params_used - Config variable not found in main.nf: params.email_on_fail
  • params_used - Config variable not found in main.nf: params.plaintext_email
  • params_used - Config variable not found in main.nf: params.enable_conda
  • params_used - Config variable not found in main.nf: params.singularity_pull_docker_container
  • params_used - Config variable not found in main.nf: params.validate_params
  • params_used - Config variable not found in main.nf: params.hostnames
  • params_used - Config variable not found in main.nf: params.config_profile_description
  • params_used - Config variable not found in main.nf: params.config_profile_contact
  • params_used - Config variable not found in main.nf: params.config_profile_url
  • params_used - Config variable not found in main.nf: params.max_memory
  • params_used - Config variable not found in main.nf: params.max_cpus
  • params_used - Config variable not found in main.nf: params.max_time
  • schema_description - No description provided in schema for parameter: skip_multiqc

❔ Tests ignored:

  • files_unchanged - File ignored due to lint config: lib/NfcoreSchema.groovy
  • files_unchanged - File does not exist: .github/workflows/push_dockerhub_dev.yml
  • files_unchanged - File does not exist: .github/workflows/push_dockerhub_release.yml
  • conda_env_yaml - No environment.yml file found - skipping conda_env_yaml test
  • conda_dockerfile - No environment.yml / Dockerfile file found - skipping conda_dockerfile test

✅ Tests passed:

Run details

  • nf-core/tools version 1.14
  • Run at 2021-05-25 10:03:38

@skrakau skrakau requested a review from d4straub May 23, 2021 16:23
@skrakau skrakau force-pushed the add_cat_db_creation branch from 68e65d9 to 7ad4025 Compare May 23, 2021 16:27
Copy link
Copy Markdown
Collaborator

@d4straub d4straub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants