Skip to content

Update GTDB-Tk QC documentation to include CheckM and CheckM2#984

Merged
dialvarezs merged 5 commits intodevfrom
claude/fix-issue-983-DcUiG
Mar 2, 2026
Merged

Update GTDB-Tk QC documentation to include CheckM and CheckM2#984
dialvarezs merged 5 commits intodevfrom
claude/fix-issue-983-DcUiG

Conversation

@dialvarezs
Copy link
Copy Markdown
Member

@dialvarezs dialvarezs commented Mar 2, 2026

Description

This PR updates the documentation and schema to clarify that GTDB-Tk bin quality filtering now supports multiple QC tools beyond BUSCO, including CheckM and CheckM2.

PR Checklist

  • This comment contains a description of changes (with reason)
  • Output Documentation in docs/output.md is updated
  • Schema documentation in nextflow_schema.json is updated

https://claude.ai/code/session_01GFjgJHCGv7Jn9ZEdsVMv8p

Closes #983.

jfy133 and others added 2 commits February 2, 2026 15:07
Release 5.4.0 (Yellow Frog)
The `gtdbtk_min_completeness` and `gtdbtk_max_contamination` parameters
previously described filtering as applying only to BUSCO quality metrics.
However, since v2.5 the pipeline supports multiple bin QC tools (BUSCO,
CheckM, CheckM2) and the filter applies to results from any/all of them.

Update help_text in nextflow_schema.json and descriptions in docs/output.md
to accurately reflect that completeness and contamination filtering works
across all available bin QC tools, not exclusively BUSCO.

Fixes #983

https://claude.ai/code/session_01GFjgJHCGv7Jn9ZEdsVMv8p
@dialvarezs dialvarezs changed the base branch from main to dev March 2, 2026 18:52
@nf-core nf-core deleted a comment from github-actions Bot Mar 2, 2026
@nf-core-bot
Copy link
Copy Markdown
Member

Warning

Newer version of the nf-core template is available.

Your pipeline is using an old version of the nf-core template: 3.5.1.
Please update your pipeline to the latest version.

For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Mar 2, 2026

nf-core pipelines lint overall result: Passed ✅ ⚠️

Posted for pipeline commit f88f8ff

+| ✅ 383 tests passed       |+
#| ❔   1 tests were ignored |#
!| ❗   6 tests had warnings |!
Details

❗ Test warnings:

  • pipeline_todos - TODO string in main.nf: Remove this line if you don't need a FASTA file [TODO: try and test using for --host_fasta and --host_genome]
  • pipeline_todos - TODO string in methods_description_template.yml: #Update the HTML below to your preferred methods description, e.g. add publication citation for this pipeline
  • pipeline_todos - TODO string in nextflow.config: Specify any additional parameters here
  • pipeline_todos - TODO string in main.nf: Optionally add in-text citation tools to this list.
  • pipeline_todos - TODO string in main.nf: Optionally add bibliographic entries to this list.
  • pipeline_todos - TODO string in main.nf: Only uncomment below if logic in toolCitationText/toolBibliographyText has been filled!

❔ Tests ignored:

  • files_unchanged - File ignored due to lint config: .github/PULL_REQUEST_TEMPLATE.md

✅ Tests passed:

Run details

  • nf-core/tools version 3.5.1
  • Run at 2026-03-02 20:15:46

@github-actions github-actions Bot added the size/s label Mar 2, 2026
@dialvarezs dialvarezs linked an issue Mar 2, 2026 that may be closed by this pull request
Comment thread nextflow_schema.json Outdated
"default": 10,
"description": "Max. bin contamination (in %) allowed to apply GTDB-tk classification.",
"help_text": "Contamination approximated based on BUSCO analysis (%Complete and duplicated). If too high, GTDB-tk classification results can be impaired due to contamination!",
"help_text": "Contamination assessed with any of the bin QC tools (BUSCO, CheckM, CheckM2). The minimum contamination value across all available QC tools is used for filtering. If too high, GTDB-tk classification results can be impaired due to contamination!",
Copy link
Copy Markdown
Contributor

@prototaxites prototaxites Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to double check that this is an accurate representation of the code? I think the code applies the filter to all value pairs from all tools and passes the bin if any pair passes, is that equivalent to these descriptions?

Copy link
Copy Markdown
Member Author

@dialvarezs dialvarezs Mar 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right, I'm going to rephrase it. But I guess the effect is actually the same.

passed: (
completeness.any { bin_completeness -> bin_completeness != -1 } &&
completeness.any { bin_completeness -> bin_completeness >= params.gtdbtk_min_completeness } &&
contamination.any { bin_contamination -> bin_contamination != -1 } &&
contamination.any { bin_contamination -> bin_contamination <= params.gtdbtk_max_contamination }
)

@dialvarezs dialvarezs force-pushed the claude/fix-issue-983-DcUiG branch from a7f000d to f88f8ff Compare March 2, 2026 20:12
@dialvarezs dialvarezs merged commit 608db5c into dev Mar 2, 2026
8 checks passed
@dialvarezs dialvarezs mentioned this pull request Mar 7, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

GTDBtk completeness/contamination Param docs out of date

5 participants