Add parameter to exclude unbinned data from post-binning steps#708
Add parameter to exclude unbinned data from post-binning steps#708jfy133 merged 8 commits intonf-core:devfrom
Conversation
jfy133
left a comment
There was a problem hiding this comment.
This is a great idea!
I've tweaked the phrasing of the parameter and given more guidance to the parameter name.
Please update the CHANGELOG and I think then we might be good to go.
The only thing check is that the bin_summary steps don't fail for some reason because some contigs are missing for some reason
Co-authored-by: James A. Fellows Yates <[email protected]>
|
Thank you for the review! I believe everything should be good to go now. As for a possible bin summary failure, I don’t think it will case issues, because DEPTHS and every other bin summary input uses the same channel. I did a test with a few samples and didn't have any problem. On that note, I think it should be worth considering making bin summary more flexible, allowing it to handle cases where not all inputs have identical bins. So, if a QC bin fails, we could generate the table with a warning rather than an error. This should prevent issues like #603, but obviously, if a QC process fails the user should have to look into it. However, this could be a discussion to have in another PR. |
jfy133
left a comment
There was a problem hiding this comment.
Thank you very much @dialvarezs !
This pr adds a new parameter
exclude_unbins_from_postbinning, with default valuefalse.The reasoning behind this is that as stated in #649, Prokka can take a really long time with unbinned data, and also qc metrics and taxonomic classification normally are not meaningful for non-bins.
With the default set to
false, this parameter doesn't introduce any breaking change to the current workflow, but adds flexibility to focus on binned data on post-binning steps.This PR also fixes one minor issue I found in line 827 of
mag.nf, where the channel passed toGUNCwasch_input_bins_for_qcinstead ofch_input_bins_for_gunc(the same, but without eukarya domain).Closes #649 .
PR checklist
nf-core pipelines lint).nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).