Fixed metabat input channel#55
Merged
skrakau merged 2 commits intonf-core:devfrom Jun 29, 2020
Merged
Conversation
d4straub
approved these changes
Jun 29, 2020
Collaborator
d4straub
left a comment
There was a problem hiding this comment.
Thanks for your efforts!
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixed problem described in #32. There is a problem with the input channel for metabat, where assembly mappings and the assembly from wrong samples are joined.
The input channel is created by joining the following two channels:
assembly_mapping_for_metabat.groupTuple(by:[0,1]):[MEGAHIT, SAMPLE_ID1, [MEGAHIT-SAMPLE_ID1-SAMPLE_ID1.bam, MEGAHIT-SAMPLE_ID1-SAMPL_ID2.bam], [MEGAHIT-SAMPLE_ID1-SAMPLE_ID1.bam.bai, MEGAHIT-SAMPLE_ID1-SAMPLE_ID2.bam.bai, MEGAHIT-SAMPLE_ID1-SAMPLE_ID1.bam.bai]]
...
assembly_all_to_metabat_copy:[MEGAHIT, SAMPLE_ID1, SAMPLE_ID1.contigs.fa]
...
Those two channels are currently joined with default parameters, thus by index 0 which is the assembler name and not the sample name. This did not cause a problem for single-sample analyses or when, by chance, the order of the emitted assemblies and assembly mappings was not changed during the run. However, this resulted in cases where assembly mappings and assembly from different samples are joined and caused the in PR #53 observed error, due to wrong contigs.
Fixed this by using
joinwith parameterby:[0,1]to join by assembler name and sample name.PR checklist
nextflow run . -profile test,docker).nf-core lint .).docsis updatedCHANGELOG.mdis updatedREADME.mdis updatedLearn more about contributing: https://github.com/nf-core/mag/tree/master/.github/CONTRIBUTING.md