Adding ancient subworkflow#247
Conversation
d4straub
left a comment
There was a problem hiding this comment.
Nice work!
The remaining test failures seem only linting and master/dev PR confusion.
Just to make sure I understand, this only evaluates whether contigs are likely of ancient origin? Wouldn't it be good when there is some kind of consequence to it, e.g. that those that are of no ancient origin are filtered out, or that the ones or ancient origin are separated (sorry for my ignorance, I have no idea how this usually works).
I am not sure whether the -profile is the best place to add those configurations, why do you prefer that over a parameter which is than documented in nextflow_schema.json? The documentation of the profile seems not clear this way. In any case, I'd expect aDNA also to be described in the usage.md at a more prominent place, such as before Running the pipeline.
I suggest to use the |
Do you mean |
Could also do like this :) |
|
|
Hey @d4straub and @jfy133 , 'bowtie2_assembly_align' {
args = params.bowtie2_mode ? params.bowtie2_mode : params.ancient_dna ? '--very-sensitive-local -N 1' : ''
publish_files = ['log':'']
publish_by_meta = ['assembler', 'QC', 'id']
publish_dir = "Assembly"
}With $ cat /work/c1/c49e900dd4f90f1d88e47f02516a40/.command.sh
#!/bin/bash -euo pipefail
INDEX=`find -L ./ -name "*.rev.1.bt2l" -o -name "*.rev.1.bt2" | sed 's/.rev.1.bt2l//' | sed 's/.rev.1.bt2//'`
bowtie2 \
-p "2" \
-x $INDEX \
\ # here should be --very-sensitive-local -N 1
-1 "test_minigut.phix_removed.unmapped_1.fastq.gz" -2 "test_minigut.phix_removed.unmapped_2.fastq.gz" \
2> "SPAdes-test_minigut-test_minigut.bowtie2.log" |
samtools view -@ "2" -bS | samtools sort -@ "2" -o "SPAdes-test_minigut-test_minigut.bam"
samtools index "SPAdes-test_minigut-test_minigut.bam"
if [ SPAdes-test_minigut-test_minigut = "SPAdes-test_minigut-test_minigut" ] ; then
mv "SPAdes-test_minigut-test_minigut.bowtie2.log" "SPAdes-test_minigut.bowtie2.log"
fi
echo $(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*$//' > bowtie2_assembly.version.txtMy guess is that a switch to the nf-core modules DSL2 v2 syntax should fix it (like for the aDNA modules of this PR) |
jfy133
left a comment
There was a problem hiding this comment.
It would still be good to reference that aDNA mode still works in the USAGE docs.
Might as well also mention in in the main README that this is now possible with MAG
But otherwise the code looks good!
I agree, would be nice to update the README, however, I suck at Inkscape, @d4straub Do you want to add it ? That's what I did |
|
@maxibor I can also do it if you send me the SVG |
There was a problem hiding this comment.
Looks quite good to me.
However, I think it would be really good to mention it in README (at least a small note about aDNA) and docs/usage.
I also agree that it should be in the overview figure (but I disagree to stretch the figure [i.e. rather minimize white space] and it should be obvious that its an on-demand or at least optional step, imho).
Edit: Just saw in the failed linting (expected, I know) the unexpected line
freebayes | Local copy of module does not match remote
thats a bit worrisome if its a relevant change. Should be local then.
That was just one |
jfy133
left a comment
There was a problem hiding this comment.
Only think I find missing would be a short description in USAGE of exactly what the purposes of the aDNA workflow is (the README describes what, not why), but otherwise wise looks good 👍
Co-authored-by: James A. Fellows Yates <[email protected]>
d4straub
left a comment
There was a problem hiding this comment.
Looks good to me as well. I agree and still think (previous comments) that there should be a mention of the profile and usage in the docs, i.e. usage.md. I approve it anyway.
This pull request adds a subworkflow specific to ancient DNA de novo assembly validation.
This consists of mainly two steps:
The proposed way of activating this aDNA subworkflow is through a specific profile
-profile ancient_dna, as a few general parameters of MAG need to be tuned to make them sensible for aDNA data (the bowtie2 alignment mode).Would be cool when you have time if you could add the ancient DNA subworfklow to the general MAG workflow schema (on the right of the assembly step) 🙂 .
To Do
options.argsPR checklist
nf-core lint) (won't lint because of outdated multiqc config @skrakau @d4straub )nextflow run . -profile test,docker).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).