Conversation
Release v5.1.0
Release: v5.2.0 Puce Pangolin
|
@nf-core-bot fix linting |
|
Please let me know what I can improve. Thank you! |
d4straub
left a comment
There was a problem hiding this comment.
Ci tests fail with various errors. https://github.com/nf-core/mag/actions/runs/19404507607/job/55538896302?pr=931 fails with
> ERROR ~ Error executing process > 'NFCORE_MAG:MAG:ALE (minigut)'
>
> Caused by:
> Process `NFCORE_MAG:MAG:ALE (minigut)` terminated with an error exit status (134)
>
>
> Command executed:
>
> ALE \
> \
> SPAdesHybrid-minigut-minigut.bam \
> SPAdesHybrid-minigut.scaffolds.fa \
> minigut_ALEoutput.txt
>
> cat <<-END_VERSIONS > versions.yml
> "NFCORE_MAG:MAG:ALE":
> ale: 20180904
> END_VERSIONS
>
> Command exit status:
> 134
>
> Command output:
> BAM file: SPAdesHybrid-minigut-minigut.bam
> Assembly fasta file: SPAdesHybrid-minigut.scaffolds.fa
> ALE Output file: minigut_ALEoutput.txt
> Reading in assembly...
> Reading in the map and computing statistics...
> Insert length and std not given, will be calculated from input map.
> Found FR sample avg insert length to be 383.864169 from 28344 mapped reads
> Found FR sample insert length std to be 69.336488
> Found NOT_PROPER_FR sample avg insert length to be 892.122675 from 66297 mapped reads
> Found NOT_PROPER_FR sample insert length std to be 221.969163
> There were 99620 total reads, 99620 paired (97898 properly mated), 763 proper singles, 959 improper reads (818 chimeric). (83 reads were unmapped)
> Saved library parameters to minigut_ALEoutput.txt.param
> Computing read placements and depths
>
> Command error:
> WARNING: The following read and its mate do not agree on the contigs and/or positions of their mappings:read1: NC_006347.1_4981 81: 0 0 106315 105875 read2: NC_006347.1_4981 161: 0 0 105578 106537 l: 1.000000 li: 1.000000, s1: 106315, s2: 105875, e1: 106441, e2: -1, c1: 0, c2: 0, NC_006347.1_4981, NOT_PROPER_FR, 0, b1: 34e7c540, b2: 0
> ALE: ALElike.c:1892: validateAlignmentMates: Assertion `thisAlignment->start2 == thisReadMate->core.pos' failed.
> BAM file: SPAdesHybrid-minigut-minigut.bam
> Assembly fasta file: SPAdesHybrid-minigut.scaffolds.fa
> ALE Output file: minigut_ALEoutput.txt
> Reading in assembly...
> Reading in the map and computing statistics...
> Insert length and std not given, will be calculated from input map.
> Found FR sample avg insert length to be 383.864169 from 28344 mapped reads
> Found FR sample insert length std to be 69.336488
> Found NOT_PROPER_FR sample avg insert length to be 892.122675 from 66297 mapped reads
> Found NOT_PROPER_FR sample insert length std to be 221.969163
> There were 99620 total reads, 99620 paired (97898 properly mated), 763 proper singles, 959 improper reads (818 chimeric). (83 reads were unmapped)
> Saved library parameters to minigut_ALEoutput.txt.param
> Computing read placements and depths
> .command.sh: line 6: 34 Aborted (core dumped) ALE SPAdesHybrid-minigut-minigut.bam SPAdesHybrid-minigut.scaffolds.fa minigut_ALEoutput.txt
>
> Work dir:
> /home/runner/_work/mag/mag/~/tests/b1878932db1a90503becf8394b4ddfd4/work/f4/88d2098735b2dd029b42b1a840ced7
>
> Container:
> quay.io/biocontainers/ale:20180904--py27ha92aebf_0
>
> Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
>
> -- Check '/home/runner/_work/mag/mag/~/tests/b1878932db1a90503becf8394b4ddfd4/meta/nextflow.log' file for details
> ERROR ~ Could not find which method load() to invoke from this list:
> public java.lang.Object org.yaml.snakeyaml.Yaml#load(java.io.InputStream)
> public java.lang.Object org.yaml.snakeyaml.Yaml#load(java.io.Reader)
> public java.lang.Object org.yaml.snakeyaml.Yaml#load(java.lang.String)
> public java.lang.Object org.yaml.snakeyaml.Yaml#load(java.io.File)
> public java.lang.Object org.yaml.snakeyaml.Yaml#load(java.nio.file.Path)
>
> -- Check script '/home/runner/_work/mag/mag/subworkflows/nf-core/utils_nfcore_pipeline/main.nf' at line: 82 or see '/home/runner/_work/mag/mag/~/tests/b1878932db1a90503becf8394b4ddfd4/meta/nextflow.log' file for more details
> ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting
>
> -- Check '/home/runner/_work/mag/mag/~/tests/b1878932db1a90503becf8394b4ddfd4/meta/nextflow.log' file for details
> -[nf-core/mag] Pipeline completed with errors-
> WARN: Killing running tasks (1)
FAILED (481.488s)
Additionally, test https://github.com/nf-core/mag/actions/runs/19404507607/job/55538896283?pr=931 indicates that ALE is run but output files are not published to the results folder:
2 { 2 {
3 "ADJUST_MAXBIN2_EXT": { 3 "ADJUST_MAXBIN2_EXT": {
4 "coreutils": 9.5 4 "coreutils": 9.5
+ 5 },
+ 6 "ALE": {
+ 7 "ale": 20180904
5 }, 8 },
6 "BIN_SUMMARY": { 9 "BIN_SUMMARY": {
7 "pandas": "1.4.3", 10 "pandas": "1.4.3",
Co-authored-by: Daniel Straub <[email protected]>
Co-authored-by: Daniel Straub <[email protected]>
prototaxites
left a comment
There was a problem hiding this comment.
Hi @PetcuBogdan, a few thoughts from me!
Hi! Thanks a lot for pointing this out, and apologies for the slow reply — I needed some time to investigate it properly. This issue is actually a known ALE bug that appears when running on the small synthetic BAM/FASTA files used in the nf-core test datasets. The error does not occur with real metagenomic data (e.g., the CAPES samples I used to validate the module locally), which is why the process completes normally outside the test environment. One possible path forward would be to update the ALE module to catch this specific failure mode and surface it as a warning or a logged message rather than causing the process to hang. Please let me know if there is something else that i can do, thank you. |
@PetcuBogdan do you know exactly what the bug is though? We can theoretically try and work aroudn it but it would be good to know what the cause before we do that as work arounds can sometimes be dangerous (hide a bigger different bug) |
Hi! The ALE crash appears to come from its strict validateAlignmentMates check — some read pairs in the test dataset have inconsistent mate coordinates or orientations, and ALE aborts when encountering them. This issue shows up only on the test data; on the real CAPES dataset ALE runs normally, so the pipeline itself is fine. To investigate this further I could also use samtools to filter the input BAM and remove problematic pairs before running ALE, so that we can confirm whether the error is strictly caused by these inconsistent alignments. Let me know if you want me to try that. |
|
I also noticed that there is a newer ale-core module available, which seems to be a more up-to-date and actively maintained version of ALE. It might handle mate-pair inconsistencies better or at least fail more gracefully. If updating ALE in the modules/nf-core is possible, switching to ale-core could potentially avoid this crash. Link to ale-core package https://anaconda.org/channels/bioconda/packages/ale-core/overview |
@jfy133 I got a similar error using the newer |
|
Warning Newer version of the nf-core template is available. Your pipeline is using an old version of the nf-core template: 3.4.1. For more documentation on how to update your pipeline, please see the nf-core documentation and Synchronisation documentation. |
Release: v5.3.0 Rainbow Rattlesnake
jfy133
left a comment
There was a problem hiding this comment.
Approval as comments not necessarily blocking (the second is worrying though)
Missing README.md, otherwise one question and one suggestion
|
Thanks @dialvarezs , and sorry that too long, and thanks for contributing @PetcuBogdan (and @amizeranschi for cooridnating!) |
PR checklist
nextflow run . -profile test,docker --outdir <OUTDIR>).nextflow run . -profile debug,test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).Description
This PR adds ALE (Assembly Likelihood Estimator) for assembly quality control in nf-core/mag.
ALE is a probabilistic framework that evaluates assembly quality by computing the likelihood of sequencing reads given an assembly. It provides per-contig quality scores useful for identifying misassemblies, comparing assemblies, and validating quality before binning.
Changes made
Workflow:
References