Allow local databases to be used for kraken2, centrifuge, and busco#504
Allow local databases to be used for kraken2, centrifuge, and busco#504jfy133 merged 27 commits intonf-core:devfrom
Conversation
|
|
@gregorysprenger I have a little bit of spare time so just had a look: I wonder if the error is coming from the fact that |
Yeah, I got it figured out now! Once I finish running the test profiles locally (and they pass), I'll get everything pushed over. |
| } | ||
| if (params.busco_download_path) { | ||
| Nextflow.error('Both --skip_binqc and --busco_download_path are specified! Invalid combination, please specify either --skip_binqc or --binqc_tool \'busco\' with --busco_download_path.') | ||
| if (params.busco_db) { |
There was a problem hiding this comment.
Outwith the scope of this PR for sure, but I think cases like this where you skip a step but specify a database should probably just print a warning rather than quitting. Makes debugging a little easier if you just need to quickly turn something off.
| "description": "Run BUSCO with automated lineage selection, but ignoring eukaryotes (saves runtime)." | ||
| }, | ||
| "save_busco_reference": { | ||
| "save_busco_db": { |
There was a problem hiding this comment.
I'm wondering if this should keep its name (or if it should be renamed to "save_busco_references"? As far as I see it, the use case for this parameter is to save the lineages files downloaded when using online auto lineage detection. On their own, these don't actually comprise a busco database as there are other index files you need to download in order to pass a directory to busco_db.
|
Couple of small comments, but otherwise LGTM I think 👍 |
jfy133
left a comment
There was a problem hiding this comment.
Needs a changelog update! Also given we are changing a parameter it needs tobe loudly flagged (I think thsi will require a 2.5 infact...))
I'm starting my test runs now, hope to report back soon!
|
Manual Testing results: Busco
Centrifuge
Kraken2
|
Co-authored-by: James A. Fellows Yates <[email protected]>
Co-authored-by: James A. Fellows Yates <[email protected]>
Co-authored-by: James A. Fellows Yates <[email protected]>
Co-authored-by: James A. Fellows Yates <[email protected]>
|
@jfy133 Thanks for the suggestions and testing edge cases I overlooked! I'll try to get all of these fixed over the weekend. |
|
I've still got 2 to do, I hope by tomorrow! |
|
OK, it's just the BUSCO uncompressed tar/dir input which is problmatic, and after that we are good to go 👍 |
|
Thanks @gregorysprenger ! I will test again at the weekend 👍 |
Co-authored-by: James A. Fellows Yates <[email protected]>
Give me a couple more hours and I'll get the centrifuge and kraken database parsing into groovy to avoid the possible symlink issues. :) |
|
Alright I think I'm done this time lol. @jfy133 I ran all of your manual tests and they appeared to work fine, but double checking is always best! Quick updates:
[ its weird that PythonBlack failed O.o ] |
jfy133
left a comment
There was a problem hiding this comment.
LGTM, could always bit of refinement here and there but this works 👍
One last thing: please pull in latest changes in dev, and move your CHANGELOG entry accordingly, then we can merge!
|
And then I'll od the first patch release :) Thanks @gregorysprenger ! |
|
Done! Thanks @prototaxites and @jfy133 for reviewing! |
|
Tweaked further the versions to make it clearer the change, I think I will make this once tests pass, I will merge and make the release PR :D |
|
@nf-core-bot fix linting |
This PR is a follow up to issue #498.
This adds the ability to check if a database is compressed or decompressed. Updated BUSCO to only have to specify
--busco_dbfor databases that are both compressed or decompressed. It determines if an input is a lineage dataset by looking forodb10in the input filename (ie.bacteria_odb10). Since I dropped--busco_reference, I decided to rename--save_busco_referenceto--save_busco_db.PR checklist
nf-core lint).nextflow run . -profile test,docker --outdir <OUTDIR>).docs/usage.mdis updated.docs/output.mdis updated.CHANGELOG.mdis updated.README.mdis updated (including new tool citations and authors/contributors).