-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add separate Nanopore input option #275
Comments
Totally agree. That would be a major enhancement, also some significant work though. It requires an nanopore-only assemblers, e.g. flye, and mapping processes for contig/mag quantification. If you have favorite programs, let us know. |
Thank you for your feedback @d4straub! I'm wondering that maybe these parts initially could be included:
|
Oh, there might be a slight misunderstanding: Currently, there is already the possibility to use Nanopore data, but only in addition to Illumina data, not on its own (i.e. Nanopore-only). Having said that, the pipeline does not (yet) support Nanopore-only assembly, and this is what I was referring to. In case there are no Illumina reads, Filtlong doesnt work (because the settings require Illumina reads) and no Nanopore-only assembler (such as flye) is implemented in the pipeline yet. Additionally, Nanopore reads are currently not used in centrifuge/kraken (that might be relatively easy to add, actually). |
Thank you @d4straub for the clarification! So if I understand correctly there should be a standalone way of having Nanopore reads fastq files as input. I'm working for Genomic Medicine Sweden and we've been hoping to have a Nanopore-only reads classification directly without assembly (if that seems suitable for this pipeline, possibly using centrifuge/kraken2). Would these steps seem okay additions to the pipeline? At least with kraken2, we have experience of using it with Nanopore reads for classification with seemingly good results. |
Yes, there could be a way of having Nanopore reads fastq files without Illumina data as input. And that would be desirable in this pipeline. But it would be important that those Nanopore reads are not taken only for Kraken2 but also for assembly, because this is an assembly focused pipeline. And as far as I understand, assembly is not your primary objective (please correct me if I am wrong). There is a new pipeline in the making, see https://nf-co.re/taxprofiler, that is only focusing on taxonomic profiling. However, it might not allow Nanopore input yet, and it is under construction. So if you are not interested in assembly, and you consider implementing it yourself, I'd recommend to participate in nf-core/taxprofiler. |
Thank you for your response @d4straub and thank you especially for the recommendation about taxprofiler! It looks like taxprofiler matches more accurately what we need in Genomic Medicine Sweden so I will contribute in adding the feature in taxprofiler instead. I will remove myself as an assignee but will not close the issue in case someone else would like to contribute in adding Nanopore assembly based classification. |
I thought my questions fit here.
Thanks for your attention. |
That question would be better asked via nf-core slack (see https://nf-co.re/join) channel "mag". But because I am already here, short answers:
|
Thanks,
|
agree. it is important to support long reads only input data as long reads sequencing is becoming more and more popular |
Hi, Any updates or fresh thoughts on adding a pure long-read track to the pipeline? I was checking out a few other nf-core pipelines and noticed that some, like viralrecon, have already embraced this idea. I would like to help set up a dedicated nanopore/long read track for this pipeline. Should this discussion be moved to the Slack channel instead? Thanks! |
As far as I know there are no new thoughts except that the pipeline is getting huge, additions should be kept at a minimum. I still think that nanopore-only assembly should be possible within the nf-core/mag pipeline. |
Hi again, We're a group of people involved in Clinical Genomics in Sweden, and we're eager to introduce a dedicated long read track for metagenomic genome assembly. After chatting with @jfy133 , we've decided to first get together to figure out what features and functionality we want to include, then we'll dive into the how and where of adding this new track. We're well aware that there's an ongoing discussion about the existing code base, and it might be a bit tricky to shoehorn something new into the current metagenomic assembly process, especially with the potential need for significant changes and rebuilds. So, one idea would be to start fresh with a completely new pipeline for long read implementation. Perhaps we can keep the discussion going here, so others can participate with their thoughts on architecture and functionality. Thanks! |
Small comment for now: I don't think we need an entire re-write of the pipeline per se, but the purely long read functionality could be a separate fresh workflow (like viral recon with illuminata Vs nanopore data) |
Is your feature request related to a problem? Please describe
At the moment there is an input option to use only short reads.
Describe the solution you'd like
An additional option to use Nanopore long reads fastq files as input would be good to have.
The text was updated successfully, but these errors were encountered: