Add separate Nanopore input option #275

ljmesi · 2022-02-22T13:13:02Z

Is your feature request related to a problem? Please describe

At the moment there is an input option to use only short reads.

Describe the solution you'd like

An additional option to use Nanopore long reads fastq files as input would be good to have.

d4straub · 2022-02-22T13:40:48Z

Totally agree. That would be a major enhancement, also some significant work though. It requires an nanopore-only assemblers, e.g. flye, and mapping processes for contig/mag quantification. If you have favorite programs, let us know.
This is definitely on my wish-list as well, but it might take a while. You are welcome to add it if you feel like it!

ljmesi · 2022-03-07T09:14:26Z

Thank you for your feedback @d4straub! I'm wondering that maybe these parts initially could be included:

Add Nanopore reads as input to the pipeline
Use Porechop for adapter/quality trimming
Remove host reads with minimap2
Classify the reads with centrifuge/kraken
Do these sound good steps to take in order to chop this broader task into smaller subtasks?

d4straub · 2022-03-07T11:31:13Z

Oh, there might be a slight misunderstanding:

Currently, there is already the possibility to use Nanopore data, but only in addition to Illumina data, not on its own (i.e. Nanopore-only).
Also, adapter trimming (Porechop), quality visualisation (NanoPlot), quality filtering (Filtlong), Lambda read removal (NanoLyse) are implemented. Direct host read removal is not yet implemented, only via Filtlong, that depends on Illumina reads (i.e. Nanopore reads that are not covered by Illumina reada are discarded, i.e. when Illumina reads do not have host data, filtered Nanopore reads will also not). Hybrid (Illumina & Nanopore data) assembly is realized with hybridSPAdes already.

Having said that, the pipeline does not (yet) support Nanopore-only assembly, and this is what I was referring to. In case there are no Illumina reads, Filtlong doesnt work (because the settings require Illumina reads) and no Nanopore-only assembler (such as flye) is implemented in the pipeline yet. Additionally, Nanopore reads are currently not used in centrifuge/kraken (that might be relatively easy to add, actually).

ljmesi · 2022-03-10T10:00:28Z

Thank you @d4straub for the clarification! So if I understand correctly there should be a standalone way of having Nanopore reads fastq files as input. I'm working for Genomic Medicine Sweden and we've been hoping to have a Nanopore-only reads classification directly without assembly (if that seems suitable for this pipeline, possibly using centrifuge/kraken2). Would these steps seem okay additions to the pipeline? At least with kraken2, we have experience of using it with Nanopore reads for classification with seemingly good results.

d4straub · 2022-03-10T10:28:18Z

Yes, there could be a way of having Nanopore reads fastq files without Illumina data as input. And that would be desirable in this pipeline. But it would be important that those Nanopore reads are not taken only for Kraken2 but also for assembly, because this is an assembly focused pipeline. And as far as I understand, assembly is not your primary objective (please correct me if I am wrong).

There is a new pipeline in the making, see https://nf-co.re/taxprofiler, that is only focusing on taxonomic profiling. However, it might not allow Nanopore input yet, and it is under construction. So if you are not interested in assembly, and you consider implementing it yourself, I'd recommend to participate in nf-core/taxprofiler.

ljmesi · 2022-03-17T17:16:36Z

Thank you for your response @d4straub and thank you especially for the recommendation about taxprofiler! It looks like taxprofiler matches more accurately what we need in Genomic Medicine Sweden so I will contribute in adding the feature in taxprofiler instead. I will remove myself as an assignee but will not close the issue in case someone else would like to contribute in adding Nanopore assembly based classification.

abu85 · 2023-04-12T09:23:02Z

I thought my questions fit here.
I have question regarding adding a pooled nanopore sample to the pipeline and question an subsequent analysis based on the previous one.
I want to have hybrid comprehensive assembly from both short reads (illumina) and long reads (nanopore). but unfortunately I had to pool samples before nanopore sequencing, so i have fifty individual samples in short redas but one (combined) sample in nanopore. So my questions are

which way i should add the nanopore sample in the samplesheet (my plan is to add this sample besides one of the he short read sample in samplesheet)?
I would like to do binning groupwise on this hybrid assembly based on short reads samples, will this setup in the samplesheet make a problem later on here?
How can i classify nanopore reads in this pipeline (there is no Kraken2 classification option for long reads)? any suggestion?
I have so many fastq files in nanopore sample, should i combine them all into one before runing?

Thanks for your attention.

d4straub · 2023-04-12T10:32:15Z

That question would be better asked via nf-core slack (see https://nf-co.re/join) channel "mag". But because I am already here, short answers:

once per row, i.e. once per illumina sample is the only way, but that would generate huge overhead in the pipeline. I am not not sure I got it right, but you can not make a co-assembly that way of course (using --coassemble_group).
binning group wise is no problem, because it only depends on the short reads, it does not use the long reads.
use nf-core/taxprofiler, now released
yes, but your data is non-optimal (nanopore not separated into samples, are you sure that your "many fastq files" are not separated by sample, after all, also nanopore allows [de]multiplexing)

abu85 · 2023-04-12T12:25:27Z

Thanks,

I want to utilize this pooled longreads sample (where a bit of every sample were merged into one), I thought I can include in the analysis to make the analysis be better but now it seems that i can not do so from your point, or i misunderstood? Do you suggest anything here?
4.no, they are not separetd by samples.

dawnmy · 2023-06-19T22:56:51Z

agree. it is important to support long reads only input data as long reads sequencing is becoming more and more popular

willros · 2023-08-30T09:15:56Z

Hi,

Any updates or fresh thoughts on adding a pure long-read track to the pipeline? I was checking out a few other nf-core pipelines and noticed that some, like viralrecon, have already embraced this idea. I would like to help set up a dedicated nanopore/long read track for this pipeline.

Should this discussion be moved to the Slack channel instead?

Thanks!
William

d4straub · 2023-08-30T10:44:17Z

As far as I know there are no new thoughts except that the pipeline is getting huge, additions should be kept at a minimum. I still think that nanopore-only assembly should be possible within the nf-core/mag pipeline.
General planning/updates should be here I think, more interactive discussion are more convenient in slack imho.

willros · 2023-09-07T09:15:51Z

Hi again,

We're a group of people involved in Clinical Genomics in Sweden, and we're eager to introduce a dedicated long read track for metagenomic genome assembly. After chatting with @jfy133 , we've decided to first get together to figure out what features and functionality we want to include, then we'll dive into the how and where of adding this new track.

We're well aware that there's an ongoing discussion about the existing code base, and it might be a bit tricky to shoehorn something new into the current metagenomic assembly process, especially with the potential need for significant changes and rebuilds. So, one idea would be to start fresh with a completely new pipeline for long read implementation.

Perhaps we can keep the discussion going here, so others can participate with their thoughts on architecture and functionality.

Thanks!
William

jfy133 · 2023-09-07T10:05:05Z

Small comment for now: I don't think we need an entire re-write of the pipeline per se, but the purely long read functionality could be a separate fresh workflow (like viral recon with illuminata Vs nanopore data)

ljmesi added the enhancement New feature or request label Feb 22, 2022

ljmesi self-assigned this Feb 22, 2022

sofstam mentioned this issue Feb 25, 2022

Add separate input for sample/control + samplesheet #277

Closed

ljmesi removed their assignment Mar 17, 2022

jfy133 mentioned this issue Sep 5, 2024

Add (meta)Flye for long-read only assembly #659

Open

jfy133 assigned muabnezor Sep 19, 2024

muabnezor linked a pull request Dec 12, 2024 that will close this issue

Longread only functionality #718

Open

11 tasks

jfy133 linked a pull request Jan 20, 2025 that will close this issue

Longread only functionality #718

Open

11 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add separate Nanopore input option #275

Add separate Nanopore input option #275

ljmesi commented Feb 22, 2022

d4straub commented Feb 22, 2022

ljmesi commented Mar 7, 2022

d4straub commented Mar 7, 2022 •

edited

Loading

ljmesi commented Mar 10, 2022 •

edited

Loading

d4straub commented Mar 10, 2022

ljmesi commented Mar 17, 2022 •

edited

Loading

abu85 commented Apr 12, 2023 •

edited

Loading

d4straub commented Apr 12, 2023

abu85 commented Apr 12, 2023 •

edited

Loading

dawnmy commented Jun 19, 2023

willros commented Aug 30, 2023

d4straub commented Aug 30, 2023

willros commented Sep 7, 2023

jfy133 commented Sep 7, 2023

Add separate Nanopore input option #275

Add separate Nanopore input option #275

Comments

ljmesi commented Feb 22, 2022

Is your feature request related to a problem? Please describe

Describe the solution you'd like

d4straub commented Feb 22, 2022

ljmesi commented Mar 7, 2022

d4straub commented Mar 7, 2022 • edited Loading

ljmesi commented Mar 10, 2022 • edited Loading

d4straub commented Mar 10, 2022

ljmesi commented Mar 17, 2022 • edited Loading

abu85 commented Apr 12, 2023 • edited Loading

d4straub commented Apr 12, 2023

abu85 commented Apr 12, 2023 • edited Loading

dawnmy commented Jun 19, 2023

willros commented Aug 30, 2023

d4straub commented Aug 30, 2023

willros commented Sep 7, 2023

jfy133 commented Sep 7, 2023

d4straub commented Mar 7, 2022 •

edited

Loading

ljmesi commented Mar 10, 2022 •

edited

Loading

ljmesi commented Mar 17, 2022 •

edited

Loading

abu85 commented Apr 12, 2023 •

edited

Loading

abu85 commented Apr 12, 2023 •

edited

Loading