Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pangenome from draft assemblies #1552

Open
maggs-x opened this issue Dec 2, 2024 · 3 comments
Open

Pangenome from draft assemblies #1552

maggs-x opened this issue Dec 2, 2024 · 3 comments

Comments

@maggs-x
Copy link

maggs-x commented Dec 2, 2024

Hi is either progressive cactus or cactus minigraph pangenome appropriate to create a pangenome graph for assemblies that are just short of chromosome level? We'd like to generate a pangenome graph that can be used later to genotype structural variants across more individuals.

I'm inclined to use cactus minigraph instead of progressive because I imagine relying on the highest quality assembly as a reference would be helpful. We do not have a chromosome level reference in this case. Based on the paper and the GitHub, it sounds like sufficiently long scaffolds will still render accurate results. If you have any feedback on how best to troubleshoot let me know. I've debated going with standard read mapping approaches to call structural variants, but this wouldn't be as beneficial downstream.

Thanks for your help,

Maggs

@maggs-x
Copy link
Author

maggs-x commented Dec 2, 2024

And one quick additional comment. My understanding of progressive is that it works well even when lower quality assemblies are included in the dataset. But, I'm cautious to interpret this as meaning that a dataset comprised entirely of 'fragmented' assemblies will render a good result. The vcf output with minigraph is also useful because it's easy to weed out any SVs that don't have high alignment scores. Curious of your thoughts. Thanks again,

Maggs

@glennhickey
Copy link
Collaborator

I think you've outlined all the points

  • progressive cactus doesn't require a reference-quality assembly, but you can't use it for genotyping downstream
  • minigraph-cactus does require a reference assembly, and you can use the results for genotyping.

In both cases, the alignment quality will only be as good as the input data. I guess you can try the --noSplit option of cactus-pangenome with your data, but I can't guarantee the results will be useful.

@maggs-x
Copy link
Author

maggs-x commented Dec 3, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants