Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommendations for performing GSEA on topics? #58

Open
AmosFong1 opened this issue Jan 11, 2025 · 3 comments
Open

Recommendations for performing GSEA on topics? #58

AmosFong1 opened this issue Jan 11, 2025 · 3 comments

Comments

@AmosFong1
Copy link

I was wondering if one of the maintainers could provide some recommendations on how to perform GSEA on fastTopics derived topics. Would it be sensible to rank genes by -log10(lfsr) * LFC (assuming lfsr is analogous to q-value)?

@pcarbo
Copy link
Member

pcarbo commented Jan 13, 2025

@AmosFong1 Any of the approaches that have been used for differential expression analyses could be used here as well; I do not have enough experience to say what is the best approach (and it may be different for different studies). In the Genome Biology paper, we performed a GSEA using the posterior mean LFC estimates. In that case, it would be testing for enrichment for the size of the changes. Alternatively, you could use the lfsrs, or another common approach is to set some threshold (e.g., based on lfsr or LFC), and test for enrichment of the gene sets above that threshold. My intuition is that the approach you are suggesting is similar to using the posterior mean LFC estimates (assuming you are shrinking the LFC estimates with shrink.method = "ash" in de_analysis()).

@AmosFong1
Copy link
Author

AmosFong1 commented Jan 13, 2025

Thanks @pcarbo, could you comment on why there are NA values for some lfsr? If I wanted to rank genes for GSEA based on lfsr, how should I handle NAs?

@pcarbo
Copy link
Member

pcarbo commented Jan 14, 2025

@AmosFong1 This might happen if there is little to no variance in the LFC estimates; check the output de_analysis outputs such as "lower" and "upper". Increasing control$ns can help give you more accurate estimates and sometimes eliminate some of the NAs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants