Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification in input samples #19

Open
DustinSokolowski opened this issue Apr 26, 2022 · 3 comments
Open

Clarification in input samples #19

DustinSokolowski opened this issue Apr 26, 2022 · 3 comments

Comments

@DustinSokolowski
Copy link

Hello!

Thank you for the exciting tool and paper. I was interested in looking into applying Bayes-Prism to a non-cancer database with a small sample size compared to TCGA.

Specifically, I have 50 mice with 5 timepoints and 2 conditions at each timepoint. My paired scRNA-seq data is an N=2, one sample from each condition in the middle timepoint. I was hoping to look at cell-type proportions and even differences in cell-type specific expression between conditions across timepoints. I was wondering if you've tested Bayes-Prism on sample sizes of this size and of non-tumour tissue? If you have, are there any potential roadblocks to consider?

Best,
Dustin

@tinyi
Copy link
Collaborator

tinyi commented May 10, 2022

Hi Dustin,

I am not sure if my previous email reply passed through. In case if it didn't, I am pasting my reply here.

"Hi Dustin,

Thank you for your interest in our work. Sample size of the bulk RNA-seq generally should not be an issue, as in the first (initial) Gibbs sampling step, each bulk is treated independently. Only the updated sampling step might be affected. We benchmarked the human peripheral whole blood sample with N=12 (see figure 1e and f of our paper), and found both the initial and updated gibbs accurate. That being said, we are updating our method to make the updated sampling potentially more robust to rare cell types and small numbers of bulk samples (in ~one week). You are also welcome to try the updated package.

A few suggestions are as follows.

  1. As for the recommended setup, if the gene expression is expected to change across time points, you may consider sub-clustering each cell type in your scRNA-seq data, and label them as cell states. Hopefully the scRNA-seq collected at the midpoint can capture the heterogeneity of transcription from early and late time points. Doing this way may make the inferred posterior more accurate.

  2. I would recommend starting by deconvolving samples from each condition using the scRNA-seq from the same condition.

Let me know if there are any questions.

Best,

Tinyi"

@DustinSokolowski
Copy link
Author

DustinSokolowski commented Oct 11, 2022 via email

@tinyi
Copy link
Collaborator

tinyi commented Oct 11, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants