Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pre-Processing of SICILIAN Output #4

Open
apdavid opened this issue Sep 15, 2021 · 4 comments
Open

Pre-Processing of SICILIAN Output #4

apdavid opened this issue Sep 15, 2021 · 4 comments

Comments

@apdavid
Copy link

apdavid commented Sep 15, 2021

Thank you again for all the help so far, @juliaolivieri. I have the SICILIAN output file, in my case: runname_with_postprocessing.txt and I have a few questions about preparing the data for SpliZ

  1. I merged the output file with a file the corresponds to the clusternames (in our case '0-14' as free_annotation) cross-referenced by the cell name. Is that correct? Is it okay to pass number labels, or do they have to be strings? Ex instead of '9' something like 'S9'.
  2. Do I need to perform any filtering steps with the following fields: postprocess_passed or both_ann or called prior to running through SpliZ?
  3. Should I filter the chrR1A to only those who's names start with chr? There are some that are formatted like this: NT_039192.2
  4. Are there any other processing steps required?
@juliaolivieri
Copy link
Owner

Great questions! I think Kaitlin is helping you with 1 on the other github page. For 2, no need to perform any filtering yourself, the SpliZ pipeline will filter on the called column automatically. For 3, no need to filter out "non-standard" chromosomes, they won't affect the results for the "standard" chromosomes. And for 4, there shouldn't be any other processing steps necessary. Let me know if you run into more issues!

@apdavid
Copy link
Author

apdavid commented Sep 15, 2021

Thank you this is super helpful. Do I need to filter by postprocess_passed since that variable isn't passed to to SpliZ?

@juliaolivieri
Copy link
Owner

No need to filter, the called column is the only one from SICILIAN that should be used to filter (and that's already incorporated into the pipeline)

@apdavid
Copy link
Author

apdavid commented Sep 15, 2021

Okay got it thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants