Importing new whitelist for parsebio protocols #1777
Replies: 8 comments 5 replies
-
@NBurnaevskiy, is there another question? From your message and the original updates, I got the impression that you have figured everything out. Feel free to reach out if you need any assistance. |
Beta Was this translation helpful? Give feedback.
-
Hi. I am reopening this issue, since I still have concerns about providing a custom white list of barcodes. I examined the source file for parse bioscience preset: I first tried to run mixcr with preset "parsebio-sc-3gex-evercode-wt" and a custom list of CELL1 barcodes (96 barcodes total in plain text file, one 8nt barcode on each line). However, instead of converting 96 barcodes into numbers 1 through 96, first half of the barcode list was converted into numbers 1 through 48 and also second half of barcodes from my custom list were also converted into numbers 1 through 48. But we need these 96 barcodes to be considered different wells and converted into numbers 1 through 96, by the number of wells in microwell plate that we used during the assay. I am now trying to use a parameter "parsebio-sc-3gex-evercode-wt-mega" which according to preset file above allows CELL1 conversion beyond 48, but this protocol assumes 2 sets of 96 barcodes for CELL1. What will happen if I only provide a list of 96 custom barcodes? Will mixcr try to utilize second half of CELL1 barcodes from "analyze parsebio-sc-3gex-evercode-wt-mega" preset? Or will it only utilize my list of 96 barcodes? I am running mixcr now, and results look little confusing. I see wells that shouldn't be there. This may be partially due to PCR and sequencing errors, but I am also not sure if mixcr only uses my 96 barcodes for CELL1 or taps into built-in presets to look up remaining 96 barcodes for "parsebio-sc-3gex-evercode-wt-mega" preset. Could you please clarify? |
Beta Was this translation helpful? Give feedback.
-
In this case, I would recommend creating a custom preset. I can help by creating a YAML file specifically tailored to your data. Could you please share the whitelists for all CELL barcodes here or at [email protected]? Additionally, do you need the conversion to numbers, as done in Parsebio, or would you prefer to have the barcode sequence itself in the output? |
Beta Was this translation helpful? Give feedback.
-
Mizraelson, I tried to tinker with my local YAML file, but then mixcr failed to run. |
Beta Was this translation helpful? Give feedback.
-
In fact, when I try to run mixcr with the preset "parsebio-sc-3gex-evercode-wt-mega" and custom list of only 96 barcodes, the run crashes at the stage of "extending alignments" with the error |
Beta Was this translation helpful? Give feedback.
-
Hm, |
Beta Was this translation helpful? Give feedback.
-
enriched. |
Beta Was this translation helpful? Give feedback.
-
I used 160gb system. Is more needed for enriched libraries? |
Beta Was this translation helpful? Give feedback.
-
Hello,
I am trying to implement mixcr to analyze the scRNA-seq data produced using split-seq method (commercial version is called parse bio evercode WT).
We use adapters of the same length and structure as commercial evercode-wt, but our CELL1 barcodes are different, so built-in mixcr barcodes don't work for us. I am trying to use our barcodes through --set-whitelist, but it doesn't work (cells are not detected). Currently, the file that I am trying to upload is just a text file with 96 x 8nt barcodes. What is the proper file format to import new barcodes?
My current command looks like this:
mixcr analyze parsebio-sc-3gex-evercode-wt
--species hsa
file_R1.fastq.gz
file_R2.fastq.gz
ouput/mixcr_output/
--set-whitelist CELL1=file:barcodes_r1.txt
Thank you.
Update. I realized that our barcode structure in this sample is little different and I made changes in tag-pattern. Now cells are found, but mixcr stopped producing output. It shows output in the terminal, and claiming that output files are written, but the output folder remains empty. Here is my command that I update parameters with
mixcr analyze parsebio-sc-3gex-evercode-wt
--species hsa
sample_R1.fastq.gz
sample_R2.fastq.gz \
TCR/output/
--tag-pattern "^(R1:*)^(UMI:N{10})(CELL3:N{8})gtggccgatgN{20}(CELL2:N{8})atccacgtgcN{20}(CELL1:N{8})"
--set-whitelist CELL1=file:barcodes_r1.txt
when I start mixcr again aiming for same output folder, it warns me that the file "alignments.vdjca" already exists. Use -f / --force-overwrite option to overwrite it." but the output folder is still empty.
Please let me know if what other info is needed to troubleshoot the issue
Update:
It appeares that mixcr is very sensitive to command syntax. spaces in command line were treated at part of paths directing output to the wrong places. It now works.
Beta Was this translation helpful? Give feedback.
All reactions