-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dbit #95
Dbit #95
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #95 +/- ##
==========================================
- Coverage 36.83% 35.73% -1.10%
==========================================
Files 15 16 +1
Lines 1075 1192 +117
==========================================
+ Hits 396 426 +30
- Misses 679 766 +87
|
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @lillux , thank you for this PR! and sorry for getting back to you so late. It looks good, I made some few comments that should just be about a little refactoring to strive for simplicity, let me know if anything is unclear, looking forward to get this through the finish line!
"""Keys for DBiT formatted dataset.""" | ||
|
||
# files and directories | ||
COUNTS_FILE = ".h5ad" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just out of curiosity, but which pipeline/preprocessing output an h5ad directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing is that DBiT seq can be used for both scATAC and scRNA data. We currently use a simple script to process fastq so that they can be feeded to any scRNA/scATAC processing pipeline. From that point on, it's just SOTA processing producing h5ad
files. The obs_name
in the h5ad
will be then used to map texels on the surface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, thank you for the clarification @dawe !
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dear @giovp, we have reviewed the code as suggested. Thanks for the useful comments, let us know if there is something else to fix.
Best,
Lillo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @lillux and sorry for late reply. So, to be totally honest, I don't have experience with this data, and also am not really able to test it. We also don't really have a triage/experimental section where we could put this yet. However, if it enables yours and other users analysis, and considering that it seems like a widespread enough technology, I would be positive to include this as is in the main module. What @LucaMarconato @melonora @kevinyamauchi think?
If we get couple positive feedback from other core devs, I'd be happy to merge this!
ah one more thing, if you could add a line in the changelog about this would be very helpful! Thank you!
"""Keys for DBiT formatted dataset.""" | ||
|
||
# files and directories | ||
COUNTS_FILE = ".h5ad" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great, thank you for the clarification @dawe !
I am not against it. Maybe one thing is that with multiple tables soon fully supported in |
I haven't had time to review, but I would be okay with an optimistic merge if it allows people to try out this type of data. I would vote to merge once @giovp 's comments are addressed. We can circle back to the multiple tables later (as suggested by @melonora) necessary. I think it's better to get it in and iterate rather than wait for the multiple tables to land. Thanks, @lillux ! |
Thanks @lillux; very in favor of merging as it could be helpful to other people working with this technology. On the practical side, if you have some data to suggest/share so that we can test this, please let us know. Otherwise gonna also give a pass to the code before merge. Let us know please 😊 |
This support DBiT with grids other than 50 x 50.
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Thanks all for your comments! Here are some data to test the reader. We are available to maintain the reader to keep up with future updates of both the DBiT technology and @giovp I just added an entry for the DBiT reader in the Changelog, let me know if that is what you meant. |
That's great to hear, @lillux ! Thanks for the updates to the PR. I'm not sure how the changelogs work, so I'll let @giovp or @LucaMarconato give the final review and merge. |
@lillux I have just checked and tried the reader, great work! Merging now 😊 We plan to make a release by the end of the month (it's going to take a while because at the moment we are working on some large PRs in spatialdata), but looking forward to have this in PyPI! |
Nice work, everyone! Super cool to see another reader added. |
Thanks @giovp , @LucaMarconato , @kevinyamauchi , @melonora for your support and advices in bringing this code to production! I'm looking forward to further contribute to the scverse ecosystem! All the best, |
This pull-request proposes a reader for
DBiT-seq
experiments.The reader handles DBiT-seq experiments with or without histological images. It expects the histological image to be cropped and transformed correctly by the user, to match the area covered by the chip.
The reader accepts one experiment per object.
The
barcode_position
file is supposed to be a tab separated text file, headerless, with 2 columns: the first for thebarcode coordinate
(A and B, 50 barcodes each), the second for thebarcode sequence
.Here a
barcode_position
file as a reference:barcode_list.txt
We are happy to have feedback from you!
Thanks for your support!