Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dbit #95

Merged
merged 99 commits into from
Feb 9, 2024
Merged

Dbit #95

merged 99 commits into from
Feb 9, 2024

Conversation

lillux
Copy link
Contributor

@lillux lillux commented Dec 4, 2023

This pull-request proposes a reader for DBiT-seq experiments.
The reader handles DBiT-seq experiments with or without histological images. It expects the histological image to be cropped and transformed correctly by the user, to match the area covered by the chip.
The reader accepts one experiment per object.

The barcode_position file is supposed to be a tab separated text file, headerless, with 2 columns: the first for the barcode coordinate (A and B, 50 barcodes each), the second for the barcode sequence.

A1    AACGTGAT
A2    AAACATCG
A3    ATGCCTAA
A4    AGTGGTCA
A5    ACCACTGT
A6    ACATTGGC
A7    CAGATCTG
A8    CATCAAGT
A9    CGCTGATC
A10  ACAAGCTA
...

Here a barcode_position file as a reference:
barcode_list.txt

We are happy to have feedback from you!
Thanks for your support!

@codecov-commenter
Copy link

codecov-commenter commented Dec 5, 2023

Codecov Report

Attention: 87 lines in your changes are missing coverage. Please review.

Comparison is base (449f4b8) 36.83% compared to head (955a52c) 35.73%.
Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #95      +/-   ##
==========================================
- Coverage   36.83%   35.73%   -1.10%     
==========================================
  Files          15       16       +1     
  Lines        1075     1192     +117     
==========================================
+ Hits          396      426      +30     
- Misses        679      766      +87     
Files Coverage Δ
src/spatialdata_io/__init__.py 100.00% <100.00%> (ø)
src/spatialdata_io/_constants/_constants.py 100.00% <100.00%> (ø)
src/spatialdata_io/readers/dbit.py 21.62% <21.62%> (ø)

Copy link
Member

@giovp giovp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @lillux , thank you for this PR! and sorry for getting back to you so late. It looks good, I made some few comments that should just be about a little refactoring to strive for simplicity, let me know if anything is unclear, looking forward to get this through the finish line!

src/spatialdata_io/__init__.py Outdated Show resolved Hide resolved
"""Keys for DBiT formatted dataset."""

# files and directories
COUNTS_FILE = ".h5ad"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just out of curiosity, but which pipeline/preprocessing output an h5ad directly?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing is that DBiT seq can be used for both scATAC and scRNA data. We currently use a simple script to process fastq so that they can be feeded to any scRNA/scATAC processing pipeline. From that point on, it's just SOTA processing producing h5ad files. The obs_name in the h5ad will be then used to map texels on the surface.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, thank you for the clarification @dawe !

src/spatialdata_io/_constants/_constants.py Outdated Show resolved Hide resolved
src/spatialdata_io/readers/DBiT.py Outdated Show resolved Hide resolved
src/spatialdata_io/readers/DBiT.py Outdated Show resolved Hide resolved
src/spatialdata_io/readers/DBiT.py Outdated Show resolved Hide resolved
src/spatialdata_io/readers/DBiT.py Outdated Show resolved Hide resolved
src/spatialdata_io/readers/DBiT.py Outdated Show resolved Hide resolved
src/spatialdata_io/readers/DBiT.py Outdated Show resolved Hide resolved
src/spatialdata_io/readers/DBiT.py Outdated Show resolved Hide resolved
Copy link
Contributor Author

@lillux lillux left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dear @giovp, we have reviewed the code as suggested. Thanks for the useful comments, let us know if there is something else to fix.

Best,
Lillo

@lillux lillux requested a review from giovp January 29, 2024 09:22
Copy link
Member

@giovp giovp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @lillux and sorry for late reply. So, to be totally honest, I don't have experience with this data, and also am not really able to test it. We also don't really have a triage/experimental section where we could put this yet. However, if it enables yours and other users analysis, and considering that it seems like a widespread enough technology, I would be positive to include this as is in the main module. What @LucaMarconato @melonora @kevinyamauchi think?

If we get couple positive feedback from other core devs, I'd be happy to merge this!

ah one more thing, if you could add a line in the changelog about this would be very helpful! Thank you!

"""Keys for DBiT formatted dataset."""

# files and directories
COUNTS_FILE = ".h5ad"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great, thank you for the clarification @dawe !

src/spatialdata_io/readers/dbit.py Outdated Show resolved Hide resolved
src/spatialdata_io/readers/dbit.py Outdated Show resolved Hide resolved
@melonora
Copy link
Collaborator

melonora commented Feb 5, 2024

thank you @lillux and sorry for late reply. So, to be totally honest, I don't have experience with this data, and also am not really able to test it. We also don't really have a triage/experimental section where we could put this yet. However, if it enables yours and other users analysis, and considering that it seems like a widespread enough technology, I would be positive to include this as is in the main module. What @LucaMarconato @melonora @kevinyamauchi think?

If we get couple positive feedback from other core devs, I'd be happy to merge this!

ah one more thing, if you could add a line in the changelog about this would be very helpful! Thank you!

I am not against it. Maybe one thing is that with multiple tables soon fully supported in SpatialData and tables not necessarily having to annotate a SpatialElement we could include the barcode list. We could either wait until it is ready or merge and afterwards open a PR for this.

@kevinyamauchi
Copy link
Collaborator

I haven't had time to review, but I would be okay with an optimistic merge if it allows people to try out this type of data.

I would vote to merge once @giovp 's comments are addressed. We can circle back to the multiple tables later (as suggested by @melonora) necessary. I think it's better to get it in and iterate rather than wait for the multiple tables to land.

Thanks, @lillux !

@LucaMarconato
Copy link
Member

Thanks @lillux; very in favor of merging as it could be helpful to other people working with this technology.

On the practical side, if you have some data to suggest/share so that we can test this, please let us know. Otherwise gonna also give a pass to the code before merge. Let us know please 😊

@lillux
Copy link
Contributor Author

lillux commented Feb 6, 2024

Thanks all for your comments!

Here are some data to test the reader.

We are available to maintain the reader to keep up with future updates of both the DBiT technology and SpatialData development.

@giovp I just added an entry for the DBiT reader in the Changelog, let me know if that is what you meant.

@kevinyamauchi
Copy link
Collaborator

We are available to maintain the reader to keep up with future updates of both the DBiT technology and SpatialData development.

That's great to hear, @lillux ! Thanks for the updates to the PR.

I'm not sure how the changelogs work, so I'll let @giovp or @LucaMarconato give the final review and merge.

@LucaMarconato
Copy link
Member

@lillux I have just checked and tried the reader, great work! Merging now 😊

We plan to make a release by the end of the month (it's going to take a while because at the moment we are working on some large PRs in spatialdata), but looking forward to have this in PyPI!

@LucaMarconato LucaMarconato merged commit 94d856f into scverse:main Feb 9, 2024
6 checks passed
@kevinyamauchi
Copy link
Collaborator

Nice work, everyone! Super cool to see another reader added.

@lillux
Copy link
Contributor Author

lillux commented Feb 9, 2024

Thanks @giovp , @LucaMarconato , @kevinyamauchi , @melonora for your support and advices in bringing this code to production! I'm looking forward to further contribute to the scverse ecosystem!

All the best,
Lillo

lucas-diedrich pushed a commit to lucas-diedrich/spatialdata-io that referenced this pull request Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants