Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add persistent implementation of dataset::MutableDataset #22

Open
pchampin opened this issue Dec 10, 2019 · 5 comments
Open

Add persistent implementation of dataset::MutableDataset #22

pchampin opened this issue Dec 10, 2019 · 5 comments
Labels
help wanted Extra attention is needed
Milestone

Comments

@pchampin
Copy link
Owner

In addition to the dataset::inmem module, it would be nice to have a disk-based persistent implementation of dataset::MutableDataset.

@pchampin pchampin added the help wanted Extra attention is needed label Dec 10, 2019
@pchampin
Copy link
Owner Author

One way to do it could be to use RockDB. We could even try and use the same layout as used by Oxigraph, making it possible to share the same storage across both crates. @Tpt, what do you think? Is that layout documented somewhere?

@Tpt
Copy link
Contributor

Tpt commented Dec 10, 2019

That would be great to make Sophia works with Oxigraph storage! The Oxigraph RockDB layout is not stable yet, I am currently tweaking it to make it a bit more compact and allow efficient range queries. I hope to have time finishing a 0.1 Oxigraph release with a stable RocksDB layout in late December or (more realistically) January.

The basic storage approach should not change: I store in RocksDB keys rotations of quads of EncodedTerm and a string store. RocksDB prefix searches are then used to solve triple patterns. The string store is used as an inverse hash lookup, the strings being hashed inside of EncodedTerm. Hashing strings is very useful for heavy SPARQL query evaluations with a lot of joins, it might not be the best approach for just storing and doing simple triple pattern evaluation.

But, I'm not sure that reimplementing Oxigraph storage in Sophia is the best way to do it. I fear than very quickly you might want also to be able to run SPARQL queries on top of it in Sophia, completely duplicating the work already done in Oxigraph. A better way to go would probably to make Oxigraph usable with Sophia.

@pchampin
Copy link
Owner Author

A better way to go would probably to make Oxigraph usable with Sophia.

Yes, implementing Sophia's traits above Oxigraph is also a way to go, and probably the fastest one. I would be concerned, though, that converting from Oxigraph's model to Sophia's would induce some overhead, hence my initial proposal... But definitely worth a try, anyway.

@pchampin
Copy link
Owner Author

Not published on crates.io yet, but an adapter for Oxigraph is now available at https://github.com/pchampin/sophia_oxigraph.

@pchampin pchampin added this to the 0.9 milestone Dec 14, 2023
@pchampin
Copy link
Owner Author

Just pushed a PR on Oxigraph to make it implement the relevant Sophia traits (behind a feature gate). Oxigraph could therefore serve as a reference implementation of a persistent dataset.

@pchampin pchampin modified the milestones: 0.9, 0.10 Nov 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants