-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds JSONL connection #88
Conversation
…erialization of descriptors and retrievables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The logic itself looks good. Surely could be useful. I have a few fundamental comments though:
- I don't see why this is part of the
vitrivr-engine-index
module. This means, that I cannot use it of retrieval only (unless I include this module). Why not just have a separate module for this database connector? - Having a few tests that make sure the basic functionality works as intended would go a long ways to keep this stable. I'd suggest to implement at least those that I've created for the other database connectors.
- Personally, I'm not a huge fan of the old Java
File
API and I'd probably usejava.nio
for everything new (however, this is not a must just a personal preference).
Other than that, the the PR looks good to me.
...src/main/kotlin/org/vitrivr/engine/database/jsonl/retrievable/JsonlRetrievableInitializer.kt
Outdated
Show resolved
Hide resolved
...src/main/kotlin/org/vitrivr/engine/database/jsonl/retrievable/JsonlRetrievableInitializer.kt
Outdated
Show resolved
Hide resolved
...src/main/kotlin/org/vitrivr/engine/database/jsonl/retrievable/JsonlRetrievableInitializer.kt
Outdated
Show resolved
Hide resolved
...ndex/src/main/kotlin/org/vitrivr/engine/database/jsonl/retrievable/JsonlRetrievableReader.kt
Outdated
Show resolved
Hide resolved
import org.vitrivr.engine.core.model.types.Value | ||
import java.util.* | ||
|
||
/** A typealias to identify the [UUID] identifying a [Descriptor]. */ | ||
typealias DescriptorId = UUID | ||
typealias DescriptorId = @Serializable(UUIDSerializer::class) UUID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works? Amazing!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was surprized too
* A [Type] that represents a [UUID] value. | ||
*/ | ||
@Serializable | ||
data object UUID : Type { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just so I can understand why this is needed: UUIDs (thus far) are only used internally by the engine as IDs. Adding this as a type basically means, that we want to be able to actually use UUIDs as fields. Is that the intent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I had to touch the type system anyway to make everything serializable, I thought adding this would be nice for completeness' sake. This way, the type can be used, for example, in a struct descriptor that needs to deal with external identifiers.
} | ||
|
||
override fun update(item: D): Boolean { | ||
LOGGER.warn { "JsonlWriter.update is not supported" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why one cannot support deletes in this case. But why are updates not supported if the entry is stored on a single line anyway? Just update the line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically correct, this will still require to rewrite the entire file. I'm not sure if the complexity, especially since this then requires additional synchronization, is worth the effort.
Looks good to me. |
Great, thanks! |
A connection that reads and writes data to line-wise JSON files. Can be useful for larger-scale distributed extractions on shared collections or just for small-scale testing purposes.