-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add support to pgvector #267
base: main
Are you sure you want to change the base?
Conversation
Code Coverage Summary
Diff against main
Results for commit: 204b4cb Minimum allowed coverage is ♻️ This comment has been updated with latest results |
Trivy scanning results. .venv/lib/python3.10/site-packages/PyJWT-2.9.0.dist-info/METADATA (secrets)Total: 1 (MEDIUM: 1, HIGH: 0, CRITICAL: 0) MEDIUM: JWT (jwt-token) .venv/lib/python3.10/site-packages/litellm/llms/huggingface/huggingface_llms_metadata/hf_text_generation_models.txt (secrets)Total: 1 (MEDIUM: 0, HIGH: 0, CRITICAL: 1) CRITICAL: HuggingFace (hugging-face-access-token) .venv/lib/python3.10/site-packages/litellm/proxy/_types.py (secrets)Total: 1 (MEDIUM: 1, HIGH: 0, CRITICAL: 0) MEDIUM: Slack (slack-web-hook) |
packages/ragbits-core/src/ragbits/core/vector_stores/pgvector.py
Outdated
Show resolved
Hide resolved
metadata_store: The metadata store to use. If None, the metadata will be stored in pgVector db. | ||
""" | ||
super().__init__(default_options=default_options, metadata_store=metadata_store) | ||
conf = PgVectorConfig() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part is confusing to me. If you want to provide default values for some of the __init__
arguments why not just provide the default value in the method signature (e.g., vector_size: int = 512,
)?
It also seems to me that the db
argument should not have a default value and instead be required.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
create_command = self._create_table_command() | ||
await conn.execute(create_command) | ||
hnsw_name = self.table_name + "_hnsw_idx" | ||
query = create_index_query.format( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should never format SQL queries using string operations - this creates a SQL injection and other problems connected with unexpected characters. You should always pass values the way you did on line 105 - as arguments to the relevant function of the SQL library.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added the check on table name in init
""" | ||
distance_operator = PgVectorDistance.DISTANCE_OPS[self.distance_method][1] | ||
|
||
query = f"SELECT * FROM {self.table_name}" # noqa S608 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, all values added to SQL query via f-string create possibilities for sql injection / bugs connected to characters that has not been escaped.
No description provided.