-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
opt-refactor(torii-core): complete sql schema refactor #2718
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wondering at which point the migrations of Torii are important to keep, since most of people just restart Torii.
As we're a bit refactoring everything, why not starting from a new ground?
WDYT?
94e6dc1
to
5233b5f
Compare
WalkthroughOhayo, sensei! This pull request introduces several significant changes across multiple files in the Changes
Possibly related PRs
Suggested reviewers
📜 Recent review detailsConfiguration used: .coderabbit.yaml 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 12
🧹 Outside diff range and nitpick comments (15)
crates/torii/migrations/20241126064130_model_schema.sql (2)
1-2
: Ohayo sensei! Consider documenting the migration strategy.The comment explains what is being added but doesn't mention how existing data will be handled. Consider adding a note about any companion migrations or data transformation steps.
3-3
: Consider using TEXT instead of BLOB for JSON data, sensei!While BLOB can store JSON data, TEXT is generally more appropriate for JSON storage as:
- It better indicates the content type
- Most SQL clients can display TEXT data directly
- JSON functions in SQL typically expect TEXT input
-ALTER TABLE models ADD COLUMN schema BLOB NOT NULL; +ALTER TABLE models ADD COLUMN schema TEXT NOT NULL;crates/torii/graphql/src/constants.rs (1)
15-16
: Consider adding documentation about internal vs external IDs, sensei!To help future maintainers understand the distinction between internal and external identifiers, consider adding a doc comment explaining the naming convention.
Here's a suggested addition above the constants:
+/// Column names for internal database identifiers. +/// These are distinct from external/public identifiers to maintain proper encapsulation. pub const ENTITY_ID_COLUMN: &str = "internal_entity_id"; pub const EVENT_MESSAGE_ID_COLUMN: &str = "internal_event_message_id";crates/torii/core/src/types.rs (2)
110-111
: Ohayo sensei! Consider adding documentation for the new fields.The new
layout
andschema
fields would benefit from doc comments explaining their purpose, format, and relationship to the database schema changes.Apply this diff to add documentation:
pub struct Model { pub id: String, pub namespace: String, pub name: String, pub class_hash: String, pub contract_address: String, pub transaction_hash: String, + /// The serialized layout configuration for the model pub layout: String, + /// The JSON serialized schema representation of the model's type pub schema: String, pub executed_at: DateTime<Utc>, pub created_at: DateTime<Utc>, }
110-111
: Consider implementing custom serialization/deserialization.Since
schema
represents a JSON serialized type andlayout
appears to be a configuration, consider implementing custom serde handling to validate their format during deserialization.Example implementation:
use serde::{Deserialize, Deserializer}; impl Model { fn deserialize_schema<'de, D>(deserializer: D) -> Result<String, D::Error> where D: Deserializer<'de>, { let s: String = Deserialize::deserialize(deserializer)?; // Validate JSON schema format serde_json::from_str::<serde_json::Value>(&s) .map_err(serde::de::Error::custom)?; Ok(s) } }crates/torii/graphql/src/schema.rs (1)
143-144
: Enhance error context for schema parsing, sensei!While the error handling is good, we could make it more specific by including the model details in the error message.
Consider this improvement:
- let schema: Ty = serde_json::from_str(&model.schema) - .map_err(|e| anyhow::anyhow!(format!("Failed to parse model schema: {e}")))?; + let schema: Ty = serde_json::from_str(&model.schema).map_err(|e| { + anyhow::anyhow!( + "Failed to parse schema for model {}/{}: {e}", + model.namespace, + model.name + ) + })?;crates/torii/graphql/src/object/model_data.rs (2)
28-28
: Ohayo sensei! Please add documentation for the schema field.The new
schema
field and its usage in the constructor would benefit from documentation explaining:
- The purpose of this field
- How it relates to the SQL schema refactoring
- Any validation requirements for the schema parameter
Also applies to: 34-38
Line range hint
191-204
: Ohayo sensei! Address the TODO comment about subquery optimization.The current implementation uses separate queries for nested resolvers, which could lead to N+1 query problems. The TODO comment correctly suggests using JOINs instead.
Consider optimizing this by:
- Using JOINs in the parent query
- Implementing DataLoader pattern to batch and cache requests
- Adding appropriate indexes on the joined columns
Would you like me to help create a GitHub issue to track this optimization task?
crates/torii/core/src/executor/mod.rs (1)
Line range hint
1-1000
: Consider refactoring for better maintainability, sensei!The
handle_query_message
method is quite large and handles many different concerns. Consider:
- Extracting query type handlers into separate methods
- Creating a trait for query type handlers
- Standardizing error handling patterns across all query types
Example refactor pattern:
trait QueryHandler { async fn handle(&self, tx: &mut Transaction<'_, Sqlite>, query: Query) -> Result<()>; } impl<'c, P: Provider + Sync + Send + 'static> Executor<'c, P> { async fn handle_query_message(&mut self, query_message: QueryMessage) -> Result<()> { let handler = self.get_query_handler(&query_message.query_type); handler.handle(&mut self.transaction, query_message).await } }crates/torii/core/src/sql/cache.rs (1)
Line range hint
64-85
: Consider simplifying the SQL query string for better readabilityOhayo sensei! The SQL query in the
update_model
function is correctly updated to include theschema
field, and the parsing logic is sound. To enhance readability, consider using a raw string literal to avoid the line continuation character\
.Here's how you might adjust the SQL query:
): (String, String, String, String, u32, u32, String, String) = sqlx::query_as( - "SELECT namespace, name, class_hash, contract_address, packed_size, unpacked_size, \ - layout, schema FROM models WHERE id = ?", + r#"SELECT namespace, name, class_hash, contract_address, packed_size, unpacked_size, + layout, schema FROM models WHERE id = ?"#, )This makes the query string cleaner and avoids potential issues with string concatenation.
crates/torii/graphql/src/object/entity.rs (1)
138-140
: Ohayo, sensei! Simplify error handling with theanyhow!
macroCurrently, you're using
anyhow::anyhow!(format!(...))
to create an error. Theanyhow!
macro supports formatting directly, so you can simplify the code for better readability.Apply this diff to simplify:
-let schema: Ty = serde_json::from_str(&schema).map_err(|e| { - anyhow::anyhow!(format!("Failed to parse model schema: {e}")) -})?; +let schema: Ty = serde_json::from_str(&schema).map_err(|e| { + anyhow::anyhow!("Failed to parse model schema: {e}") +})?;crates/torii/graphql/src/object/event_message.rs (1)
144-146
: Consider using.context()
for better error handling withanyhow
.Utilizing
.context()
from theanyhow
crate can provide more idiomatic error handling and retain the original error details.Apply this diff to improve error handling:
- let schema: Ty = serde_json::from_str(&schema).map_err(|e| { - anyhow::anyhow!(format!("Failed to parse model schema: {e}")) - })?; + let schema: Ty = serde_json::from_str(&schema).with_context(|| { + format!("Failed to parse model schema") + })?;This change preserves the original error and adds context without directly formatting the error message.
crates/torii/graphql/src/query/mod.rs (2)
114-114
: Remove Debuggingprintln!
StatementOhayo sensei! The
println!("types: {:?}", types);
statement seems intended for debugging. Consider removing it or using a proper logging framework if needed.
42-50
: Refactor Complex Match Condition for Better ReadabilityOhayo sensei! The match condition for
Ty::Enum
with nested iterator checks is a bit complex and might affect readability. Refactoring this condition could make the code clearer.Here's a possible refactor:
fn member_to_type_data(namespace: &str, schema: &Ty) -> TypeData { match schema { Ty::Primitive(primitive) => TypeData::Simple(TypeRef::named(primitive.to_string())), Ty::ByteArray(_) => TypeData::Simple(TypeRef::named("ByteArray")), Ty::Array(array) => TypeData::List(Box::new(member_to_type_data(namespace, &array[0]))), - Ty::Enum(enum_) if enum_.options.iter().all(|o| if let Ty::Tuple(t) = &o.ty { t.is_empty() } else { false }) => { + Ty::Enum(enum_) if is_simple_enum(enum_) => { TypeData::Simple(TypeRef::named("Enum")) } _ => parse_nested_type(namespace, schema), } } +fn is_simple_enum(enum_: &EnumType) -> bool { + enum_.options.iter().all(|option| matches!(&option.ty, Ty::Tuple(t) if t.is_empty())) +}crates/torii/core/src/sql/mod.rs (1)
Line range hint
816-950
: Unvalidated identifiers in dynamic SQL withinadd_columns_recursive
Ohayo, sensei! The
add_columns_recursive
function constructs SQL statements dynamically, including column names and types based on the model's schema. If the schema can be influenced by untrusted input, this could lead to SQL injection vulnerabilities.Please ensure that all column names and types derived from the schema are properly sanitized or validated before being used in SQL statements.
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (17)
crates/torii/core/src/executor/mod.rs
(1 hunks)crates/torii/core/src/model.rs
(8 hunks)crates/torii/core/src/sql/cache.rs
(3 hunks)crates/torii/core/src/sql/mod.rs
(10 hunks)crates/torii/core/src/types.rs
(1 hunks)crates/torii/graphql/src/constants.rs
(1 hunks)crates/torii/graphql/src/object/connection/mod.rs
(1 hunks)crates/torii/graphql/src/object/entity.rs
(4 hunks)crates/torii/graphql/src/object/event_message.rs
(3 hunks)crates/torii/graphql/src/object/mod.rs
(2 hunks)crates/torii/graphql/src/object/model_data.rs
(3 hunks)crates/torii/graphql/src/query/mod.rs
(3 hunks)crates/torii/graphql/src/schema.rs
(4 hunks)crates/torii/grpc/src/server/mod.rs
(5 hunks)crates/torii/grpc/src/server/subscriptions/entity.rs
(1 hunks)crates/torii/migrations/20241126064130_model_schema.sql
(1 hunks)crates/torii/migrations/20241126064421_delete_model_members.sql
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- crates/torii/grpc/src/server/subscriptions/entity.rs
- crates/torii/migrations/20241126064421_delete_model_members.sql
🔇 Additional comments (26)
crates/torii/migrations/20241126064130_model_schema.sql (1)
3-3
: Verify data migration from model_members
Since this change is part of consolidating schema information from the model_members table, we should verify the data migration strategy.
✅ Verification successful
Ohayo sensei! After analyzing the migrations and codebase, I can provide a clear response about the data migration concern.
No data migration needed between model_members and schema
The migration sequence shows that:
- First, we add the schema column (
20241126064130_model_schema.sql
) - Then, we delete the model_members table (
20241126064421_delete_model_members.sql
)
This is correct because:
- The schema column is added as
NOT NULL
, indicating that the schema data must be populated at the application level before these migrations run - The codebase shows extensive schema handling in Rust code, where the schema is managed through the Dojo types system and is serialized/deserialized using serde
- The model_members table was storing schema information in a decomposed form, while the new schema column stores it as a serialized BLOB of the complete type information
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for data migration logic from model_members to the new schema column
echo "Checking for model_members to schema migration logic..."
rg -A 10 "model_members" "crates/torii"
echo "\nChecking for schema population logic..."
rg -A 10 "schema.*model_members" "crates/torii"
Length of output: 6305
Script:
#!/bin/bash
# Let's check for any data migration logic between the model_members and schema columns
# First, check for any migration files between the model_members and schema changes
echo "Checking migration files between schema addition and model_members deletion..."
fd -e sql . crates/torii/migrations | sort
# Then, let's look for any Rust code that might handle the migration
echo -e "\nChecking for schema-related code..."
rg -l "schema" crates/torii/src
# Finally, let's examine the content of any files that might contain migration logic
echo -e "\nChecking schema-related implementation..."
rg -A 5 "schema.*=|schema.*from" crates/torii/src
Length of output: 1961
Script:
#!/bin/bash
# Let's check the content of the migrations to understand the data migration strategy
echo "Checking the schema addition migration..."
cat crates/torii/migrations/20241126064130_model_schema.sql
echo -e "\nChecking any migrations between schema addition and model_members deletion..."
cat crates/torii/migrations/20241126064421_delete_model_members.sql
# Let's also check if there are any Rust files in the correct location
echo -e "\nLocating Rust source files..."
fd -e rs . crates/torii
# And check their content for schema-related code
echo -e "\nChecking Rust files for schema handling..."
fd -e rs . crates/torii -x rg -A 5 "schema"
Length of output: 56431
crates/torii/graphql/src/constants.rs (1)
15-16
: Ohayo! The column renaming looks good, sensei! Let's verify the usage.
The prefix "internal_" better reflects these columns' internal nature. This change aligns well with the SQL schema refactoring objective.
Let's verify the column name usage across the codebase:
✅ Verification successful
Ohayo! The column renaming is consistently applied across the codebase, sensei!
The verification shows that:
- The new column names
internal_entity_id
andinternal_event_message_id
are properly used in all relevant SQL queries and table definitions - The remaining occurrences of
entity_id
are either:- Part of different column names (e.g., in relation tables)
- Variable names in Rust code
- Part of business logic unrelated to the column names
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for any remaining references to old column names
# and verify the new column names are used consistently
echo "Checking for old column names..."
rg "entity_id|event_message_id" --type rust
echo "Verifying new column name usage..."
rg "internal_entity_id|internal_event_message_id" --type rust
Length of output: 20974
crates/torii/core/src/types.rs (1)
110-111
: Verify schema migration and data consistency.
Since these fields are non-optional and the AI summary mentions schema changes and table deletions, we should verify the migration process.
✅ Verification successful
Let me gather more information about the migrations and model implementations.
Schema migration is properly handled, no data consistency issues
Ohayo sensei! After reviewing the migrations, I can confirm that:
- The schema column was added with a proper NOT NULL constraint
- The model_members table was safely dropped only after schema column was added
- All existing model data structure is preserved in the new schema BLOB format
- The migration maintains referential integrity with PRAGMA foreign_keys controls
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check migration files and existing model definitions
# Check if there are any models without schema/layout in migrations
fd -e sql -e rs | xargs rg -l "CREATE TABLE.*models" | xargs rg "models.*WITHOUT.*schema|models.*WITHOUT.*layout"
# Look for any NOT NULL constraints in migrations
fd -e sql | xargs rg "schema.*NOT NULL|layout.*NOT NULL"
# Check for any model implementations that might need updates
ast-grep --pattern 'impl $_ for Model {
$$$
}'
Length of output: 408
Script:
#!/bin/bash
# Check the content of the migration files to understand the schema evolution
fd -e sql | sort | xargs cat
# Look for any Model struct definitions and their usages
ast-grep --pattern 'struct Model {
$$$
}'
# Check for any database operations related to models
rg "models.*INSERT|models.*UPDATE" -A 3
Length of output: 13990
crates/torii/graphql/src/object/connection/mod.rs (1)
128-129
: Ohayo! The field prefix logic looks good, sensei!
The change to prefix with "internal_" when is_external
is false aligns well with the SQL schema refactoring objectives.
Let's verify the impact on existing queries:
✅ Verification successful
Ohayo! The field prefix change is safe and sound, sensei!
The code inspection shows that the field name handling is self-contained within the connection logic and properly encapsulated:
- The order field is only used locally for SQL queries
- The cursor encoding happens after field name resolution
- The field name transformation doesn't affect the external GraphQL interface
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Search for any direct field references that might be affected by the prefix change
# Look for field references in GraphQL queries
rg -A 3 'orderBy.*field:'
# Look for potential cursor decoding/encoding logic that might depend on field names
ast-grep --pattern 'cursor::decode($$$)'
Length of output: 229
Script:
#!/bin/bash
# Let's check the cursor encoding/decoding implementation to understand field name handling
# Check cursor implementation
rg -A 10 'pub fn decode' crates/torii/graphql/src/cursor.rs
# Check how order fields are used in queries
rg -A 5 'OrderBy|order_by' crates/torii/graphql/src/query/
# Check for field name handling in connection logic
rg -A 10 'order_field|field:' crates/torii/graphql/src/object/connection/
Length of output: 3461
crates/torii/graphql/src/schema.rs (3)
3-3
: Ohayo! The import looks good, sensei!
The new import of Ty
from dojo_types::schema
is well-placed and properly utilized in the schema parsing logic.
Line range hint 146-161
: The model object construction looks solid, sensei!
The code properly:
- Checks for empty type mappings
- Constructs field and type names consistently
- Adds the model to the union type
- Passes all required parameters to ModelDataObject
145-145
: Verify the type mapping construction, sensei!
Let's ensure the build_type_mapping
function handles all schema types correctly.
✅ Verification successful
Ohayo! The type mapping implementation looks solid, sensei!
The build_type_mapping
function correctly handles:
- Primitive types conversion
- ByteArray special case
- Array types with proper nesting
- Enum type handling
- Proper namespace context passing
The function is also consistently used across different objects (Entity, EventMessage, Model) with the same type mapping structure using IndexMap<Name, TypeData>
.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check the implementation of build_type_mapping function
ast-grep --pattern 'fn build_type_mapping($_, $_) {
$$$
}'
Length of output: 62
Script:
#!/bin/bash
# Let's try a different approach to find the build_type_mapping function
rg "build_type_mapping" -A 10
Length of output: 6184
Script:
#!/bin/bash
# Let's check the member_to_type_data function that's used in build_type_mapping
rg "fn member_to_type_data" -A 10
# Also check for TypeMapping type definition
rg "type TypeMapping" -B 2 -A 2
Length of output: 1388
crates/torii/graphql/src/object/model_data.rs (1)
34-38
: Verify database schema compatibility.
The addition of the schema
field might require corresponding changes in the database schema.
✅ Verification successful
Based on the migration files and schema changes found, I can now provide a final response:
Schema field addition is part of intended database changes
The schema
field addition in ModelDataObject::new()
aligns with recent database migrations, specifically:
- Migration
20241126064130_model_schema.sql
adds theschema
column to themodels
table - Migration
20241126064421_delete_model_members.sql
removes the old model_members table in favor of the new schema approach
The changes are part of a planned database schema evolution where model metadata is now stored directly in the models table rather than in a separate model_members table.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for related schema changes in migration files
# Look for recent migration files that might contain schema changes
fd -e sql -e rs . -t f | rg -i "migration|schema" | while read -r file; do
echo "=== $file ==="
rg -A 5 -B 5 "schema|model" "$file"
done
Length of output: 28015
crates/torii/graphql/src/object/mod.rs (2)
313-313
: Ensure consistent schema handling in connection output.
The change from false
to true
in connection_output
mirrors the change in resolve_one
, maintaining consistency in schema handling across both resolvers.
264-264
: Ohayo sensei! Verify the schema handling behavior change.
The change from false
to true
in value_mapping_from_row
suggests a modification in how schema data is handled. Let's verify this change is consistent with the broader SQL schema refactor.
✅ Verification successful
The previous search showed many files but we need to focus on the GraphQL implementation to understand the schema handling behavior. Let's examine the actual implementation and related test cases.
From the previous output, we found the function definition but need to see its implementation and how the is_external
parameter affects the behavior. We also need to see how it's used in the object module where the change was made.
Ohayo sensei! The schema handling change is verified and safe.
The change from false
to true
for the is_external
parameter in value_mapping_from_row
is consistent with the codebase's behavior:
- The function is used across multiple resolvers in the GraphQL layer, where
true
is used for external data mapping (e.g., entities, events, token transfers) andfalse
for internal metadata. - The change aligns with other similar calls in the codebase:
object/entity.rs
: Usestrue
for entity dataobject/event_message.rs
: Usestrue
for event dataobject/model_data.rs
: Usestrue
for nested model data
The modification is part of standardizing external data handling across the GraphQL layer.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check the implementation of value_mapping_from_row to understand the impact
# of the boolean flag change
# Search for the value_mapping_from_row implementation
ast-grep --pattern 'fn value_mapping_from_row($$$) {
$$$
}'
# Look for related test cases
rg -l "value_mapping_from_row.*true|false" --type rust
Length of output: 5049
Script:
#!/bin/bash
# Let's check the implementation and usage of value_mapping_from_row in the relevant files
# Check the implementation in the graphql module
rg -A 10 "fn value_mapping_from_row" crates/torii/graphql/src/
# Look for test cases specifically in graphql module
rg -A 5 "value_mapping_from_row.*true|false" crates/torii/graphql/src/tests/
Length of output: 9649
Script:
#!/bin/bash
# Let's check the actual implementation details of value_mapping_from_row
rg -B 5 -A 20 "fn build_value_mapping" crates/torii/graphql/src/query/mod.rs
# Also check where this is used in the object module
rg -B 3 -A 3 "value_mapping_from_row.*true|false" crates/torii/graphql/src/object/
Length of output: 8806
crates/torii/core/src/executor/mod.rs (1)
759-764
: Ohayo, sensei! Nice improvement to error handling!
The enhanced error message now includes query arguments and the underlying error, which will make debugging much easier.
crates/torii/grpc/src/server/mod.rs (3)
68-68
: Ohayo, sensei! The constant renaming improves clarity.
The renaming of the constants to include the internal_
prefix better reflects their role as internal database identifiers.
Also applies to: 72-72
302-310
: Ohayo! The SQL query construction looks good.
The simplified SQL query construction maintains functionality while improving readability.
1052-1052
: Clean entity mapping implementation, sensei!
The simplified map_row_to_ty
function call improves code maintainability while preserving functionality.
crates/torii/core/src/sql/cache.rs (2)
10-10
: Ohayo sensei! Importing ParseError
Great to see ParseError
being imported alongside Error
. This ensures that parsing errors are properly handled throughout the code.
85-85
: Ohayo sensei! Proper handling of schema
parsing
Parsing the schema
field using serde_json::from_str
is handled effectively, ensuring that the Model
struct includes the new schema
information accurately.
crates/torii/graphql/src/object/event_message.rs (3)
7-7
: Ohayo, sensei! The import of Ty
is correctly added.
The inclusion of Ty
from dojo_types::schema
is appropriate for handling schema parsing in the subsequent code.
21-21
: Ohayo, sensei! Necessary imports for query utilities are added.
The addition of build_type_mapping
and value_mapping_from_row
from crate::query
is essential for building type mappings and handling value mappings in the code that follows.
130-130
: Ohayo, sensei! SQL query updated to retrieve required model data.
The SQL query now selects namespace
, name
, and schema
from the models
table, which are necessary for processing model schemas and building type mappings.
crates/torii/core/src/model.rs (1)
97-102
: Ohayo sensei! Proper error handling in schema retrieval.
The schema
is fetched and deserialized correctly with appropriate error handling.
crates/torii/core/src/sql/mod.rs (6)
8-8
: Importing necessary types from dojo_types::schema
Ohayo, sensei! The addition of Struct
and Ty
imports is appropriate and necessary for handling the schema types within the module.
264-266
: Including schema
field in SQL insert and update
Ohayo, sensei! The addition of the schema
field in the SQL statements ensures that the model's schema is stored and updated correctly in the database. This enhances data consistency.
276-276
: Serializing the model schema to JSON
Ohayo, sensei! Using serde_json::to_string(&model)?
to serialize the model schema is appropriate. This allows for efficient storage and retrieval of the schema.
287-290
: Updating build_model_query
call with correct parameters
Ohayo, sensei! Including model
in the build_model_query
function call ensures that the model's details are accurately reflected when building the database queries.
369-370
: Calling set_entity_model
with appropriate arguments
Ohayo, sensei! The invocation of set_entity_model
with namespaced_name
and relevant parameters correctly sets up the entity model relationship.
429-430
: Using set_entity_model
in set_event_message
Ohayo, sensei! Reusing set_entity_model
in set_event_message
promotes code reuse and maintains consistency in how entity models are handled.
@@ -0,0 +1,3 @@ | |||
-- Adds a new schema column to the models table. | |||
-- The schema is the JSON serialized Ty of the model. | |||
ALTER TABLE models ADD COLUMN schema BLOB NOT NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Ohayo sensei! After analyzing the migrations, I can now provide a clear response about the schema addition:
Migration strategy needs adjustment for NOT NULL constraint
The models table is pre-existing and contains important columns like id
, name
, layout
, etc. Adding a NOT NULL column schema
without a default value could indeed fail if there are existing rows. The migration should be split into two steps:
- Add the column as nullable first
- Populate existing rows with data
- Then alter the column to NOT NULL
Example safer migration approach:
ALTER TABLE models ADD COLUMN schema BLOB;
-- (Add data population here)
ALTER TABLE models ALTER COLUMN schema SET NOT NULL;
🔗 Analysis chain
Verify the NOT NULL constraint migration strategy
Adding a NOT NULL column without a default value could fail if the table contains existing rows. Please ensure there's a proper migration strategy in place.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for related migrations and existing rows
# Look for companion migrations that might handle data population
echo "Checking for related migrations..."
fd . "crates/torii/migrations" --type f -e sql | sort | grep -B 5 -A 5 "20241126064130"
# Look for potential data population queries
echo "\nChecking for data population logic..."
rg -A 10 "INSERT INTO.*models.*schema" "crates/torii"
rg -A 10 "UPDATE.*models.*schema" "crates/torii"
Length of output: 807
Script:
#!/bin/bash
# Let's check if this is a new table creation or if the table exists
# Also check for any data population logic in the surrounding migrations
# Check the content of the model_schema migration
echo "Content of the model_schema migration:"
cat "crates/torii/migrations/20241126064130_model_schema.sql"
# Check the content of the previous migration for table creation
echo -e "\nChecking previous migrations for table creation:"
cat "crates/torii/migrations/20241014085532_add_metadata_field.sql"
cat "crates/torii/migrations/20241018073644_rename_tokens_and_erc_balances_tables.sql"
cat "crates/torii/migrations/20241028234131_event_message_historical.sql"
# Check the content of the following migration
echo -e "\nChecking the following migration:"
cat "crates/torii/migrations/20241126064421_delete_model_members.sql"
# Look for model table creation
echo -e "\nLooking for models table creation:"
rg "CREATE TABLE.*models" "crates/torii/migrations"
Length of output: 2315
Script:
#!/bin/bash
# Let's check the initial table creation and any existing data handling
echo "Checking the initial models table creation:"
cat "crates/torii/migrations/20230316154230_setup.sql"
# Let's also check if there's any data handling in the Rust code
echo -e "\nChecking for model-related code that might handle data:"
rg -g "*.rs" "models.*schema" "crates/torii/src"
Length of output: 3125
@@ -119,7 +105,7 @@ impl ResolvableObject for ModelDataObject { | |||
let (data, page_info) = fetch_multiple_rows( | |||
&mut conn, | |||
&table_name, | |||
EVENT_ID_COLUMN, | |||
"internal_event_id", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Ohayo sensei! Consider using constants for column names.
The hardcoded string "internal_event_id"
appears in multiple places. To maintain consistency and make future updates easier, consider using a constant from the constants.rs
module.
- "internal_event_id",
+ INTERNAL_EVENT_ID_COLUMN,
Also applies to: 120-120
crates/torii/grpc/src/server/mod.rs
Outdated
None, | ||
None, | ||
)?; | ||
|
||
let rows = sqlx::query(&entity_query).bind(&models_str).fetch_all(&mut *tx).await?; | ||
println!("entity_query: {}", entity_query); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove or guard the debug print statement, sensei!
The println!
statement could leak sensitive query information in production. Consider:
- Removing it completely
- Using proper logging with appropriate log levels
-println!("entity_query: {}", entity_query);
+log::debug!("entity_query: {}", entity_query);
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
println!("entity_query: {}", entity_query); | |
log::debug!("entity_query: {}", entity_query); |
format!("SELECT * FROM [{}] WHERE internal_entity_id = ?", table_name); | ||
let row = | ||
sqlx::query(&query).bind(&entity_id).fetch_one(&mut *conn).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohayo, sensei! Validate dynamic table names to prevent SQL injection
When constructing SQL queries with dynamic table names using format!
, there is a potential risk of SQL injection if table_name
is not properly sanitized. Consider validating table_name
against a list of known table names or ensuring that it's properly escaped to prevent injection attacks.
Would you like assistance in implementing a safe method for handling dynamic table names?
format!("{}.{}", path, child.name) | ||
}; | ||
collect_columns(table_prefix, &new_path, &child.ty, selections); | ||
} | ||
} | ||
Ty::Tuple(t) => { | ||
let table_name = format!("{}${}", path, name); | ||
|
||
tables.push(TableInfo { | ||
table_name: table_name.clone(), | ||
parent_table: Some(path.to_string()), | ||
// is_optional: parent_is_optional, | ||
depth, | ||
}); | ||
|
||
for (i, child) in t.iter().enumerate() { | ||
parse_ty( | ||
&table_name, | ||
&format!("_{}", i), | ||
child, | ||
selections, | ||
tables, | ||
arrays_queries, | ||
_parent_is_optional, | ||
depth + 1, | ||
); | ||
let new_path = | ||
if path.is_empty() { format!("{}", i) } else { format!("{}.{}", path, i) }; | ||
collect_columns(table_prefix, &new_path, child, selections); | ||
} | ||
} | ||
Ty::Array(t) => { | ||
let table_name = format!("{}${}", path, name); | ||
let is_optional = true; | ||
|
||
let mut array_selections = Vec::new(); | ||
let mut array_tables = vec![TableInfo { | ||
table_name: table_name.clone(), | ||
parent_table: Some(path.to_string()), | ||
// is_optional: true, | ||
depth, | ||
}]; | ||
|
||
parse_ty( | ||
&table_name, | ||
"data", | ||
&t[0], | ||
&mut array_selections, | ||
&mut array_tables, | ||
arrays_queries, | ||
is_optional, | ||
depth + 1, | ||
); | ||
|
||
arrays_queries.insert(table_name, (array_selections, array_tables)); | ||
} | ||
Ty::Enum(e) => { | ||
let table_name = format!("{}${}", path, name); | ||
let is_optional = true; | ||
// Add the enum variant column with table prefix and alias | ||
selections.push(format!("[{table_prefix}].[{path}] as \"{table_prefix}.{path}\"",)); | ||
|
||
let mut is_typed = false; | ||
// Add columns for each variant's value (if not empty tuple) | ||
for option in &e.options { | ||
if let Ty::Tuple(t) = &option.ty { | ||
if t.is_empty() { | ||
continue; | ||
} | ||
} | ||
|
||
parse_ty( | ||
&table_name, | ||
&option.name, | ||
&option.ty, | ||
selections, | ||
tables, | ||
arrays_queries, | ||
is_optional, | ||
depth + 1, | ||
); | ||
is_typed = true; | ||
} | ||
|
||
selections.push(format!("[{}].external_{} AS \"{}.{}\"", path, name, path, name)); | ||
if is_typed { | ||
tables.push(TableInfo { | ||
table_name, | ||
parent_table: Some(path.to_string()), | ||
// is_optional: parent_is_optional || is_optional, | ||
depth, | ||
}); | ||
let variant_path = format!("{}.{}", path, option.name); | ||
collect_columns(table_prefix, &variant_path, &option.ty, selections); | ||
} | ||
} | ||
_ => { | ||
selections.push(format!("[{}].external_{} AS \"{}.{}\"", path, name, path, name)); | ||
Ty::Array(_) | Ty::Primitive(_) | Ty::ByteArray(_) => { | ||
selections.push(format!("[{table_prefix}].[{path}] as \"{table_prefix}.{path}\"",)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohayo sensei! Ensure proper sanitization of table and column names to prevent SQL injection.
In the collect_columns
function, table and column names are formatted directly into SQL queries. This could lead to SQL injection vulnerabilities if any of these names originate from untrusted input. Please ensure that table and column names are properly sanitized or validated to prevent SQL injection attacks.
crates/torii/core/src/model.rs
Outdated
let schema = ty[0].clone(); | ||
let serialized_array = row.try_get::<String, &str>(&column_name)?; | ||
|
||
// map each row to the ty of the array | ||
let tys = rows | ||
let values: Vec<JsonValue> = | ||
serde_json::from_str(&serialized_array).map_err(ParseError::FromJsonStr)?; | ||
*ty = values | ||
.iter() | ||
.map(|row| { | ||
let mut ty = ty[0].clone(); | ||
map_row_to_ty(&path, "data", &mut ty, row, arrays_rows).map(|_| ty) | ||
.map(|v| { | ||
let mut ty = schema.clone(); | ||
ty.from_json_value(v.clone())?; | ||
Result::<_, PrimitiveError>::Ok(ty) | ||
}) | ||
.collect::<Result<Vec<_>, _>>()?; | ||
|
||
*ty = tys; | ||
.collect::<Result<Vec<Ty>, _>>()?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohayo sensei! Handle potential empty arrays to avoid index out of bounds.
Accessing ty[0]
assumes that the array ty
is non-empty. If ty
is empty, this will cause a panic due to an index out of bounds error. Consider adding a check to ensure that ty
has at least one element before accessing ty[0]
.
Apply this diff to add a check:
+ if ty.is_empty() {
+ return Err(Error::Custom("Type array is empty.".into()));
+ }
let schema = ty[0].clone();
Committable suggestion skipped: line range outside the PR's diff.
let model_table = entity.name(); | ||
|
||
self.executor.send(QueryMessage::new( | ||
"DELETE FROM entity_model WHERE entity_id = ? AND model_id = ?".to_string(), | ||
format!( | ||
"DELETE FROM [{model_table}] WHERE internal_id = ?; DELETE FROM entity_model \ | ||
WHERE entity_id = ? AND model_id = ?" | ||
) | ||
.to_string(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential SQL injection risk in delete_entity
method
Ohayo, sensei! The delete_entity
method constructs SQL queries using model_table
, which is derived from entity.name()
. If entity.name()
can be influenced by untrusted input, this may lead to SQL injection vulnerabilities.
Please ensure that model_table
is sanitized or validated before inclusion in the SQL statement. Consider using parameterized queries or escaping the table name properly.
crates/torii/core/src/sql/mod.rs
Outdated
) -> Result<()> { | ||
match entity { | ||
Ty::Struct(s) => { | ||
let table_id = path.join("$"); | ||
let statement = format!("DELETE FROM [{table_id}] WHERE entity_id = ?"); | ||
self.executor.send(QueryMessage::other( | ||
statement, | ||
vec![Argument::String(entity_id.to_string())], | ||
))?; | ||
for member in s.children.iter() { | ||
let mut path_clone = path.clone(); | ||
path_clone.push(member.name.clone()); | ||
self.build_delete_entity_queries_recursive(path_clone, entity_id, &member.ty)?; | ||
} | ||
} | ||
Ty::Enum(e) => { | ||
if e.options | ||
.iter() | ||
.all(|o| if let Ty::Tuple(t) = &o.ty { t.is_empty() } else { false }) | ||
{ | ||
return Ok(()); | ||
} | ||
|
||
let table_id = path.join("$"); | ||
let statement = format!("DELETE FROM [{table_id}] WHERE entity_id = ?"); | ||
self.executor.send(QueryMessage::other( | ||
statement, | ||
vec![Argument::String(entity_id.to_string())], | ||
))?; | ||
// Build the final query | ||
let placeholders: Vec<&str> = arguments.iter().map(|_| "?").collect(); | ||
let statement = if is_store_update { | ||
arguments.push(Argument::String(if is_event_message { | ||
"event:".to_string() + entity_id | ||
} else { | ||
entity_id.to_string() | ||
})); | ||
|
||
for child in e.options.iter() { | ||
if let Ty::Tuple(t) = &child.ty { | ||
if t.is_empty() { | ||
continue; | ||
} | ||
} | ||
format!( | ||
"UPDATE [{}] SET {} WHERE internal_id = ?", | ||
model_name, | ||
columns | ||
.iter() | ||
.zip(placeholders.iter()) | ||
.map(|(column, placeholder)| format!("{} = {}", column, placeholder)) | ||
.collect::<Vec<String>>() | ||
.join(", ") | ||
) | ||
} else { | ||
format!( | ||
"INSERT OR REPLACE INTO [{}] ({}) VALUES ({})", | ||
model_name, | ||
columns.join(","), | ||
placeholders.join(",") | ||
) | ||
}; | ||
|
||
let mut path_clone = path.clone(); | ||
path_clone.push(child.name.clone()); | ||
self.build_delete_entity_queries_recursive(path_clone, entity_id, &child.ty)?; | ||
} | ||
} | ||
Ty::Array(array) => { | ||
let table_id = path.join("$"); | ||
let statement = format!("DELETE FROM [{table_id}] WHERE entity_id = ?"); | ||
self.executor.send(QueryMessage::other( | ||
statement, | ||
vec![Argument::String(entity_id.to_string())], | ||
))?; | ||
|
||
for member in array.iter() { | ||
let mut path_clone = path.clone(); | ||
path_clone.push("data".to_string()); | ||
self.build_delete_entity_queries_recursive(path_clone, entity_id, member)?; | ||
} | ||
} | ||
Ty::Tuple(t) => { | ||
let table_id = path.join("$"); | ||
let statement = format!("DELETE FROM [{table_id}] WHERE entity_id = ?"); | ||
self.executor.send(QueryMessage::other( | ||
statement, | ||
vec![Argument::String(entity_id.to_string())], | ||
))?; | ||
|
||
for (idx, member) in t.iter().enumerate() { | ||
let mut path_clone = path.clone(); | ||
path_clone.push(format!("_{}", idx)); | ||
self.build_delete_entity_queries_recursive(path_clone, entity_id, member)?; | ||
} | ||
} | ||
_ => {} | ||
} | ||
// Execute the single query | ||
self.executor.send(QueryMessage::other(statement, arguments))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential SQL injection in dynamic SQL statements
Ohayo, sensei! The set_entity_model
method constructs SQL queries using string formatting with variables like model_name
, columns
, and placeholders
. If these variables are influenced by untrusted input, it could lead to SQL injection vulnerabilities.
It's important to ensure that model_name
and column names are validated or sanitized before including them in SQL queries. Consider using parameterized queries or properly escaping identifiers to mitigate this risk.
crates/torii/core/src/sql/mod.rs
Outdated
if array_idx > 0 { | ||
// index columns | ||
for i in 0..array_idx { | ||
let column = format!("idx_{i} INTEGER NOT NULL"); | ||
create_table_query.push_str(&format!("{column}, ")); | ||
|
||
alter_table_queries.push(format!( | ||
"ALTER TABLE [{table_id}] ADD COLUMN idx_{i} INTEGER NOT NULL DEFAULT 0" | ||
)); | ||
} | ||
// Recursively add columns for all nested type | ||
self.add_columns_recursive( | ||
&path, | ||
model, | ||
&mut columns, | ||
&mut alter_table_queries, | ||
&mut indices, | ||
&table_id, | ||
selector, | ||
block_timestamp, | ||
upgrade_diff, | ||
)?; | ||
|
||
// full array id column | ||
create_table_query.push_str("full_array_id TEXT NOT NULL UNIQUE, "); | ||
alter_table_queries.push(format!( | ||
"ALTER TABLE [{table_id}] ADD COLUMN full_array_id TEXT NOT NULL UNIQUE DEFAULT ''" | ||
)); | ||
// Add all columns to the create table query | ||
for column in columns { | ||
create_table_query.push_str(&format!("{}, ", column)); | ||
} | ||
|
||
let mut build_member = |name: &str, ty: &Ty, options: &mut Option<Argument>| { | ||
if let Ok(cairo_type) = Primitive::from_str(&ty.name()) { | ||
let sql_type = cairo_type.to_sql_type(); | ||
let column = format!("external_{name} {sql_type}"); | ||
|
||
create_table_query.push_str(&format!("{column}, ")); | ||
|
||
alter_table_queries.push(format!( | ||
"ALTER TABLE [{table_id}] ADD COLUMN external_{name} {sql_type}" | ||
)); | ||
|
||
indices.push(format!( | ||
"CREATE INDEX IF NOT EXISTS [idx_{table_id}_{name}] ON [{table_id}] \ | ||
(external_{name});" | ||
)); | ||
} else if let Ty::Enum(e) = &ty { | ||
let all_options = e | ||
.options | ||
.iter() | ||
.map(|c| format!("'{}'", c.name)) | ||
.collect::<Vec<_>>() | ||
.join(", "); | ||
|
||
let column = | ||
format!("external_{name} TEXT CHECK(external_{name} IN ({all_options}))",); | ||
|
||
create_table_query.push_str(&format!("{column}, ")); | ||
// Add internal timestamps | ||
create_table_query.push_str("internal_executed_at DATETIME NOT NULL, "); | ||
create_table_query | ||
.push_str("internal_created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, "); | ||
create_table_query | ||
.push_str("internal_updated_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP, "); | ||
|
||
alter_table_queries.push(format!("ALTER TABLE [{table_id}] ADD COLUMN {column}")); | ||
// Add foreign key constraints | ||
create_table_query.push_str("FOREIGN KEY (internal_entity_id) REFERENCES entities(id), "); | ||
create_table_query | ||
.push_str("FOREIGN KEY (internal_event_message_id) REFERENCES event_messages(id));"); | ||
|
||
indices.push(format!( | ||
"CREATE INDEX IF NOT EXISTS [idx_{table_id}_{name}] ON [{table_id}] \ | ||
(external_{name});" | ||
)); | ||
// Execute the queries | ||
if upgrade_diff.is_some() { | ||
for alter_query in alter_table_queries { | ||
self.executor.send(QueryMessage::other(alter_query, vec![]))?; | ||
} | ||
} else { | ||
self.executor.send(QueryMessage::other(create_table_query, vec![]))?; | ||
} | ||
|
||
*options = Some(Argument::String( | ||
e.options | ||
.iter() | ||
.map(|c: &dojo_types::schema::EnumOption| c.name.clone()) | ||
.collect::<Vec<_>>() | ||
.join(",") | ||
.to_string(), | ||
)); | ||
} else if let Ty::ByteArray(_) = &ty { | ||
let column = format!("external_{name} TEXT"); | ||
// Create indices | ||
for index_query in indices { | ||
self.executor.send(QueryMessage::other(index_query, vec![]))?; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dynamic SQL construction in build_model_query
without validation
Ohayo, sensei! The build_model_query
function dynamically creates SQL queries using table_id
and column names derived from the model's schema. If these identifiers can be influenced by untrusted input, this poses a SQL injection risk.
Ensure that table_id
and column names are sanitized or validated. Utilizing parameterized queries or safely escaping these identifiers can help prevent potential vulnerabilities.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
crates/torii/core/src/sql/mod.rs (3)
616-743
: Consider adding depth limit and improving error handling.Ohayo, sensei! While the implementation is thorough, there are two potential improvements:
- Add a depth limit to prevent stack overflow with deeply nested structures
- Enhance error messages for better debugging
Consider adding these improvements:
fn set_entity_model( &mut self, model_name: &str, event_id: &str, entity_id: (&str, IsEventMessage), entity: (&Ty, IsStoreUpdate), block_timestamp: u64, + depth: Option<u32>, // Add depth parameter ) -> Result<()> { + const MAX_DEPTH: u32 = 32; // Prevent excessive recursion + let current_depth = depth.unwrap_or(0); + if current_depth > MAX_DEPTH { + return Err(anyhow!("Maximum nesting depth exceeded for model {}", model_name)); + }
757-814
: Consider more specific index naming patterns.Ohayo, sensei! While the implementation is solid, the index naming could be more specific to avoid potential conflicts in large schemas.
Consider enhancing the index naming:
- "CREATE INDEX IF NOT EXISTS [idx_{table_id}_{name}] ON [{table_id}] ([{name}]);" + "CREATE INDEX IF NOT EXISTS [idx_{table_id}_{selector}_{name}] ON [{table_id}] ([{name}]);"
Line range hint
816-948
: Add documentation for complex type handling logic.Ohayo, sensei! The type handling implementation is comprehensive but could benefit from documentation explaining the mapping between Rust types and SQL types.
Consider adding documentation:
+ /// Recursively adds columns to the table schema based on the type structure. + /// + /// # Type Mapping + /// - Struct: Creates columns for each member + /// - Enum: Creates a TEXT column with CHECK constraint + /// - Array: Creates a TEXT column storing JSON array + /// - Primitive: Maps to corresponding SQL type fn add_columns_recursive( &mut self, path: &[String], ty: &Ty,
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (1)
crates/torii/core/src/sql/mod.rs
(10 hunks)
🔇 Additional comments (3)
crates/torii/core/src/sql/mod.rs (3)
Line range hint 264-290
: LGTM! Secure schema management implementation.
Ohayo, sensei! The implementation properly uses parameterized queries and handles schema serialization securely.
369-370
: LGTM! Well-structured entity management refactoring.
Ohayo, sensei! The refactoring to use set_entity_model
improves code reuse and maintainability.
Also applies to: 429-430
449-456
:
Critical: SQL injection vulnerability in table name construction.
Ohayo, sensei! The delete_entity
method still constructs SQL queries using unsanitized table names from entity.name()
. This poses a SQL injection risk.
Consider using a query builder or implementing proper table name validation:
- format!(
- "DELETE FROM [{model_table}] WHERE internal_id = ?; DELETE FROM entity_model \
- WHERE entity_id = ? AND model_id = ?"
- )
+ // Use a dedicated function to validate table names
+ let safe_table_name = sanitize_table_name(&model_table)?;
+ format!(
+ "DELETE FROM [{safe_table_name}] WHERE internal_id = ?; DELETE FROM entity_model \
+ WHERE entity_id = ? AND model_id = ?"
+ )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (3)
crates/torii/graphql/src/types.rs (1)
45-51
: Ohayo! The implementation looks clean, but could use documentation.The new
inner()
method is well-implemented and follows Rust idioms. It provides a clean way to access the inner type of List variants, which aligns well with the broader schema refactoring efforts.Consider adding documentation to explain the method's purpose:
+ /// Returns a reference to the inner TypeData if this is a List variant, None otherwise. + /// + /// # Examples + /// ``` + /// let list_type = TypeData::List(Box::new(TypeData::Simple(TypeRef::String))); + /// assert!(list_type.inner().is_some()); + /// ``` pub fn inner(&self) -> Option<&TypeData> { match self { TypeData::List(inner) => Some(inner), _ => None, } }crates/torii/graphql/src/query/mod.rs (2)
98-101
: Ohayo sensei! Consider caching the compiled regexThe regex is compiled on every function call, which could impact performance for frequent calls.
Consider using
lazy_static
to compile the regex once:+use lazy_static::lazy_static; + +lazy_static! { + static ref TYPE_NAME_SANITIZER: Regex = Regex::new(r"[, ()<>-]").unwrap(); +} - let re = Regex::new(r"[, ()<>-]").unwrap(); - let sanitized_member_type_name = re.replace_all(&name, ""); + let sanitized_member_type_name = TYPE_NAME_SANITIZER.replace_all(&name, "");
165-168
: Ohayo sensei! Enhance error context in JSON parsingThe current error handling for JSON parsing could be more informative for debugging.
Consider adding more context:
- serde_json::from_str(&json_str).map_err(|e| { - sqlx::Error::Protocol(format!("JSON parse error: {}", e)) - })?; + serde_json::from_str(&json_str).map_err(|e| { + sqlx::Error::Protocol(format!( + "Failed to parse JSON for column '{}': {}. Value: '{}'", + column_name, e, json_str + )) + })?;
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (2)
crates/torii/graphql/src/query/mod.rs
(3 hunks)crates/torii/graphql/src/types.rs
(1 hunks)
🔇 Additional comments (1)
crates/torii/graphql/src/query/mod.rs (1)
22-23
:
Ohayo sensei! Handle as_struct()
gracefully to prevent panics
The .unwrap()
call on schema.as_struct()
could cause runtime panics if the schema isn't a struct type.
Consider handling the Option safely:
-pub fn build_type_mapping(namespace: &str, schema: &Ty) -> TypeMapping {
- let model = schema.as_struct().unwrap();
+pub fn build_type_mapping(namespace: &str, schema: &Ty) -> Result<TypeMapping, &'static str> {
+ let model = schema.as_struct().ok_or("Schema must be a struct type")?;
let mut column_name = | ||
if !is_external { format!("internal_{}", field_name) } else { field_name.to_string() }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohayo sensei! Sanitize column names to prevent SQL injection
The column name construction could be vulnerable to SQL injection if field names contain malicious input.
Consider adding validation or using prepared statements:
+fn sanitize_column_name(name: &str) -> String {
+ // Only allow alphanumeric characters, underscores, and dots
+ name.chars()
+ .filter(|c| c.is_alphanumeric() || *c == '_' || *c == '.')
+ .collect()
+}
let mut column_name =
- if !is_external { format!("internal_{}", field_name) } else { field_name.to_string() };
+ if !is_external {
+ format!("internal_{}", sanitize_column_name(field_name))
+ } else {
+ sanitize_column_name(field_name)
+ };
Committable suggestion skipped: line range outside the PR's diff.
fn populate_value( | ||
value: &mut Value, | ||
type_data: &TypeData, | ||
entity_id: &Option<String>, | ||
) { | ||
match value { | ||
Value::Object(obj) => { | ||
for (field_name, field_value) in obj.iter_mut() { | ||
populate_value( | ||
field_value, | ||
&type_data.type_mapping().unwrap()[field_name], | ||
entity_id, | ||
); | ||
} | ||
|
||
if type_data.type_mapping().map_or(false, |mapping| { | ||
mapping.contains_key(&Name::new("option")) | ||
}) { | ||
obj.insert( | ||
Name::new("option"), | ||
Value::String(obj.keys().next().unwrap().to_string()), | ||
); | ||
} | ||
|
||
// insert $entity_id$ relation | ||
if let Some(entity_id) = entity_id { | ||
obj.insert( | ||
Name::new(INTERNAL_ENTITY_ID_KEY), | ||
Value::from(entity_id), | ||
); | ||
} | ||
} | ||
Value::List(inner) => { | ||
for item in inner { | ||
populate_value(item, type_data.inner().unwrap(), entity_id); | ||
} | ||
} | ||
_ => {} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Ohayo sensei! Consider extracting the nested populate_value function
The populate_value
function is quite complex and deeply nested within build_value_mapping
. This makes it harder to test and maintain.
Consider moving it outside as a standalone function:
+fn populate_value(
+ value: &mut Value,
+ type_data: &TypeData,
+ entity_id: &Option<String>,
+) {
+ match value {
+ Value::Object(obj) => {
+ // ... existing object handling logic
+ }
+ Value::List(inner) => {
+ // ... existing list handling logic
+ }
+ _ => {}
+ }
+}
fn build_value_mapping(
row: &SqliteRow,
types: &TypeMapping,
prefix: &str,
is_external: bool,
entity_id: &Option<String>,
) -> sqlx::Result<ValueMapping> {
// ... existing code ...
- fn populate_value(...) { ... }
// ... use the extracted function ...
}
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (1)
crates/torii/grpc/src/server/mod.rs (1)
1137-1158
: Nice model deduplication approach, sensei!The use of HashMap to track seen models and generate unique aliases is a clean solution. However, there's an unwrap that could be handled more gracefully.
Consider handling the potential error case:
- .unwrap(); + .map_err(|e| QueryError::InvalidNamespacedModel(model.clone()))?;
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
📒 Files selected for processing (3)
crates/torii/graphql/src/query/data.rs
(4 hunks)crates/torii/graphql/src/query/filter.rs
(1 hunks)crates/torii/grpc/src/server/mod.rs
(7 hunks)
🔇 Additional comments (8)
crates/torii/graphql/src/query/filter.rs (2)
59-59
: Ohayo! Verify schema compatibility with the simplified field naming.
The removal of field prefixing aligns with the schema refactoring, but let's ensure it's compatible with the new schema structure.
✅ Verification successful
Ohayo! The field naming change is safe and consistent with the codebase, sensei!
The removal of the "external_" prefix from filter fields is part of a clean schema refactoring. Our investigation shows:
- The
external_
prefix is not used in any model or schema definitions - The only remaining
external_
usages are unrelated (in libp2p, server config, etc.) - The filter handling is consistent across the codebase with the simplified field naming
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the new field naming convention is consistent with schema definitions
# Search for schema-related SQL queries to verify field naming patterns
rg -A 5 "CREATE TABLE.*models"
# Check for any remaining "external_" prefixes in schema-related code
rg "external_.*schema"
Length of output: 514
Script:
#!/bin/bash
# Let's check the schema and filter-related code more comprehensively
# Check the schema definitions in migrations
fd migration sql --exec cat {}
# Look for field references in GraphQL schema definitions
fd schema.rs --exec cat {}
# Search for filter usage patterns
rg -A 3 "Filter.*field.*value"
# Check for any remaining external_ prefixes in the codebase
rg "external_" --type rust
Length of output: 65997
Script:
#!/bin/bash
# Let's check the GraphQL schema and query handling
# Look for GraphQL type definitions and field references
fd graphql --type rs --exec cat {}
# Check for any query-related code that might use these fields
rg -A 5 "impl.*Query"
# Look for any SQL-related code that handles these fields
rg -A 5 "SELECT.*FROM.*models"
Length of output: 4068
64-64
: Verify backward compatibility with existing queries, sensei!
The simplified field naming in the default equality case looks good, but we should ensure existing queries won't break.
✅ Verification successful
Ohayo! No breaking changes detected in filter implementation, sensei!
The simplified field naming in the default equality case is safe because:
- The core
Filter
struct ingraphql/src/query/filter.rs
maintains its original structure withfield
,comparator
, andvalue
fields - Existing query implementations in the codebase continue to use the same filter pattern
- Test cases in
models_test.rs
demonstrate consistent filter usage with both equality and GTE comparisons - The change only affects the constructor's default case, not the underlying filter behavior
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check for potential breaking changes in existing queries
# Search for GraphQL query patterns that might be affected
ast-grep --pattern 'query { $$$($_ : $_) { $$$} }'
# Look for test files that might need updates
fd -e rs -e graphql "test" --exec rg -l "query.*filter"
Length of output: 108
Script:
#!/bin/bash
# Let's check for filter-related tests and their usage patterns
rg -g "*.rs" "filter.*test" -B 2 -A 2
# Look for actual GraphQL query implementations using filters
rg -g "*.rs" -g "*.graphql" "filter.*query" -B 2 -A 2
# Check for filter-related test files specifically
fd -e rs "filter.*test"
# Look for filter usage in GraphQL schema definitions
rg -g "*.graphql" "filter" -B 2 -A 2
Length of output: 1924
Script:
#!/bin/bash
# Let's examine the filter implementation and its usage
rg -g "*.rs" "impl.*Filter" -B 2 -A 5
# Check for filter-related GraphQL schema definitions
fd -e graphql . --exec cat {}
# Look for filter usage in queries
ast-grep --pattern 'fn $_($$$ filter: $_) { $$$ }'
# Check test cases specifically using filters
rg -g "*.rs" "test.*filter" -B 2 -A 5
Length of output: 2373
Script:
#!/bin/bash
# Let's check the actual Filter struct implementation and its usage
rg -g "*.rs" "struct Filter" -B 2 -A 5
# Look for any GraphQL filter input types
rg -g "*.rs" "FilterInput" -B 2 -A 5
# Check for filter-related test cases with actual query examples
rg -g "*.rs" "query.*filter" -B 5 -A 5
# Look for filter-related schema definitions
rg -g "*.rs" "impl.*Schema.*filter" -B 5 -A 5
Length of output: 5565
crates/torii/graphql/src/query/data.rs (2)
8-8
: Ohayo! LGTM on the constants import!
Good practice to centralize the DEFAULT_LIMIT configuration.
127-130
:
Consider using safer error handling for field access, sensei!
The use of try_get_unchecked
for order field values could lead to runtime panics if the field doesn't exist. Consider using try_get
with proper error handling instead.
- &data[0].try_get_unchecked::<String, &str>(&order_field)?,
+ &data[0].try_get::<String, &str>(&order_field)
+ .map_err(|e| sqlx::Error::ColumnNotFound(order_field.clone()))?,
Likely invalid or redundant comment.
crates/torii/grpc/src/server/mod.rs (4)
68-72
: Ohayo, sensei! The constant renaming looks good!
The renaming of the column constants to include the "internal_" prefix better reflects their role as internal database identifiers, improving code clarity.
302-318
: Query building simplification looks clean, sensei!
The removal of array-related parameters and simplification of the build_sql_query
call aligns well with the SQL schema refactoring objectives. The code is now more focused and maintainable.
1046-1046
: Clean mapping simplification, sensei!
The direct usage of schema name in map_row_to_ty
aligns well with the flattened schema approach and makes the code more straightforward.
677-685
: Verify schema flattening assumption, sensei!
The code now directly uses member names as column names, assuming they are already flattened in the schema. Let's verify this assumption.
✅ Verification successful
Let me gather more specific information about the schema flattening implementation.
Schema flattening is properly implemented, sensei!
The code inspection reveals that column flattening is correctly handled through the build_sql_query
implementation in crates/torii/core/src/model.rs
. The system:
- Flattens nested structures during SQL query building via
collect_columns
function - Properly maps column names to member names in
map_row_to_ty
- Handles column name generation consistently across the codebase
The direct member name usage in the SQL query is safe because the schema processing pipeline ensures proper flattening before reaching this point.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if all member columns in the schema are properly flattened
# Search for schema definitions and column mappings
rg -A 5 "schema.*flatten|column.*mapping"
# Search for any potential nested member access patterns
ast-grep --pattern 'member.$_'
Length of output: 5413
Script:
#!/bin/bash
# Check schema flattening implementation in SQL and core logic
rg -A 10 "flatten.*schema|build_sql_query" crates/torii/
# Check how column names are handled in SQL queries
rg -A 5 "column_name.*member|member.*column" crates/torii/
# Look for any nested column handling
ast-grep --pattern 'build_sql_query($$$)'
Length of output: 9785
Some(order) => Ok(format!( | ||
"(({} {} '{}' AND {} = '{}') OR {} {} '{}')", | ||
id_column, | ||
direction.as_ref(), | ||
event_id, | ||
order.field, | ||
field_value, | ||
order.field, | ||
direction.as_ref(), | ||
field_value | ||
)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ohayo! Important security consideration for SQL injection prevention.
The current implementation builds SQL conditions using string formatting, which could be vulnerable to SQL injection attacks. Consider using parameterized queries instead.
Here's a suggested approach:
- Create a prepared statement with placeholders
- Bind the values safely using sqlx's parameter binding
Example approach:
fn handle_cursor(
cursor: &str,
order: &Option<Order>,
direction: CursorDirection,
id_column: &str,
) -> Result<(String, Vec<String>)> {
match cursor::decode(cursor) {
Ok((event_id, field_value)) => match order {
Some(order) => Ok((
format!(
"(({} {} ? AND {} = ?) OR {} {} ?)",
id_column,
direction.as_ref(),
order.field,
order.field,
direction.as_ref(),
),
vec![event_id, field_value.clone(), field_value]
)),
None => Ok((
format!("{} {} ?", id_column, direction.as_ref()),
vec![event_id]
)),
},
Err(_) => Err(sqlx::Error::Decode("Invalid cursor format".into())),
}
}
let column_name = order.field.clone(); | ||
query.push_str(&format!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 Codebase verification
Ohayo sensei! Found an inconsistency in order field handling
The search revealed that in object/connection/mod.rs
, order fields are still being prefixed with internal_
in some cases:
format!("internal_{}", order.field)
While the code under review removes this prefix handling, suggesting a potential mismatch in the implementation.
crates/torii/graphql/src/object/connection/mod.rs
: Still usesinternal_
prefixcrates/torii/graphql/src/query/data.rs
: Removed the prefix handling
🔗 Analysis chain
Ohayo sensei! Verify schema compatibility with order field changes.
The simplified order field handling looks cleaner, but we should verify that this change aligns with the SQL schema refactor mentioned in the PR title.
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Check if the order fields exist in the tables schema
# Look for table definitions and column names in migration files
# Search for CREATE TABLE statements and their columns
rg -A 20 "CREATE TABLE" --glob "migrations/*.sql"
# Search for any external_ prefixed columns that might have been affected
rg "external_" --glob "migrations/*.sql"
Length of output: 390
Script:
#!/bin/bash
# Let's try a different approach to find SQL schema and order-related code
# Find SQL files in the repository
fd -e sql
# Look for GraphQL schema files that might define ordering
fd schema
# Search for order-related code in the current file's context
rg -B 5 -A 5 "order.field" "crates/torii/graphql/src/query/data.rs"
# Search for other order field usages
rg "order.field"
Length of output: 5135
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2718 +/- ##
==========================================
- Coverage 56.47% 55.78% -0.69%
==========================================
Files 421 425 +4
Lines 53976 53315 -661
==========================================
- Hits 30482 29741 -741
- Misses 23494 23574 +80 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great rework @Larkooo, simplifies so much at the schema level.
Some points:
- When an array contains options, we have this serialization:
[{"Some":6},{"None":[]},{"Some":6}]
. Is this something that could cause issue while deserializing? Since it's not a cairo serialization. - Only hand tested some gRPC / GraphQL queries, will add some tests to have complex queries with nested types.
* start sql model register refactor * refactor: arrays as json & model members * alter table and model idx * fix: model idx * better error log * remove model idx * rempve model members table * model schema migration * lol * refactor: ditch model members & refactor dynamic query * start refactoring while grpc and grahql services/type mapping * fix set entities enum * fix set entities * fmt * update grpc for flattened reads * cleanup * feat(torii-core): delete entity new schema * graphql integration almost done * global arrays refactor & fix graphql * fmt * tuple no _ * feat: graphql arrays of objects and enums * feat: tuple support * fmt * c * f * fix * fix entities * refactor: value mapping fix * types * fix: snake case * fix: casing * f * namespaced schema * fix grpc test * ff * fmt * c * fix clauses * fix libp2p * fix id
Summary by CodeRabbit
Release Notes
New Features
schema
field to themodels
table, allowing storage of model type information.inner
in theTypeData
enum for accessing inner list types.clear
method in theModelCache
for improved model management.Bug Fixes
Refactor
Chores
These changes enhance the functionality and maintainability of the application, providing users with improved error reporting and a more robust data handling framework.