-
Notifications
You must be signed in to change notification settings - Fork 532
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: analytics api #2740
feat: analytics api #2740
Conversation
The latest updates on your projects. Learn more about Vercel for Git ↗︎
1 Skipped Deployment
|
|
📝 WalkthroughWalkthroughThis pull request introduces multiple enhancements to the Unkey analytics system across various files. Key changes include the addition of a public getter method Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
Thank you for following the naming conventions for pull request titles! 🙏 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
apps/api/src/routes/v1_analytics_getVerifications.ts (2)
100-118
: Consider using a const enum for orderBy values.The orderBy validation could be more type-safe by using a const enum or as const array.
Apply this diff to improve type safety:
+const ORDER_BY_VALUES = [ + "time", + "valid", + "notFound", + "forbidden", + "usageExceeded", + "rateLimited", + "unauthorized", + "disabled", + "insufficientPermissions", + "expired", + "total", +] as const; - .enum([ - "time", - "valid", - "notFound", - "forbidden", - "usageExceeded", - "rateLimited", - "unauthorized", - "disabled", - "insufficientPermissions", - "expired", - "total", - ]) + .enum(ORDER_BY_VALUES)
465-481
: Optimize identity fetching with batched queries.The current implementation makes individual cache/DB queries for each identity. Consider batching these queries for better performance.
Consider this approach:
const identityIds = Object.keys(identityIds); // Batch fetch from cache first const cacheResults = await Promise.all( chunk(identityIds, 100).map(async (ids) => { const cacheKeys = ids.map(id => ({ key: id, type: "identityById" as const })); return cache.batchGet(cacheKeys); }) ); // Batch fetch missing ones from DB const missingIds = identityIds.filter(id => !cacheResults.flat().find(r => r.key === id)); if (missingIds.length > 0) { const dbResults = await db.readonly.query.identities.findMany({ where: (table, { and, eq, inArray }) => and( eq(table.workspaceId, auth.authorizedWorkspaceId), inArray(table.id, missingIds) ), }); // Update cache with new results await Promise.all( dbResults.map(identity => cache.identityById.set(identity.id, identity) ) ); }apps/api/src/routes/v1_analytics_getVerifications.happy.test.ts (2)
9-9
: Define outcome constants in a separate constants file.The
POSSIBLE_OUTCOMES
constant could be shared with the main implementation file.Consider moving this to a shared constants file:
// src/constants/analytics.ts export const VERIFICATION_OUTCOMES = ["VALID", "RATE_LIMITED", "DISABLED"] as const; export type VerificationOutcome = typeof VERIFICATION_OUTCOMES[number];
1189-1197
: Enhance test data generation with more configuration options.The
generate
function could be more flexible to support different test scenarios.Consider adding these options:
type GenerateOptions = { start: number; end: number; length: number; workspaceId: string; keySpaceId: string; keys: Array<{ keyId: string; identityId?: string }>; tags?: string[]; // Add these options: outcomeDistribution?: Partial<Record<VerificationOutcome, number>>; timeDistribution?: "random" | "uniform" | "weighted"; regionList?: string[]; };
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
apps/api/src/pkg/cache/index.ts
(1 hunks)apps/api/src/pkg/cache/namespaces.ts
(1 hunks)apps/api/src/routes/v1_analytics_getVerifications.happy.test.ts
(1 hunks)apps/api/src/routes/v1_analytics_getVerifications.ts
(5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (17)
- GitHub Check: Test Packages / Test ./packages/rbac
- GitHub Check: Test Packages / Test ./packages/nextjs
- GitHub Check: Test Packages / Test ./packages/hono
- GitHub Check: Test Packages / Test ./packages/cache
- GitHub Check: Test Packages / Test ./packages/api
- GitHub Check: Test Packages / Test ./internal/clickhouse
- GitHub Check: Test Packages / Test ./internal/resend
- GitHub Check: Test Packages / Test ./internal/keys
- GitHub Check: Test Packages / Test ./internal/id
- GitHub Check: Test Packages / Test ./internal/hash
- GitHub Check: Test API / API Test Local
- GitHub Check: Test Packages / Test ./internal/encryption
- GitHub Check: Test Packages / Test ./internal/billing
- GitHub Check: Build / Build
- GitHub Check: Test Agent Local / test_agent_local
- GitHub Check: autofix
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (5)
apps/api/src/pkg/cache/namespaces.ts (1)
66-66
: LGTM! The cache namespace type is well-defined.The addition of
identityById
follows the established pattern and properly handles cache misses with the nullable type.apps/api/src/pkg/cache/index.ts (1)
74-74
: LGTM! Cache namespace initialization is consistent.The
identityById
namespace is properly initialized with the same configuration as other namespaces, maintaining consistency in the caching layer.apps/api/src/routes/v1_analytics_getVerifications.ts (3)
447-456
:⚠️ Potential issueImprove error handling and logging security.
The error handling exposes internal details in both logs and error messages.
Apply this diff to improve security:
if (data.err) { logger.error("unable to query clickhouse", { - error: data.err.message, - query: query, + error: data.err.message.replace(/'[^']*'/g, "'***'"), // Redact values + queryType: "analytics_verification", }); throw new UnkeyApiError({ code: "INTERNAL_SERVER_ERROR", - message: `unable to query clickhouse: ${data.err.message}`, + message: "An error occurred while processing analytics data", }); }Likely invalid or redundant comment.
208-212
: 🛠️ Refactor suggestionAvoid exposing internal details in error messages.
The error message includes the raw filters object which could expose sensitive information.
Apply this diff to provide a cleaner error message:
- message: `You can not group by tag and tags at the same time, received: ${JSON.stringify( - filters, - )}`, + message: "Cannot group by both 'tag' and 'tags' simultaneously",Likely invalid or redundant comment.
397-397
:⚠️ Potential issueFix potential SQL injection in ORDER BY clause.
Using
{ orderBy: Identifier }
is unsafe as it allows arbitrary column names.Apply this diff to use a safe mapping:
- query.push(`ORDER BY { orderBy: Identifier } ${filters.order === "asc" ? "ASC" : "DESC"} `); + const safeOrderByMap: Record<typeof ORDER_BY_VALUES[number], string> = { + time: "time", + valid: "valid", + notFound: "notFound", + forbidden: "forbidden", + usageExceeded: "usageExceeded", + rateLimited: "rateLimited", + unauthorized: "unauthorized", + disabled: "disabled", + insufficientPermissions: "insufficientPermissions", + expired: "expired", + total: "total", + }; + query.push(`ORDER BY ${safeOrderByMap[filters.orderBy]} ${filters.order === "asc" ? "ASC" : "DESC"} `);Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (5)
.github/workflows/job_test_api_local.yaml (1)
38-44
: Consider security and robustness improvements for ClickHouse migrationWhile the migration step is functionally correct, consider these improvements:
- Avoid hardcoding credentials in the workflow file
- Add error handling and migration status verification
- Ensure the migration directory path is resilient to repository structure changes
Example improvement:
- name: Migrate ClickHouse run: | + if ! goose status; then + echo "Failed to check migration status" + exit 1 + fi goose up + if [ $? -ne 0 ]; then + echo "Migration failed" + exit 1 + fi env: GOOSE_DRIVER: clickhouse - GOOSE_DBSTRING: "tcp://default:[email protected]:9000" + GOOSE_DBSTRING: ${{ secrets.CLICKHOUSE_URL }} GOOSE_MIGRATION_DIR: ./internal/clickhouse/schemaapps/api/src/routes/v1_analytics_getVerifications.ts (2)
82-89
: Consider adding validation for mutually exclusive groupBy values.The description mentions that grouping by tags and tag is mutually exclusive, but this constraint isn't enforced at the schema level. While the code handles this later with a runtime check, consider adding this validation in the schema for better API documentation and early validation.
- groupBy: validation.groupBy - .or(z.array(validation.groupBy)) - .optional() + groupBy: z + .union([ + validation.groupBy, + z.array(validation.groupBy.refine( + (arr) => !(arr.includes("tag") && arr.includes("tags")), + "Cannot group by both 'tag' and 'tags'" + )) + ]) + .optional()
319-352
: Consider implementing retry logic for cache operations.The API lookup cache operation could benefit from retry logic to handle transient failures. Also, consider implementing a circuit breaker pattern for the database queries.
apps/api/src/routes/v1_analytics_getVerifications.happy.test.ts (2)
9-10
: Consider using TypeScript enum for outcomes.Using a const array for possible outcomes could be improved by using a TypeScript enum for better type safety and maintainability.
-const POSSIBLE_OUTCOMES = ["VALID", "RATE_LIMITED", "DISABLED"] as const; +enum VerificationOutcome { + VALID = "VALID", + RATE_LIMITED = "RATE_LIMITED", + DISABLED = "DISABLED", +}
1227-1229
: Consider adding input validation to toStartOfHour.The toStartOfHour function should validate its input to ensure it's a valid timestamp.
function toStartOfHour(unixmilli: number): number { + if (!Number.isFinite(unixmilli) || unixmilli < 0) { + throw new Error('Invalid timestamp provided'); + } return Math.floor(unixmilli / 60 / 60 / 1000) * 60 * 60 * 1000; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/job_test_api_local.yaml
(2 hunks)apps/api/src/routes/v1_analytics_getVerifications.happy.test.ts
(1 hunks)apps/api/src/routes/v1_analytics_getVerifications.ts
(5 hunks)
🧰 Additional context used
📓 Learnings (1)
apps/api/src/routes/v1_analytics_getVerifications.ts (2)
Learnt from: chronark
PR: unkeyed/unkey#2740
File: apps/api/src/routes/v1_analytics_getVerifications.ts:259-259
Timestamp: 2025-01-10T08:31:33.471Z
Learning: In the Unkey codebase, workspace_id values obtained from auth.authorizedWorkspaceId are safe to use in SQL queries without parameterization since they are internally controlled and validated by the authentication system, not user-provided.
Learnt from: chronark
PR: unkeyed/unkey#2740
File: apps/api/src/routes/v1_analytics_getVerifications.ts:399-402
Timestamp: 2025-01-10T08:35:51.282Z
Learning: Route-level Zod validation for query parameters (e.g., orderBy) is sufficient for preventing SQL injection, as the validation happens before the handler runs, ensuring only valid values reach the query construction.
⏰ Context from checks skipped due to timeout of 90000ms (14)
- GitHub Check: Test Packages / Test ./packages/nextjs
- GitHub Check: Test Packages / Test ./packages/hono
- GitHub Check: Test Packages / Test ./packages/cache
- GitHub Check: Test Packages / Test ./packages/api
- GitHub Check: Test Packages / Test ./internal/clickhouse
- GitHub Check: Test Packages / Test ./internal/resend
- GitHub Check: Test Packages / Test ./internal/hash
- GitHub Check: Test Packages / Test ./internal/encryption
- GitHub Check: Test Packages / Test ./internal/billing
- GitHub Check: Test API / API Test Local
- GitHub Check: Test Agent Local / test_agent_local
- GitHub Check: Build / Build
- GitHub Check: autofix
- GitHub Check: Analyze (javascript-typescript)
🔇 Additional comments (5)
.github/workflows/job_test_api_local.yaml (1)
26-26
: LGTM: Go installation added for ClickHouse supportThe addition of Go installation support is necessary for the new ClickHouse migration tooling.
apps/api/src/routes/v1_analytics_getVerifications.ts (3)
9-11
: LGTM! Well-structured validation setup.The validation object clearly defines the allowed groupBy values, making it easy to maintain and extend.
201-211
: LGTM! Proper validation of mutually exclusive groupBy values.The runtime check for mutually exclusive groupBy values is implemented correctly with a clear error message.
449-458
:⚠️ Potential issueImprove error logging for database queries.
The error logging includes the full SQL query which might expose sensitive information. Consider logging only the error message and a query identifier for debugging.
- logger.error("unable to query clickhouse", { - error: data.err.message, - query: query, - }); + logger.error("unable to query clickhouse", { + error: data.err.message, + queryId: crypto.randomUUID(), + });Likely invalid or redundant comment.
apps/api/src/routes/v1_analytics_getVerifications.happy.test.ts (1)
1198-1222
: LGTM! Well-documented and type-safe test data generation.The generate function is well-documented with TypeScript types and provides good flexibility for test data generation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (5)
apps/api/src/routes/v1_analytics_getVerifications.ts (2)
220-242
: Consider improving query construction.While the query construction is functional, consider these improvements:
- Define a type for the table configuration to ensure type safety.
- Simplify the table selection logic by using an object lookup instead of if-else chains.
Apply this diff to improve type safety and readability:
const tables = { hour: { name: "verifications.key_verifications_per_hour_v3", fill: `WITH FILL FROM toStartOfHour(fromUnixTimestamp64Milli({ start: Int64 })) TO toStartOfHour(fromUnixTimestamp64Milli({ end: Int64 })) STEP INTERVAL 1 HOUR`, }, day: { name: "verifications.key_verifications_per_day_v3", fill: `WITH FILL FROM toStartOfDay(fromUnixTimestamp64Milli({ start: Int64 })) TO toStartOfDay(fromUnixTimestamp64Milli({ end: Int64 })) STEP INTERVAL 1 DAY`, }, month: { name: "verifications.key_verifications_per_month_v3", fill: `WITH FILL FROM toDateTime(toStartOfMonth(fromUnixTimestamp64Milli({ start: Int64 }))) TO toDateTime(toStartOfMonth(fromUnixTimestamp64Milli({ end: Int64 }))) STEP INTERVAL 1 MONTH`, }, -} as const; +} satisfies Record<string, { + name: string; + fill: string; +}>; +type TableGranularity = keyof typeof tables; +const getTable = (granularity: TableGranularity) => tables[granularity]; -let table: ValueOf<typeof tables> = tables.hour; +const table = getTable(selectedGroupBy.find((g): g is TableGranularity => g in tables) ?? "hour");Also applies to: 387-402
332-348
: Improve error message clarity and security.The error messages could be more user-friendly while maintaining security:
Apply this diff to improve the error messages:
- message: "we're unable to load the API", + message: "Failed to retrieve API configuration", - message: "we're unable to find the API", + message: "API not found", - message: "api has no keyspace attached", + message: "API is not properly configured for key management",apps/api/src/routes/v1_analytics_getVerifications.happy.test.ts (3)
11-38
: Enhance test descriptions and type safety.While the test coverage is comprehensive, consider these improvements:
- Make test descriptions more specific about what they're validating.
- Add type safety to the test data generation.
Apply this diff to improve the test descriptions:
-describe("with no data", () => { +describe("getVerifications with no data", () => { - test("returns an array with one element per interval", async (t) => { + test("should return an array with one element per hour when no verifications exist", async (t) => { - test("a user's usage over the past 24h for 2 keys", async (t) => { + test("should aggregate usage statistics for multiple keys over 24 hours", async (t) => { - test("A monthly cron job creates invoices for each identity", async (t) => { + test("should provide monthly usage statistics for billing purposes", async (t) => {Also applies to: 126-127, 361-362
1189-1223
: Improve test data generation utility.The test data generation utility could benefit from these improvements:
- Use a seeded random number generator for deterministic tests.
- Add more precise type definitions.
Apply this diff to improve the utility:
+/** + * A deterministic random number generator for test stability + */ +function seededRandom(seed: number) { + return () => { + seed = (seed * 16807) % 2147483647; + return (seed - 1) / 2147483646; + }; +} +type VerificationOutcome = typeof POSSIBLE_OUTCOMES[number]; +interface KeyConfig { + keyId: string; + identityId?: string; +} function generate(opts: { start: number; end: number; length: number; workspaceId: string; keySpaceId: string; - keys: Array<{ keyId: string; identityId?: string }>; + keys: Array<KeyConfig>; tags?: string[]; }) { + const random = seededRandom(opts.start); return Array.from({ length: opts.length }).map((_) => { - const key = opts.keys[Math.floor(Math.random() * opts.keys.length)]; + const key = opts.keys[Math.floor(random() * opts.keys.length)]; return { - time: Math.round(Math.random() * (opts.end - opts.start) + opts.start), + time: Math.round(random() * (opts.end - opts.start) + opts.start), workspace_id: opts.workspaceId, key_space_id: opts.keySpaceId, key_id: key.keyId, - outcome: POSSIBLE_OUTCOMES[Math.floor(Math.random() * POSSIBLE_OUTCOMES.length)], + outcome: POSSIBLE_OUTCOMES[Math.floor(random() * POSSIBLE_OUTCOMES.length)], tags: opts.tags ?? [], request_id: newId("test"), region: "test", identity_id: key.identityId, }; }); }
1225-1230
: Enhance timestamp utility function.The timestamp utility function could be improved with better validation and precision:
Apply this diff to improve the utility:
/** * Truncates the timestamp to the start of the current hour + * @throws {Error} If the timestamp is invalid */ function toStartOfHour(unixmilli: number): number { + if (!Number.isFinite(unixmilli) || unixmilli < 0) { + throw new Error('Invalid timestamp'); + } - return Math.floor(unixmilli / 60 / 60 / 1000) * 60 * 60 * 1000; + return Math.floor(unixmilli / (60 * 60 * 1000)) * (60 * 60 * 1000); }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
apps/api/src/routes/v1_analytics_getVerifications.happy.test.ts
(1 hunks)apps/api/src/routes/v1_analytics_getVerifications.ts
(5 hunks)
🧰 Additional context used
📓 Learnings (1)
apps/api/src/routes/v1_analytics_getVerifications.ts (2)
Learnt from: chronark
PR: unkeyed/unkey#2740
File: apps/api/src/routes/v1_analytics_getVerifications.ts:259-259
Timestamp: 2025-01-10T08:31:33.471Z
Learning: In the Unkey codebase, workspace_id values obtained from auth.authorizedWorkspaceId are safe to use in SQL queries without parameterization since they are internally controlled and validated by the authentication system, not user-provided.
Learnt from: chronark
PR: unkeyed/unkey#2740
File: apps/api/src/routes/v1_analytics_getVerifications.ts:399-402
Timestamp: 2025-01-10T08:35:51.282Z
Learning: Route-level Zod validation for query parameters (e.g., orderBy) is sufficient for preventing SQL injection, as the validation happens before the handler runs, ensuring only valid values reach the query construction.
⏰ Context from checks skipped due to timeout of 90000ms (1)
- GitHub Check: Test Agent Local / test_agent_local
🔇 Additional comments (2)
apps/api/src/routes/v1_analytics_getVerifications.ts (2)
9-11
: LGTM! The validation logic is well-structured.The validation setup correctly handles:
- Multiple granularities for groupBy
- Both string and array types for keyId and tag parameters
- Proper timestamp coercion for start/end parameters
Also applies to: 82-89
464-490
: LGTM! The identity mapping implementation is appropriate.The individual cache lookups for identity mapping align with the caching strategy requirements.
Summary by CodeRabbit
Release Notes
New Features
Harness
class with ClickHouse support.Documentation
Improvements
StepRequest
type with optional search parameters.Chores