-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: gen ai tuning and eval sample #1628
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @willisc7 and reviewers,
I'm currently reviewing this pull request and will post my detailed review in a few minutes. In the meantime, I'm providing this summary of the pull request's content to help you and other reviewers quickly understand the changes and intent.
This pull request, titled "feat: gen ai tuning and eval sample", aims to add a sample Vertex AI Pipeline for tuning a Gemini model and evaluating the tuned model against a previously tuned one. The description mentions using this for an LLM that summarizes glucose values for diabetes patients.
Here's a breakdown of the changes:
.github/CODEOWNERS
(line 81-84): Added code ownership for the new sample app to @willisc7 and @GoogleCloudPlatform/generative-ai-devrel.gemini/sample-apps/genai-mlops-tune-and-eval/.gitignore
(lines 1-5): Added standard gitignore entries for files generated during the pipeline's execution.gemini/sample-apps/genai-mlops-tune-and-eval/README.md
(lines 1-91): A comprehensive README was added, detailing how to prepare data (optional), set up IAM permissions, run the Vertex AI pipeline, and optionally run it locally using Kubeflow Pipelines. This includes detailed instructions and code snippets for each step.gemini/sample-apps/genai-mlops-tune-and-eval/local/Dockerfile
(lines 1-6): A Dockerfile was added to build a local Docker image for running the pipeline locally. It includes setting up Google Cloud authentication.gemini/sample-apps/genai-mlops-tune-and-eval/local/pipeline.py
(lines 1-242): This file contains the Kubeflow Pipelines definition for the entire process. It defines two components:gemini_tuning_component
(for tuning the model) andmodel_comparison_component
(for comparing the tuned model to a baseline). The pipeline orchestrates these components to perform the tuning and evaluation.gemini/sample-apps/genai-mlops-tune-and-eval/patient_1_glucose_examples.jsonl
(line 1): Sample data in JSONL format for tuning and evaluation was added.gemini/sample-apps/genai-mlops-tune-and-eval/submit_pipeline_job.py
(lines 1-17): A script to submit the pipeline job to Vertex AI Pipelines was added. This script handles initialization, parameter setting, and job submission.
I'll be looking at the specifics of the code in the next review. Let me know if you have any questions in the meantime!
And now, a little poem to lighten the mood:
Code flows like a river,
Through branches, a gentle stream,
Reviews make it clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request adds a valuable sample for GenAI model tuning and evaluation using Vertex AI Pipelines. The documentation in the README is comprehensive and well-structured. Here's a summary of the Google Python Style Guide elements I referenced:
- Naming: Used Google Python Style Guide naming conventions for variables and functions (e.g., snake_case).
- Docstrings: Ensured docstrings adhere to the recommended format and provide clear explanations of function purpose, arguments, and return values.
- Imports: Organized imports according to the style guide, grouping standard library imports, third-party imports, and local imports.
- Line Length: Adhered to the 79-character line length limit.
- Comments: Added comments to clarify complex logic or non-obvious code segments.
Overall, the code is well-written and easy to follow. However, there are a few minor improvements that could enhance readability and maintainability. I've provided specific suggestions in the review comments below.
Co-authored-by: code-review-assist[bot] <182814678+code-review-assist[bot]@users.noreply.github.com>
Co-authored-by: code-review-assist[bot] <182814678+code-review-assist[bot]@users.noreply.github.com>
Co-authored-by: code-review-assist[bot] <182814678+code-review-assist[bot]@users.noreply.github.com>
Co-authored-by: code-review-assist[bot] <182814678+code-review-assist[bot]@users.noreply.github.com>
gemini/sample-apps/genai-mlops-tune-and-eval/patient_1_glucose_examples.jsonl
Outdated
Show resolved
Hide resolved
Co-authored-by: Holt Skinner <[email protected]>
Co-authored-by: Holt Skinner <[email protected]>
@willisc7 Could you resolve the remaining lint errors? Thanks! https://github.com/GoogleCloudPlatform/generative-ai/actions/runs/12893354426/job/35949591448 |
I believe the rest of the linting errors are pylint not understanding the layout of kubeflow components, but let me know if Im wrong. |
@willisc7 I made a few adjustments for the formatting and linting, please make sure my changes didn't break your code. The remaining lint error is due to typing mismatches. I'm not familiar enough with Kubeflow pipelines to know what they should be, could you please adjust the parameters/return types?
Also, could you add a Thanks! |
@willisc7 Thanks for addressing the linting and formatting issues! I see you've chosen to return |
Ok I made some changes to get the code to work with your linting and formatting changes. It turns out |
@willisc7 Thanks for the clarification and for addressing the linting issues! I understand the constraint with |
@willisc7 Thanks for making the adjustments! There's only one remaining lint issue which is the unused variable |
@holtskinner Thanks for catching the unused variable |
The change to I think we need the longer return type hint |
@willisc7 Thanks for the clarification and for addressing the remaining lint issue! The error you're encountering, |
Description
Adds a sample Vertex AI Pipeline that tunes a Gemini model and then evaluates the tuned model against a previously-tuned model.