Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Create Spatial Understanding Notebook for Gemini 2.0 #1651

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

holtskinner
Copy link
Collaborator

@holtskinner holtskinner requested a review from a team as a code owner January 23, 2025 21:48
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

Copy link
Contributor

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @holtskinner and reviewers,

I'm currently reviewing this pull request and will post my feedback in a few minutes. In the meantime, I'm providing this summary of the pull request's content to help you and other reviewers quickly understand the changes and intent.

This pull request, titled "feat: Create Spatial Understanding Notebook for Gemini 2.0", aims to create a Jupyter Notebook demonstrating spatial understanding capabilities with the Gemini 2.0 API in Vertex AI. The notebook leverages the Google Gen AI SDK and the gemini-2.0-flash-exp model.

Here's a summary of the changes:

  • A new Jupyter Notebook (gemini/use-cases/spatial-understanding/spatial_understanding.ipynb) was added. This notebook is approximately 675 lines long and contains comprehensive examples of spatial understanding tasks. The notebook includes sections on:
    • Installation and authentication
    • Setting up the Google Cloud project
    • Importing necessary libraries
    • Loading the Gemini model
    • Defining helper functions for plotting bounding boxes
    • Several examples demonstrating object detection and spatial reasoning with various images and prompts.
    • Sharing options for the notebook (Colab, Colab Enterprise, Vertex AI Workbench, GitHub).

The notebook uses a BoundingBox Pydantic model to structure the output and includes safety settings to mitigate potential risks. The code effectively demonstrates how to use the Gemini API for tasks such as:
* Simply overlaying information on images
* Searching within images for specific objects
* Leveraging Gemini's reasoning capabilities for tasks like finding the shadow of an object.

  • The notebook also includes a YouTube video link demonstrating spatial understanding with Gemini 2.0.

I'll provide a more detailed review shortly, but this summary should give you a good overview of the changes.


Here's a little haiku to lighten the mood:

Boxes on the screen,
Gemini's sight, sharp and keen,
Code makes pictures speak.

Copy link
Contributor

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request introduces a new Jupyter Notebook demonstrating spatial understanding with Gemini 2.0. The notebook is well-structured and provides clear explanations and examples. The code is generally clean and easy to follow. However, there are a few minor suggestions for improvement, primarily related to documentation and code style.

I've also noticed a few minor issues with the prompt text and some inconsistencies in the code. I've added comments and suggestions to address these issues and improve the overall quality of the notebook.

Overall, this is a valuable addition to the repository. Addressing the minor issues I've raised would further enhance its clarity and usefulness.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants