Issue with streaming with Gemini #547

eltoob · 2024-11-21T19:36:19Z

Describe the bug
Gemini just announced the support for openai library.
See here: https://ai.google.dev/gemini-api/docs/openai
For some reason, the ruby library doesn't stream (or I guess to be more precise it streams all the response at once.
Tried the exact same request the python library and it streams properly

To Reproduce
Steps to reproduce the behavior:

Go to https://ai.google.dev/
Generate a key
Run the code below
There is no streaming

require 'openai'

client = OpenAI::Client.new(
  access_token: "API_KEY",
  uri_base: "https://generativelanguage.googleapis.com/v1beta/openai/"
)
start_time = Time.now
puts start_time
response = client.chat(
  parameters: {
    model: "gemini-1.5-flash",
    messages: [
      { role: "system", content: "You are a helpful assistant." },
      { role: "user", content: "Hello! write a poem about the moon make it 2000 words" }
    ],
    stream: proc do |chunk|
      current_time = Time.now
      elapsed = current_time - start_time
      puts "#{current_time}: chunk (#{elapsed.round(2)}s elapsed)"
    end
  }
)

You can execute the same code with python and you will see that the stream will work properly

from openai import OpenAI
import time

start_time = time.time()

client = OpenAI(
    api_key="API_KEY",
    base_url="https://generativelanguage.googleapis.com/v1beta/openai/"
)

response = client.chat.completions.create(
  model="gemini-1.5-flash",
  messages=[
    {"role": "system", "content": "You are a helpful assistant."},
    {"role": "user", "content": "Hello! write a poem about the moon"}
  ],
  stream=True
)

for chunk in response:
    current_time = time.time()
    elapsed = current_time - start_time
    print(f"{elapsed:.2f}s elapsed:"

If you try to run the following code
Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
Here I logged the time and you can see that with ruby it returns all the chunks at once

Now with python it will actually stream,

The text was updated successfully, but these errors were encountered:

eltoob · 2024-11-26T23:48:54Z

ok quick update,
I try to replicate the exact same headers as the python library
When i pass "Accept-Encoding" => "gzip, deflate", as a header, it's kinda working (ie I do see the proc working but there are issues with eventstreamer

eltoob · 2024-11-27T06:23:55Z

Ok I finally fixed the issue.

OpenAI.configure do |config|
  config.extra_headers = {
    "Accept-Encoding" => ""
  }
end

Not sure why

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue with streaming with Gemini #547

Issue with streaming with Gemini #547

eltoob commented Nov 21, 2024 •

edited

Loading

eltoob commented Nov 26, 2024

eltoob commented Nov 27, 2024

Issue with streaming with Gemini #547

Issue with streaming with Gemini #547

Comments

eltoob commented Nov 21, 2024 • edited Loading

eltoob commented Nov 26, 2024

eltoob commented Nov 27, 2024

eltoob commented Nov 21, 2024 •

edited

Loading