Skip to content

Commit

Permalink
'responseaudio' to 'outputaudio', doc comments and cleaup
Browse files Browse the repository at this point in the history
  • Loading branch information
trrwilson committed Jan 17, 2025
1 parent 20a0f9c commit 6a1a7ee
Show file tree
Hide file tree
Showing 49 changed files with 374 additions and 365 deletions.
4 changes: 2 additions & 2 deletions .dotnet/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@
- Chat completion now supports audio input and output!
- To configure a chat completion to request audio output using the `gpt-4o-audio-preview` model, create a `ChatAudioOptions` instance and provide it on `ChatCompletionOptions.AudioOptions`.
- Input chat audio is provided to `UserChatMessage` instances using `ChatContentPart.CreateInputAudioPart()`
- Output chat audio is provided on the `ResponseAudio` property of `ChatCompletion`
- References to prior assistant audio are provided via `ResponseAudioReference` instances on the `AudioReference` property of `AssistantChatMessage`; `AssistantChatMessage(chatCompletion)` will automatically handle this, too
- Output chat audio is provided on the `OutputAudio` property of `ChatCompletion`
- References to prior assistant audio are provided via `OutputAudioReference` instances on the `AudioReference` property of `AssistantChatMessage`; `AssistantChatMessage(chatCompletion)` will automatically handle this, too
- For more information, see the example in the README

## 2.1.0 (2024-12-04)
Expand Down
18 changes: 9 additions & 9 deletions .dotnet/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -382,30 +382,30 @@ List<ChatMessage> messages =
// Output audio is requested by configuring AudioOptions on ChatCompletionOptions
ChatCompletionOptions options = new()
{
AudioOptions = new(ChatResponseVoice.Alloy, ChatOutputAudioFormat.Mp3),
AudioOptions = new(ChatOutputAudioVoice.Alloy, ChatOutputAudioFormat.Mp3),
};

ChatCompletion completion = client.CompleteChat(messages, options);

void PrintAudioContent()
{
if (completion.ResponseAudio is ChatResponseAudio responseAudio)
if (completion.OutputAudio is ChatOutputAudio outputAudio)
{
Console.WriteLine($"Response audio transcript: {responseAudio.Transcript}");
string outputFilePath = $"{responseAudio.Id}.mp3";
Console.WriteLine($"Response audio transcript: {outputAudio.Transcript}");
string outputFilePath = $"{outputAudio.Id}.mp3";
using (FileStream outputFileStream = File.OpenWrite(outputFilePath))
{
outputFileStream.Write(responseAudio.Data);
outputFileStream.Write(outputAudio.Data);
}
Console.WriteLine($"Response audio written to file: {outputFilePath}");
Console.WriteLine($"Valid on followup requests until: {responseAudio.ExpiresAt}");
Console.WriteLine($"Valid on followup requests until: {outputAudio.ExpiresAt}");
}
}

PrintAudioContent();

// To refer to past audio output, create an assistant message from the earlier ChatCompletion or instantiate a
// ChatResponseAudioReference(string) from the .Id of the completion's .ResponseAudio property.
// ChatOutputAudioReference(string) from the .Id of the completion's .OutputAudio property.
messages.Add(new AssistantChatMessage(completion));
messages.Add("Can you say that like a pirate?");

Expand All @@ -414,11 +414,11 @@ completion = client.CompleteChat(messages, options);
PrintAudioContent();
```

Streaming is highly parallel: `StreamingChatCompletionUpdate` instances can include a `ResponseAudioUpdate` that may
Streaming is highly parallel: `StreamingChatCompletionUpdate` instances can include a `OutputAudioUpdate` that may
contain any of:

- The `Id` of the streamed audio content, which can be referenced by subsequent `AssistantChatMessage` instances via `ChatAudioReference` once the streaming response is complete; this may appear across multiple `StreamingChatCompletionUpdate` instances but will always be the same value when present
- The `ExpiresAt` value that describes when the `Id` will no longer be valid for use with `ChatAudioReference` in subsequent requests; this typically appears once and only once, in the final `StreamingResponseAudioUpdate`
- The `ExpiresAt` value that describes when the `Id` will no longer be valid for use with `ChatAudioReference` in subsequent requests; this typically appears once and only once, in the final `StreamingOutputAudioUpdate`
- Incremental `TranscriptUpdate` and/or `DataUpdate` values, which can incrementally consumed and, when concatenated, form the complete audio transcript and audio output for the overall response; many of these typically appear

## How to generate text embeddings
Expand Down
94 changes: 47 additions & 47 deletions .dotnet/api/OpenAI.netstandard2.0.cs

Large diffs are not rendered by default.

12 changes: 6 additions & 6 deletions .dotnet/examples/Chat/Example09_ChatWithAudio.cs
Original file line number Diff line number Diff line change
Expand Up @@ -26,23 +26,23 @@ public void Example09_ChatWithAudio()
// Output audio is requested by configuring AudioOptions on ChatCompletionOptions
ChatCompletionOptions options = new()
{
AudioOptions = new(ChatResponseVoice.Alloy, ChatOutputAudioFormat.Mp3),
AudioOptions = new(ChatOutputAudioVoice.Alloy, ChatOutputAudioFormat.Mp3),
};

ChatCompletion completion = client.CompleteChat(messages, options);

void PrintAudioContent()
{
if (completion.ResponseAudio is ChatResponseAudio responseAudio)
if (completion.OutputAudio is ChatOutputAudio outputAudio)
{
Console.WriteLine($"Response audio transcript: {responseAudio.Transcript}");
string outputFilePath = $"{responseAudio.Id}.mp3";
Console.WriteLine($"Response audio transcript: {outputAudio.Transcript}");
string outputFilePath = $"{outputAudio.Id}.mp3";
using (FileStream outputFileStream = File.OpenWrite(outputFilePath))
{
outputFileStream.Write(responseAudio.Data);
outputFileStream.Write(outputAudio.Data);
}
Console.WriteLine($"Response audio written to file: {outputFilePath}");
Console.WriteLine($"Valid on followup requests until: {responseAudio.ExpiresAt}");
Console.WriteLine($"Valid on followup requests until: {outputAudio.ExpiresAt}");
}
}

Expand Down
12 changes: 6 additions & 6 deletions .dotnet/examples/Chat/Example10_ChatWithAudioAsync.cs
Original file line number Diff line number Diff line change
Expand Up @@ -27,23 +27,23 @@ public async Task Example09_ChatWithAudioAsync()
// Output audio is requested by configuring AudioOptions on ChatCompletionOptions
ChatCompletionOptions options = new()
{
AudioOptions = new(ChatResponseVoice.Alloy, ChatOutputAudioFormat.Mp3),
AudioOptions = new(ChatOutputAudioVoice.Alloy, ChatOutputAudioFormat.Mp3),
};

ChatCompletion completion = await client.CompleteChatAsync(messages, options);

async Task PrintAudioContentAsync()
{
if (completion.ResponseAudio is ChatResponseAudio responseAudio)
if (completion.OutputAudio is ChatOutputAudio outputAudio)
{
Console.WriteLine($"Response audio transcript: {responseAudio.Transcript}");
string outputFilePath = $"{responseAudio.Id}.mp3";
Console.WriteLine($"Response audio transcript: {outputAudio.Transcript}");
string outputFilePath = $"{outputAudio.Id}.mp3";
using (FileStream outputFileStream = File.OpenWrite(outputFilePath))
{
await outputFileStream.WriteAsync(responseAudio.Data);
await outputFileStream.WriteAsync(outputAudio.Data);
}
Console.WriteLine($"Response audio written to file: {outputFilePath}");
Console.WriteLine($"Valid on followup requests until: {responseAudio.ExpiresAt}");
Console.WriteLine($"Valid on followup requests until: {outputAudio.ExpiresAt}");
}
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ internal override void WriteCore(Utf8JsonWriter writer, ModelReaderWriterOptions
writer.WriteOptionalProperty("name"u8, ParticipantName, options);
writer.WriteOptionalCollection("tool_calls"u8, ToolCalls, options);
writer.WriteOptionalProperty("function_call"u8, FunctionCall, options);
writer.WriteOptionalProperty("audio"u8, ResponseAudioReference, options);
writer.WriteOptionalProperty("audio"u8, OutputAudioReference, options);
writer.WriteSerializedAdditionalRawData(_additionalBinaryDataProperties, options);
writer.WriteEndObject();
}
Expand Down
14 changes: 7 additions & 7 deletions .dotnet/src/Custom/Chat/AssistantChatMessage.cs
Original file line number Diff line number Diff line change
Expand Up @@ -88,12 +88,12 @@ public AssistantChatMessage(ChatFunctionCall functionCall)
/// Creates a new instance of <see cref="AssistantChatMessage"/> that represents a prior response from the model
/// that included audio with a correlation ID.
/// </summary>
/// <param name="responseAudioReference"> The <c>audio</c> reference with an <c>id</c>, produced by the model. </param>
public AssistantChatMessage(ChatResponseAudioReference responseAudioReference)
/// <param name="outputAudioReference"> The <c>audio</c> reference with an <c>id</c>, produced by the model. </param>
public AssistantChatMessage(ChatOutputAudioReference outputAudioReference)
{
Argument.AssertNotNull(responseAudioReference, nameof(responseAudioReference));
Argument.AssertNotNull(outputAudioReference, nameof(outputAudioReference));

ResponseAudioReference = responseAudioReference;
OutputAudioReference = outputAudioReference;
}

/// <summary>
Expand Down Expand Up @@ -122,9 +122,9 @@ public AssistantChatMessage(ChatCompletion chatCompletion)

Refusal = chatCompletion.Refusal;
FunctionCall = chatCompletion.FunctionCall;
if (chatCompletion.ResponseAudio is not null)
if (chatCompletion.OutputAudio is not null)
{
ResponseAudioReference = new(chatCompletion.ResponseAudio.Id);
OutputAudioReference = new(chatCompletion.OutputAudio.Id);
}
foreach (ChatToolCall toolCall in chatCompletion.ToolCalls ?? [])
{
Expand All @@ -150,5 +150,5 @@ public AssistantChatMessage(ChatCompletion chatCompletion)

// CUSTOM: Made internal for reprojected representation within the content collection.
[CodeGenMember("Audio")]
public ChatResponseAudioReference ResponseAudioReference { get; set; }
public ChatOutputAudioReference OutputAudioReference { get; set; }
}
2 changes: 1 addition & 1 deletion .dotnet/src/Custom/Chat/ChatAudioOptions.cs
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ public partial class ChatAudioOptions
/// Gets or sets the voice model that the response should use to synthesize audio.
/// </summary>
[CodeGenMember("Voice")]
public ChatResponseVoice ResponseVoice { get; set; }
public ChatOutputAudioVoice OutputAudioVoice { get; set; }

/// <summary>
/// Specifies the output format desired for synthesized audio.
Expand Down
2 changes: 1 addition & 1 deletion .dotnet/src/Custom/Chat/ChatCompletion.cs
Original file line number Diff line number Diff line change
Expand Up @@ -86,5 +86,5 @@ public partial class ChatCompletion
public ChatFunctionCall FunctionCall => Choices[0].Message.FunctionCall;

/// <summary> The audio response generated by the model. </summary>
public ChatResponseAudio ResponseAudio => Choices[0].Message.Audio;
public ChatOutputAudio OutputAudio => Choices[0].Message.Audio;
}
4 changes: 2 additions & 2 deletions .dotnet/src/Custom/Chat/ChatMessage.cs
Original file line number Diff line number Diff line change
Expand Up @@ -135,8 +135,8 @@ internal ChatMessage(ChatMessageRole role, string content = null) : this(role)
/// <inheritdoc cref="AssistantChatMessage(ChatCompletion)"/>
public static AssistantChatMessage CreateAssistantMessage(ChatCompletion chatCompletion) => new(chatCompletion);

/// <inheritdoc cref="AssistantChatMessage(ChatResponseAudioReference)"/>
public static AssistantChatMessage CreateAssistantMessage(ChatResponseAudioReference audioReference) => new(audioReference);
/// <inheritdoc cref="AssistantChatMessage(ChatOutputAudioReference)"/>
public static AssistantChatMessage CreateAssistantMessage(ChatOutputAudioReference audioReference) => new(audioReference);

#endregion

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ internal static void WriteCoreContentPart(ChatMessageContentPart instance, Utf8J
writer.WritePropertyName("image_url"u8);
writer.WriteObjectValue(instance._imageUri, options);
}
else if (instance._kind == ChatMessageContentPartKind.Audio)
else if (instance._kind == ChatMessageContentPartKind.InputAudio)
{
writer.WritePropertyName("input_audio"u8);
writer.WriteObjectValue(instance._inputAudio, options);
Expand Down
8 changes: 4 additions & 4 deletions .dotnet/src/Custom/Chat/ChatMessageContentPart.cs
Original file line number Diff line number Diff line change
Expand Up @@ -84,7 +84,7 @@ internal ChatMessageContentPart(
/// The encoded binary audio payload associated with the content part.
/// </summary>
/// <remarks>
/// Present when <see cref="Kind"/> is <see cref="ChatMessageContentPartKind.Audio"/> and the content part
/// Present when <see cref="Kind"/> is <see cref="ChatMessageContentPartKind.InputAudio"/>. The content part
/// represents user role audio input.
/// </remarks>
public BinaryData AudioBytes => _inputAudio?.Data;
Expand All @@ -93,7 +93,7 @@ internal ChatMessageContentPart(
/// The encoding format that the audio data provided in <see cref="AudioBytes"/> should be interpreted with.
/// </summary>
/// <remarks>
/// Present when <see cref="Kind"/> is <see cref="ChatMessageContentPartKind.Audio"/> and the content part
/// Present when <see cref="Kind"/> is <see cref="ChatMessageContentPartKind.InputAudio"/>. The content part
/// represents user role audio input.
/// </remarks>
public ChatInputAudioFormat? AudioInputFormat => _inputAudio?.Format;
Expand Down Expand Up @@ -171,7 +171,7 @@ public static ChatMessageContentPart CreateRefusalPart(string refusal)
/// <summary> Creates a new <see cref="ChatMessageContentPart"/> that encapsulates user role input audio in a known format. </summary>
/// <remarks>
/// Binary audio content parts may only be used with <see cref="UserChatMessage"/> instances to represent user audio input. When referring to
/// past audio output from the model, use <see cref="ChatResponseAudioReference(string)"/> instead.
/// past audio output from the model, use <see cref="ChatOutputAudioReference(string)"/> instead.
/// </remarks>
/// <param name="audioBytes"> The audio data. </param>
/// <param name="audioFormat"> The format of the audio data. </param>
Expand All @@ -180,7 +180,7 @@ public static ChatMessageContentPart CreateInputAudioPart(BinaryData audioBytes,
Argument.AssertNotNull(audioBytes, nameof(audioBytes));

return new ChatMessageContentPart(
kind: ChatMessageContentPartKind.Audio,
kind: ChatMessageContentPartKind.InputAudio,
inputAudio: new(audioBytes, audioFormat));
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ internal static partial class ChatMessageContentPartKindExtensions
ChatMessageContentPartKind.Text => "text",
ChatMessageContentPartKind.Refusal => "refusal",
ChatMessageContentPartKind.Image => "image_url",
ChatMessageContentPartKind.Audio => "input_audio",
ChatMessageContentPartKind.InputAudio => "input_audio",
_ => throw new ArgumentOutOfRangeException(nameof(value), value, "Unknown ChatMessageContentPartKind value.")
};

Expand All @@ -22,7 +22,7 @@ public static ChatMessageContentPartKind ToChatMessageContentPartKind(this strin
if (StringComparer.OrdinalIgnoreCase.Equals(value, "text")) return ChatMessageContentPartKind.Text;
if (StringComparer.OrdinalIgnoreCase.Equals(value, "refusal")) return ChatMessageContentPartKind.Refusal;
if (StringComparer.OrdinalIgnoreCase.Equals(value, "image_url")) return ChatMessageContentPartKind.Image;
if (StringComparer.OrdinalIgnoreCase.Equals(value, "input_audio")) return ChatMessageContentPartKind.Audio;
if (StringComparer.OrdinalIgnoreCase.Equals(value, "input_audio")) return ChatMessageContentPartKind.InputAudio;
throw new ArgumentOutOfRangeException(nameof(value), value, "Unknown ChatMessageContentPartKind value.");
}
}
2 changes: 1 addition & 1 deletion .dotnet/src/Custom/Chat/ChatMessageContentPartKind.cs
Original file line number Diff line number Diff line change
Expand Up @@ -11,5 +11,5 @@ public enum ChatMessageContentPartKind

Image,

Audio,
InputAudio,
}
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,11 @@

namespace OpenAI.Chat;

/// <summary>
/// Represents the audio output generated by the model as part of a chat completion response.
/// </summary>
[CodeGenModel("ChatCompletionResponseMessageAudio")]
public partial class ChatResponseAudio
public partial class ChatOutputAudio
{

}
4 changes: 4 additions & 0 deletions .dotnet/src/Custom/Chat/ChatOutputAudioFormat.cs
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@

namespace OpenAI.Chat;

/// <summary>
/// Specifies the audio format the model should use when generating output audio as part of a chat completion
/// response.
/// </summary>
[CodeGenModel("CreateChatCompletionRequestAudioFormat")]
public readonly partial struct ChatOutputAudioFormat
{
Expand Down
20 changes: 20 additions & 0 deletions .dotnet/src/Custom/Chat/ChatOutputAudioReference.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
using System;
using System.Collections.Generic;
using System.Text.RegularExpressions;

namespace OpenAI.Chat;

/// <summary>
/// Represents an ID-based reference to a past audio output as received from a prior chat completion response, as
/// provided when creating an <see cref="AssistantChatMessage"/> instance for use in a conversation history.
/// </summary>
/// <remarks>
/// This value is obtained from the <see cref="ChatCompletion.OutputAudio.Id"/> or
/// <see cref="StreamingChatCompletionUpdate.OutputAudioUpdate.Id"/> properties for streaming and non-streaming
/// responses, respectively. The <see cref="AssistantChatMessage(ChatCompletion)"/> constructor overload can also be
/// used to automatically populate the appropriate properties from a <see cref="ChatCompletion"/> instance.
/// </remarks>
[CodeGenModel("ChatCompletionRequestAssistantMessageAudio")]
public partial class ChatOutputAudioReference
{
}
14 changes: 14 additions & 0 deletions .dotnet/src/Custom/Chat/ChatOutputAudioVoice.cs
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
using System;
using System.Collections.Generic;
using System.Diagnostics.CodeAnalysis;

namespace OpenAI.Chat;

/// <summary>
/// Specifies the available voices that the model can use when generating output audio as part of a chat completion.
/// </summary>
[CodeGenModel("CreateChatCompletionRequestAudioVoice")]
public readonly partial struct ChatOutputAudioVoice
{

}
10 changes: 0 additions & 10 deletions .dotnet/src/Custom/Chat/ChatResponseAudioReference.cs

This file was deleted.

11 changes: 0 additions & 11 deletions .dotnet/src/Custom/Chat/ChatResponseVoice.cs

This file was deleted.

5 changes: 5 additions & 0 deletions .dotnet/src/Custom/Chat/Internal/GeneratorStubs.cs
Original file line number Diff line number Diff line change
Expand Up @@ -102,3 +102,8 @@ internal readonly partial struct InternalCreateChatCompletionRequestModality { }
[CodeGenModel("ChatCompletionRequestMessageContentPartAudioType")]
internal readonly partial struct InternalChatCompletionRequestMessageContentPartAudioType { }

[CodeGenModel("ChatCompletionRequestMessageContentPartAudio")]
internal partial class InternalChatCompletionRequestMessageContentPartAudio { }

[CodeGenModel("ChatCompletionRequestMessageContentPartAudioInputAudio")]
internal partial class InternalChatCompletionRequestMessageContentPartAudioInputAudio { }

This file was deleted.

This file was deleted.

Loading

0 comments on commit 6a1a7ee

Please sign in to comment.