New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

[draft] avatar plugin based on v1.0 #1391

Open

longcw wants to merge 14 commits into dev-1.0 from longc/avatar-plugin

Collaborator

longcw commented Jan 20, 2025 •

edited

Loading

AudioSink based on DataStream (Add data stream support python-sdks#347)
Avatar worker example with video generation and av sync

longcw added 3 commits

January 17, 2025 18:13


          avatar plugin wip

d7a24d9


          add sink control and worker

9a96aa9


          update avatar io api

fc2bd2d

changeset-bot bot commented Jan 20, 2025 •

edited

Loading

⚠️ No Changeset found

Latest commit: 2164483

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets

When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

longcw changed the base branch from main to dev-1.0

January 20, 2025 03:45


          update video generator protocol

ddde6af

longcw requested review from davidzhao and theomonnom

January 20, 2025 03:47


          fix wait_for_participant

f1ac76e

davidzhao reviewed

View reviewed changes

Member

davidzhao left a comment

this looks great! just a few comments.

we'll also need some error handling in various parts.. how do both sides handle cases where the other side is disconnected. if the avatar participant is gone for longer than a reasonable timeout, then the agent would likely need to report that error and shutdown itself.

similarly.. if the controller is gone, the service on the other side might want to avoid consuming resources and also exit

livekit-plugins/livekit-plugins-avatar/livekit/plugins/avatar/io.py Outdated

+              AUDIO_SENDER_ATTR = "__livekit_avatar_audio_sender"
+              AUDIO_RECEIVER_ATTR = "__livekit_avatar_audio_receiver"
+              RPC_INTERRUPT_PLAYBACK = "__livekit_avatar_interrupt_playback"

Member

davidzhao Jan 20, 2025

nit: for consistency, using lk. namespace to identify livekit specific actions

Suggested change

      
            RPC_INTERRUPT_PLAYBACK = "__livekit_avatar_interrupt_playback"
          
            RPC_INTERRUPT_PLAYBACK = "lk.interrupt_playback"

livekit-plugins/livekit-plugins-avatar/livekit/plugins/avatar/io.py Outdated

+                  async def start(self) -> None:
+                      """Wait for worker participant to join and start streaming"""
+                      # mark self as sender
+                      await self._room.local_participant.set_attributes({AUDIO_SENDER_ATTR: "true"})

Member

davidzhao Jan 20, 2025

was thinking we can simplify this step. instead the receiver could just wait for an audio stream of a particular name?

livekit-plugins/livekit-plugins-avatar/livekit/plugins/avatar/io.py Outdated

+                      """Wait for worker participant to join and start streaming"""
+                      # mark self as sender
+                      await self._room.local_participant.set_attributes({AUDIO_SENDER_ATTR: "true"})
+                      self._remote_participant = await wait_for_participant(

Member

davidzhao Jan 20, 2025

what if.. instead of waiting for an attribute, we could:

take avatar_identity as a param in the sink (with a sane default)
create a token for that identity and send it to the other side as part of initial handshake
here we can just wait for that agreed-upon identity

Collaborator Author

longcw Jan 20, 2025

sounds good! on the avatar side, it wait for the audio stream with a particular name from the participant with kind=='agent'.

Collaborator Author

longcw Jan 20, 2025

updated

livekit-plugins/livekit-plugins-avatar/livekit/plugins/avatar/io.py Outdated

+                          # start new stream
+                          # TODO: any better option to send the metadata?
+                          name = f"audio_{frame.sample_rate}_{frame.num_channels}"
+                          self._stream_writer = await self._room.local_participant.stream_file(

Member

davidzhao Jan 20, 2025

this is a good use of stream extensions:

writer = await room.local_participant.stream_file("audio",
    extensions={"sample_rate": "48000", "channels": "1"})

or

writer = await room.local_participant.stream_file("audio",
    extensions={"audio_settings": json.dumps({"sample_rate": 48000, channels: 1})})

Collaborator Author

longcw Jan 20, 2025

Oh I see. the extensions is some kind of metadata? then what is the reason it is named extensions?

Member

davidzhao Jan 20, 2025

yeah.. I think attributes is probably a better name

livekit-plugins/livekit-plugins-avatar/livekit/plugins/avatar/io.py Outdated

+                      # mark self as receiver
+                      await self._room.local_participant.set_attributes({AUDIO_RECEIVER_ATTR: "true"})
+                      self._remote_participant = await wait_for_participant(

Member

davidzhao Jan 20, 2025

it seems here we can just wait for participant.kind == agent?

if we wanted to handle multiple avatars in the room, then the integration should take in the controller's identity.

lukasIO reviewed

View reviewed changes

livekit-plugins/livekit-plugins-avatar/livekit/plugins/avatar/io.py Outdated

+                          reader: rtc.FileStreamReader, remote_participant_id: str
+                      ) -> None:
+                          if remote_participant_id != self._remote_participant.identity:
+                              logger.warning(

Contributor

lukasIO Jan 20, 2025

would we really want to warn on any other incoming file stream? that seems like a rather narrow use case for this plugin

Collaborator Author

longcw Jan 20, 2025

oh I see, I'll filter for the audio stream first, so other data streams can still be processed by other handlers.

Btw, what is the use case of the file_name in data stream, can I pass a tag and the metadata like sample_rate and num_channels using file name, or is there any better option for this.

Contributor

lukasIO Jan 20, 2025

see @davidzhao's comment above, the best option is the extensions map on the stream.

livekit-plugins/livekit-plugins-avatar/livekit/plugins/avatar/io.py Outdated

+                              reader = self._stream_readers.pop(0)
+                              async for data in reader.stream_reader:
+                                  yield rtc.AudioFrame(
+                                      data=data,

Contributor

lukasIO Jan 20, 2025

this pattern would suggest that we're sure a single audio frame never exceeds STREAM_CHUNK_SIZE (~15kb)

Collaborator Author

longcw Jan 20, 2025

For audio bytes, splitting large chunks into smaller chunks before sending is fine, and should be less than 15kb. but for other use cases, receiving a different number of chunks than it send may not be good behavior. Maybe add a size limit at the send side?

Contributor

lukasIO Jan 20, 2025

I originally had a size limit in there, @theomonnom's wish was that we wouldn't enforce such a limit, but I agree, it might make things trickier if we don't have a sender side size limit

longcw added 9 commits

January 20, 2025 22:45


          update avatar connection

f27e8f6


          update example

ec1fadc


          move to example

2f1693a


          add dispatcher

b089a3a


          Merge remote-tracking branch 'origin/dev-1.0' into longc/avatar-plugin

6666a4d


          add readme

492d89b


          Merge remote-tracking branch 'origin/dev-1.0' into longc/avatar-plugin

e55b40a


          add connection info

e05d2af


          add wait_for_subscription for avatar worker

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet