Skip to content

Commit

Permalink
feat: add LLM function calling support (#724)
Browse files Browse the repository at this point in the history
Let's implement a function calling with `sfn-currency-converter`:

### Step 1. Install CLI

```bash
curl -fsSL https://get.yomo.run | sh
```

Verify if the CLI was installed successfully

```bash
yomo version
```

### Step 2. Start the server

Prepare the configuration as `my-agent.yaml`

```yaml
name: ai-zipper
host: 0.0.0.0
port: 9000

auth:
  type: token
  token: SECRET_TOKEN

bridge:
  ai:
    server:
      addr: 0.0.0.0:8000 ## Restful API endpoint
      provider: azopenai ## LLM API Service we will use

    providers:
      azopenai:
        api_key: <YOUR_AZURE_OPENAI_API_KEY>
        api_endpoint: <YOUR_AZURE_OPENAI_ENDPOINT>

      openai:
        api_key: <OPENAI_API_KEY>
        model: <OPENAI_MODEL>

      gemini:
        api_key: <GEMINI_API_KEY>

      huggingface:
        model:
```

Start the server:

```sh
YOMO_LOG_LEVEL=debug yomo serve -c my-agent.yaml
```

### Step 3. Write the function

First, let's define what this function do and how's the parameters required, these will be combined to prompt when invoking LLM.

```golang
func Description() string {
	return "Get the current exchange rates"
}

type Parameter struct {
	SourceCurrency string  `json:"source" jsonschema:"description=The source currency to be queried in 3-letter ISO 4217 format"`
	TargetCurrency string  `json:"target" jsonschema:"description=The target currency to be queried in 3-letter ISO 4217 format"`
	Amount         float64 `json:"amount" jsonschema:"description=The amount of the USD currency to be converted to the target currency"`
}

func InputSchema() any {
	return &Parameter{}
}
```

Retrieve the real-time exchange rate by calling the openexchangerates.org API.:

```golang
type Rates struct {
	Rates map[string]float64 `json:"rates"`
}

func fetchRate(sourceCurrency string, targetCurrency string, amount float64) (float64, error) {
	resp, _ := http.Get(fmt.Sprintf("https://openexchangerates.org/api/latest.json?app_id=%s&base=%s&symbols=%s", os.Getenv("API_KEY"), sourceCurrency, targetCurrency))
	defer resp.Body.Close()
	body, _ := io.ReadAll(resp.Body)
	var rt *Rates
	_ = json.Unmarshal(body, &rt)

  return rates.Rates[targetCurrency], nil
}
```

Wrap to a Stateful Serverless Function:

```golang
func handler(ctx serverless.Context) {
	fcCtx, _ := ai.ParseFunctionCallContext(ctx)

	var msg Parameter
	fcCtx.UnmarshalArguments(&msg)

	rate, _ := fetchRate(msg.SourceCurrency, msg.TargetCurrency, msg.Amount)
  	result = fmt.Sprintf("%f", msg.Amount*rate)

	fcCtx.SetRetrievalResult(fmt.Sprintf("based on today's exchange rate: %f, %f %s is equivalent to approximately %f %s",rate, msg.Amount, msg.SourceCurrency, msg.Amount*rate, msg.TargetCurrency))
	fcCtx.Write(result)
}
```

Finally, let's run it

```bash
$ API_KEY=<get_from_openexchangerates.org> go run main.go

time=2024-02-26T17:29:52.868+08:00 level=INFO msg="connected to zipper" component=StreamFunction sfn_id=GqfKopi2ECx7GIlzw6ZL3 sfn_name=fn-exchange-rates zipper_addr=localhost:9000
time=2024-02-26T17:29:52.869+08:00 level=INFO msg="register ai function success" component=StreamFunction sfn_id=GqfKopi2ECx7GIlzw6ZL3 sfn_name=fn-exchange-rates zipper_addr=localhost:9000 name=fn-exchange-rates tag=16
```

### Done, let's have a try

```sh
$ curl -i -X POST -H "Content-Type: application/json" -d '{"prompt":"How much is 100 dollar in Korea and UK currency"}' http://127.0.0.1:8000/invoke

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Connection: keep-alive
Content-Type: text/event-stream
Date: Mon, 26 Feb 2024 09:30:35 GMT
Keep-Alive: timeout=4
Proxy-Connection: keep-alive

event:result
data: {"req_id":"7YU0SY","result":"78.920600","retrieval_result":"based on today's exchange rate: 0.789206, 100.000000 USD is equivalent to approximately 78.920600 GBP","tool_call_id":"call_mgGM9fqGHTtUueokUa7uwYHT","function_name":"fn-exchange-rates","arguments":"{\"amount\": 100, \"source\": \"USD\", \"target\": \"GBP\"}"}

event:result
data: {"req_id":"7YU0SY","result":"133139.226800","retrieval_result":"based on today's exchange rate: 1331.392268, 100.000000 USD is equivalent to approximately 133139.226800 KRW","tool_call_id":"call_1IFlbtKNC5CEN13tBSM0Nson","function_name":"fn-exchange-rates","arguments":"{\"amount\": 100, \"source\": \"USD\", \"target\": \"KRW\"}"}
```

### Full Example Code

[Full LLM Function Calling Codes](./example/10-ai/)
  • Loading branch information
venjiang authored Feb 26, 2024
1 parent c28f5a4 commit 65bd27b
Show file tree
Hide file tree
Showing 36 changed files with 2,491 additions and 214 deletions.
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -37,3 +37,4 @@ target
coverage.txt
*.o
build/
.env
2 changes: 1 addition & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ lint:

.PHONY: build
build:
$(GO) build -tags "$(TAGS)" -o bin/yomo -trimpath -ldflags "-s -w" ./cmd/yomo/main.go
$(GO) build -race -tags "$(TAGS)" -o bin/yomo -trimpath -ldflags "-s -w" ./cmd/yomo/main.go

.PHONY: test
test:
Expand Down
262 changes: 116 additions & 146 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,43 +4,25 @@

# YoMo ![Go](https://github.com/yomorun/yomo/workflows/Go/badge.svg) [![codecov](https://codecov.io/gh/yomorun/yomo/branch/master/graph/badge.svg?token=MHCE5TZWKM)](https://codecov.io/gh/yomorun/yomo) [![Discord](https://img.shields.io/discord/770589787404369930.svg?label=discord&logo=discord&logoColor=ffffff&color=7389D8&labelColor=6A7EC2)](https://discord.gg/RMtNhx7vds)

YoMo is an open-source Streaming Serverless Framework for building Low-latency
Geo-Distributed System. Built atop QUIC Transport Protocol and Functional
Reactive Programming interface, it makes real-time collaborative applications
reliable, secure, and easy.
YoMo is an open-source LLM Function Calling Framework for building Geo-distributed AI applications.
Built atop QUIC Transport Protocol and Stateful Serverless architecture, makes your AI application
low-latency, reliable, secure, and easy.

Read the docs: 🦖[https://yomo.run](https://yomo.run/docs)

💚 We care about: **The Demand For Real-Time Digital User Experiences**

It’s no secret that today’s users want instant gratification, every productivity
application is more powerful when it's collaborative. But, currently, when we
talk about `distribution`, it represents **distribution in data center**. API is
far away from their users from all over the world.

If an application can be deployed anywhere close to their end users, solve the
problem, this is **Geo-distributed System Architecture**:

<img width="580" alt="yomo geo-distributed system" src="https://user-images.githubusercontent.com/65603/162367572-5a0417fa-e2b2-4d35-8c92-2c95d461706d.png">
💚 We care about: **Customer Experience in the Age of AI**

## 🌶 Features

| | **Features** |
| -- | ------------------------------------------------------------------------------------------------------------ |
| ⚡️ | **Low-latency** Guaranteed by implementing atop QUIC [QUIC](https://datatracker.ietf.org/wg/quic/documents/) |
| 🔐 | **Security** TLS v1.3 on every data packet by design |
| 📱 | **5G/WiFi-6** Reliable networking in Cellular/Wireless |
| 🌎 | **Geo-Distributed Edge Mesh** Edge-Mesh Native architecture makes your services close to end users |
| 📸 | **Event-First** Architecture leverages serverless service to be event driven and elastic |
| 🦖 | **Streaming Serverless** Write only a few lines of code to build applications and microservices |
| 🔐 | **Security** TLS v1.3 on every data packet by design |
| 📸 | **Stateful Serverless** Make your GPU serverless 10x faster |
| 🌎 | **Geo-Distributed Architecture** Brings AI inference closer to end users |
| 🚀 | **Y3** a [faster than real-time codec](https://github.com/yomorun/y3-codec-golang) |
| 📨 | **Reactive** stream processing based on [Rx](http://reactivex.io/documentation/operators.html) |

## 🚀 Getting Started

### Prerequisite

[Install Go](https://golang.org/doc/install)
Let's implement a function calling with `sfn-currency-converter`:

### Step 1. Install CLI

Expand All @@ -54,157 +36,156 @@ Verify if the CLI was installed successfully
yomo version
```

### Step 2. Init your first stream function, in WebAssembly way
### Step 2. Start the server

In this demo, we will create a go project observing a data stream and count
bytes received.
Prepare the configuration as `my-agent.yaml`

```bash
yomo init try-yomo
```
```yaml
name: ai-zipper
host: 0.0.0.0
port: 9000

The yomo CLI will generate codes in folder `try-yomo`.
auth:
type: token
token: SECRET_TOKEN

### Step 3. Build
bridge:
ai:
server:
addr: 0.0.0.0:8000 ## Restful API endpoint
provider: azopenai ## LLM API Service we will use

This Stream Function is written in Go, before compiling to WebAssembly, you need
to install [tinygo](https://tinygo.org/getting-started/install/) first.
providers:
azopenai:
api_key: <YOUR_AZURE_OPENAI_API_KEY>
api_endpoint: <YOUR_AZURE_OPENAI_ENDPOINT>

```bash
$ yomo build app.go
openai:
api_key: <OPENAI_API_KEY>
model: <OPENAI_MODEL>

gemini:
api_key: <GEMINI_API_KEY>

ℹ️ YoMo Stream Function file: app.go
⌛ YoMo Stream Function building...
✅ Success! YoMo Stream Function build.
huggingface:
model:
```
Now, we get the `sfn.wasm` file, only 190K bytes.
Start the server:
```bash
$ exa -l
.rw-r--r-- 359 fanweixiao 14 Apr 01:02 app.go
.rwxr-xr-x 190k fanweixiao 14 Apr 01:08 sfn.wasm
```sh
YOMO_LOG_LEVEL=debug yomo serve -c my-agent.yaml
```

> Note: you can implement Stream Function in Rust, Zig, C or other languages can
> be compiled to WebAssembly, more examples can be found at
> [example/7-wasm/sfn](example/7-wasm/sfn).
### Step 3. Write the function

### Step 4. Run
First, let's define what this function do and how's the parameters required, these will be combined to prompt when invoking LLM.

There is a public test Zipper service `tap.yomo.dev:9140` which is provided by
our community, you can test your StreamFunction quickly by connecting to it.
```golang
func Description() string {
return "Get the current exchange rates"
}

```bash
$ yomo dev sfn.wasm

ℹ️ YoMo Stream Function file: sfn.wasm
⌛ Create YoMo Stream Function instance...
ℹ️ Starting YoMo Stream Function instance with executable file: sfn.wasm. Zipper: [tap.yomo.dev:9140].
ℹ️ YoMo Stream Function is running...
time=2023-04-14T00:05:25.073+08:00 level=INFO msg="use credential" component="Stream Function" client_id=7IwpRofCpPp-AcVV2qUFc client_name=yomo-app-demo credential_name=none
time=2023-04-14T00:05:26.297+08:00 level=INFO msg="connected to zipper" component="Stream Function" client_id=7IwpRofCpPp-AcVV2qUFc client_name=yomo-app-demo zipper_addr=tap.yomo.dev:9140
sfn received 57 bytes
sfn received 59 bytes
sfn received 59 bytes
sfn received 59 bytes
sfn received 58 bytes
sfn received 59 bytes
sfn received 58 bytes
sfn received 59 bytes
sfn received 58 bytes
^C
type Parameter struct {
SourceCurrency string `json:"source" jsonschema:"description=The source currency to be queried in 3-letter ISO 4217 format"`
TargetCurrency string `json:"target" jsonschema:"description=The target currency to be queried in 3-letter ISO 4217 format"`
Amount float64 `json:"amount" jsonschema:"description=The amount of the USD currency to be converted to the target currency"`
}

func InputSchema() any {
return &Parameter{}
}
```

It works!

> Note: `yomo dev sfn.wasm` is more convinient for development, it will connect
> to `tap.yomo.dev:9140` automatically. It's a shortcut of
> `yomo run -z tap.yomo.dev:9140 -n yomo-app-demo`.
There are many other examples that can help reduce the learning curve:

- [0-basic](./example/0-basic/): Write Stream Function in pure golang.
- [1-pipeline](./example/1-pipeline/): Unix Pipeline over Cloud.
- [2-iopipe](./example/2-iopipe/): Unix Pipeline over Cloud.
- [3-multi-sfn](./example/3-multi-sfn/): Write programs that do one thing and do
it well. Write programs to work together. --
[Doug Mcllroy](https://en.wikipedia.org/wiki/Unix_philosophy)
- [4-cascading-zipper](./example/4-cascading-zipper/): Flexible adjustment of
sfn deployment and run locations.
- [5-backflow](./example/5-backflow/)
- [6-mesh](./example/6-mesh/): Demonstrate how to put your serverless closer to
end-user.
- [7-wasm](./example/7-wasm/): Implement Stream Function by WebAssembly in `c`,
`go`, `rust` and even [zig](https://ziglang.org).
- [8-deno](./example/8-deno/): Demonstrate how to write Stream Function with
TypeScript and [deno](https://deno.com).
- [9-cli](./example/9-cli/): Implement Stream Function in
[Rx](https://reactivex.io/) way.
Retrieve the real-time exchange rate by calling the openexchangerates.org API.:

Read more about YoMo at [yomo.run/docs](https://yomo.run/docs).
```golang
type Rates struct {
Rates map[string]float64 `json:"rates"`
}

## 🧩 Interop
func fetchRate(sourceCurrency string, targetCurrency string, amount float64) (float64, error) {
resp, _ := http.Get(fmt.Sprintf("https://openexchangerates.org/api/latest.json?app_id=%s&base=%s&symbols=%s", os.Getenv("API_KEY"), sourceCurrency, targetCurrency))
defer resp.Body.Close()
body, _ := io.ReadAll(resp.Body)
var rt *Rates
_ = json.Unmarshal(body, &rt)

### Metaverse Workplace (Virtual Office) with YoMo
return rates.Rates[targetCurrency], nil
}
```

- [Frontend](https://github.com/yomorun/yomo-metaverse-workplace-nextjs)
- [Backend](https://github.com/yomorun/yomo-vhq-backend)
Wrap to a Stateful Serverless Function:

### Sources
```golang
func handler(ctx serverless.Context) {
fcCtx, _ := ai.ParseFunctionCallContext(ctx)

- [Connect EMQ X Broker to YoMo](https://github.com/yomorun/yomo-source-emqx-starter)
- [Connect MQTT to YoMo](https://github.com/yomorun/yomo-source-mqtt-broker-starter)
var msg Parameter
fcCtx.UnmarshalArguments(&msg)

### Stream Functions
rate, _ := fetchRate(msg.SourceCurrency, msg.TargetCurrency, msg.Amount)
result = fmt.Sprintf("%f", msg.Amount*rate)

- [Write a Stream Function with WebAssembly by WasmEdge](https://github.com/yomorun/yomo-wasmedge-tensorflow)
fcCtx.SetRetrievalResult(fmt.Sprintf("based on today's exchange rate: %f, %f %s is equivalent to approximately %f %s",rate, msg.Amount, msg.SourceCurrency, msg.Amount*rate, msg.TargetCurrency))
fcCtx.Write(result)
}
```

### Output Connectors
Finally, let's run it

- [Connect to FaunaDB to store post-processed result the serverless way](https://github.com/yomorun/yomo-sink-faunadb-example)
- Connect to InfluxDB to store post-processed result
- [Connect to TDEngine to store post-processed result](https://github.com/yomorun/yomo-sink-tdengine-example)
```bash
$ API_KEY=<get_from_openexchangerates.org> go run main.go

time=2024-02-26T17:29:52.868+08:00 level=INFO msg="connected to zipper" component=StreamFunction sfn_id=GqfKopi2ECx7GIlzw6ZL3 sfn_name=fn-exchange-rates zipper_addr=localhost:9000
time=2024-02-26T17:29:52.869+08:00 level=INFO msg="register ai function success" component=StreamFunction sfn_id=GqfKopi2ECx7GIlzw6ZL3 sfn_name=fn-exchange-rates zipper_addr=localhost:9000 name=fn-exchange-rates tag=16
```

### Done, let's have a try

```sh
$ curl -i -X POST -H "Content-Type: application/json" -d '{"prompt":"How much is 100 dollar in Korea and UK currency"}' http://127.0.0.1:8000/invoke

HTTP/1.1 200 OK
Transfer-Encoding: chunked
Connection: keep-alive
Content-Type: text/event-stream
Date: Mon, 26 Feb 2024 09:30:35 GMT
Keep-Alive: timeout=4
Proxy-Connection: keep-alive

event:result
data: {"req_id":"7YU0SY","result":"78.920600","retrieval_result":"based on today's exchange rate: 0.789206, 100.000000 USD is equivalent to approximately 78.920600 GBP","tool_call_id":"call_mgGM9fqGHTtUueokUa7uwYHT","function_name":"fn-exchange-rates","arguments":"{\"amount\": 100, \"source\": \"USD\", \"target\": \"GBP\"}"}

event:result
data: {"req_id":"7YU0SY","result":"133139.226800","retrieval_result":"based on today's exchange rate: 1331.392268, 100.000000 USD is equivalent to approximately 133139.226800 KRW","tool_call_id":"call_1IFlbtKNC5CEN13tBSM0Nson","function_name":"fn-exchange-rates","arguments":"{\"amount\": 100, \"source\": \"USD\", \"target\": \"KRW\"}"}
```

## 🗺 Location Insensitive Deployment
### Full Example Code

![yomo-flow-arch](https://yomo.run/yomo-flow-arch.jpg)
[Full LLM Function Calling Codes](./example/10-ai/)

## 📚 Documentation

- `YoMo-Source`: [docs.yomo.run/source](https://yomo.run/docs/api/source)
- `YoMo-Stream-Function`:
[docs.yomo.run/stream-function](https://yomo.run/docs/api/sfn)
- `YoMo-Zipper`: [docs.yomo.run/zipper](https://yomo.run/docs/cli/zipper)
- `Faster than real-time codec`: [Y3](https://github.com/yomorun/y3-codec)
Read more about YoMo at [yomo.run/docs](https://yomo.run/docs).

[YoMo](https://yomo.run) ❤️
[Vercel](https://vercel.com/?utm_source=yomorun&utm_campaign=oss), our
documentation website is

[![Vercel Logo](https://yomo.run/vercel.svg)](https://vercel.com/?utm_source=yomorun&utm_campaign=oss)

## 🎯 Focuses on computings out of data center
## 🎯 Focuses on Geo-distributed AI Inference Infra

- IoT/IIoT/AIoT
- Latency-sensitive applications.
- Networking situation with packet loss or high latency.
- Handling continuous high frequency generated data with stream-processing.
- Building Complex systems with Streaming-Serverless architecture.
It’s no secret that today’s users want instant AI inference, every AI
application is more powerful when it response quickly. But, currently, when we
talk about `distribution`, it represents **distribution in data center**. The AI model is
far away from their users from all over the world.

## 🌟 Why YoMo
If an application can be deployed anywhere close to their end users, solve the
problem, this is **Geo-distributed System Architecture**:

- Based on QUIC (Quick UDP Internet Connection) protocol for data transmission,
which uses the User Datagram Protocol (UDP) as its basis instead of the
Transmission Control Protocol (TCP); significantly improves the stability and
throughput of data transmission. Especially for cellular networks like 5G.
- A self-developed `y3-codec` optimizes decoding performance. For more
information, visit [its own repository](https://github.com/yomorun/y3-codec)
on GitHub.
- Based on stream computing, which improves speed and accuracy when dealing with
data handling and analysis; simplifies the complexity of stream-oriented
programming.
- Secure-by-default from transport protocol.
<img width="580" alt="yomo geo-distributed system" src="https://user-images.githubusercontent.com/65603/162367572-5a0417fa-e2b2-4d35-8c92-2c95d461706d.png">

## 🦸 Contributing

Expand All @@ -224,17 +205,6 @@ project, for example:
[code of conduct](https://github.com/yomorun/yomo/blob/master/CODE_OF_CONDUCT.md)
that we expect project participants to adhere to.

## 🤹🏻‍♀️ Feedback

Any questions or good ideas, please feel free to come to our
[Discussion](https://github.com/yomorun/yomo/discussions). Any feedback would be
greatly appreciated!

## 🏄‍♂️ Best Practice in Production

[Discussion #314](https://github.com/yomorun/yomo/discussions/314) Tips:
YoMo/QUIC Server Performance Tuning

## License

[Apache License 2.0](http://www.apache.org/licenses/LICENSE-2.0.html)
Loading

0 comments on commit 65bd27b

Please sign in to comment.