Skip to content

Commit

Permalink
doc: improve documentation (#133)
Browse files Browse the repository at this point in the history
  • Loading branch information
jxnl authored Mar 9, 2024
1 parent f365a41 commit c8bc9ad
Show file tree
Hide file tree
Showing 9 changed files with 343 additions and 148 deletions.
57 changes: 18 additions & 39 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,29 +12,19 @@ _Structured extraction in Typescript, powered by llms, designed for simplicity,

Dive into the world of Typescript-based structured extraction, by OpenAI's function calling API and Zod, typeScript-first schema validation with static type inference. Instructor stands out for its simplicity, transparency, and user-centric design. Whether you're a seasoned developer or just starting out, you'll find Instructor's approach intuitive and steerable.

> ℹ️ **Tip:** Support in other languages
Check us out in [Python](https://jxnl.github.io/instructor/), [Elixir](https://github.com/thmsmlr/instructor_ex/) and [PHP](https://github.com/cognesy/instructor-php/).

Check out ports to other languages below:

- [Python](https://www.github.com/jxnl/instructor)
- [Elixir](https://github.com/thmsmlr/instructor_ex/)

If you want to port Instructor to another language, please reach out to us on [Twitter](https://twitter.com/jxnlco) we'd love to help you get started!
If you want to port Instructor to another language, please reach out to us on [Twitter](https://twitter.com/jxnlco) we'd love to help you get started!

## Usage

To check out all the tips and tricks to prompt and extract data, check out the [documentation](https://instructor-ai.github.io/instructor-js/tips/prompting/).

```ts
import Instructor from "@instructor-ai/instructor";
import OpenAI from "openai"
import { z } from "zod"

const UserSchema = z.object({
age: z.number(),
name: z.string()
})

type User = z.infer<typeof UserSchema>

const oai = new OpenAI({
apiKey: process.env.OPENAI_API_KEY ?? undefined,
organization: process.env.OPENAI_ORG_ID ?? undefined
Expand All @@ -45,10 +35,21 @@ const client = Instructor({
mode: "FUNCTIONS"
})

const UserSchema = z.object({
// Description will be used in the prompt
age: z.number().describe("The age of the user"),
name: z.string()
})


// User will be of type z.infer<typeof UserSchema>
const user = await client.chat.completions.create({
messages: [{ role: "user", content: "Jason Liu is 30 years old" }],
model: "gpt-3.5-turbo",
response_model: { schema: UserSchema, name: "User" }
response_model: {
schema: UserSchema,
name: "User"
}
})

console.log(user)
Expand All @@ -59,7 +60,7 @@ console.log(user)

The question of using Instructor is fundamentally a question of why to use zod.

1. **Powered by OpenAI** — Instructor is powered by OpenAI's function calling API. This means you can use the same API for both prompting and extraction.
1. **Powered by OpenAI SDK** — Instructor is powered by OpenAI's API. This means you can use the same API for both prompting and extraction across multiple providers that support the OpenAI API.

2. **Customizable** — Zod is highly customizable. You can define your own validators, custom error messages, and more.

Expand All @@ -71,34 +72,12 @@ The question of using Instructor is fundamentally a question of why to use zod.

If you'd like to see more check out our [cookbook](examples/index.md).

[Installing Instructor](docs/installation.md) is a breeze.
[Installing Instructor](docs/installation.md) is a breeze.

## Contributing

If you want to help out, checkout some of the issues marked as `good-first-issue` or `help-wanted`. Found [here](https://github.com/instructor-ai/instructor-js/labels/good%20first%20issue). They could be anything from code improvements, a guest blog post, or a new cook book.

Checkout the [contribution guide]() for details on how to set things up, testing, changesets and guidelines.

## License

This project is licensed under the terms of the MIT License.

## TODO
- [ ] Add `llm_validator`
- [ ] Logging for Distillation / Finetuning
- [x] Support Streaming
- [x] Optional/Maybe types
- [ ] Add Tutorials, include in docs
- [x] Text Classification
- [ ] Search Queries
- [x] Query Decomposition
- [ ] Citations
- [x] Knowledge Graph
- [ ] Self Critique
- [ ] Image Extracting Tables
- [ ] Moderation
- [ ] Entity Resolution
- [ ] Action Item and Dependency Mapping

These translations provide a structured approach to creating TypeScript schemas with Zod, mirroring the functionality and intent of the original Python examples.

7 changes: 4 additions & 3 deletions docs/blog/posts/anyscale.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ By the end of this blog post, you will learn how to effectively utilize instruct

## Understanding Modes

Instructor's patch enhances a openai api it with the following features, you can learn more about them [here](../../concepts/modes.md), for anyscale they support `JSON_SCHEMA` and `TOOLS` modes. and with instructor we'll be able to use the following features:
Instructor's patch enhances a openai api it with the following features, you can learn more about them [here](../../concepts/patching.md), for anyscale they support `JSON_SCHEMA` and `TOOLS` modes. and with instructor we'll be able to use the following features:

- `response_model` in `create` calls that returns a Zod schema
- `max_retries` in `create` calls that retries the call if it fails by using a backoff strategy
Expand All @@ -34,7 +34,7 @@ The good news is that Anyscale employs the same OpenAI client, and its models su

Let's explore one of the models available in Anyscale's extensive collection!

```ts
```js
import Instructor from "@/instructor"
import OpenAI from "openai"
import { z } from "zod"
Expand Down Expand Up @@ -84,4 +84,5 @@ console.log(user)
}
*/
```
You can find more information about Anyscale's output mode support [here](https://docs.endpoints.anyscale.com/).
You can find more information about Anyscale's output mode support [here](https://docs.endpoints.anyscale.com/).
196 changes: 196 additions & 0 deletions docs/blog/posts/oss.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
---
draft: False
date: 2024-03-07
slug: open-source-local-structured-output-zod-json-openai
tags:
- llms
- opensource
- together
- llama-cpp-python
- anyscale
- groq
- mistral
- ollama
authors:
- jxnl
---

# Structured Output for Open Source and Local LLMS

Originally, Instructor facilitated API interactions solely via the OpenAI SDK, with an emphasis on function calling by incorporating [Zod](https://zod.dev/) for structured data validation and serialization.

As the year progressed, we expanded our toolkit by integrating [JSON mode](../../concepts/patching.md#json-mode), thus enhancing our adaptability to vision models and open source models. This advancement now enables us to support an extensive range of models, from [GPT](https://openai.com/api/) and [Mistral](https://mistral.ai) to virtually any model accessible through [Ollama](https://ollama.ai) and [Hugging Face](https://huggingface.co/models). For more insights into leveraging JSON mode with various models, refer back to our detailed guide on [Patching](../../concepts/patching.md).

<!-- more -->

## Exploring Different OpenAI Clients with Instructor

Below, we explore some of the notable clients integrated with Instructor, providing structured outputs and enhanced capabilities, complete with examples of how to initialize and patch each client.

## Local Models

### Ollama: A New Frontier for Local Models

For an in-depth exploration of Ollama, including setup and advanced features, refer to the documentation. The [Ollama official website](https://ollama.ai/download) also provides essential resources, model downloads, and community support for newcomers.

```
ollama run llama2
```

```js
import Instructor from "@instructor-ai/instructor"
import OpenAI from "openai"
import { z } from "zod"

const UserExtractSchema = z.object({
age: z.number(),
name: z.string()
})

const oai = new OpenAI({
apiKey: "ollama", // required, but unused
baseUrl: "http://localhost:11434/v1", // updated API URL
})

const client = Instructor({
client: oai,
mode: "FUNCTIONS"
})

const user = await client.chat.completions.create({
model: "llama2",
messages: [{ role: "user", content: "Jason is 30 years old" }],
response_model: { schema: UserExtractSchema, name: "UserExtractSchema" }
})

console.log(user)
// { age: 30, name: "Jason" }
```

## Alternative Providers

### Anyscale

```bash
export ANYSCALE_API_KEY="your-api-key"
```

```js
import { z } from "zod";
import Instructor from "@instructor-js/instructor";
import OpenAI from "openai";

// Define the schema using Zod
const UserExtractSchema = z.object({
name: z.string(),
age: z.number(),
});

// Initialize OpenAI client
const oai = new OpenAI({
apiKey: process.env.ANYSCALE_API_KEY,
base_url: "https://api.endpoints.anyscale.com/v1",
});

// Patch the OpenAI client with Instructor-js
const client = Instructor({
client: oai,
mode: "JSON_SCHEMA"
});

// Use the patched client to create a chat completion
const resp = await client.chat.completions.create({
model: "mistralai/Mixtral-8x7B-Instruct-v0.1",
messages: [
{ role: "system", content: "You are a world class extractor" },
{ role: "user", content: 'Extract the following entities: "Jason is 20"' },
],
response_model: { schema: UserExtractSchema, name: "UserExtractSchema" },
});

console.log(resp);
// Expected output: { name: 'Jason', age: 20 }
```

### Groq

[Groq's official documentation](https://groq.com/), offers a unique approach to processing with its tensor architecture. This innovation significantly enhances the performance of structured output processing.

```bash
export GROQ_API_KEY="your-api-key"
```

```js
import { z } from "zod";
import Instructor from "@instructor-js/instructor";
import Groq from "groq-sdk";

// Initialize Groq client
const groqClient = new Groq({
apiKey: process.env.GROQ_API_KEY,
});

// Define the schema using Zod
const UserExtractSchema = z.object({
name: z.string(),
age: z.number(),
});

// Patch the Groq client with Instructor-js
const client = Instructor({
client: groqClient,
mode: "FUNCTIONS",
});

// Use the patched client to create a chat completion
const user = await client.chat.completions.create({
model: "mixtral-8x7b-32768",
response_model: { schema: UserExtractSchema, name: "UserExtract" },
messages: [
{ role: "user", content: "Extract jason is 25 years old" },
],
});

console.log(user);
// { name: 'jason', age: 25 }
```

### Together AI

Together AI, when combined with Instructor, offers a seamless experience for developers looking to leverage structured outputs in their applications.

```bash
export TOGETHER_API_KEY="your-api-key"
```

```js
import { z } from "zod";
import Instructor from "@instructor-js/instructor";
import OpenAI from "openai";


const client = Instructor({
client: new OpenAI({
apiKey: process.env.TOGETHER_API_KEY,
base_url: "https://api.together.xyz/v1",
}),
mode: "TOOLS",
});

const UserExtractSchema = z.object({
name: z.string(),
age: z.number(),
});

const user = await client.chat.completions.create({
model: "mistralai/Mixtral-8x7B-Instruct-v0.1",
response_model: { schema: UserExtractSchema, name: "UserExtract" },
messages: [
{ role: "user", content: "Extract jason is 25 years old" },
],
});

console.assert(user instanceof UserExtractSchema, "Should be instance of UserExtract");
console.log(user);
//> name='jason', age=25
```
3 changes: 2 additions & 1 deletion docs/blog/posts/together.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ By the end of this blog post, you will learn how to effectively utilize instruct

## Understanding Modes

Instructor's patch enhances a openai api it with the following features, you can learn more about them [here](../../concepts/modes.md), for Togethers they support `JSON_SCHEMA` and `TOOLS` modes. and with instructor we'll be able to use the following features:
Instructor's patch enhances a openai api it with the following features, you can learn more about them [here](../../concepts/patching.md), for Togethers they support `JSON_SCHEMA` and `TOOLS` modes. and with instructor we'll be able to use the following features:

- `response_model` in `create` calls that returns a Zod schema
- `max_retries` in `create` calls that retries the call if it fails by using a backoff strategy
Expand Down Expand Up @@ -84,4 +84,5 @@ console.log(user)
}
*/
```

You can find more information about Togethers's output mode support [here](https://docs.together.ai/docs/json-mode/).
Loading

0 comments on commit c8bc9ad

Please sign in to comment.