doc: improve documentation (#133)

instructor-ai · Mar 9, 2024 · c8bc9ad · c8bc9ad
1 parent f365a41
commit c8bc9ad
Show file tree

Hide file tree

Showing 9 changed files with 343 additions and 148 deletions.
diff --git a/README.md b/README.md
@@ -12,29 +12,19 @@ _Structured extraction in Typescript, powered by llms, designed for simplicity,
 
 Dive into the world of Typescript-based structured extraction, by OpenAI's function calling API and Zod, typeScript-first schema validation with static type inference. Instructor stands out for its simplicity, transparency, and user-centric design. Whether you're a seasoned developer or just starting out, you'll find Instructor's approach intuitive and steerable.
 
-> ℹ️ **Tip:**  Support in other languages
+Check us out in [Python](https://jxnl.github.io/instructor/), [Elixir](https://github.com/thmsmlr/instructor_ex/) and [PHP](https://github.com/cognesy/instructor-php/).
 
-    Check out ports to other languages below:
-
-    - [Python](https://www.github.com/jxnl/instructor)
-    - [Elixir](https://github.com/thmsmlr/instructor_ex/)
-
-    If you want to port Instructor to another language, please reach out to us on [Twitter](https://twitter.com/jxnlco) we'd love to help you get started!
+If you want to port Instructor to another language, please reach out to us on [Twitter](https://twitter.com/jxnlco) we'd love to help you get started!
 
 ## Usage
 
+To check out all the tips and tricks to prompt and extract data, check out the [documentation](https://instructor-ai.github.io/instructor-js/tips/prompting/).
+
 ```ts
 import Instructor from "@instructor-ai/instructor";
 import OpenAI from "openai"
 import { z } from "zod"
 
-const UserSchema = z.object({
-  age: z.number(),
-  name: z.string()
-})
-
-type User = z.infer<typeof UserSchema>
-
 const oai = new OpenAI({
   apiKey: process.env.OPENAI_API_KEY ?? undefined,
   organization: process.env.OPENAI_ORG_ID ?? undefined
@@ -45,10 +35,21 @@ const client = Instructor({
   mode: "FUNCTIONS"
 })
 
+const UserSchema = z.object({
+  // Description will be used in the prompt
+  age: z.number().describe("The age of the user"), 
+  name: z.string()
+})
+
+
+// User will be of type z.infer<typeof UserSchema>
 const user = await client.chat.completions.create({
   messages: [{ role: "user", content: "Jason Liu is 30 years old" }],
   model: "gpt-3.5-turbo",
-  response_model: { schema: UserSchema, name: "User" }
+  response_model: { 
+    schema: UserSchema, 
+    name: "User"
+  }
 })
 
 console.log(user)
@@ -59,7 +60,7 @@ console.log(user)
 
 The question of using Instructor is fundamentally a question of why to use zod.
 
-1. **Powered by OpenAI** — Instructor is powered by OpenAI's function calling API. This means you can use the same API for both prompting and extraction.
+1. **Powered by OpenAI SDK** — Instructor is powered by OpenAI's API. This means you can use the same API for both prompting and extraction across multiple providers that support the OpenAI API.
 
 2. **Customizable** — Zod is highly customizable. You can define your own validators, custom error messages, and more.
 
@@ -71,34 +72,12 @@ The question of using Instructor is fundamentally a question of why to use zod.
 
 If you'd like to see more check out our [cookbook](examples/index.md).
 
-[Installing Instructor](docs/installation.md) is a breeze. 
+[Installing Instructor](docs/installation.md) is a breeze.
 
 ## Contributing
 
 If you want to help out, checkout some of the issues marked as `good-first-issue` or `help-wanted`. Found [here](https://github.com/instructor-ai/instructor-js/labels/good%20first%20issue). They could be anything from code improvements, a guest blog post, or a new cook book.
 
-Checkout the [contribution guide]() for details on how to set things up, testing, changesets and guidelines.
-
 ## License
 
 This project is licensed under the terms of the MIT License.
-
-## TODO
-- [ ] Add `llm_validator`
-- [ ] Logging for Distillation / Finetuning
-- [x] Support Streaming
-- [x] Optional/Maybe types
-- [ ] Add Tutorials, include in docs
-    - [x] Text Classification
-    - [ ] Search Queries
-    - [x] Query Decomposition
-    - [ ] Citations
-    - [x] Knowledge Graph
-    - [ ] Self Critique
-    - [ ] Image Extracting Tables
-    - [ ] Moderation
-    - [ ] Entity Resolution
-    - [ ] Action Item and Dependency Mapping
-
-These translations provide a structured approach to creating TypeScript schemas with Zod, mirroring the functionality and intent of the original Python examples.
-
diff --git a/docs/blog/posts/anyscale.md b/docs/blog/posts/anyscale.md
@@ -19,7 +19,7 @@ By the end of this blog post, you will learn how to effectively utilize instruct
 
 ## Understanding Modes
 
-Instructor's patch enhances a openai api it with the following features, you can learn more about them [here](../../concepts/modes.md), for anyscale they support `JSON_SCHEMA` and `TOOLS` modes. and with instructor we'll be able to use the following features:
+Instructor's patch enhances a openai api it with the following features, you can learn more about them [here](../../concepts/patching.md), for anyscale they support `JSON_SCHEMA` and `TOOLS` modes. and with instructor we'll be able to use the following features:
 
 - `response_model` in `create` calls that returns a Zod schema
 - `max_retries` in `create` calls that retries the call if it fails by using a backoff strategy
@@ -34,7 +34,7 @@ The good news is that Anyscale employs the same OpenAI client, and its models su
 
 Let's explore one of the models available in Anyscale's extensive collection!
 
-```ts
+```js
 import Instructor from "@/instructor"
 import OpenAI from "openai"
 import { z } from "zod"
@@ -84,4 +84,5 @@ console.log(user)
 }
  */
 ```
-You can find more information about Anyscale's output mode support [here](https://docs.endpoints.anyscale.com/).
+
+You can find more information about Anyscale's output mode support [here](https://docs.endpoints.anyscale.com/).
diff --git a/docs/blog/posts/oss.md b/docs/blog/posts/oss.md
@@ -0,0 +1,196 @@
+---
+draft: False
+date: 2024-03-07
+slug: open-source-local-structured-output-zod-json-openai
+tags:
+  - llms
+  - opensource
+  - together
+  - llama-cpp-python
+  - anyscale
+  - groq
+  - mistral
+  - ollama
+authors:
+  - jxnl
+---
+
+# Structured Output for Open Source and Local LLMS
+
+Originally, Instructor facilitated API interactions solely via the OpenAI SDK, with an emphasis on function calling by incorporating [Zod](https://zod.dev/) for structured data validation and serialization.
+
+As the year progressed, we expanded our toolkit by integrating [JSON mode](../../concepts/patching.md#json-mode), thus enhancing our adaptability to vision models and open source models. This advancement now enables us to support an extensive range of models, from [GPT](https://openai.com/api/) and [Mistral](https://mistral.ai) to virtually any model accessible through [Ollama](https://ollama.ai) and [Hugging Face](https://huggingface.co/models). For more insights into leveraging JSON mode with various models, refer back to our detailed guide on [Patching](../../concepts/patching.md).
+
+<!-- more -->
+
+## Exploring Different OpenAI Clients with Instructor
+
+Below, we explore some of the notable clients integrated with Instructor, providing structured outputs and enhanced capabilities, complete with examples of how to initialize and patch each client.
+
+## Local Models
+
+### Ollama: A New Frontier for Local Models
+
+For an in-depth exploration of Ollama, including setup and advanced features, refer to the documentation. The [Ollama official website](https://ollama.ai/download) also provides essential resources, model downloads, and community support for newcomers.
+
+```
+ollama run llama2
+```
+
+```js
+import Instructor from "@instructor-ai/instructor"
+import OpenAI from "openai"
+import { z } from "zod"
+
+const UserExtractSchema = z.object({
+  age: z.number(),
+  name: z.string()
+})
+
+const oai = new OpenAI({
+  apiKey: "ollama",  // required, but unused
+  baseUrl: "http://localhost:11434/v1", // updated API URL
+})
+
+const client = Instructor({
+  client: oai,
+  mode: "FUNCTIONS"
+})
+
+const user = await client.chat.completions.create({
+  model: "llama2",
+  messages: [{ role: "user", content: "Jason is 30 years old" }],
+  response_model: { schema: UserExtractSchema, name: "UserExtractSchema" }
+})
+
+console.log(user)
+// { age: 30, name: "Jason" }
+```
+
+## Alternative Providers
+
+### Anyscale
+
+```bash
+export ANYSCALE_API_KEY="your-api-key"
+```
+
+```js
+import { z } from "zod";
+import Instructor from "@instructor-js/instructor";
+import OpenAI from "openai";
+
+// Define the schema using Zod
+const UserExtractSchema = z.object({
+  name: z.string(),
+  age: z.number(),
+});
+
+// Initialize OpenAI client
+const oai = new OpenAI({
+  apiKey: process.env.ANYSCALE_API_KEY,
+  base_url: "https://api.endpoints.anyscale.com/v1",
+});
+
+// Patch the OpenAI client with Instructor-js
+const client = Instructor({
+  client: oai,
+  mode: "JSON_SCHEMA"
+});
+
+// Use the patched client to create a chat completion
+const resp = await client.chat.completions.create({
+  model: "mistralai/Mixtral-8x7B-Instruct-v0.1",
+  messages: [
+    { role: "system", content: "You are a world class extractor" },
+    { role: "user", content: 'Extract the following entities: "Jason is 20"' },
+  ],
+  response_model: { schema: UserExtractSchema, name: "UserExtractSchema" },
+});
+
+console.log(resp);
+// Expected output: { name: 'Jason', age: 20 }
+```
+
+### Groq
+
+[Groq's official documentation](https://groq.com/), offers a unique approach to processing with its tensor architecture. This innovation significantly enhances the performance of structured output processing.
+
+```bash
+export GROQ_API_KEY="your-api-key"
+```
+
+```js
+import { z } from "zod";
+import Instructor from "@instructor-js/instructor";
+import Groq from "groq-sdk";
+
+// Initialize Groq client
+const groqClient = new Groq({
+  apiKey: process.env.GROQ_API_KEY,
+});
+
+// Define the schema using Zod
+const UserExtractSchema = z.object({
+  name: z.string(),
+  age: z.number(),
+});
+
+// Patch the Groq client with Instructor-js
+const client = Instructor({
+  client: groqClient,
+  mode: "FUNCTIONS",
+});
+
+// Use the patched client to create a chat completion
+const user = await client.chat.completions.create({
+  model: "mixtral-8x7b-32768",
+  response_model: { schema: UserExtractSchema, name: "UserExtract" },
+  messages: [
+    { role: "user", content: "Extract jason is 25 years old" },
+  ],
+});
+
+console.log(user);
+// { name: 'jason', age: 25 }
+```
+
+### Together AI
+
+Together AI, when combined with Instructor, offers a seamless experience for developers looking to leverage structured outputs in their applications.
+
+```bash
+export TOGETHER_API_KEY="your-api-key"
+```
+
+```js
+import { z } from "zod";
+import Instructor from "@instructor-js/instructor";
+import OpenAI from "openai";
+
+
+const client = Instructor({
+  client: new OpenAI({
+    apiKey: process.env.TOGETHER_API_KEY,
+    base_url: "https://api.together.xyz/v1",
+  }),
+  mode: "TOOLS",
+});
+
+const UserExtractSchema = z.object({
+  name: z.string(),
+  age: z.number(),
+});
+
+const user = await client.chat.completions.create({
+  model: "mistralai/Mixtral-8x7B-Instruct-v0.1",
+  response_model: { schema: UserExtractSchema, name: "UserExtract" },
+  messages: [
+    { role: "user", content: "Extract jason is 25 years old" },
+  ],
+});
+
+console.assert(user instanceof UserExtractSchema, "Should be instance of UserExtract");
+console.log(user);
+//> name='jason', age=25
+```
diff --git a/docs/blog/posts/together.md b/docs/blog/posts/together.md
@@ -19,7 +19,7 @@ By the end of this blog post, you will learn how to effectively utilize instruct
 
 ## Understanding Modes
 
-Instructor's patch enhances a openai api it with the following features, you can learn more about them [here](../../concepts/modes.md), for Togethers they support `JSON_SCHEMA` and `TOOLS` modes. and with instructor we'll be able to use the following features:
+Instructor's patch enhances a openai api it with the following features, you can learn more about them [here](../../concepts/patching.md), for Togethers they support `JSON_SCHEMA` and `TOOLS` modes. and with instructor we'll be able to use the following features:
 
 - `response_model` in `create` calls that returns a Zod schema
 - `max_retries` in `create` calls that retries the call if it fails by using a backoff strategy
@@ -84,4 +84,5 @@ console.log(user)
 }
  */
 ```
+
 You can find more information about Togethers's output mode support [here](https://docs.together.ai/docs/json-mode/).