New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

update evaluation tutorial #614

Merged

baskaryan merged 7 commits into main from isaac/updateevaluationtutorial

Jan 16, 2025

+198 −142

Contributor

isahers1 commented Jan 8, 2025

No description provided.


          changes

cd7ce82

vercel bot commented Jan 8, 2025 •

edited

Loading

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
langsmith-docs	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	Jan 16, 2025 3:00am

vercel bot deployed to Preview

January 8, 2025 19:34

View deployment

baskaryan reviewed

View reviewed changes

docs/evaluation/tutorials/evaluation.mdx Outdated Show resolved Hide resolved

docs/evaluation/tutorials/evaluation.mdx Outdated Show resolved Hide resolved

docs/evaluation/tutorials/evaluation.mdx Outdated Show resolved Hide resolved

docs/evaluation/tutorials/evaluation.mdx Outdated

-              def evaluate_length(run: Run, example: Example) -> dict:
-                  prediction = run.outputs.get("output") or ""
-                  required = example.outputs.get("answer") or ""
+              def evaluate_length(outputs: dict, reference_outputs: dict) -> dict:

Contributor

baskaryan Jan 9, 2025

could we just call this something like is_concise() and return the bool directly

Contributor Author

isahers1 Jan 9, 2025

yeah going to call it "length" to match the key

docs/evaluation/tutorials/evaluation.mdx Outdated Show resolved Hide resolved


          bagatur comments

9f9f599

vercel bot deployed to Preview

January 9, 2025 19:13

View deployment

baskaryan reviewed

View reviewed changes

docs/evaluation/tutorials/evaluation.mdx Outdated Show resolved Hide resolved

docs/evaluation/tutorials/evaluation.mdx Outdated Show resolved Hide resolved

docs/evaluation/tutorials/evaluation.mdx Outdated Show resolved Hide resolved

docs/evaluation/tutorials/evaluation.mdx Outdated Show resolved Hide resolved

docs/evaluation/tutorials/evaluation.mdx Outdated

Comment on lines 207 to 215

    
              # Note: If your system is async, you can use the asynchronous `aevaluate` function

              # import asyncio

              # from langsmith import aevaluate

              #

              # experiment_results = asyncio.run(aevaluate(

              # experiment_results = asyncio.run(client.aevaluate(

              #     my_async_langsmith_app, # Your AI system

              #     data=dataset_name, # The data to predict and grade over

              #     evaluators=[evaluate_length, qa_evaluator], # The evaluators to score the results

              #     evaluators=[length, correctness], # The evaluators to score the results

              #     experiment_prefix="openai-3.5", # A prefix for your experiment names to easily identify them

Contributor

baskaryan Jan 14, 2025

can we delete this comment? dont think its needed in tutorial

Contributor Author

isahers1 Jan 15, 2025

deleted, but would it be valuable to add a note about async compatibility somewhere? wouldn't want people turned away because they think we only do sync stuff.

docs/evaluation/tutorials/evaluation.mdx Outdated Show resolved Hide resolved


          comments

be4bf16

vercel bot deployed to Preview

January 15, 2025 00:18

View deployment


          Update docs/evaluation/tutorials/evaluation.mdx

77104fd

Co-authored-by: Bagatur <[email protected]>

vercel bot deployed to Preview

January 15, 2025 00:21

View deployment

baskaryan added 2 commits

January 15, 2025 17:38


          Merge branch 'main' into isaac/updateevaluationtutorial

84c15b2

cr

6dce32e

baskaryan approved these changes

View reviewed changes

isahers1 commented

View reviewed changes

docs/evaluation/tutorials/evaluation.mdx

Comment on lines +158 to +159

		def concision(outputs: dict, reference_outputs: dict) -> bool:
		return int(len(outputs["response"]) < 2 * len(reference_outputs["answer"]))

Contributor Author

isahers1 Jan 16, 2025

The screenshots all say length so we would need to retake all of them if we want to use concision as the name.

vercel bot deployed to Preview

January 16, 2025 02:20

View deployment


          update screenshots

07e74aa

vercel bot deployed to Preview

January 16, 2025 03:00

View deployment

baskaryan merged commit 98815a1 into main

6 checks passed

baskaryan deleted the isaac/updateevaluationtutorial branch

January 16, 2025 03:11

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet