Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fullscreen/realtime interface for samples #865

Merged
merged 225 commits into from
Nov 21, 2024
Merged

fullscreen/realtime interface for samples #865

merged 225 commits into from
Nov 21, 2024

Conversation

jjallaire
Copy link
Collaborator

This PR introduces a new fullscreen display UI that includes the traditional Inspect task view with additional panes with running samples and log/console output. Running samples have a live transcript along with the ability to cancel and score. (or cancel raising an error for evals with fail-on-error policies):

Screenshot 2024-11-19 at 6 22 05 PM Screenshot 2024-11-19 at 6 24 25 PM

The new UI is not currently enabled by default, but we'd like people to experiment with it and provide feedback before we make it the default (in ~ 2 weeks). You can enable fullscreen mode using the --display option or the INSPECT_DISPLAY environment variable:

# enable globally in .env
INSPECT_DISPLAY=full

# enable for a single eval
inspect eval ctf.py --display full

The available values for the display option are:

  • full - Fullscreen UI as shown above
  • rich - Classic progress UI (currently the default)
  • plain - No progress UI but print task summary at the end w/o using ANSI colors/formatting
  • none - No display at all

We'll be working on several enhancements to fullscreen mode in the near future:

  1. Realtime display of scores/metrics in the task view
  2. Human approver will run in a new "Approval" tab and be async (so other samples/tasks keep running while approval is pending)
  3. Ability to add custom panels (much like the current input_screen() but async and with richer UI constructs available via textual

@jjallaire jjallaire merged commit 047de72 into main Nov 21, 2024
10 checks passed
@jjallaire jjallaire deleted the feature/realtime branch November 21, 2024 03:28
@rusheb-apollo
Copy link

HUGE!

jjallaire added a commit that referenced this pull request Nov 21, 2024
jjallaire added a commit that referenced this pull request Nov 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants