Add Headful browsing to Agent Web Tooling #1080

cmathw · 2025-01-06T04:01:18Z

This PR contains:

What is the current behavior? (You can also link to an open issue here)

The web_browser tooling only makes use of headless browser, it is not possible to view this browser graphically in real-time.

What is the new behavior?

This PR implements headful browsing, allowing users to view an agent's interactions with the browser in real time via a VNC viewer.

Tests have been added for this new behaviour in test_playwright_crawler. These are essentially just parameterising playwright's headless flag to test both True and False, instead of just True originally.

Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)

This PR should be accompanied by an update to the image (aisiuk/inspect-web-browser-tool) hosted at Dockerhub for web_browser tooling. This PR's implementation defaults to headless mode. For users making use of web_browser tooling who are not aware of this PR, there should be no downstream impacts.

Other information:

This PR's implementation makes use of Docker targets, the user selects a build target in their task's docker-compose file. For example, in the examples/browser/compose.yaml file, the user can choose headful browsing with:

services:
  default:
    build:
      context: ../../src/inspect_ai/tool/_tools/_web_browser/_resources
      dockerfile: Dockerfile
      target: headful
    ports:
      - "127.0.0.1:5900:5900"

If a target is not specified, or it is specified as headless the Dockerfile will build the headless image. In this instance, it is not necessary to specify the 127.0.0.1:5900:5900 port mapping either (this is only needed for VNC viewing).

This PR achieves this broadly by:

Replacing the playwright headless flag with an environment variable instead of hardcoding to True (i.e. the "HEADLESS" environment variable).
Each Docker build set this "HEADLESS" environment variable depending on the target chosen by the user. We use Docker targets approach such that only the layer's necessary for each respective build are included.
The same entrypoint.sh file is copied across into the web_browser container irrespective of target, it uses the "HEADLESS" environment variable to dictate the actions taken at container start.

Note: When running the test file, a chromium browser will open/close for the headful tests. If necessary I'm open to adding decorators to these tests so they are only run when specified. These tests are not located in the tests/ dir though and won't run calling make tests.

jjallaire · 2025-01-06T09:33:26Z

Thank you! This looks like an excellent addition. Some questions:

After we publish the new image, am I correct in thinking that the target: headful (plus the VNC proxying) in compose.yaml is all a user will need to enable this mode? (i.e. they don't need to actually rebuild the image locally or do they?). In any case it seems like making sure this works w/ configuration only is desirable (but if that's problematic in some fashion do LMK).
Docker compose can do interpolation of environment variables. I wonder if there is a way we could enable this behavior in the container AND have Inspect make the right call off of a single environment variable (e.g. INSPECT_WEB_BROWSER_TOOL_TARGET=headful). Note that this can be used to forward into the HEADLESS environment variable.

Take these above suggestions as really me probing at what the workflow will be for end users. I'm open to anything that makes things as transparent and straightforward as possible.

cc'ing @epatey as well here who is working on implementing a full desktop computer tool (as discussed w/ @jmsdao back in August/September). @epatey The Harmony intelligence folks have also been working on desktop computer tools (although I believe possibly using GCP rather than in a container?).

It would be good to compare notes on how we are setting up and running VNC so its consistent across our images (@cmathw our current work is based on the Anthropic example so definitely fungible if there is a better way!).

cmathw added 10 commits January 6, 2025 14:28

update playwright headless flag to use env variable

20a96c2

update browser example to use updated Dockerfile

355517d

update Dockerfile in web_browser dir

2967194

add entrypoint script to run virtual display if HEADLESS=False

efc7c4a

add comments

1660aec

remove VNC password requirement

f6f8caf

adopt a layered approach

76370eb

remove redundant env variables and update to layered approach

36e18df

parameterize to include both headless and headful tests

be344a4

ensure playwright headless default to True

619ebda

cmathw mentioned this pull request Jan 6, 2025

Add Headful browsing to Inspect's Web Tooling harmony-intelligence/inspect_ai#1

Closed

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Headful browsing to Agent Web Tooling #1080

Add Headful browsing to Agent Web Tooling #1080

cmathw commented Jan 6, 2025

jjallaire commented Jan 6, 2025

Add Headful browsing to Agent Web Tooling #1080

Are you sure you want to change the base?

Add Headful browsing to Agent Web Tooling #1080

Conversation

cmathw commented Jan 6, 2025

This PR contains:

What is the current behavior? (You can also link to an open issue here)

What is the new behavior?

Does this PR introduce a breaking change? (What changes might users need to make in their application due to this PR?)

Other information:

jjallaire commented Jan 6, 2025