Visible Browser Skill (Model Context Protocol)
Restores the headed "visible browser" capability in Google Antigravity 2.0. Primary method uses direct Python/Playwright scripts for speed and reliability. MCP server tools available as backup.
Visible Browser Skill (Model Context Protocol)
This skill restores the headed "visible browser" capability that was deprecated and removed in the transition to Antigravity 2.0. It supports two control methods (direct Python scripts as primary, MCP server tools as backup) allowing the agent to launch Chrome, navigate pages, click elements, capture screenshots, and shutdown the process tree.
I. Architecture & Setup
- Launch Trigger: The MCP server launches the browser by programmatically connecting to the IDE's fixed Electron debugging port
9000via CDP, finding the Chrome buttona.codicon-chromein the workbench DOM (workbench.html), and clicking it. This starts headed Chrome on debug port9222. - Steering & Control: Playwright connects over CDP to port
9222to steer the browser pages. - Direct Python Method (Primary): For maximum speed and reliability, launch the browser by running open_browser_cdp9000.py directly via
run_command, then write inline Playwright scripts to steer it. This avoids the MCP server's frequent EOF crashes and preserves persistent cookies correctly. - MCP Server Method (Backup): If the direct method fails, fall back to the
call_mcp_toolwrapper targetingServerName: "visible_browser"(see Section II). - Redundancy / Fallbacks: For deeper troubleshooting or alternative architectures, refer to the detailed development and fallback guide in the history archive:
visible-browser_skill.md
II. Available Tools
All tools are called via the call_mcp_tool wrapper targeting ServerName: "visible_browser":
launch_browser- Purpose: Programmatically clicks the IDE Chrome button to launch Chrome, waits for CDP port
9222to start, and navigates to the starting URL. - Arguments:
{"url": "<URL>"}(optional, defaults toabout:blank).
- Purpose: Programmatically clicks the IDE Chrome button to launch Chrome, waits for CDP port
navigate- Purpose: Navigates the active page to a new URL.
- Arguments:
{"url": "<URL>"}.
click_element- Purpose: Clicks a visible DOM element using Playwright selector logic.
- Arguments:
{"selector": "<SELECTOR>"}(e.g.,text="I understand"orbutton.submit).
capture_screenshot- Purpose: Captures a screenshot of the active page and saves it directly to the project directory.
- Arguments:
{"filename": "<FILENAME>"}(e.g.,proof_screenshot.png).
shutdown_browser- Purpose: Closes the Playwright connections and forcefully kills all running
chrome.exeprocess trees. - Arguments:
{}.
- Purpose: Closes the Playwright connections and forcefully kills all running
III. Standard Usage Workflow
Whenever you need to perform headed web automation or visual validation, follow this sequence.
Step 1: Launch the Browser
Primary method (direct script): Run the launch script directly via run_command:
python open_browser_cdp9000.py
Then write a small inline Python script to navigate, and execute it via run_command:
import asyncio
from playwright.async_api import async_playwright
async def main():
async with async_playwright() as p:
browser = await p.chromium.connect_over_cdp("http://127.0.0.1:9222")
context = browser.contexts[0] if browser.contexts else await browser.new_context()
page = context.pages[-1]
await page.goto("https://www.wikipedia.org")
print(await page.title())
asyncio.run(main())
Backup method (MCP server): If the direct method fails, use the MCP tool:
{
"ServerName": "visible_browser",
"ToolName": "launch_browser",
"Arguments": {
"url": "https://www.wikipedia.org"
}
}
Step 2: Handle Cookie Consent & Pop-ups
Identify any obstructing overlays. Always inspect the elements or text content specifically, avoiding generic clicks:
{
"ServerName": "visible_browser",
"ToolName": "click_element",
"Arguments": {
"selector": "button:has-text('Accept All')"
}
}
Step 3: Visual Verification (Screenshot Proofs)
Per the strict anti-cheating policy, always capture a screenshot after major actions or transitions to confirm success:
{
"ServerName": "visible_browser",
"ToolName": "capture_screenshot",
"Arguments": {
"filename": "wikipedia_homepage.png"
}
}
Step 3.5: Full-Page ("Long") Screenshots
If you need to capture a "long picture of the entire website" (a full-page top-to-bottom screenshot) because the content extends beyond the fold, you MUST bypass the basic MCP tool and instead write a short Python script using Playwright to connect to the browser and capture it with full_page=True.
import asyncio
from playwright.async_api import async_playwright
async def main():
async with async_playwright() as p:
# Connect to the visible browser CDP port
browser = await p.chromium.connect_over_cdp("http://127.0.0.1:9222")
context = browser.contexts[0] if browser.contexts else await browser.new_context()
page = context.pages[-1] # Target the active tab
# Capture the entire document, top to bottom
await page.screenshot(path="full_website_capture.png", full_page=True)
asyncio.run(main())
(Note: If using raw Chrome DevTools Protocol commands instead of Playwright, use Page.captureScreenshot with the "captureBeyondViewport": true parameter).
Step 4: Page Navigation
Change URLs on the active browser:
{
"ServerName": "visible_browser",
"ToolName": "navigate",
"Arguments": {
"url": "https://en.wikipedia.org/wiki/Main_Page"
}
}
Step 5: Clean Shutdown
Once all tasks are completed or if you encounter a critical automation error, always shutdown the browser to free ports and resources:
{
"ServerName": "visible_browser",
"ToolName": "shutdown_browser",
"Arguments": {}
}
IV. Environmental Troubleshooting
- WinNat Port Conflicts: If the browser launches but the tool times out attempting to connect to CDP port
9222, check for Windows NAT port exclusions. Runnet stop winnatin an elevated shell to release port reservations. - Zombie Locks: If the browser fails to start and you need to clear locks, NEVER run
taskkill /IM chrome.exeas this will kill the user's personal browser. Instead, selectively kill only the automated instance using PowerShell:Get-CimInstance Win32_Process -Filter "Name = 'chrome.exe'" | Where-Object CommandLine -match "remote-debugging-port=9222" | Invoke-CimMethod -MethodName Terminate - Crash Screen Freeze ("Restore Pages"): If the browser launches but fails to navigate because it's frozen on a Chrome crash popup, Playwright cannot attach to it. To bypass this, manually inject a new blank page via the CDP endpoint to give Playwright a valid target:
python -c "import urllib.request; urllib.request.urlopen(urllib.request.Request('http://127.0.0.1:9222/json/new', method='PUT'))". (Note: The MCP server'slaunch_browsertool now does this automatically).
V. Behavioral Guardrails
- Do not prematurely declare the browser broken: NEVER tell the user that the visible browser is broken or instruct them to manually perform actions (like copy-pasting URLs) just because a single command failed. You must first make a genuine, exhaustive attempt to debug the issue (e.g., verifying you actually called
launch_browser, checking for Zombie Locks, and actually trying to reopen the browser). Do not take lazy shortcuts (like runningStart-Process) to bypass this tool without actually trying to fix it properly.