Web Browsing Policy

This skill defines the fallback chain, central directory, and rules for accessing web content in Antigravity.

I. Browsing Methods

Always prefer the simplest, fastest tool that works. The six methods, in order of preference:

Internal browser: Fast, invisible, text-only HTTP fetches (no JavaScript) via the read_url_content tool. Always try this first. Even when your end goal is a binary file (video, image, etc.), use this first to fetch the page HTML — it often contains direct download links you can then grab with Invoke-WebRequest.
Dev Tools browser: Advanced background Chromium instance controlled via the chrome_devtools MCP server. Use this for deep debugging, intercepting network requests, reading console logs, running Lighthouse audits, or evaluating raw JavaScript. NOTE: Cannot be used to log into sites (e.g., Google) due to bot detection blocks. Refer to dev_tools_browser skill.
Subagent Browser: A headed, visible Chromium instance with persistent cookies and full bot-detection bypass. Use this when the Dev Tools browser is blocked by anti-bot walls, when you need to maintain logged-in sessions across invocations, or when you need autonomous multi-step browsing with automatic video recording. You give it a task prompt (not individual tool calls) and it operates autonomously. Refer to subagent_browser skill.
Playwright browser: A headed Chromium instance controlled via local MCP tools (launch_browser, navigate, click_element, capture_screenshot, shutdown_browser). Use this when you need granular step-by-step MCP tool control, or when you need to navigate chrome-extension:// URLs (which the Browser Subagent cannot do). Refer to playwright_browser skill.
Edge DOM browser: Direct DOM interaction with the user's personal Edge browser via the Edge DOM Bridge extension. Use this if you need to inherit the user's personal active session cookies and logins. Permission required: Either the user explicitly requests Edge DOM, or you must ask the user for permission before using it. Refer to edge_dom skill.
Point and click browsing: Vision-based coordinate clicking using Win32 Python scripts. Use this as a last resort when you have NO DOM access. Refer to take_picture_and_click skill.

II. Fallback Chain & Troubleshooting

1. Fallback Chain

Try the Internal Browser first for static HTML.
If you need JS execution, network interception, or deep inspection, use the Dev Tools Browser.
If the Dev Tools Browser is blocked by anti-bot walls (Cloudflare/DataDome), or you need persistent cookies/logins, switch to the Subagent Browser.
If the Subagent Browser is insufficient (e.g., you need granular MCP tool control or chrome-extension:// URL access), use the Playwright Browser.
If you need the user's personal logged-in sessions, use the Edge DOM Browser (after getting user permission).
If programmatic DOM access fails entirely, fall back to Point and Click Browsing.

2. Browser Troubleshooting

For specific browser connection issues, crashes, or timeouts, refer directly to their respective skill files:

3. Hosts File Bypass

If a domain is blocked by the hosts file:

Delete it: Start-Process powershell -Verb RunAs -ArgumentList '-Command "Remove-Item C:\Windows\System32\drivers\etc\hosts -Force"' -Wait -WindowStyle Hidden
Browse.
Remind the user: "🚨🚨🚨 HOSTS FILE DISABLED — Run C:\Dropbox\spreads\spr-other\copy hosts.bat to restore blocks. 🚨🚨🚨"

III. Website-Specific Skills

[!IMPORTANT] EXPLICIT OVERRIDES: If a target website has a corresponding skill listed below, the instructions within that website-specific skill override this general web browsing policy in the event of any conflict. If there is no conflict, you must follow both the general policy and the specific skill.

[!CAUTION] WEBSITE LAYOUT MISMATCHES & AUDIO ALERTS: If you open a website-specific skill and find that the described selectors, elements, or flows do not match what is actually visible on the website (e.g., the layout has changed, buttons have been renamed/restructured, or options have moved into nested submenus), this means the website has updated.

You MUST:

Immediately stop all execution.

Run the local audio alert script: python global_workflows/audio_alert.py "The website layout has changed."

Notify the user in bold text that the site has updated and you need to investigate from scratch.

Investigate the DOM from scratch using search scripts to map out the new selectors, update the website's skill file, and proceed.

Website-specific skill files contain detailed steps for common tasks on complex sites:
- gemini.google.com (Deep Think / Deep Research): gemini_google_com skill
- notebooklm.google.com: notebooklm_google_com_browsing skill
- search.google.com/search-console: search_console_google_com_browsing skill
- westlaw.com: westlaw_com_browsing skill
- x.com: x_com_browsing skill
- youtube.com: youtube_com_browsing skill
If you learn something new when browsing one of the above sites, update the site's skill file.
When browsing a new complex website, prepare a skill file as you learn and add it to the list above.

IV. General Rules & Workarounds

Visual Verification: To visually check a website's layout or confirm that an action was accepted, you MUST use the subagent browser, playwright browser, Edge DOM browser, or point and click browsing. Do not use the internal browser for visual verification.
Handling Native File Upload Pop-ups: The playwright browser and Edge DOM Bridge are DOM-based and blind to native Windows file dialogs. When clicking "Upload" hangs (because the browser is waiting on the OS-level file explorer pop-up):
- Use the take_picture_and_click skill to take a picture of the screen, locate the file explorer window, and type/click to select the file.
URL Safety: NEVER parrot URLs generated by internal subagents — they hallucinate. Only pull URLs directly from the browser.
Security & Login: NEVER type, guess, or handle the user's passwords or logins. If a website requires a new login, STOP immediately and ask the user to log in manually.
Chrome Extensions popup (Playwright Browser Only): To click inside a Chrome Extension popup, navigate to chrome-extension://EXTENSION_ID/index.html to force a full tab. This works ONLY in the Playwright Browser — non-HTTPS URLs crash the Edge DOM bridge.

V. Multi-Tab & Cross-Browser Discovery

When performing browser automation or troubleshooting, you may need to scan all open browser windows (including the automated visible Chrome, personal Chrome, Opera, Edge, or Firefox sessions) to find specific tabs, active URLs, or check their contents.

Always prioritize the fastest, least intrusive method first:

1. HTTP CDP Targets Query (Fastest - Automated Chrome Only)

If you only need to check the tabs open inside the automated Chrome instance (playwright browser):

Run the Python script to fetch from the local CDP JSON endpoint: python skills/web_browsing_policy/scripts/list_automated_tabs.py
This instantly returns all open pages/tabs in the automated Chrome instance without any cached delay and without disrupting the user.

2. Programmatic Tab Survey (Non-Intrusive - Cross-Browser)

If you need to search for tabs across all browser windows (personal Chrome, Opera, Edge, Firefox, etc.) without bringing them to the foreground or taking screenshots:

Run the Windows UI Automation script: powershell -ExecutionPolicy Bypass -File skills/web_browsing_policy/scripts/list_all_tabs_uia.ps1
This uses the .NET UI Automation API to silently enumerate and list all TabItem objects inside any active browser window.

3. Visual Tab Cycling & Screenshots (Headed Windows - Foolproof Fallback)

If the above methods fail (e.g., debugging port is locked or UI Automation elements are not fully exposed) or if you need to visually read page content when DOM access is blocked:

Run the Python screenshot cycler script: python skills/web_browsing_policy/scripts/screenshot_all_tabs.py
Behavior: This script programmatically restores each browser window, brings it to the foreground, captures a full-screen screenshot (saved to %TEMP%), simulates Ctrl + Tab to cycle tabs, and returns focus to the Antigravity IDE when done.
Important: This script will actively move windows and capture the screen, so notify the user before running it.