Web Browsing Policy
Defines the fallback chain, central directory, and rules for accessing web content in Antigravity.
Web Browsing Policy
This skill defines the fallback chain, central directory, and rules for accessing web content in Antigravity.
I. Browsing Methods
Always prefer the simplest, fastest tool that works. The five methods, in order of preference:
- Internal browser: Fast, invisible, text-only HTTP fetches (no JavaScript) via the
read_url_contenttool. Always try this first. Even when your end goal is a binary file (video, image, etc.), use this first to fetch the page HTML — it often contains direct download links you can then grab withInvoke-WebRequest. - Dev Tools browser: Advanced background Chromium instance controlled via the
chrome_devtoolsMCP server. Use this for deep debugging, intercepting network requests, reading console logs, running Lighthouse audits, or evaluating raw JavaScript. Refer to dev_tools_browser skill. - Visible browser: A headed Chromium instance controlled via local MCP tools (
launch_browser,navigate,click_element,capture_screenshot,shutdown_browser). Use this when the Dev Tools browser is blocked by bot detection (like Cloudflare), or when you need to physically click elements to bypass anti-bot challenges. Refer to visible_browser skill. - Edge DOM browser: Direct DOM interaction with the user's personal Edge browser via the Edge DOM Bridge extension. Use this if you need to inherit the user's personal active session cookies and logins. Permission required: Either the user explicitly requests Edge DOM, or you must ask the user for permission before using it. Refer to edge_dom skill.
- Point and click browsing: Vision-based coordinate clicking using Win32 Python scripts. Use this as a last resort when you have NO DOM access. Refer to take_picture_and_click skill.
II. Fallback Chain & Troubleshooting
1. Fallback Chain
- Try the Internal Browser first for static HTML.
- If you need JS execution, network interception, or deep inspection, use the Dev Tools Browser.
- If the Dev Tools Browser is blocked by anti-bot walls (Cloudflare/DataDome), switch to the Visible Browser for stealth physical clicks.
- If you need the user's personal logged-in sessions, use the Edge DOM Browser (after getting user permission).
- If programmatic DOM access fails entirely, fall back to Point and Click Browsing.
2. Browser Troubleshooting
For specific browser connection issues, crashes, or timeouts, refer directly to their respective skill files:
3. Hosts File Bypass
If a domain is blocked by the hosts file:
- Delete it:
Start-Process powershell -Verb RunAs -ArgumentList '-Command "Remove-Item C:\Windows\System32\drivers\etc\hosts -Force"' -Wait -WindowStyle Hidden - Browse.
- Remind the user: "🚨🚨🚨 HOSTS FILE DISABLED — Run
C:\Dropbox\spreads\spr-other\copy hosts.batto restore blocks. 🚨🚨🚨"
III. Website-Specific Skills
[!CAUTION] WEBSITE LAYOUT MISMATCHES & AUDIO ALERTS: If you open a website-specific skill and find that the described selectors, elements, or flows do not match what is actually visible on the website (e.g., the layout has changed, buttons have been renamed/restructured, or options have moved into nested submenus), this means the website has updated.
You MUST:
- Immediately stop all execution.
- Run the local audio alert script: python global_workflows/audio_alert.py "The website layout has changed."
- Notify the user in bold text that the site has updated and you need to investigate from scratch.
- Investigate the DOM from scratch using search scripts to map out the new selectors, update the website's skill file, and proceed.
- Website-specific skill files contain detailed steps for common tasks on complex sites:
- gemini.google.com: gemini_google_com_browsing skill
- notebooklm.google.com: notebooklm_google_com_browsing skill
- search.google.com/search-console: search_console_google_com_browsing skill
- westlaw.com: westlaw_com_browsing skill
- x.com: x_com_browsing skill
- youtube.com: youtube_com_browsing skill
- If you learn something new when browsing one of the above sites, update the site's skill file.
- When browsing a new complex website, prepare a skill file as you learn and add it to the list above.
- LIBERALLY TAKE SCREENSHOTS: When using the visible browser, Edge DOM browser, or point and click browsing, you MUST liberally take pictures of the screen. Never feel like you are taking too many screenshots. Take a picture whenever you perform an action or need to verify page state. For Edge DOM: if Edge is not in the foreground when capturing, run python global_workflows/audio_alert.py "Can you look at this?" to alert the user.
IV. General Rules & Workarounds
- Visual Verification: To visually check a website's layout or confirm that an action was accepted, you MUST use the visible browser, Edge DOM browser, or point and click browsing. Do not use the internal browser for visual verification.
- Handling Native File Upload Pop-ups: The visible browser and Edge DOM Bridge are DOM-based and blind to native Windows file dialogs. When clicking "Upload" hangs (because the browser is waiting on the OS-level file explorer pop-up):
- Use the take_picture_and_click skill to take a picture of the screen, locate the file explorer window, and type/click to select the file.
- URL Safety: NEVER parrot URLs generated by internal subagents — they hallucinate. Only pull URLs directly from the browser.
- Security & Login: NEVER type, guess, or handle the user's passwords or logins. If a website requires a new login, STOP immediately and ask the user to log in manually.
- Chrome Extensions popup (Visible Browser Only): To click inside a Chrome Extension popup, navigate to
chrome-extension://EXTENSION_ID/index.htmlto force a full tab. This works ONLY in the Visible Browser — non-HTTPS URLs crash the Edge DOM bridge.
V. Multi-Tab & Cross-Browser Discovery
When performing browser automation or troubleshooting, you may need to scan all open browser windows (including the automated visible Chrome, personal Chrome, Opera, Edge, or Firefox sessions) to find specific tabs, active URLs, or check their contents.
Always prioritize the fastest, least intrusive method first:
1. HTTP CDP Targets Query (Fastest - Automated Chrome Only)
If you only need to check the tabs open inside the automated Chrome instance (visible browser):
-
Run the Python script to fetch from the local CDP JSON endpoint: python skills/web_browsing_policy/scripts/list_automated_tabs.py
-
This instantly returns all open pages/tabs in the automated Chrome instance without any cached delay and without disrupting the user.
2. Programmatic Tab Survey (Non-Intrusive - Cross-Browser)
If you need to search for tabs across all browser windows (personal Chrome, Opera, Edge, Firefox, etc.) without bringing them to the foreground or taking screenshots:
-
Run the Windows UI Automation script: powershell -ExecutionPolicy Bypass -File skills/web_browsing_policy/scripts/list_all_tabs_uia.ps1
-
This uses the .NET UI Automation API to silently enumerate and list all
TabItemobjects inside any active browser window.
3. Visual Tab Cycling & Screenshots (Headed Windows - Foolproof Fallback)
If the above methods fail (e.g., debugging port is locked or UI Automation elements are not fully exposed) or if you need to visually read page content when DOM access is blocked:
-
Run the Python screenshot cycler script: python skills/web_browsing_policy/scripts/screenshot_all_tabs.py
-
Behavior: This script programmatically restores each browser window, brings it to the foreground, captures a full-screen screenshot (saved to
%TEMP%), simulatesCtrl + Tabto cycle tabs, and returns focus to the Antigravity IDE when done. -
Important: This script will actively move windows and capture the screen, so notify the user before running it.