← Back to Workflow
Skill

External Tools Skill

How to open, read, or process file types and media that you can't natively handle. Check here first before searching for new tools.

External Tools for Unsupported File Types & Media

When the user asks you to open or process something you can't handle natively, check this workflow first. If the file type or media isn't listed below, search the web for a tool, test it, and add it to this file for future use.


PDF Files

Your native view_file tool does NOT support PDFs (it rejects application/pdf mime type). Your browser also cannot access local file:/// URLs. Do NOT waste time attempting either of these.

  1. Check that Python is available:
python -c "print('available')"
  1. Install PyMuPDF (if not already installed):
pip install pymupdf
  1. Extract the text from the PDF and save it as a .txt file in the scratch folder:
python -c "import fitz; doc=fitz.open(r'<PDF_PATH>'); text=''.join([p.get_text() for p in doc]); f=open(r'C:\Users\user\.gemini\antigravity\scratch\<FILENAME>.txt','w',encoding='utf-8'); f.write(text); f.close(); print(f'Extracted {len(text)} characters from {len(doc)} pages')"

Replace <PDF_PATH> with the actual path to the PDF and <FILENAME> with a descriptive name.

  1. Read the resulting .txt file using view_file. Now you can work with the full content.

Notes:

  • This works because PyMuPDF (fitz) extracts all text layers from the PDF.
  • The text file is saved to scratch/ since it's a temporary working file.
  • If PyMuPDF is already installed, skip step 2.

Fallback for Scanned / Image-Only PDFs (0 Characters Extracted): If the script above returns Extracted 0 characters, the PDF is a scanned image. Do not attempt to use Tesseract, as it may not be installed. Instead, write and run a Python script using easyocr (which uses CPU fallback if no GPU) and fitz to render the pages as images and extract the text.

Example script to create in the scratch directory:

import fitz
import easyocr
import sys

sys.stdout.reconfigure(encoding='utf-8')
reader = easyocr.Reader(['en'], gpu=False)
doc = fitz.open(r'<PDF_PATH>')

with open(r'C:\Users\user\.gemini\antigravity\scratch\ocr_output.txt', 'w', encoding='utf-8') as f:
    for page_num in range(len(doc)):
        page = doc.load_page(page_num)
        pix = page.get_pixmap(dpi=150)
        results = reader.readtext(pix.tobytes("png"), detail=0, paragraph=True)
        f.write(f"\n--- PAGE {page_num + 1} ---\n" + "\n".join(results))

Instagram Reel Transcription

Use OutBlogAI to transcribe Instagram Reels. Navigate to their transcription tool, paste the reel URL, and it will generate a text transcript of the audio. (Discovered 2026-03-17.)

Backup options that may also work: VideoToTextAI, GetTranscribe, Dictationer, Choppity.


YouTube Video Transcripts

To download or extract YouTube video transcripts without triggering PowerShell encoding corruption, you MUST follow the yt-dlp guide inside skills/youtube_com_browsing/SKILL.md.

Notes:

  • Do NOT use youtube-transcript-api via CLI piping.

Visible Browser Downloads

The Antigravity visible browser (Playwright Chrome) does NOT save downloads to C:\Users\user\Downloads. It saves them to a temporary Playwright artifacts directory: C:\Users\user\AppData\Local\Temp\playwright-artifacts-*\ The folder suffix changes between sessions. Files are saved without extensions (just a UUID filename). To find a specific download, query the Chrome History database:

Copy-Item "C:\Users\user\.gemini\antigravity-browser-profile\Default\History" "$env:TEMP\history_copy.db" -Force
python -c "import sqlite3; c=sqlite3.connect(r'C:\Users\user\AppData\Local\Temp\history_copy.db').cursor(); c.execute('SELECT target_path,tab_url FROM downloads ORDER BY start_time DESC LIMIT 5'); [print(r) for r in c.fetchall()]"

Google Drive (Read/Write via rclone)

The user's Google Drive is accessible via rclone, configured with a remote named gdrive. The executable is at C:\rclone\rclone-v1.74.1-windows-amd64\rclone.exe.

Common commands:

  1. List folders in Drive root:
C:\rclone\rclone-v1.74.1-windows-amd64\rclone.exe lsd gdrive:
  1. List files in a specific Drive folder:
C:\rclone\rclone-v1.74.1-windows-amd64\rclone.exe ls gdrive:"Some Folder/Subfolder"
  1. Copy a file FROM Drive to local:
C:\rclone\rclone-v1.74.1-windows-amd64\rclone.exe copy gdrive:"Some Folder/file.md" C:\local\destination\
  1. Copy a file TO Drive from local:
C:\rclone\rclone-v1.74.1-windows-amd64\rclone.exe copy C:\local\file.md gdrive:"Some Folder/"
  1. Copy an entire folder (recursive):
C:\rclone\rclone-v1.74.1-windows-amd64\rclone.exe copy gdrive:"Some Folder" C:\local\destination\ --progress
  1. Read a file's contents directly (pipe to stdout):
C:\rclone\rclone-v1.74.1-windows-amd64\rclone.exe cat gdrive:"Some Folder/file.md"

Notes:

  • Use quotes around Drive paths that contain spaces.
  • The remote name is gdrive (configured 2026-05-13).
  • Auth tokens refresh automatically; no user interaction needed after initial setup.
  • To create a folder on Drive, just copy a file into it — rclone creates intermediate directories automatically.

File Type Not Listed?

  1. Search the web for a tool or library that can handle the file type or media.
  2. Test it to confirm it works.
  3. Add a new section to this file with the steps, so you have it for next time.

DOCX / Word Files

Your native tools cannot read .docx files natively. Use Python to extract the text.

  1. Install python-docx (if not already installed):
pip install python-docx
  1. Extract the text and save it as a .md or .txt file:
python -c "import docx; doc=docx.Document(r'<DOCX_PATH>'); text='\n'.join([p.text for p in doc.paragraphs]); f=open(r'C:\Users\user\.gemini\antigravity\scratch\<FILENAME>.txt','w',encoding='utf-8'); f.write(text); f.close(); print(f'Extracted {len(text)} characters')"

Replace <DOCX_PATH> with the absolute path to the .docx file, and <FILENAME> with a descriptive name. Read the resulting text file using view_file.


Financial & Market Data (Stocks, Gold, etc.)

When you need historical prices or market data for stocks, commodities (like gold), or other financial instruments, use the yfinance Python library instead of relying solely on web searches.

  1. Install yfinance (if not already installed):
pip install yfinance
  1. Write a Python script to fetch the data. For example:
import yfinance as yf
data = yf.download('GC=F', start='2008-01-01', end='2008-12-31', interval='1mo')
print(data['Close'])

Run the script to extract the precise historical figures for the user.


High-Framerate Video Processing (10+ FPS)

Natively, the AI can only process exactly 1 frame per second (1 FPS) when a video file is supplied directly. If the user requests a higher framerate analysis (e.g., watching at 10 FPS to catch subliminal or fast action):

  1. For Online Videos (e.g., YouTube): Do not ask the user to download it. Instead, write a python/shell script utilizing yt-dlp to fetch the video and ffmpeg to extract the frames at the requested framerate (e.g., via the fps=10 filter).
  2. For Local Videos: Build an ffmpeg script to extract the desired frames into a temporary local folder.
  3. Review Process: Once the script dumps the high-frequency still images into a scratch folder, point the AI to analyze those still images instead of passing the raw video file.

Excel Files (.xlsx)

Your native text-editing tools cannot handle .xlsx files because they are zip-based binaries. There are two approaches — COM automation (preferred) and openpyxl (lightweight fallback).

Preferred: COM Automation (requires Excel installed)

Use this when you need to read calculated formula results, create workbooks with formulas, do bulk operations, or interact with Excel's full engine. See skills/com_automation/SKILL.md for full code templates.

Quick read example:

import win32com.client
excel = win32com.client.Dispatch("Excel.Application")
excel.Visible = False
wb = excel.Workbooks.Open(r'<YOUR_PATH>')
ws = wb.Sheets(1)
# Read the CALCULATED value of a formula cell
print(ws.Range("C5").Value)
# Read the formula itself
print(ws.Range("C5").Formula)
# Read a block of data
data = ws.Range("A1:D100").Value  # tuple of tuples
wb.Close(SaveChanges=False)
excel.Quit()

Fallback: openpyxl (no Excel needed)

Use this only for simple read/write when Excel is not installed, or when you just need raw cell values without formula evaluation.

  1. Install openpyxl (if not already installed):
pip install openpyxl
  1. To read data:
python -c "import openpyxl; wb=openpyxl.load_workbook(r'<YOUR_PATH>', data_only=True); ws=wb.active; [print([cell.value for cell in row]) for row in ws.iter_rows(min_row=1, max_row=10)]"
  1. To modify a specific cell (e.g. Row 5, Column 2):
python -c "import openpyxl; wb=openpyxl.load_workbook(r'<YOUR_PATH>'); ws=wb.active; ws.cell(row=5, column=2).value = 'New Value'; wb.save(r'<YOUR_PATH>')"

Note: openpyxl with data_only=True reads cached values from the last time the file was opened in Excel. It does NOT evaluate formulas. If you need live formula results, use COM.


Word Files via COM (.docx — Full Control)

For full Word document manipulation (formatting, find/replace, mail merge, PDF export), use COM automation instead of python-docx. See skills/com_automation/SKILL.md for code templates.

Note: The python-docx section above is still valid for simple text extraction when Word is not installed.


Outlook, PowerPoint, Access (via COM)

All three are available on this machine via COM automation. Use cases:

  • Outlook: Send emails, read inbox, manage calendar
  • PowerPoint: Create/edit presentations, export to PDF
  • Access: Query databases, run SQL

See skills/com_automation/SKILL.md for full code templates for each.

This is used in: