Safari only ever lets you pick five extensions

Morning. I am reading SCMP on my iPhone, Safari, one hand, coffee in the other. There is an article about the new MTR line that I want to remember. There is a small blue dot in the bottom-right corner of the screen. I tap it. A sheet slides up: Save page · Summarize · Send to Brain · Fill form. I tap Send to Brain. A toast appears: "Saved. 4 wikilinks matched." The article is now in my Obsidian vault, on my Mac, linked into the right project notes, with a one-paragraph summary above the body.

I did not open another app. I did not pick a target. I did not configure Zapier. I tapped the blue dot.

That blue dot is a userscript. There is one userscript installed on my Safari. It is 93.7 KB of TypeScript I wrote myself. It does ten things. It works on iPhone and Mac. It syncs over iCloud. It costs $99 a year for the Apple Developer account I already had. Everything else is $0.

This post is the build.

Safari extensions are a joke

Apple killed the Safari Extension Gallery in 2018. After that, every Safari extension had to ship through the App Store, which means App Store review, which means waiting weeks for someone in Cupertino to decide whether your one-line tweak to YouTube's homepage is acceptable.

Walk into Chrome's extension store: 200,000 extensions, most of them junk, some of them brilliant, all of them one click away.

Walk into Safari's: a handful of password managers, a few ad blockers, four note-taking clippers, two translators, a thing for Reddit, and a calculator nobody asked for. If you want to modify how a website behaves, you are not the customer Apple was thinking of.

I spent years muttering about this. Then I realized I had been looking at the wrong problem. The bottleneck wasn't the extension marketplace. It was the idea that one extension equals one job. Chrome's 200k extensions are 200k people having written 200k different jobs as 200k separate distributable packages.

I am one person. I write all my own jobs. Why would I distribute them?

The trick: one extension, unlimited scripts

There is a free Safari extension called Userscripts.app by a developer named Quoid. It loads any number of .user.js files and injects them into the pages I tell it to. One extension, free, in the App Store, signed by someone who is not me, approved once by Apple, then it loads whatever code I put in front of it.

Greasemonkey, but for Safari, but it works.

So I installed it. I wrote one userscript that does ten things. I bundled it into a single .user.js file. The Userscripts.app extension reads that file. Safari calls into the extension on every page load. My script runs.

I did not need to re-bother App Store review for any of this. The extension is one slot in the Safari toolbar. Behind that slot lives whatever JavaScript I want, and I can rewrite the JavaScript any time. No re-sign. No re-submit. No waiting.

That is the magic. Userscripts is the meta-extension that breaks the App Store bottleneck.

The Mac → iPhone sync

The other half of the trick: I want the same script on my phone. Rebuilding a Safari extension for iOS is the worst developer experience Apple has ever shipped — code signing, provisioning profiles, App Store Connect, six menus deep in Xcode.

So I don't. I write the userscript on my Mac in any text editor. I save the file to a folder that iCloud Drive watches. Five seconds later, the same file shows up on my iPhone. Userscripts.app on iPhone picks it up. The next time I tap a Safari page, the script is live.

No rebuild. No re-sign. No Xcode. The file is the deploy.

Source lives at ~/Tools/userscripts-ios-safari-ext/scripts/dist/brain-local-clipper.user.js. iCloud mirrors it. Phone sees it. That is the whole pipeline.

What my one script does

Action	What happens	Provider
Save page	Extracts title, URL, readable text (max 120K chars), writes a Brain source note	local Python server
Save selection	Last selection within 10 minutes; lets me capture quoted text alone	local Python server
Summarize	NDJSON-stream a summary in place; replaces article body with summary + "Restore article" button	`qwen3.6:35b` (Ollama) / `gpt-5.3-codex-spark` / Claude Sonnet
Send to Brain	Reads pre-generated summary + 260-title vault index, emits real `[[wikilinks]]` to existing notes, returns `wikilinksMatched` count	Claude Sonnet via local `claude` CLI
Fill form	Multi-round LLM agent fills draft form fields, safety-filtered, never auto-submits	Ollama / Spark / Claude (configurable)
Drag-to-snap	Grab the button, drop it in any of 4 screen corners; position persists per origin	local

Ten things in one extension slot. If I had built each as a separate Safari extension I would have spent six months in App Store review hell. As one userscript: one weekend.

The button itself

A 52×52 px rounded squircle. Apple liquid-glass styling — backdrop-filter blur and saturate, iOS blue #007aff (#0a84ff in dark mode). Sits in the bottom-right by default, drag to relocate. Respects iPhone safe-area insets so it doesn't sit under the home indicator.

Single tap opens a sheet:

Save (green dot — local server)
Summarize (color-coded per provider: green Ollama, orange Codex Spark, purple Claude)
Send to Brain (purple — Claude)
Fill form (orange — Codex Spark)

Long press 460 ms triggers the default action (Save page) without opening the sheet. Hold and drag = relocate. Release without moving = sheet.

Haptics for everything: 8 ms selection on text-detect, 15 ms light on sheet item, 25 ms medium on sheet open, 30/60/40 ms success pattern when an action lands, 40 ms × 5 error buzz on failure. A toast pill appears above the button for 2.6 seconds, longer on errors or busy states.

It looks like an iOS system control. It is a <div> mounted in a shadow root by 1350 lines of JavaScript.

The form-fill agent

This is the part that takes the most explaining and the part I'm most proud of.

I open a form on some site I don't want to fill out (a tax declaration, a visa application, a vendor onboarding). I tap Fill form. The script scans every fillable field, sends labels + types + autocomplete hints to a local LLM, gets back a draft response per field, fills them in, and stops.

Up to 15 rounds. Stops after 2 stalls. Never auto-submits — I click the submit button myself after reviewing.

The safety filter is hard-coded:

Refused input types: password, hidden, file, submit, button, reset, image
Refused field-name regex: /pass(word)?|otp|2fa|mfa|captcha|card|credit|cc-|cvv|cvc|iban|swift|routing|bank|pin|secret/i
Match runs against type, label, autocomplete, inputmode, and field name

The LLM refuses to fill anything that even looks like a credential. The script refuses to attempt it. Both belts and suspenders. If a payment form sneaks through with a non-standard field name, the LLM still won't write a credit card it doesn't have.

I am not going to let a model with claude --model sonnet paste my bank details into a stranger's form. The filter is paranoid on purpose.

The backend

The userscript is the front half. It needs a backend to do anything real. Cross-origin AJAX from a userscript only works because Userscripts.app exposes GM.xmlHttpRequest, which bypasses CORS.

The backend is a Python server at ~/Tools/scripts/brain/brain-web-clipper-server.py, listening on localhost:17861. In front of it sits a Caddy reverse proxy on :18443 with a Tailscale TLS cert. The hostname is macbook.tail71cfd.ts.net. The phone reaches the Mac through the Tailscale mesh — no public domain, no Cloudflare tunnel, no exposed port.

Authentication is one HTTP header: X-Brain-Clipper: userscript. The server 403s without it. Cheap gate; the real perimeter is the Tailscale ACL.

Three endpoints:

POST /clip — save page to Brain
POST /clip-deep — save page + run Claude wikilink-matching pass
POST /summary-stream — NDJSON streaming summary

Three LLM providers, switched at runtime:

Ollama — qwen3.6:35b running locally, ~170 tokens/sec on M4
Codex Spark — gpt-5.3-codex-spark via the local codex CLI
Claude Sonnet — via the local claude CLI

All three are on the Mac. The phone never talks to OpenAI, Anthropic, or anybody else. It talks to the Mac. The Mac talks to whichever LLM I picked.

Cost

Component	Cost
Userscripts.app (Safari extension)	$0 (App Store, free)
Apple Developer Program (only if I want to install Userscripts on iOS too)	$99/year (already paid for other reasons)
Tailscale mesh (Mac + iPhone)	$0, personal tier
Caddy reverse proxy	$0, open source
Python `brain-web-clipper-server.py`	$0, local
Ollama + `qwen3.6:35b`	$0, local compute
Codex CLI	$0 incremental (OpenAI subscription I have)
Claude CLI	$0 incremental (Claude subscription I have)
Total incremental	$0/month

The whole stack costs nothing on top of what I already pay for AI subscriptions and an Apple Developer account I keep around anyway. The userscript is free. The Tailscale mesh is free. The proxy is free. The models are local.

Surface vs Bridge

Same two terms.

Surface — Safari. The blue dot. The sheet. Whatever page I happen to be reading.

Bridge — Userscripts.app, the 93 KB compiled .user.js, the iCloud sync to iPhone, the Tailscale mesh, the Caddy proxy, the Python server, the three local LLMs.

The Surface is one app I was already using, for free, on a device I was already holding. The Bridge is one weekend of TypeScript plus a Tailscale-aware reverse proxy.

The pattern across the catalog so far:

Habitica — children's habit app → my morning standup
Zello — walkie-talkie radio → voice-mode AI
Telegram — chat app → multi-session dashboard
Safari — web browser → one-tap clip-to-vault, summarize-in-place, fill-form-with-LLM

None of these surfaces were built for AI agents. All of them were already in my hand. The Bridge meets the agent where the human already is.

What this enables

Most of my reading happens on the phone. Train, walk, café, kitchen. The chat box on claude.ai is useless in those contexts — I am not going to open a tab and paste a URL and type "please summarize this." I would not even open the tab.

The blue dot makes the action one tap. The action is whatever I taught it to be. The teaching is one TypeScript file I can rewrite tonight.

What I avoided building: a native iOS app, an iCloud sync engine, a notification layer, an extension reviewer's-questions Q&A, a public landing page, an App Store screenshot session. What I built: one userscript. The browser does the rest.

Safari only ever let me pick five extensions. So I wrote one extension that contains an infinity of them.