UNDERSTUDY
WHEN YOUR MODEL CAN'T GO ON,
THE SHOW DOES.
A self-hosted gateway that keeps your AI agents running when a model taps out. Rate limit, quota, outage: the understudy steps in mid-scene, and your agent never knows the lead changed.
curl -fsSL https://understudy.cc/install.sh | bash
That puts it on your machine. The first launch runs a short setup - provider keys, a fallback chain, and auto-wiring the agent tools you already use. After that, starting the gateway is one word.
RAISE THE CURTAIN
understudy
Point any agent at the understudy endpoint and it never gets interrupted mid-session again - no harness restart, no config change, no worries.
Take your seat.
2 A.M. THE LEAD FORGETS ITS LINES.
Your agent is deep in an overnight run - the refactor is finally going somewhere. Then the 429. Without understudy, the curtain falls. With it:
Same session. Same tools. No restart. What your agent notices: nothing.
THE QUICK CHANGE, IN TWO ENVIRONMENT VARIABLES.
One variable on the gateway, one on your harness. That is the entire wiring.
FALLBACK_CHAIN=openai/gpt-5.5 understudy # the understudy waits in the wings
ANTHROPIC_BASE_URL=http://localhost:42986 claude # business as usual - until it isn't
- The understudy steps in On a 429, a 5xx, or an outage, the failing model is benched, the next one in the chain takes over mid-run, and the lead retakes the stage the moment its cooldown expires.
- Fluent in every dialect Requests arrive as OpenAI, Anthropic, or Responses; any model answers in the format the client expects. Tool calls, vision, and live token streams are translated event by event, so the harness never notices the swap.
- The off switch always works
understudy disablerestores every harness to exactly how it connected before - even when the gateway itself is down.
THE SHOW THAT NEVER CLOSES.
Surviving one bad night is the easy part. Here is what makes understudy the gateway you leave running.
SEASON TICKETS
API keys aren't the only way to pay. understudy login seats a ChatGPT, Claude, or Copilot subscription you already own as a link in the chain - no per-token bill when it covers a run.
REHEARSALS ARE FREE
Identical requests replay from cache - streamed or not, ~0 ms, zero tokens billed. Crash-and-retry loops stop charging you twice for the same lines.
THE BOX OFFICE
Every request logged with tokens, latency, cost, and who served it. curl /v1/usage and finally know what the overnight run actually cost.
NO DRAMA OFFSTAGE
Strict TypeScript, four runtime dependencies, no database. Self-hosted, so your provider keys never leave the building.
THE CAST
THE COMPANY
in order of appearance
โ marks the lead. Recast any role by changing one string.
PLAYS WELL WITH
The five most popular harnesses have been 100% validated by the backstage crew. LangChain, other agents, and your own code will also work when connected to understudy through the standard OpenAI endpoint pattern.
THE PROMPT BOOK
One binary runs the whole production. Here is the full set of cues.
-
understudy
Start the gateway. The first run walks you through setup - keys, a fallback chain, and wiring up whatever harnesses it finds. Every run after just raises the curtain.
-
understudy setup
Re-run the wizard any time to change provider keys, the fallback chain, or which harnesses route through the gateway.
-
understudy login <chatgpt | anthropic | copilot>
Seat a subscription you already pay for as an understudy, through a one-time OAuth flow. It then stands in like any other link in the chain.
-
understudy status
Who's on stage right now: gateway health, live providers, who's benched and for how long, and which harnesses are routed through the gateway versus talking direct.
-
understudy enable ยท understudy disable
Route every harness through the gateway, or restore each one to exactly how it connected before. Both work even when the gateway is down - so a single point of failure never traps you.
TAKE A BOW
That's the show. If the understudy earned the part, leave a star on GitHub on your way out - the source is there too.