PROOF OF CONCEPT This system is actively in development. The pipeline works end-to-end but is still being hardened for reliability.
Shipping Code While I Sleep
How I built an autonomous AI development pipeline that picks up GitHub issues, writes code, tests in Unity, and creates pull requests—without me touching the keyboard.
────────────────────────────────────────────────────────────────────
[ The Problem ]
"On the left is how most teams use AI today. Ask, copy, paste, fix, repeat. The human does all the work."
Even with AI as a development partner, every feature still started the same way: I opened a chat, described what I wanted, reviewed the output, pasted it into Unity, fixed the errors, and went back and forth until it worked. I was the bottleneck in my own pipeline.
My backlog was growing faster than I could work through it. I had the ideas. I had the AI. But the glue between intent and execution was still me, sitting at the keyboard, for every single issue.
What if I could go to sleep and wake up to a pull request?
────────────────────────────────────────────────────────────────────
[ The Concept ]
BIFROST is the bridge between intent and execution. In Norse mythology, it connects Midgard to Asgard—the mortal world to the realm of the gods. Here, it connects a GitHub issue to a finished pull request.
The idea is simple:
💡 Idea
→
📋 GitHub Issue
→
📐 PRISM Spec
→
✅ Human Approval
→
⚡ BIFROST
→
🔀 Pull Request
I write the what and why. BIFROST handles the how. The human stays in the loop for intent and approval—the AI handles execution.
────────────────────────────────────────────────────────────────────
[ How It Works ]
💡
Idea
It starts with something I want to build. A feature, a bug fix, a new system. I write it down as a plain-language description of what I want and why it matters. No code yet—just intent.
↓
📋
GitHub Issue
The idea becomes a GitHub issue with acceptance criteria, context, and any reference materials. This is the contract—everything BIFROST needs to understand the scope. Labels and priority scores determine queue order.
↓
📐
PRISM Spec Intent
An AI triage system called PRISM reads the issue and auto-generates a technical specification. It identifies which skill agent to route to (UdonSharp, Terminal UI, Game Dev), estimates story points, and writes acceptance criteria. This is the blueprint the AI will follow.
↓
✅
Ready-for-Dev Human Approval
This is the gate. I review the PRISM spec, confirm the approach makes sense, and add the ready-for-dev label. BIFROST detects it and sends a notification to my phone. I reply APPROVE from anywhere—phone, laptop, watch. Nothing runs without this step.
↓
⚡
BIFROST Autonomous Run
The pipeline takes over. BIFROST spins up a Claude Code session with the PRISM spec, routes to the correct skill agent, and the AI starts building. It writes code, triggers Unity compilation via MCP, reads the console for errors, fixes them, and loops until clean. Then it tests in Play mode, validates the results, and moves to delivery.
↓
🔀
Pull Request
A PR appears on GitHub with a full summary, self-review checklist, and test evidence. A structured event log provides a complete audit trail of every action the AI took. I review it in the morning—merge, request changes, or close. The human gets the final word.
AI executes after human intent, never before.
────────────────────────────────────────────────────────────────────
[ The Transformation ]
"I stopped being the middleware between the AI and my codebase."
────────────────────────────────────────────────────────────────────
[ The Infrastructure ]
BIFROST isn't one script—it's an orchestration system built from five components that work together:
GitHub Bridge Polls for ready-for-dev issues, orchestrates the full workflow
Queue Manager FIFO with BBP priority scoring, persistence across restarts
Session Manager Launches Claude Code + Unity Editor, manages lifecycle and timeouts
Approval Manager Human gate via Home Assistant notifications or email from phone
Email System A dedicated Bifrost mailbox for approving issues from anywhere
────────────────────────────────────────────────────────────────────
[ The First Test ]
January 21, 2026. The first end-to-end autonomous run.
Issue #296—"Add Boot Screen + Sounds to Terminal"—was labeled ready-for-dev. BIFROST picked it up, queued it, requested approval via Home Assistant, spun up Claude Code with the /terminal-ui skill, and produced PR #297.
PASS GitHub polling detected the label
PASS Queue system with persistence
PASS Dry-run approval via HA notification
PASS Claude invoked with correct skill routing
PASS Code written, compiled, PR created
PASS 10+ HA notification events fired
PASS Full structured audit trail logged
FAIL Unity MCP scene navigation—couldn't wire GameObject references
"The infrastructure works. The failure was at the execution layer, not the orchestration layer."
The PR was closed because Unity wiring wasn't completed. But the pipeline—from issue detection to PR creation—worked flawlessly. The gap was solvable.
────────────────────────────────────────────────────────────────────
[ Remote Approval ]
The first test used Home Assistant notifications for approval—functional, but limited. I wanted to approve issues from my phone, my laptop, even my watch. So I built an email approval system.
BIFROST has a dedicated mailbox that only accepts emails from one authorized address. When BIFROST needs approval, it sends a request. I reply APPROVE from any device. That's it.
Security Model ProtonMail with sender whitelist, DKIM/SPF/DMARC validation, Message-ID replay prevention
Architecture ProtonMail Split Mode → Proton Bridge (local IMAP/SMTP) → BIFROST polling
Problem solved along the way: ProtonMail marks emails as "read" at the conversation level, not per-mailbox. Switched from read/unread tracking to Message-ID based deduplication.
────────────────────────────────────────────────────────────────────
[ By The Numbers ]
1,500+ Lines of Python
6 Orchestration Modules
10+ Notification Events
2 Weeks to Build
────────────────────────────────────────────────────────────────────
[ What I Learned ]
🔐 Human-in-the-Loop AI should execute after human intent, never autonomously decide what to build. The approval gate is the feature.
🏗️ Orchestration > Execution The hard part isn't getting AI to write code. It's building the pipeline around it—queue, approval, session, cleanup.
🔄 Fail at the Right Layer The first test "failed" but proved the architecture. Infrastructure worked; only the last-mile execution needed fixing.
────────────────────────────────────────────────────────────────────
[ What's Next ]
The bridge is built. Now I'm scaling what crosses it.
BIFROST is running. Issues go in, pull requests come out. The overnight agent is real.