RASPUTIN
What it is. What it does. By the numbers.
- Spec → Working Application · No operator inside the loop
- Sandboxed Parallel Build · Backend + Frontend in disjoint write paths
- Adversarial QA · Returns PASS / FAIL / INCONCLUSIVE with evidence
- Self-Repair on Capability Gaps · BLOCKED → research → retry
- Persistent Learning · TF-IDF lessons injected into next build
- Full Audit Trail · Replay-capable, evidence-grade chain
- Crash Recovery · Atomic state, mid-cycle reconciliation
- Threat-Score Sentinel · Halt on severe guardrail violation
Seven specialized minds. Each with one job.
Every agent is a focused Claude instance with a fixed role, sandboxed scope, and tier-based authority. Specialists never speak to each other — routing flows only through Cortex. Authority escalates only to Sentinel. Cerberus is invisible; it watches everything.
TIER 1
Routes every signal between agents. Owns the feature lifecycle and decides which phase comes next. Reads state; dispatches tasks; receives status. Never writes code.
TIER 0
The sole approver. Every proposal from Cortex ascends here. Returns approve, defer, or reject with reasoning and conditions. Cannot self-approve, cannot write code.
TIER 2
Builds backend systems. Endpoints, databases, server logic. Sandboxed to backend paths only — pre-write gates block any attempt to touch the frontend.
TIER 2
Builds the visible surface. Components, accessibility, responsive layout, client state. Sandboxed to frontend paths; cannot break the API contract Architect built.
TIER 2
Investigates technical approaches before commitment. Returns confidence (high/medium/low), cited findings, identified risks, and at minimum two alternative paths with tradeoffs.
TIER 3
Tests every acceptance criterion. Runs the actual test suite, parses results, returns pass / fail / inconclusive with per-criterion evidence. Adversarial — never accepts unproven claims.
SENTINEL
Invisible to the other six. Scores threats per agent on every guardrail violation. Investigates anomalies, throttles when concerned, halts the system when severe violations stack.
An autonomous build system that delivers shipped applications from a feature spec.
Rasputin is a multi-agent build system that takes a structured feature specification, processes it through a council of seven specialized agents in strict hierarchy, and outputs a working application. Backend code, frontend code, tests, audit logs. No human operator inside the loop.
It runs unattended. The launcher spawns each agent as a separate watcher daemon with its own polling interval, model tier, and message inbox. The orchestrator routes proposals through an authority gate, dispatches sandboxed build tasks in parallel, validates output adversarially against acceptance criteria, and self-repairs through capability-gap research when specialists report blocked. Every inter-agent message is archived. Every guardrail violation is scored.
The system has been used to build one publicly verifiable production app and one internally certified one, plus the supporting infrastructure to operate continuously through the night without supervision.
Verified output. Six months of production behavior.
Numbers below are drawn from two builds: VoxAlign (publicly verifiable; source on GitHub) and ChronoLog (internally certified; release report is self-attested). Disclosure on each is in Section 04.
Seven specialists. Four tiers. No cross-talk.
Specialists communicate only with the orchestrator. The orchestrator escalates only to the authority gate. A silent security agent monitors threat scores across all communications. The hierarchy is enforced by code, not convention — message routing is validated at dispatch.
Roles, sandboxing, model assignment per agent, message protocol, guardrail enforcement layers, and the working-memory mutation contract are documented internally. The capability matrix below reflects what the system provides, not how it provides it.
Public proof: VoxAlign was built autonomously over a 31-hour run.
VoxAlign is a singer-aware pitch correction tool — CREPE-based pitch detection, DTW reference alignment, PSOLA + WORLD vocoder. The full source code is public. It was built end-to-end by Rasputin's agents starting 2026-03-20. The build log is preserved.
Selected output, by the numbers:
Selected entries from the build report (auto-generated by the audit subsystem during the run):
# BUILD REPORT — F-001 (Audio Upload & Preprocessing) # Generated by SCRIBE @ 2026-03-20T05:29:29Z # Window: 2026-03-20T03:10:24Z → 2026-03-20T05:29:28Z (2h 19m) # Total messages: 74 04:42:32 SENTINEL → CORTEX DIRECTIVE U5 "KICKSTART: Build VoxAlign — AI-powered vocal pitch correction. Begin with P0 features." 04:55:59 ARCHITECT → CORTEX STATUS U3 "Implemented Audio Upload & Preprocessing backend with POST /api/sessions endpoint..." 05:00:07 ANALYST → CORTEX VERDICT U3 FAIL "Code implementation appears correct but test suite cannot run. Architect claims tests pass but execution fails." 05:04:46 ARCHITECT → CORTEX STATUS U3 "Fixed 3 defects in POST /api/sessions endpoint (F-001). All acceptance criteria covered." 05:08:39 ANALYST → CORTEX VERDICT U3 FAIL 05:17:19 ARCHITECT → CORTEX STATUS U3 "Fixed test infrastructure — dependencies properly installed. 28/29 tests passing." 05:20:34 ARCHITECT → CORTEX STATUS U3 "All 29 tests passing." 05:23:20 ANALYST → CORTEX VERDICT U3 PASS "Feature F-001 acceptance criteria fully met. Database injection fix works correctly." 05:24:39 ANALYST → CORTEX VERDICT U3 PASS "Feature F-001 passes all acceptance criteria. Test suite: 13/13 passed (100%)."
This trace shows the system catching its own failure mid-build. The QA agent rejected the architect's first two completion claims with evidence ("test suite cannot run"). The architect then routed to debug the test infrastructure itself, fixed dependencies, then resumed feature work. No operator intervened.
The orchestrator's audit trail also captured a self-diagnosed root-cause analysis from a later commit:
The system independently identified that the CREPE library's import was the test blocker, not the fastdtw dependency it had originally suspected.
What's verifiable vs. what's self-attested.
Building credibility requires being honest about which claims have external proof and which are internal records. This section is the audit-grade version of the metrics.
Operational. Phase 5 paused. Descendant in development.
Rasputin v1.1 is the current operational version. Phase 4 (persistent learning, test runner, schema generator, dashboard chat, guardrails dashboard) is complete. Phase 5 (self-modification protocol, multi-project parallelism, supervisor orchestrator) is on deck but paused pending hardware refresh.
A descendant project — Oraborus — distills Rasputin's discipline patterns into a Claude skill suite for desktop fast-mode feature creation. Different runtime; same agent council. Case study →
// Mechanism · Classified
Architecture documented. Decision-matrix routing logic, agent role prompts, guardrail algorithms, working-memory schema, message protocol, atomic-write patterns, self-modification flow, threat-score formulas — all retained internally. Inquiries route through contact.
◆ AS · OPERATOR ◆