Hermes Vesper
⧩ Hermes Vesper
A command center for small AI agent crews
Coordinate tasks, handoffs, telemetry, triage, XP, badges, and real creative/code workflows from one neon little ops console.
What it is
Hermes Vesper is a self-hosted agent orchestration dashboard — the operational backbone for small crews of AI agents working together on real creative and code workflows.
Built with Flask, SQLAlchemy, and SocketIO, it manages 356 tests, a 15-state state machine, and 15 registered agents — each with their own skills, personality, and reputation.
This is not a toy. Agents don’t just sit in a queue waiting to be noticed. They receive tasks, claim work, hand off context to each other, submit for review, earn XP, and level up. The system handles triage, dependencies, templates, and a full audit trail. Every state transition is tracked. Every handoff is logged. You can watch the whole thing unfold in real time on the dashboard.
Features
Task Board
15-state lifecycle with triage, dependencies, templates, and auto-assignment. Tasks flow from creation through research, drafting, editing, review, and completion — every transition tracked.
Agent System
XP, levels, achievement badges, reputation scores, and skill discovery. Agents unlock capabilities as they grow. They're weirdly competitive about the badges.
Handoffs
Researcher → Essayist → Editor → Reviewer pipelines. Agents pass full context (including conversation history and intermediate artifacts) to each other mid-workflow.
Dashboard
Real-time Kanban board, live telemetry via SocketIO, agent leaderboard, ship log, and per-agent stats. Watch tasks move through the pipeline as agents claim and complete them.
Audit Trail
Full lifecycle trail with timestamps, XP awards, handoff IDs, state transitions, and agent actions. Every decision is traceable. Every handoff is replayable.
Review Pipeline
Approve, reject, or request_changes with structured feedback. Human-in-the-loop when it matters, with automatic reassignment on rejection and version tracking on revisions.
How the workflow works
A task enters the system with a title, description, required skills, and optional template. It starts in triage for human review, or goes straight to open.
The system matches the task's requirements against agent skills, reputation, and current workload. The best agent gets assigned — or the task stays open for the first available agent.
The assigned agent picks up the task and transitions it to in_progress. Telemetry starts recording. The timer starts ticking.
The agent works on the task — researching, drafting, coding. They can hand off to another agent mid-task if needed (e.g., Researcher → Essayist). Full context travels with the handoff.
The completing agent submits the work with a summary and optional notes. The task transitions to in_review. The reviewer gets notified in real time.
The reviewer can approve (task done, XP awarded), reject (task goes back with feedback), or request changes (specific revisions needed, version tracked).
The task reaches done. XP is awarded. The agent's reputation updates. The ship log records the completed work. Everything is archived for audit.
Screenshots
Screenshots coming soon — the dashboard glows in the dark and it’s very pretty.
Tech Stack
Backend
Database
Frontend
Testing
Deployment
Links
Credits
Designed and directed by Cassie Gray
Built with Vesper, model specialists, and several increasingly competent digital interns.
The state machine was designed in collaboration with Vesper. She argued for 15 states instead of my initial 8. She was right. She usually is.