Make AI collaborate like an elite consulting team
Emergent Roles Β· Cognitive Alignment Β· Skill Injection Β· Deliverable-Driven Output
English | δΈζ
Most multi-agent systems follow a "fixed job" paradigm β developers predefine roles and pipelines, and agents execute according to flowcharts. This approach hits three structural walls when facing open-ended complex tasks: role mismatch, stitched-together collaboration, and undeliverable output.
Agent Swarm takes a different approach: it borrows from how elite consulting firms run project-based teams. For every new task, the LLM analyzes from scratch what experts are needed, how to divide work, and how to collaborate β dynamically "emerging" the optimal team. When the task is done, the team dissolves and capabilities return to the pool.
This is not an engineering tweak, but a paradigm shift: from "predefined pipelines" to a "self-organizing expert hive".
Traditional frameworks require developers to predefine Agent(role="researcher") and similar fixed roles. Agent Swarm's Role Emergence Engine lets the LLM autonomously plan based on the task's nature:
User input: "Analyze the cinematography of In the Mood for Love"
β Auto-emerged roles:
π¬ Cinematography Analyst β visual narrative language, composition, color
π Narrative Architect β storyline, temporal structure, negative space
π Visual Semiotician β visual metaphors, cultural symbol interpretation
π΅ Score Interpreter β audiovisual relationship, music narrative function
Each emerged role is not just a name tag, but a complete expert profile with work objectives, expected deliverables, methodology, success criteria, and collaboration triggers. Whether the task is film analysis, ad campaign planning, or industry research, the system "assembles" the optimal expert team rather than "making do" with preset roles.
Most concurrent approaches are "hands-off" β agents work in isolation, then results are piled together. Agent Swarm introduces the Relay Station mechanism for real-time cognitive synchronization:
Traditional 2D Concurrency: Agent Swarm 3D Orchestration:
ββββββββββββββββ
Agent A βββΆ Result A ββ β Relay Station β
Agent B βββΆ Result B ββΌββΆ Pile up ββββββ€ (War Room) ββββββ
Agent C βββΆ Result C ββ β β β β
Agent A Agent B Agent C
Isolated, unaware of each other Real-time sharing, aligned cognition
The Relay Station supports 10 message types (discovery broadcast, alignment request/response, suggestion, checkpoint, human intervention, etc.). Agents use adaptive triggering to autonomously decide when to sync with others β broadcasting when critical information is found, seeking verification when uncertain, and synchronizing at progress milestones.
Users can initiate human intervention at any time, which is broadcast through the Relay Station to all relevant agents, immediately influencing the entire team's cognitive direction.
Pure LLM reasoning suffers from unstable output and capability ceilings. Agent Swarm addresses both through a SKILL.md-driven injection mechanism (inspired by Claude's Skills pattern):
Each skill is defined as a SKILL.md file β a structured Markdown document containing professional workflows, guidelines, success criteria, and safety checks. During role emergence, relevant skills are injected directly into the agent's system prompt, transforming a generic LLM into a domain expert with internalized methodology.
SKILL.md (e.g. "Director" skill)
βββ Metadata β name, tags, trigger keywords
βββ Workflow β step-by-step professional process
βββ Guidelines β industry best practices & principles
βββ Examples β reference cases & templates
βββ Success Criteria β quality standards & checkpoints
β
Injected into Agent's System Prompt
β
LLM "thinks" like a professional director
For skills that need real-world execution power (e.g. web search), the SKILL.md can be paired with executable scripts β the agent calls them as tools, and real results (search data, analysis output) flow back into its reasoning.
Skills are automatically matched during role emergence β a "Creative Director" receives directing skills, a "Content Planner" gets screenwriting skills, a researcher gets web-search skills. The skill system is easily extensible: add a SKILL.md (and optional scripts) under backend/skills/library/ to register new skills.
The most common LLM problem is "nice form, hollow content." Agent Swarm anchors what each agent must deliver at role emergence time:
# Automatically defined during role emergence
{
"name": "Creative Director",
"work_objective": "Define creative direction, ensure alignment with brand identity",
"deliverables": ["Creative Direction Document", "Visual Style Guide", "Final Creative Review"],
"methodology": {
"approach": "Start from brand core values, combine with target audience traits",
"steps": ["Analyze brand identity", "Define creative direction", "Set visual style", ...],
"success_criteria": ["Creative-brand alignment", "Visual consistency", "Audience fit"]
}
}This "Goal Anchoring β Output Anchoring β Process Anchoring β Quality Anchoring" mechanism transforms agents from "chatty dialogue machines that ramble upon receiving instructions" into "value creators that define deliverables, follow methodologies, and produce professional results."
ββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β React Frontend β
β Agent Overview Β· Relay Panel Β· Streaming Β· HI β
ββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β AG-UI Protocol (SSE)
ββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββ
β Python Backend β
β β
β βββββββββββββββ ββββββββββββββ ββββββββββββββββ β
β β Master Agent ββββ Emergence ββββRelay Station β β
β ββββββββ¬βββββββ β Engine β ββββββββ¬ββββββββ β
β β ββββββββββββββ β β
β ββββββββ΄βββββββββββββββββββββββββββββββββββ΄βββββββ β
β β Dynamic Subagents (2-5) β β
β β βββββββββββ βββββββββββ βββββββββββ β β
β β β Agent 1 β β Agent 2 β β Agent N β β β
β β ββββββ¬βββββ ββββββ¬βββββ ββββββ¬βββββ β β
β ββββββββββΌββββββββββββββΌββββββββββββΌβββββββββββββ β
β βββββββββββββββΌββββββββββββ β
β ββββββ΄βββββ β
β β Skills β β
β βββββββββββ β
β reasoning Β· director Β· web_search β
β screenwriter Β· visual_designer Β· ... β
ββββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββ
β
ββββββββββ΄βββββββββ
β LLM Provider β
β OpenAI / Claude β
βββββββββββββββββββ
| Module | Path | Responsibility |
|---|---|---|
| Master Agent | core/master_agent.py |
Task analysis, role emergence orchestration, result synthesis |
| Emergence Engine | core/role_emergence.py |
LLM-driven dynamic role generation and skill assignment |
| Relay Station | core/relay_station.py |
Message broadcast, cognitive alignment, human intervention |
| Subagent Runtime | core/subagent.py |
Independent execution unit, skill invocation, relay triggering |
| Skill System | skills/ |
Skill definition, auto-registration, dual-channel injection & execution |
| AG-UI Protocol | agui/ |
SSE event stream, real-time frontend-backend communication |
| Memory System | memory/ |
User preference and knowledge persistence |
| Session Manager | core/session_manager.py |
Multi-session isolation and history management |
- Python 3.9+
- Node.js 18+
- OpenAI API Key or compatible endpoint
git clone https://github.com/jlulxy/agent-swarm.git
cd agent-swarm
# Configure API Key
cp backend/.env.example backend/.env
# Edit backend/.env with your API Key
chmod +x start.sh
./start.sh# Configure API Key
cp backend/.env.example backend/.env
# Edit backend/.env
docker compose up --build# Backend
cd backend
pip install -r requirements.txt
python main.py
# Frontend (new terminal)
cd frontend
npm install
npm run dev| Service | URL |
|---|---|
| Frontend UI | http://localhost:3000 |
| Backend API | http://localhost:8000 |
| API Docs | http://localhost:8000/docs |
Deep Film Analysis:
Input: "Deeply analyze the narrative structure and visual language of Inception"
The system automatically emerges four expert roles β Cinematography Analyst, Narrative Architect, Sound Interpreter, and Integration Analyst β who share discoveries in real-time via the Relay Station (e.g., correlating "extensive symmetrical composition" with "fate themes"), ultimately producing a cross-dimensional deep analysis report with mutual corroboration.
Brand Ad Creative:
Input: "Create a 30-second ad concept for a new energy vehicle brand"
The system emerges a Creative Director, Content Planner, and Visual Designer, responsible for creative direction, script copywriting, and visual style design respectively. Deliverables include a creative direction document, storyboard script, and visual style guide β not vague "suggestions."
After analyzing the task, the Master Agent dynamically emerges an expert team. The panoramic view on the right shows each agent's runtime status and progress:
Each emerged role includes a complete profile with role summary, core capabilities, methodology, and collaboration triggers:
Agents invoke professional tools (search, data analysis, etc.) through the skill system during execution. The right panel shows skill invocation records:
Subagents exchange information and align progress in real-time through the Relay Station, supporting phased data handoff and human intervention broadcasts:
| Dimension | Monolithic Agent (Claude Code, etc.) | Predefined Multi-Agent (AutoGen/CrewAI) | Agent Swarm |
|---|---|---|---|
| Roles | Fixed role | Developer-preset | LLM dynamically emerged |
| Collaboration | Master-sub dispatch | Preset flowchart | Relay Station real-time alignment |
| Capability | Domain-specific | Tool-definition dependent | Dual-channel skill injection |
| Output | Code/dialogue | Independent per agent | Deliverable-driven + success criteria |
| Best For | Linear tasks | Fixed-process tasks | Open-ended complex collaboration |
Agent Swarm is not meant to replace monolithic agents or fixed-process frameworks β when a task is complex enough to need "assembling an expert team" rather than "finding one expert," Agent Swarm is the better choice.
Backend: Python 3.9+ Β· FastAPI Β· Uvicorn Β· SQLite Β· OpenAI SDK Β· bcrypt + JWT
Frontend: React 18 Β· TypeScript Β· Vite Β· Tailwind CSS Β· Zustand Β· Lucide Icons
Copy backend/.env.example to backend/.env and edit:
# Required
OPENAI_API_KEY=your-api-key-here
OPENAI_BASE_URL=https://api.openai.com/v1 # or other compatible endpoint
OPENAI_MODEL=gpt-4o # GPT-4 level model recommended
# Optional
HOST=0.0.0.0
PORT=8000
DEBUG=false # true enables hot reload
CORS_ORIGINS=http://localhost:3000,http://localhost:5173
JWT_SECRET=change-this-to-a-random-stringSee backend/.env.example for all configuration options.
agent-swarm/
βββ backend/
β βββ main.py # Entry point
β βββ core/ # Core engine
β β βββ master_agent.py # Master Agent
β β βββ role_emergence.py# Role Emergence
β β βββ relay_station.py # Relay Station
β β βββ subagent.py # Subagent Runtime
β β βββ models.py # Data Models
β βββ skills/ # Skill system
β β βββ library/ # Skill library (extensible)
β β βββ registry.py # Skill registry
β β βββ executor.py # Skill executor
β β βββ loader.py # Skill loader
β βββ agui/ # AG-UI Protocol
β βββ memory/ # Memory system
β βββ auth/ # Authentication
β βββ api/ # API routes
βββ frontend/
β βββ src/
β βββ App.tsx # Main interface
β βββ components/ # UI components
β βββ hooks/ # AG-UI Hooks
β βββ store/ # State management
βββ start.sh # One-click launch
βββ docker-compose.yml # Docker deployment
βββ Dockerfile
Contributions welcome! See CONTRIBUTING.md for development workflow and guidelines.
"Not about making AI smarter, but about making AI collaborate more professionally."
- Role Emergence solves "who does it" β the right expert for the task
- Relay Communication solves "how to collaborate" β real cognitive alignment between experts
- Skill Injection solves "doing it well" β professional tools and methodologies
- Deliverable-Driven solves "what to ship" β truly deliverable, usable output
These four reinforce each other, forming a self-organizing, self-coordinating intelligent collaboration hive.



