Forge Architecture

Overview

Forge is a 6-layer AI development orchestration system designed for scalability, modularity, and extensibility. Each layer has a specific responsibility and communicates through well-defined interfaces.

Design Principles

Separation of Concerns - Each layer handles one aspect of development
Progressive Enhancement - Layers build upon previous layers
Fail-Fast - Errors caught early prevent wasted resources
Parallel Execution - Independent tasks run concurrently
Pattern-Driven - Knowledge patterns guide generation
Test-Driven - Testing integrated at every step

Layer Architecture

User Input (Natural Language)
         ↓
┌────────────────────────────────────┐
│   Layer 1: Decomposition           │  ← KnowledgeForge Patterns
│   - Conversational planning        │
│   - Task breakdown                 │
│   - Dependency analysis            │
└────────────────────────────────────┘
         ↓ TaskPlan
┌────────────────────────────────────┐
│   Layer 2: Planning                │  ← Tech Stack Selection
│   - File structure design          │
│   - Technology selection           │
│   - Module organization            │
└────────────────────────────────────┘
         ↓ ProjectPlan
┌────────────────────────────────────┐
│   Layer 3: Generation              │  ← Multi-Agent Generation
│   - Parallel code generation       │
│   - Pattern application            │
│   - Quality assurance              │
└────────────────────────────────────┘
         ↓ GeneratedCode
┌────────────────────────────────────┐
│   Layer 4: Testing                 │  ← Docker Isolation
│   - Unit tests                     │
│   - Integration tests              │
│   - Security scanning              │
│   - Performance benchmarking       │
└────────────────────────────────────┘
         ↓ TestResults
┌────────────────────────────────────┐
│   Layer 5: Review                  │  ← Iterative Refinement
│   - Failure analysis               │
│   - Fix generation                 │
│   - Learning database              │
└────────────────────────────────────┘
         ↓ Fixes (if needed, loop to Layer 4)
┌────────────────────────────────────┐
│   Layer 6: Deployment              │  ← Git & Deployment
│   - Git workflows                  │
│   - PR creation                    │
│   - Deployment configs             │
└────────────────────────────────────┘
         ↓
Production-Ready Code

Layer Details

Layer 1: Decomposition

Purpose: Transform user requirements into actionable tasks

Components:

DecompositionLayer - Main orchestrator
ConversationalPlanner - Interactive requirement gathering
TaskDecomposer - Break down into tasks
DependencyAnalyzer - Build dependency graph

Key Data Structures:

@dataclass
class Task:
    id: str
    title: str
    description: str
    dependencies: List[str]
    complexity: Complexity
    estimated_time: int  # minutes
    tech_stack: List[str]
    file_outputs: List[str]

Pattern Usage:

Loads KB3 patterns for best practices
Uses pattern library to identify task types
Applies complexity estimation patterns

Flow:

User provides natural language description
System asks clarifying questions
Decomposes into tasks with KF patterns
Analyzes dependencies
Estimates complexity
Returns TaskPlan

Layer 2: Planning

Purpose: Design project structure and select technologies

Components:

PlanningLayer - Structure designer
TechStackSelector - Technology selection
FileStructureGenerator - Directory layout
DependencyResolver - Package management

Key Data Structures:

@dataclass
class ProjectPlan:
    tasks: List[Task]
    file_structure: Dict[str, FileSpec]
    tech_stack: TechStack
    dependencies: List[str]
    entry_points: List[str]

Pattern Usage:

Project structure patterns
Tech stack best practices
Naming conventions

Flow:

Receives TaskPlan
Selects appropriate tech stack
Designs file structure
Plans module organization
Returns ProjectPlan

Layer 3: Generation

Purpose: Generate code using AI and patterns

Components:

GenerationLayer - Generation orchestrator
CodeGenAPI - API-based generation
ClaudeCodeGenerator - Claude-based generation
GeneratorFactory - Provider abstraction
QualityChecker - Code validation

Providers:

Anthropic Claude (primary)
OpenAI GPT-4 (fallback)

Key Features:

Parallel Generation - Multiple tasks simultaneously
Pattern Integration - KF patterns in prompts
Context Management - Cross-file dependencies
Quality Checks - Syntax validation, best practices

Flow:

Receives ProjectPlan
Groups tasks by dependencies
Generates code in parallel waves
Validates each output
Returns GeneratedCode

Layer 4: Testing

Purpose: Comprehensive testing and validation

Components:

TestingOrchestrator - Test coordination
DockerRunner - Isolated test execution
TestGenerator - Test code generation
SecurityScanner - Vulnerability detection
PerformanceBenchmark - Performance testing

Test Types:

Unit Tests - Individual function/class testing
Integration Tests - Component interaction testing
Security Scans - Vulnerability detection
Performance Tests - Latency/throughput benchmarks

Docker Isolation:

Each test suite runs in isolated container
Clean environment per test
Reproducible results
Resource limits

Flow:

Receives GeneratedCode
Generates test code
Builds Docker environment
Runs test suites
Scans for vulnerabilities
Benchmarks performance
Returns ComprehensiveTestReport

Layer 5: Review

Purpose: Iterative refinement until tests pass

Components:

ReviewLayer - Iteration controller
FailureAnalyzer - Root cause detection
FixGenerator - AI-powered fix generation
LearningDatabase - Success pattern storage

Iteration Process:

Run tests
Analyze failures (14 failure types)
Generate fixes (top 3 per iteration)
Apply fixes
Repeat (max 5 iterations)

Failure Types:

Syntax errors
Import errors
Type errors
Logic errors (assertions)
Security vulnerabilities
Performance degradation

Learning System:

Stores successful fix patterns
Tracks fix success rate
Calculates average iterations
Improves over time

Flow:

Receives test failures
Categorizes and analyzes
Generates targeted fixes
Applies fixes
Re-runs tests
Updates learning database

Layer 6: Deployment

Purpose: Git workflows and deployment automation

Components:

ForgeRepository - Git operations
GitHubClient - PR management
DeploymentGenerator - Platform configs
ConventionalCommit - Commit formatting

Features:

Branch Management - forge/* naming
Conventional Commits - Structured messages
PR Creation - With checklists
Multi-Platform - 5 deployment targets

Platforms:

fly.io
Vercel
AWS Lambda
Docker/Docker Compose
Kubernetes

Flow:

Create feature branch
Generate deployment configs
Commit with conventional format
Push to remote
Create PR with checklist

Data Flow

Complete Build Flow

User Description
    ↓
┌─────────────────────┐
│ 1. Decomposition    │
└─────────────────────┘
    ↓ TaskPlan
┌─────────────────────┐
│ 2. Planning         │
└─────────────────────┘
    ↓ ProjectPlan
┌─────────────────────┐
│ 3. Generation       │  ← Parallel execution
└─────────────────────┘
    ↓ GeneratedCode
┌─────────────────────┐
│ 4. Testing          │  ← Docker isolation
└─────────────────────┘
    ↓ TestResults
    │
    ├─ Tests Pass ──────────────┐
    │                            ↓
    └─ Tests Fail          ┌─────────────────────┐
        ↓                  │ 6. Deployment       │
   ┌─────────────────────┐└─────────────────────┘
   │ 5. Review & Fix     │     ↓
   └─────────────────────┘  Production
        ↓
   (Loop to Testing)

State Management

Forge maintains state across layers using StateManager:

class StateManager:
    def save_task_plan(project_id: str, plan: TaskPlan)
    def load_task_plan(project_id: str) -> TaskPlan

    def save_project_plan(project_id: str, plan: ProjectPlan)
    def load_project_plan(project_id: str) -> ProjectPlan

    def save_generated_code(project_id: str, code: GeneratedCode)
    def load_generated_code(project_id: str) -> GeneratedCode

    def save_test_results(project_id: str, results: TestResults)
    def load_test_results(project_id: str) -> TestResults

State stored in .forge/state/<project_id>/:

task_plan.json
project_plan.json
generated_code/
test_results.json

Cross-Cutting Concerns

Knowledge Management

PatternStore - Centralized pattern access

class PatternStore:
    def search_patterns(query: str) -> List[Pattern]
    def get_pattern_by_id(id: str) -> Pattern
    def get_similar_patterns(pattern: Pattern) -> List[Pattern]

Uses semantic search with embeddings for relevance.

Error Handling

ForgeError - Base exception class

class ForgeError(Exception):
    """Base exception for Forge"""

class DecompositionError(ForgeError):
    """Task decomposition errors"""

class GenerationError(ForgeError):
    """Code generation errors"""

class TestingError(ForgeError):
    """Testing errors"""

All errors include:

Clear error message
Fix suggestions
Documentation links
Example solutions

Logging

Structured logging throughout:

logger.info("Starting code generation", extra={
    "project_id": project_id,
    "task_count": len(tasks),
    "provider": "anthropic"
})

Log levels:

DEBUG - Detailed diagnostic info
INFO - Normal operation milestones
WARNING - Unexpected but handled
ERROR - Operation failures
CRITICAL - System failures

Performance Optimizations

Parallel Execution

Tasks executed in dependency waves:

Wave 1: [Task A, Task B, Task C]  ← No dependencies
   ↓
Wave 2: [Task D, Task E]           ← Depend on Wave 1
   ↓
Wave 3: [Task F]                   ← Depends on Wave 2

Max parallelism: 4 workers (configurable)

Caching

Pattern Embeddings - Cached to avoid recomputation

~/.forge/cache/
  embeddings/
    patterns.pkl
    last_updated.txt

Test Results - Cached until code changes

.forge/cache/<project_id>/
  test_results_<hash>.json

Memory Management

Streaming - Large files processed incrementally Batching - Tasks grouped to reduce API calls Cleanup - Temporary files deleted after use

Extension Points

Custom Generators

class CustomGenerator(BaseGenerator):
    def generate_code(
        self,
        task: Task,
        context: GenerationContext
    ) -> GeneratedCode:
        # Custom generation logic
        pass

# Register
GeneratorFactory.register("custom", CustomGenerator)

Custom Test Frameworks

class CustomTestRunner(BaseTestRunner):
    def run_tests(
        self,
        code_dir: Path,
        test_dir: Path
    ) -> TestResult:
        # Custom test execution
        pass

Custom Deployment Platforms

class CustomPlatform(DeploymentGenerator):
    def generate_configs(
        self,
        config: DeploymentConfig
    ) -> List[Path]:
        # Generate platform configs
        pass

Security Considerations

API Key Management

Never logged or stored in files
Environment variables only
Encrypted in memory if possible

Code Execution

All tests run in isolated Docker containers
Resource limits enforced
Network isolation optional

Generated Code

Security scanning mandatory
Vulnerability database checks
Best practice validation

Scalability

Horizontal Scaling

Stateless generation workers
Task queue for distribution
Shared state storage

Vertical Scaling

Configurable worker count
Memory limits per task
Timeout controls

Monitoring

Metrics Collected

Task completion time
API call count/latency
Test pass rate
Fix success rate
Memory usage
Error rates

Health Checks

Pattern store connectivity
API provider status
Docker daemon status
Disk space available

Future Architecture

Planned Enhancements

Web UI - Visual project planning
Cloud Execution - Serverless generation
Team Features - Shared projects
Plugin System - Third-party extensions
IDE Integration - VSCode/PyCharm plugins

Architecture Evolution

Microservices for layers
Message queue between layers
Distributed state management
Multi-tenancy support

FilesExpand file tree

ARCHITECTURE.md

Latest commit

History

ARCHITECTURE.md

File metadata and controls

Forge Architecture

Overview

Design Principles

Layer Architecture

Layer Details

Layer 1: Decomposition

Layer 2: Planning

Layer 3: Generation

Layer 4: Testing

Layer 5: Review

Layer 6: Deployment

Data Flow

Complete Build Flow

State Management

Cross-Cutting Concerns

Knowledge Management

Error Handling

Logging

Performance Optimizations

Parallel Execution

Caching

Memory Management

Extension Points

Custom Generators

Custom Test Frameworks

Custom Deployment Platforms

Security Considerations

API Key Management

Code Execution

Generated Code

Scalability

Horizontal Scaling

Vertical Scaling

Monitoring

Metrics Collected

Health Checks

Future Architecture

Planned Enhancements

Architecture Evolution

Related Documentation