Skip to content

timholm/reqtrace

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

reqtrace

Requirement traceability and compliance gap detection for industrial engineering documents.

What it does

ReqTrace ingests industrial documents (PDFs, DOCX, spreadsheets), builds a hybrid semantic-lexical index, and exposes a local API and CLI for requirement extraction, cross-document traceability linking, compliance gap detection, and supplier risk scoring. Its key innovation, described in the source paper (arXiv:2603.20534), is combining multi-provider LLM orchestration — routing queries to the cheapest adequate model — with a persistent traceability graph that tracks requirement evolution over time and flags drift, deletions, and emerging focus areas like IT security.

Install

go install github.com/timholm/reqtrace@latest

Or build from source:

git clone https://github.com/timholm/reqtrace.git
cd reqtrace
make build

Usage

Open a store and inspect statistics:

import (
    "fmt"
    "github.com/timholm/reqtrace/internal/store"
)

s, err := store.Open("~/.reqtrace")
if err != nil {
    log.Fatal(err)
}
fmt.Println(s.Stats())
// map[chunks:0 documents:0 gaps:0 requirements:0 risks:0 traces:0]

Route an LLM extraction request through the cost-aware router:

import (
    "context"
    "github.com/timholm/reqtrace/internal/config"
    "github.com/timholm/reqtrace/internal/llm"
)

cfg, _ := config.Load("")   // loads ~/.reqtrace/config.json, falls back to defaults
router := llm.NewRouter(cfg)

resp, err := router.Complete(context.Background(), llm.Request{
    Task:   "extraction",
    System: "Extract all 'shall' requirements from the following text.",
    Messages: []llm.Message{
        {Role: "user", Content: chunkText},
    },
})
// resp.Content holds extracted requirements; resp.Cost is estimated USD spend
fmt.Printf("model=%s cost=$%.4f\n", resp.Model, resp.Cost)
// model=ollama/llama3.2 cost=$0.0000

API

internal/config

// Load reads ~/.reqtrace/config.json (or path) and merges with defaults.
// Env vars OPENAI_API_KEY, ANTHROPIC_API_KEY, REQTRACE_DATA_DIR override file values.
func Load(path string) (*Config, error)

// Save writes config to path (default: ~/.reqtrace/config.json).
func Save(cfg *Config, path string) error

// Default returns the built-in defaults.
func Default() *Config

Default values:

Field Default
data_dir ~/.reqtrace
llm.default_provider ollama
llm.ollama_url http://localhost:11434
llm.max_cost_per_query_usd 0.10
api.addr 127.0.0.1:8765

internal/store

func Open(dir string) (*Store, error)

// Documents
func (s *Store) PutDocument(d *Document) error
func (s *Store) GetDocument(id string) (*Document, bool)
func (s *Store) ListDocuments() []*Document
func (s *Store) DeleteDocument(id string) error

// Chunks
func (s *Store) PutChunk(c *Chunk) error
func (s *Store) ChunksByDocument(docID string) []*Chunk
func (s *Store) AllChunks() []*Chunk

// Requirements
func (s *Store) PutRequirement(r *Requirement) error
func (s *Store) GetRequirement(id string) (*Requirement, bool)
func (s *Store) RequirementsByDocument(docID string) []*Requirement
func (s *Store) AllRequirements() []*Requirement

// Traces
func (s *Store) PutTrace(t *Trace) error
func (s *Store) TracesFrom(reqID string) []*Trace
func (s *Store) AllTraces() []*Trace

// Compliance gaps
func (s *Store) PutGap(g *ComplianceGap) error
func (s *Store) AllGaps() []*ComplianceGap

// Supplier risk
func (s *Store) PutRisk(r *SupplierRisk) error
func (s *Store) AllRisks() []*SupplierRisk

func (s *Store) Stats() map[string]int

Requirement types: functional, performance, safety, security, interface Requirement priorities: shall, should, may Trace types: satisfies, refines, conflicts, derived_from Gap severities: critical, major, minor

internal/llm

// NewRouter builds a cost-aware router from config.
// Provider selection order: Ollama (free/local) → OpenAI → Anthropic → mock fallback.
func NewRouter(cfg *config.Config) *Router

func (r *Router) Complete(ctx context.Context, req Request) (*Response, error)
func (r *Router) Providers() []Provider

// Individual providers
func NewOpenAI(apiKey, model string) Provider      // default model: gpt-4o-mini
func NewAnthropic(apiKey, model string) Provider   // default model: claude-haiku-4-5-20251001
func NewOllama(baseURL, model string) Provider     // default model: llama3.2

Task routing profiles:

Task Prefers local Max cost
classification yes $0.01
extraction yes $0.05
summary yes $0.05
audit no $0.10
gap_analysis no $0.10

Architecture

File Purpose
internal/config/config.go JSON config file loading/saving with env-var overrides
internal/store/store.go Thread-safe JSON-file-backed store for all domain entities
internal/llm/provider.go OpenAI, Anthropic, Ollama, and mock provider implementations
internal/llm/router.go Cost-aware router that selects cheapest adequate provider per task

Data flows from documents → chunks → requirements → traces/gaps/risks, all persisted atomically in ~/.reqtrace/store.json. LLM calls are dispatched through the router using per-task cost profiles; local Ollama inference is preferred for cheap classification and extraction tasks to minimise API spend.

References

Research Papers

  • arXiv:2603.20534 — Longitudinal analysis of requirement evolution in industrial specifications, providing the drift detection and traceability methodology implemented by this tool, including the 83% time-reduction and $2.3M contract-penalty-avoidance benchmarks used as ROI justification.

Related Projects

Market Analysis

The buyer is the requirements engineering or quality/compliance team lead at automotive OEMs, aerospace firms, and Tier-1 suppliers — companies already spending $50–200K/year on manual requirement management tools like IBM DOORS or Jama Connect. ReqTrace sells as an on-prem enterprise license ($2K–8K/seat/year) with a free single-user CLI tier for adoption, targeting the 83% time reduction and contract-penalty avoidance ($2.3M in the paper's case) as ROI justification. The moat is the traceability graph and longitudinal drift detection — general-purpose RAG platforms like Dify and RAGFlow lack domain-specific requirement linking, compliance gap scoring, and supplier risk models, while incumbent RE tools lack LLM-powered extraction entirely.

License

MIT

About

Manufacturing and regulated-industry engineering teams waste weeks manually extracting, cross-referencing, and auditing requirements across heterogeneous specification documents, supplier qualifications, and compliance standards — with no traceability and high error rates.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors