Skip to content

A comprehensive Rust library for processing source code using tree-sitter. This library provides high-level abstractions for parsing, navigating, and querying syntax trees across multiple programming languages.

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

njfio/rust-treesitter-agent-code-utility

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

98 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Rust Tree-sitter Agent Code Utility

A Rust library for parsing and analyzing source code using tree-sitter. Provides abstractions for parsing, navigating, and querying syntax trees across multiple programming languages with analysis capabilities for security, performance, and code quality.

Built for developers and AI systems that need code analysis tools and insights into code structure and quality.

Table of Contents

Features

Core Language Support

  • 7 Programming Languages: Rust, JavaScript, TypeScript, Python, C, C++, Go
  • Language Detection: Automatic detection from file extensions and content analysis
  • Symbol Extraction: Functions, classes, structs, methods, types, interfaces, implementations
  • Language Features: Language-specific construct detection and analysis

Analysis Capabilities

  • Codebase Analysis: Directory analysis with file metrics, symbol extraction, and statistics
  • Security Scanning: Pattern-based vulnerability detection with OWASP categorization and semantic context tracking
  • Complexity Analysis: Comprehensive code complexity metrics including McCabe, cognitive, NPATH, and Halstead metrics
  • Performance Analysis: Optimization recommendations and performance hotspot detection
  • Dependency Analysis: Package manager file parsing (package.json, requirements.txt, Cargo.toml, go.mod)
  • Code Quality Analysis: Code smell detection and refactoring suggestions
  • Intent Mapping: Requirements to implementation mapping for development workflow
  • Semantic Context Tracking: Advanced false positive reduction through contextual analysis

Advanced Features

  • Semantic Context Tracking: Multi-phase semantic analysis for 50% false positive reduction
  • Symbol Table Analysis: Hierarchical scope management with comprehensive symbol tracking
  • Data Flow Analysis: Reaching definitions, use-def chains, and taint flow tracking
  • Security Context Analysis: Validation/sanitization point detection with trust level tracking
  • Semantic Knowledge Graphs: Build and query relationships between code elements
  • Automated Reasoning: Logic-based code analysis and inference capabilities
  • Smart Refactoring Engine: Code improvement suggestions and automated refactoring

CLI Interface

  • Available Commands: analyze, security, refactor, dependencies, symbols, query, find, map, explain, insights, interactive
  • Output Formats: JSON, table, markdown, summary
  • Progress Tracking: Real-time progress indicators
  • Filtering: Severity levels, file types, symbol types
  • Interactive Mode: Real-time code exploration

CLI Commands

analyze - Codebase Analysis

Analyze directory structure, extract symbols, and generate statistics.

tree-sitter-cli analyze <PATH> [OPTIONS]

Options:
  -f, --format <FORMAT>    Output format: table, json, summary [default: table]
  -d, --detailed          Show detailed analysis
  --max-depth <DEPTH>     Maximum directory depth to analyze

Example:

tree-sitter-cli analyze ./src --format json

security - Security Vulnerability Scanning

Pattern-based security analysis with vulnerability detection and compliance assessment.

tree-sitter-cli security <PATH> [OPTIONS]

Options:
  -f, --format <FORMAT>         Output format: table, json, markdown [default: table]
  --min-severity <SEVERITY>     Minimum severity: critical, high, medium, low, info [default: medium]
  --output <FILE>               Save detailed report to file
  --summary-only                Show summary only
  --compliance                  Include compliance assessment
  --depth <LEVEL>               Analysis depth: basic, deep, full [default: full]

Example:

tree-sitter-cli security ./src --min-severity high --format json

Detection Capabilities:

  • OWASP Patterns: SQL injection, XSS, insecure deserialization, broken authentication
  • Code Injection: Command injection, code execution vulnerabilities
  • Input Validation: Missing validation patterns and sanitization
  • Authorization: Missing access controls and privilege escalation
  • Cryptographic Issues: Weak algorithms and insecure practices
  • Compliance Assessment: OWASP and CWE compliance scoring

complexity - Code Complexity Analysis

Comprehensive code complexity analysis with multiple metrics.

tree-sitter-cli complexity <PATH> [OPTIONS]

Options:
  -f, --format <FORMAT>    Output format: table, json, markdown [default: table]
  --metric <METRIC>        Specific metric: mccabe, cognitive, npath, halstead, all [default: all]
  --threshold <VALUE>      Complexity threshold for warnings
  --detailed               Show detailed per-function analysis

Example:

tree-sitter-cli complexity ./src --metric all --format json

Metrics Calculated:

  • McCabe Complexity: Cyclomatic complexity based on control flow paths
  • Cognitive Complexity: Human-perceived complexity with nesting penalties
  • NPATH Complexity: Number of execution paths through functions
  • Halstead Metrics: Volume, difficulty, and effort based on operators/operands
  • Lines of Code: Physical and logical line counts
  • Nesting Depth: Maximum nesting level in functions

symbols - Symbol Extraction

Extract and display code symbols (functions, classes, structs, etc.).

tree-sitter-cli symbols <PATH> [OPTIONS]

Options:
  -f, --format <FORMAT>    Output format: table, json [default: table]

Example:

tree-sitter-cli symbols ./src --format json

Extracts:

  • Functions and methods
  • Classes and structs
  • Interfaces and traits
  • Implementations
  • Types and enums
  • Visibility information
  • Line numbers and locations

refactor - Smart Refactoring Engine

Code improvement suggestions with refactoring capabilities.

tree-sitter-cli refactor <PATH> [OPTIONS]

Options:
  -f, --format <FORMAT>    Output format: table, json, markdown [default: table]
  --category <CATEGORY>    Focus category: all, code_smells, patterns, performance
  --quick-wins             Show only quick wins
  --major-only             Show only major improvements
  --min-priority <LEVEL>   Minimum priority: low, medium, high, critical
  --output <FILE>          Save detailed report to file

Capabilities:

  • Code Smell Detection: Identify anti-patterns and code quality issues
  • Design Pattern Recommendations: Suggest appropriate design patterns
  • Modernization Suggestions: Update code to use modern language features
  • Performance Optimization: Identify and suggest performance improvements
  • Complexity Reduction: Simplify overly complex code structures

dependencies - Dependency Analysis

Dependency analysis with package manager integration.

tree-sitter-cli dependencies <PATH> [OPTIONS]

Options:
  -f, --format <FORMAT>    Output format: table, json, markdown [default: table]
  --include-dev            Include development dependencies
  --vulnerabilities        Enable vulnerability scanning
  --licenses               Enable license compliance checking
  --outdated               Show outdated dependencies
  --graph                  Show dependency graph analysis
  --output <FILE>          Save detailed report to file

Features:

  • Multi-Language Support: package.json (Node.js), requirements.txt (Python), Cargo.toml (Rust), go.mod (Go)
  • Dependency Tree: Visualize dependency relationships
  • License Analysis: Identify license information
  • Update Information: Show outdated dependencies

query - Code Querying

Code search and analysis using tree-sitter queries.

tree-sitter-cli query <PATH> [OPTIONS]

Options:
  -p, --pattern <PATTERN>  Tree-sitter query pattern
  -l, --language <LANG>    Target specific language
  -c, --context <LINES>    Show context lines around matches [default: 2]
  -f, --format <FORMAT>    Output format: table, json [default: table]

find - Symbol Search

Find symbols and patterns across the codebase.

tree-sitter-cli find <PATH> [OPTIONS]

Options:
  --name <PATTERN>         Symbol name pattern
  --symbol-type <TYPE>     Symbol type filter
  --language <LANG>        Target specific language
  --public-only            Show only public symbols

map - Code Structure Mapping

Generate visual maps of code structure and relationships.

tree-sitter-cli map <PATH> [OPTIONS]

Options:
  --map-type <TYPE>        Map type: overview, tree, symbols, dependencies
  -f, --format <FORMAT>    Output format: unicode, ascii, json, mermaid
  --max-depth <DEPTH>      Maximum depth to show
  --show-sizes             Show file sizes
  --show-symbols           Show symbol counts
  --languages <LANGS>      Filter by languages
  --collapse-empty         Collapse empty directories

explain - Code Explanation

Generate explanations of code functionality and architecture.

tree-sitter-cli explain <PATH> [OPTIONS]

Options:
  --file <FILE>            Specific file to explain
  --symbol <SYMBOL>        Specific symbol to explain
  -f, --format <FORMAT>    Output format: markdown, json [default: markdown]
  --detailed               Include detailed analysis
  --learning               Include learning recommendations

insights - Codebase Insights

Generate insights and recommendations for the codebase.

tree-sitter-cli insights <PATH> [OPTIONS]

Options:
  --focus <AREA>           Focus area: all, architecture, quality, complexity
  -f, --format <FORMAT>    Output format: markdown, json, text [default: markdown]

interactive - Interactive Mode

Enter interactive mode for real-time code exploration.

tree-sitter-cli interactive <PATH>

stats - Codebase Statistics

Show comprehensive statistics about the codebase.

tree-sitter-cli stats <PATH> [OPTIONS]

Options:
  --top <N>                Show top N files by various metrics [default: 10]

Quick Start

CLI Installation

# Clone the repository
git clone https://github.com/njfio/rust-treesitter-agent-code-utility.git
cd rust-treesitter-agent-code-utility

# Build the CLI tool
cargo build --release --bin tree-sitter-cli

# Run analysis on your code
./target/release/tree-sitter-cli analyze ./src

Library Usage

Add this to your Cargo.toml:

[dependencies]
rust_tree_sitter = { git = "https://github.com/njfio/rust-treesitter-agent-code-utility.git" }

Basic Parsing

use rust_tree_sitter::{Parser, Language};

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create a parser for Rust
    let mut parser = Parser::new(Language::Rust)?;

    // Parse some code
    let source = "fn main() { println!(\"Hello, world!\"); }";
    let tree = parser.parse(source, None)?;

    // Navigate the syntax tree
    let root = tree.root_node();
    println!("Root node: {}", root.kind());

    Ok(())
}

Language Detection

use rust_tree_sitter::detect_language_from_extension;

// Detect language from extension
if let Some(lang) = detect_language_from_extension("py") {
    println!("Detected language: {}", lang.name());
}

Codebase Analysis

use rust_tree_sitter::CodebaseAnalyzer;
use std::path::PathBuf;

// Create analyzer with error handling
let mut analyzer = CodebaseAnalyzer::new()?;

// Analyze directory
let result = analyzer.analyze_directory(&PathBuf::from("./src"))?;

// Access results
println!("Found {} files", result.files.len());
for file_info in &result.files {
    println!("πŸ“ {} ({} symbols)", file_info.path.display(), file_info.symbols.len());
}

Security Analysis

use rust_tree_sitter::{CodebaseAnalyzer, AdvancedSecurityAnalyzer};
use std::path::PathBuf;

// Analyze codebase
let mut analyzer = CodebaseAnalyzer::new()?;
let analysis = analyzer.analyze_directory(&PathBuf::from("./src"))?;

// Run security scan
let security_analyzer = AdvancedSecurityAnalyzer::new();
let security_result = security_analyzer.scan_analysis_result(&analysis)?;

println!("Security Score: {}/100", security_result.security_score);
println!("Found {} vulnerabilities", security_result.vulnerabilities.len());

// Display vulnerabilities
for vuln in &security_result.vulnerabilities {
    println!("πŸ”’ {}: {} (line {})",
             vuln.severity, vuln.title, vuln.location.line);
}

Intent Mapping

use rust_tree_sitter::{CodebaseAnalyzer, IntentMappingSystem, Requirement, RequirementType, Priority};
use std::path::PathBuf;

// Analyze codebase
let mut analyzer = CodebaseAnalyzer::new()?;
let analysis = analyzer.analyze_directory(&PathBuf::from("./src"))?;

// Create intent mapping system
let mut mapping_system = IntentMappingSystem::new();

// Add requirements
let requirement = Requirement {
    id: "REQ-001".to_string(),
    requirement_type: RequirementType::UserStory,
    description: "As a user, I want to authenticate securely".to_string(),
    priority: Priority::High,
    acceptance_criteria: vec![
        "User can enter credentials".to_string(),
        "System validates credentials".to_string(),
    ],
    stakeholders: vec!["Product Owner".to_string()],
    tags: vec!["authentication".to_string(), "security".to_string()],
};

mapping_system.add_requirement(requirement);

// Generate mappings
let mappings = mapping_system.generate_mappings(&analysis)?;
println!("Generated {} mappings", mappings.len());

Complexity Analysis

use rust_tree_sitter::{Parser, Language, ComplexityAnalyzer};

// Create parser and analyzer
let mut parser = Parser::new(Language::Rust)?;
let analyzer = ComplexityAnalyzer::new("rust");

// Parse code
let source = r#"
    fn complex_function(x: i32, y: i32) -> i32 {
        if x > 0 {
            for i in 0..x {
                if i % 2 == 0 {
                    return i * y;
                }
            }
        }
        match y {
            0..=10 => y * 2,
            11..=100 => y + 50,
            _ => y - 25,
        }
    }
"#;

let tree = parser.parse(source, None)?;

// Analyze complexity
let metrics = analyzer.analyze_complexity(&tree)?;

println!("McCabe Complexity: {}", metrics.cyclomatic_complexity);
println!("Cognitive Complexity: {}", metrics.cognitive_complexity);
println!("NPATH Complexity: {}", metrics.npath_complexity);
println!("Halstead Volume: {:.2}", metrics.halstead_volume);
println!("Halstead Difficulty: {:.2}", metrics.halstead_difficulty);
println!("Halstead Effort: {:.2}", metrics.halstead_effort);
println!("Max Nesting Depth: {}", metrics.max_nesting_depth);
println!("Lines of Code: {}", metrics.lines_of_code);

Performance Analysis

use rust_tree_sitter::{CodebaseAnalyzer, PerformanceAnalyzer};
use std::path::PathBuf;

// Analyze codebase
let mut analyzer = CodebaseAnalyzer::new()?;
let analysis = analyzer.analyze_directory(&PathBuf::from("./src"))?;

// Run performance analysis
let perf_analyzer = PerformanceAnalyzer::new();
let perf_result = perf_analyzer.analyze(&analysis)?;

println!("Performance Score: {}/100", perf_result.performance_score);
println!("Found {} hotspots", perf_result.hotspots.len());

for hotspot in &perf_result.hotspots {
    println!("⚑ {}: {} (severity: {:?})",
             hotspot.category, hotspot.location.file.display(), hotspot.severity);
}

Semantic Context Analysis

use rust_tree_sitter::{SemanticContextAnalyzer, Language};
use std::path::PathBuf;

// Create semantic context analyzer
let mut semantic_analyzer = SemanticContextAnalyzer::new(Language::Rust)?;

// Parse and analyze code
let source = std::fs::read_to_string("src/main.rs")?;
let mut parser = Parser::new(Language::Rust)?;
let tree = parser.parse(&source, None)?;

// Perform comprehensive semantic analysis
let semantic_context = semantic_analyzer.analyze(&tree, &source)?;

// Access symbol table with scope information
println!("Found {} scopes", semantic_context.symbol_table.scopes.len());
println!("Found {} symbols", semantic_context.symbol_table.symbols.len());

// Access data flow analysis
println!("Reaching definitions: {}", semantic_context.data_flow.reaching_definitions.len());
println!("Use-def chains: {}", semantic_context.data_flow.use_def_chains.len());
println!("Taint flows: {}", semantic_context.data_flow.taint_flows.len());

// Access security context
let security_ctx = &semantic_context.security_context;
println!("Validation points: {}", security_ctx.validation_points.len());
println!("Sanitization points: {}", security_ctx.sanitization_points.len());
println!("Trust levels tracked: {}", security_ctx.trust_levels.len());

// Access call graph analysis
println!("Function calls: {}", semantic_context.call_graph.calls.len());
println!("Function definitions: {}", semantic_context.call_graph.functions.len());

// Access pattern detection
println!("Code patterns: {}", semantic_context.pattern_context.patterns.len());
println!("Anti-patterns: {}", semantic_context.pattern_context.anti_patterns.len());

Supported Languages

Language Extensions Symbol Extraction Security Analysis Status
Rust .rs βœ… Functions, structs, impls, traits βœ… Pattern-based 🟒 Working
JavaScript .js, .mjs, .jsx βœ… Functions, classes, methods βœ… Pattern-based 🟒 Working
TypeScript .ts, .tsx βœ… Functions, classes, interfaces, types βœ… Pattern-based 🟒 Working
Go .go βœ… Functions, structs, methods, interfaces βœ… Pattern-based 🟒 Working
Python .py, .pyi βœ… Functions, classes, methods βœ… Pattern-based 🟒 Working
C .c, .h βœ… Functions, structs, typedefs, macros βœ… Pattern-based 🟒 Working
C++ .cpp, .hpp, etc βœ… Functions, classes, namespaces, templates βœ… Pattern-based 🟒 Working

Symbol Types Extracted

  • Functions: Regular functions, methods, constructors
  • Classes/Structs: Class definitions, struct definitions, implementations
  • Types: Interfaces, type aliases, enums, traits
  • Visibility: Public, private, protected (language-dependent)
  • Location: Line numbers, column positions
  • Documentation: Extracted where available

Security Vulnerability Detection

Pattern-based detection for:

  • SQL Injection: Unsafe query construction
  • Command Injection: Unsafe command execution
  • XSS: Cross-site scripting patterns
  • Hardcoded Secrets: API keys, passwords, tokens
  • Cryptographic Issues: Weak algorithms, insecure practices
  • Input Validation: Missing validation patterns
  • Authorization: Missing access controls

Test Coverage

Current Test Status

  • 330+ Total Tests Passing: Comprehensive test suite covering all functionality
  • Core Parsing: All parsing functionality working across 7 languages
  • Symbol Extraction: Working for all supported languages with symbol detection
  • Security Analysis: Pattern-based security scanning with OWASP categorization
  • Complexity Analysis: 21 tests covering McCabe, cognitive, NPATH, and Halstead metrics
  • Semantic Context Tracking: 17 tests covering symbol tables, data flow, and security context
  • Performance Analysis: Optimization recommendations and hotspot detection
  • Intent Mapping: Requirements-to-implementation mapping with validation
  • Advanced Features: Semantic analysis, automated reasoning, and code explanation
  • CLI Commands: All commands working with comprehensive option support
  • Output Formats: JSON, table, markdown, summary formats
  • Error Handling: Robust Result<T,E> patterns throughout
  • Constants Management: Centralized configuration with validation

Test Categories

  • Unit Tests: 313 tests covering individual components and functions
  • Integration Tests: End-to-end testing of CLI commands and workflows
  • Error Handling Tests: Comprehensive error condition and edge case testing
  • Configuration Tests: Validation of all configuration options and defaults
  • Security Tests: Vulnerability detection and pattern matching
  • Performance Tests: Analysis accuracy and recommendation validation

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

A comprehensive Rust library for processing source code using tree-sitter. This library provides high-level abstractions for parsing, navigating, and querying syntax trees across multiple programming languages.

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages