Purpose: Graph and matrix visualization generation for GNN models
Pipeline Step: Step 8: Visualization (8_visualization.py)
Category: Visualization / Graph Analysis
Status: ✅ Production Ready
Version: 1.6.0
Last Updated: 2026-04-16
- Generate graph visualizations from GNN models
- Create matrix heatmaps and plots
- Visualize model structure and connections
- Generate network topology diagrams
- Provide visualization data for advanced analysis
- Network graph generation and layout
- Matrix visualization and heatmap creation
- Interactive visualization support
- Multiple output formats (PNG, SVG, HTML)
- Model structure visualization
Description: Main visualization processing function called by orchestrator (8_visualization.py). Implementation: core/process.py.
Parameters:
target_dir(Path): Directory containing GNN filesoutput_dir(Path): Output directory for visualizationsverbose(bool): Enable verbose logging (default: False)**kwargs: Additional visualization options
Returns: True if at least one artifact was generated
Data loading: core/parsed_model.py load_visualization_model prefers {model}_parsed.json from step 3; fallback is parse/markdown.py parse_gnn_content.
Example:
from visualization import process_visualization
success = process_visualization(
target_dir=Path("input/gnn_files"),
output_dir=Path("output/8_visualization_output"),
verbose=True
)Description: Module-level helper; delegates to GNNVisualizer.
Parameters:
graph_data: Graph data dictionaryoutput_dir: Optional output directory
Returns: List of generated visualization file paths
Description: Module-level helper; delegates to GNNVisualizer.
Parameters:
matrix_data: Matrix data dictionaryoutput_dir: Optional output directory
Returns: List of generated visualization file paths
Description: Instance method on GNNVisualizer, not a package-level function. Use GNNVisualizer(...).create_network_diagram(graph_data).
Returns: Dictionary with visualization metadata / paths
matplotlib- Plotting and visualizationnetworkx- Network graph algorithmsnumpy- Numerical computations
plotly- Interactive visualizationsgraphviz- Graph layout and rendering
utils.pipeline_template- Pipeline utilities
VISUALIZATION_CONFIG = {
'output_format': 'png',
'dpi': 300,
'figsize': (10, 8),
'colormap': 'viridis',
'layout_algorithm': 'spring'
}GRAPH_CONFIG = {
'node_size': 100,
'edge_width': 1,
'node_color': 'lightblue',
'edge_color': 'gray',
'layout': 'force_directed'
}from visualization import process_visualization
success = process_visualization(
target_dir="input/gnn_files",
output_dir="output/8_visualization_output"
)from visualization import generate_graph_visualization
files = generate_graph_visualization(graph_data)
for file_path in files:
print(f"Generated: {file_path}")from visualization import generate_matrix_visualization
files = generate_matrix_visualization(matrix_data)
for file_path in files:
print(f"Generated: {file_path}"){model}_network_graph.png— Network layout (directed vs undirected edges, ontology labels){model}_network_stats.json— Counts,gnn_edge_orientation, optionalnetwork_properties{model}_variable_parameter_bipartite.png— Variables vs parameter tensors (name matches){model}_*_heatmap.png/*_tensor.png/*_analysis.png— Matrix / POMDP outputs{model}_combined_analysis.png,{model}_generative_model.png, standalone panels{model}_viz_manifest.json— Artifact paths,_viz_meta(JSON vs markdown source), counts{model}_viz_source_note.txt— When step-3 JSON is older than source.mdvisualization_summary.json— Run-level summary (all models)
output/8_visualization_output/
├── visualization_summary.json
└── {model}/
├── {model}_network_graph.png
├── {model}_network_stats.json
├── {model}_viz_manifest.json
├── {model}_combined_analysis.png
└── …
- Duration: ~2-5 seconds per model
- Memory: ~50-150MB
- Status: ✅ Production Ready
- Graph Generation: 1-3 seconds
- Matrix Visualization: 1-2 seconds
- Structure Analysis: 2-4 seconds
- Combined Visualization: 3-6 seconds
- Graph Layout: Graph layout algorithm failures
- Matrix Size: Matrix too large for visualization
- File I/O: Visualization file writing failures
- Dependency: Missing visualization dependencies
- Layout Recovery: Use simpler layout algorithms
- Matrix Sampling: Sample large matrices
- Format Recovery: Try alternative output formats
- Dependency Skip: Skip advanced visualizations
- Script:
8_visualization.py(Step 8) - Function:
process_visualization()(core/process.py)
utils.pipeline_template- Pipeline utilities
advanced_visualization- Advanced visualization moduletests.test_visualization_*- Visualization tests
GNN Files → Graph Extraction → Layout Calculation → Visualization Generation → Output Files
src/tests/test_visualization_matrices.py- Matrix visualization testssrc/tests/test_visualization_comprehensive.py- Comprehensive real-data testssrc/tests/test_visualization_overall.py- Module-level testssrc/tests/test_visualization_ontology.py- Ontology visualization testssrc/tests/test_visualization_artifacts.py- Artifact / manifest tests
- Measurement:
uv run pytest src/tests/test_visualization_*.py --cov=src.visualization --cov-report=term-missing(do not treat a fixed percentage in this file as canonical).
- Graph visualization with various layouts
- Matrix heatmap generation
- Model structure visualization
- Error handling and recovery
- Matplotlib backend configuration
- Headless environment support
- Progress tracking validation
Registration lives in mcp.py via register_tools(mcp_instance) (GNN MCP server register_tool API).
| Tool name | Python handler | Purpose |
|---|---|---|
process_visualization |
process_visualization_mcp |
Run full step-8 batch for a directory |
get_visualization_options |
get_visualization_options_mcp |
Return get_visualization_options() dict |
list_visualization_artifacts |
list_visualization_artifacts_mcp |
List PNG/SVG/HTML/PDF under an output dir |
get_visualization_module_info |
get_visualization_module_info_mcp |
Return get_module_info() metadata |
Symptom: Warnings about matplotlib backend or "no DISPLAY" errors
Solution:
- ✅ Automatic Fix: The module now automatically detects headless environments and configures the
Aggbackend - Environment variable: Set
MPLBACKEND=Aggbefore running - Manual fix: Add to your script:
import matplotlib matplotlib.use('Agg')
Prevention: Run in environments with display support or ensure Agg backend is used
Symptom: ImportError for matplotlib, networkx, or numpy
Solution:
# Using UV (recommended)
uv pip install matplotlib>=3.5.0 networkx>=2.8.0 numpy>=1.21.0
# Or install all dependencies via pyproject.toml
uv syncAlternative: Install visualization optional group:
uv sync --extra visualizationSymptom: Visualization fails or hangs with large models (>100 nodes)
Solution:
- ✅ Automatic: Module samples large models automatically
- Manual override: Set sampling parameters in config
- Alternative: Visualize model subsets
Prevention: Use --sample-large-models flag when processing
Symptom: Out of memory errors or system slowdown
Solution:
- Reduce visualization DPI: Set
DPI=150(default: 300) - Process files individually instead of batch
- Increase system memory or use sampling
Prevention: Monitor memory usage with --verbose flag
Symptom: Step completes successfully but no images created
Diagnostic:
# Check if GNN processing (step 3) completed successfully
ls output/3_gnn_output/
# Run visualization with verbose logging
python src/8_visualization.py --verbose --target-dir input/gnn_files --output-dir outputCommon Causes:
- GNN processing (step 3) not run first
- Empty or invalid GNN files
- Missing parsed model files
Solution:
# Run complete pipeline in order
python src/main.py --only-steps "3,8" --verboseSymptom: Blurry or pixelated visualizations
Solution:
- Increase DPI in configuration (default: 300)
- Use vector formats (SVG) instead of PNG
- Adjust figure size in config
Configuration:
VISUALIZATION_CONFIG = {
'dpi': 600, # Higher quality
'format': 'svg', # Vector format
'figsize': (12, 10) # Larger canvas
}Symptom: No progress updates during long-running visualizations
Solution:
# Enable verbose mode for detailed progress
python src/8_visualization.py --verbose --target-dir input/gnn_filesFeatures:
- ✅ File-by-file progress indicators:
[1/5],[2/5], etc. - ✅ Visualization type completion: Matrix ✅, Network ✅, Combined ✅
- ✅ Detailed step logging with emoji indicators 📊
- Use appropriate DPI: 150 for preview, 300 for publication
- Sample large models: Automatic sampling for >100 nodes
- Parallel processing: Process multiple files independently
- Cache results: Reuse visualizations when possible
- Memory: ~50-150MB per model (typical)
- CPU: 1-2 cores per visualization process
- Disk: ~1-5MB per visualization set
- Time: 1-5 seconds per model (typical)
-
Always run GNN processing (step 3) first:
python src/3_gnn.py --target-dir input/gnn_files python src/8_visualization.py --target-dir input/gnn_files
-
Use verbose mode for debugging:
python src/8_visualization.py --verbose
-
Check output directory structure:
output/8_visualization_output/ ├── model_name/ │ ├── matrix_analysis.png │ ├── matrix_statistics.png │ └── model_name_combined_analysis.png └── visualization_results.json -
Monitor for warnings:
- Backend configuration warnings
- Dependency availability warnings
- Sampling notifications for large models
Features:
- Graph visualization generation
- Matrix heatmap creation
- Network topology diagrams
- Model structure visualization
- Automatic headless environment detection
- Progress tracking with visual indicators
Known Issues:
- None currently
- Next Version: Interactive visualizations (plotly/HTML where optional deps exist)
- Future: Streaming or incremental updates for large models
Last Updated: 2026-04-16 Maintainer: GNN Pipeline Team Status: ✅ Production Ready Version: 1.6.0 Architecture Compliance: ✅ 100% Thin Orchestrator Pattern