Skip to content

Latest commit

 

History

History
175 lines (134 loc) · 7.21 KB

File metadata and controls

175 lines (134 loc) · 7.21 KB

METAINFORMANT Documentation Guide

A comprehensive guide to navigating and understanding the METAINFORMANT documentation system.

Quick Start

If you're new to METAINFORMANT, follow this path:

  1. QUICKSTART.md - Installation and basic usage examples
  2. Architecture - System design and module relationships
  3. CLI Reference - Command-line interface for all modules
  4. Domain-specific documentation - See below for your area of interest

Documentation Organization

METAINFORMANT documentation is organized hierarchically by domain and purpose:

Repository Root

Core Documentation (docs/)

  • index.md - Complete documentation index with navigation
  • architecture.md - System design, module dependencies, data flow
  • cli.md - Unified CLI reference for all modules
  • setup.md - Installation and environment configuration
  • testing.md - Test suite documentation and guidelines
  • UV_SETUP.md - Package management with uv

Domain Documentation

Each biological domain has its own subdirectory with consistent structure:

Domain Location Primary Focus
Core docs/core/ Shared utilities (I/O, config, logging, paths)
DNA docs/dna/ Sequence analysis, alignment, phylogeny, population genetics
RNA docs/rna/ RNA-seq workflows, amalgkit integration
GWAS docs/gwas/ Genome-wide association studies
eQTL docs/eqtl/ Expression QTL integration pipeline
Protein docs/protein/ Protein sequences, structures, databases
Single-Cell docs/singlecell/ scRNA-seq preprocessing, clustering, trajectory
Networks docs/networks/ Biological networks, community detection
Multi-Omics docs/multiomics/ Cross-omics data integration
Math docs/math/ Population genetics theory, epidemiology
ML docs/ml/ Machine learning methods
Visualization docs/visualization/ Plots, animations, trees
Simulation docs/simulation/ Synthetic data generation
Quality docs/quality/ Data quality assessment
Information docs/information/ Information theory methods
Life Events docs/life_events/ Life course event analysis
Ontology docs/ontology/ Gene Ontology, functional annotation
Phenotype docs/phenotype/ Phenotypic trait analysis
Epigenome docs/epigenome/ Methylation, ChIP-seq, ATAC-seq
Ecology docs/ecology/ Community analysis, diversity metrics
Long-read docs/longread/ PacBio/Nanopore sequencing, assembly
Metagenomics docs/metagenomics/ Amplicon, shotgun, functional annotation
Structural Variants docs/structural_variants/ CNV/SV detection, annotation, visualization
Spatial docs/spatial/ Spatial transcriptomics (Visium, MERFISH, Xenium)
Pharmacogenomics docs/pharmacogenomics/ Clinical variants, CPIC, PharmGKB
Metabolomics docs/metabolomics/ Mass spectrometry, pathway mapping
Cloud docs/cloud/ GCP deployment, Docker pipelines, VM lifecycle
Agents docs/agents/ Agent-based modeling, ecosystem simulation

Domain Documentation Structure

Each domain directory follows a consistent pattern:

docs/<domain>/
├── README.md       # Domain overview and quick start
├── index.md        # Detailed index of all submodules
└── <topic>.md      # Topic-specific documentation

Finding What You Need

By Task

Task Documentation
Install METAINFORMANT QUICKSTART.md, setup.md
Run CLI commands cli.md
Analyze DNA sequences docs/dna/
Run RNA-seq workflow docs/rna/workflow.md
Perform GWAS docs/gwas/workflow.md
Analyze single-cell data docs/singlecell/
Visualize results docs/visualization/
Run tests testing.md
Understand architecture architecture.md

By Module

Source code documentation mirrors the src/metainformant/ structure:

Source Module Source README User Documentation
core/ src/metainformant/core/README.md docs/core/
dna/ src/metainformant/dna/README.md docs/dna/
rna/ src/metainformant/rna/README.md docs/rna/
... ... ...

Documentation Types

README Files

  • Located in each module directory
  • Provide quick overview and basic usage
  • Link to detailed documentation

AGENTS.md Files

  • Document AI contributions to the project
  • Located at repository root and in src/metainformant/ module directories
  • Track development history and design decisions
  • Cursor project skills under .cursor/skills/ are generated from each AGENTS.md; regenerate with uv run python scripts/package/generate_cursor_skills.py (see .cursor/skills/README.md)

Topic Documentation

  • Detailed coverage of specific functionality
  • Code examples and API references
  • Performance considerations

Configuration Documentation

  • YAML configuration file format
  • Environment variable overrides
  • Species-specific configurations

Best Practices

Reading Documentation

  1. Start with domain README.md for overview
  2. Check index.md for available topics
  3. Read topic-specific files for details
  4. Reference source code READMEs for API specifics

Contributing Documentation

  1. Follow existing structure and formatting
  2. Include practical code examples
  3. Update cross-references when adding new docs
  4. Place new docs in appropriate docs/<domain>/ directory
  5. Never create documentation in output/ directory

Documentation Standards

  • Clear, technical writing style
  • Consistent markdown formatting
  • Runnable code examples
  • Cross-references between related topics
  • Regular updates with code changes

Cross-Reference Navigation

Key Entry Points

Related Resources

Getting Help

  • Documentation Issues: Check for broken links or outdated content
  • Code Questions: Reference source code READMEs and docstrings
  • Workflow Help: See domain-specific workflow documentation
  • Bug Reports: https://github.com/docxology/metainformant/issues

This guide provides navigation for documentation organized across 29 domains, ensuring coverage of METAINFORMANT's bioinformatics capabilities.