Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -214,3 +214,4 @@ competitors/wandb/

# Misc
*playground*.ipynb
.codex/
6 changes: 2 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Made for biomedical data, Agentomics outperformed human experts and created new


How it works
1) Input is a CSV training dataset + optional data description
1) Input is a folder-based dataset split + optional data description
2) Agentomics autonomously experments with various ML models and strategies
3) Output is a trained model ready for inference and a detailed PDF report summarizing the development process and achieved metrics

Expand Down Expand Up @@ -49,7 +49,7 @@ Agentomics can be run either:
For more details visit **https://biogemt.github.io/agentomics-ml/**

## Key Features
- Generic: Agentomics can crunch any classification and regression datasets in CSV format.
- Generic: Agentomics can use folder-based inputs for classification and regression tasks.
- Secure: Agents execute code securely in Docker with read-only mounts to your file system and are only allowed to write in a Docker Volume.
- Reproducible: Outputs include models, scripts, and conda environments needed to run inference or re-train models with one bash command.
- Trustworthy: If you provide a test set, Agentomics fully abstracts LLMs from accessing it, allowing you to rely on programmaticly computed and reported test set metrics.
Expand All @@ -61,7 +61,6 @@ For more details visit **https://biogemt.github.io/agentomics-ml/**
Agentomics is in active development. We welcome any raised Issues and suggestions. You can also [Email Us](mailto:martinekvlastimil95@gmail.com).

Features coming soon:
- Support for any data type (currently only CSV datasets)
- Run forking and continuing
- Better local model support and configuration
- Remote GPU support for GCP
Expand All @@ -81,4 +80,3 @@ bioRxiv (preprint) https://www.biorxiv.org/content/10.64898/2026.01.27.702049v1

MIT. See `LICENSE`.


16 changes: 11 additions & 5 deletions docs/getting-started/quick-start.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,12 +45,18 @@ The agent will prompt you to:

Place your data in `datasets/<your_dataset_name>/`:

```
```text
datasets/my_dataset/
├── train.csv # Required: training data
├── validation.csv # Optional: validation data
├── test.csv # Optional: hidden test set
└── dataset_description.md # Optional: domain context
├── train/
│ ├── input/ # Required: model input files
│ └── labels.csv # Required: id,numeric_label
├── validation/ # Optional
│ ├── input/
│ └── labels.csv
├── test/ # Optional hidden test set
│ ├── input/
│ └── labels.csv
└── dataset_description.md
```

See [Preparing Datasets](../user-guide/datasets.md) for details.
Expand Down
3 changes: 2 additions & 1 deletion docs/how-it-works/evaluation.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,8 @@ At the end of the run:
4. Results saved to final report

!!! note
Test evaluation only occurs if you provide a `test.csv` file.
Test evaluation only occurs if you provide a `test/` split with `input/`
and `labels.csv`.

## Classification Metrics

Expand Down
2 changes: 1 addition & 1 deletion docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ Agentomics-ML works like an ML engineer:
| Feature | Description |
|---------|-------------|
| **Any LLM** | Works with OpenAI, OpenRouter, or local models via Ollama |
| **Any Dataset** | Supports classification or regression datasets in CSV format |
| **Any Dataset** | Supports folder-based inputs for classification or regression tasks |
| **Secure Execution** | Docker containers with read-only access to code and isolated execution |
| **Reproducible** | Outputs include trained models, scripts, and conda environments |

Expand Down
109 changes: 71 additions & 38 deletions docs/reference/workspace-structure.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,52 +4,73 @@ How Agentomics-ML organizes files during and after execution.

## Directory Overview

```
```text
agentomics-ml/
├── datasets/ # Raw input datasets
├── prepared_datasets/ # Prepared training data
├── prepared_test_sets/ # Prepared test data (hidden)
├── prepared_datasets/ # Prepared public train/validation data
├── prepared_test_sets/ # Prepared hidden test data
├── workspace/ # Active execution workspace
│ ├── run/ # Current run files
│ ├── best_iteration_snapshot/ # Best iteration snapshot
│ ├── reports/ # Iteration reports
│ ├── extras/ # Logs and extra artifacts
│ └── fallbacks/ # Backup for recovery
│ └── extras/ # Logs and extra artifacts
└── outputs/ # Final results
```

## datasets/

Your raw input datasets:
Raw datasets use split folders:

```
```text
datasets/my_dataset/
├── train.csv # Training data (required)
├── validation.csv # Validation data (optional)
├── test.csv # Test data (optional)
└── dataset_description.md # Domain info (optional)
```
├── train/
│ ├── input/
│ ├── extras/ # Optional: supplementary training files
│ └── labels.csv
├── validation/ # Optional
│ ├── input/
│ ├── extras/ # Optional: supplementary training files
│ └── labels.csv
├── test/ # Optional hidden test set
│ ├── input/
│ └── labels.csv
├── supplementary/ # Optional: supporting/supplementary materials
├── metadata.json # Optional if task type is supplied at preparation
└── dataset_description.md # Optional domain information
```

Each `labels.csv` must include `id` and `numeric_label` columns. Only `train`,
`validation`, and `test` are supported split names. The `input/` structure is
recorded at preparation time and must match across all splits.

## prepared_datasets/

After preparation, datasets are formatted for the agent:
After preparation, public splits are formatted for the agent:

```
```text
prepared_datasets/my_dataset/
├── train.csv # Processed training data
├── validation.csv # Processed validation data
├── dataset_description.md # Copied/created description
└── metadata.json # Task info (type, classes, etc.)
├── train/
│ ├── input/
│ ├── extras/ # If provided
│ └── labels.csv
├── validation/
│ ├── input/
│ ├── extras/ # If provided
│ └── labels.csv
├── supplementary/ # If provided
├── dataset_description.md
└── metadata.json
```

## prepared_test_sets/

Test data is separated to ensure it stays hidden:
Test data is separated to keep it hidden:

```
```text
prepared_test_sets/my_dataset/
├── test.csv # Test data with labels
└── test.no_label.csv # Test data without labels
└── test/
├── input/
└── labels.csv
```

The agent never sees files in this directory during training.
Expand Down Expand Up @@ -95,18 +116,27 @@ workspace/best_iteration_snapshot/

Updated whenever a new best iteration is achieved.

### workspace/fallbacks/
### workspace/run/shared/splits/

Recovery backup for split changes:
Versioned train/validation split folders:

```
workspace/fallbacks/<agent_id>/
├── train.csv
├── validation.csv
└── split_fingerprint.json
```text
workspace/run/shared/splits/
└── split_0/
├── train/
│ ├── input/
│ ├── extras/ # Optional
│ └── labels.csv
└── validation/
├── input/
├── extras/ # Optional
└── labels.csv
```

Used to restore data if a split change causes issues.
Each time the agent changes the train/validation split, a new `split_<n>/`
folder is created. Iteration outputs record which split version they used.
The `input/` structure must match the original recorded structure across all
splits. The `extras/` subfolder may be created or modified by the agent.

### workspace/reports/

Expand All @@ -115,14 +145,14 @@ Iteration reports are written here during runs. These are copied to

### workspace/extras/

Logs and auxiliary artifacts (metrics, run logs) are stored here and copied to
Logs and auxiliary artifacts are stored here and copied to
`outputs/<agent_id>/extras/`.

## outputs/

Final results after run completion:

```
```text
outputs/<agent_id>/
├── best_iteration_snapshot/ # Best iteration artifacts
│ ├── model_training/
Expand All @@ -138,8 +168,12 @@ outputs/<agent_id>/
│ │ ├── config.json
│ │ └── splits/
│ │ └── split_0/
│ │ ├── train.csv
│ │ └── validation.csv
│ │ ├── train/
│ │ │ ├── input/
│ │ │ └── labels.csv
│ │ └── validation/
│ │ ├── input/
│ │ └── labels.csv
│ ├── iteration_0/
│ ├── iteration_1/
│ └── ...
Expand Down Expand Up @@ -176,7 +210,6 @@ rm -rf outputs/<agent_id>
```bash
rm -rf workspace/run/*
rm -rf workspace/best_iteration_snapshot/*
rm -rf workspace/fallbacks/*
```

### Clean Everything
Expand All @@ -192,9 +225,9 @@ rm -rf prepared_test_sets/*

In Docker mode, workspace is mounted as a volume:

- Code repository: Read-only
- Workspace: Read-write
- Outputs: Read-write
- Code repository: read-only
- Workspace: read-write
- Outputs: read-write

This isolates agent execution from the host system.

Expand Down
Loading