BioGeMT · MartinekV · May 24, 2026 · May 26, 2026 · May 26, 2026 · May 26, 2026
diff --git a/.gitignore b/.gitignore
@@ -214,3 +214,4 @@ competitors/wandb/
 
 # Misc
 *playground*.ipynb
+.codex/
diff --git a/README.md b/README.md
@@ -7,7 +7,7 @@ Made for biomedical data, Agentomics outperformed human experts and created new
 
 
 How it works
-1) Input is a CSV training dataset + optional data description
+1) Input is a folder-based dataset split + optional data description
 2) Agentomics autonomously experments with various ML models and strategies
 3) Output is a trained model ready for inference and a detailed PDF report summarizing the development process and achieved metrics
 
@@ -49,7 +49,7 @@ Agentomics can be run either:
 For more details visit **https://biogemt.github.io/agentomics-ml/**
 
 ## Key Features
-- Generic: Agentomics can crunch any classification and regression datasets in CSV format.
+- Generic: Agentomics can use folder-based inputs for classification and regression tasks.
 - Secure: Agents execute code securely in Docker with read-only mounts to your file system and are only allowed to write in a Docker Volume.
 - Reproducible: Outputs include models, scripts, and conda environments needed to run inference or re-train models with one bash command.
 - Trustworthy: If you provide a test set, Agentomics fully abstracts LLMs from accessing it, allowing you to rely on programmaticly computed and reported test set metrics.
@@ -61,7 +61,6 @@ For more details visit **https://biogemt.github.io/agentomics-ml/**
 Agentomics is in active development. We welcome any raised Issues and suggestions. You can also [Email Us](mailto:martinekvlastimil95@gmail.com).
 
 Features coming soon:
-- Support for any data type (currently only CSV datasets)
 - Run forking and continuing
 - Better local model support and configuration
 - Remote GPU support for GCP
@@ -81,4 +80,3 @@ bioRxiv (preprint) https://www.biorxiv.org/content/10.64898/2026.01.27.702049v1
 
 MIT. See `LICENSE`.
 
-
diff --git a/docs/getting-started/quick-start.md b/docs/getting-started/quick-start.md
@@ -45,12 +45,18 @@ The agent will prompt you to:
 
 Place your data in `datasets/<your_dataset_name>/`:
 
-```
+```text
 datasets/my_dataset/
-├── train.csv           # Required: training data
-├── validation.csv      # Optional: validation data
-├── test.csv            # Optional: hidden test set
-└── dataset_description.md  # Optional: domain context
+├── train/
+│   ├── input/          # Required: model input files
+│   └── labels.csv      # Required: id,numeric_label
+├── validation/         # Optional
+│   ├── input/
+│   └── labels.csv
+├── test/               # Optional hidden test set
+│   ├── input/
+│   └── labels.csv
+└── dataset_description.md
 ```
 
 See [Preparing Datasets](../user-guide/datasets.md) for details.

diff --git a/docs/how-it-works/evaluation.md b/docs/how-it-works/evaluation.md
@@ -32,7 +32,8 @@ At the end of the run:
 4. Results saved to final report
 
 !!! note
-    Test evaluation only occurs if you provide a `test.csv` file.
+    Test evaluation only occurs if you provide a `test/` split with `input/`
+    and `labels.csv`.
 
 ## Classification Metrics
 

diff --git a/docs/index.md b/docs/index.md
@@ -31,7 +31,7 @@ Agentomics-ML works like an ML engineer:
 | Feature | Description |
 |---------|-------------|
 | **Any LLM** | Works with OpenAI, OpenRouter, or local models via Ollama |
-| **Any Dataset** | Supports classification or regression datasets in CSV format |
+| **Any Dataset** | Supports folder-based inputs for classification or regression tasks |
 | **Secure Execution** | Docker containers with read-only access to code and isolated execution |
 | **Reproducible** | Outputs include trained models, scripts, and conda environments |
 

diff --git a/docs/reference/workspace-structure.md b/docs/reference/workspace-structure.md
@@ -4,52 +4,73 @@ How Agentomics-ML organizes files during and after execution.
 
 ## Directory Overview
 
-```
+```text
 agentomics-ml/
 ├── datasets/                 # Raw input datasets
-├── prepared_datasets/        # Prepared training data
-├── prepared_test_sets/       # Prepared test data (hidden)
+├── prepared_datasets/        # Prepared public train/validation data
+├── prepared_test_sets/       # Prepared hidden test data
 ├── workspace/                # Active execution workspace
 │   ├── run/                  # Current run files
 │   ├── best_iteration_snapshot/ # Best iteration snapshot
 │   ├── reports/              # Iteration reports
-│   ├── extras/               # Logs and extra artifacts
-│   └── fallbacks/            # Backup for recovery
+│   └── extras/               # Logs and extra artifacts
 └── outputs/                  # Final results
 ```
 
 ## datasets/
 
-Your raw input datasets:
+Raw datasets use split folders:
 
-```
+```text
 datasets/my_dataset/
-├── train.csv              # Training data (required)
-├── validation.csv         # Validation data (optional)
-├── test.csv               # Test data (optional)
-└── dataset_description.md # Domain info (optional)
-```
+├── train/
+│   ├── input/
+│   ├── extras/             # Optional: supplementary training files
+│   └── labels.csv
+├── validation/             # Optional
+│   ├── input/
+│   ├── extras/             # Optional: supplementary training files
+│   └── labels.csv
+├── test/                   # Optional hidden test set
+│   ├── input/
+│   └── labels.csv
+├── supplementary/          # Optional: supporting/supplementary materials
+├── metadata.json           # Optional if task type is supplied at preparation
+└── dataset_description.md  # Optional domain information
+```
+
+Each `labels.csv` must include `id` and `numeric_label` columns. Only `train`,
+`validation`, and `test` are supported split names. The `input/` structure is
+recorded at preparation time and must match across all splits.
 
 ## prepared_datasets/
 
-After preparation, datasets are formatted for the agent:
+After preparation, public splits are formatted for the agent:
 
-```
+```text
 prepared_datasets/my_dataset/
-├── train.csv              # Processed training data
-├── validation.csv         # Processed validation data
-├── dataset_description.md # Copied/created description
-└── metadata.json          # Task info (type, classes, etc.)
+├── train/
+│   ├── input/
+│   ├── extras/             # If provided
+│   └── labels.csv
+├── validation/
+│   ├── input/
+│   ├── extras/             # If provided
+│   └── labels.csv
+├── supplementary/          # If provided
+├── dataset_description.md
+└── metadata.json
 ```
 
 ## prepared_test_sets/
 
-Test data is separated to ensure it stays hidden:
+Test data is separated to keep it hidden:
 
-```
+```text
 prepared_test_sets/my_dataset/
-├── test.csv               # Test data with labels
-└── test.no_label.csv      # Test data without labels
+└── test/
+    ├── input/
+    └── labels.csv
 ```
 
 The agent never sees files in this directory during training.
@@ -95,18 +116,27 @@ workspace/best_iteration_snapshot/
 
 Updated whenever a new best iteration is achieved.
 
-### workspace/fallbacks/
+### workspace/run/shared/splits/
 
-Recovery backup for split changes:
+Versioned train/validation split folders:
 
-```
-workspace/fallbacks/<agent_id>/
-├── train.csv
-├── validation.csv
-└── split_fingerprint.json
+```text
+workspace/run/shared/splits/
+└── split_0/
+    ├── train/
+    │   ├── input/
+    │   ├── extras/         # Optional
+    │   └── labels.csv
+    └── validation/
+        ├── input/
+        ├── extras/         # Optional
+        └── labels.csv
 ```
 
-Used to restore data if a split change causes issues.
+Each time the agent changes the train/validation split, a new `split_<n>/`
+folder is created. Iteration outputs record which split version they used.
+The `input/` structure must match the original recorded structure across all
+splits. The `extras/` subfolder may be created or modified by the agent.
 
 ### workspace/reports/
 
@@ -115,14 +145,14 @@ Iteration reports are written here during runs. These are copied to
 
 ### workspace/extras/
 
-Logs and auxiliary artifacts (metrics, run logs) are stored here and copied to
+Logs and auxiliary artifacts are stored here and copied to
 `outputs/<agent_id>/extras/`.
 
 ## outputs/
 
 Final results after run completion:
 
-```
+```text
 outputs/<agent_id>/
 ├── best_iteration_snapshot/           # Best iteration artifacts
 │   ├── model_training/
@@ -138,8 +168,12 @@ outputs/<agent_id>/
 │   │   ├── config.json
 │   │   └── splits/
 │   │       └── split_0/
-│   │           ├── train.csv
-│   │           └── validation.csv
+│   │           ├── train/
+│   │           │   ├── input/
+│   │           │   └── labels.csv
+│   │           └── validation/
+│   │               ├── input/
+│   │               └── labels.csv
 │   ├── iteration_0/
 │   ├── iteration_1/
 │   └── ...
@@ -176,7 +210,6 @@ rm -rf outputs/<agent_id>
 ```bash
 rm -rf workspace/run/*
 rm -rf workspace/best_iteration_snapshot/*
-rm -rf workspace/fallbacks/*
 ```
 
 ### Clean Everything
@@ -192,9 +225,9 @@ rm -rf prepared_test_sets/*
 
 In Docker mode, workspace is mounted as a volume:
 
-- Code repository: Read-only
-- Workspace: Read-write
-- Outputs: Read-write
+- Code repository: read-only
+- Workspace: read-write
+- Outputs: read-write
 
 This isolates agent execution from the host system.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -214,3 +214,4 @@ competitors/wandb/

		# Misc
		playground.ipynb
		.codex/