Skip to content

Migration guide for workflow outputs #6162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 8, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -161,9 +161,10 @@ developer/plugins
:caption: Tutorials
:maxdepth: 1

data-lineage
metrics
flux
tutorials/data-lineage
tutorials/workflow-outputs
tutorials/metrics
tutorials/flux
```

```{toctree}
Expand Down
1 change: 1 addition & 0 deletions docs/migrations/24-04.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
(migrating-24-04-page)=

# Migrating to 24.04

Expand Down
4 changes: 2 additions & 2 deletions docs/migrations/25-04.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ The third preview of workflow outputs introduces the following breaking changes

- The `mapper` index directive has been removed. Use a `map` operator in the workflwo body instead.

See {ref}`workflow-output-def` to learn more about the workflow output definition.
See {ref}`migrating-workflow-outputs` to get started.

<h3>Topic channels (out of preview)</h3>

Expand All @@ -44,7 +44,7 @@ This release introduces built-in provenance tracking, also known as *data lineag

You can explore this lineage from the command line using the {ref}`cli-lineage` command. Additionally, you can refer to files in the lineage store from a Nextflow script using the `lid://` path prefix as well as the {ref}`channel-from-lineage` channel factory.

See the {ref}`data-lineage-page` guide to get started.
See {ref}`data-lineage-page` to get started.

## Enhancements

Expand Down
2 changes: 1 addition & 1 deletion docs/data-lineage.md → docs/tutorials/data-lineage.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
(data-lineage-page)=

# Data lineage
# Getting started with data lineage

Data lineage in Nextflow provides comprehensive tracking of workflow runs, task executions, and output files. This feature helps you verify the integrity and reproducibility of your pipeline results by maintaining a complete history of computations and intermediate data.

Expand Down
42 changes: 8 additions & 34 deletions docs/flux.md → docs/tutorials/flux.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,9 @@
:::{versionadded} 22.11.0-edge
:::

The [Flux Framework](https://flux-framework.org/) is a modern resource manager that can span the space between cloud and HPC. If your center does not provide Flux for you, you can [build Flux on your own](https://flux-framework.readthedocs.io/en/latest/quickstart.html#building-the-code) and launch it as a job with your resource manager of choice (e.g. SLURM or a cloud provider).
## Overview

## Tutorial
The [Flux Framework](https://flux-framework.org/) is a modern resource manager that can span the space between cloud and HPC. If your center does not provide Flux, you can [build Flux yourself](https://flux-framework.readthedocs.io/en/latest/quickstart.html#building-the-code) and launch it as a job using your resource manager of choice (e.g. SLURM or a cloud provider).

In the [`docker/flux`](https://github.com/nextflow-io/nextflow/tree/master/docker/flux) directory we provide a [Dockerfile for interacting with Flux](https://github.com/nextflow-io/nextflow/tree/master/docker/flux/.devcontainer/Dockerfile) along with a [VSCode Developer Container](https://code.visualstudio.com/docs/devcontainers/containers) environment that you can put at the root of the project to be provided with a Flux agent and the dependencies needed to build Nextflow. There are two ways to use this:

Expand All @@ -16,7 +16,7 @@ In the [`docker/flux`](https://github.com/nextflow-io/nextflow/tree/master/docke

Both strategies are described below. For this tutorial, you will generally want to prepare a pipeline to use the `flux` executor, create an environment with Flux, start a Flux instance, and interact with it.

### Prepare your pipeline
## Prepare your pipeline

To run your pipeline with Flux, you'll want to specify it in your config. Here is an example `nextflow.config`:

Expand Down Expand Up @@ -53,7 +53,7 @@ process haveMeal {
}
```

### Container Environment
## Prepare your environment

You can either build the Docker image from the root of the Nextflow repository:

Expand Down Expand Up @@ -81,19 +81,15 @@ $ code .

Then you should be able to open a terminal (**Terminal** -> **New Terminal**) to interact with the command line. Try running `make` again! Whichever of these two approaches you take, you should be in a container environment with the `flux` command available.

### Start a Flux Instance
## Start a Flux instance

Once in your container, you can start an interactive Flux instance (from which you can submit jobs on the command line to test with Nextflow) as follows:

```console
$ flux start --test-size=4
```

#### Getting Familiar with Flux

:::{note}
This step is optional!
:::
### Getting familiar with Flux

Here is an example of submitting a job and getting the log for it.

Expand Down Expand Up @@ -125,7 +121,7 @@ $ flux jobs
ƒ4tkMUAAT root sleep R 1 1 2.546s ab6634a491bb
```

### Submitting with Nextflow
## Submitting with Nextflow

Prepare your `nextflow.config` and `demo.nf` in the same directory.

Expand All @@ -134,27 +130,7 @@ $ ls .
demo.nf nextflow.config
```

If you've installed Nextflow already, you are good to go! If you are working with development code and need to build Nextflow:

```console
$ make assemble
```

Make sure `nextflow` is on your PATH (here we are in the root of the Nextflow repository):

```console
$ export PATH=$PWD:$PATH
$ which nextflow
/workspaces/nextflow/nextflow
```

Then change to the directory with your config and demo file:

```console
$ cd docker/flux
```

And then run the pipeline with Flux!
Finally, run the pipeline with Flux:

```console
$ nextflow -c nextflow.config run demo.nf
Expand All @@ -169,5 +145,3 @@ executor > flux (5)
🥑️ for breakfast!
🥧️ for breakfast!
```

And that's it! You've just run a pipeline using nextflow and Flux.
14 changes: 7 additions & 7 deletions docs/metrics.md → docs/tutorials/metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ This tutorial explains how resource usage metrics are computed from execution re

CPU Usage plots report how CPU resources are used by each process.

```{image} _static/report-resource-cpu-noheader.png
```{image} ../_static/report-resource-cpu-noheader.png
```

**Raw Usage** tabs are expected to show 100% core usage if processes perform one task of pure computation. If tasks are distributed over, 2, 3, or 4 CPUs, the raw usage will be 200%, 300%, or 400%, respectively. **% Allocated** tabs rescale raw usage values relative to the number of CPUs that are set with the `cpus` directive. If the `cpus` directive is not set, CPUs are set to `1` and **% Allocated** tabs will show the same values **Raw Usage** tabs.
Expand Down Expand Up @@ -253,17 +253,17 @@ workflow{

The **Virtual (RAM + Disk swap)** tab shows that both `malloc` and `malloc_fill` use the same amount of virtual memory (~1 GiB):

```{image} _static/report-resource-memory-vmem.png
```{image} ../_static/report-resource-memory-vmem.png
```

However, the **Physical (RAM)** tab shows that `malloc_fill` uses ~1 GiB of RAM while `malloc` uses ~0 GiB of RAM:

```{image} _static/report-resource-memory-ram.png
```{image} ../_static/report-resource-memory-ram.png
```

The **% RAM Allocated** tab shows that `malloc` and `malloc_fill` used 0% and 67% of resources set in the `memory` directive, respectively:

```{image} _static/report-resource-memory-pctram.png
```{image} ../_static/report-resource-memory-pctram.png
```

:::{warning}
Expand All @@ -274,7 +274,7 @@ Memory and storage metrics are reported in bytes. For example, 1 KB = $1024$ byt

**Job Duration** plots report how long each process took to run. It has two tabs. The **Raw Usage** tab shows the job duration and the **% Allocated** tab shows the time that was requested relative to what was requested using the `time` directive. Job duration is sometimes known as elapsed real time, real time or wall time.

```{image} _static/report-resource-job-duration.png
```{image} ../_static/report-resource-job-duration.png
```

## I/O Usage
Expand Down Expand Up @@ -306,10 +306,10 @@ workflow{

The **Read** tab shows that ~1 Gib and ~256 Mb are read:

```{image} _static/report-resource-io-read.png
```{image} ../_static/report-resource-io-read.png
```

The **Write** tab shows that ~1 Gib and ~256 Mb are written:

```{image} _static/report-resource-io-write.png
```{image} ../_static/report-resource-io-write.png
```
Loading