Reporting

Flowcept can generate summarized reports from provenance records.

Current report implementations:

report_type="provenance_card" with format="markdown" (default)
report_type="provenance_report" with format="pdf" (executive PDF with plots)

API

Use:

from flowcept import Flowcept

# Default path: markdown provenance card
Flowcept.generate_report(
    report_type="provenance_card",
    format="markdown",
    output_path="PROVENANCE_CARD.md",
    records=my_records,  # or input_jsonl_path=..., or workflow_id/campaign_id
)

Markdown Provenance Cards (Default)

Markdown provenance cards are the default reporting mode.

from flowcept import Flowcept

# 1) Generate from workflow_id (DB-backed mode)
Flowcept.generate_report(
    report_type="provenance_card",
    format="markdown",
    workflow_id="20c5939f-f3ee-4031-9303-a9e68a5a8092",
    output_path="PROVENANCE_CARD.md",
)

# 2) Generate from in-memory records
Flowcept.generate_report(
    report_type="provenance_card",
    format="markdown",
    records=my_records,
    output_path="PROVENANCE_CARD_FROM_RECORDS.md",
)

# 3) Generate from Flowcept JSONL buffer
Flowcept.generate_report(
    report_type="provenance_card",
    format="markdown",
    input_jsonl_path="/tmp/flowcept_buffer.jsonl",
    output_path="PROVENANCE_CARD_FROM_JSONL.md",
)

Render Markdown Directly in Terminal (Rich)

You can optionally print the generated markdown report in a rich terminal:

from flowcept import Flowcept

Flowcept.generate_report(
    report_type="provenance_card",
    format="markdown",
    records=my_records,
    output_path="PROVENANCE_CARD.md",
    print_markdown=True,
)

If Rich is not installed and print_markdown=True, Flowcept raises an error. Install Rich via:

pip install flowcept["extras"]

Input Modes

Exactly one input mode must be provided:

input_jsonl_path: read from a Flowcept JSONL buffer file.
records: list of dictionaries already loaded in memory.
workflow_id or campaign_id: query workflow, task, and object documents from DB.

Aggregation

The provenance card is summarized, not raw-dump oriented.

Grouping key: activity_id.
Per-group summary includes: - number of task records aggregated (n_tasks) - status counts - timing aggregates (median/summary fields)

This aggregation method is written in generated output under Aggregation Method.

Object Metadata Summary

When objects are present, reports include metadata-only summaries:

counts by type
counts by storage mode (in_object vs gridfs)
linkage counts (task/workflow-linked)
object version and size summaries

Blob payload bytes are excluded from report rendering.

Real Example (Rendered in RST)

Below is a real example equivalent to generated markdown content for: Workflow Provenance Card: Perceptron GridSearch.

Summary

Workflow Name: Perceptron GridSearch
Workflow ID: 20c5939f-f3ee-4031-9303-a9e68a5a8092
Campaign ID: 661344de-ddf4-497d-a5ba-0d01c67cfb79
Execution Start (UTC): 2026-02-19 05:05:10
Execution End (UTC): 2026-02-19 05:05:12
Total Elapsed (s): 1.501
User: rsr
System Name: Darwin
Environment ID: laptop
Workflow Subtype: ml_workflow
Code Repository: branch=skills, short_sha=f3df676, dirty=dirty
Git Remote: git@github.com:ORNL/flowcept.git
Workflow args:
- python_random_seeded: True
- seed: 42
- torch_cuda_manual_seeded: False
- torch_cudnn_benchmark: False
- torch_cudnn_deterministic: True
- torch_deterministic_algorithms: True
- torch_manual_seeded: True

Workflow-level Summary

Total Activities: 3
Status Counts: {'FINISHED': 7}
Total Elapsed Workflow Time (s): 1.501
- train_and_validate: 0.088 s
- get_dataset: 0.056 s
- select_best_model: 0.041 s
Resource Totals:
- Memory Used: 7.78 MB
- Average CPU (%): 54.1%
- IO:
  - Read: 38.49 MB
  - Write: 55.11 MB
  - Read Ops: 1,454
  - Write Ops: 155
Key Observations:
- Slowest Activity: train_and_validate at 0.088 s
- Largest IO Activity: train_and_validate with Read 31.74 MB and Write 52.10 MB

Workflow Structure

input data
        │
        ▼
 get_dataset
        │
 train_and_validate
        │
 select_best_model
        ▼
 output data

Timing Report

Rows are sorted by First Started At (ascending).

Activity	Status Counts	First Started At	Last Ended At	Median Elapsed (s)
get_dataset	{'FINISHED': 1}	2026-02-19 05:05:10	2026-02-19 05:05:10	0.056
train_and_validate	{'FINISHED': 5}	2026-02-19 05:05:10	2026-02-19 05:05:12	0.088
select_best_model	{'FINISHED': 1}	2026-02-19 05:05:12	2026-02-19 05:05:12	0.041

Per Activity Details

get_dataset (subtype=``dataprep``)
- Used:
  - n_samples: 120
  - split_ratio: 0.8
- Generated:
  - dataset_id: f1e918cc-a3eb-4dd8-8036-5f6e4fc140d1
  - x_train_shape: [96, 2]
  - x_val_shape: [24, 2]
  - y_train_shape: [96, 1]
  - y_val_shape: [24, 1]
train_and_validate (n=5, subtype=``learning``)
- Used (aggregated): includes epochs, learning_rate, n_input_neurons, config_id, and other fields.
- Generated (aggregated): includes best_val_loss, val_loss, val_accuracy, and model object ids.
select_best_model (subtype=``model_selection``)
- Generated:
  - selected_config_id: cfg_5
  - selected_loss: 0.0490574836730957
  - selected_model_object_id: ae18a739-1ffe-45a8-ae64-827a079579a6

Workflow-level Resource Usage

Metric	Value
Telemetry Samples (task start/end pairs)	7
CPU User Time Delta	7.380
CPU System Time Delta	1.940
Average CPU (%) Delta	54.1%
Average CPU Frequency	3,228
Memory Used Delta	7.78 MB
Average Memory (%)	73.7%
Average Swap (%)	90.0%
Disk Read Time Delta (ms)	224.000
Disk Write Time Delta (ms)	14.000
Disk Busy Time Delta (ms)	0.000

Object Artifacts Summary

Metric	Value
Total Objects	6
By Type	{'dataset': 1, 'ml_model': 5}
By Storage	{'in_object': 1, 'gridfs': 5}
Task-linked Objects	6
Workflow-linked Objects	6
Max Version	7
Total Size	13.66 KB
Average Size	2.28 KB
Max Size	4.10 KB

Object Details by Type

Datasets
- f1e918cc-a3eb-4dd8-8036-5f6e4fc140d1
  - version: 0
  - storage: in_object
  - size: 4.10 KB
  - task_id: 1771477510.9383209
  - workflow_id: 20c5939f-f3ee-4031-9303-a9e68a5a8092
  - timestamp: 2026-02-19 05:05:10
  - sha256: 7d7b4be35ea11f66e9a785d1b39cfb8fc31f8fd23020bc74918872ab5855253c
Models
- ae18a739-1ffe-45a8-ae64-827a079579a6
  - version: 7
  - storage: gridfs
  - size: 1.91 KB
  - tags: best
  - custom_metadata includes checkpoint_epoch, class, config_id, learning_rate, loss, and model_profile.

Aggregation Method

Grouping key: activity_id.
Each grouped row may aggregate multiple task records (n_tasks).
Aggregated metrics currently include count/status/timing.

Generator footer example:

Provenance card generated by Flowcept | GitHub | Version: 0.9.14 on Feb 19, 2026 at 12:05 AM EST

PDF Reports (Optional)

PDF reports are intended for executive-friendly rendering and include plots.

pip install flowcept[report_pdf]

from flowcept import Flowcept

# 1) Generate PDF from workflow_id (DB-backed mode)
stats = Flowcept.generate_report(
    report_type="provenance_report",
    format="pdf",
    workflow_id="5def1173-d417-420b-a7ed-61ada01772cd",
    output_path="PROVENANCE_REPORT.pdf",
)
print(stats["output"])

# 2) Generate PDF from in-memory records
Flowcept.generate_report(
    report_type="provenance_report",
    format="pdf",
    records=my_records,
    output_path="PROVENANCE_REPORT_FROM_RECORDS.pdf",
)

# 3) Generate PDF from a Flowcept JSONL file
Flowcept.generate_report(
    report_type="provenance_report",
    format="pdf",
    input_jsonl_path="/tmp/flowcept_buffer.jsonl",
    output_path="PROVENANCE_REPORT_FROM_JSONL.pdf",
)

PDF report plots include:

Top slowest activities
Top fastest activities
Most resource-demanding activities (IO)
Telemetry-aware charts when telemetry fields are available

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reporting

API

Markdown Provenance Cards (Default)

Render Markdown Directly in Terminal (Rich)

Input Modes

Aggregation

Object Metadata Summary

Real Example (Rendered in RST)

Summary

Workflow-level Summary

Workflow Structure

Timing Report

Per Activity Details

Workflow-level Resource Usage

Object Artifacts Summary

Object Details by Type

Aggregation Method

PDF Reports (Optional)

FilesExpand file tree

reporting.rst

Latest commit

History

reporting.rst

File metadata and controls

Reporting

API

Markdown Provenance Cards (Default)

Render Markdown Directly in Terminal (Rich)

Input Modes

Aggregation

Object Metadata Summary

Real Example (Rendered in RST)

Summary

Workflow-level Summary

Workflow Structure

Timing Report

Per Activity Details

Workflow-level Resource Usage

Object Artifacts Summary

Object Details by Type

Aggregation Method

PDF Reports (Optional)