-
Notifications
You must be signed in to change notification settings - Fork 84
Update process results workflow for new website #919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
lazappi
wants to merge
57
commits into
main
Choose a base branch
from
feature/no-ref/update-process-results
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
57 commits
Select commit
Hold shift + click to select a range
e734440
Add combine_output reporting component
lazappi 45abf32
Update process_task_results to combine outputs
lazappi 37042df
Update get_task_info to new schema
lazappi a747b9b
Update get_method_info to new schema
lazappi e3439c9
Update get_metric_info to new schema
lazappi fdbdff8
Update get_dataset_info to new schema
lazappi 00be327
Update get_results to new schema
lazappi d318832
Add metric resources to results component/schema
lazappi d5d1b8f
Update generate_qc to match new schema
lazappi 11d6d9d
Update combine_output to new schema
lazappi 6a91673
Update viash version and reference
lazappi 5f70ab9
Update process_results workflow to new components
lazappi 99a4a50
Add render_report component
lazappi 264790e
Add render_report to process results workflow
lazappi 272bba8
Handle missing values in generate_qc()
lazappi 8ab4f60
Handle missing controls in results report
lazappi d8fdfd9
update common submodule
rcannood 15a2d18
Merge remote-tracking branch 'origin/main' into feature/no-ref/update…
rcannood e9e3ef1
Merge remote-tracking branch 'origin' into feature/no-ref/update-proc…
lazappi cbf0e50
Merge branch 'feature/no-ref/update-process-results' of github.com:op…
lazappi c1198e5
Strip quotes from descriptions/summaries
lazappi a7b9c6d
Add roles to author details
lazappi 19ecc9d
Add QC check for number of successful controls
lazappi dc75604
Handle missing exit codes in report
lazappi 382a35d
Add schema validation to process_results workflow
lazappi caad282
Fix combine_output image version
lazappi 4b3b585
Handle alternative field names in get_dataset_info
lazappi 8d6a370
Handle v1 slots in get_method_info
lazappi 186133f
Handle null author fields in report
lazappi 8ba2b4c
Add missing information in control QC checks
lazappi cd2eef7
Handle old doc URL location in get_method_info
lazappi c44f13e
Prefix component additional info in get_metric_info
lazappi bca05d1
Cleanup removed argument in get_results
lazappi 22361be
Fix test script for generate_qc
lazappi 14861b4
Add authors to datasets, methods, metrics
lazappi 2f7943a
schemas were moved to the common_resources repo
rcannood de9e0b2
fix schema paths
rcannood 01acc49
set common submodule to different branch for testing
rcannood a40114e
Fix resource
rcannood 3098da9
fix schema paths in the script
rcannood 7044804
authors and references were moved into core
rcannood 05a33b2
add a params placeholder for ease of use
rcannood 5ab5c3f
show number of passed checks as well
rcannood 8d14bc8
fix result schema path
rcannood 3fbbbf6
Add bibliography file
lazappi 5276ba0
Add shared util functions
lazappi 072addf
Use shared functions for authors and references
lazappi 664eeb5
update submodule (#934)
rcannood ada1876
Add scripts/create_resources/task_results_v4
lazappi 55b9e23
Update main reference
lazappi 2c4ce88
Use temporary directory in render-report
lazappi 6572871
Style reporting R scripts
lazappi 18cd13d
add auto wf
rcannood 52d7f17
add script to reprocess task results
rcannood 2239144
Handle missing scaled scores in generate_qc
lazappi 9c1cd26
Set unknown error in get_results
lazappi 5a18f35
fix script
rcannood File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Submodule common
updated
8 files
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,83 @@ | ||
#!/bin/bash | ||
|
||
# get the root of the directory | ||
REPO_ROOT=$(git rev-parse --show-toplevel) | ||
|
||
# ensure that the command below is run from the root of the repository | ||
cd "$REPO_ROOT" | ||
|
||
set -e | ||
|
||
OUT_DIR="resources" | ||
|
||
echo ">>> Fetching raw results..." | ||
aws s3 sync --profile op \ | ||
s3://openproblems-data/resources/ \ | ||
"$OUT_DIR/" \ | ||
--exclude "*" \ | ||
--include "**/results/run_*/*" \ | ||
--delete | ||
|
||
echo ">>> Patch state.yaml files..." | ||
# fix state.yaml id and output_trace | ||
python <<HERE | ||
import os | ||
import re | ||
import glob | ||
|
||
def update_state_file(file_path, new_id): | ||
with open(file_path, 'r') as file: | ||
content = file.read() | ||
|
||
# if output_trace is missing, add it | ||
if 'output_trace:' not in content: | ||
content += "\noutput_trace: !file trace.txt\n" | ||
|
||
# replace the id with the value of the glob ** pattern | ||
content = re.sub(r'id: .+', f'id: {new_id}/processed', content) | ||
|
||
with open(file_path, 'w') as file: | ||
file.write(content) | ||
|
||
# find all state.yaml files | ||
state_files = glob.glob('resources/**/state.yaml', recursive=True) | ||
for state_file in state_files: | ||
# extract the id from the path | ||
match = re.search(r'resources/(.+?)/state\.yaml', state_file) | ||
if match: | ||
new_id = match.group(1) | ||
update_state_file(state_file, new_id) | ||
print(f"Updated {state_file} with id: {new_id}") | ||
else: | ||
print(f"Could not extract id from {state_file}, skipping.") | ||
HERE | ||
|
||
echo ">>> Creating params.yaml..." | ||
cat > /tmp/params.yaml << HERE | ||
input_states: resources/*/results/run_*/state.yaml | ||
rename_keys: 'input_task_info:output_task_info;input_dataset_info:output_dataset_info;input_method_configs:output_method_configs;input_metric_configs:output_metric_configs;input_scores:output_scores;input_trace:output_trace' | ||
output_state: '\$id/state.yaml' | ||
settings: '{"output_combined": "\$id/output_combined.json", "output_report": "\$id/output_report.html", "output_task_info": "\$id/output_task_info.json", "output_dataset_info": "\$id/output_dataset_info.json", "output_method_info": "\$id/output_method_info.json", "output_metric_info": "\$id/output_metric_info.json", "output_results": "\$id/output_results.json", "output_scores": "\$id/output_quality_control.json"}' | ||
publish_dir: "$OUT_DIR" | ||
HERE | ||
|
||
echo ">>> Processing results..." | ||
nextflow run target/nextflow/reporting/process_task_results/main.nf \ | ||
-profile docker \ | ||
-params-file /tmp/params.yaml \ | ||
-c common/nextflow_helpers/labels_ci.config \ | ||
-entry auto \ | ||
-resume | ||
|
||
# find all files in $OUT with the pattern output_report.html | ||
echo ">>> List reports..." | ||
find "$OUT_DIR" -name "output_report.html" | ||
|
||
# echo ">>> Uploading processed results to S3..." | ||
# aws s3 sync --profile op \ | ||
# "resources_test/openproblems/task_results_v4/" \ | ||
# "s3://openproblems-data/resources_test/openproblems/task_results_v4/" \ | ||
# --delete --dryrun | ||
|
||
# echo | ||
# echo ">>> Done!" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
#!/bin/bash | ||
|
||
# get the root of the directory | ||
REPO_ROOT=$(git rev-parse --show-toplevel) | ||
|
||
# ensure that the command below is run from the root of the repository | ||
cd "$REPO_ROOT" | ||
|
||
set -e | ||
|
||
OUT_DIR="resources_test/openproblems/task_results_v4" | ||
|
||
echo ">>> Fetching raw results..." | ||
aws s3 sync --profile op \ | ||
s3://openproblems-data/resources/task_batch_integration/results/run_2025-01-23_18-03-16/ \ | ||
"$OUT_DIR/raw/" \ | ||
--delete | ||
|
||
echo | ||
echo ">>> Processing results..." | ||
if [ -d "$OUT_DIR/processed" ]; then rm -Rf $OUT_DIR/processed; fi | ||
nextflow run target/nextflow/reporting/process_task_results/main.nf \ | ||
-profile docker \ | ||
--input_task_info $OUT_DIR/raw/task_info.yaml \ | ||
--input_dataset_info $OUT_DIR/raw/dataset_uns.yaml \ | ||
--input_method_configs $OUT_DIR/raw/method_configs.yaml \ | ||
--input_metric_configs $OUT_DIR/raw/metric_configs.yaml \ | ||
--input_scores $OUT_DIR/raw/score_uns.yaml \ | ||
--input_trace $OUT_DIR/raw/trace.txt \ | ||
--output_state state.yaml \ | ||
--publishDir $OUT_DIR/processed | ||
|
||
echo ">>> Uploading processed results to S3..." | ||
aws s3 sync --profile op \ | ||
"resources_test/openproblems/task_results_v4/" \ | ||
"s3://openproblems-data/resources_test/openproblems/task_results_v4/" \ | ||
--delete --dryrun | ||
|
||
echo | ||
echo ">>> Done!" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,102 @@ | ||
name: combine_output | ||
namespace: reporting | ||
description: Combine task outputs into a single JSON | ||
|
||
argument_groups: | ||
- name: Inputs | ||
arguments: | ||
- name: --input_task_info | ||
type: file | ||
description: Task info file | ||
info: | ||
format: | ||
type: json | ||
schema: /common/schemas/results_v4/task_info.json | ||
required: true | ||
example: resources_test/openproblems/task_results_v4/processed/task_info.json | ||
- name: --input_dataset_info | ||
type: file | ||
description: Dataset info file | ||
info: | ||
format: | ||
type: json | ||
schema: /common/schemas/results_v4/dataset_info.json | ||
required: true | ||
example: resources_test/openproblems/task_results_v4/processed/dataset_info.json | ||
- name: --input_method_info | ||
type: file | ||
description: Method info file | ||
info: | ||
format: | ||
type: json | ||
schema: /common/schemas/results_v4/method_info.json | ||
required: true | ||
example: resources_test/openproblems/task_results_v4/processed/method_info.json | ||
- name: --input_metric_info | ||
type: file | ||
description: Metric info file | ||
info: | ||
format: | ||
type: json | ||
schema: /common/schemas/results_v4/metric_info.json | ||
required: true | ||
example: resources_test/openproblems/task_results_v4/processed/metric_info.json | ||
- name: --input_results | ||
type: file | ||
description: Results file | ||
info: | ||
format: | ||
type: json | ||
schema: /common/schemas/results_v4/results.json | ||
required: true | ||
example: resources_test/openproblems/task_results_v4/processed/results.json | ||
- name: --input_quality_control | ||
type: file | ||
description: Quality control file | ||
info: | ||
format: | ||
type: json | ||
schema: /common/schemas/results_v4/quality_control.json | ||
required: true | ||
example: resources_test/openproblems/task_results_v4/processed/quality_control.json | ||
|
||
- name: Outputs | ||
arguments: | ||
- name: --output | ||
type: file | ||
direction: output | ||
description: Combined output JSON | ||
default: combined_output.json | ||
info: | ||
format: | ||
type: json | ||
schema: /common/schemas/results_v4/combined_output.json | ||
|
||
resources: | ||
- type: r_script | ||
path: script.R | ||
- path: /common/schemas | ||
dest: schemas | ||
|
||
test_resources: | ||
- type: python_script | ||
path: /common/component_tests/run_and_check_output.py | ||
- path: /resources_test/openproblems/task_results_v4 | ||
dest: resources_test/openproblems/task_results_v4 | ||
|
||
engines: | ||
- type: docker | ||
image: openproblems/base_r:1 | ||
setup: | ||
- type: apt | ||
packages: | ||
- nodejs | ||
- npm | ||
- type: docker | ||
run: npm install -g ajv-cli | ||
|
||
runners: | ||
- type: executable | ||
- type: nextflow | ||
directives: | ||
label: [lowmem, lowtime, lowcpu] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,105 @@ | ||
## VIASH START | ||
processed_dir <- "resources_test/openproblems/task_results_v4/processed" | ||
|
||
par <- list( | ||
# Inputs | ||
input_task_info = paste0(processed_dir, "/task_info.json"), | ||
input_quality_control = paste0(processed_dir, "/quality_control.json"), | ||
input_metric_info = paste0(processed_dir, "/metric_info.json"), | ||
input_method_info = paste0(processed_dir, "/method_info.json"), | ||
input_dataset_info = paste0(processed_dir, "/dataset_info.json"), | ||
input_results = paste0(processed_dir, "/results.json"), | ||
# Outputs | ||
output = "task_results.json" | ||
) | ||
## VIASH END | ||
|
||
################################################################################ | ||
# MAIN SCRIPT | ||
################################################################################ | ||
|
||
cat("====== Combine output ======\n") | ||
|
||
cat("\n>>> Reading input files...\n") | ||
cat("Reading task info from '", par$input_task_info, "'...\n", sep = "") | ||
task_info <- jsonlite::read_json(par$input_task_info) | ||
|
||
cat( | ||
"Reading quality control from '", | ||
par$input_quality_control, | ||
"'...\n", | ||
sep = "" | ||
) | ||
quality_control <- jsonlite::read_json(par$input_quality_control) | ||
|
||
cat("Reading metric info from '", par$input_metric_info, "'...\n", sep = "") | ||
metric_info <- jsonlite::read_json(par$input_metric_info) | ||
|
||
cat("Reading method info from '", par$input_method_info, "'...\n", sep = "") | ||
method_info <- jsonlite::read_json(par$input_method_info) | ||
|
||
cat("Reading dataset info from '", par$input_dataset_info, "'...\n", sep = "") | ||
dataset_info <- jsonlite::read_json(par$input_dataset_info) | ||
|
||
cat("Reading results from '", par$input_results, "'...\n", sep = "") | ||
results <- jsonlite::read_json(par$input_results) | ||
|
||
cat("\n>>> Combining outputs...\n") | ||
# Create combined output according to task_results.json | ||
combined_output <- list( | ||
task_info = task_info, | ||
dataset_info = dataset_info, | ||
method_info = method_info, | ||
metric_info = metric_info, | ||
results = results, | ||
quality_control = quality_control | ||
) | ||
|
||
cat("\n>>> Writing output file...\n") | ||
cat("Writing combined output to '", par$output, "'...\n", sep = "") | ||
jsonlite::write_json( | ||
combined_output, | ||
par$output, | ||
pretty = TRUE, | ||
null = "null", | ||
na = "null", | ||
auto_unbox = TRUE | ||
) | ||
|
||
cat("\n>>> Validating output against schema...\n") | ||
results_schemas <- file.path(meta$resources_dir, "schemas", "results_v4") | ||
ajv_args <- paste( | ||
"validate", | ||
"--spec draft2020", | ||
"-s", | ||
file.path(results_schemas, "combined_output.json"), | ||
"-r", | ||
file.path(results_schemas, "task_info.json"), | ||
"-r", | ||
file.path(results_schemas, "dataset_info.json"), | ||
"-r", | ||
file.path(results_schemas, "method_info.json"), | ||
"-r", | ||
file.path(results_schemas, "metric_info.json"), | ||
"-r", | ||
file.path(results_schemas, "results.json"), | ||
"-r", | ||
file.path(results_schemas, "quality_control.json"), | ||
"-r", | ||
file.path(results_schemas, "core.json"), | ||
"-d", | ||
par$output | ||
) | ||
|
||
cat("Running validation command:", "ajv", ajv_args, "\n") | ||
cat("Output:\n") | ||
validation_result <- system2("ajv", ajv_args) | ||
|
||
if (validation_result == 0) { | ||
cat("JSON validation passed successfully!\n") | ||
} else { | ||
cat("JSON validation failed!\n") | ||
stop("Output JSON does not conform to schema") | ||
} | ||
|
||
cat("\n>>> Done!\n") |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.