Skip to content

Commit 025fbd8

Browse files
ci: Add CI workflow for lint, typecheck, and CPU tests (#189)
## Summary - New `ci.yml` workflow with 3 jobs on `ubuntu-latest`, no approval gates: - **lint**: `ruff check` + `ruff format --check` - **typecheck**: `ty check` across boltz-dev, protenix-dev, rf3-dev (matrix) - **cpu-tests**: `pytest -m 'not gpu'` across all 3 envs (412 tests, matrix) - Switch self-hosted GPU runners from GitHub Actions cache to NFS-backed caching ## Context Addresses feedback from Karson and Marcus: - Non-GPU tests now run automatically on PRs without approval - Formatting/linting enforced in CI matching pre-commit hooks (ruff, ty) - GPU test approval preserved for pausing during sampleworks machine runs ## Test plan - [x] Workflow YAML validated - [x] CPU tests pass on ubuntu-latest - [x] GPU tests work with NFS cache — pixi install: 11min → 11s, boltz/rf3: 14min → 3.5min <!-- This is an auto-generated comment: release notes by coderabbit.ai --> ## Summary by CodeRabbit * **Chores** * Added comprehensive CI workflows with lint, type-check, and multi-environment test runs (including manual trigger) and adjusted GPU install caching. * Tightened type-checking rules for stricter diagnostics. * Improved runtime robustness and fallback determinism for GPU extension loading. * Minor formatting and argument/help string cleanups across utilities. * **Tests** * Made tests more deterministic by adding targeted mocking and cache resets. * Improved test reliability and maintainability. <!-- end of auto-generated comment: release notes by coderabbit.ai -->
1 parent bcfb45c commit 025fbd8

14 files changed

Lines changed: 211 additions & 60 deletions

File tree

.github/workflows/ci.yml

Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
name: CI
2+
3+
on:
4+
push:
5+
branches: [main]
6+
paths:
7+
- 'src/**'
8+
- 'tests/**'
9+
- 'pyproject.toml'
10+
- 'pixi.lock'
11+
- '.github/workflows/ci.yml'
12+
- '.pre-commit-config.yaml'
13+
pull_request:
14+
branches: [main]
15+
paths:
16+
- 'src/**'
17+
- 'tests/**'
18+
- 'pyproject.toml'
19+
- 'pixi.lock'
20+
- '.github/workflows/ci.yml'
21+
- '.pre-commit-config.yaml'
22+
workflow_dispatch:
23+
24+
concurrency:
25+
group: ci-${{ github.ref }}
26+
cancel-in-progress: true
27+
28+
jobs:
29+
lint:
30+
runs-on: ubuntu-latest
31+
timeout-minutes: 10
32+
permissions:
33+
contents: read
34+
35+
steps:
36+
- name: Checkout code
37+
uses: actions/checkout@v4
38+
39+
- name: Install pixi
40+
uses: prefix-dev/setup-pixi@v0.8.8
41+
with:
42+
environments: boltz-dev
43+
44+
- name: Ruff lint
45+
run: pixi run -e boltz-dev ruff check .
46+
47+
- name: Ruff format check
48+
run: pixi run -e boltz-dev ruff format --check .
49+
50+
typecheck:
51+
runs-on: ubuntu-latest
52+
timeout-minutes: 15
53+
permissions:
54+
contents: read
55+
strategy:
56+
fail-fast: false
57+
matrix:
58+
environment: [boltz-dev, protenix-dev, rf3-dev]
59+
60+
name: typecheck (${{ matrix.environment }})
61+
62+
steps:
63+
- name: Checkout code
64+
uses: actions/checkout@v4
65+
66+
- name: Install pixi
67+
uses: prefix-dev/setup-pixi@v0.8.8
68+
with:
69+
environments: ${{ matrix.environment }}
70+
71+
- name: Run ty
72+
run: pixi run -e ${{ matrix.environment }} ty check
73+
74+
cpu-tests:
75+
runs-on: ubuntu-latest
76+
timeout-minutes: 20
77+
permissions:
78+
contents: read
79+
strategy:
80+
fail-fast: false
81+
matrix:
82+
environment: [boltz-dev, protenix-dev, rf3-dev]
83+
84+
name: tests (${{ matrix.environment }})
85+
86+
steps:
87+
- name: Checkout code
88+
uses: actions/checkout@v4
89+
90+
- name: Install pixi
91+
uses: prefix-dev/setup-pixi@v0.8.8
92+
with:
93+
environments: ${{ matrix.environment }}
94+
95+
- name: Run CPU tests
96+
run: pixi run -e ${{ matrix.environment }} cpu-tests

.github/workflows/gpu-tests.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,8 @@ jobs:
4343

4444
- name: Install pixi
4545
uses: prefix-dev/setup-pixi@19eac09b398e3d0c747adc7921926a6d802df4da # v0.8.8
46+
with:
47+
cache: false # NFS-backed cache on self-hosted runner handles this
4648

4749
- name: Build CUDA extensions
4850
run: pixi run -e ${{ matrix.environment }} python3 -c "from sampleworks.core.forward_models.xray.real_space_density_deps.ops.csrc import dilate_points_cuda"

pixi.lock

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

pyproject.toml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -177,4 +177,19 @@ include = ["src/sampleworks/eval/bond_angle_and_length_outlier_eval_script.py"]
177177
possibly-missing-attribute = "ignore"
178178

179179
[tool.ty.rules]
180+
# Pre-existing type issues across the codebase; warn instead of error
181+
# so ty runs in CI without blocking PRs while the team fixes them.
180182
unresolved-import = "ignore"
183+
unknown-argument = "warn"
184+
unresolved-attribute = "warn"
185+
invalid-argument-type = "warn"
186+
invalid-assignment = "warn"
187+
invalid-method-override = "warn"
188+
invalid-parameter-default = "warn"
189+
no-matching-overload = "warn"
190+
not-iterable = "warn"
191+
not-subscriptable = "warn"
192+
too-many-positional-arguments = "warn"
193+
unsupported-operator = "warn"
194+
unused-ignore-comment = "warn"
195+
unused-type-ignore-comment = "warn"

scripts/eval/bond_geometry_eval.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ def bond_length_violations(pose: AtomArray, tolerance: float = 0.1) -> tuple[flo
4040
"""
4141
try:
4242
bounds = check_pose_and_get_bounds(pose)
43-
except (ValueError, BadStructureError) as e:
43+
except (ValueError, BadStructureError):
4444
return np.nan, pd.DataFrame()
4545

4646
bond_indices = np.sort(pose.bonds.as_array()[:, :2], axis=1)
@@ -97,13 +97,12 @@ def check_pose_and_get_bounds(pose: AtomArray):
9797
"`biotite.structure.io.pdbx.get_structure(..., include_bonds=True)`"
9898
)
9999
raise ValueError("The structure does not have bonds.")
100-
100+
101101
# this fetches values from RDKit, raises BadStructureError if the structure is bad
102102
bounds = get_distance_bounds(pose)
103103
return bounds
104104

105105

106-
107106
def bond_angle_violations(pose: AtomArray, tolerance: float = 0.1) -> tuple[float, pd.DataFrame]:
108107
"""
109108
Calculate the percentage of bonds that are outside acceptable ranges.

scripts/eval/run_and_process_phenix_clashscore.py

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,9 +37,7 @@ def main(args) -> None:
3737
return
3838

3939
clashscore_df = pd.concat(clashscore_metrics, ignore_index=True)
40-
clashscore_df.to_csv(
41-
args.grid_search_results_path / "clashscore_metrics.csv", index=False
42-
)
40+
clashscore_df.to_csv(args.grid_search_results_path / "clashscore_metrics.csv", index=False)
4341

4442

4543
def process_one_trial(trial: Trial) -> pd.DataFrame:

scripts/eval/run_and_process_tortoize.py

Lines changed: 10 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,6 @@
88
import pandas as pd
99
from loguru import logger
1010
from pandas import DataFrame
11-
1211
from sampleworks.eval.grid_search_eval_utils import parse_eval_args, setup_evaluation_parameters
1312

1413

@@ -27,9 +26,7 @@ def main(args: argparse.Namespace) -> None:
2726
try:
2827
subprocess.call("tortoize", stdout=subprocess.DEVNULL, stderr=subprocess.DEVNULL)
2928
except FileNotFoundError:
30-
raise RuntimeError(
31-
"tortoize is not available, make sure you have installed it."
32-
) from None
29+
raise RuntimeError("tortoize is not available, make sure you have installed it.") from None
3330
# The dropped variable is a list of ProteinConfigs, not used yet in this script
3431
all_trials, _ = setup_evaluation_parameters(args)
3532

@@ -122,13 +119,15 @@ def get_protein_level_z_scores(tortoize_json: dict[str, Any]) -> pd.DataFrame:
122119
out: list[dict[str, Any]] = []
123120
model_block = tortoize_json.get("model", {})
124121
for model_id, model_data in model_block.items():
125-
out.append({
126-
"model": str(model_id),
127-
"ramachandran_z_score": model_data.get("ramachandran-z", None),
128-
"ramachandran_jackknife_sd": model_data.get("ramachandran-jackknife-sd", None),
129-
"torsion_z_score": model_data.get("torsion-z", None),
130-
"torsion_jackknife_sd": model_data.get("torsion-jackknife-sd", None)
131-
})
122+
out.append(
123+
{
124+
"model": str(model_id),
125+
"ramachandran_z_score": model_data.get("ramachandran-z", None),
126+
"ramachandran_jackknife_sd": model_data.get("ramachandran-jackknife-sd", None),
127+
"torsion_z_score": model_data.get("torsion-z", None),
128+
"torsion_jackknife_sd": model_data.get("torsion-jackknife-sd", None),
129+
}
130+
)
132131
return pd.DataFrame(out)
133132

134133

src/sampleworks/core/forward_models/xray/real_space_density_deps/ops/csrc/__init__.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,4 +45,5 @@ def _ensure_toolchain_env() -> None:
4545
CUDA_AVAILABLE = True
4646
except Exception as e:
4747
print(f"CUDA extension loading failed: {e}")
48+
dilate_points_cuda = None
4849
CUDA_AVAILABLE = False

src/sampleworks/eval/grid_search_eval_utils.py

Lines changed: 6 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@
1111

1212
from loguru import logger
1313
from sampleworks.eval.constants import OCCUPANCY_LEVELS
14-
from sampleworks.eval.eval_dataclasses import Trial, TrialList, ProteinConfig
14+
from sampleworks.eval.eval_dataclasses import ProteinConfig, Trial, TrialList
1515
from sampleworks.eval.occupancy_utils import extract_protein_and_occupancy
1616
from sampleworks.utils.guidance_constants import StructurePredictor
1717

@@ -175,22 +175,22 @@ def parse_eval_args(description: str | None = None):
175175
type=Path,
176176
required=True,
177177
help="Path to the top-level grid search results directory, usu. called "
178-
"``grid_search_results``",
178+
"``grid_search_results``",
179179
)
180180
# not technically used everywhere yet, but requiring it future-proofs.
181181
parser.add_argument(
182182
"--grid-search-inputs-path",
183183
type=Path,
184184
required=True,
185185
help="Path to the directory containing the grid search inputs, in particular "
186-
"the protein configuration CSV file, maps, and reference structures.",
186+
"the protein configuration CSV file, maps, and reference structures.",
187187
default=None,
188188
)
189189
parser.add_argument(
190190
"--protein-configs-csv",
191191
type=Path,
192192
help="Path to the CSV file containing protein configurations, like "
193-
"``${HOME}/configs.csv``. Defaults to sampleworks/data/protein_configs.csv",
193+
"``${HOME}/configs.csv``. Defaults to sampleworks/data/protein_configs.csv",
194194
default=files("sampleworks.data") / "protein_configs.csv",
195195
)
196196
parser.add_argument(
@@ -215,7 +215,7 @@ def parse_eval_args(description: str | None = None):
215215

216216

217217
def setup_evaluation_parameters(
218-
args: argparse.Namespace
218+
args: argparse.Namespace,
219219
) -> tuple[TrialList, dict[str, ProteinConfig]]:
220220
grid_search_dir = Path(args.grid_search_results_path)
221221

@@ -227,9 +227,7 @@ def setup_evaluation_parameters(
227227
logger.info(f"Proteins configured: {list(protein_configs.keys())}")
228228

229229
# Scan for experiments (look for refined.cif files)
230-
all_trials = scan_grid_search_results(
231-
grid_search_dir, target_filename=args.target_filename
232-
)
230+
all_trials = scan_grid_search_results(grid_search_dir, target_filename=args.target_filename)
233231
logger.info(f"Found {len(all_trials)} experiments with refined.cif files")
234232

235233
if all_trials:

src/sampleworks/utils/msa.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -52,8 +52,8 @@ def _validate_msa_cache_contents(msa_hash: str, msa_dir: Path) -> None:
5252
raise FileNotFoundError(f"No A3M files found for hash {msa_hash} in {msa_dir}")
5353

5454
# Validate that we have matching pairs
55-
csv_indices = {int(f.stem.split('_')[-1]) for f in csv_files}
56-
a3m_indices = {int(f.stem.split('_')[-1]) for f in a3m_files}
55+
csv_indices = {int(f.stem.split("_")[-1]) for f in csv_files}
56+
a3m_indices = {int(f.stem.split("_")[-1]) for f in a3m_files}
5757

5858
if csv_indices != a3m_indices:
5959
raise ValueError(
@@ -67,16 +67,16 @@ def _validate_msa_cache_contents(msa_hash: str, msa_dir: Path) -> None:
6767
a3m_path = msa_dir / f"{msa_hash}_{idx}.a3m"
6868

6969
# Read CSV sequences (skip header, take second column)
70-
with csv_path.open('r') as f:
70+
with csv_path.open("r") as f:
7171
csv_lines = f.readlines()
7272

7373
if not csv_lines or csv_lines[0].strip() != "key,sequence":
7474
raise ValueError(f"Invalid CSV header in {csv_path}")
7575

76-
csv_sequences = [line.strip().split(',', 1)[1] for line in csv_lines[1:] if line.strip()]
76+
csv_sequences = [line.strip().split(",", 1)[1] for line in csv_lines[1:] if line.strip()]
7777

7878
# Read A3M sequences (every other line, skipping headers)
79-
with a3m_path.open('r') as f:
79+
with a3m_path.open("r") as f:
8080
a3m_lines = f.readlines()
8181

8282
# A3M format: header lines start with '>', sequences on alternating lines

0 commit comments

Comments
 (0)