Skip to content

feat: translate remaining SAS analysis pipeline to R#8

Open
Copilot wants to merge 36 commits into
mainfrom
copilot/translate-remaining-sas-files
Open

feat: translate remaining SAS analysis pipeline to R#8
Copilot wants to merge 36 commits into
mainfrom
copilot/translate-remaining-sas-files

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 4, 2026

Translates the remaining SAS files from the García-Albéniz et al. (2020) analysis pipeline into exported R functions, completing the core statistical methodology for target trial emulation. Includes correctness fixes applied in response to code review feedback.

New functions

Function(s) SAS source Purpose
compute_ipw_weights() cann17b + cann18 Stabilized cumulative IPW weights for STOPBASE/CONTINUE arms; separate per-arm 99th-percentile truncation; single-arm input guard
predict_survival_unadjusted() cann15 Pooled logistic regression + g-computation, unadjusted
predict_survival_baseline_adjusted() cann16 Same, with baseline covariate adjustment
predict_survival_ipw() cann21 Same, IPW-weighted
fit_outcome_hr() cann20 IPW-weighted logistic regression; returns baseline-time OR + consistent Wald 95% CI as HR approximation; single-arm guard
bootstrap_ci() cann23 + cann24 Nonparametric bootstrap percentile CIs for survival difference; failure tracking with cli::cli_warn(); RNG state restored on exit; n_boot validated to be ≥ 1
false_positives() cann30 False positive rate by arm and screening round
extract_screening_mammograms() / extract_any_mammograms() / extract_diagnostic_mammograms() b02_* Medicare claims extraction templates with standard HCPCS codes

Time in all survival models is modeled with restricted cubic splines via splines::ns() (knots at months 6, 48, 72, matching the SAS %RCSPLINE macro). splines and cli added to Imports.

Correctness fixes (from code review)

  • predict_survival_*: max_month filter is now applied before fitting the pooled logistic model (not only at prediction time), matching the SAS cann15/16/21 cutoff behaviour. Added id_col parameter; fixed hardcoded "id" column in standardize_survival(). check_both_arms() is now called on both the raw input and the filtered fit_data (after max_month/NA outcome filtering) to catch cases where filtering removes all rows for one arm. check_both_arms() now also rejects NA arm values and any arm values other than "STOPBASE" or "CONTINUE", erroring instead of silently misclassifying unexpected inputs. standardize_survival() now errors with a clear message when no baseline (month_col == 0) rows are found in fit_data, preventing silent return of a spurious perfect-survival curve. weight_col is validated to exist in fit_data and to be numeric before being passed to glm(), preventing silent unweighted fitting when the column is missing. id_col is validated to exist in fit_data before use in standardize_survival(), replacing a low-level error with a clear cli_abort() message. The spline basis is now computed dynamically via a compute_ns_basis() helper that renames its output columns to ns1, ns2, ..., nsK: all columns are generated from the full rcs_knots vector (not just the first two), so no spline terms are silently dropped when more than one interior knot is specified. rcs_knots is validated to have at least three entries (two boundary + at least one interior knot) and its documentation now correctly states it accepts a vector of at least 3 elements; compute_ns_basis() @return docs corrected to state the basis has length(rcs_knots) - 1 columns (not length(rcs_knots) - 2). standardize_survival() reuses compute_ns_basis() at prediction time, keeping fit and prediction bases consistent. covariate_cols in predict_survival_ipw() is now documented to require baseline-measured covariates, matching the standardization logic that operates over baseline rows only.
  • compute_ipw_weights: 99th-percentile truncation is now computed separately within each arm (STOPBASE and CONTINUE), reproducing the two-step cann18 approach. Arm-specific truncation guards against single-arm input. The anymammo_col parameter has been removed; tslm_lag already measures months since the last any mammogram (screening or diagnostic) and implicitly captures compliance-window resets. CONTINUE-arm weight logic corrected: (a) weight now updates at every month in the compliance window (tslm_lag 11–13), not only when scrmammo == 1; (b) numerator now uses conditional probabilities under the discrete uniform distribution (1/3 at month 11, 1/2 at month 12, 1 at month 13), matching SAS cann17b; (c) weight updates now stop after a breast-cancer diagnosis (bc_month_col and month2_col passed to internal helper); (d) NA predicted probabilities in the CONTINUE-arm helper are now treated as 0.0 (consistent with the STOPBASE branch) rather than the previous arbitrary 0.5. Arm dispatch now guards against NA arm values explicitly (via is.na(arm) check before %in%) to avoid a base-R missing value where TRUE/FALSE needed error, then aborts for any value other than "STOPBASE" or "CONTINUE". The @examples section no longer references the removed anymammo column. pred_prob_col is now validated to exist in long_data, be numeric, and fall within [0, 1] before any row-level processing begins. The internal temporary row-index column now uses a collision-resistant name (..ettbc_row_idx..) to avoid silently overwriting any existing user column with the same name.
  • false_positives: Evaluations are filtered to within each arm's observed follow-up before counting. Period classification now uses the 0-indexed month2 follow-up column (parameter renamed hist_month_colhist_month2_col). Duplicate evaluations within window_months per participant-arm are deduplicated. An empty-result data frame is returned safely when no IDs match the cohort or when long_data has 0 rows (even if hist_data is non-empty). @return docs clarified: arm-period combinations with no histological evaluations are omitted from the result. arm_col values are now validated up front: NA or any value other than "STOPBASE" / "CONTINUE" raises a cli_abort() error before any processing occurs. NA values in hist_month2_col now abort with a clear error message rather than causing a base-R missing value where TRUE/FALSE needed error in the deduplication loop.
  • fit_outcome_hr: Added cluster_id_col parameter; confidence intervals are now computed as Wald CIs in both the sandwich and fallback branches (consistent named numeric vector output). When sandwich is available, cluster-robust variance via sandwich::vcovCL() is used, matching the SAS PROC SURVEYLOGISTIC approach. sandwich added to Suggests. @details formula generalized to use ns1 + … + nsK (instead of hard-coded ns1/ns2) to reflect dynamic spline column generation from rcs_knots. The STOPBASE main-effect OR is documented as the baseline-time ratio from a model with arm-by-time interactions. Added single-arm guard on both input and filtered model data. weight_col is validated to exist in fit_data and to be numeric before use; cluster_id_col is validated to exist in fit_data and to contain no missing values before being passed to sandwich::vcovCL(), replacing a low-level NULL-access error with a clear cli_abort() message. Now uses compute_ns_basis() and dynamic ns_col_names for consistency with the survival prediction functions. rcs_knots documentation updated to reflect that the parameter accepts a vector of at least 3 elements (two boundary + at least one interior knot).
  • bootstrap_ci: Returns an NA-filled data frame for empty input instead of erroring with mismatched column lengths. Failed iterations are now counted; cli::cli_warn() is issued when the failure rate exceeds the new fail_threshold argument (default 10%); warning message formatting corrected (removed extraneous backslash). set.seed() now saves and restores the caller's RNG state on exit so seeding is confined to the function call. col_quantile() passes names = FALSE to stats::quantile() so apply() always returns a plain numeric vector (not a named 1×N matrix). id_col is now forwarded to both the point-estimate and bootstrap-iteration calls to predict_survival_ipw(), so non-default participant ID column names are handled correctly. n_boot is now validated to be ≥ 1, with a clear cli_abort() error for zero or negative values. rcs_knots documentation updated to reflect that vectors with at least 3 elements are supported. long_data is now pre-split by id_col once before the bootstrap loop (instead of calling merge() on every iteration), substantially reducing overhead on large datasets.
  • extract_mammograms_impl: Empty-result id column now preserves the type of claims[[id_col]] instead of defaulting to integer(0).
  • NEWS.md: Updated dependency entry to record that both splines and cli were added to Imports.

Tests added

  • test-bootstrap_ci.R: output structure, empty-input guard, seed reproducibility
  • test-extract_mammograms.R: HCPCS code filtering, month conversion, empty result, diagnostic/any variants
  • test-false_positives.R: structure, empty hist data, empty long data (with non-empty hist data), no-match cohort, censoring filter, deduplication, period classification
  • test-predict_survival.R: added coverage for predict_survival_baseline_adjusted() and predict_survival_ipw() (output structure, empty-data handling, weight handling)
  • test-compute_ipw_weights.R: added deterministic fixtures for STOPBASE grace-period behaviour, CONTINUE-arm compliance-window weight updates (tslm_lag 11–13), and per-arm 99th-percentile truncation

Checklist

  • The title of your PR should briefly describe the change.
  • The body of your PR should contain Fixes #issue-number (if relevant).
  • Commit/merge messages to be included in NEWS.md should begin with -.
  • x] Code should follow the tidyverse [style guide.
  • Documentation should use roxygen2, with Markdown syntax.
  • Contributions should include unit tests (using testthat).

Copilot AI requested review from Copilot and removed request for Copilot May 4, 2026 08:33
Copilot AI linked an issue May 4, 2026 that may be closed by this pull request
…tives, and mammogram extraction functions

- Add predict_survival_unadjusted(), predict_survival_baseline_adjusted(),
  predict_survival_ipw() with g-computation via pooled logistic regression
  and RCS splines (splines::ns)
- Add compute_ipw_weights() for stabilised IPW weights truncated at p99
- Add fit_outcome_hr() for weighted pooled logistic regression outcome model
- Add bootstrap_ci() for percentile bootstrap confidence intervals
- Add false_positives() for false-positive rate by arm and screening round
- Add extract_screening_mammograms(), extract_any_mammograms(),
  extract_diagnostic_mammograms() as Medicare claims extraction templates
- Add tests for predict_survival, compute_ipw_weights, fit_outcome_model
- Bump version to 0.0.0.9015; add splines to Imports
- Update NEWS.md with new function entries

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot May 4, 2026 09:21
Copilot AI changed the title [WIP] Translate remaining SAS code into R feat: translate remaining SAS analysis pipeline to R May 4, 2026
Copilot AI requested a review from d-morrison May 4, 2026 09:22
@d-morrison d-morrison marked this pull request as ready for review May 4, 2026 17:46
Copilot AI review requested due to automatic review settings May 4, 2026 17:46
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Completes most of the remaining SAS-to-R translation for the package’s target-trial-emulation workflow by adding exported analysis helpers for weighting, survival estimation, bootstrap CIs, false-positive summaries, and mammogram extraction templates.

Changes:

  • Added new core analysis functions for IPW construction, survival prediction, hazard-ratio estimation, bootstrap confidence intervals, false-positive summaries, and Medicare claims extraction.
  • Exported and documented the new functions, and updated package metadata/description to reflect the broader analysis pipeline.
  • Added initial unit tests for survival prediction, IPW weights, and outcome-model structure.

Reviewed changes

Copilot reviewed 23 out of 23 changed files in this pull request and generated 15 comments.

Show a summary per file
File Description
tests/testthat/test-predict_survival.R Adds basic tests for unadjusted survival output and empty input handling.
tests/testthat/test-fit_outcome_model.R Adds basic structure/error tests for fit_outcome_hr().
tests/testthat/test-compute_ipw_weights.R Adds basic tests for returned weight columns and empty input handling.
R/predict_survival.R Introduces unadjusted, baseline-adjusted, and IPW survival prediction helpers plus shared modeling utilities.
R/fit_outcome_model.R Adds pooled-logistic outcome model fitting and OR/CI extraction.
R/false_positives.R Adds false-positive rate summarisation by arm and screening round.
R/extract_mammograms.R Adds screening/diagnostic/combined mammogram extraction helpers for claims data.
R/compute_ipw_weights.R Adds cumulative IPW calculation and 99th-percentile truncation.
R/bootstrap_ci.R Adds bootstrap percentile CI computation for survival differences.
NEWS.md Records the new exported analysis functionality.
NAMESPACE Exports the newly added public functions.
man/predict_survival_unadjusted.Rd Generated documentation for unadjusted survival prediction.
man/predict_survival_ipw.Rd Generated documentation for IPW survival prediction.
man/predict_survival_baseline_adjusted.Rd Generated documentation for baseline-adjusted survival prediction.
man/fit_outcome_hr.Rd Generated documentation for outcome HR fitting.
man/false_positives.Rd Generated documentation for false-positive summaries.
man/extract_screening_mammograms.Rd Generated documentation for screening mammogram extraction.
man/extract_diagnostic_mammograms.Rd Generated documentation for diagnostic mammogram extraction.
man/extract_any_mammograms.Rd Generated documentation for combined mammogram extraction.
man/ettbc-package.Rd Updates package-level description to include the new analysis scope.
man/compute_ipw_weights.Rd Generated documentation for IPW weight computation.
man/bootstrap_ci.Rd Generated documentation for bootstrap CI estimation.
DESCRIPTION Bumps development version, broadens package description, and adds splines to Imports.

Comment thread R/predict_survival.R Outdated
Comment thread R/predict_survival.R Outdated
Comment thread R/predict_survival.R Outdated
Comment thread R/fit_outcome_model.R Outdated
Comment thread R/false_positives.R Outdated
Comment thread R/predict_survival.R Outdated
Comment thread R/bootstrap_ci.R
Comment thread R/bootstrap_ci.R Outdated
Comment thread R/false_positives.R
Comment thread R/extract_mammograms.R
@d-morrison
Copy link
Copy Markdown
Collaborator

@copilot apply changes based on the comments in this thread and fix spellcheck

@d-morrison d-morrison removed their request for review May 7, 2026 06:29
Copilot AI and others added 2 commits May 7, 2026 07:02
Copilot AI requested review from Copilot and removed request for Copilot May 7, 2026 07:04
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 7, 2026

@copilot apply changes based on the comments in this thread and fix spellcheck

Done in 84f1910. Here's a summary of what was addressed from the review thread:

  • predict_survival_*: max_month filter is now applied before fitting (not just at prediction time); added id_col parameter and fixed hardcoded "id" column in standardize_survival().
  • compute_ipw_weights: Per-arm 99th-percentile truncation (separate cutoffs for STOPBASE and CONTINUE); CONTINUE arm helper now receives and checks anymammo_col to correctly detect compliance-window resets.
  • false_positives: Evaluations filtered to within each arm's observed follow-up; period now classified using 0-indexed month2 (parameter renamed hist_month_colhist_month2_col); duplicate evaluations within window_months per participant-arm are dropped; empty-result guard prevents NULL crash when no IDs match the cohort.
  • fit_outcome_hr: Added cluster_id_col parameter; uses sandwich::vcovCL() for cluster-robust CIs when sandwich is available (added to Suggests), matching the SAS PROC SURVEYLOGISTIC variance approach.
  • bootstrap_ci: Returns an NA-filled data frame on empty input instead of erroring.
  • Tests: Added test-bootstrap_ci.R (structure, empty-input, seed reproducibility), test-extract_mammograms.R, and test-false_positives.R.
  • Spellcheck: Fixed stabilised→stabilized; added FQHC, HCPCS, HC, histological, deduplicate, deduplication, th, unbootstrapped to WORDLIST.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 27 changed files in this pull request and generated 7 comments.

Comment thread R/fit_outcome_model.R Outdated
Comment thread R/fit_outcome_model.R
Comment thread R/fit_outcome_model.R Outdated
Comment thread R/compute_ipw_weights.R Outdated
Comment thread R/extract_mammograms.R Outdated
Comment thread R/bootstrap_ci.R
Comment thread R/predict_survival.R
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 12, 2026

@copilot apply changes based on the comments in this thread and iterate with parallel_validation until review doesn't give any m...

Done in b472312. Three issues from review 4269647539 addressed:

  • compute_ipw_weights(): Added early cli_abort() validating pred_prob_col exists in long_data, is numeric, and is within [0, 1].
  • standardize_survival(): Added cli_abort() check that id_col exists in fit_data before it is used for duplicated().
  • false_positives(): Added cli_abort() check for NA values in hist_month2_col before the deduplication loop (where an NA m would cause a base-R error).
  • .github/copilot-instructions.md: Added the parallel_validation iteration policy.

parallel_validation passed with no additional feedback.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 28 changed files in this pull request and generated 3 comments.

Files not reviewed (11)
  • man/bootstrap_ci.Rd: Language not supported
  • man/compute_ipw_weights.Rd: Language not supported
  • man/ettbc-package.Rd: Language not supported
  • man/extract_any_mammograms.Rd: Language not supported
  • man/extract_diagnostic_mammograms.Rd: Language not supported
  • man/extract_screening_mammograms.Rd: Language not supported
  • man/false_positives.Rd: Language not supported
  • man/fit_outcome_hr.Rd: Language not supported
  • man/predict_survival_baseline_adjusted.Rd: Language not supported
  • man/predict_survival_ipw.Rd: Language not supported
  • man/predict_survival_unadjusted.Rd: Language not supported

Comment thread R/predict_survival.R Outdated
Comment thread R/predict_survival.R Outdated
Comment thread .github/copilot-instructions.md Outdated
@d-morrison
Copy link
Copy Markdown
Collaborator

@copilot apply changes based on the comments in this thread and iterate with copilot review before sending this PR back to me

Copilot AI and others added 10 commits May 12, 2026 17:04
…ming note

Agent-Logs-Url: https://github.com/ucdavis/ettbc/sessions/dd7e51ed-6843-440b-88b7-70c96d94cf77

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
… with @nord pattern)

Agent-Logs-Url: https://github.com/ucdavis/ettbc/sessions/dd7e51ed-6843-440b-88b7-70c96d94cf77

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Agent-Logs-Url: https://github.com/ucdavis/ettbc/sessions/dd7e51ed-6843-440b-88b7-70c96d94cf77

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
…_basis()

Agent-Logs-Url: https://github.com/ucdavis/ettbc/sessions/dd7e51ed-6843-440b-88b7-70c96d94cf77

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 12, 2026

@copilot apply changes based on the comments in this thread and iterate with copilot review before sending this PR back to me

Done in f352817. Three issues from review 4270685928 addressed, iterated through code review until no substantive issues remained:

  • predict_survival.R spline basis: Added compute_ns_basis() helper that computes the natural spline basis from all knots (not just rcs_knots[2L]); build_model_data() now dynamically generates ns1, ns2, ..., nsK columns for all basis columns returned; build_model_formula() now accepts ns_col_names and generates all interaction and main effect terms dynamically; standardize_survival() now recomputes all spline columns per time point via compute_ns_basis(). This prevents silent term dropping if rcs_knots has more than one interior knot.
  • fit_outcome_model.R: Updated to unpack the new list from build_model_data() and pass ns_col_names to build_model_formula().
  • .github/copilot-instructions.md: Updated the parallel_validation policy line to clarify it refers to calling the parallel_validation agent tool (not a shell command).

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 28 changed files in this pull request and generated 5 comments.

Files not reviewed (11)
  • man/bootstrap_ci.Rd: Language not supported
  • man/compute_ipw_weights.Rd: Language not supported
  • man/ettbc-package.Rd: Language not supported
  • man/extract_any_mammograms.Rd: Language not supported
  • man/extract_diagnostic_mammograms.Rd: Language not supported
  • man/extract_screening_mammograms.Rd: Language not supported
  • man/false_positives.Rd: Language not supported
  • man/fit_outcome_hr.Rd: Language not supported
  • man/predict_survival_baseline_adjusted.Rd: Language not supported
  • man/predict_survival_ipw.Rd: Language not supported
  • man/predict_survival_unadjusted.Rd: Language not supported

Comment thread .github/copilot-instructions.md Outdated
Comment thread R/predict_survival.R Outdated
Comment thread R/fit_outcome_model.R Outdated
Comment thread R/bootstrap_ci.R Outdated
Comment thread R/compute_ipw_weights.R Outdated
@d-morrison
Copy link
Copy Markdown
Collaborator

@copilot apply changes based on the comments in this thread

…copilot-instructions clarification

Agent-Logs-Url: https://github.com/ucdavis/ettbc/sessions/1704c56c-dd99-4d2f-bd9a-3e1d91281935

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 12, 2026

@copilot apply changes based on the comments in this thread

Done in cbed3ac. Four issues from review 4274879388 addressed:

  • rcs_knots docs (predict_survival.R, fit_outcome_model.R, bootstrap_ci.R): Changed "length 3" to "at least 3 elements"; now explicitly documents that the first element is the left boundary knot, the last is the right boundary knot, and any middle elements are interior knots.
  • .row_idx collision (compute_ipw_weights.R): Replaced the temporary .row_idx column name with "..ettbc_row_idx.." — a collision-resistant name that won't silently overwrite user data.
  • copilot-instructions.md: Clarified that parallel_validation is a Copilot agent tool (not a shell command/script in the repo).

parallel_validation passed with no comments.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 28 changed files in this pull request and generated 4 comments.

Files not reviewed (11)
  • man/bootstrap_ci.Rd: Language not supported
  • man/compute_ipw_weights.Rd: Language not supported
  • man/ettbc-package.Rd: Language not supported
  • man/extract_any_mammograms.Rd: Language not supported
  • man/extract_diagnostic_mammograms.Rd: Language not supported
  • man/extract_screening_mammograms.Rd: Language not supported
  • man/false_positives.Rd: Language not supported
  • man/fit_outcome_hr.Rd: Language not supported
  • man/predict_survival_baseline_adjusted.Rd: Language not supported
  • man/predict_survival_ipw.Rd: Language not supported
  • man/predict_survival_unadjusted.Rd: Language not supported

Comment thread R/predict_survival.R Outdated
Comment thread R/predict_survival.R Outdated
Comment thread R/fit_outcome_model.R Outdated
Comment thread NEWS.md Outdated
@d-morrison
Copy link
Copy Markdown
Collaborator

@copilot apply changes based on the comments in this thread

…generalization, NEWS cli import

Agent-Logs-Url: https://github.com/ucdavis/ettbc/sessions/da8a9b08-d9f9-45b3-addb-6c48acef64f3

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 13, 2026

@copilot apply changes based on the comments in this thread

Done in 8efa3fc. Four documentation issues from review 4278124228 addressed:

  • compute_ns_basis() @return: Corrected column count from length(rcs_knots) - 2 to length(rcs_knots) - 1 (i.e., length(interior_knots) + 1).
  • predict_survival_ipw() covariate_cols: Added note that these covariates are used during g-computation standardization over baseline rows (month = 0) and must be measured at or before baseline.
  • fit_outcome_hr() @details formula: Replaced hard-coded ns1/ns2 with generic ns1 + ... + nsK and K = length(rcs_knots) - 1.
  • NEWS.md: Updated the Imports entry to mention both splines and cli.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 28 changed files in this pull request and generated 3 comments.

Files not reviewed (11)
  • man/bootstrap_ci.Rd: Language not supported
  • man/compute_ipw_weights.Rd: Language not supported
  • man/ettbc-package.Rd: Language not supported
  • man/extract_any_mammograms.Rd: Language not supported
  • man/extract_diagnostic_mammograms.Rd: Language not supported
  • man/extract_screening_mammograms.Rd: Language not supported
  • man/false_positives.Rd: Language not supported
  • man/fit_outcome_hr.Rd: Language not supported
  • man/predict_survival_baseline_adjusted.Rd: Language not supported
  • man/predict_survival_ipw.Rd: Language not supported
  • man/predict_survival_unadjusted.Rd: Language not supported

Comment thread man/fit_outcome_hr.Rd Outdated
Comment thread R/predict_survival.R
Comment thread R/bootstrap_ci.R Outdated
@d-morrison
Copy link
Copy Markdown
Collaborator

@copilot apply changes based on the comments in this thread

Copilot AI and others added 2 commits May 13, 2026 05:28
…g_data for bootstrap

Agent-Logs-Url: https://github.com/ucdavis/ettbc/sessions/81178e8c-a33e-457b-839e-f1568fab67c1

Co-authored-by: d-morrison <2474437+d-morrison@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented May 13, 2026

@copilot apply changes based on the comments in this thread

Done in 7e5a072 (with a491163 staging the pre-existing uncommitted doc changes). Three issues from review 4278478682 addressed:

  • man/fit_outcome_hr.Rd: Committed the already-regenerated .Rd (from devtools::document()) with the generic ns1 + ... + nsK formula (also committed updated man/predict_survival_ipw.Rd with the covariate_cols baseline note).
  • compute_ns_basis() @return docs: Added colnames(basis) <- paste0("ns", seq_len(ncol(basis))) inside compute_ns_basis() so the function now actually returns a matrix with ns1, ns2, ..., nsK column names — matching what the @return docs say. Simplified build_model_data() to use colnames(ns_basis) instead of regenerating names independently.
  • Bootstrap loop performance: Pre-split long_data by id_col once before the loop using split(), then build each bootstrap sample by concatenating the sampled groups via lapply + do.call(rbind, ...), avoiding a merge() on every iteration.

parallel_validation passed with no comments.

@github-actions
Copy link
Copy Markdown

PR Preview Action v1.8.1

QR code for preview link

🚀 View preview at
https://ucdavis.github.io/ettbc/pr-preview/pr-8/

Built to branch gh-pages at 2026-05-13 05:33 UTC.
Preview will be ready when the GitHub Pages deployment is complete.


These commands must be run in this order and all must pass without errors before pushing changes or requesting code review. This ensures that CI/CD workflows will pass and prevents wasting reviewer time on fixable issues.

**POLICY: After completing all code changes, call the `parallel_validation` tool and iterate — applying any feedback raised by the code review or CodeQL scan — until `parallel_validation` reports no new issues. Only then consider the task done.** Note: `parallel_validation` is a Copilot coding agent tool (invokable from within a cloud agent session); it is not a shell command or script in this repository.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

parallel_validation doesn't seem to be real; instead, just iteratively request copilot reviews from inside the cloud agent session until the reviews come back clean

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 17 out of 28 changed files in this pull request and generated 2 comments.

Files not reviewed (11)
  • man/bootstrap_ci.Rd: Language not supported
  • man/compute_ipw_weights.Rd: Language not supported
  • man/ettbc-package.Rd: Language not supported
  • man/extract_any_mammograms.Rd: Language not supported
  • man/extract_diagnostic_mammograms.Rd: Language not supported
  • man/extract_screening_mammograms.Rd: Language not supported
  • man/false_positives.Rd: Language not supported
  • man/fit_outcome_hr.Rd: Language not supported
  • man/predict_survival_baseline_adjusted.Rd: Language not supported
  • man/predict_survival_ipw.Rd: Language not supported
  • man/predict_survival_unadjusted.Rd: Language not supported

Comment thread R/predict_survival.R
Comment on lines +298 to +307
build_model_formula <- function(ns_col_names, covariate_cols = NULL) {
ns_interaction_terms <- paste0("STOPBASE:", ns_col_names)
base_terms <- c(
"STOPBASE", "STOPBASE:month3",
ns_interaction_terms,
"month3", ns_col_names
)
all_terms <- c(base_terms, covariate_cols)
stats::as.formula(paste("dead_t1 ~", paste(all_terms, collapse = " + ")))
}
Comment thread R/bootstrap_ci.R
Comment on lines +163 to +166
# assigning new integer IDs to handle duplicates
boot_groups <- lapply(seq_along(boot_ids), function(i) {
grp <- long_data_split[[as.character(boot_ids[i])]]
grp[[id_col]] <- i
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

translate the rest of the sas code

4 participants