🤖 Written by Claude.
Overview
A new admin-only analytics page mining ClassificationModification history to show how classifications change over time. Initially restricted to superusers with lab/org filter controls. Will be tested on Shariant; team feedback will determine whether to expose per-lab views later.
The page answers questions like: "how often do we reclassify?", "which genes have the most VUSes?", and "is our database trending towards benign over time?"
Data Model
A materialised ReclassificationEvent table — one row per event where clinical significance changed between consecutive ClassificationModification records:
classification FK, lab FK, allele_origin (germline/somatic — hard split throughout)
from_clinical_significance / to_clinical_significance
reclassified_date (date of the newer modification)
- FKs to both
ClassificationModification records (for traceability)
Updated incrementally via signal on classification publish; daily Celery beat as a safety net; full recompute only on schema change.
For variants reclassified multiple times (expected to be rare): expose both each individual step and the first→last transition.
Charts / Features
1. Sankey / Alluvial Chart
Two ranked columns (B / LB / VUS / LP / P), lines between them proportional to reclassification count. Germline and somatic on separate axes. Immediately shows the pattern of how classifications are moving.
2. Overall Trend Chart
Each reclassification event plotted as a dot over time — green for benign-direction, red for pathogenic-direction — with a rolling average line. Expected to show a net benign trend as population databases (gnomAD etc.) resolve VUSes.
3. VUS Reclassification Rate Over Time
Line chart: of all VUSes existing at the start of each period, what % were reclassified by the end? Default: per year. Toggle: per quarter (may have insufficient data). This is the headline metric for database quality — "we resolve X% of VUSes per year."
Note: requires tracking the VUS count at the start of each period as a denominator, not just reclassification events.
4. Time-to-Reclassification Distribution
Histogram of elapsed time from initial classification to reclassification, with a separate histogram per starting clinical significance. The key question: "what is the median VUS half-life?" Straightforward to derive from the event table (reclassified_date - initial_classification_date).
5. VUS Burden by Gene
Sortable table / bar chart: genes with the most outstanding VUSes right now, normalised by total classification count for that gene. Actionable for curation prioritisation — lab managers can see which genes need attention.
6. Evidence Key Changes Driving Reclassification
Diff the evidence JSON between from_modification and to_modification for each event; show a bar chart of the most commonly changed criteria. Answers "is it gnomAD or functional data that is resolving our VUSes?" Most complex of the six — may be phased if needed.
Permissions / Access
- Admin-only initially (no share-level filtering needed — simplifies queries significantly)
- Filter UI: organisation → lab, date range
- Whole-system view is the most valuable for reporting and grants
- If per-lab view added later: scope to classifications the lab owns, not just ones visible to them, to avoid leaking unpublished changes from other labs
Deferred (separate issues)
- Lab follow-on convergence: when lab A reclassifies a variant, do other labs follow and how quickly? Needs multi-lab pairing on the same allele — too large for initial scope.
- Stale VUS list: sortable table of VUSes not reviewed in >N years — too large for initial scope.
🤖 Written by Claude.
Overview
A new admin-only analytics page mining
ClassificationModificationhistory to show how classifications change over time. Initially restricted to superusers with lab/org filter controls. Will be tested on Shariant; team feedback will determine whether to expose per-lab views later.The page answers questions like: "how often do we reclassify?", "which genes have the most VUSes?", and "is our database trending towards benign over time?"
Data Model
A materialised
ReclassificationEventtable — one row per event where clinical significance changed between consecutiveClassificationModificationrecords:classificationFK,labFK,allele_origin(germline/somatic — hard split throughout)from_clinical_significance/to_clinical_significancereclassified_date(date of the newer modification)ClassificationModificationrecords (for traceability)Updated incrementally via signal on classification publish; daily Celery beat as a safety net; full recompute only on schema change.
For variants reclassified multiple times (expected to be rare): expose both each individual step and the first→last transition.
Charts / Features
1. Sankey / Alluvial Chart
Two ranked columns (B / LB / VUS / LP / P), lines between them proportional to reclassification count. Germline and somatic on separate axes. Immediately shows the pattern of how classifications are moving.
2. Overall Trend Chart
Each reclassification event plotted as a dot over time — green for benign-direction, red for pathogenic-direction — with a rolling average line. Expected to show a net benign trend as population databases (gnomAD etc.) resolve VUSes.
3. VUS Reclassification Rate Over Time
Line chart: of all VUSes existing at the start of each period, what % were reclassified by the end? Default: per year. Toggle: per quarter (may have insufficient data). This is the headline metric for database quality — "we resolve X% of VUSes per year."
Note: requires tracking the VUS count at the start of each period as a denominator, not just reclassification events.
4. Time-to-Reclassification Distribution
Histogram of elapsed time from initial classification to reclassification, with a separate histogram per starting clinical significance. The key question: "what is the median VUS half-life?" Straightforward to derive from the event table (
reclassified_date - initial_classification_date).5. VUS Burden by Gene
Sortable table / bar chart: genes with the most outstanding VUSes right now, normalised by total classification count for that gene. Actionable for curation prioritisation — lab managers can see which genes need attention.
6. Evidence Key Changes Driving Reclassification
Diff the evidence JSON between
from_modificationandto_modificationfor each event; show a bar chart of the most commonly changed criteria. Answers "is it gnomAD or functional data that is resolving our VUSes?" Most complex of the six — may be phased if needed.Permissions / Access
Deferred (separate issues)