-
Notifications
You must be signed in to change notification settings - Fork 27
Speed up dot plots: integrate preprocessed data on frontend (SCP-5992) #2324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ortal_core into ew-fast-dot-plots
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## development #2324 +/- ##
===============================================
- Coverage 72.62% 72.61% -0.01%
===============================================
Files 333 335 +2
Lines 27745 27834 +89
Branches 2651 2565 -86
===============================================
+ Hits 20151 20213 +62
- Misses 7453 7477 +24
- Partials 141 144 +3
🚀 New features to boost your workflow:
|
bistline
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm very glad to see this didn't look to terribly complicated in the end, though I'm sure it took quite a while to figure out due to the lack of documentation. Very nice work! The speedup locally is incredible.
| CellMetadatum.where(name: 'organism_age').map(&:set_minmax_by_units!) | ||
| # Only process CellMetadatum records that have a valid study association | ||
| CellMetadatum.where(name: 'organism_age').each do |cell_metadatum| | ||
| next if cell_metadatum.study.nil? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't ever be the case - all records are destroyed on study deletion, though clearly at some point something went sideways locally and you ended up with orphaned data. I did a quick check on staging & production and we don't have any of these (good), but leaving this check in place is harmless.
This dramatically speeds up dot plots for datasets where the preprocessing pipeline has run.
Previously, dot plots would load much more data than they needed from the backend. Dot plots only display two metrics: scaled mean expression and percent of cells expressing for a given annotation label. They don't need each cell's expression data, but they would load it.
Now, dot plots can use preprocessed data that only provides those two summary metrics. This makes dot plots much faster, and also (in principle) enables them to scale to many more genes than the current 50-gene cutoff.
The technical implementation adds a monkey patch to the underlying JavaScript of Morpheus. It also only enables this behind a feature flag, as at least two known issues need to be resolved before this is made generally available:
Demo video:
Fast_dot_plots_frontend__SCP__2025-11-24.mov