Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
57dbc1b
Adding support for mongo-based SearchFacet entries
bistline Apr 30, 2025
f4298fc
Adding migration, updating code to use bq_cols for metadata and sourc…
bistline May 1, 2025
da9dbe9
fixing regressions from refactor
bistline May 1, 2025
6f59a60
fixing typo
bistline May 1, 2025
cf98799
updating tests
bistline May 7, 2025
449b94b
adding search_controller test
bistline May 7, 2025
38bb247
fixing search_controller test
bistline May 7, 2025
530e46f
adding new facets to addt'l list
bistline May 7, 2025
6f97952
trying cleanup step re: CI test failure
bistline May 7, 2025
7d68318
Bump rack from 2.2.13 to 2.2.14
dependabot[bot] May 8, 2025
2d37358
Upgrading to sentry-ruby
bistline May 12, 2025
ca36b6c
Merge pull request #2251 from broadinstitute/jb-new-search-facets
bistline May 19, 2025
ca3279b
Merge pull request #2254 from broadinstitute/dependabot/bundler/rack-…
bistline May 19, 2025
9043c5a
Bump vite from 4.5.13 to 4.5.14
dependabot[bot] May 19, 2025
23a2845
Dealing with type param error
bistline May 19, 2025
603a367
Merge pull request #2255 from broadinstitute/dependabot/npm_and_yarn/…
eweitz May 19, 2025
1cd202d
Making data retention test more tolerant of ops order
bistline May 19, 2025
c205b32
Fix filter handling for edge case in annotationFacets
eweitz May 19, 2025
36d2560
Merge branch 'development' into jb-sentry-ruby
bistline May 19, 2025
0b7b337
Fixing test regression re: filter choice
bistline May 19, 2025
c274d89
Merge pull request #2256 from broadinstitute/jb-sentry-ruby
bistline May 19, 2025
69daabd
Disabling external NeMO tests
bistline May 19, 2025
639fd2b
Improve fix for null filter edge case
eweitz May 19, 2025
269d970
Test cell filtering handling for null filters
eweitz May 20, 2025
7965f33
Merge branch 'development' of github.com:broadinstitute/single_cell_p…
eweitz May 20, 2025
455761e
Merge pull request #2257 from broadinstitute/jb-nemo-test-disable
bistline May 20, 2025
fbb7a97
Omit CELLxGENE IDs from cell facets
eweitz May 20, 2025
39c1c31
Merge branch 'development' of github.com:broadinstitute/single_cell_p…
eweitz May 20, 2025
4977579
Revert inferior fix
eweitz May 20, 2025
fccf1f7
Fix search via DE genes click
eweitz May 20, 2025
3a0cc48
Merge pull request #2258 from broadinstitute/ew-patch-null-filter
eweitz May 20, 2025
0389ba3
Merge pull request #2259 from broadinstitute/ew-fix-gene-search-de
eweitz May 20, 2025
6461a72
Fix mouse ideogram gene popover, add more human pathways
eweitz May 20, 2025
88a4fef
Add "Parse Evercode Whole Transcriptome v3" to library preparation pr…
eweitz May 20, 2025
32f53e1
Merge pull request #2260 from broadinstitute/ew-update-ideogram-data
eweitz May 20, 2025
0dd33cd
Add "Parse Evercode WT Mini v3" to library preparation protocol dropdown
eweitz May 20, 2025
925f27d
Merge pull request #2262 from broadinstitute/ew-parse-evercode-wt3
eweitz May 20, 2025
c84d8c6
Robustify autocomplete to issues in pathway data
eweitz May 21, 2025
66de532
Test pathway handling robustness, omit problematic pathways
eweitz May 21, 2025
a42dab6
Merge pull request #2264 from broadinstitute/ew-robust-pathway-autoco…
eweitz May 21, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,8 @@ gem 'ruby_native_statistics'
gem 'mongoid_rails_migrations'
gem 'secure_headers'
gem 'swagger-blocks'
gem 'sentry-raven'
gem 'sentry-ruby'
gem "sentry-rails"
gem 'rubyzip'
gem 'rack-brotli'
gem 'time_difference'
Expand Down
13 changes: 9 additions & 4 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -380,7 +380,7 @@ GEM
puma (5.6.9)
nio4r (~> 2.0)
racc (1.8.1)
rack (2.2.13)
rack (2.2.14)
rack-brotli (1.1.0)
brotli (>= 0.1.7)
rack (>= 1.4)
Expand Down Expand Up @@ -479,8 +479,12 @@ GEM
sdoc (2.1.0)
rdoc (>= 5.0)
secure_headers (6.3.2)
sentry-raven (3.1.2)
faraday (>= 1.0)
sentry-rails (5.23.0)
railties (>= 5.0)
sentry-ruby (~> 5.23.0)
sentry-ruby (5.23.0)
bigdecimal
concurrent-ruby (~> 1.0, >= 1.0.2)
signet (0.17.0)
addressable (~> 2.8)
faraday (>= 0.17.5, < 3.a)
Expand Down Expand Up @@ -626,7 +630,8 @@ DEPENDENCIES
sass-rails (>= 6)
sdoc
secure_headers
sentry-raven
sentry-rails
sentry-ruby
simplecov
simplecov-lcov
stackprof
Expand Down
27 changes: 21 additions & 6 deletions app/controllers/api/v1/search_controller.rb
Original file line number Diff line number Diff line change
Expand Up @@ -170,7 +170,7 @@ class SearchController < ApiBaseController

def index
@viewable = Study.viewable(current_api_user)
@search_type = params[:type].to_sym
@search_type = params[:type]&.to_sym || :study # handle empty type

# filter results by branding group, if specified
if @selected_branding_group.present?
Expand Down Expand Up @@ -238,10 +238,20 @@ def index
if @studies.count > 0 && @facets.any?
sort_type = :facet
@studies_by_facet = {}
@big_query_search = self.class.generate_bq_query_string(@facets)
logger.info "Searching BigQuery using facet-based query: #{@big_query_search}"
query_results = ApplicationController.big_query_client.dataset(CellMetadatum::BIGQUERY_DATASET).query @big_query_search
job_id = query_results.job_gapi.job_reference.job_id
mongo_facets, bq_facets = self.class.divide_facets_by_source(@facets)
if bq_facets.any?
@big_query_search = self.class.generate_bq_query_string(bq_facets)
query_results = ApplicationController.big_query_client.dataset(CellMetadatum::BIGQUERY_DATASET).query @big_query_search
else
query_results = []
end
# run a query for any mongo-based facets
mongo_facets.map do |facet|
db_facet = facet[:db_facet]
mongo_results = StudySearchService.perform_mongo_facet_search(db_facet, facet[:filters])
query_results += mongo_results
end

# build up map of study matches by facet & filter value (for adding labels in UI)
@studies_by_facet = self.class.match_studies_by_facet(query_results, @facets)
# uniquify result list as one study may match multiple facets/filters
Expand All @@ -254,7 +264,7 @@ def index
existing_total_matches = @match_by_data['numResults:scp'].to_i
@match_by_data['numResults:scp:metadata'] = total_metadata_matches.size
@match_by_data['numResults:scp'] = existing_total_matches + total_metadata_matches.size
logger.info "Found #{@convention_accessions.count} matching studies from BQ job #{job_id}: #{@convention_accessions}"
logger.info "Found #{@convention_accessions.count} matching studies from query: #{@convention_accessions}"
@studies = @studies.where(:accession.in => @convention_accessions)
end

Expand Down Expand Up @@ -765,6 +775,11 @@ def self.match_results_by_filter(search_result:, result_key:, facets:)
end
end

# divide facets into mongo- and bigquery-based
def self.divide_facets_by_source(facets)
facets.partition { |facet| facet[:db_facet].is_mongo_based }
end

# properly escape any single quotes in a filter value (double quotes are correctly handled already)
def self.sanitize_filter_value(filter)
filter.gsub(/'/) { "\\'" }
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -203,7 +203,7 @@ function getExpressionSort(exploreInfo, exploreParams) {
* */
export default function ExploreDisplayPanelManager({
studyAccession, exploreInfo, setExploreInfo, exploreParams, updateExploreParams, clearExploreParams,
exploreParamsWithDefaults, routerLocation, searchGenes, countsByLabelForDe, setShowUpstreamDifferentialExpressionPanel,
exploreParamsWithDefaults, routerLocation, queryFn, countsByLabelForDe, setShowUpstreamDifferentialExpressionPanel,
setShowDifferentialExpressionPanel, showUpstreamDifferentialExpressionPanel, togglePanel, shownTab,
showDifferentialExpressionPanel, setIsCellSelecting, currentPointsSelected, isCellSelecting, deGenes,
setDeGenes, setShowDeGroupPicker,
Expand Down Expand Up @@ -572,7 +572,7 @@ export default function ExploreDisplayPanelManager({
<DifferentialExpressionPanel
deGroup={deGroup}
deGenes={deGenes}
searchGenes={searchGenes}
searchGenes={queryFn}
exploreParamsWithDefaults={exploreParamsWithDefaults}
exploreInfo={exploreInfo}
clusterName={exploreParamsWithDefaults.cluster}
Expand Down
4 changes: 2 additions & 2 deletions app/javascript/components/explore/StudyGeneField.jsx
Original file line number Diff line number Diff line change
Expand Up @@ -362,9 +362,9 @@ export default function StudyGeneField({
}

/** Last filtering applied before showing selectable autocomplete options */
function finalFilterOptions(option, rawInput) {
export function finalFilterOptions(option, rawInput) {
const input = rawInput.toLowerCase()
const label = 'label' in option ? option.label.toLowerCase() : option.toLowerCase()
const label = 'label' in option ? option.label?.toLowerCase() : option.toLowerCase()
const isPathway = option.data.isGene === false
return isPathway || label.includes(input) // partial match
}
Expand Down
2 changes: 1 addition & 1 deletion app/javascript/components/search/controls/FacetsPanel.jsx
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import { SearchFacetContext } from '~/providers/SearchFacetProvider'

const defaultFacetIds = ['disease', 'species']
const moreFacetIds = [
'sex', 'race', 'library_preparation_protocol', 'organism_age'
'sex', 'race', 'library_preparation_protocol', 'organism_age', 'has_morphology', 'has_electrophysiology'
]

/**
Expand Down
45 changes: 39 additions & 6 deletions app/javascript/lib/cell-faceting.js
Original file line number Diff line number Diff line change
Expand Up @@ -350,7 +350,7 @@ export function getMinMaxValues(filters) {
/** Omit any filters that match 0 cells in the current clustering */
function trimNullFilters(cellFaceting) {
const filterCountsByFacet = cellFaceting.filterCounts
const annotationFacets = cellFaceting.facets.map(facet => facet.annotation)
const facets = cellFaceting.facets.map(facet => facet.annotation)
const nonzeroFiltersByFacet = {} // filters to remove, as they match no cells
const nonzeroFilterCountsByFacet = {}
const originalFacets = cellFaceting.rawFacets.facets
Expand All @@ -359,14 +359,15 @@ function trimNullFilters(cellFaceting) {

const filterableCells = cellFaceting.filterableCells

for (let i = 0; i < annotationFacets.length; i++) {
const facet = annotationFacets[i]
for (let i = 0; i < facets.length; i++) {
const facet = facets[i]
const sourceFacet = originalFacets.find(f => f.annotation === facet)
let facetHasNullFilter = false
const isGroupFacet = facet.includes('--group--')
let nullFilterIndex

const countsByFilter = filterCountsByFacet[facet]

const nonzeroFilters = []
let defaultSelection = []
const nonzeroFilterCounts = {}
Expand Down Expand Up @@ -424,7 +425,7 @@ function trimNullFilters(cellFaceting) {

if (!hasAnyNullFilters) {return cellFaceting}

cellFaceting.cellsByFacet = getCellsByFacet(filterableCells, annotationFacets)
cellFaceting.cellsByFacet = getCellsByFacet(filterableCells, facets)
cellFaceting.filterableCells = filterableCells
cellFaceting.filterCounts = nonzeroFilterCountsByFacet

Expand All @@ -438,6 +439,7 @@ function getFilterCounts(annotationFacets, cellsByFacet, facets, selection) {
for (let i = 0; i < annotationFacets.length; i++) {
const facet = annotationFacets[i]
const facetCrossfilter = cellsByFacet[facet]

// Set counts for each filter in facet
const rawFilterCounts = facetCrossfilter.group().top(Infinity)
let countsByFilter
Expand Down Expand Up @@ -489,6 +491,7 @@ function getFilterCounts(annotationFacets, cellsByFacet, facets, selection) {
return filterCounts
}


/** Get crossfilter-initialized cells by facet */
function getCellsByFacet(filterableCells, annotationFacets) {
const cellCrossfilter = crossfilter(filterableCells)
Expand Down Expand Up @@ -545,9 +548,11 @@ function getFacetsToFetch(allRelevanceSortedFacets, prevCellFaceting) {
}
})

return allRelevanceSortedFacets
const facetsToFetch = allRelevanceSortedFacets
.map(annot => annot.annotation)
.slice(fetchOffset, fetchOffset + 5)

return facetsToFetch
}

/** Log metrics to Mixpanel if fully loaded, return next perfTime object to pass in chain */
Expand Down Expand Up @@ -593,6 +598,22 @@ function getFilterableAnnotationsForClusterAndStudy(annotations, clusterName) {
return annots
}

/** Omit annotations that are CELLxGENE term IDs */
function getIsCellxGeneTermId(annotName) {
const isCellxGeneTermId = [
'disease_ontology_term_id',
'cell_type_ontology_term_id',
'library_preparation_protocol_term_id',
'sex_ontology_term_id',
'protocol_URL',
'tissue_ontology_term_id',
'assay_ontology_term_id',
'development_stage_ontology_term_id'
].includes(annotName)

return isCellxGeneTermId
}

/** Get 5 default annotation facets: 1 for selected, and 4 others */
export async function initCellFaceting(
selectedCluster, selectedAnnot, studyAccession, allAnnots, prevCellFaceting, subsample=null
Expand All @@ -616,12 +637,23 @@ export async function initCellFaceting(
!(annot.type === 'group' && annot.values.length <= 1) &&
!annot.identifier.endsWith('invalid') &&
!annot.identifier.endsWith('user') &&
!(annot.type === 'numeric' && shouldHideNumericCellFiltering)
!(annot.type === 'numeric' && shouldHideNumericCellFiltering) &&
!(getIsCellxGeneTermId(annot.name))
)
})

let allRelevanceSortedFacets =
sortAnnotationsByRelevance(eligibleAnnots)
.filter(annot => {
if (!prevCellFaceting) {
return true
}

const prevAnnotFacets = prevCellFaceting.facets.map(f => f.annotation)

// Omit null facets detected in prior calls of `initCellFaceting`
return (prevAnnotFacets.includes(annot.identifier))
})
.map(annot => {
const facet = { annotation: annot.identifier, type: annot.type }
if (annot.type) {
Expand Down Expand Up @@ -683,6 +715,7 @@ export async function initCellFaceting(

// Below line is worth keeping, but only uncomment to debug in development
// window.SCP.cellFaceting = cellFaceting

return cellFaceting
}

Expand Down
7 changes: 5 additions & 2 deletions app/javascript/lib/search-utils.js
Original file line number Diff line number Diff line change
Expand Up @@ -39,8 +39,11 @@ function getPathwayIdsByName() {

const pathwayCache = window.Ideogram.interactionCache

// Lower-quality pathways
const omittedPathways = ['WP1984', 'WP615', 'WP5096']
// Lower-quality or buggy pathways
const omittedPathways = [
'WP1984', 'WP615', 'WP5096',
'WP5520', 'WP5522', 'WP5523'
]

const pathwayIdsByName = {}
const pathwayEntries = Object.entries(pathwayCache)
Expand Down
24 changes: 24 additions & 0 deletions app/lib/study_search_service.rb
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,30 @@ def self.get_studies_from_term_conversion(terms)
accessions_to_filters.with_indifferent_access
end

# search Mongo for facets that don't use BigQuery to source data
# also accounts for presence-based facets like has_morphology
def self.perform_mongo_facet_search(facet, filter_values)
results = []
values = filter_values.map { |entry| [convert_id_format(entry[:id]), entry[:name]] }.flatten
matches = facet.associated_metadata(values:)
matches.each do |metadata|
next if metadata.study.queued_for_deletion

accession = metadata.study.accession
matched_values = facet.is_presence_facet ? [facet.identifier] : values & metadata.values
matched_values.map do |val|
results << { study_accession: accession, facet.identifier.to_sym => val }
end
end
results
end

# deal with ontology id formatting inconsistencies
def self.convert_id_format(id)
parts = id.split(/[_:]/)
[parts.join('_'), parts.join(':')]
end

# take a term and match to a possible search facet/filter
# will return a single hash with keys as facet names, and values as an array of filter matches
def self.match_facet_filters_from_terms(term_list)
Expand Down
2 changes: 2 additions & 0 deletions app/models/expression_file_info.rb
Original file line number Diff line number Diff line change
Expand Up @@ -58,6 +58,8 @@ class ExpressionFileInfo
'MERFISH', # spatial transcriptomic
'NanoString CosMx', #spatial transcriptomic
'osmFISH', # spatial transcriptomic
'Parse Evercode WT v3', # Similar to SPLiT-seq; https://broadinstitute.zendesk.com/agent/tickets/328801
'Parse Evercode WT Mini v3', # Similar to SPLiT-seq; https://broadinstitute.zendesk.com/agent/tickets/328801
'Patch-seq', # multimodal: scRNAseq, electrophys, morphology
'PIP-seq', # scRNA-seq, https://www.nature.com/articles/s41587-023-01685-z
'scATAC-seq/Fluidigm', # scATAC-seq
Expand Down
Loading
Loading