Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
a1895c4
Add upstream Batch API check before launching new DE job
bistline Jan 27, 2025
7b018f1
Adding tests
bistline Jan 27, 2025
d3542b9
better matching on command line
bistline Jan 29, 2025
41d3d03
add DE job check after raw counts extraction
bistline Jan 29, 2025
8187214
Merge pull request #2196 from broadinstitute/jb-de-job-poller
bistline Feb 13, 2025
9ba901e
Bump nokogiri from 1.18.2 to 1.18.3
dependabot[bot] Feb 19, 2025
5fcff1f
Use BED as default sequence file type for upload
eweitz Feb 20, 2025
2044349
Support CSI files as BED index: like TBI, for big chromosomes
eweitz Feb 20, 2025
0c80bd2
Add CSI to expected message in validation test
eweitz Feb 20, 2025
c39c970
Robustify creation date in embargo enforcement
eweitz Feb 21, 2025
30b72fc
Fix client-side validation of embargo date
eweitz Feb 21, 2025
123362b
Extract validateEmbargo to JS module
eweitz Feb 21, 2025
a792990
Write error message below relevant field
eweitz Feb 21, 2025
321cd17
Refine client-side error state in study creation
eweitz Feb 21, 2025
25ff238
Client-side validate study name, billing project, workspace
eweitz Feb 22, 2025
8fafe44
Scroll to top of study form on validation failure
eweitz Feb 22, 2025
c65e423
Revert incidental update: use Fastq as default, like before
eweitz Feb 24, 2025
087b7aa
Merge pull request #2202 from broadinstitute/ew-csi-genome-upload
eweitz Feb 24, 2025
4a612b9
Allow sequence file upload for AnnData UX
bistline Feb 24, 2025
3871818
Create test for native DOM study validation
eweitz Feb 24, 2025
3205a11
Add more tests for client-side study validation
eweitz Feb 24, 2025
2b02459
Slightly economize descriptive text
eweitz Feb 24, 2025
e7bd59f
Log new study-validation event to Mixpanel
eweitz Feb 24, 2025
1559bd2
Remove nightly storage integrity check
bistline Feb 25, 2025
83ec08a
Fix test regression
eweitz Feb 25, 2025
e886016
Updating to scp-ingest-pipeline:1.40.1
bistline Feb 25, 2025
58820e7
Merge pull request #2201 from broadinstitute/dependabot/bundler/nokog…
eweitz Feb 25, 2025
27e2fc7
Bump rack from 2.2.10 to 2.2.11
dependabot[bot] Feb 25, 2025
cc9e67c
no-op change to force CI
bistline Feb 25, 2025
57bcbef
Merge pull request #2208 from broadinstitute/dependabot/bundler/rack-…
bistline Feb 26, 2025
9cba589
Merge pull request #2205 from broadinstitute/jb-storage-check-disable
bistline Feb 26, 2025
ffb73e5
Merge pull request #2203 from broadinstitute/jb-anndata-ux-sequence-f…
bistline Feb 26, 2025
0e84ac1
Merge pull request #2207 from broadinstitute/jb-ingest-1.40.1
bistline Feb 26, 2025
09b1cd0
Merge pull request #2204 from broadinstitute/ew-improve-study-validation
eweitz Feb 26, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions Gemfile.lock
Original file line number Diff line number Diff line change
Expand Up @@ -329,11 +329,11 @@ GEM
net-protocol
netrc (0.11.0)
nio4r (2.7.4)
nokogiri (1.18.2-arm64-darwin)
nokogiri (1.18.3-arm64-darwin)
racc (~> 1.4)
nokogiri (1.18.2-x86_64-darwin)
nokogiri (1.18.3-x86_64-darwin)
racc (~> 1.4)
nokogiri (1.18.2-x86_64-linux-gnu)
nokogiri (1.18.3-x86_64-linux-gnu)
racc (~> 1.4)
oauth2 (1.4.7)
faraday (>= 0.8, < 2.0)
Expand Down Expand Up @@ -368,7 +368,7 @@ GEM
puma (5.6.9)
nio4r (~> 2.0)
racc (1.8.1)
rack (2.2.10)
rack (2.2.11)
rack-brotli (1.1.0)
brotli (>= 0.1.7)
rack (>= 1.4)
Expand Down
4 changes: 2 additions & 2 deletions app/javascript/components/upload/UploadWizard.jsx
Original file line number Diff line number Diff line change
Expand Up @@ -55,8 +55,8 @@ const ALL_POSSIBLE_STEPS = [
AnnDataUploadStep,
DifferentialExpressionStep,
SpatialStep,
CoordinateLabelStep,
SequenceFileStep,
CoordinateLabelStep,
GeneListStep,
MiscellaneousStep,
SeuratStep,
Expand Down Expand Up @@ -101,7 +101,7 @@ export function RawUploadWizard({ studyAccession, name }) {
// set the additional steps to display, based on classic or AnnData experience
if (isAnnDataExperience) {
MAIN_STEPS = MAIN_STEPS_ANNDATA
SUPPLEMENTAL_STEPS = ALL_POSSIBLE_STEPS.slice(6, 7)
SUPPLEMENTAL_STEPS = ALL_POSSIBLE_STEPS.slice(6, 8)
// SUPPLEMENTAL_STEPS.splice(1, 0, DifferentialExpressionStep)
// TODO enable after raw counts are sorted for AnnData (SCP-5110)
NON_VISUALIZABLE_STEPS = ALL_POSSIBLE_STEPS.slice(10, 12)
Expand Down
3 changes: 2 additions & 1 deletion app/javascript/components/upload/upload-utils.js
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,7 @@ const sequenceExtensions = [
]
const baiExtensions = ['.bai']
const tbiExtensions = ['.tbi']
const csiExtensions = ['.csi'] // Like TBI, but supports larger chromosomes
const annDataExtensions = ['.h5', '.h5ad', '.hdf5']
const seuratExtensions = ['.Rds', '.rds', '.RDS', '.seuratdata', '.h5seurat', '.h5Seurat', '.seuratdisk', '.Rda', '.rda']
const miscExtensions = baseMiscExtensions.concat(mtxExtensions, annDataExtensions, seuratExtensions)
Expand All @@ -342,7 +343,7 @@ export const FileTypeExtensions = {
misc: miscExtensions.concat(miscExtensions.map(ext => `${ext}.gz`)),
sequence: sequenceExtensions,
bai: baiExtensions,
tbi: tbiExtensions,
tbi: tbiExtensions.concat(csiExtensions),
annData: annDataExtensions,
seurat: seuratExtensions
}
Expand Down
37 changes: 23 additions & 14 deletions app/javascript/lib/validation/log-validation.js
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,25 @@ function getTrimmedIssueMessages(issues) {
}).slice(0, 20) // Show <= 20 messages
}

/** Get shared issue props for file-validation and study-validation */
export function getWarningAndErrorProps(errors, warnings) {
const errorMessages = getTrimmedIssueMessages(errors)
const warningMessages = getTrimmedIssueMessages(warnings)

const errorTypes = Array.from(new Set(errors.map(columns => columns[1])))
const warningTypes = Array.from(new Set(warnings.map(columns => columns[1])))

return {
errors: errorMessages,
warnings: warningMessages,
errorTypes, warningTypes,
numErrors: errors.length,
numWarnings: warnings.length,
numErrorTypes: errorTypes.length,
numWarningTypes: warningTypes.length
}
}

/** Get properties about this validation run to log to Mixpanel */
export function getLogProps(fileInfo, issueObj, perfTimes) {
const { errors, warnings, summary } = issueObj
Expand Down Expand Up @@ -45,23 +64,13 @@ export function getLogProps(fileInfo, issueObj, perfTimes) {
}
return Object.assign({ status: 'success' }, defaultProps)
} else {
const errorMessages = getTrimmedIssueMessages(errors)
const warningMessages = getTrimmedIssueMessages(warnings)
const issueProps = getWarningAndErrorProps(errors, warnings)

const errorTypes = Array.from(new Set(errors.map(columns => columns[1])))
const warningTypes = Array.from(new Set(warnings.map(columns => columns[1])))
const withIssueProps = Object.assign(defaultProps, issueProps)

return Object.assign(defaultProps, {
return Object.assign(withIssueProps, {
status: 'failure',
summary,
numErrors: errors.length,
numWarnings: warnings.length,
errors: errorMessages,
warnings: warningMessages,
numErrorTypes: errorTypes.length,
numWarningTypes: warningTypes.length,
errorTypes,
warningTypes
summary
})
}
}
Expand Down
2 changes: 1 addition & 1 deletion app/javascript/lib/validation/validate-file-content.js
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ import { parseAnnDataFile } from './validate-anndata'
const MAX_GZIP_FILESIZE = 50 * oneMiB

/** File extensions / suffixes that indicate content must be gzipped */
const EXTENSIONS_MUST_GZIP = ['gz', 'bam', 'tbi']
const EXTENSIONS_MUST_GZIP = ['gz', 'bam', 'tbi', 'csi']


/**
Expand Down
178 changes: 178 additions & 0 deletions app/javascript/lib/validation/validate-study.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,178 @@
import { getWarningAndErrorProps } from '~/lib/validation/log-validation'
import { log } from '~/lib/metrics-api'

/** Ensure a name is provided for the study */
function validateName(input) {
const issues = []

const name = input.value
if (name === '') {
const msg = 'Enter a name for your study.'
issues.push(['error', 'missing-name', msg])
}

return issues
}

/** Convert Date string to string format in "Data release date" UI */
function dateToMMDDYYYY(dateString) {
const date = new Date(dateString)
const rawMonth = date.getMonth() + 1; // Months are zero-indexed, so add 1
const rawDay = date.getDate();
const year = date.getFullYear();

const month = rawMonth.toString().padStart(2, '0');
const day = rawDay.toString().padStart(2, '0');

const MMDDYYYY = `${month}/${day}/${year}`;

return MMDDYYYY
}

/** Ensure any embargo date is between tomorrow and max embargo date */
export function validateEmbargo(embargoInput) {
const issues = []

const rawEmbargoDate = embargoInput.value
const embargoDate = new Date(embargoInput.value)
const maxDate = new Date(embargoInput.max)

const tomorrow = new Date();
tomorrow.setDate(tomorrow.getDate() + 1);

if (
rawEmbargoDate !== '' &&
(embargoDate > maxDate || embargoDate < tomorrow)
) {
const tomorrowFormatted = dateToMMDDYYYY(tomorrow)
const maxDateFormatted = dateToMMDDYYYY(maxDate)

const msg =
`If embargoed, date must be between ` +
`tomorrow (${tomorrowFormatted}) and ${maxDateFormatted}.`

issues.push(['error', 'invalid-embargo', msg])
}

return issues
}

/** Ensure a billing project is selected */
function validateBillingProject(input) {
const issues = []

const billingProject = input.value
if (billingProject === '') {
const msg = 'Pick a billing project from above menu.'
issues.push(['error', 'missing-billing-project', msg])
}

return issues
}

/** Ensure a workspace is selected, if using existing workspace */
function validateWorkspace(input, studyForm) {
const issues = []

const workspace = input.value
const useExistingWorkspace =
studyForm.querySelector('#study_use_existing_workspace').value === '1'

if (useExistingWorkspace && workspace === '') {
const msg = 'Enter a workspace name, or set "Use existing workspace?" to "No".'
issues.push(['error', 'missing-workspace', msg])
}

return issues
}

/** Add or remove error classes from field elements around given element */
function updateErrorState(element, addOrRemove) {
const fieldDiv = element.closest('[class^="col-md"]')

if (addOrRemove === 'add') {
fieldDiv.querySelector('label').classList.add('text-danger')
fieldDiv.classList.add('has-error', 'has-feedback')
} else {
fieldDiv.querySelector('label').classList.remove('text-danger')
fieldDiv.classList.remove('has-error', 'has-feedback')
}
}

/** Render messages for any validation issue for given input element */
function writeValidationMessage(input, issues) {
if (issues.length === 0) {return}

// Consider below if we need to deal with multiple errors per field
// See ValidationMessage.jsx for model to follow
// const messageList = '<ul>' + issues.map(issue => {
// const msg = issue[2]
// return `<li className="validation-error">${msg}</li>`
// }).join('') + '</ul>

const message = issues[0][2]

const messageHtml = `<div class="validation-error">${message}</div>`

updateErrorState(input, 'add')
input.insertAdjacentHTML('afterend', messageHtml)
}

/** Validate given field, get issues, write any messages */
function checkField(studyForm, field, validateFns, issues) {
const input = studyForm.querySelector(`#study_${field}`)
let fieldIssues
if (field === 'firecloud_workspace') {
fieldIssues = validateFns[field](input, studyForm)
} else {
fieldIssues = validateFns[field](input)
}
issues = issues.concat(fieldIssues)
writeValidationMessage(input, fieldIssues)

return issues
}

/** Get event data to log to Bard / Mixpanel */
function getLogProps(issues) {
const warnings = issues.filter(issue => issue[0] === 'warn')
const errors = issues.filter(issue => issue[0] === 'error')

const issueProps = getWarningAndErrorProps(errors, warnings)
const status = errors.length === 0 ? 'success' : 'failure'

const logProps = Object.assign(issueProps, {
status
})

return logProps
}

/** Validation form in "Create study" page */
export function validateStudy(studyForm) {
let issues = []

// Clear any prior error messages
document.querySelectorAll('.validation-error').forEach(error => {
updateErrorState(error, 'remove')
error.remove()
})

const validateFns = {
'name': validateName,
'firecloud_project': validateBillingProject, // "Terra billing project"
'embargo': validateEmbargo, // "Data release date"
'firecloud_workspace': validateWorkspace // "Existing Terra workspace"
}

const fields = Object.keys(validateFns)
fields.forEach(field => {
issues = checkField(studyForm, field, validateFns, issues)
})

const logProps = getLogProps(issues)

log('study-validation', logProps)

return issues
}
2 changes: 2 additions & 0 deletions app/javascript/vite/application.js
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ import ClusterAssociationSelect from '~/components/upload/ClusterAssociationSele
import RawAssociationSelect from '~/components/upload/RawAssociationSelect'
import { getFeatureFlagsWithDefaults } from '~/providers/UserProvider'
import ValidateFile from '~/lib/validation/validate-file'
import { validateStudy } from '~/lib/validation/validate-study'
import { setupSentry } from '~/lib/sentry-logging'
import { adjustGlobalHeader, mitigateStudyOverviewTitleTruncation } from '~/lib/layout-utils'
import { clearOldServiceWorkerCaches } from '~/lib/service-worker-cache'
Expand Down Expand Up @@ -132,6 +133,7 @@ window.SCP.eventsToLog.forEach(eventToLog => {

window.SCP.getFeatureFlagsWithDefaults = getFeatureFlagsWithDefaults
window.SCP.validateRemoteFile = validateRemoteFile
window.SCP.validateStudy = validateStudy
window.SCP.API = ScpApi

window.Spinner = Spinner
Expand Down
14 changes: 13 additions & 1 deletion app/lib/differential_expression_service.rb
Original file line number Diff line number Diff line change
Expand Up @@ -163,7 +163,19 @@ def self.run_differential_expression_job(cluster_group, study, user, annotation_
params_object.machine_type = machine_type if machine_type.present? # override :machine_type if specified
return true if dry_run # exit before submission if specified as annotation was already validated

if params_object.valid?
# check if there's already a job running using these parameters and exit if so
job_params = ApplicationController.batch_api_client.format_command_line(
study_file:, action: :differential_expression, user_metrics_uuid: user.metrics_uuid, params_object:
)
running = ApplicationController.batch_api_client.find_matching_jobs(
params: job_params, job_states: BatchApiClient::RUNNING_STATES
)
if running.any?
log_message "Found #{running.count} running DE jobs: #{running.map(&:name).join(', ')}"
log_message "Matching these parameters: #{job_params.join(' ')}"
log_message "Exiting without queuing new job"
false
elsif params_object.valid?
# launch DE job
job = IngestJob.new(study:, study_file:, user:, action: :differential_expression, params_object:)
job.delay.push_remote_and_launch_ingest
Expand Down
13 changes: 0 additions & 13 deletions app/mailers/single_cell_mailer.rb
Original file line number Diff line number Diff line change
Expand Up @@ -282,16 +282,6 @@ def user_notification(user, subject, message)
end
end

# nightly sanity check email looking for missing files
def sanity_check(missing_files)
@missing_files = missing_files
@admins = User.where(admin: true).map(&:email)

mail(to: @admins, subject: "[Single Cell Portal Admin Notification #{Rails.env != 'production' ? " (#{Rails.env})" : nil}]: Sanity check results: #{@missing_files.size} files missing") do |format|
format.html
end
end

# collect usage statistics for the given day and email admins
def nightly_admin_report
@admins = User.where(admin: true).map(&:email)
Expand Down Expand Up @@ -324,9 +314,6 @@ def nightly_admin_report
# disk usage
@disk_stats = SummaryStatsUtils.disk_usage

# storage sanity check
@missing_files = SummaryStatsUtils.storage_sanity_check

mail(to: @admins, subject: "[Single Cell Portal Admin Notification#{Rails.env != 'production' ? " (#{Rails.env})" : nil}]: Nightly Server Report for #{@today}") do |format|
format.html { render layout: 'nightly_admin_report' }
end
Expand Down
16 changes: 16 additions & 0 deletions app/models/batch_api_client.rb
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,22 @@ def list_jobs(page_token: nil)
service.list_project_location_jobs(project_location, page_token:)
end

# find any existing jobs using command line parameters & job states
#
# * *params*
# - +params+ (Array<String>) => array of params sent to job to match on
# - +job_state+ (Array<String>) => optional states to filter on (defaults to completed jobs)
#
# * *returns*
# - (Array<Google::Apis::BatchV1::Job>)
def find_matching_jobs(params: [], job_states: COMPLETED_STATES)
jobs = list_jobs.jobs
jobs.select do |job|
commands = get_job_command_line(job:)
(params & commands).sort == params.sort && job_states.include?(job.status.state)
end
end

# main handler to create and run a Batch API job
#
# * *params*
Expand Down
1 change: 1 addition & 0 deletions app/models/ingest_job.rb
Original file line number Diff line number Diff line change
Expand Up @@ -516,6 +516,7 @@ def set_study_state_after_ingest
set_has_image_cache
when :ingest_anndata
launch_anndata_subparse_jobs if study_file.is_viz_anndata?
launch_differential_expression_jobs if study_file.is_viz_anndata?
set_anndata_file_info
end
set_study_initialized
Expand Down
Loading
Loading