Skip to content

Conversation

@hyi
Copy link
Contributor

@hyi hyi commented Oct 12, 2025

Added slurm folder to support running babel in a distributed fashion on Hatteras. Reference: https://snakemake.github.io/snakemake-plugin-catalog/plugins/executor/slurm.html.

  • Adds a slurm/run-babel-on-slurm.sh script that can be used to run Babel on Slurm (just set the BABEL_VERSION environmental variable to something that distinguishes the run, like babel-1.15-run-1 to get distinct log files).
    • This will create a low-memory, low CPU sbatch job that can then start new Slurm jobs for each Snakemake step. This is not recommended -- the recommended way to run Snakemake on Slurm is to run it on the login node (as it doesn't take up significant resources) and have it spawn jobs to run, but I like the simplicity of running everything within a single setup instead of managing Screen sceens and logs separately, but they warn that this might have unpredictable behavior. So far so good, however.
    • This job will use the Snakemake profile in slurm/config.yaml, which configures the use of the Slurm runtime and sets. up default resources for all jobs.
  • @hyi created two jobs that can be used to run Babel on Hatteras: I've kept them in slurm/job if anybody needs them, but slurm/run-babel-on-slurm.sh will automatically set up output/error/default timeout for you.
  • Added additional resources to: chemical_unichem_concordia, untyped_chemical_compendia, chemical_compendia, get_ensembl, chembl_labels_and_smiles, drugchemical_conflated_synonyms, export_compendia_to_duckdb, check_for_identically_labeled_cliques, check_for_duplicate_curies, check_for_duplicate_clique_leaders, generate_kgx, generate_sapbert_training_data, gene_compendia, geneprotein_conflated_synonyms, get_protein_pr_uniprotkb_relationships, protein_compendia, protein, generate_pubmed_concords, generate_pubmed_compendia.
  • Set up jobs with retries: get_anatomy_obo_relationships, chemical_chebi_ids, get_obo_labels, get_obo_synonyms, get_obo_descriptions, get_icrdf, get_unichem
  • Started improving some DuckDB reports (work completed in PR Improved reports with tables #624)
    • Split job generate_prefix_report into generate_curie_report and generate_by_clique_report
    • Gzipped identically_labeled_cliques.tsv to improve storage.
  • DuckDB was running out of memory doing some large loads, so I added some code for importing it in batches.
  • Simplified a bunch of the DuckDB reports: they run out of memory if given too much memory, so I simplified them so I could run them with less than 50G of memory.
    • For some queries, I was able to get them to finish by giving them up to the maximum memory on Hatteras (1500G). To do this, I had to set up a duckdb_config dictionary for sending configuration information to DuckDB.
  • Some minor fixes:
    • Added an explicit owlfile arguments to write_efo_ids() and build_disease_efo_relationships() and class EFOgraph.
    • Reduced BIOMART_MAX_ATTRIBUTE_COUNT from 8 to 6 to get it to run.
    • Added a verify_gzip option to pull_via_urllib() to allow downloaded gzipped files to be verified.
    • Deleted src/cluster_config.yml, which doesn't appear to do anything.
    • Sleep for 5 seconds between UberGraph requests.
    • Noted that this doesn't work on Python 3.14 (I think because of a prerequisite?)
    • Copy babel-config.yaml into the babel-outputs once all the other jobs are complete, so we keep a record of what the configuration was.
    • Switch rules all and all_outputs to use write_done() instead of creating the files with shell.
      Should be merged after PR Babel 1.15 #620.

@hyi hyi requested a review from gaurav October 12, 2025 20:41
@gaurav gaurav changed the base branch from master to babel-1.14 November 14, 2025 23:11
@gaurav gaurav force-pushed the 35-run-distribute-hatteras branch from abcc15d to e80b57f Compare November 14, 2025 23:36
@gaurav gaurav force-pushed the 35-run-distribute-hatteras branch from 33a8381 to 393209e Compare December 14, 2025 05:17
@gaurav gaurav changed the base branch from babel-1.14 to babel-1.15 December 14, 2025 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants