NCATSTranslator · gaurav · Dec 15, 2025 · Oct 29, 2025 · Oct 29, 2025 · Oct 29, 2025
diff --git a/.github/workflows/release.yaml b/.github/workflows/release.yaml
@@ -24,7 +24,7 @@ jobs:
         # with you. This also makes it possible to fetch additional branches from
         # GitHub if you need to.
         with:
-          fetch-depth: 0
+          persist-credentials: false
       - name: Get the version
         id: get_version
         run: echo ::set-output name=VERSION::${GITHUB_REF/refs\/tags\//}

diff --git a/config.yaml b/config.yaml
@@ -1,16 +1,21 @@
+# Build information. Currently unstructured -- you can use this to write down any notes on what is going on with
+# this specific build of Babel.
+build:
+  branch: babel-1.14
+
+# Versions that need to be updated on every release.
+biolink_version: "4.3.2"
+umls_version: "2025AA"
+rxnorm_version: "10062025"
+drugbank_version: "5-1-13"
+
 # Overall inputs and outputs.
 input_directory: input_data
 download_directory: babel_downloads
 intermediate_directory: babel_outputs/intermediate
 output_directory: babel_outputs
 tmp_directory: babel_downloads/tmp
 
-# Versions that need to be updated on every release.
-biolink_version: "4.2.6-rc5"
-umls_version: "2025AA"
-rxnorm_version: "07072025"
-drugbank_version: "5-1-13"
-
 #
 # UMLS
 #

diff --git a/docs/README.md b/docs/README.md
@@ -2,8 +2,8 @@
 
 This directory contains several pieces of Babel documentation.
 
-Both [Node Normalization (NodeNorm)](https://github.com/TranslatorSRI/NodeNormalization) and
-[Name Resolution (NameRes or NameLookup)](https://github.com/TranslatorSRI/NameResolution) have their own GitHub repositories
+Both [Node Normalization (NodeNorm)](https://github.com/NCATSTranslator/NodeNormalization) and
+[Name Resolution (NameRes or NameLookup)](https://github.com/NCATSTranslator/NameResolution) have their own GitHub repositories
 with their own documentation, but this directory is intended to include all the basic instructions
 needed to work with Babel and its tools.
 
@@ -18,7 +18,7 @@ _cliques_ of identifiers that refer to the same concept. Each clique is assigned
 type from the [Biolink Model](https://github.com/biolink/biolink-model), which determines which identifier prefixes are
 allowed and the order in which the identifiers are presented. One of these identifiers
 is chosen to be the _preferred identifier_ for the clique. Within Translator, this
-information is made available through the [Node Normalization service](https://github.com/TranslatorSRI/NodeNormalization).
+information is made available through the [Node Normalization service](https://github.com/NCATSTranslator/NodeNormalization).
 
 In certain contexts, differentiating between some related cliques doesn't make sense:
 for example, you might not want to differentiate between a gene and the product of that
@@ -27,7 +27,7 @@ on the basis of various criteria: for example, the GeneProtein conflation combin
 gene with the protein that that gene encodes.
 
 While generating these cliques, Babel also collects all the synonyms for every clique,
-which can then be used by tools like [Name Resolution (NameRes)](https://github.com/TranslatorSRI/NameResolution) to provide
+which can then be used by tools like [Name Resolution (NameRes)](https://github.com/NCATSTranslator/NameResolution) to provide
 name-based lookup of concepts.
 
 ## How can I access Babel cliques?
@@ -41,17 +41,17 @@ There are several ways of accessing Babel cliques:
   "normalize" identifiers -- any member of a particular clique will be normalized
   to the same preferred identifier, and the API will return all the secondary
   identifiers, Biolink type, description and other useful information.
-  You can find out more about this frontend on [its GitHub repository](https://github.com/TranslatorSRI/NodeNormalization).
+  You can find out more about this frontend on [its GitHub repository](https://github.com/NCATSTranslator/NodeNormalization).
 * The NCATS Translator project also provides the [Name Lookup (Name Resolution)](https://name-lookup.transltr.io/)
   frontends for searching for concepts by labels or synonyms. You can find out more
-  about this frontend at [its GitHub repository](https://github.com/TranslatorSRI/NameResolution).
+  about this frontend at [its GitHub repository](https://github.com/NCATSTranslator/NameResolution).
 * Members of the Translator consortium can also request access to the [Babel outputs](./BabelOutputs.md)
   (in a [custom format](./DataFormats.md)),
   which are currently available in JSONL, [Apache Parquet](https://parquet.apache.org/) or [KGX](https://github.com/biolink/kgx) formats.
 
 ## What is the Node Normalization service (NodeNorm)?
 
-The Node Normalization service, Node Normalizer or [NodeNorm](https://github.com/TranslatorSRI/NodeNormalization) is an
+The Node Normalization service, Node Normalizer or [NodeNorm](https://github.com/NCATSTranslator/NodeNormalization) is an
 NCATS Translator web service to normalize identifiers by returning a single preferred identifier for any identifier
 provided.
 
@@ -63,17 +63,17 @@ It also includes some endpoints for normalizing an entire TRAPI message and othe
 Translator users.
 
 You can find out more about NodeNorm at its [Swagger interface](https://nodenormalization-sri.renci.org/docs)
-or [in this Jupyter Notebook](https://github.com/TranslatorSRI/NodeNormalization/blob/master/documentation/NodeNormalization.ipynb).
+or [in this Jupyter Notebook](https://github.com/NCATSTranslator/NodeNormalization/blob/master/documentation/NodeNormalization.ipynb).
 
 ## What is the Name Resolution service (NameRes)?
 
-The Name Resolution service, Name Lookup or [NameRes](https://github.com/TranslatorSRI/NameResolution) is an
+The Name Resolution service, Name Lookup or [NameRes](https://github.com/NCATSTranslator/NameResolution) is an
 NCATS Translator web service for looking up preferred identifiers by search text. Although it is primarily
 designed to be used to power NCATS Translator's autocomplete text fields, it has also been used for
 named-entity linkage.
 
 You can find out more about NameRes at its [Swagger interface](https://name-resolution-sri.renci.org/docs)
-or [in this Jupyter Notebook](https://github.com/TranslatorSRI/NameResolution/blob/master/documentation/NameResolution.ipynb).
+or [in this Jupyter Notebook](https://github.com/NCATSTranslator/NameResolution/blob/master/documentation/NameResolution.ipynb).
 
 ## What are "information content" values?
 
@@ -84,7 +84,7 @@ that range from 0.0 (high-level broad term with many subclasses) to 100.0 (very
 
 ## I've found a "split" clique: two identifiers that should be considered identical are in separate cliques.
 
-Please report this as an issue to the [Babel GitHub repository](https://github.com/TranslatorSRI/Babel/issues).
+Please report this as an issue to the [Babel GitHub repository](https://github.com/NCATSTranslator/Babel/issues).
 At a minimum, please include the identifiers (CURIEs) for the identifiers that should be combined. Links to
 a NodeNorm instance showing the two cliques are very helpful. Evidence supporting the lumping, such as a link to an
 external database that makes it clear that these identifiers refer to the same concept, are also very helpful: while we
@@ -93,7 +93,7 @@ mappings that would combine the two identifiers, allowing us to improve cliquing
 
 ## I've found a "lumped" clique: two identifiers that are combined in a single clique refer to different concepts.
 
-Please report this as an issue to the [Babel GitHub repository](https://github.com/TranslatorSRI/Babel/issues).
+Please report this as an issue to the [Babel GitHub repository](https://github.com/NCATSTranslator/Babel/issues).
 At a minimum, please include the identifiers (CURIEs) for the identifiers that should be split. Links to
 a NodeNorm instance showing the lumped clique is very helpful. Evidence, such as a link to an external database
 that makes it clear that these identifiers refer to the same concept, are also very helpful: while we have some
@@ -117,6 +117,6 @@ into any problems or would like some assistance.
 
 ## Who should I contact for more information about Babel?
 
-You can find out more about Babel by [opening an issue on this repository](https://github.com/TranslatorSRI/Babel/issues),
+You can find out more about Babel by [opening an issue on this repository](https://github.com/NCATSTranslator/Babel/issues),
 contacting one of the [Translator SRI PIs](https://ncats.nih.gov/research/research-activities/translator/projects) or
-contacting the [NCATS Translator team](https://ncats.nih.gov/research/research-activities/translator/about).
+contacting the [NCATS Translator team](https://ncats.nih.gov/research/research-activities/translator/about).
diff --git a/kubernetes/babel.k8s.yaml b/kubernetes/babel.k8s.yaml
@@ -19,7 +19,7 @@ spec:
   restartPolicy: Never
   containers:
   - name: babel
-    image: ghcr.io/translatorsri/babel:latest
+    image: ghcr.io/ncatstranslator/babel:latest
     # I just need something to run while I figure out how to make this work
     command: [ "/bin/bash", "-c", "--" ]
     args: [ "while true; echo Running; do sleep 30; done;" ]

diff --git a/pyproject.toml b/pyproject.toml
@@ -34,8 +34,8 @@ dependencies = [
 ]
 
 [project.urls]
-Homepage = "https://github.com/TranslatorSRI/Babel"
-Repository = "https://github.com/TranslatorSRI/Babel"
+Homepage = "https://github.com/NCATSTranslator/Babel"
+Repository = "https://github.com/NCATSTranslator/Babel"
 Issues = "https://github.com/NCATSTranslator/Babel/issues"
 
 [tool.uv.sources]
@@ -56,4 +56,4 @@ line-length = 160
 
 [tool.snakefmt]
 line_length = 160
-include = '\.snakefile$|^Snakefile'
+include = '\.snakefile$|^Snakefile'
diff --git a/src/babel_utils.py b/src/babel_utils.py
@@ -141,7 +141,7 @@
        self.delta = timedelta(milliseconds=delta_ms)

    def get(self, url):
        now = dt.now()
        throttled = False
        if self.last_time is not None:
            cdelta = now - self.last_time
@@ -149,7 +149,7 @@
                waittime = self.delta - cdelta
                time.sleep(waittime.microseconds / 1e6)
                throttled = True
        self.last_time = dt.now()
        response = requests.get(url)
        return response, throttled

@@ -175,7 +175,7 @@
    """
    # Everything goes in downloads
    download_dir = get_config()["download_directory"]
    working_dir = download_dir

    # get the (local) download file name, derived from the input file name
    if subpath is None:
@@ -307,13 +307,13 @@
     # Decompress the downloaded file if needed.
     uncompressed_filename = None
     if decompress:
-        if dl_file_name.endswith(".gz"):
+        if dl_file_name.lower().endswith(".gz"):
             uncompressed_filename = dl_file_name[:-3]
             process = subprocess.run(["gunzip", dl_file_name])
             if process.returncode != 0:
                 raise RuntimeError(f"Could not execute gunzip ['gunzip', {dl_file_name}]: {process.stderr}")
         else:
-            raise RuntimeError(f"Don't know how to decompress {in_file_name}")
+            raise RuntimeError(f"Don't know how to decompress {in_file_name}, which was downloaded as '{dl_file_name}'.")
 
         if os.path.isfile(uncompressed_filename):
             file_size = os.path.getsize(uncompressed_filename)
@@ -538,11 +538,11 @@
                    possible_labels = map(lambda identifier: identifier.get("label", ""), node["identifiers"])

                # Step 2. Filter out any suspicious labels.
                filtered_possible_labels = [l for l in possible_labels if l]  # Ignore blank or empty names.

                # Step 3. Filter out labels longer than config['demote_labels_longer_than'], but only if there is at
                # least one label shorter than this limit.
                labels_shorter_than_limit = [l for l in filtered_possible_labels if l and len(l) <= config["demote_labels_longer_than"]]
                if labels_shorter_than_limit:
                    filtered_possible_labels = labels_shorter_than_limit

@@ -731,7 +731,7 @@
    shit_prefixes = set(["KEGG", "PUBCHEM"])
    test_id = "xUBERON:0002262"
    debugit = False
    excised = set()
    for xgroup in newgroups:
        if isinstance(xgroup, frozenset):
            group = set(xgroup)
@@ -751,7 +751,7 @@
        existing_sets_w_x = [(conc_set[x], x) for x in group if x in conc_set]
        # All of these sets are now going to be combined through the equivalence of our new set.
        existing_sets = [es[0] for es in existing_sets_w_x]
        x = [es[1] for es in existing_sets_w_x]
        newset = set().union(*existing_sets)
        if debugit:
            print("merges:", existing_sets)
@@ -779,7 +779,7 @@
        for up in unique_prefixes:
            if test_id in group:
                print("up?", up)
            idents = [e if type(e) == str else e.identifier for e in newset]
            if len(set([e for e in idents if (e.split(":")[0] == up)])) > 1:
                bad += 1
                setok = False

diff --git a/src/createcompendia/leftover_umls.py b/src/createcompendia/leftover_umls.py
@@ -221,6 +221,7 @@ def umls_type_to_biolink_type(umls_tui):
                     "names": synonyms_list,
                     "clique_identifier_count": 1,
                     "taxa": [],
+                    "taxon_specific": False,
                     "types": [t[8:] for t in node_factory.get_ancestors(umls_type_by_id[id])],
                 }
 

diff --git a/src/datahandlers/chebi.py b/src/datahandlers/chebi.py
@@ -2,8 +2,8 @@
 
 
 def pull_chebi():
-    pull_via_ftp("ftp.ebi.ac.uk", "/pub/databases/chebi/SDF/", "ChEBI_complete.sdf.gz", decompress_data=True, outfilename="CHEBI/ChEBI_complete.sdf")
-    pull_via_ftp("ftp.ebi.ac.uk", "/pub/databases/chebi/Flat_file_tab_delimited/", "database_accession.tsv", outfilename="CHEBI/database_accession.tsv")
+    pull_via_ftp("ftp.ebi.ac.uk", "/pub/databases/chebi/SDF", "chebi.sdf.gz", decompress_data=True, outfilename="CHEBI/ChEBI_complete.sdf")
+    pull_via_ftp("ftp.ebi.ac.uk", "/pub/databases/chebi/flat_files", "database_accession.tsv.gz", decompress_data=True, outfilename="CHEBI/database_accession.tsv")
 
 
 def x(inputfile, labelfile, synfile):

diff --git a/src/snakefiles/diseasephenotype.snakefile b/src/snakefiles/diseasephenotype.snakefile
@@ -171,10 +171,10 @@ rule disease_manual_concord:
             sources=[
                 {
                     "name": "Babel repository",
-                    "url": "https://github.com/TranslatorSRI/Babel",
+                    "url": "https://github.com/NCATSTranslator/Babel",
                 }
             ],
-            url="https://github.com/TranslatorSRI/Babel/blob/master/input_data/manual_concords/disease.txt",
+            url="https://github.com/NCATSTranslator/Babel/blob/master/input_data/manual_concords/disease.txt",
             concord_filename=output.outfile,
         )