Skip to content

City-of-Helsinki/unified-search

Repository files navigation

Common unified search

This is common unified search: multi domain search over multiple services.

Table of Contents

Applications

Unified search consists of the following applications:

Elasticsearch

  • Search engine for indexing the data
  • All environments use Elasticsearch

Data collector

  • Python Django application for fetching data from multiple sources and storing it to Elasticsearch.
  • Django management commands for importing data are triggered daily by Azure DevOps cron jobs.
  • The /sources/ dir in this repository

See Data collector README for more info.

GraphQL search API

  • GraphQL search API on top of Elasticsearch providing high level interface for end (frontend) users
  • The /graphql/ dir in this repository

See GraphQL search API README for more info.

Architecture

---
title: Architecture diagram of Unified Search
---
flowchart BT
  subgraph DevOps["Azure DevOps"]
    CronJobs["Cron Jobs"]
  end
  subgraph Sources["Data Sources"]
    subgraph DjangoMunigeoGroup[" "]
      Makasiini["makasiini.hel.ninja"]
      HelsinkiWFS["kartta.hel.fi"]
      HsyWFS["kartta.hsy.fi"]
    end
    subgraph OtherAPIsGroup[" "]
      PalvelukarttaWS["hel.fi/palvelukarttaws"]
      ServiceMap["api.hel.fi/servicemap"]
      Hauki["hauki.api.hel.fi"]
      LinkedEvents["api.hel.fi/linkedevents"]
    end
  end
  subgraph DataCollector["Data Collector"]
    IngestData["ingest_data mgmt cmd"]
  end
  subgraph Frontends["Frontend Applications"]
    Kultus["kultus.hel.fi"]
    Liikunta["liikunta.hel.fi"]
    LiikuntaHeadlessCMS["liikunta2.content.api.hel.fi"]
  end
  Elasticsearch
  GraphQL["GraphQL search API"]

  DevOps -- calls daily --> DataCollector
  DataCollector -- reads --> Sources -- are mapped to --> Elasticsearch
  GraphQL -- queries --> Elasticsearch
  Frontends -- query --> GraphQL

  style DjangoMunigeoGroup stroke-width:0
  style OtherAPIsGroup stroke-width:0
Loading

Known users of unified search

User's URL Type Used GraphQL queries Purpose
https://kultus.hel.fi/ Frontend administrativeDivisions query Select areas for search on search page
https://liikunta.hel.fi/ Frontend unifiedSearch query with location index Search for venues (i.e. locations/units) and show them in list or on map
https://liikunta2.content.api.hel.fi/ Headless CMS unifiedSearch query with location index Select venues (i.e. locations/units) to be shown as CMS content on liikunta.hel.fi

kultus.hel.fi

liikunta.hel.fi

liikunta2.content.api.hel.fi

Used GraphQL query:

{
  unifiedSearch(
    index: location
    ontologyTreeIdOrSets: [551]
    text: "%s"
    first: 50
  ) {
    edges {
      node {
        venue {
          meta { id }
          name { fi sv en }
        }
      }
    }
  }
}

where:

  • 551 is the ontology tree ID for "Sports and physical exercise" (i.e. only show sports venues)
  • %s is replaced with the search term

NOTE:

  • Because of historical development of Liikunta application's headless CMS there are two different Liikunta Headless CMS production instances:
  • The instances could be combined into one by combining their datas, and updating all the links to point to it

Development

Running with Docker & Docker compose

  1. First copy .env.example to .env
  2. Then read the file's contents and set environment variables according to your environment
  3. Configure your Docker to use at least 4 GB RAM so all services can be run simultaneously
  4. Run docker compose up to start all services locally
  5. Wait until all services are up and running (it takes a while)

Services can now be locally accessed at:

Service Local URL
GraphQL search API http://localhost:4000/search
Elastic Stack home http://localhost:5601
Elasticsearch Dev Tools http://localhost:5601/app/dev_tools#/console
Elasticsearch http://localhost:9200
Data collector http://localhost:5001/readiness

Running without Docker

Caveat emptor:

Running without Docker is not fully supported, so this setup may prove difficult. You have been warned.
  1. First copy .env.example to .env
  2. Then read the file's contents and set environment variables according to your environment
  3. See the app specific READMEs for more info:

Running tests

Sources tests, with docker compose:

docker compose exec sources pytest

GraphQL tests under graphql folder (Install dependencies with yarn first):

yarn test:ci

Running data importers

For more info, see Data Importers README, but here are a few examples of importing data into unified search.

Import administrative division data:

docker compose exec sources python manage.py ingest_data administrative_division

Import location data:

docker compose exec sources python manage.py ingest_data location

Setting up pre-commit hooks

You can use pre-commit to lint and format your code before committing:

  1. Install pre-commit (there are many ways to do that, but let's use pip as an example):
    • pip install pre-commit
  2. Set up git hooks from .pre-commit-config.yaml by running these commands from project root:
    • pre-commit install to enable pre-commit code formatting & linting
    • pre-commit install --hook-type commit-msg to enable pre-commit commit message linting
  3. To be able to successfully run the pre-commit hooks for the graphql app, you need to install its dependencies:
    • yarn --cwd graphql (Installs dependencies in the graphql folder)

After that, linting and formatting hooks will run against all changed files before committing.

Git commit message linting is configured in .gitlint

About

Common unified search

Resources

License

Stars

Watchers

Forks

Contributors 11