Skip to content

Conversation

@alecgibson
Copy link
Collaborator

@alecgibson alecgibson commented Feb 6, 2025

This is a BREAKING CHANGE that aims to improve database performance by changing the queries and indexes used to perform operations. There will be no changes visible to consumers of the JavaScript API.

The break will:

  • require consumers to run a migration query before upgrading
  • add new indexes
  • allow dropping of an old index

More details on migration are at the bottom of this commit message.

Motivation

This library seems to have been written with the assumption that its collections are small. However, we have hundreds of thousands of jobs in various queues on Production, which causes some slow queries because of the design choices made in the schema of this library.

In particular, we aim to address two issues:

  1. {deleted: {$exists: boolean}} calls are inefficient, but used by practically every query in this library
  2. counting inflight jobs has awful performance when there are many available jobs

The result is that no further filtering is needed beyond the index on any of the issued queries.

$exists

MongoDB's $exists operator has tricky index performance. In the best cases, $exists: true can use the index, but only if it's sparse, and $exists: false can never just use the index: it always needs to fetch documents.

In order to avoid the constant use of $exists in this library, we rely on a logical paradigm shift: we $unset visible when acking the job, so that visible and deleted are mutually exclusive fields. Therefore we can:

  • add a sparse index for both of these fields
  • query on the field we care about, and know that it implies the absence of the other field, allowing the removal of $exists assertions

It should be noted that in local testing, I observed strange behaviour when trying to use this partial index: we have to use ack: {$gt: ''} instead of ack: {$exists: true} to get MongoDB to leverage this index for some reason.

inFlight()

The existing inFlight() query has particularly bad performance in cases where there are many (hundreds of thousands) of jobs available to pick up.

This is because the existing query uses the deleted_1_visible_1 index, but even after filtering by deleted and visible with the index, the database will need to fetch every single job that could be picked up, and check for ack, which is very slow.

We improve the performance here by:

  • the removal of the $exists query (see above)
  • the addition of a partial index that only contains unacked jobs that have been retrieved at some point by get(). We can then filter these by the current time to find in-flight jobs

Migration path

This performance improvement is built upon a shift in the assumptions made about underlying job structure: namely, that deleted and visible are now mutually exclusive properties (which was not true before).

  1. Bump patch version to 7.1.1: this will start removing the visible property from acked jobs in a non-breaking way

  2. Deploy the patch to Production

  3. Update any existing documents to match this new schema:

    db.collection.updateMany( {deleted: {$exists: true}}, {$unset: {visible: 1}})
  4. Bump major version to 8.0.0 and deploy

  5. Drop old index delted_1_visible_1, which is no longer used

Depends on:

alecgibson added a commit that referenced this pull request Feb 6, 2025
This is a non-breaking change that forms part of the migration path to
improving our performance in #15

When acking a job, we now unset the `visible` property, which makes
`visible` and `deleted` mutually exclusive properties, so that the
presence of one implies the absence of the other.

We can do this, because the only time we query on `visible` is when we
*also* query for `deleted: {$exists: false}`. Similarly, the only times
we query for `deleted: {$exists: true}` are times when we don't query
`visible`.
@alecgibson alecgibson force-pushed the performance-improvement branch 2 times, most recently from 3331a93 to 53b6b09 Compare February 6, 2025 14:53
@alecgibson alecgibson changed the base branch from main to remove-visible February 6, 2025 14:53
Base automatically changed from remove-visible to main February 7, 2025 09:30
This is a **BREAKING CHANGE** that aims to improve database performance
by changing the queries and indexes used to perform operations. There
will be no changes visible to consumers of the JavaScript API.

The break will:

 - require consumers to run a migration query before upgrading
 - add new indexes
 - allow dropping of an old index

More details on migration are at the bottom of this commit message.

Motivation
----------
This library seems to have been written with the assumption that its
collections are small. However, we have hundreds of thousands of jobs in
various queues on Production, which causes some slow queries because of
the design choices made in the schema of this library.

In particular, we aim to address two issues:

 1. `{deleted: {$exists: boolean}}` calls are inefficient, but used by
    practically every query in this library
 2. counting inflight jobs has awful performance when there are many
    available jobs

The result is that no further filtering is needed beyond the index on
any of the issued queries.

`$exists`
---------
MongoDB's `$exists` operator has [tricky index performance][1]. In the
best cases, `$exists: true` can use the index, but only if it's sparse,
and `$exists: false` can **never** just use the index: it always needs
to fetch documents.

In order to avoid the constant use of `$exists` in this library, we rely
on a logical paradigm shift: we `$unset` `visible` when acking the job,
so that `visible` and `deleted` are mutually exclusive fields. Therefore
we can:

 - add a sparse index for both of these fields
 - query on the field we care about, and **know** that it implies the
   absence of the other field, allowing the removal of `$exists`
   assertions

It should be noted that in local testing, I observed
[strange behaviour][2] when trying to use this partial index: we have to
use `ack: {$gt: ''}` instead of `ack: {$exists: true}` to get MongoDB to
leverage this index for some reason.

`inFlight()`
------------
The existing `inFlight()` query has particularly bad performance in
cases where there are many (hundreds of thousands) of jobs available to
pick up.

This is because the existing query uses the `deleted_1_visible_1` index,
but even after filtering by `deleted` and `visible` with the index, the
database will need to fetch every single job that could be picked up,
and check for `ack`, which is very slow.

We improve the performance here by:

 - the removal of the `$exists` query (see above)
 - the addition of a partial index that only contains unacked jobs that
   have been retrieved at some point by `get()`. We can then filter
   these by the current time to find in-flight jobs

Migration path
--------------
This performance improvement is built upon a shift in the assumptions
made about underlying job structure: namely, that `deleted` and
`visible` are now mutually exclusive properties (which was not true
before).

 1. Bump patch version to [`7.1.1`][3]: this will start removing the
    `visible` property from acked jobs in a non-breaking way
 2. Deploy the patch to Production
 3. Update any existing documents to match this new schema:

    ```js
    db.collection.updateMany(
      {deleted: {$exists: true}},
      {$unset: {visible: 1}},
    )
    ```

 4. Bump major version to `8.0.0` and deploy
 5. Drop old index `delted_1_visible_1`, which is no longer used

[1]: https://www.mongodb.com/docs/manual/reference/operator/query/exists/#use-a-sparse-index-to-improve--exists-performance
[2]: https://www.mongodb.com/community/forums/t/partial-index-is-not-used-during-search/290507/2
[3]: #16
@alecgibson alecgibson force-pushed the performance-improvement branch from 53b6b09 to 795a8d9 Compare February 10, 2025 11:59
@alecgibson
Copy link
Collaborator Author

alecgibson commented Feb 10, 2025

Migration checklist:

  • remove-user-pending
    • run migration query
    • add new indexes
  • pdf-pending
    • run migration query
    • add new indexes
  • mail-check-in-pending
    • run migration query
    • add new indexes
  • remove-book-pending
    • run migration query
    • add new indexes
  • epub-pending
    • run migration query
    • add new indexes
  • docx-pending
    • run migration query
    • add new indexes
  • import-upload-pending
    • run migration query
    • add new indexes
  • mail-goals-pending
    • run migration query
    • add new indexes
  • image-conversions-pending
    • run migration query
    • add new indexes
  • mail-book-history-deletion-pending
    • run migration query
    • add new indexes
  • remove-bookHistory-pending
    • run migration query
    • add new indexes

@alecgibson alecgibson marked this pull request as ready for review February 10, 2025 15:30
@alecgibson alecgibson merged commit 8393845 into main Feb 10, 2025
2 checks passed
@alecgibson alecgibson deleted the performance-improvement branch February 10, 2025 15:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants