Missing Indexes in SQLite Database Causing Performance Bottlenecks

I've noticed that certain queries on the SQLite database used by beets—particularly album.items()—are quite slow on my machine.

After some investigation, I found that the database does not currently have any indexes defined. Specifically, the items table lacks an index on the album_id column, which seems to be the root cause of the slowdown when querying items by album.

I manually created an index on album_id, and the performance improvement was significant—query times dropped by approx a factor of 4 in my benchmarks.

## Suggestion

Adding indexes for frequently queried fields (like `album_id`) could provide a substantial performance boost for many users. However, I'm not very familiar with the dbcore layer and I'm unsure how or where to properly define these indexes within the beets codebase. I dont think we have a mechanism in place for applying any kind of database migrations.

## Temporary Workaround

For anyone else running into similar performance issues, or if you just want to test this, here's a small script you can use to manually add the index and run a benchmark on your library.

```python
import sqlite3
import time
from pathlib import Path

from beets.library import Library

# Configure your db path
db_path = Path("./library.db")
idx_name = "items_album_id_idx"


def run_sql_exc(path, query):
    conn = sqlite3.connect(path)
    cursor = conn.cursor()
    cursor.execute(query)
    conn.commit()
    conn.close()


def run_benchmark(lib, indexed=True):
    time_start = time.perf_counter()
    albums = lib.albums()
    items = []
    for album in albums:
        items.extend(album.items())
    duration = time.perf_counter() - time_start
    nA = len(albums)
    nI = len(items)

    print(
        f"Benchmark with{'out' if indexed else None} index took {duration:.2f} seconds"
    )
    print(f"\tnAlbums: {nA} (avgtime {duration * 1000 / nA} ms per matchquery)")
    print(f"\tnItems:  {nI} (avgtime {duration * 1000 / nI:.2f} ms per item)")


if __name__ == "__main__":
    run_sql_exc(db_path, 'DROP INDEX IF EXISTS "{}"'.format(idx_name))
    print("Benchmark without index...")
    run_benchmark(Library(db_path.name, str(db_path.resolve().parent)))
    print("Creating index on items.album_id...")
    run_sql_exc(
        db_path, 'CREATE INDEX IF NOT EXISTS "{}" ON items (album_id)'.format(idx_name)
    )
    print("Benchmark with index...")
    run_benchmark(Library(db_path.name, str(db_path.resolve().parent)))
    print("Done.")
```

For me adding a singular index on items yields the following improvements:
```shell
Benchmark without index...
Benchmark without index took 2.27 seconds
	nAlbums: 521 (avgtime 4.364256262957434 ms per matchquery)
	nItems:  3977 (avgtime 0.57 ms per item)
Creating index on items.album_id...
Benchmark with index...
Benchmark with index took 0.66 seconds
	nAlbums: 521 (avgtime 1.2675718119012231 ms per matchquery)
	nItems:  3977 (avgtime 0.17 ms per item)
Done.
```


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Missing Indexes in SQLite Database Causing Performance Bottlenecks #5809

Suggestion

Temporary Workaround

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Missing Indexes in SQLite Database Causing Performance Bottlenecks #5809

Description

Suggestion

Temporary Workaround

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions