Improve embedding visualization color spread using UMAP

## Problem

Article visualization colors cluster into only 2-3 color variants (purple/red and green/orange) instead of spreading across the full color spectrum.

## Current Implementation

The `compute_content_hue()` function in `embeddings/embeddings.py` derives hue by:
1. Computing the mean embedding vector for the article
2. Hashing it with SHA-256
3. Using `hash_int % 360` to get a hue

This produces deterministic but poorly distributed colors when articles are semantically similar.

## Proposed Solution

Use UMAP to reduce the mean embedding to 1D, then map that value to hue (0-360). This approach:

- Spreads articles along their primary semantic axis
- Similar articles get nearby (but distinct) colors
- Very different articles get contrasting colors
- Scales well as more articles are added

## Implementation

Modify `compute_content_hue()` to:
1. Use UMAP with `n_components=1` on the mean embedding
2. Normalize the resulting value to 0-360 range
3. Return as hue

This may require processing all articles together for optimal spread, or using a pre-fitted UMAP model.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Improve embedding visualization color spread using UMAP #2

Problem

Current Implementation

Proposed Solution

Implementation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Improve embedding visualization color spread using UMAP #2

Description

Problem

Current Implementation

Proposed Solution

Implementation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions