Skip to content

Conversation

@Artemii-Shlychkov
Copy link

Issue

T-distributed Stochastic Neighbor Embedding (t-SNE) is a widely used tool for dimensionality reduction and visualization of high-dimensional datasets. By replacing the Gaussian kernel in SNE with a Cauchy kernel (Student t-distribution with the degree-of-freedom (dof) parameter alpha set to 1), it alleviates the “crowding problem” in low-dimensional embeddings. Varying the degree-of-freedom parameter alpha affects t-SNE embeddings on both toy and real-world datasets (e.g., MNIST and single-cell RNA sequencing). Moreover, alpha can be regarded as a trainable parameter, allowing it to be adjusted during embedding optimization via a gradient-based method. Alpha values different from 1 can yield superior embeddings, reflected by reduced Kullback-Leibler (KL) divergence and higher k-Nearest Neighbors (kNN) recall scores, at least on some datasets. Overall, this suggests that alpha optimization can lead to more faithful low-dimensional representations of high-dimensional data.

Description of changes

_tsne.pyx
Main changes with regards to dof optimization:

  • estimate_positive_gradient_nn function:
    • added computations for the alpha gradient positive term
  • _estimate_negative_gradient_single function
    • added computations for the alpha gradient negative term
  • estimate_negative_gradient_bh function
    • added normalization of alpha gradient negative term on sum_Q

tsne.py

  • added dataclass OptimizationStats to track changes in KL-divergence, dof values, alpha gradient values and embeddings with every iteration
  • kl_divergence_bh function
    • added dof gradient computations (alpha_grad), based on the ouputs of _tsne module
  • gradient_descent class
    • added optional optimize_for_alpha bool argument to trigger dof-optimization
    • added optional dof_lr argument
    • added optional dof update with current value of alpha_grad and dof_lr learning rate
    • eval_error_every_iter argument to make tracking of KL-divergence more flexible
Includes
  • Code changes
  • Tests
  • Documentation

exaggeration=None,
dof=1,
optimize_for_alpha=False,
dof_lr=0.5,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we connect this to the t-SNE learning rate somehow? Don't like the extra parameter

openTSNE/tsne.py Outdated
momentum=0.8,
exaggeration=None,
dof=1,
optimize_for_alpha=False,
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

activate with dof="auto" for optimization

openTSNE/tsne.py Outdated
)

if optimize_for_alpha:
dof -= dof_lr * alpha_grad
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a new optimizer. Can we optimize with the existing delta-bar-delta optimizer? Let's look at some loss curves.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants