-
Notifications
You must be signed in to change notification settings - Fork 175
Learnable dof feature as #275
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Learnable dof feature as #275
Conversation
| exaggeration=None, | ||
| dof=1, | ||
| optimize_for_alpha=False, | ||
| dof_lr=0.5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we connect this to the t-SNE learning rate somehow? Don't like the extra parameter
openTSNE/tsne.py
Outdated
| momentum=0.8, | ||
| exaggeration=None, | ||
| dof=1, | ||
| optimize_for_alpha=False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
activate with dof="auto" for optimization
openTSNE/tsne.py
Outdated
| ) | ||
|
|
||
| if optimize_for_alpha: | ||
| dof -= dof_lr * alpha_grad |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a new optimizer. Can we optimize with the existing delta-bar-delta optimizer? Let's look at some loss curves.
Issue
T-distributed Stochastic Neighbor Embedding (t-SNE) is a widely used tool for dimensionality reduction and visualization of high-dimensional datasets. By replacing the Gaussian kernel in SNE with a Cauchy kernel (Student t-distribution with the degree-of-freedom (dof) parameter alpha set to 1), it alleviates the “crowding problem” in low-dimensional embeddings. Varying the degree-of-freedom parameter alpha affects t-SNE embeddings on both toy and real-world datasets (e.g., MNIST and single-cell RNA sequencing). Moreover, alpha can be regarded as a trainable parameter, allowing it to be adjusted during embedding optimization via a gradient-based method. Alpha values different from 1 can yield superior embeddings, reflected by reduced Kullback-Leibler (KL) divergence and higher k-Nearest Neighbors (kNN) recall scores, at least on some datasets. Overall, this suggests that alpha optimization can lead to more faithful low-dimensional representations of high-dimensional data.
Description of changes
_tsne.pyxMain changes with regards to dof optimization:
estimate_positive_gradient_nnfunction:_estimate_negative_gradient_singlefunctionestimate_negative_gradient_bhfunctionsum_Qtsne.pyOptimizationStatsto track changes in KL-divergence, dof values, alpha gradient values and embeddings with every iterationkl_divergence_bhfunctionalpha_grad), based on the ouputs of _tsne modulegradient_descentclassoptimize_for_alphabool argument to trigger dof-optimizationdof_lrargumentdofupdate with current value ofalpha_gradanddof_lrlearning rateeval_error_every_iterargument to make tracking of KL-divergence more flexibleIncludes