-
Notifications
You must be signed in to change notification settings - Fork 415
Open
Labels
Description
Hi I've been using kmodes (https://www.rdocumentation.org/packages/klaR/versions/0.6-12/topics/kmodes) from the KlaR, an R package to cluster my data set. I wanted to try using kmodes in python to see if I get similar results. However, I don't see how I can determine the optimal number of clusters in the python version of kmodes.
In the klaR package, I can use the $withindiff function to get the within-cluster simple-matching distance for each cluster. This allows me to calculate the sum of error for for k= 2, 3, 4...., etc. and select the optimal number of clusters based on the largest sum of error difference between each iteration of clustering with varying k values.
In the kmodes for python, how do you determine the optimal k?