-
Notifications
You must be signed in to change notification settings - Fork 415
Description
I am trying to implement hamming distance for categorical data, but I am getting an error
C:\Users\Mukul.Sharma\AppData\Local\Continuum\Anaconda3\lib\site-packages\kmodes\kmodes.py in init_huang(X, n_clusters, dissim) 39 # so set centroid to closest point in X. 40 for ik in range(n_clusters): ---> 41 ndx = np.argsort(dissim(X, centroids[ik])) 42 # We want the centroid to be unique, if possible. 43 while np.all(X[ndx[0]] == centroids, axis=1).any() and ndx.shape[0] > 1:
my hamming distance is:
def hamming_distance(s1, s2): """Return the Hamming distance between equal-length sequences""" if len(s1) != len(s2): raise ValueError("Undefined for sequences of unequal length") return sum(el1 != el2 for el1, el2 in zip(s1, s2))
I am having issues with scipy hamming as well (scipy.spatial.distance.hamming)
Here the error says
ValueError: Input vector should be 1-D
Can you please help me ?
Also give me an idea for writing my custom distance metric, like telling me the internal working of this algo (K-prototypes?