Added pair counting fmeasure metric#220
Conversation
If it's a direct translation, you'll have to include the license here (or ask the scikit-learn folks if a translation of their code can be MIT licensed. But I'm guessing that would be difficult.). I don't do much with this package, but I can review this. However, we'll need to figure out the license stuff first (i.e., do we really want to include BSD licensed code here.) If |
|
Given how short and simple the code is, it probably won't have to be considered as derived from NumPy if you adapt it to make it more Julian and more efficient, as in the end the only think that will remain from NumPy is the algorithm. For example, |
|
Thanks, |
|
I've just re-implemented this functionality in #227 to fix ARI calculations. |
|
Great!, I will wait for it to get pushed and update this commit accordingly. |
I often use this metric, I think it's worth having.
refs:
https://nlp.stanford.edu/IR-book/html/htmledition/evaluation-of-clustering-1.html
https://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.214.7233&rep=rep1&type=pdf
Included also is precision and recall for clustering, I was not sure about the proper name (e.g. precision is already in use by Julia base).
The _pair_confusion_matrix is translated from sklearn's https://github.com/scikit-learn/scikit-learn/blob/2beed55847ee70d363bdbfe14ee4401438fba057/sklearn/metrics/cluster/_supervised.py#L154
there is a small duplication with the rand index, which also require this matrix, but as I did not want to modify it to use my new function right now, but in a separated or (if at all).