Skip to content

Commit 2e794ab

Browse files
doc: delete ROUGE/METEOR score from QAAccuracy documentation (#325)
* fix: pinned nltk version to address build failure * fix: delete ROUGE/METEOR score from QAAccuracy documentation --------- Co-authored-by: Xiaoyi Cheng <[email protected]>
1 parent 4096bf0 commit 2e794ab

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

src/fmeval/eval_algorithms/qa_accuracy.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -285,7 +285,7 @@ class QAAccuracy(EvalAlgorithmInterface):
285285
3. Precision over Words: The fraction of words in the prediction that are also found in the target answer. The text is normalized as before.
286286
4. Recall over Words: The fraction of words in the target answer that are also found in the prediction.
287287
5. F1 over Words: The harmonic mean of precision and recall, over words (normalized).
288-
6. [BERTScore](https://arxiv.org/pdf/1904.09675.pdf) uses a second ML model (from the BERT family) to compute sentence embeddings and compare their cosine similarity. This score may account for additional linguistic flexibility over ROUGE and METEOR since semantically similar sentences should be embedded closer to each other.
288+
6. [BERTScore](https://arxiv.org/pdf/1904.09675.pdf) uses a second ML model (from the BERT family) to compute sentence embeddings and compare their cosine similarity. This score may account for additional linguistic flexibility over the other QAAccuracy metrics since semantically similar sentences should be embedded closer to each other.
289289
290290
291291
Precision, Recall and F1 over Words are more flexible as they assign non-zero scores to

0 commit comments

Comments
 (0)