-
Notifications
You must be signed in to change notification settings - Fork 62
Open
Description
In example/rater/generated_answer.ipynb. For a input of which true label is equivalent, model sometimes generate accept or reject. So majority vote can give wrong vote.
Input:
("Vitamin C (also known as ascorbic acid and ascorbate) is a water-soluble vitamin found in citrus and other fruits, berries and vegetables, also sold as a dietary supplement and as a topical serum ingredient to treat melasma (dark pigment spots) and wrinkles on the face.",
"Is Vitamin C water-soluble?",
"Yes, Vitamin C is a very water-soluble vitamin.",
"Yes, Vitamin C can be dissolved in water well."), # Equally good
Run:
config2 = RaterForGeneratedAnswerOpenAIGPT3p5Config()
config2.model_config.num_call = 3
config2.model_config.temperature = 0.9
with OpScope(name="TextFlow"):
client2 = RaterClient(config2)
output = client2.run(data)
pprint.pprint(output)
Ouput:
{'output': [{'average_score': 0.0,
'error': 'No errors.',
'majority_vote': 'reject',
'response': ['explanation: The grounding answer is better '
'because it directly states that Vitamin C is "very '
'water-soluble," while the generated answer is more '
'vague in saying that it "can be dissolved in water '
'well."\n'
'label: reject',
'explanation: Both the grounding answer and the '
'generated answer correctly state that Vitamin C is '
'water-soluble, so they are equivalent.\n'
'label: equivalent',
'explanation: The generated answer is better '
'because it accurately states that Vitamin C is '
'water-soluble, which aligns with the information '
'provided in the context.\n'
'label: accept'],
'scores': [-1.0, 0.0, 1.0],
'votes': ['reject', 'equivalent', 'accept']}],
Here 'majority_vote': 'reject' is wrong.
Metadata
Metadata
Assignees
Labels
No labels