Majority vote give wrong label due to model inconsistency.

In `example/rater/generated_answer.ipynb`. For a input of which true label is `equivalent`, model sometimes generate `accept` or `reject`. So majority vote can give wrong vote.

Input: 
```
("Vitamin C (also known as ascorbic acid and ascorbate) is a water-soluble vitamin found in citrus and other fruits, berries and vegetables, also sold as a dietary supplement and as a topical serum ingredient to treat melasma (dark pigment spots) and wrinkles on the face.",
"Is Vitamin C water-soluble?",
"Yes, Vitamin C is a very water-soluble vitamin.",
"Yes, Vitamin C can be dissolved in water well."), # Equally good
```

Run:
```
config2 = RaterForGeneratedAnswerOpenAIGPT3p5Config()
config2.model_config.num_call = 3
config2.model_config.temperature = 0.9

with OpScope(name="TextFlow"):
    client2 = RaterClient(config2)

output = client2.run(data)
pprint.pprint(output)
```

Ouput:
```
{'output': [{'average_score': 0.0,
              'error': 'No errors.',
              'majority_vote': 'reject',
              'response': ['explanation: The grounding answer is better '
                           'because it directly states that Vitamin C is "very '
                           'water-soluble," while the generated answer is more '
                           'vague in saying that it "can be dissolved in water '
                           'well."\n'
                           'label: reject',
                           'explanation: Both the grounding answer and the '
                           'generated answer correctly state that Vitamin C is '
                           'water-soluble, so they are equivalent.\n'
                           'label: equivalent',
                           'explanation: The generated answer is better '
                           'because it accurately states that Vitamin C is '
                           'water-soluble, which aligns with the information '
                           'provided in the context.\n'
                           'label: accept'],
              'scores': [-1.0, 0.0, 1.0],
              'votes': ['reject', 'equivalent', 'accept']}],
```

Here `'majority_vote': 'reject'` is wrong.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Majority vote give wrong label due to model inconsistency. #106

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Majority vote give wrong label due to model inconsistency. #106

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions