What is MT-Bench Conversation Benchmark?
MT-Bench evaluates the ability of language models to maintain context and coherence across multiple conversation turns. This benchmark tests how effectively models can follow conversation threads and provide consistent responses.
Resources: MT-Bench dataset: GitHub, MT-Bench Paper
