How to evaluate a translation: BLEU scores
Media Type |
audio
Categories Via RSS |
Technology
Publication Date |
Jan 16, 2017
Episode Duration |
00:17:06
As anyone who's encountered a badly translated text could tell you, not all translations are created equal. Some translations are smooth, fluent and sound like a poet wrote them; some are jerky, non-grammatical and awkward. When a machine is doing the translating, it's awfully easy to end up with a robotic-sounding text; as the state of the art in machine translation improves, though, a natural question to ask is: according to what measure? How do we quantify a "good" translation? Enter the BLEU score, which is the standard metric for quantifying the quality of a machine translation. BLEU rewards translations that have large overlap with human translations of sentences, with some extra heuristics thrown in to guard against weird pathologies (like full sentences getting translated as one word, redundancies, and repetition). Nowadays, if there's a machine translation being evaluated or a new state-of-the-art system (like the Google neural machine translation we've discussed on this podcast before), chances are that there's a BLEU score going into that assessment.

This episode currently has no reviews.

Submit Review
This episode could use a review!

This episode could use a review! Have anything to say about it? Share your thoughts using the button below.

Submit Review