Measuring sentence similarity

metrics

BLEU (Bilingual Evaluation Understudy)

BLEU computes a score based on the n-gram overlap between the generated text and the reference text, as well as the brevity penalty to handle cases where the generated text is too short. The score ranges from 0 to 1, where 1 indicates a perfect match with the reference translations.

ROUGE (Recall-Oriented Understudy for Gisting Evaluation)

ROUGE score measures the similarity between the machine-generated summary and the reference summaries using overlapping n-grams, word sequences that appear in both the machine-generated summary and the reference summaries. ROUGE score ranges from 0 to 1, with higher values indicating better summary quality.

ROUGE scores are branched into ROUGE-N,ROUGE-L, and ROUGE-S.
ROUGE-N measures the overlap of n-grams (contiguous sequences of n words) between the candidate text and the reference text. It computes the precision, recall, and F1-score based on the n-gram overlap.
ROUGE-L measures the longest common subsequence (LCS) between the candidate text and the reference text. It computes the precision, recall, and F1-score based on the length of the LCS.
ROUGE-S measures the skip-bigram (bi-gram with at most one intervening word) overlap between the candidate text and the reference text. It computes the precision, recall, and F1-score based on the skip-bigram overlap.

references

https://medium.com/@sthanikamsanthosh1994/understanding-bleu-and-rouge-score-for-nlp-evaluation-1ab334ecadcb

Measuring sentence similarity

http://yoursite.com/2024/02/21/llm0/

Author

s-serenity

Posted on

2024-02-21

Updated on

2024-03-15

Licensed under

You need to set install_url to use ShareThis. Please set it in _config.yml.
You forgot to set the business or currency_code for Paypal. Please set it in _config.yml.

Comments

You forgot to set the shortname for Disqus. Please set it in _config.yml.