Evaluation Metrics

Character Error Rate (CER)

CER measures the number of character-level edits required:

  • insertions

  • deletions

  • substitutions

Formula: CER = Levenshtein_distance / number_of_characters

Word Error Rate (WER)

WER measures errors at word level:

WER = Levenshtein_distance_words / number_of_words_in_reference

These metrics are used to evaluate OCR accuracy before and after correction.

Normalisation Before Evaluation

Before computing either metric, both the gold standard and the prediction are normalised using the same function:

import re, unicodedata

def normalize(text):
    text = unicodedata.normalize("NFKD", text)
    text = text.lower()
    text = text.replace("ſ", "s")     # long s → regular s
    text = text.replace("œ", "oe")
    text = text.replace("æ", "ae")
    text = re.sub(r"[.,;:!?()\[\]\"'&]", " ", text)
    text = re.sub(r"\s+", " ", text)
    return text.strip()

This ensures that differences in punctuation or Unicode encoding do not artificially inflate error rates.