-
Notifications
You must be signed in to change notification settings - Fork 0
Feature Engineering: Grammatical Features
Shakleen Ishfar edited this page Jun 9, 2024
·
1 revision
tldr; Levenshtein distance is the most useful.
Distance measures how different raw sentence from essay is to corrected sentence by T5 model. The model used is T5-GEC-small. The model corrects sentences from the essay. Then distance measures are used to measure how different the sentences are.
Importance-wise Levenshtein distance is more effective than hamming, cosine, and jaccard.
Operation-wise sum and max operations are more important than others.