An N-gram models use (
n-1)
words of context in the sentence to predict the next word.
E.g., let’s say we are using a trigram model (uses 2 words for context)
If you type - “I am”
The message app might suggest - “going”, “happy”, “fine”, etc.
Real Example with Bigrams:
Given a sentence: "I love to eat ice cream and I love to eat pizza."
The bigrams are:
I love
love to
to eat
eat ice
ice cream
cream and
and I
I love
love to
to eat
eat pizza
From this, the model learns:
After “I”, “love” always comes next → 100% probability
After “eat”, both “ice” and “pizza” appear → 50% probability each
Using these probabilities, the app predicts the next word.
Application:
Spell or grammar correction: Needs only a small context → bigram model is often enough.
Plagiarism detection: Needs larger context → higher-order n-grams (like 5-grams or 6-grams) are better.