PT Trigram Frequency
Shows typical trigram frequency for most common Brazilian Portuguese triplets corpus.
โ
Trigram Frequency in Brazilian Portuguese
A trigram is just three letters in a row pulled from a piece of text. To get how often each triple shows up, you divide its count by the total: f(xyz) = count(xyz) / total_trigrams. The denominator here counts the overlapping triples across the whole corpus. A string of length n contains exactly n - 2 of them.
Look at Brazilian Portuguese and the most frequent trigrams trace out the language's common morphemes and inflections. You'll usually find “que”, “ent”, “des”, “con”, “est”, “men”, “par” and “ada” near the top, a direct echo of common prefixes like des- and con- and of verb endings.
Applications
Statistical language models like KenLM and Stanford NLP run on trigrams. They were the engine behind the old T9 keypad's predictions, and OCR error correction still leans on them, ranking candidate substitutions by how likely each trigram is. The same idea helps with spelling correction and figuring out which language a text is written in.
FAQ
Why use trigrams instead of bigrams? The extra letter buys you more context. A trigram model can tell “ent” (common in verb stems such as “dente” or “sente”) apart from the suffix “ente”, a distinction a bigram model simply can't make.
What is the data sparsity problem? The number of possible triples grows with the cube of the alphabet, so plenty of perfectly valid ones never show up in the training data. Smoothing methods like Kneser-Ney handle this by shifting some probability mass over to the trigrams that were never seen.
Are trigrams still relevant in the era of transformers? They are. When you need the fastest possible baseline, they still win for keyboards, embedded OCR engines and lightweight language ID, the kinds of places where a neural model would be far too heavy.
Related Tools
Rent Adjustment Calculator
Compute annual rent adjustment by IGP-M or IPCA accumulated in the last 12 months (manually configurable).
Pregnancy Calculator
Compute estimated due date (EDD), gestational age and trimester from the last menstrual period (LMP).
Fertile Period Calculator
Compute fertile window and ovulation day from the first day of the last cycle and the average cycle length.