ISO 639 Language Code Validator
Validate ISO 639-1 / ISO 639-2 language codes and resolve language name.
ISO 639: the language code family
ISO 639 is the international standard that catalogs identifiers for human languages. It is the foundation of every internationalization pipeline: HTML lang attributes, browser Accept-Language headers, app store localized listings, gettext catalogs, Unicode CLDR locale data, machine-translation routing and search engine hreflang tags all start from an ISO 639 subtag. Unlike ISO 3166 (countries) or ISO 4217 (currencies), ISO 639 is split into several parts because the linguistic landscape is far more granular than the political one.
The three main variants
- ISO 639-1: two lowercase letters (
pt,en,es,de,fr,ja,zh). Covers about 184 of the world's most widely spoken languages. This is what you almost always want in web contexts. - ISO 639-2: three letters (
por,eng,spa,deu/ger,fra/fre,jpn,zho/chi). Comes in two flavors: bibliographic (used by libraries, MARC records) and terminologic (used by linguists). Most languages have a single code; a few have both forms. - ISO 639-3: three letters (
por,eng,cmnfor Mandarin,yuefor Cantonese,bzsfor Brazilian Sign Language). Designed to cover every known human language, living or extinct โ currently around 7,800 entries โ and the macrolanguage concept letszhoumbrellacmn,yue,wuuand dozens of other Chinese languages.
There is also ISO 639-5 for language families (e.g. sla Slavic, roa Romance) and ISO 639-4 documenting the principles, but they rarely show up in application code.
BCP 47: the practical web format
On the web you almost never see a bare ISO 639 code. You see a BCP 47 language tag (RFC 5646), which composes ISO 639 with ISO 15924 (scripts), ISO 3166-1 (regions) and variant / extension subtags. Examples:
pt-BRโ Brazilian Portuguese (language + region).pt-PTโ European Portuguese.zh-Hans-CNโ Simplified Chinese as written in China (language + script + region).zh-Hant-TWโ Traditional Chinese as written in Taiwan.sr-Cyrl/sr-Latnโ Serbian in Cyrillic or Latin script.en-US-x-privateโ private-use extension.
Subtag order: language - script - region - variant - extension - private use. Case is informative only (lowercase language, Titlecase script, UPPERCASE region) โ the matching algorithm is case-insensitive.
Where ISO 639 shows up in the stack
- HTML:
<html lang="pt-BR">declares document language for screen readers, browsers and crawlers. - HTTP:
Accept-Language: pt-BR,pt;q=0.9,en;q=0.8negotiates content language with quality values. - SEO:
<link rel="alternate" hreflang="pt-BR" href="...">tells Google which version to serve. - i18n libraries:
react-intl,FormatJS,i18next,vue-i18nall key on BCP 47 tags. - Unicode CLDR: the locale data behind
Intluses BCP 47 internally. - App stores: localized descriptions are uploaded per BCP 47 locale.
Sign and constructed languages
ISO 639-3 covers sign languages and even constructed languages. A few you can validate:
bzsโ Lingua Brasileira de Sinais (Libras).aseโ American Sign Language (ASL).bfiโ British Sign Language (BSL).eoโ Esperanto (also in ISO 639-1).tlhโ Klingon, registered in ISO 639-3 (yes, the Star Trek language).sjnโ Sindarin, Tolkien's Elvish language.qaa-qtzโ range reserved for local / private use.
Libraries and language detection
- JavaScript:
iso-639-1,langs,bcp-47on npm. - Python:
pycountry,babel,langcodes. - Detection:
franc(npm), Google CLD3 (cld3Python bindings), Microsoft Recognizers,fasttextlanguage models. - Glibc / ICU: system locales like
pt_BR.UTF-8blend ISO 639-1 + ISO 3166-1 + encoding.
Example with iso-639-1:
const ISO6391 = require('iso-639-1')
ISO6391.getName('pt') // "Portuguese"
ISO6391.getNativeName('pt') // "Portugues"
ISO6391.validate('xx') // false
ISO6391.getAllCodes().length // 184
Brazilian Portuguese vs European Portuguese
Both share the ISO 639-1 code pt but BCP 47 distinguishes them by region: pt-BR and pt-PT. The 1990 Orthographic Agreement (Acordo Ortografico) reduced spelling differences but vocabulary, grammar and pronunciation still diverge. For localization, always use the region-tagged form โ pt alone is ambiguous and Google may serve the wrong variant.
FAQ
Should I use pt or pt-BR?
For a Brazilian site, use pt-BR. The bare pt is generic and search engines or screen readers may apply European Portuguese defaults. For an hreflang matrix, declare each regional variant.
ISO 639-1, -2, or -3 โ which do I pick?
ISO 639-1 for everyday web work โ it is what BCP 47 expects. ISO 639-2 for libraries, MARC and government archives. ISO 639-3 when you need to identify rare, regional or sign languages that 639-1 does not cover.
Do sign languages have ISO codes?
Yes. ISO 639-3 covers them: bzs for Libras, ase for ASL, bfi for BSL and many others. ISO 639-1 does not, since it has only 184 slots for the most widespread spoken languages.
Is Klingon really an ISO 639-3 code?
Yes โ tlh. The ISO 639-3 registry includes constructed languages with documented vocabularies. Sindarin (sjn) and Quenya (qya), both from Tolkien, are also listed alongside Esperanto (eo / epo) and Volapuk (vo / vol).
Related Tools
CPF Validator
Validate Brazilian CPF numbers instantly using the official algorithm. Useful for testing document validation in applications. No data sent to servers.
Batch CPF Validator
Validate a list of CPFs (one per line) and see which are valid and which are not. No data sent to servers.
Batch CNPJ Validator
Validate a list of CNPJs (one per line) with a summary of valid, invalid and total. No data sent to servers.