DOI Validator
Validate DOI format (10.PREFIX/SUFFIX) per Crossref. Format only — does not check existence.
DOI validation: the persistent identifier of scholarly content
The DOI — Digital Object Identifier is the persistent identifier of choice for academic articles, books, datasets, preprints, conference papers and any "digital object" that requires a stable, citable URL. It is governed by the DOI Foundation, a nonprofit, and standardised by ISO 26324. Unlike a plain URL, a DOI resolves through https://doi.org/ and is guaranteed to point to the current location of the resource even after publisher migrations, journal name changes or domain expirations.
Validating a DOI is a syntactic exercise — there is no check digit. A DOI must (1) start with the literal prefix 10., (2) be followed by a registrant code of four to nine digits, (3) include a forward slash, and (4) end with an opaque, case-insensitive suffix made of letters, digits and a small set of punctuation marks. Whether the DOI is actually registered is a separate question that requires a network resolution.
Anatomy of a DOI
A DOI has the form 10.{registrant}/{suffix}. Examples:
10.1038/nature12373— Nature article.10.1093/ajae/aaq063— American Journal of Agricultural Economics.10.1109/5.771073— IEEE archive paper.10.1000/182— the DOI Handbook itself (sample 10.1000 registrant).
The prefix 10. is fixed by the standard and never changes. The registrant (four to nine digits after the dot) is allocated to a registration agency, who in turn assigns it to a publisher. The suffix is opaque: the publisher chooses any scheme it likes — sequential numbers, article codes, mnemonic strings — and it must be treated as a black box by validators. ASCII letters, digits and the punctuation - . _ ; ( ) / : are allowed.
Reasonable validation regex
A practical pattern, after normalising the input to uppercase, is:
^10\.\d{4,9}/[-._;()/:A-Z0-9]+$
This regex catches the vast majority of real-world DOIs while still permitting the punctuation publishers actually use. Crossref recommends being slightly more permissive — accepting any printable ASCII in the suffix — because some legacy DOIs include unusual characters. Either way, a regex check is the floor, not the ceiling: actual registration is the next step.
Registration agencies
Several agencies issue DOIs under licence from the DOI Foundation:
- Crossref — the largest, covering most academic papers worldwide.
- DataCite — datasets, research software, theses and grey literature (used by Zenodo, Figshare, OSF).
- mEDRA — European DOI agency, common for Italian and EU publishers.
- Airiti — Taiwan / East Asian publishers.
- KISTI — Korean Institute of Science and Technology Information.
- JaLC — Japan Link Center, Japanese journals and datasets.
- SciELO in Brazil registers DOIs via Crossref membership.
Resolution and content negotiation
Beyond validation, you usually want to resolve the DOI: send an HTTP HEAD or GET to https://doi.org/{doi} and observe the 30x redirect to the publisher landing page. Tools that need bibliographic metadata can use DOI Content Negotiation:
curl -LH "Accept: application/x-bibtex" https://doi.org/10.1038/nature12373
curl -LH "Accept: application/vnd.citationstyles.csl+json" https://doi.org/10.1038/nature12373
The server returns BibTeX, CSL-JSON, RIS or RDF depending on the Accept header — ideal for reference managers (Mendeley, Zotero, Papers) and citation engines.
DOI vs ORCID vs ISSN vs ISBN
- DOI identifies the work (article, dataset, chapter).
- ORCID identifies the researcher (16-digit code with ISNI mod-11 check).
- ISSN identifies the journal in which the article was published.
- ISBN identifies the book edition.
- arXiv ID and PMID are alternative identifiers; arXiv preprints today carry a DOI from DataCite as well.
Pitfalls
- Case-sensitivity confusion: a DOI is case-insensitive, but the URL of the destination publisher may not be. The doi.org resolver normalises the DOI before redirecting, so always compare uppercased forms.
- Mistaking URL for DOI:
https://doi.org/10.1000/182is the URL;10.1000/182is the DOI itself. Strip the host before validating. - Mistyping the prefix:
10,1038/...(comma) and10.1038\\...(backslash) are common copy-paste bugs. - Trailing punctuation: citation text often ends a sentence with
10.1038/nature12373.— strip the final dot before validating. - Suffix with slashes:
10.5281/zenodo.1234is valid; some validators wrongly stop at the first slash.
FAQ
Can a DOI start with anything other than 10.? No. Every DOI in existence begins with the literal prefix 10. — it is the directory indicator of the Handle System.
Can I confirm a DOI by checking whether it resolves? Yes — send a HEAD request to https://doi.org/{doi}. A 302 response means it is registered; a 404 means it does not exist. This page only does the syntactic check, locally.
Is the DOI case-sensitive? No. The string is case-insensitive by standard. Compare after normalising to a common case.
How much does a DOI cost? Publishers pay around US$ 1 per DOI through Crossref, plus an annual membership fee starting at US$ 275. DataCite has similar tiered pricing.
Does the suffix have meaning? Only to the publisher who chose it. Some encode the article number, others use a UUID or a slug. Validators must treat it as an opaque string.
Related Tools
CPF Validator
Validate Brazilian CPF numbers instantly using the official algorithm. Useful for testing document validation in applications. No data sent to servers.
Batch CPF Validator
Validate a list of CPFs (one per line) and see which are valid and which are not. No data sent to servers.
Batch CNPJ Validator
Validate a list of CNPJs (one per line) with a summary of valid, invalid and total. No data sent to servers.