Base64 Validator
Check whether a string is valid Base64 (with or without padding). Shows decoded size and whether content looks like UTF-8 or binary.
Base64: the binary-to-text encoding behind data URIs, JWTs and email attachments
Base64 is the binary-to-text encoding scheme that turns arbitrary bytes into a printable ASCII string using a 64-character alphabet. Standardized by RFC 4648 (2006, replacing RFC 3548) and originally introduced for the MIME email standard in RFC 2045 (1996), it remains the workhorse for shipping binary payloads through systems that only understand text: SMTP headers, JSON, XML, HTTP cookies, JWT signatures and HTML attributes.
The trade-off is size: every 3 input bytes become 4 output characters, giving a 33% overhead. The benefit is universality: a base64 string survives any 7-bit ASCII pipeline intact, no escaping, no corruption.
Standard alphabet vs base64url
The standard alphabet defined by RFC 4648 uses A-Z, a-z, 0-9, + and / β 64 characters in total. The = sign is reserved for padding at the end (1 or 2 chars depending on the input length).
The base64url variant (also RFC 4648, section 5) swaps the two unsafe characters: + becomes - and / becomes _, making the string safe to drop into URLs, filenames and HTTP headers without percent-encoding. Padding is also optional in base64url, which is why JWTs never carry trailing =.
Validation regex and the modulo-4 rule
A well-formed standard base64 string matches ^[A-Za-z0-9+/]*={0,2}$ and its length must be a multiple of 4. For base64url the regex is ^[A-Za-z0-9_-]*={0,2}$ and the length constraint relaxes when padding is omitted (length mod 4 = 2 or 3).
Encoded output size from n input bytes is ceil(n / 3) * 4 characters. Decoding the reverse: a 100-character standard string carries between 73 and 75 bytes depending on padding.
Common pitfalls when validating Base64
- Mixed alphabets: a JWT segment contains
-or_and will fail the standard regex. Try both alphabets before declaring it invalid. - Missing padding: legal in base64url, illegal in strict standard Base64. Some parsers tolerate it, others don't.
- MIME line wrapping: RFC 2045 inserts a
CRLFevery 76 characters. Strip whitespace before validating. - Unicode source:
btoa()in browsers only handles Latin-1. For UTF-8 text you must first encode withTextEncoder, then base64 the bytes. - Truncated payload: a valid syntax doesn't guarantee a valid decode. Attempt the decode to confirm semantic integrity.
Real-world use cases
- Data URIs:
data:image/png;base64,iVBORw0KGgo...embeds images directly in HTML/CSS. - JWT (JSON Web Tokens): header, payload and signature are base64url segments separated by dots.
- HTTP Basic Auth:
Authorization: Basic dXNlcjpwYXNzis base64(user:pass). - Email attachments (MIME): base64 wraps PDFs, images and binaries inside text-only SMTP.
- Embedded SVG, fonts and Web Workers: any binary asset inlined into a textual file.
- Cookies and localStorage: small binary blobs (sometimes gzip-then-base64) stored client-side.
Performance and libraries
Browsers ship native btoa() and atob(), but they only operate on ASCII strings β Unicode requires TextEncoder/TextDecoder. Node.js uses Buffer.from(str, 'base64') and buf.toString('base64'), which handle bytes directly and are highly optimized. Popular libraries include base64-js and js-base64 for cross-environment Unicode-aware encoding.
FAQ
Is validating Base64 easy?
Syntax validation is straightforward β alphabet plus length-mod-4 plus padding check. Semantic validation (does it decode to something meaningful?) requires actually attempting the decode and inspecting the bytes.
What is the URL-safe variant?
It is base64url, defined in RFC 4648 section 5. It swaps + for - and / for _, so the string can be used in URLs, filenames and HTTP headers without percent-encoding.
Can a Base64 string be valid without padding?
In strict RFC 4648 Base64, no β padding is mandatory. In base64url (and many practical implementations), yes β padding is optional. JWTs are the canonical example of unpadded base64url.
Why is the overhead 33%?
Because each base64 character carries 6 bits of information (2^6 = 64), while each byte carries 8 bits. To encode 3 bytes (24 bits) you need 4 characters (24 bits), so 4/3 = 1.33x the original size.
Should I gzip before Base64?
If the payload is large and compressible (JSON, XML, text), DEFLATE-then-base64 can dramatically shrink the transported string β common in cookies and HTTP cache strategies. For already-compressed data (PNG, JPEG, ZIP) it just adds CPU cost.
Related Tools
CPF Validator
Validate Brazilian CPF numbers instantly using the official algorithm. Useful for testing document validation in applications. No data sent to servers.
Batch CPF Validator
Validate a list of CPFs (one per line) and see which are valid and which are not. No data sent to servers.
Batch CNPJ Validator
Validate a list of CNPJs (one per line) with a summary of valid, invalid and total. No data sent to servers.