Subtitle Converter
Converts subtitle files between SRT and VTT formats. Useful for reusing subtitles across platforms.
SRT vs VTT differences
The SRT (SubRip) is the simplest format, carrying numbering and comma timestamps (00:00:01,500).
VTT (WebVTT) is the web standard. It opens with a WEBVTT header, puts a dot in the timestamps (00:00:01.500) and supports CSS styling.
During conversion the comma becomes a dot and the header is added or stripped as needed.
Subtitle formats: SRT, VTT, ASS/SSA, SUB
Subtitle files are plain-text containers that pair a piece of text with a time range during which it should be displayed over video. Despite serving the same conceptual purpose, the formats in circulation differ enormously in expressive power. The four families you will encounter most often are SRT (SubRip), VTT (WebVTT), ASS/SSA (Advanced SubStation Alpha), and SUB (MicroDVD or SubViewer, depending on the dialect). Choosing well or converting cleanly between them is a recurring task for editors, translators, anime fansubbers, language teachers, accessibility specialists, and anyone delivering localized video to multiple platforms.
SRT and VTT are nearly identical in shape and account for the vast majority of subtitle files in the wild. ASS/SSA is the powerhouse format used when typography, positioning, karaoke effects, and on-screen signage matter as much as the dialogue itself. SUB is older and less expressive, but still appears alongside legacy DivX/XviD releases and on a handful of standalone players. The sections below walk through each format, then cover the timing and accessibility concerns that determine whether a conversion actually plays correctly on the target device.
SRT: the simplest, most universal format
SubRip Text was popularized by the SubRip extraction tool in the late 1990s and quickly became the de facto interchange format for fan-made subtitles. Its grammar is trivially simple: a sequence of cues separated by blank lines, where each cue has three parts. First a sequential index, then a timestamp range using the syntax HH:MM:SS,mmm --> HH:MM:SS,mmm (note the comma as the decimal separator for milliseconds), then one or more lines of text. The text payload may contain very limited inline HTML-style tags โ typically <i>, <b>, <u>, and occasionally <font color="..."> โ but support varies player by player and nothing about styling is standardized.
1 00:00:01,000 --> 00:00:04,500 Welcome to the show. 2 00:00:05,200 --> 00:00:08,000 <i>Today's topic is subtitle formats.</i>
Because the format is so minimal, virtually every video player, NLE, transcription service, and streaming uploader accepts SRT. That universality is the single biggest reason translators still hand it back as the final deliverable even when richer formats were available upstream. The trade-off is that SRT cannot express positioning, font choice, multiple speakers as distinct visual entities, fade animations, or anything beyond rudimentary italics. If you need any of that, you must move to VTT or ASS.
VTT (WebVTT): the W3C standard for HTML5
WebVTT was created by the W3C specifically to back the HTML5 <track> element on <video> and <audio>. Files use the .vtt extension, must be encoded as UTF-8, and are served with MIME type text/vtt. Every file begins with the magic string WEBVTT followed by two line terminators. The timestamp syntax replaces SRT's comma with a period (HH:MM:SS.mmm), the cue identifier becomes optional, and a rich grammar of cue settings, region definitions, comments, and embedded CSS is layered on top.
WEBVTT
STYLE
::cue(v[voice="Narrator"]) { color: lime; }
NOTE Intro sequence
intro
00:00:01.000 --> 00:00:04.500 line:90% align:center
<v Narrator>Welcome to the show.</v>
Cue settings such as line, position, size, align, and vertical let authors place captions anywhere on screen, including for vertical Japanese text. <v Speaker> voice tags identify who is talking; <c.class> spans hook into CSS selectors via the ::cue pseudo-element; karaoke-style mid-cue timestamps allow word-by-word highlight effects. All major browsers implement at least the core feature set, which is why VTT is the right answer for any captions that ship as part of a website.
ASS/SSA: advanced styling and anime fansubs
SubStation Alpha (SSA) and its successor Advanced SubStation Alpha (ASS) were developed for the SubStation Alpha editor and later embraced by the anime fansubbing community, where they remain dominant. The format is INI-shaped: a [Script Info] header, a [V4+ Styles] table that defines named styles with font, color, outline, shadow, and alignment, and a [Events] table with one Dialogue: line per cue.
[Script Info]
Title: Demo
ScriptType: v4.00+
[V4+ Styles]
Format: Name, Fontname, Fontsize, PrimaryColour, Outline, Alignment
Style: Default,Arial,24,&H00FFFFFF,2,2
[Events]
Format: Layer, Start, End, Style, Name, Text
Dialogue: 0,0:00:01.00,0:00:04.50,Default,,{\an8\fad(200,200)}Welcome to the show.
Inline override tags inside curly braces ({\an8}, {\fad(200,200)}, {\move(...)}, {\t(...)}) give you alignment, fades, motion, color transitions, blur, rotation, and clipping. Editors like Aegisub render these in real time and are the standard authoring tool for sign typesetting in anime releases. The downside is that ASS playback requires libass or a compatible renderer; web browsers do not natively understand it, and conversions to SRT or VTT necessarily drop every effect except the dialogue itself.
SUB: legacy MicroDVD and SubViewer
The .sub extension covers two different formats. MicroDVD uses curly-brace frame numbers โ {1}{50}First subtitle โ which means the framerate of the source video is hard-coded into the file. SubViewer uses bracketed timestamps and supports a simple INI-shaped header. Both are legacy and you will mostly encounter them shipped next to older AVI files. Converting MicroDVD to anything timestamp-based requires you to know the framerate that was assumed when the file was authored, or every cue will drift.
Timing, framerate, and synchronization drift
The single most common subtitle problem is drift: captions appear in sync at the start of a video and gradually fall behind (or run ahead) as it plays. The cause is almost always a framerate mismatch. Subtitles authored against 23.976 fps (NTSC film) played over a 25 fps (PAL) re-encode of the same content drift by roughly 4.27% โ about 7 seconds over a 2-hour movie. Common framerates you will see in the wild include 23.976, 24.000, 25.000, 29.970, 30.000, 50.000, and 59.940.
- Constant offset โ the whole file is a fixed number of milliseconds early or late. Fix by shifting all timestamps by the same delta.
- Linear drift โ the offset grows over time. Fix by multiplying every timestamp by the ratio between authored framerate and actual framerate.
- Non-linear drift โ caused by commercials removed, intro skipped, or a re-cut. Requires anchor-point re-synchronization with at least two known good cues.
- Reading speed โ even perfectly synced cues fail if shown for too short a time. Industry guidelines recommend 15-20 characters per second and minimum cue durations of around one second.
Accessibility: captions, SDH, and WCAG
Captions and subtitles are distinct under accessibility law even though the file formats are the same. Subtitles assume the viewer can hear and only need translation. Captions (or SDH โ subtitles for the deaf and hard of hearing) include non-dialogue information: speaker labels, sound effects in brackets such as [door slams], and music descriptions. WCAG 2.1 requires captions for all prerecorded audio content (Success Criterion 1.2.2, Level A) and live captions for synchronized media (1.2.4, Level AA). Audio descriptions, delivered as a separate <track kind="descriptions">, narrate visual information for blind viewers.
Platform support
- YouTube โ accepts SRT, VTT, SBV, SCC, and TTML uploads; converts internally and serves VTT-ish JSON to the player.
- Netflix โ production deliverables use TTML/IMSC1; upstream translators usually work in SRT.
- Vimeo โ accepts SRT, VTT, SCC, and DXFP.
- HTML5 <track> โ WebVTT only; serve SRT files only after conversion.
- VLC / mpv / MPC-HC โ accept everything including ASS with libass for full styling.
- Apple TV / iOS โ prefers WebVTT or CEA-608/708 sidecar; SRT works via the Files app.
FAQ
Can I convert ASS to SRT without losing anything? No. SRT cannot express positioning, fonts, colors, or motion. You lose every override tag and keep only timing plus stripped text.
Why does VTT use a period and SRT a comma? SRT was authored in Europe where the comma is the standard decimal separator. WebVTT followed the ISO-style period to align with web and CSS conventions.
How do I detect the original framerate of a subtitle file? If the file uses MicroDVD frame numbers, you must guess (typical guesses: 23.976, 25, 29.97). For timestamp-based formats, compare the last cue's start time to the actual video length โ large discrepancies suggest a framerate mismatch.
Is encoding important? Yes. UTF-8 is mandatory for WebVTT and strongly recommended for SRT. Files saved as Windows-1252 or ISO-8859-1 will render accented characters incorrectly on most modern players.
Will the converter preserve italics and line breaks? Yes. Inline <i> and <b> tags survive a round-trip between SRT and VTT, as do hard line breaks within a cue.
Related Tools
PDF to Images
Convert PDF pages to individual PNG images. Everything in your browser via pdf.js.
Pixel Art Converter
Convert an image to pixel art by reducing resolution and color palette. Stylized retro output.
Currency Converter
Convert between major currencies (USD, EUR, BRL, GBP, JPY, ARS) using live rates from open.er-api.com.
Convert subtitles between SRT and VTT
SRT and VTT are the two most common subtitle formats, except not every player takes both. Converting by hand, fiddling with the timecodes, is tedious. This tool translates subtitle files between SRT and VTT while keeping the timings and the text.
It comes in handy for reusing a subtitle on a platform that only takes one of the formats, prepping subtitles for a web video (which usually wants VTT) or converting a downloaded file to the format your player understands. Rather than editing manually, you paste or load the subtitle and get the finished version back.
The whole process runs in the browser, with no file going to any server. Quick for anyone working with video and accessibility.