Closed Captioning (CC), Captions, Subtitles

Introduction – Closed Captions, Open, Subtitles, …


“Closed captioning is the American term for closed subtitles specifically intended for people who are deaf or hard-of-hearing. These are a transcription rather than a translation, and usually contain descriptions of important non-dialogue audio as well, such as “(SIGHS)”, “(WIND HOWLING)”, “(“SONG TITLE” PLAYING)”, “(KISSES)”, “(THUNDER RUMBLING)” or “(DOOR CREAKING)” and lyrics. From the expression “closed captions”, the word “caption” has in recent years come to mean a subtitle intended for the deaf or hard-of-hearing, be it “open” or “closed”. In British English, “subtitles” usually refers to subtitles for the deaf or hard-of-hearing (SDH); however, the term “SDH” is sometimes used when there is a need to make a distinction between the two.” []


General Resources


Different Kinds

  1. Intent with texts
    1. Hearing impaired – support, transcription of dialogue, and descriptions of sounds, relevant musical cues, et c
      • This is what is commonly denoted CC, Closed Captioning, in the US
    2. Translation of foreign language
    3. More … (e.g. using text to provide comments to media, from very technical like frames info, to higher level commentary)
  2. Format, how saved (‘INPUTS’ to player / converter app)
    1. Bitmap-based (image-based)
    2. Text-based, with or without more complex controls including styling capabilities
  3. Open or Closed, including How presented (‘OUTPUTS’ from player / converter app)
    1. Open : Always present, cannot be turned off / on
      • A.k.a. Hard, from HandBrake: “Hard Burn: This means the subtitles are written on top of the image permanently. They cannot be turned on or off like on the DVD.”
    2. Closed: *Can* turn on / off, often select between different ‘tracks’ (English, English for Hearing Impaired, Spanish, …)
      • A.k.a. Soft, from HandBrake: “Soft Subtitles: This means the subtitles will appear as separate selectable tracks in your output file. With the correct playback software, you’ll be able to enable / disable these subtitles as required.”


Notes of Process Phases

  1. Creating subtitles
    • more research to do, on tools, best practices, …
  2. Encoding into video container (mkv, mp4, mov…)
    • more research to do, ffmpeg?
  3. Play, render


Overview of Formats – Technologies

Overview, List

Most safe bets:

  1. SubRip Subtitle (SRT) – supported by most different players listed at wikipedia Comparison of video player software#Subtitle_ability
  2. SubStation Alpha (SSA), Advanced SubStation Alpha (ASS) – also supported by a larger number of players
  3. For web – Web Video Text Tracks (WebVTT)

General searches:

NOTE: the following rather long list does NOT include (e.g.) SubStation Alpha (SSA), Advanced SubStation Alpha (ASS)….:

  1. SubRip Subtitle (SRT)
  2. Web Video Text Tracks (WebVTT)
  3. Scenarist Closed Captions (SCC)
  4. Spruce Subtitle File (STL)
  5. Distribution Format Exchange Profile (DFXP)
  6. Timed Text Markup Language (TTML)
  7. Society of Motion Picture and Television Engineering – Timed Text (SMPTE-TT)
  8. CAP
  9. Captionate XML (CPT.XML)
  10. Powerpoint XML (PPT.XML)
  11. European Broadcasting Union subtitles (EBU.STL)
  12. RealText (RT)
  13. Synchronized Accessible Media Interchange (SAMI or SMI)
  14. SubViewer (SBV or SUB)
  15. Adobe (ADBE)
  16. Apple XML Interchange Format (Apple XML)
  17. Avid (AFF or Avid DS)
  18. MacCaption (CCA, MCC, or MCC V2)
  19. ONL – CPC 715
  20. Crackle Timed Text (Crackle TT -variant of SMPTE-TT)
  21. DECE CFF (Variant of SMPTE-TT with auxiliary PNG files)
  22. Evertz ProCAP
  23. iTunes Timed Text (ITT)
  24. Matrox for MX02 (Matrox4VANC)
  25. Multiple CC (Multiplexed SCC)
  26. XML file (Rhozet)
  27. Sony Pictures Timed Text XML (SonyPictures TT)
  28. Texas Instruments DLP Cinema XML (TIDLP Cinema)
  29. Windows Media timed text file (WMP.TXT)
  30. LRC (.lrc) – No styling, but enhanced format supported.
  31. Videotron Lambda (.cap) – Primarily used for Japanese subtitles.


Authoring, Editing


Players Support

General on Support for Subtitles Technologies by Players


Elmedia Player

From….. (emphasis in bold here):

Embedded subtitles support
MKV files frequently have embedded multiple subtitle files in different languages. If your local video has included subtitles then Elmedia Player will have no problem playing them whatsoever.

This is an automatic feature that Elmedia Player has. It can identify the encoding of subtitles without effort. In the case though that it has difficulties reading the encoding then a message will appear on your screen requesting to choose the encoding manually. The big advantage of Elmedia Player is that it supports multiple types of formats like .srt, .smil, .ass and more. So you don’t have to worry at all. You won’t encounter any issues playing such subtitle formats.

More Players


Time Code, Time Specifications

Standard Smallest Unit Formatting, Notes
SRT millisecond [hours]: [minutes]: [seconds], [milliseconds]
NOTE: It’s COMMA – NOT dot – that separates milliseconds (French origin)
SSA, ASS 100ds sec Hrs:Mins:Secs:hundredths
1) Hrs in ONE digit…
2) Colon separator for hundreds
3) And – yepp, this standard uses hundreds, not millisecs as SRT
(none found) microsecond ….
Ah, wait a sec – a table in include ‘Timing precision’, and no standard is listed with ‘microseconds’ BUT – there are a few listed with ‘As frames’!
(a few) As frames

Table extracted from 2021-01-09:

###Sortable table
Name Extension Type Text styling Metadata Timings Timing precision
EBU-TT-D[22] N/A XML Yes Yes Elapsed time Unlimited
Spruce subtitle format[25] .stl Text Yes Yes Sequential time+frames Sequential time+frames
Ogg Writ N/A (embedded in Ogg container) Text Yes Yes Sequential granules Dependent on bitstream
AQTitle .aqt Text Yes Yes Framings As frames
JACOSub[23] .jss Text with markup Yes No Elapsed time As frames
MicroDVD .sub Text No No Framings As frames
Phoenix Subtitle .pjs Text No No Framings As frames
SAMI .smi HTML Yes Yes Framings As frames
Gloss Subtitle .gsub HTML/XML Yes Yes Elapsed time 10 milliseconds
MPSub .sub Text No Yes Sequential time 10 milliseconds
RealText[24] .rt HTML Yes No Elapsed time 10 milliseconds
(Advanced) SubStation Alpha .ssa or .ass (advanced) Text Yes Yes Elapsed time 10 milliseconds
SubViewer .sub Text No Yes Elapsed time 10 milliseconds
PowerDivX .psb Text No No Elapsed time 1 second
MPEG-4 Timed Text .ttxt (or mixed with A/V stream) XML Yes No Elapsed time 1 millisecond
Structured Subtitle Format .ssf XML Yes Yes Elapsed time 1 millisecond
SubRip .srt Text Yes No Elapsed time 1 millisecond
Universal Subtitle Format .usf XML Yes Yes Elapsed time 1 millisecond
VobSub .sub + .idx Image N/A N/A Elapsed time 1 millisecond
WebVTT .vtt HTML Yes Yes Elapsed time 1 millisecond
XSUB N/A (embedded in .divx container) Image N/A N/A Elapsed time 1 millisecond

There are still many more uncommon formats. Most of them are text-based and have the extension .txt.


As frames – Timing Precision

Spruce .stl
AQTitle .aqt
JACOSub .jss
MicroDVD .sub
Phoenix Subtitle .pjs
SAMI .smi


AQTitle (.aqt) (As frames)

Movie subtitles file created in the AQTitle format, an older format used by the Czech subtitling community; uses a plain text format that specifies when each subtitle should be displayed; not a common format, but supported by some modern video players.


JACOSub (.jss) (As frames)


MicroDVD (.sub) (As frames)


Phoenix Subtitle (.pjs) (As frames)


SAMI (.smi, .sami) (As frames)

    “Synchronized Accessible Media Interchange (SAMI) is a Microsoft accessibility initiative released in 1998. The structured markup language is designed to simplify creating subtitles for media playback on a PC.”
  • DO get to work with VLC
  • Do NOT get to work w Elmedia Player… Sad!!!###. Must have been internal text-file format issues…
  • Do NOT get to work w IINA player… Sad!!!###. Must have been internal text-file format issues…



(End of small sections on technologies that are stated supporting ‘As frames’ for time precision.

Now back to main sections on more frequently actually used technologies.)

Synchronized Multimedia Integration Language (SMIL)

More a “control” mechanism for bringing different parts together and control when and how to present.

Invented and intended for use on the Internet.

It’t a quite comprehensive standard ( and takes time to get familiarized with.

From wikipedia article:

Synchronized Multimedia Integration Language (SMIL (/ˈsml/)) is a World Wide Web Consortium recommended Extensible Markup Language (XML) markup language to describe multimedia presentations. It defines markup for timing, layout, animations, visual transitions, and media embedding, among other things. SMIL allows presenting media items such as text, images, video, audio, links to other SMIL presentations, and files from multiple web servers. SMIL markup is written in XML, and has similarities to HTML.



Take note that different SMIL players provide varying levels of implementation– in other words, some players implement the entire specification, and some use only parts of it. Also, not all of SMIL’s accessibility features are supported by all SMIL players. In these cases, we offer workaround solutions that will work with existing players.


SubRip Subtitle (SRT)

File format

The SubRip file format is described on the Matroska multimedia container format website as “perhaps the most basic of all subtitle formats.”[12] SubRip (SubRip Text) files are named with the extension .srt, and contain formatted lines of plain text in groups separated by a blank line. Subtitles are numbered sequentially, starting at 1. The timecode format used is hours:minutes:seconds,milliseconds with time units fixed to two zero-padded digits and fractions fixed to three zero-padded digits (00:00:00,000). The fractional separator used is the comma, since the program was written in France.

    1. A numeric counter identifying each sequential subtitle
    2. The time that the subtitle should appear on the screen, followed by --> and the time it should disappear
    3. Subtitle text itself on one or more lines
    4. A blank line containing no text, indicating the end of this subtitle[12]


00:02:17,440 --> 00:02:20,375
Senator, we're making
our final approach into Coruscant.

00:02:20,476 --> 00:02:22,501
Very good, Lieutenant.


Unofficially the format has very basic text formatting, which can be either interpreted or passed through for rendering depending on the processing application. Formatting is derived from HTML tags for bold, italic, underline and color:[13]

    • Bold – <b>…</b> or {b}…{/b}
    • Italic – <i>…</i> or {i}…{/i}
    • Underline – <u>…</u> or {u}…{/u}
    • Font color – <font color="color name or #code">…</font> (as in HTML)
    • Line position – {\a7} would denote text should appear starting on “line 7.”[14]

Nested tags are allowed; some implementations prefer whole-line formatting only.


The SubRip .srt file format is supported by most software video players.

Calculating Conversion of SRT Time Code (AegiSub) into Conventional time code (Premiere) –


“Most subtitles distributed on the Internet are in this format.[4][5]”



When you create an SRT file in a text editor, you need to format the text correctly and save it as an SRT file. This format should include:

[Section of subtitles number]

[Time the subtitle is displayed begins] –> [Time the subtitle is displayed ends]


To format the timestamps correctly, show:

[hours]: [minutes]: [seconds], [milliseconds]

Here’s an example:

Correct formatting is crucial for SRT files to work properly.


SubStation Alpha (SSA), Advanced … (ASS)



SubStation Alpha (or Sub Station Alpha), abbreviated SSA

Advanced SubStation Alpha (ASS) is a script for more advanced subtitles than SSA. It is technically SSA v4+.



Field 2:     Start
Start Time of the Event, in 0:00:00:00 format ie. Hrs:Mins:Secs:hundredths. This is the time elapsed during script playback at which the text will appear onscreen. Note that there is a single digit for the hours!

Field 3:     End
End Time of the Event, in 0:00:00:00 format ie. Hrs:Mins:Secs:hundredths. This is the time elapsed during script playback at which the text will disappear offscreen. Note that there is a single digit for the hours!

Timed Text Markup Language (TTML)

Web Video Text Tracks (WebVTT)