NCDAE Tips and Tools: Web Captioning
Created: December 2006
The following is a brief introduction to the principles and potential challenges of captioning for the web. It is meant to be a starting point, not a definitive guide to captioning. If you are interested in learning more, read the NCDAE captioning article and the several captioning resources provided by WebAIM. WebAIM is a partner with NCDAE.
What are Captions?
Captions are text equivalents of the spoken word and other audio content. They allow the audio content of web multimedia to be accessible to those who do not have access to audio, primarily the Deaf and hard-of-hearing. Captioning can be expensive, and a little daunting at first, but it is also a very important part of making content accessibleaccessible design.
Common web accessibility guidelines indicate that captions should be:
- Synchronized - the text content should appear at approximately the same time that audio would be available
- Equivalent - content provided in captions should be equivalent to that of the spoken word
- Accessible - caption content should be readily accessible and available to those who need it
There are five main steps to captioning a file for the web.
1. Create or obtain a transcript
There are a few ways that a transcript can be created, and each has its advantages and disadvantages.
- From production script: If a script is available, you may already have a transcript.
- Generated by stenographer: A stenographer can create a caption in a very short time, but this process can be extremely expensive, usually $75-$100 per hour.
- Typed by hand: This is usually the most time consuming way to create a script, but it may be the most cost effective if fast production time is not critical. A fast typist can create a transcript at a much lower cost than a stenographer.
- Created using voice recognition: Although voice recognition is an exciting alternative, the technology is far from perfect. In order for voice recognition to be reliable, a person must train the software and speak very clearly. Some people take advantage of voice recognition through a process called "shadow speaking." That is where a person repeats live or recorded speech.
2. Segment into individual caption displays and add speaker names
Before captions can be created, text must be chunked into smaller units of one or two short sentences. This is usually accomplished by adding manual line breaks between units (hit Enter twice). New speakers or a change in speaker should also be identified by starting the line with the person's name, a colon and a space. Sometimes you will see the speaker identified in a separate line, but this is usually a waste of space.
Note: This step can be combined with Step 1.
3. Assign timecode for each caption to synchronize with audio
Several programs exist to help people synchronize text transcripts with media. The two most popular tools are MAGpie (a free tool) and Hi-Caption. For more information on using these tools, see the following tutorials.
- WebAIM article: Captioning with MAGpie 1
- WebAIM article: Captioning with MAGpie 2
- WebAIM article: Captioning with HiCaption
4. Create appropriate caption files (QTtext, RealText, SAMI)
Every media player uses a different format for their caption files. This can be frustrating if your media files exist in more than one format, but many tools, including those listed in Step 3, can create files in these different formats. The following is a list of the most common file types.
- SAMI (Synchronized Accessible Media Interchange) – The file that contains caption data with timing information for Windows Media Player.
- Quicktime Text Track – The file that contains captions and timing information for Quicktime media.
- RealText – The file that contains caption and timecode data for RealPlayer.
- SMIL (Synchronized Multimedia Integration Language) – The layout language used by Quicktime and RealPlayer.
There are also some tools that allow you to create captions for Adobe Flash content, although there is not currently a single specified format for captions in Flash.
5. Combine with media and distribute the captioned media
There is no easy way to learn how to combine media and caption files. It can be a difficult process. If you are interested in captioning for a specific format, the following WebAIM tutorials might be helpful.
- WebAIM article: Captioning Quicktime
- WebAIM article: Captioning RealPlayer
- WebAIM article: Captioning Windows Media
Captioning Accessibility Challenges and Solutions
The following table lists common challenges associated with captions, the people with disabilities that might be impacted and possible solutions to these challenges.
|Accessibility challenge||Disability type(s)||Solution(s)|
|A person cannot hear or easily understand audio or video content.||Deaf, Cognitive, Low literacy, non-native language, All||
|Captions may be too long, causing part of the caption to be hidden, or making it difficult to read.||All, Cognitive||Each excerpt should be no more than two lines long.|
|Captioned media is not accessible to a person relying on a Refreshable Braille device||Deaf Blind||Provide a text transcript in addition to captions.|
|In a video, there may be important content conveyed visually that is not included in the captions.||Blind||
|Embedded media players may not be as keyboard-accessible||Blind, Cannot use a mouse||
|Small videos may be hard to view||Low Vision||
When possible, offer the option of a higher-resolution video.
|Large videos may be very difficult to view by someone with a slow internet connection||All users||
|Small font size and poor fonts may make a caption unreadable||Low vision, all users||
|It may be difficult for some deaf people to read captions, as Engligh is not the first language for many Deaf people (it is ASL or another sign language).||Deaf||It may be appropriate to provide a video with an ASL (in the U.S.) interpreter in addition to a captioned video. This approach is not always recommended, because it can be very expensive, and because an ASL alternative will not benefit as many people as a captioned video.|
|Nonverbal audio cues are not always included in captions.||Deaf, Anyone who is unable to hear the audio||Ensure that captions include all important audio cues including non-verbal sounds and change of speaker.|
Real-time, streaming web multimedia introduces an additional challenge for captioning. The difficulties in generating real-time captions are 1) Audio information must be converted into text in real time, and 2) The text captions must be delivered to the end user so they are synchronized with the audio. Both of these issues introduce difficulties when dealing with live, real-time web multimedia.