Closed Captioning for Online Video
Live Captioning
Live captioning is the real-time transcription of a broadcast as it happens, such as the news, sports, and other live events. Live captioning is generally performed by court reporters using 10-key machines in which the keys are a form of shorthand. The 10-key output is translated via software into English words.
VITAC Corp. is responsible for captioning 160,000 hours of programming each year for major networks, ESPN, Fox Sports, and various feature films. VITAC had to heavily invest in new infrastructure and reprogram portions of its internal software to meet the new challenges in captioning. Tim Taylor, vice president of engineering and facility operations at VITAC, says, "For offline, we had to make changes for captioning in HD by modifying our software for a new aspect ratio—16x9. Then, the next hurdle with HD was that the frame rate for SD (standard definition) is at 29.97 frames per second, PAL is at 25 per second, and now HD has two different frame rates: 29.97 and 24 frames per second. Moreover, everything we caption ‘live’ is pulled off of satellite. So, in order for us to be able to see all of the HD feeds, we now have 120 satellite receivers, with more on the way. To caption for the HD networks meant that we had to make significant investments just so that we could see all of the HD feeds simultaneously. This investment in infrastructure was gradual over a period of time as HD began to phase in. IRD satellite receivers cost $5,000 to $9,000 each. This represents hundreds of thousands of dollars in investment in captioning and subtitling capabilities for both Pre-Recorded and Real-Time Captioning."
Figure 1. VITAC Corp. is responsible for captioning 160,000 hours of programming each year for major networks, ESPN, Fox Sports, and various feature films.
The ‘How’ of Live Captioning
"For live webcast captioning, video streaming servers are used," says Dilip K. Som, president of Computer Prompting & Captioning Co. (CPC), a company that produces live and offline captioning and subtitling software for Windows and Mac. "From the computer running our WebCaption software that encodes captions to the videos, a link is made to a streaming server that is usually equipped with large bandwidth to accommodate hundreds to thousands of web viewers. The customer sends a stream to one of the companies that uses our software, such as Caption Technologies. That company (Caption Technologies) uses a steno system or a speech recognition software to create the captions in real time and embed the captions to the video. We embed the captions inside the video instead of sending the captions as a separate stream to achieve full synchronization of captions to audio. At the present moment, we only work with Windows Media Video. We investigated other players—including Real and QuickTime, but there were some technical difficulties to achieve the same solution. We’re still working on it."
Figure 2. CPC’s WebCaption software encodes captions to videos, then links to a streaming server that is equipped with large bandwidth to accommodate hundreds to thousands of web viewers.
Live captioning from an originally English feed directly into other languages is rare, and VITAC is one of the few companies that offer such a service. Very few people can use a 10-key machine or type in shorthand while translating correctly at the same time, and these translating captionists should be valued. Live translation is particularly difficult because of the wide variety of subject matter that appears on broadcasts, from financial news to international politics to science and technology programs.
Software and Offline Captioning
For offline captioning and subtitling, some of the big players in software packages include CPC, Adobe, and Sonic Scenarist. At the click of a button, CPC’s software breaks a transcribed script instantaneously into the raw captions required, splitting each sentence into the proper two or three subtitles, all of correct length as specified by the FCC. Then, the software guides the user to assign timecodes to each entry point with mouse clicks.
For translation of captioning files or subtitling for the web and DVD, International Services offers a web-based, do-it-yourself Subtitler designed to control translation into other languages with glossaries and other features for linguists. Subtitle text can be translated by a professional translator (if you don’t have one, links are provided to qualified translators), or users can send the subtitles for automated translation by language software.
Automatic Sync Technologies (AST) has created web-based software to provide a "digital drive-thru," a quick-turnaround version of captioning in which users drop off via the web and pick up via email. To provide this quick turnaround, AST runs each transcription through parsing software. Then, the company’s algorithms synchronize the script words with the corresponding audio in a text-to-audio match pattern.
Where Does the Money Go?
An hour of outsourced captioning varies widely depending on the vendor, with $600 being fairly normal, but this is not the only price around. The bean counter in you may respond, "But transcription sounds so easy. Where does the money go?" First, transcription is a long and precise process. There can be 10,000 words in 1 hour of content. A typist who types at 100 words per minute would need 10 hours to finish such a transcription. A fast typist at 200 words per minute would need 5 hours. Then, the transcription must be proofread before being broken down into captions and timecode assignments.
A professional transcriptionist transcribes much faster, and more thoroughly, than a typist. Professionals in transcription often use 10-key court-reporting equipment that speeds up the transcription turnaround significantly. In addition, they are usually excellent proofreaders.
VITAC has invented a type of "shorthand for keyboards" for use with its own internal software that is a 21st-century version of Gregg Shorthand, which was popular in the 1950s. According to Yelena Makarczyk, director of subtitling operations for VITAC, both captioning and subtitling are arts that may not be readily apparent to nondeaf viewers, but poorly split sentence text is like a punch in the gut to people exposed to captioning and subtitling on a daily basis. "So much goes into one caption. The average reading speed is 15 characters per second, but cartoons for kids must be slowed, with [fewer] words showing on screen for a much longer period of time," Makarczyk says. "For other languages, we also perform cultural assessment checks. Every single caption is treated as a unit—dividing the timing and process of captioning and subtitling into age group, accuracy, flavor, consistency, [and] verification. And there is research involved. We once spent 4 days in our off-hours looking for the exact spelling of an African god, until someone from Sudan finally found it in a book."