The Human Art of Captioning

By: David Titmus
Woman typing captions

Popular posts

Cell phone laying on a desk near a computer keyboard with the Twitch logo displayed on the phone screen
How to Add Captions to Twitch How to Add Captions to Twitch
image showing legal textbooks, a gavel, and scales, in front of a man signing legal papers
Communications, Video, and Telecommunications Act Introduced Communications, Video, and Telecommunications Act Introduced

Related posts

A cluster of social media icons pictured on a white background.In the center and larger than the other logos is the TikTok logo. The surrounding logos are as follows from left to right: the Instagram logo, the YouTube logo, the Twitch logo, the LinkedIn logo, the Facebook logo, the Tumblr logo, the Discord logo, the Pinterest logo, the Twitter logo.
How to Use TikTok Captions How to Use TikTok Captions
Microphone in front of a computer monitor for recording audio description tracks
VITAC Among Finalists at ACB’s Audio Description Awards Gala VITAC Among Finalists at ACB’s Audio Description Awards Gala

For some, closed captions are simply those words that flow across the bottom of televisions that describe the on-screen dialogue and action. But for nearly 50 million Americans in the deaf and hard of hearing community, captions are so much more — they are an important connection to a world that many in the hearing community take for granted. Captions provide a link to not only entertainment, but to education, news, and emergency information.

In short, captions play a crucial function in the daily lives of millions. And it’s for this reason that clear, concise, accurate captions are absolutely essential.

Though new technology and voice recognition systems are making strides in the industry, the best approach to captioning remains firmly rooted in the human experience of the spoken word. Professionals trained to provide captions bring human sensitivities and contextual awareness to a captioning engagement that no Automatic Speech Recognition (ASR) system can.

Indeed, when comparing the captions created by humans to those created exclusively by even the smartest of machines, there is an obvious disparity. ASR systems often fail to present names and technical terms properly, they stumble on accented or mumbled speech or background noises, have errors in caption timing, synchronicity, and placement, and can have difficulty in determining the differences between what a speaker “said” and what they actually “meant.”

The whole point of captions is to enable accurate access to the audible content on the screen, but if captions present that content in ways that make no sense, the captions are useless. This is why human captioners remain the lynchpin to successful captioning.

That is not to say that technology does not matter. The technology can provide the science in captioning, while the human captioner provides the art behind the captions. In fact, an understanding of the technologies involved in the presentation of live, broadcast, and streamed content plays a critical role.

VITAC is committed to developing, finding, and implementing all the latest technologies in delivering accessibility to viewers, but we do not cut corners, and human beings are still the best at understanding the many complexities and contexts that go into captioning the spoken word.