In-house, Outsource, or ASR – What is the Best Way to Transcribe a Video?

Oct 18 2019 David Titmus
A video mixing screen on a computer showing images and audio transcribing.

Popular posts

Cell phone laying on a desk near a computer keyboard with the Twitch logo displayed on the phone screen
How to Add Captions to Twitch How to Add Captions to Twitch
lamp on desk
So You Want to Be a VITAC Realtime Captioner… So You Want to Be a VITAC Realtime Captioner…

Related posts

Exterior photo of the department of justice (DOJ) building with an American flag flying in front.
Justice Department Sets Accessibility Requirements for State and Local Government Websites and Mobile Apps Justice Department Sets Accessibility Requirements for State and Local Government Websites and Mobile Apps
VITAC, a Verbit Company logo, on a television screen. The screen is atop a white cabinet against a white wall backdrop.
VITAC and ENCO Form Strategic Partnership to Expand Caption Encoding and Delivery Options for Broadcasters VITAC and ENCO Form Strategic Partnership to Expand Caption Encoding and Delivery Options for Broadcasters

Broadcast transcription has come a long way. We’ve moved from digibeta to digital workflows and have expanded the use of transcriptions from simply speeding up edits to facilitating localization and making content accessible, but the basic need to convert video to text remains unchanged. In fact, transcriptions are probably more important to our industry now than ever before.

Valuable as it is, transcribing video content is seen as a relatively simple part of a complex content supply chain and tight budgets make it tempting to assign the task to internal teams rather than outsourcing it to professionals. The advent of ASR transcription tools and the industry’s keen adoption of this technology seem to suggest that this will all be automated soon – if it isn’t already. So, what is the best way to transcribe a video for broadcast?

Option 1: Using In-House Resources to Transcribe Video

Tasking your production assistant, or even a runner, with transcribing your videos may seem like a perfect solution. They’re already working for you, so there’s little-to-no additional cost involved, their workload often drops off after the shoot, so they probably have some capacity to take on additional tasks during post-production, and they’re already part of your team and familiar with the project. Keeping all your media inhouse also means that you don’t have to worry about external service providers’ content security protocols.

However, it’s unlikely that you hired your show runner based on either their language skills or typing ability, so it could end up taking them an inordinate amount of time to produce transcriptions that end up being unusable anyway because they’re riddled with spelling mistakes and formatting issues.  You would also have to manage these team members’ schedules and provide the space, playback and control tools and workstations for them to work from. Another big issue is scalability. Tight post-production schedules generally mean that you’re dealing with high volumes of content that need to be turned around quickly – to deliver on time you need a big, dedicated team working for short periods rather than one or two runners doing a bit of transcription work in between their other tasks. Often trying to save money by transcribing content internally ends up costing time and money as you end up having to pay someone to redo the work.

Option 2: ASR Transcription for Video

Automatic speech recognition technology is an increasingly popular method of transcribing video content and, as such, has been integrated into many software platforms across the industry. The benefits are obvious – ASR tools provide transcriptions at lightning speed – your transcriptions are completed within seconds or minutes of placing an order – and they’re extremely cost-effective, available globally and can handle any volume of work.

But while ASR technology has improved enormously since it was first adopted for captions on live programming, this automated software is not yet able to provide the same level of accuracy as human transcribers. This can negatively affect your ability to search transcripts for key words and phrases, particularly if your video requires multiple speaker identification or if it wasn’t recorded in a sound-controlled environment. If 100% accuracy isn’t necessary for your project, then ASR technology will get the job done quickly and cheaply, however the final product is unlikely to be formatted to your specific needs and will often require clean up.

Option 3: Outsourcing Video Transcription to a Specialist Service Provider

Broadcast transcription specialists provide highly accurate video transcriptions for producers, content providers, and localization companies using either human transcribers or a combination of ASR and human “polishing.” Because their processes aren’t automated, they can customize their services to meet your specific requirements and can expand the video transcriptions into full post-production (or as-broadcast) scripts that include dialogue lists, onscreen captions, music cues and credit lists – key deliverables for localization. Many transcription agencies, like VITAC, also provide related services like translations, captions, and audio descriptions so you don’t have to deal with multiple providers. As these agencies are industry specialists, they understand how transcripts are used in production and are familiar with all the different broadcaster’s specifications, so they’re able to provide transcriptions and other content in the format most appropriate for their specific purpose. Finally, broadcast transcription specialists should be certified as compliant with industry security protocols, like the DPP committed to security marks, so you can rest assured that your content is safe.

Outsourcing to broadcast transcription specialists costs more than other methods of transcribing your video, but the difference in cost is often less than you might expect and is generally reflected in the quality of the final product. Human transcribers also can’t compete with the speed of ASR (so you shouldn’t expect to receive transcriptions within minutes of submitting them to a transcription company) but, by combining ASR with human polishing and enlisting a large, internationally distributed workforce, some providers are able to offer overnight turnarounds, even on high volumes of content.

As you can see, there are pros and cons to each approach and the best way to transcribe a video will depend on how you’re planning on using the transcription and whether speed, quality or price are most important to you.

Contact VITAC to find out how little you can expect to pay for highly accurate video transcriptions for your next production.