TranscribeMe, Nuance Fine-Tuning Audio Transcription For Hot Market

(Page 2 of 2)

and transcription prices drop. For example, media producers often publish full texts of their productions so that viewers and listeners can find them using search engines. “Google doesn’t index MP3 files,” he says.

Smartphones are also market catalysts, because they feature high-quality recording devices with enough memory to store audio from meetings two or three hours long, Dunayev says. TranscribeMe offers mobile apps for Windows, Android, and iPhones that steer the process of recording and uploading audio to its customer portal. There, customers can store their audio files and place transcription orders.

TranscribeMe is now preparing to allow clients to order transcripts of files they’ve stored in other content hosting platforms such as YouTube and Dropbox. TranscribeMe charges $2 per transcribed audio minute when the file involves two or more speakers. The price for a single speaker transcript is $1 per minute.

Dunayev says TranscribeMe has been growing rapidly since its service came online in 2012, but the private company doesn’t disclose its revenues. The company raised $900,000 in November 2012 from investors including Tech Coast Angels, Sierra Angels, TA Ventures, TEC Ventures, ICE Angels, and Maverick Angels, bringing its total funds raised to $1.5 million. TranscribeMe has 30 staffers in Berkeley and an operations branch in New Zealand.

While startups continue to develop methods to work around the limitations of multiple voice transcription software, tech giants are amassing big databases of digitized speech and matching text that may help researchers enhance the accuracy of voice recognition programs across a range of accents and vocal quirks. Microsoft, Google, and Apple are trying to improve their services based on speech recognition, such as voicemail-to-text translation and voice-activated computer commands.

Nuance, by supplying voicemail-to-text technology to phone companies, has already made some headway in transcribing the speech of people its programs haven’t been specifically trained to interpret. At Nuance, research divisions are now trying to develop high-quality transcribing capabilities for lengthy, multiple-voice audio files, Mahoney says.

The first hurdle is to build software that can simply recognize that a new person has begun to speak—a feat that can be challenging even for human beings when two speakers on the same audio file have similar voices. Mahoney says Nuance is working on voice identification software that weighs a number of speech characteristics to sort out different speakers.

The future Nuance software would then split up the audio file so that each individual speaker’s statements could be transcribed separately and later re-assembled, Mahoney says. The new process is being designed as a service for enterprise customers such as businesses that record their meetings, not as a consumer software product. The service, while it would rely on improved speech recognition software, would still make some use of human transcriptionists, the company says. No timeline has been released for its launch.

TranscribeMe’s Dunayev says he fully expects speech recognition software to gain added power, no matter which big company eventually meets the challenge of multiple-voice transcription. He says he doesn’t fear the competition.

“We actually count on it happening,” Dunayev says. Rather than undercutting TranscribeMe’s business, voice-to-text technology advances could allow the company to improve its service by reducing costs and lowering prices for customers, he says. Dunayev doubts that software will soon eliminate the need for human transcribers to perfect computer-generated transcripts.

“There’s still a need for that last person to do that quality validation,” Dunayev says.

Single PageCurrently on Page: 1 2 previous page

Bernadette Tansey is Xconomy's San Francisco Editor. You can reach her at btansey@xconomy.com. Follow @Tansey_Xconomy

Trending on Xconomy