Re: [NTLK] Pen computing is dead... long live the iNewt

From: Ryan Vetter <>
Date: Sun Aug 16 2009 - 18:34:23 EDT

I think Ron was just being incredulous to save face...

As I sent you in another Email, Ron, there is no legitimate automated transcription software. It just does not exist. Google doesn't have it, neither does Microsoft, nor does TRADOS or underground language translation/transcription Defence contractors.

The best we have been able to come up with with transcription is software that allows one to use a pedal, along with some workflow enhancements (i.e. predictive text entry, like on smartphones). Press the pedal to start the audio/video, press it again to stop, etc. With Dragon Naturally Speaking and a headset, the Transcriber can dictate what they are hearing in the audio/video recording.

iListen did offer a transcription pack when it was on the market, but it was not like what you would think it is. The recording had to be of high quality, and the speech had to have been from a voice trained under iListen. Not only that, but the speech had to be the kind recognizable by iListen (you must say punctuation marks i.e. "period", "comma", etc.). Because of this, it was unusable in the transcription industry. Unfortunately, no speech recognition is going to automatically format your speech with periods, quotation marks, etc. The output of this software when it comes to transcription is a mess and therefore unusable.

But iListen/MacSpeech Dictate/Dragon Naturally Speaking are not designed for transcription, they are designed for direct speech input via an approved microphone and trained speaker. And I mention DNS and MSD because they are the best speech to text applications in the world, and many other companies license DNS's speech recognition algorithm. It makes sense though: that algorithm has seen over 20 years of development and is the best, most widely used speech-to-text recognition software on the planet.

The problem with transcription software is that it is really an all or none thing: either the software works or it does not. We are not even 50% there yet, and we are many years away from anything remotely useable because there are so many challenges with automated transcription: multiple voices, varying pitch, speeds, pronunciation, background noises, etc.

You might want to check out that universal translator that the US DOD uses, which was produced by a Defence contractor. It sounds great, but is itself very limited (it will give you a basic sentence resulting from the input of some very basic, predefined speech). Nothing right now exists that is smart and seemless and it won't for quite some time. But dictating to your computer directly has basically been perfected, and the leaps came just in the past 4 years.

As for SpinVox, they thus do not have some superior, intelligent software that other industry leaders lack (i.e. TRADOS, DNS, Lionbridge, etc.). SpinVox outsources the work for low cost to places like India where human Transcribers transcribe your voicemails into text and send it to you via SMS...

And as for your strawman Ron, about SpinVox and privacy issues, personal/business voicemails and technology discussions from active users on a discussion list are two different worlds. For example: my clients some times leave their credit card information on our answering system, although they are not instructed to do so. Handing that information out to SpinVox is not only risky and unprofessional, but illegal. Here is a link discussing a privacy breach at SpinVox:

