Skip Links

Network World

  • Social Web 
  • Email 
  • Close

Time to reappraise speech recognition systems?

By Nick Booth , CIO , 11/17/2008
  • Share/Email
  • Comment
  • Print

One April, 11 and a half years ago, Hollywood actor Richard Dreyfuss presented a new type of software that was going to 'revolutionise business'. He had been paid to host the launch of Dragon's NaturallySpeaking application, which could faultlessly translate spoken words into text. If this worked, we could chuck away our keyboards. Productivity would multiply. Dragon would become the new Microsoft and a new era of IT would dawn.

And work it did too -- in the demonstration. But not everything about the event was quite so well stage-managed. New York was suffering its worst ever blizzard and few made it through the snow. One year later, founders Janet and Jim Baker hadn't found the mass market they may have anticipated. That year, a Belgian firm called Lernout & Hauspie introduced Voice-Express, another desktop speech software product that could potentially free us all from the tyranny of crouching over a keyboard, ruining our posture and giving ourselves RSI. In a demo, it even outperformed the world's fastest typist.

So why aren't we using this software on every computer in the land? Why aren't we talking to computers, telling them what we want to do? How come Windows and Mac OS remained the user interfaces of choice, when voice commands would be so much more efficient and user friendly? Especially as speech dictation has become part of so many phone calls to buy tickets, report meter readings and query bills?

Clearly, the cynic might argue, the software was less effective outside a demo environment. As soon as there is no public relations executive standing over you, fussing that "it won't work if you do it like that", then the technology doesn't function and we give up.

And yet it's still with us. The inventors have persisted through all the trials and tribulations. How often have we seen technology deliver the promised benefits, after a decade of trying?

Videoconferencing took nearly 50 years to deliver on its initial promise. Windows took three versions to make a big impact. After only 10 years, voice recognition software might actually be worth installing for some users, especially if they have a disability, if they share computers or regularly insert slabs of ready-made text.

To find out where speech software is today, CIO magazine talked to the man behind Dragon at the launch of version 10 of NaturallySpeaking.

Well documented

Steve Chambers has been president of mobile services at Nuance (owner of Dragon as well as a number of other speech recognition companies it has acquired) for eight years, after a decade at Xerox. His career reflects the changing emphasis in managing information. Xerox was the place to be for document management. Now speech is the key to managing intelligent systems. There have been a lot of false dawns. So has voice recognition really come of age and is it ready for CIOs to consider for broad deployment? Or are optimists still kissing frogs?

"I would argue that a CIO has some pretty sound reasons for deploying this software, even if it's on a limited basis," says Chambers. "Although actually, that's one of the best ways you can build mass appeal for a new application among your users. Give it to the privileged few, and pretty soon everyone will be knocking on your door demanding it," he says.

The problem that speech recognition suffered from when it first appeared was over-selling its capabilities. It was never going to be a replacement for typing, just an alternative for certain types of users and certain types of documents or usage scenarios. It would have been a mistake to give people the impression that they could dispense with their keyboards completely and look forward to talking to their computers. And it was also premature to make too many bold claims for the robustness of the technology. It wasn't ready back then, Chambers admits, and whether it is ready now is open to debate.

"It all depends on the tasks you're undertaking," says Chambers. "Clearly there are some processes where you are going to write several drafts before you are happy, sometimes of the first paragraph. That's still a lot easier to do with a keyboard than with vocal commands."

One of the most obvious drawbacks with NaturallySpeaking is that you have to speak slowly and clearly, and you have to know exactly what you are going to say before you start dictating. The best way to work this out, of course, is to write yourself a script. And how would you do that? Using a word processor and a keyboard.

However, there are still niches where speech recognition has an obvious effect on productivity. In jobs where no creative thought is needed, for example, such as gathering information in the field. Insurance investigators reporting their findings can have their observations quickly transcribed, or legal professionals can dictate notes. Doctors who have dictated their observations about patients into a voice recorder can now have them transcribed by playing back the recording to NaturallySpeaking.

  • Share/Email
  • Comment
  • Print
Comment
Login
Forgot your account info?
Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a NetworkWorld account? Log in here. Register now for a free account.

Videos

rssRss Feed