Computers May Listen in the Future : Telecommunications: AT&T;’s plan to replace many telephone operators with computerized voice recognition systems should help speed the use of machines that understand human speech.
American Telephone & Telegraph’s plan to replace many of its telephone operators with computerized voice recognition systems is not a dramatic technological breakthrough, but it should nonetheless help speed the deployment of machines that understand human speech, analysts said Wednesday.
Voice recognition should over the next few years become an alternative to touch-tone phones in navigating corporate telephone systems. And it will also replace the traditional typewriter keyboard as a means of entering information on some highly specialized computer systems.
But the day when people can casually talk to their computers using normal speech remains far off. And for many researchers, the focus now is less on finding a revolutionary new technology than on developing practical uses for the impressive but limited capabilities that already exist.
“Within a couple of years, everyone will have some experience with voice recognition systems,” said John Oberteuffer, president of Voice Information Associates, a Lexington, Mass. consulting firm. “There are a lot of good industrial applications for speech input that have been slow to catch on. That should change as people get used to it.”
Companies such as AT&T;, Northern Telecom and International Business Machines--as well as research universities such as Massachusetts Institute of Technology and Carnegie-Mellon--have been working on speech recognition technologies for decades. Early systems had to be “trained” to understand a specific voice, and they could only understand a small number of words. They got thoroughly confused if the speaker talked too fast or slipped in a word that wasn’t part of its limited vocabulary.
But the growing power of computers and the increasing sophistication of the mathematical formulas used to analyze and understand the spoken word has solved some of these problems. Today, it’s possible to build systems that understand a few words spoken by anyone, or many words spoken carefully by one person--though it’s still difficult to combine the two.
The key to the AT&T; system is a technique called “word spotting,” which makes it possible for the system to comprehend a request even if it is phrased in many different ways. Thus a caller can say, “I’d like to make a collect call,” or, “Make this call collect.”
“This is very important for ease of use,” said Jay Wilpon, a Bell Laboratories researcher who played a key role in developing the technology. “You have to be able to talk to a voice recognizer without thinking about it.”
The AT&T; system can only identify a handful of words and phrases, but it can consistently understand virtually any intonation or accent.
Karl Kozarsky, executive vice president of Probe Research, a New Jersey telecommunications consulting firm, said the market for telephone-based voice response systems would grow from $54 million in 1991 to nearly $250 million by 1995. Voice recognition is also gaining currency on computer systems. Quality control at the Saturn car factory, for example, can orally state problems into a headset and the computer system puts the corrective process in motion, according to Oberteuffer.
Voice recognition systems can also aid people with physical disabilities.
Yet the major computer companies have not yet moved to make voice input a part of their systems--both because the technology remains imperfect and expensive and because it’s not clear what benefits voice would provide for mainstream computing.
Apple Computer, for example, recently hired renowned Carnegie-Mellon voice researcher Kai-fu Lee and has come up with a system that enables a Macintosh personal computer to understand basic voice commands even without the addition of special voice-processing hardware.
But when the technology was demonstrated for reporters and analysts at a meeting on Wednesday, nearly every phrase had to be repeated two or three times before the machine understood. And Apple manager Phac Le Tuan indicated that the company wouldn’t release a product until it had also completed software that would enable people to develop their own ways of using voice commands.