IBM’s voice recognition grows up
2000-06-22
We knew that Ozzy Osbourne’s talents were diverse, but we didn’t expect to see him popping up as general manager of IBM Voice Systems. Okay, it is a different chap but somebody who associates himself with the hard rock icon must be either barking mad or know exactly what he is doing. Judging by IBM’s latest announcements in voice recognition, we would suggest the latter.
IBM has announced a complete revamp to its voice strategy, which reflects and confirms the direction taken by the IT market as a whole. Essentially, the technology landscape is moving away from the fat client architecture using general purpose PCs, towards a structure that concentrates hard processing on the server and performs certain specific functions on embedded chips in client devices. These two areas – thick, general-purpose servers and thin, function-specific clients – provide the model upon which IBM is to base its strategy for voice recognition in the future.
It has to be said that something had to give. As an advocate of the potential for voice recognition for many years, I felt obliged to run through the learning mode of IBM’s latest release of ViaVoice. After over an hour of reading sections of (most appropriately, I must say) Alice in Wonderland to my computer, I then attempted to dictate an article. The results ranged from the hilarious to the baroque – progress was slow, not helped by my own, frankly puerile giggling at some of the phrases that were generated. Great dream, but the reality sadly lacks. More success has been seen by companies such as SpeechWorks, which concentrate on specific, server-side areas such as telephone share trading or flight checking, but even these have been subjected to the mockery of the general public. Even assuming that the ability of computer software to interpret the spoken word does become a reality, the fact is that we do not speak as we write – dictation is difficult enough to beg the question – why bother? Which is why IBM’s strategy makes a lot of sense. Let’s look at why.
IBM is focusing on two areas. The first is the server, in which (like SpeechWorks), the aim is to integrate speech recognition into enterprise applications. According to \link{http://www.news.com,News.com}, planned for autumn release is WebSphere Voice Server with ViaVoice Technology, a suite of tools for helping call centres better use the Web. Also to come is a product that will enable Siebel users to integrate voice calls and Web-based queries. In addition, IBM announced a partnership with Internet speech specialist General Magic that is to target voice for eCommerce applications. The second area of focus is the embedded device: IBM is to release embedded ViaVoice, a Java-based software development kit which is targeted at PDAs and mobile phones as well as in-car devices and the like. According to W. S. (Ozzy) Osbourne, IBM is positioning its technologies as a framework for others to use rather than trying to go direct to the end-user.
Why is IBM’s strategy sound? To believe this, one first has to accept that voice recognition does have a future, albeit more focused than the generalised “human-talks-to-computer” model. Given this, on the enterprise IBM is concentrated on specific application areas such as call centres - no doubt, with limited vocabulary though enterprise servers will have the processing power required to support more general recognition. On the client side, IBM are facilitating voice features to be built into devices, rather than implementing such devices themselves – it is likely that product developers (such as Motorola, likely to bring out an in-car facility) are better placed to identify and develop workable applications for voice.
IBM is not dropping ViaVoice, but they are recognising that the shrink-wrapped market is never to be a major target for voice recognition. Rather, they are focusing on areas that make good business sense and also give these still-young technologies a better chance to shine. Even then, it will be a while before voice recognition manages a reasonable interpretation of the lyrics of Mr Osbourne’s namesake. Not even IBM can do everything.
(First published 22 June 2000)