Voice Recognition Part 1 1.0
A voice in the wilderness – Part 1
This is an attempt to dictate a document using voice recognition. I am doing this slightly differently to how I would normally. For a start, I am also walking the dog. Therefore, it is not possible to look at my computer in the normal way, so I have put it in my rucksack. This makes it impossible to see the screen so I have had to resort to other means. I am therefore viewing my laptop screen via a wireless network connection from my Dell Axim PDA. From there, I am using a VNC (Virtual Network computing) software client to create a remote display of the laptop screen, on the PDA. When I require some mouse control, I have a Hand Track portable trackerball from Trust, or I can also use the touch screen of the PDA to control where I am on the laptop screen. The voice recognition software is Dragon NaturallySpeaking XP, running on Windows 98.
And, guess what. It all works - well mostly.
This is by no means a perfect solution. The laptop is one of the first Sony PictureBooks, running only a 400 MHz processor with 128 Meg of RAM and Windows 98. All the same, it would appear to be at adequate for the job. I would rather not have to walk around staring at a PDA, but it isn’t that much trouble, the occasional glance is enough. The particular microphone I am using works well, as long as there is no wind. I confess that for this particular, unplanned test I am not using a noise reducing headset, rather, a cheaper Plantronics model that clips to my ear. For some reason, when the breeze takes up, the recognition starts to favour the words inward, wooden and women. Why exactly this is, I leave to your imagination, but I do know that I have achieved better results with a noise reducing model.
Not all of us would want to be staggering along with a laptop in their backpack in order to dictate an article, but this is clearly a possibility, and it does the job. It wasn’t easy to set things up – getting a peer to peer network between two wireless cards took an age (until I found an undocumented checkbox), and then there was the mucking around with the temperamental VNC client to make the screen viewable on the PDA. Everything came together in the end, but it wasn’t a job for the faint hearted.
Perhaps what all of this illustrates is the power of integration, or the lack of it with mainstream vendors. If I could get things set up with old technology, why exactly have the big IT companies been unable to bring such capabilities to market? While Microsoft and Intel still struggle to deliver the perfect tablet PC with integrated voice recognition, an old PC with an old operating system and an old version of a software package were perfectly capable. Equally, while network operators and equipment vendors try to tackle the concepts of “mobility”, trying to turn it into some distant target that will make a great deal of money for whoever can crack the code, they missed the point. For the past five years, there have existed opportunities to mobilize the masses, and they didn’t require multi-billion, high-bandwidth infrastructures. Not everyone is going to want to use voice recognition, but let’s face it – the idea of people walking about chattering into space is no longer as unnerving as it was. And what if – just imagine – voice recognition turns out to be the missing piece in the entire mobility puzzle? Not that we should all be lugging laptops around, but many of us are doing this anyway.
Ultimately, if it all boils down to integration, the biggest problem is that nobody is doing the integrating. There are lots of options out there – IBM has a version of its own recognition package ViaVoice that runs on Linux, so there would be nothing to stop someone porting it to a Sharp Zaurus PDA (though, truth be told, users of ViaVoice in general have met with varying levels of success). The PDA device I have in my hand has a processor equally powerful to the laptop in my backpack, at least if the clock speed is anything to go by. Perhaps a smaller laptop (there’s some great ones available in Japan), with a Bluetooth-integrated remote screen and microphone, rather than VNC over WiFi? Great theory, but as anyone who’s tried to connect a Bluetooth headset to a computer will tell you, it just ain’t happening at the moment. There are lots of options, but each has to be tried and tested. Even if things did work as they should, the mass market of punters won’t be spending the time using computer equipment like Lego sets, and nor should they have to.
Perhaps things have been moving too fast for even the vendors to stop and think. In our struggle to look for the latest and greatest gadgets and (and I confess, I have reverted from my new Nokia 6600 phone to my old 6310i because it was better at the basic job of making calls), it is possible to take our eyes off the ball. Or perhaps – but surely not – there is something more insidious going on here – the big guns don’t want us to have such capabilities just yet? A bit like dodgy accounting practices, maybe they prefer to spread out innovations over a number of years?
Before the conspiracy theorists pick this up and run with it, they should recognize that the truth is a little more mundane – driven by fear and greed, even the biggest companies are still insisting on following technological rainbows rather than making existing products work together as they should. Networking with Blurtooth is a good example - rather than trying to fix existing “standards” they are already pursuing the next generation. Ultra Wide Band (UWB) will begin to appear next year (100Mb/s bandwidth to start going to 400Mb/s), not to mention the short-distance WiFi version that’s just been announced – hopefully somebody will treat the issue of compatibility at the outset, rather than leaving it down to the consumer to fix yet again.
That’s not to say that new developments won’t be very welcome. Part two of this article considers how to start a voice recognition revolution, if only the price was right. Meanwhile, as I walk along watching the sunset, my faithful mutt off in some bushes, I think to myself how this was, without doubt, one of the most enjoyable experiences I have ever had writing an article. If this is the future of portable computing, I can’t wait.