Author: | J. Kelly Robison |
Title: | Via Voice 98 |
Publication info: | Ann Arbor, MI: MPublishing, University of Michigan Library August 2000 |
Rights/Permissions: |
This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact [email protected] for more information. |
Source: | Via Voice 98 J. Kelly Robison vol. 3, no. 2, August 2000 |
Article Type: | Software Review |
URL: | http://hdl.handle.net/2027/spo.3310410.0003.218 |
SOFTWARE REVIEWS
Editor's Note
Continuing the trend of reviewing software on a thematic basis, we have included in this issue two reviews of voice recognition software. For avid writers, as historians generally are, voice recognition software promises to be a godsend. Imagine, if you will, the ability to instantly convert what one says into text that can be saved and later be edited. The writing of lectures, research notes, articles, and even book-length manuscripts would become simpler as one step in the writing process, the actual writing by placing finger to keyboard, would no longer be necessary. Voice recognition technology is still fairly new. The reviews of Dragon Naturally Speaking and IBM's ViaVoice98 provide an assessment of whether or not the average historian should begin to think about using this technology.
J. Kelly Robison
-
Daniel Pfiefer, Depauw University
-
ViaVoice 98, IBM, Inc.J. Kelly Robison, Martin Luther University Halle-Wittenberg
PC system requirements: Intel® Pentium® 166 MHz with MMX™ and 256 K L2 cache (or equivalent), including AMD-K6® 200 MHz or AMD K6 with 3DNow!™ each with 256K L2 cache. 48MB RAM and 260MB of available hard disk, space Quad-speed CD-ROM drive or faster. Display mode set to 256 colors or higher (recommended). Windows 95/98 compatible 16-bit sound card with a microphone input jack and good recording capability. External speakers.
Probably the only thing better for historians than a computer that converts one's speech into text would be a direct connection to the brain that would convert thoughts into text. Perhaps today's voice-to-text conversion programs should get a bit better first, then the next step could be examined. IBM's ViaVoice98 is a far cry from the program that works on Star Trek's voice-activated computer. As I see it, there are several significant problems with this program, though these problems probably exist in other voice recognition programs as well. Two of them, perhaps, work themselves out over time, but the effort needed to get to that point is, for many, more than likely too high.
The setup is a simple, yet time-consuming affair. Actually getting the program onto the hard drive of the computer is easy enough and anyone who has previously installed a program (successfully) should have no problems with ViaVoice. Once the program is on the hard disk and the system rebooted, a further set of setup tasks is required to make sure that the microphone and speakers are at the required levels for recording and reproducing sound. Again, these are fairly simple steps. This is where the first problem comes into play. Unless one has a very sensitive (and therefore expensive) microphone, the microphone (and therefore the program) will not pick up one's voice to be converted to text. The other option to an expensive desktop microphone is a headset. The problem here is one of comfort. Many people, myself included, think and talk best while wandering around- pacing if you will. The headset, being tied to the computer via its cable, prevents this movement. The second problem is that headsets are not the most comfortable of devices. The feeling of being encumbered might be one that the user becomes comfortable with over time, but at first it is a great distraction.
The third, longest and most important step in the setup procedure is the voice recognition process. When the user "enrolls," to use IBM's term for it, one must take an hour or more to spend with the computer (headset over ears) reciting a series of prescribed sentences that the program then uses to recognize how the user says words and sounds. ViaVoice includes several levels of enrollment each of which, according to the theory of the program, enables the software to better understand the way the user speaks. This author only enrolled at the first level and perhaps my experience with the software (and hence this review) would have been better had I gone through the complete enrollment. But, I think that other historians using this software would become frustrated, as I was, sitting at the computer for hours reciting sentences that ViaVoice put on the screen.
Once enrollment is complete, or at least as complete as one desires to make it, the user can begin to actually use ViaVoice. This is when frustration really begins to set in. Assuming one has gone through the lengthy enrollment process, one expects the program to respond to voice commands. This it does, though not always and not as one might expect it to do. I believe for historians the benefit of voice recognition software would be the ability to think aloud and those thoughts to become text that could be saved to disk, edited, reworked and finally published or given as a lecture or paper. ViaVoice works with Microsoft Word, but does not work with Word Perfect. One could devise all sorts of scenarios involving Bill Gates and the executives at IBM, but I think this was simply a business decision on the part of IBM. So, the user must have Word installed. I first attempted to write this review using only the microphone, ViaVoice and Word, but gave up after the first paragraph. There were two problems I recognized quickly which made me decide to simply type what I was thinking. First, and this is where a higher level of enrollment may have helped, ViaVoice did not accurately convert what I was saying to text. In fact, when I coughed, the Edit menu dropped down. The program, at least while using a word processor, types what it thinks the user is saying. This is probably a problem that corrects itself upon enrolling at a higher level and/or using the program more so that it "learns" how one speaks.
The second problem was psychological rather than a problem with the program itself. As a teacher I have stood in front of up to two hundred students at a time and delivered lectures, conducted discussions and answered questions. Granted, public speaking is something one becomes more comfortable with over time (or finds different employment), but I was rarely at a loss for words while in front of an audience. The same is not true of being hooked up to a computer with a gadget around my head and a program awaiting instructions. I could not think is this circumstance. When I said something, I then thought of a better way of saying it or decided that this particular phrase was not what I really wanted. In between were pauses and ViaVoice does not like pauses. It repeated the most recent phrase on the screen that I had said. But again, this effect is psychological and one which some people might be able to cope with or adapt to. It also might go away on its own as one becomes more comfortable with using the program. I wonder, however, how many historians will simply put the headset away and go back to typing?
J. Kelly Robison