Add to bookbag

Author:	Bradford Lee Eden
Title:	Clifford Nass and Scott Brave's Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship
Publication info:	Ann Arbor, MI: MPublishing, University of Michigan Library April 2006
Rights/Permissions:	This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact mpub-help@umich.edu for more information.
Source:	Clifford Nass and Scott Brave's Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship Bradford Lee Eden vol. 9, no. 1, April 2006
Article Type:	Book Review
URL:	http://hdl.handle.net/2027/spo.3310410.0009.106

Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship

Bradford Lee Eden, Ph.D.

University of Nevada, Las Vegas Libraries

Nass, Clifford and Scott Brave. Wired for Speech: How Voice Activates and Advances the Human-Computer Relationship. Cambridge, MA: MIT Press, 2005. xvii + 96 p. ISBN 0-262-14092-6. $32.50.

This book is the culmination of ten years of research focused on the psychology and design of voice interfaces by the authors. Voice-interface technology had numerous problems before 2000, when the open source CSLU Toolkit appeared on the scene and assisted with some of the problems. Now with voice extensible markup language (VXML), as well as compelling experiments and designs from academia and private enterprise, much more is known and can be explored with computer-activated voice products and services.

Our world is now populated with interfaces that talk and listen to us. Almost every major business is now manned by a computer-maintained voice interface that offers choices and options to the caller. For many, these interfaces are frustrating and non-friendly, as they often keep callers from talking to a real person, and they cannot respond to emotions or problems directly. In this book, the authors present new theories and ideas from actual user testing and courses showing how people interact with technology-based voices. In the first chapter, the authors discuss the importance and background of sound and speech as the basis for human interaction and communication. In Chapter 2, they present evidence regarding issues of whether computer voices should sound male or female. Chapter 3 contains information related to gender stereotyping of voices, even in computer-generated voice technology. Moving from stereotyping to personality, Chapter 4 examines how humans relate to similar and dissimilar voice personalities from computers. Chapter 5, then, shows how mixing personalities in voice interfaces happens, and why confusion can result in inconsistencies between users and voice interfaces. The topics of accents, ethnicity, and race in voice interfaces are dealt with in Chapter 6. In Chapter 7, user emotion and voice-generated emotion are discussed, especially in relation to talking cars, while Chapter 8 further elaborates on the mixing of voice and content emotion in computer interfaces. The issue of synthetic voices and whether users interact differently with them and can tell the difference, as well as multiple voices, is dealt with in Chapter 9. Whether users should be able to choose the voice of their interface is examined here as well. The concept of “I-ness,” and whether synthetic and recorded voices should actually sound like humans, is brought up in Chapter 10. Chapter 11 then moves into the issues related to synthetic and recorded voices and faces. The pros and cons of mixing recorded and synthetic voices is discussed in Chapter 12, while some interesting communication contexts from experimentation and user study are explored in Chapter 13. Misrecognition and blame is examined in Chapter 14. Finally, in the conclusion, the authors summarize their findings, and move the direction of the book from listening to and talking at voice interfaces, to speaking with them.

What I especially liked about this book is that the authors provide specific directions to the reader at the beginning that the footnotes (of which there are many) do not provide additional information outside of the text. They are there to acknowledge references, as well as provide statistical information such as data tables and questionnaires for further study. Summarization of this statistical information is already present in the text, so the authors only include the full statistics in the footnotes, for those so inclined to examine them. This book provides some very interesting information on the current state of voice-enabled technologies, their relation and significance to human-computer interaction, and where the future may be heading for this technology.