Justine CassellMobile networks and embedded processors increasingly allow computation to suffuse all of the spaces in which we work and play. These smart environments and intelligent rooms will put at our disposal a vastly expanded inventory of information, without requiring us to learn special command languages to access data. The designers of such "invisible computers" describe them as ways for people to interact with computation "as they interact with another person".
Media Lab, MIT
In this talk, however, I will agree with Harry Potter that one should "never trust anything that can think for itself, if you can't see where it keeps its brain". I'll argue that humans need to locate intelligence, and that this need poses problems for the invisible computer. Bodies are the best possible example of located intelligence, of course. When people talk to one another face-to-face to accomplish real-world tasks, they recruit and rely on a range of behaviors, complementing strings of words with appropriate intonation and movements of the eyes, face, hands and body, and they make extensive use of the real-world environment around them, in a joint effort to maintain a shared understanding of one another's contributions to the conversation and to the ongoing task.
In integrating these properties and insights about human conversation into the design of human - computer interfaces, I have developed a computational model called the FMBT model (pronounced fembot) with several key properties:
(a) the system-internal representation of the world and of information is modality-free, but can be conveyed via any one of several modalities (speech, intonation, posture, hand gesture, facial display) (b) the functions of the system are modality-free, but can be realized in any one of a number of different surface behaviors in a number of different modalities (c) the representations of conversation are not all symbolic, as cultural and social conventions cannot all be captured in logical form (d) co-occurrences of surface-level behaviors carry meaning, over that carried by each of the constituent behaviors (e) There are syntactic and semantic/pragmatic rules to describe the interrelationship between different modalities.
I will demonstrate the FMBT model with a series of interactive systems my students and I have implemented, including embodied conversational agents, online avatars, story listening systems for children, and some new work on embodied behaviors to evoke user trust, and on "shared reality" -- a paradigm in which human and computer share a real physical space within which to make hand gestures, facial displays and body movements, and share real physical objects that can be passed back and forth between the real and virtual world.
But, at a more fundamental level, I will claim that neither embodied systems nor invisible computers will ever succeed unless we understand the "affordances" of the body -- that is, how the body works in face-to-face dialogue, in situating intelligence.