Natural Language Generation for a Speech Prosthesis
Ivan Sag, Herbert Clark, Ann Copestake, Dan Flickinger, Rob Malouf,
John Carroll
CONTACT INFORMATION
Ann Copestake
CSLI, Ventura Hall
Stanford University
Stanford, CA 94305-4115
Phone: (415) 725-2312
Fax : (415) 725-2166
Email: aac@csli.stanford.edu
This is an NSF-funded project (IRI-9612682) which began in March 1997. It is
part of the
LinGO project at CSLI
and has close ties with the
Archimedes
project.
See also Some background on AAC and NLP.
Project summary
This project is developing a novel approach to natural language generation,
applying it to computer-aided text and speech generation for people with
physical disabilities. Many people who cannot speak because of physical
disability utilize text-to-speech generators as prosthetic devices. However,
users of speech prostheses often have more general loss of motor control, and
despite aids such as word prediction, text entry is slow and difficult. For
typical users, current speech prostheses have output rates less than a tenth of
the speed of normal speech. This prevents natural social conversation, since
it completely disrupts the usual processes of turn-taking, and can lead to
negative effects on the listener's attitude to the prosthesis user. The main
focus of this research is the investigation of techniques which can improve
rates sufficiently for more natural conversation to be possible, without
sacrificing flexibility of content. This new approach employs a combination of
a wide-coverage grammar, corpus-based word frequency data, and conversational
templates. Applied to speech prosthesis, it enables the production of full
sentences from minimal user input in a context-sensitive way. The approach can
also be applied more generally for efficient production of formulaic text like
the structured reports used widely in business and government and also has
utility in computer-aided language learning, both for people who are not fully
literate, and those for whom English is not their first language
So far most of our work has fallen into the following
categories:
- Expansion and refinement of
the existing English Resource Grammar. This includes improving the
coverage of constructions which occur in conversations,
further developing the semantic representation and enlarging the lexicon
(see e.g., Bender and Flickinger, 1999; Smith, 1999;
Bouma, Flickinger and van Eynde, in press;
Copestake et al, 1997).
This work has been partly supported by
the German Verbmobil machine translation project.
- Implementation of a generation algorithm which works
with the English Resource Grammar. This has resulted in a novel
algorithm which is generally applicable to unification-based grammars
and has improvements in efficiency/flexibility compared
to previous work. See Carroll et al (1999).
- Studies of dialogue, including an in-depth study of data
from two people with ALS (Lou Gehrig's disease) to get a better
understanding of their communication needs
(Copestake and Flickinger, 1998). We have also
started a detailed analysis of some
dialogues between non-AAC speakers in order to begin to
develop a computationally-tractable version of an existing
formal theory of discourse (Copestake and Lascarides, 1998).
- Developed and evaluated software for prediction of
word completions and also developed an initial system
for prediction of closed-class
words (Copestake, 1996, 1997; Copestake and Flickinger, 1999).
In the course of our research on grammars and generation,
we have greatly improved our grammar development environment
(the LKB system --- see Copestake (1999)
and Lascarides and Copestake (1999)). We have made the LKB system
generally available to researchers via the Web
http://www-csli.stanford.edu/~aac/lkb.html
Publications
Most of these papers are downloadable:
if you have problems, please look
here for some instructions and suggestions.
-
Emily Bender and Dan Flickinger.
Peripheral constructions and core phenomena.
In Gert Webelhuth, Andreas Kathol, and Jean-Pierre Koenig (eds.),
Lexical and Constructional Aspects of Linguistic Explanation,
CSLI Publications, Stanford, 1999.
-
Gosse Bouma, Dan Flickinger, and Frank van Eynde.
Constraint-based lexicons. In
Frank van Eynde (ed.),
Lexicon Development for Speech and Language Processing. ELSNET,
Leuven, in press.
-
John Carroll, Ann Copestake, Dan Flickinger and Victor Poznanski.
An Efficient Chart Generator for (Semi-)Lexicalist Grammars.
Proceedings of the 7th European Workshop on Natural Language
Generation (EWNLG'99), Toulouse, 1999.
-
Ann Copestake.
Applying Natural Language Processing Techniques to Speech Prostheses
In Working Notes of the 1996 AAAI Fall Symposium on Developing
Assistive Technology for People with Disabilities
-
Ann Copestake.
Augmented and alternative NLP techniques
for augmentative and alternative communication
Proceedings of
the ACL workshop on Natural Language Processing for
Communication Aids, Madrid, 1997
-
Ann Copestake.
The (new) LKB system.
ms. CSLI, 1999
http://www-csli.stanford.edu/~aac/doc5-2.pdf
-
Ann Copestake, Dan Flickinger, Ivan Sag and Carl Pollard.
Minimal Recursion Semantics: An Introduction
ms. CSLI, 1999
-
Ann Copestake and Dan Flickinger.
Enriched language models for flexible generation in AAC systems.
Technology and Persons with Disabilities Conference (CSUN-98).
Los Angeles, CA, 1998
-
Ann Copestake and Dan Flickinger.
Evaluation of NLP technology for AAC using logged data.
In: Filip Loncke, John Clibbens, Helen Arvidson and Lyle Lloyd,
Augmentative and Alternative Communication: new directions
in research and practice, 123-132.
Whurr Publishers, London.
-
Ann Copestake and Alex Lascarides.
Integrating symbolic and statistical representations: the
lexicon-pragmatics interface
to appear in the proceedings of
the ACL, Madrid, 1997.
The relationship of this paper to the project is slightly indirect,
but the first part of it
illustrates the sort of combination of statistical and symbolic
techniques which we are developing.
-
Ann Copestake and Alex Lascarides.
Resolving Underspecified Values with Discourse Information.
Paper presented at the workshop on Models of
underspecification and the representation of meaning, Bad Teinach,
Germany, 1998.
-
Alex Lascarides and Ann Copestake.
Default representation in constraint-based frameworks.
Computational Linguistics, 25:1, 55-105, 1999.
-
Jeffrey D. Smith.
English Number Names in HPSG.
In Gert Webelhuth, Andreas Kathol, and Jean-Pierre Koenig (eds.),
Lexical and Constructional Aspects of Linguistic Explanation,
CSLI Publications, Stanford, 1999.
Workshops
The project was involved in the organisation of the 1997
Workshop
on NLP for communications aids. Copies of the workshop proceedings
should still be available through the
Association for Computational Linguistics (see publications order form).
Websites of some other groups working on AAC
- Applied Science and Engineering Laboratories
- ACSD (University of Dundee)
This material is based upon work supported by the National Science Foundation
under Grant No. IRI-9612682. Any opinions, findings and conclusions or
recommendations expressed in this material are those of the author(s) and do
not necessarily reflect the views of the National Science Foundation (NSF).
Ann Copestake
aac@csli.stanford.edu
Created: June 12, 1997
Last updated: December 12, 1999