Seminar on Computational Learning and Adaptation
Knowledge-Lean Word Sense Disambiguation
Ted Pedersen
Department of Computer Science and Engineering
Southern Methodist University
Dallas, TX 75275
pedersen@seas.smu.edu
[joint work with Rebecca Bruce, also of SMU]
Natural language processing applications often require that the
meanings of ambiguous words be resolved. Automatic methods of word
sense disambiguation are usually dependent on the availability of
costly knowledge sources such as manually annotated text, semantic
networks, or machine readable dictionaries. This limits the
applicability of such approaches to domains where this type of
knowledge is already available. This presentation discusses several
knowledge-lean alternatives that are able to make word sense
distinctions based only on features found in the raw text surrounding
the ambiguous words. McQuitty's and Ward's agglomerative clustering
algorithms and the EM algorithm are evaluated with respect to their
disambiguation accuracy. These results show that (1) McQuitty's
algorithm is more accurate when the underlying sense distribution is
very skewed while the EM algorithm is more accurate given a somewhat
balanced sense distribution and (2) relying on features that occur
within 1 or 2 positions of the ambiguous words may be sufficient to
attain reasonable levels of disambiguation accuracy.
Date: Thurs., February 5; Time: 4:15-5:30PM; Place: Gates 100
The goal of this seminar is to increase
communication among local researchers with interests in computational
approaches to learning and adaptation. If you would like to be added
to (or removed from) the mailing list, or if you are interested in
giving a talk in the seminar, please send email to
iba@isle.org.
Return to seminar schedule.