| HOT-FROM-THE-OVEN MEALS: Keep hot food HOT; warm isn't good enough. Set the oven temperature at 140 degrees or hotter. Use a meat thermometer. And cover with foil to keep food moist. Eat within two hours. | ``Change is always happening,'' said the ebullient trumpeter, whose words tumble out almost as fast as notes from his trumpet. ``That's one of the wonderful things about jazz music.'' For many jazz fans, Ferguson is one of the wonderful things about jazz music. |
| eat | hot | jazz | meat | trumpet | |
| Music | 3 | 1 | |||
| Food | 1 | 2 | 1 |
We proceed through the whole corpus of documents like this, for each
word building up a "number signature" which tells us how often that
word appeared in the presence of each content-bearing word. Many, even
most words, appear in many contexts, and some words (particularly
pronouns and number words) can appear in almost any context. Some
words - like jazz and meat - occur regularly in the same contexts.
The huge table or matrix built by this process gives us a
profile of the way different words are used across the corpus of
texts. Each word is given a list of numbers, and this list of numbers
is called a word-vector. A good way to think of these numbers
is as `meaning-coordinates', just as latitude and longitude associate
spatial coordinates with points on the surface of the earth. If you
find such abstract ideas interesting, you might want to read an
introduction to vectors which I am
gradually extending. (And upon which I would be very glad for any
criticism - please feel free to
send your comments.
One simple application of this model is word-association.
You can type in a word - or a longer query - and Infomap will search
for new words with similar meaning-coordinates. This allows you to
select the meanings you do intend to use - and reject the ones you
don't. This is a powerful tool for resolving ambiguity and narrowing
down a search until your query describes exactly the meaning you
desire.