Bi-annual workshop bringing together linguists, computer scientists, psychologists and philosophers to discuss problems in the intersection of semantics, pragmatics, and cognition.
Embeddings 2.0: The Lexicon as Memory, animated by Hinrich Schütze and Jay McClelland
for more information about this work group see: http://cis.lmu.de/schuetze/wgmemory.pdf
In NLP, we need knowledge about words. This knowledge today often comes from embeddings learned by word2vec, FastText etc. But these embeddings are limited. We will explore models of memory and learning from cognitive science and machine learning to come up with a foundation for better word representations.
Exploring grounded and distributional learning of language, animated by Noah Goodman and Mike Frank
for more information see: http://web.stanford.edu/~azaenen/MIC/MikeandNoah.pdf
Learning language from text corpora ('distributional') allows easy access to very large data, on the other hand learning from situated, or 'grounded', utterances seems closer to what language is actually used for and closer to the human language acquisition problem. a lot of progress has been made recently on both approaches. here we will try to understand the relationship between them, and how they can be combined. for instance we will ask: do (Bayesian) theories of language use prescribe the relationship between grounded and corpus data? can representations (deep-) learned from corpora help to learn grounded meanings?
Workgroup 3: Tuesday/Wednesday
Coping with polysemy, animated by Louise McNally
for more information about this workgroup see: http://web.stanford.edu/~azaenen/MIC/wgpolysemy.pdf
Although distributional/distributed systems have proven fairly good at handling the resolution of polysemy in context, one might wonder whether they could be improved by incorporating a key premise of the Rational Speech Act (RSA) model of utterance meaning: namely, that speakers and hearers carry out probabilistic reasoning about the meaning conveyed by a given expression in context under specific assumptions about a generally limited set of alternative expressions that could have been chosen in that context. The goal of this working group is to consider how distributed representations might be dynamically modulated with information about the contextually-salient alternatives that conversational agents are considering, or, alternatively, how distributed representations could be exploited specifically in interaction with the utterance options component of an RSA model.
Neural networks and Textual Inference, animated by Lauri Karttunen and Ignacio Cases
for more information about this workgroup see: http://web.stanford.edu/~azaenen/MIC/inferences.pdf
Recent work by Bowman et al. (2015), Rocktäschel et. al. (2016) has shown that given a sufficiently large data set such as the Stanford Natural Language Inference Corpus (SNLI) neural networks can match the performance of classical RTE systems (Dagan et al. 2006) that rely on NLP pipelines with many manually created components and features. The goal of this group is to explore whether neural networks can learn general properties of inference relations that are not represented in existing data sets such as SNLI and SICK, for example, that entailment is a reflexive and transitive relation and contradiction is symmetric.Language and Natural Reasoning