CSLI MONTHLY ----------------------------------------------------------------------- November 1986 Vol. 2, No. 2 ----------------------------------------------------------------------- A monthly publication of The Center for the Study of Language and Information ------------------ Contents Communication as Rational Interaction by Philip Cohen 1 The Wedge Syntax or Semantics? 4 Distributivity Craige Roberts 5 Representation: A Personal View Adrian Cussins 6 Symbolic Systems Program Helen Nissenbaum 7 Postdoctoral Fellowship 7 CSLI Publications 7 ------------------ COMMUNICATION AS RATIONAL INTERACTION Phil Cohen I was asked to describe my research program with Hector Levesque (University of Toronto), and to a certain extent with Ray Perrault (who does not agree with everything that follows), in a short space for wide circulation. I agreed, realizing only later how hard a job it was. A worthwhile exercise, to be sure, but one that requires ruthless editing. For example, the first thing that has to go are the usual hedges. So, since I will be overstating my case a bit, I hereby hedge a bit for the remainder of this article. A more complete exposition can be found in (Cohen and Levesque 1986, 1986a). We take the view that language use can be productively regarded from the perspective of action. For us, this provides not just a slogan, but a program of research, namely, to identify those aspects of language use that follow from general principles of rational, cooperative interaction. Our pursuing such a research program does not mean that we believe all language use is completely and consciously thought out and planned. Far from it. Rather, just as there are grammatical, processing, and sociocultural constraints on language use, so may there be constraints imposed by the rational balance agents maintain among their beliefs, intentions, commitments, and actions. Our goals are to discover such constraints, to develop a logical theory that incorporates them and predicts dialogue phenomena, and finally to apply them in developing algorithms for human-computer interaction in natural language. To pursue this research program, we treat utterance events as special cases of other events that change the state of the world; utterance events change the mental states of speakers and hearers. Typically, utterance events are performed by a speaker in order to affect such changes. Moreover, they do so because they signal, or carry, (at least) the information that the speaker is in a certain mental state, such as intending the hearer to adopt a mental state. Conversations arise and proceed because of an interplay among agents' mental states, their capabilities for purposeful behavior, their cooperativeness, the content and circumstances of their utterances, and surely other factors to be elucidated. A theory of conversation based on this approach would explain dialogue coherence in terms of the mental states of the participants, how those mental states lead to communicative action, how those acts affect the mental states of the hearers, etc. A natural avenue to travel in pursuit of some of these goals would appear to be speech act theory. After all, here is where theorists have promoted, and examined in some depth, many of the implications of treating language as action. Speech act theory was originally conceived as part of action theory. Many of Austin's insights about the nature of speech acts, felicity conditions, and modes of failure were derived from a study of noncommunicative actions. Searle (1969) repeatedly mentions that many of the conditions he attributes to various illocutionary acts (such as requests and questions) apply more generally to noncommunicative action. However, in recent work Searle and Vanderveken (1985) (hereafter, S&V) formalize communicative acts and propose a logic in which their properties (e.g., "preparatory conditions" and "modes of achievement") are primitively stipulated, rather than derived from more basic principles of action (as S&V in fact recommend). We believe such an approach misses significant generalities. Our research shows how to derive properties of illocutionary acts from principles of rationality, and hence suggests that the theory of illocutionary acts is descriptive but not explanatory. Consider the following seemingly trivial dialogue fragment: A: "Open the door." B: "Sure." Linguistically, these utterances are uninteresting. Of course, the semantics and effects of imperatives are nontrivial (and I'll get to that), and the meaning of "Sure" is unclear. But, it seems to me that the speakers' intentions and the situation of their utterances play the crucial role in determining what has happened during the dialogue, and how what has changed can influence agents' further actions. It would be reasonable to `describe' what has happened by saying that A has performed a directive speech act (e.g., a request), and that B has performed a commissive (e.g., a promise). To see that B did, imagine B's saying "Sure" and then doing nothing. A would surely be justified in complaining, or asking for an explanation. A competence theory of communication needs to explain how an interpersonal commitment becomes established. Ours does so by explaining what effects are brought about by a speaker's uttering an imperative in a given situation, and how the uttering of "Sure" relates to those effects. These explanations will make crucial reference to intention, but not necessarily to illocutionary acts. It is tempting to read, or perhaps misread, philosophers of language as saying that illocutionary force recognition is required for successful communication. Austin (1962) and Strawson (1964) require "uptake" to take place. Searle and Vanderveken (Searle 1969, Searle and Vanderveken 1985} claim that illocutionary force is part of the meaning of an utterance, and the intended effect of an utterance is "understanding." Hence, because hearers are intended to understand the utterance, presumably including at least an understanding of its meaning, on one reading of their claim, the hearer is intended to recognize the utterance's illocutionary force. (NOTE: But, perhaps they mean illocutionary force `potential'. They write: "Part of the meaning of an elementary sentence is that its literal utterance in a given context constitutes the performance or attempted performance of an illocutionary act of a particular illocutionary force." (Searle and Vanderveken 1985, p. 7). The question at issue here is whether, in a hearer's understanding an utterance and knowing its meaning, the hearer recognizes (or is intended to recognize) that the specific utterance in the specific context was uttered with a specific illocutionary force.) It is so tempting to read these writers this way that many, including myself, have made this assumption. For example, computational models of dialogue (Allen 1979, Allen and Perrault 1980, Brachman et al. 1979} that my colleagues and I have developed have required the computer program to recognize which illocutionary act the user performed in order for the system to respond as intended. However, we now claim that force recognition is usually unnecessary. For example, in both of the systems mentioned above, all the inferential power of the recognition of illocutionary acts was already available from other inferential sources (Cohen and Levesque 1980). Instead, we claim that many illocutionary acts can be `defined' in terms of the speaker's and hearer's mental states, especially beliefs and intentions. As such, what speakers and hearers need only do is to recognize the speaker's intentions (based on mutual beliefs). Contrary to other proposed theories, we do not require that those intentions include intentions that the hearer recognize precisely which illocutionary act(s) were being performed. Although one can `label' parts of a discourse with names of illocutionary acts, illocutionary labeling does not constitute an explanation of a dialogue. Rather, the labeling itself, if reliably obtained, constitutes data to be explained by constraints on mental states and actions. That is, one would show how to derive the labelings, given their definitions, from (for example) the beliefs and intentions the participants are predicted to have given what has happened earlier in the interaction. Although hearers `may' find it heuristically useful to determine just which illocutionary act was performed, our view is that illocutionary labeling is an extra task in which conversational participants may only retrospectively be able to engage. The stance that illocutionary acts are not primitive, and need not be explicitly recognized, is a liberating one. Once taken, it becomes apparent that many of the difficulties in applying speech act theory to discourse, or incorporating it into computer systems, stem from taking these acts too seriously---i.e., as primitives. FORM OF THE ARGUMENT We show that at least some illocutionary acts need not be primitive by deriving Searle's conditions on various illocutionary acts from an independently motivated theory of action. The realm of communicative action is entered following Grice (1969): by postulating a correlation between the utterance of a sentence with a certain syntactic feature (e.g., its dominant clause is an imperative) and a complex propositional attitude expressing the speaker's intention. As a result of the speaker's uttering a sentence with that feature under certain conditions the hearer thinks it is mutually believed that the speaker has the attitude. Because of general principles governing beliefs and intentions, other consequences of the speaker's having the expressed intention can be derived. Such derivations will be used to form complex action descriptions that capture illocutionary acts in that the speaker is attempting to bring about some part of the chain of consequences by means of bringing about an antecedent. For example, the action description to be called REQUEST will capture a derivation in which a speaker attempts to make it the case that (1) the hearer forms the intention to act because (2) it is mutually believed the speaker wants him/her to act. The conditions licensing the inference from (2) to (1) can be shown to subsume those claimed by Searle (1969) to be felicity conditions. However, they have been derived here from first principles, and without the need for a primitive action of requesting. Moreover, they meet a set of adequacy criteria, which include differentiating utterance form from illocutionary force, handling the major kinds of illocutionary acts, modeling speakers' insincere performances of illocutionary acts, providing an analysis of performative utterances, showing how illocutionary acts can be performed with multiple utterances, and how multiple illocutionary acts can be simultaneously performed with one utterance, and explaining indirect speech acts. Our approach is similar to that of Bach and Harnish (1979) in its reliance on inference. A theory of rational interaction will provide the formal foundation for drawing the needed inferences. A notion of sincerity is crucial for treating deception and nonserious utterances. Finally, a characterization of utterance features (e.g., mood) is required in making a transition from the domain of utterance syntax and semantics, to that of utterance effects (on speakers and hearers). There are two main steps in constructing the theory. C(1) C(2) C(i-1) C: A ---> E(1) ---> E(2) ---> E(3) ---> ... ---> E(i) Figure 1: Actions producing gated effects 1. `Infer illocutionary point from utterance form'. The theorist derives the chains of inference needed to connect the intentions and beliefs signaled by an utterance's form with typical "illocutionary points" (Searle and Vanderveken 1985), such as getting a hearer to do some action. These derivations are based on principles of rational interaction, and are independent of theories of speech acts and communication. Specifically, (referring to Figure 1) assume actions (A) are characterized as producing certain effects E(1) when executed in circumstances C. Separately, assume the theorist has either derived or postulated relationships between effects of type E(i-1) and other effects, say of type E(i) such that if E(i-1) holds in the presence of some gating condition C(i-1), then E(i) holds as well. One can then prove that in the right circumstances, specifically those satisfying the gating conditions, doing action A makes E(i) true. (NOTE: Another way to characterize utterance effects is by applying "default logic" (Perrault 1986).) 2. `Treat illocutionary acts as attempts'. Because illocutionary acts can be performed with utterances of many different forms, we abstract away from any specific form in defining illocutionary acts. Searle (1969) points out that many communicative acts are attempts to achieve some effect. For example, requests are attempts to get (in the right way) the hearer to do some action. We will say an agent `attempts' to achieve some state of affairs E(i) if s/he does some action or sequence of actions A that s/he intends should bring about effect E(i), and believes does so. The intended effect may not be an immediate consequence of the utterance act, but could be related to act A by some chain of causally related effects. Under these conditions, for A to be an attempt to bring about E(i), the agent would have to believe that the gating conditions C(i) will hold after A, and hence if s/he did A in circumstances C, E(i) would obtain. This way of treating communicative acts has many advantages. The framework clarifies the degrees of freedom available to the theorist by showing which properties of communicative acts are consequences of independently motivated elements, and which properties are stipulations. Furthermore, it shows the freedom available to linguistic communities in naming patterns of inference as illocutionary verbs. Moreover, it gives technical substance to the use made of such terms as "counts as," "felicity conditions," and "illocutionary force." However, it makes no commitment to a reasoning strategy. For example, the theorist's derivations from first principles may be encapsulated by speakers and hearers as frequently used lemmas. Moreover, speakers and hearers may not in fact believe the gating conditions hold, but may instead assume they hold and "jump" to the conclusion of the lemma. Below, I describe how the theory addresses two important kinds of phenomena. `Performatives'. We basically follow a Bach and Harnish-style analysis (Bach and Harnish 1979) in which performative utterances are treated as declarative mood utterances whose content is that the utterance event itself constitutes the performance of the mentioned illocutionary act. Because of the essential use made of the utterance event in assigning truth conditions, performative utterances are a clear case of the need for situated language use. Our ability to handle performatives is met almost entirely because illocutionary acts are defined as attempts. Since attempts depend on the speaker's beliefs and intentions, if a speaker sincerely says, for example, "I request you to open the door," he must believe he did the act with the requisite beliefs and intentions, and hence the utterance is a request. Institutionally based performatives work because society defines attempts by certain people in the right circumstances as successes, such as judges who say "I now pronounce you husband and wife." Finally, perlocutionary verbs, e.g., "frighten," cannot be used performatively because frightening requires success, not a mere attempt; neither the logic of rational interaction, nor institutions, make attempts to frighten into frightenings. `Multiact utterances and Multiutterance acts'. The use of inference allows both of these phenomena to be addressed. For the first, observe that there may be many other chains of inference emanating from the utterance event. Hence, an utterance may be an attempt by the speaker to achieve many different effects simultaneously, some of which may be labeled by illocutionary verbs in a given language. Multiutterance acts are a natural extension of our approach because action A that brings about the core effects may in fact be a sequence of utterance acts. This immediately allows the formalism to address problems of discourse, but specific solutions remain to be developed. (NOTE: See (Grosz and Sidner 1986) for progress on this front.) However, notice that these acts are problematic for theories requiring force recognition for each utterance. It may take five sentences, and three speaking turns, for a speaker to complete a request. On force-recognition accounts, the illocutionary force of each utterance would have to be recognized. However, such theories do not require that a hearer identify the illocutionary force of the `discourse' (here, as a request). But, that would be the most important act to be recognized. Moreover, if such theories did so require, they would have to provide a calculus of forces to describe how the individual ones combine to form another. Many discourse analysts have tried to give such analyses in terms of sequences of illocutionary acts and discourse "grammars." Apart from the fact that multiact utterances prevent the structure of a dialogue from being analyzed as a tree, we believe such analyses are operating at the wrong level. If illocutionary acts are definable in terms of mental states, a theory of communication will explain discourse with a logic of those attitudes and their contents. Thus, one needs to characterize how the effects of individual utterances accumulate to achieve more global intended effects. The labeling of individual utterances as the performance of specific illocutionary acts contributes nothing to an account of effect accumulation. To the extent that our analysis is on the mark, then the subject of illocutionary acts is in some sense less interesting than it has been made out to be. That is, the interest should be in the nature of rational interaction and in the kinds of reasoning (especially nonmonotonic (Kautz 1986, Perrault 1986)) that agents use to plan and to recognize the intentions and plans of others. Constraints on the use of particular illocutionary acts in conversation should follow from the underlying principles of rationality, not from a list of sequencing constraints (e.g., adjacency pairs). (NOTE: To see that this is not just a straw man, consider the following passage from Searle and Vanderveken (1985, p. 11): "But we will not get an adequate account of linguistic competence or of speech acts until we can describe the speaker's ability to produce and understand utterances (i.e., to perform and understand illocutionary acts) in `ordered speech act sequences' that constitute arguments, discussions, buying and selling, exchanging letters, making jokes, etc. ... The key to understanding the structure of conversations is to see that each illocutionary act creates the possibility of a finite and usually quite limited set of appropriate illocutionary acts as replies." [emphasis in original]) To make this more concrete, I shall briefly describe aspects of our approach to a theory of rational interaction that serves as the foundation for analyzing communication. RATIONAL INTERACTION Bratman (1986) argues that rational behavior cannot be analyzed just in terms of beliefs and desires (as many philosophers have held). A third mental state, intention, which is related in many interesting ways to beliefs and desires but is not reducible to them, is necessary. There are two justifications for this claim. First, noting that agents are resource-bounded, Bratman suggests that no agent can continually weigh his/her competing desires, and concomitant beliefs, in deciding what to do next. At some point, the agent must just `settle on' one state of affairs for which to aim. Deciding what to do establishes a limited form of `commitment'. We shall explore the consequences of such commitments. A second reason is the need to coordinate one's future actions. Once a future act is settled on, i.e., intended, one typically decides on other future actions to take with that action as given. This ability to plan to do some act A in the future, and to base decisions on what to do subsequent to A, requires that a rational agent `not' simultaneously believe s/he will `not' do A. If s/he did, the rational agent would not be able to plan past A since s/he believes it will not be done. Without some notion of commitment, deciding what else to do would be a hopeless task. Bratman argues that intentions play the following three functional roles: 1) `Intentions normally pose problems for the agent; the agent needs to determine a way to achieve them.' 2) `Intentions provide a "screen of admissibility" for adopting other intentions'. Whereas desires can be inconsistent, agents do not normally adopt intentions that they believe conflict with their present and future-directed intentions. 3) `Agents "track" the success of their attempts to achieve their intentions'. Not only do agents care whether their attempts succeed, but they are disposed to replan to achieve the intended effects if earlier attempts fail. In addition to the above functional roles, it has been argued that intending should satisfy at least the following property: 4) `Agents need not intend all the expected side effects of their intentions'. We will develop a theory in which expected side effects are `chosen', but not intended. Intention as a Composite Concept We model intention as a composite concept specifying what the agent has `chosen' and how the agent is `committed' to that choice. First, consider agents as choosing among their (possibly inconsistent) desires those s/he wants most. Call these chosen desires, loosely, goals. (NOTE: Chosen desires are ones that speech act theorists claim to be conveyed by illocutionary acts such as requests.) By assumption, chosen desires are consistent. We will give them a possible-world semantics, and hence the agent will have chosen a set of worlds in which the goals hold. Next, consider an agent to have a `persistent goal' if s/he has a goal (i.e., a proposition true in all of the agent's chosen worlds) that s/he believes currently to be false, and that will continue to be chosen at least as long as certain facts hold. Persistence involves an agent's `internal' commitment over time to his/her choices. (NOTE: This is not a `social' commitment. It remains to be seen if the latter can be built out of the former.) For example, the ultimate fanatic is persistent with respect to believing his/her goal has been achieved or is impossible. The fanatical agent will only drop his/her commitment to achieving the goal if either of those circumstances hold. Thus, We model intention as a kind of persistent goal---a persistent goal to do an action, believing one is about to do it, or achieve some state of affairs, believing one is about to achieve it. When modeled this way, intention can be shown to have Bratman's functional characteristics. Although I cannot substantiate that claim here (see (Cohen and Levesque 1986a) for details) it is instructive to see how the concept of persistence avoids one of the thornier issues for a theory of intention---closure under expected consequences. According to our analysis of goal (as a proposition true in a chosen set of worlds), what one believes to be true must be true in all one's chosen worlds. Hence, if one believes `p ] q', `p ] q' is true in all chosen worlds. So, if one has chosen worlds in which `p', then one has chosen worlds in which `q'. Now, consider a case of taking a drug to cure an illness, believing that as a side effect, one will upset one's stomach. In choosing to take the drug, the agent has surely chosen stomach distress. But, the agent did not intend to upset his/her stomach. Using our analysis of intention, the agent will have adopted a persistent goal to take the drug. However, the sickening side effect is only present in the agent's chosen worlds because of a `belief'. Should the agent take a new and improved version of the drug, and not upset his/her stomach, s/he could change his/her belief about the relationship between taking the drug and its gastric effects. In such a case, stomach distress would no longer be present in the agent's chosen worlds. But, the agent would have dropped the goal of upsetting his/her stomach for reasons other than believing it was achieved or believing it was impossible. Hence, the agent was not committed to upsetting his/her stomach, and thus did not intend to upset it. (NOTE: If the agent were truly committed to gastric distress, for instance as his/her indicator that the drug was effective, then if his/her stomach were not upset after taking the drug, s/he would ask for a refund.) This example deals with expected consequences of one's intentions. What about other consequences? Strictly speaking, the formalism predicts that agents only intend the logical equivalences of their intentions, and in some cases intend their logical consequences, and consequences that they always believe always hold. Thus, even using a possible-worlds approach, one can develop an analysis that satisfies many desirable properties of a model of intention. I believe an approach using situation theory would tighten up the analysis a bit, so that agents could choose states of affairs (in their technical sense) rather than entire worlds. However, I would expect that much of the present analysis would remain. A useful extension of the concept of persistent goal, upon which one can define an extended concept of intention, is the expansion of the conditions under which an agent can give up his/her goal. When necessary conditions for an agent's dropping a goal include his/her having other goals (call them "supergoals"), the agent can generate a chain of goals such that if the supergoals are given up, so may the subgoals. If the conditions necessary for an agent's giving up a persistent goal include his/her believing that some `other' agent has a persistent goal, a chain of interpersonally linked goals is created. For example, if Mary requests Sam to do something and Sam agrees, Sam's goal should be persistent unless he finds out Mary no longer wants him to do the requested action (or, in the usual way, he has done the action or finds it to be impossible). Both requests and promises are analyzed in terms of such "interpersonally relativized" persistent goals. BACK TO DISCOURSE To see how all this comes into play in discourse, let us reconsider the earlier trivial dialogue. Loosely speaking, the effects of sincerely uttering an imperative in the right circumstances are: the speaker (A) makes it mutually believed with the hearer (B) that A's persistent goal is that B form an intention, relative to A's goal that B do some act, thereby leading B to act. In our example, by attempting to achieve all these effects, A has requested B to open the door. B did not have to recognize the imperative itself as a request, i.e., as an attempt to achieve all these effects. The effects (i.e., the mutual belief about A's persistent goals) just needed to hold. So much for A's utterance. Why does B say "Sure"? We would claim that B knows that requirements of consistency on agents' persistent goals and intentions mean that the adoption of a persistent goal constrains the adoption of others. A cooperative step one can take for others is to tell them when their persistent goals have been achieved (so they can be dropped). In the case at hand, A's persistent goal was B's forming an intention to act, relative to A's desires. By saying "sure", B has made it mutually believed that he has adopted that relativized intention. Now, it seems not unreasonable to characterize making a commitment `to' another person to do something in terms of making it mutually believed that one has an intention/persistent goal (i.e., one is internally committed) to do that action `relative to the other's goals'. (NOTE: We are not trying to characterize the institutional concept of obligation here, but are trying to shed some light on its rational underpinnings.) This helps to explain why one cannot felicitously promise to someone something one knows he does not want. B only needs to recognize the illocutionary force of the utterance if s/he is concerned with why s/he is intended to form his/her intention (e.g., because of his/her being cooperative, or because of A's authority). The claim that illocutionary force recognition is crucial to all communication would say that hearers must reason about how they are intended to adopt their attitudes. Although I believe people do not do this frequently, the burden of proof is on those who argue that such reasoning is necessary to successful communication. (NOTE: Some illocutionary acts, such as greetings, have no propositional content. Their effects consist entirely of getting the hearer to recognize that the speaker was trying to perform that act (Searle and Vanderveken 1985). Thus, at least for these acts, illocutionary act recognition is required for communication to take place. While admitting this to be true, we suggest that these acts are the exception rather than the rule.) Generally speaking, the participants' intentions and the interactions among those intentions are the keys to dialogue success. Illocutionary act recognition is mostly beside `that' point. ACKNOWLEDGMENTS Many thanks to Herb Clark, David Israel, Martha Pollack and the Discourse, Intention, and Action group at CSLI for valuable comments. REFERENCES Allen, J. F. 1979. A Plan-based Approach to Speech Act Recognition. Technical Report 131. Department of Computer Science, University of Toronto, Toronto, Canada. Allen, J. F., and C. R. Perrault. 1980. Analyzing Intention in Dialogues. "Artificial Intelligence" 15(3): 143--78. Austin, J. L. 1962. "How To Do Things With Words." London: Oxford University Press. Bach, K., and R. Harnish. 1979. "Linguistic Communication and Speech Acts". Cambridge, Mass.: MIT Press. Brachman, R., R. Bobrow, P. Cohen, J. Klovstad, B. L. Webber, and W. A. Woods. 1979. "Research in Natural Language Understanding". Technical Report 4274. Cambridge, Mass.: Bolt Beranek and Newman Inc. Bratman, M. 1986. Intentions, Plans, and Practical Reason. In preparation. Cohen, P. R., and H. J. Levesque. Communication as Rational Interaction. In preparation. Cohen, P. R., and H. J. Levesque. 1986. Persistence, Intention, and Commitment. Timberline Workshop on Planning and Practical Reasoning. Los Altos, Calif.: Morgan Kaufman Publishers, Inc. Cohen, P. R., and H. J. Levesque. 1980. Speech Acts and the Recognition of Shared Plans. In "Proceedings of the Third Biennial Conference". Canadian Society for Computational Studies of Intelligence, Victoria, B. C., pp. 263--71. Grice, H. P. 1969. Utterer's Meaning and Intentions. "Philosophical Review" 68(2):147--77. Grosz, B. J., and C. L. Sidner. 1986. Attention, Intentions, and the Structure of Discourse. "Computational Linguistics" 12(3):175--204. Kautz, H. 1986. Generalized Plan Recognition. In "Proceedings of the Fifth Annual Meeting of the American Association for Artificial Intelligence", Philadelphia, Penn. Perrault, C. R. An Application of Default Logic to Speech Act Theory. In preparation. Searle, J. R. 1969. "Speech Acts: An Essay in the Philosophy of Language". Cambridge: Cambridge University Press. Searle, J. R., and D. Vanderveken. 1985. "Foundations of Illocutionary Logic". New York, N. Y.: Cambridge University Press. Strawson, P. F. 1964. Intention and Convention in Speech Acts. "The Philosophical Review" v(lxxiii). Reprinted in "Logico-Linguistic Papers". London: Methuen, 1971. ------------------ THE WEDGE Syntax or Semantics? [Editor's note: Peter Ludlow, a CSLI Visiting Scholar, and some of the members of the STASS project have kindly given the Monthly permission to publish this recent exchange of electronic mail messages.] Date: Sat 15 Nov 86 14:36:24-PST From: Peter Ludlow Subject: syntax or semantics To: STASS@CSLI.STANFORD.EDU All, I think this might be the right forum to air some concerns that I have. The concerns involve the amount of burden taken over from syntax by situation semantics. As far as I'm concerned there is no problem with the idea that situation semantics should be capable of doing the work that syntax does with respect to binding, scope, etc. If some linguists find it more helpful to think about these phenomena as semantic, then we should provide the resources for them to study the phenomena as semantic. My concern is that situation semantics may be becoming a theory in which these phenomena MUST be treated as semantic. I see the job of the situation semanticist as providing certain tools to aid ongoing linguistic inquiry. We know that there is a large class of semantic phenomena which resists study from a Davidsonian and Montegovian perspective. It is with respect to these phenomena, I think, that the situation semanticist should be most concerned. I can't see the point of forcing a situation-theoretic way of doing things on syntacticians who are involved in a productive research paradigm. Frankly, I can't see the difference between the view that binding theory etc. should be recast in semantics and Hartry Field's absurd view that physics should be recast in measurement theory because set theory is epistemologically troublesome. Peter ------- Date: Sun 16 Nov 86 04:15:55-PST From: Jon Barwise Subject: Re: syntax or semantics To: LUDLOW@CSLI.STANFORD.EDU cc: STASS@CSLI.STANFORD.EDU Peter, You need to distinguish situation theory, situation semantics, and any particular situation semantics account. The former is our theory of the world of semantic objects. The second is the general program of applying the first to the analysis of meaningful types of things. Within this program there is lots of room for competing accounts of any particular phenomena. Now there is nothing in either (1) or (2) that FORCES you to treat scope, say, or coreference as a purely semantical phenomenon. There will be room for competing accounts, including one where scope and coreference are indicated in the syntax. On the other hand, the Relation Theory of Meaning (RTM) is at the core of situation semantics, and it does give a perspective on language and meaning which suggests a pretty radical rethinking of the relation between syntax and semantics. It does not prevent you from putting a lot of weight on the syntax, but it makes you ask why it belongs there. And most of us in the group have come to the conclusion that it is misplaced. If, as the RTM suggests, syntax and semantics are mutually constraining, but neither prior to the other, then you can see why accounts that took syntax to be autonomous and prior to semantics would have to "discover" all kinds of invisible syntactic features that would be better seen as semantic. So while situation semantics does not prevent someone from treating coreference, say, as syntactic, it is hard for me to imagine how anyone who has understood the perspective could think that it was. Jon ------- Date: Sun 16 Nov 86 10:30:46-PST From: Mark Gawron Subject: Ludlow's message To: stass@CSLI.STANFORD.EDU It seems as if there are two ways of looking at scientific explanation -- although there probably aren't two ways of doing it -- as theory-making or as theory-explaining. I think the view evolving in the STASS group -- and growing out of the Relational Theory of Meaning -- is one that leads to useful theory-explaining. The view that indices shouldn't be thought of as decorations of syntactic representations isn't a denial that a whole line of productive research is meaningful -- it's an attempt to explain that line of research, to show how what indices account for falls out of the relation between structure, circumstance, and content. A clear explication of that relation can be useful even if the interpretation offered is completely compatible with current formalizations of indices -- in other words, it will be useful even if our notation of our interpretation turns out to be a notational variant of the current formalizations. Why? Because indices DON'T currently have any well-grounded interpretation (either set-theoretic or measure-theoretic) and currently persist purely as syntactic decorations. The way we think about our theoretical objects does influence the way we develop our theories; and certainly thinking about indices as syntactic objects has had an influence on various versions of various binding theories. It is completely consistent with that view -- for example -- that indices might themselves have internal structure, might be decorated with further decorations, and there are a number of proposals in the literature to do just that (Haik 1984, Chomsky 1980, Chametzy 1985). Summing up: the aim isn't so much to stop the presses on binding theory as it is to come up with a well-grounded view of the facts at issue to constrain the ways in which a binding theory might develop. mark ------- Date: Mon 17 Nov 86 13:15:05-PST From: Peter Ludlow Subject: on binding theory etc. To: STASS@CSLI.STANFORD.EDU A couple of comments regarding Jon and Mark's replies -- Regarding Jon's remarks, I should note that my worries are not directed at situation theory, nor situation semantics generally, but toward certain situation-theoretic accounts -- those which try to subsume binding theory. I agree that syntax and semantics should be mutually constraining; I just disagree with the idea that binding theory (and, while we're at it, scope) is best given a semantic account. But perhaps we do not disagree. Let me clarify what I take binding theory to be a theory of. It is not a theory of what it means for two NPs to be coreferential(or for one to bind the other), nor is it a theory of when they will be coreferential. Rather, it is a theory of constraints on possible interpretations due to the relative positions of constituents in a p-marker. Let me illustrate. Binding theory does not tell us whether "Bill" and "he" are coreferential in a given utterance of "Bill thinks he is groovy." Rather, binding theory tells us that so far as syntax is concerned, they can be coreferential. It is the job of the situation semanticist to determine under what situations "Bill" and "he" ARE coreferential. Binding theory is not so flexible in other circumstances. In "Bill grooves on himself," binding theory dictates that from the point of view of syntax, Bill and the pronoun must corefer. Here, I think, the situation semanticist has little to add. Mark's comments, if I may crudely summarize them, suggest that situation semantics is not in competition with binding theory, but is a theory of what binding theorists are really studying. I wonder. If it is a deeper explanation of coreference to say two NPs utilize the same parameter than to say that they refer to the same object, then I suppose Mark has a point. Frankly, I can't see that notions like coreference need an explanation. I've never had a problem understanding what coreference was. Mark is right to point out that indices explain nothing, however. As far as I'm concerned they are just heuristic devices to help us keep track of the binding facts imposed upon a sentence by the grammar. If GB grammarians confuse indices for significant portions of the syntax and define certain grammatical properties off of indices, then, to my thinking, this is just bad linguistics, and the last thing it needs is a theory. Peter ------- Date: Mon 17 Nov 86 13:31:28-PST From: Mark Gawron Subject: Re: on binding theory etc. To: LUDLOW@CSLI.STANFORD.EDU I think we're converging, but there still seem to be some points that need clarifying. (1) If all that indices were used for in GB was to indicate coreference, I doubt that explaining them would need much work. The point is that they're not. The relationship between a wh-operator and its trace, whatever it is, isn't coreference, nor, in general, do contraindexing conditions amount to disjoint reference conditions, even in cases with referential NPs. (2) The point of referring to work which has given indices internal structure was not to make a promise that we could "explain" such uses of indexing, but to show that there were uses of indexing that seemed to defy explanation, but which were consistent with the view that they were syntactic decorations. The idea is that if some clear interpretation underlies their use, people won't do such odd things with indices. (3) I agree that the point is to provide a useful discovery vehicle, and I agree that, to some extent, GB's binding theory has been just that. And yes, the only real validation of any explanation is to do just that, and that includes our work on anaphora. If indices were the only thing at issue, there would probably be small promise of that. We think there are a number of issues that can be addressed reasonably well from this perspective... mark -------- Date: Mon 17 Nov 86 15:39:03-PST From: Carl Pollard Subject: Re: on binding theory etc. To: LUDLOW@CSLI.STANFORD.EDU cc: STASS@CSLI.STANFORD.EDU Since this is a free-for-all, here is my two cents. The idea that so-called indices, usually regarded as syntactic in some ill-defined way, are better thought of as something semantic (parameters of the types of things that language-use situations describe) IS a productive hypothesis for guiding research; people who have worked on long-distance dependencies, anaphora, control, agreement, etc. within the HPSG framework have found it to be a perfectly "effective discovery vehicle" (calling a hypothesis that reminds me of calling a toothpaste an "effective decay-preventive dentifrice") for the past two years or so. But it is a somewhat pernicious simplification to describe it as providing "a semantic account" of binding, as opposed to a syntactic account. A fundamental principle in situation semantics is that linguistic meaning is a CONSTRAINT in the technical sense of a relation between parametrized types, more specifically a relation between (at least) the type of the utterance situation and the type of thing (individual, property, situation, etc.) the utterance describes. Thus aspects of the expression (including, potentially, syntactic category, configuration, grammatical relations, phonology, morphology) and aspects of the content (including thematic roles, anchoring or absorption of parameters, scope, etc.) are MUTUALLY CONSTRAINING. In particular, a situation semantics-oriented account of binding would seek to account for the mutual constraints that hold among the syntactic components of certain language-use situations (i.e., uses of traces, reflexives and reciprocals, personal pronouns, ellipses, proper nouns, quantifiers, definite and indefinite descriptions, etc.) and the parameters of the corresponding content component that those uses introduce. This is a very different thing, and in my opinion a much better thing, than giving either a strictly syntactic or a strictly semantic account -- either of which would be senseless from the point of view of the Relational Theory of Meaning. On the other hand, it is very similar -- perhaps just more general -- to "a theory of constraints on possible interpretations due to the relative positions of constituents in a p-marker." As far as (co)referentiality is concerned, it is not enough for a binding theory to say whether or not a given configuration requires or forbids given pairs of elements from being coreferential. Finer distinctions are required, as lots of work done within both situation semantics and discourse representation theory -- by people like Peter Sells, Craige Roberts, Mats Rooth, Jon Barwise, as well as Gawron and Peters -- has taken great pains to show, although there is still not a consensus as to just what the right distinctions are: just consider "Only Bill grooves on himself." It is not true that the situation semanticist has little to add to the principle of binding theory that "Bill" and "himself" must corefer. Neither is it appropriate to characterize the situation semanticist's job as "to determine under what situations `Bill' and `he' are coreferential" if the intention of that characterization is to exclude syntax from the subject matter of situation semantics (and, presumably, leave that in the hands of syntacticians of the correct persuasion); syntax and other aspects of the utterance situation figure in the meaning relation just as much as the semantic content does. As far as I can tell, it IS deeper to say (for example) that two NPs utilize the same parameter than to say that they refer to the same object (what object is referred to by "it" in "every farmer who owns a donkey beats it"?). Notions like coreference DO need an explanation, and many people over the years have had profound difficulties understanding what it is. Carl ------- Date: Tue 18 Nov 86 16:52:21-PST From: Peter Ludlow Subject: reply to Mark and Carl To: STASS@CSLI.STANFORD.EDU WRT Mark's comments, I agree that we seem to be converging. Mark is right to point out that binding theory is more than just a theory of coreference. I just used coreference as an example of the kind of thing I have in mind. Binding theory is also a theory of what parts of syntax are operators, what parts are variables, and when a given operator binds a variable. Now a semanticist (situation or otherwise) will have something interesting to say about the interpretation of quantifiers. But by "interpretation of quantifiers" I mean to speak of whether quantifiers are objectual, substitutional, etc. and questions of how they come to be interpreted as having a group reading, a distributed reading, or whatnot. But I guess I wouldn't consider any of these questions to be questions in binding theory per se. Mark's second point is that people wouldn't do such odd things with indices if they had a clear interpretation of how they were being used. With this I agree, but I think the clear interpretation might just be a statement of what syntactic relations indices are used to represent. This is a point that I know Higginbotham has made, and it seems to me that Susan Stucky has made the same point about syntactic representations generally. (Right Susan?) Mark's third point is that situation semantics will prove to be a more productive paradigm for the study of binding theory facts than the current syntactic one. Time will tell. WRT Carl's comments, I would distinguish a theory of binding (including a theory of index assignment) from a theory of indices. My point is that the former should be thought of as syntactic. I don't care about the theory of indices. I'm not sure what a theory of indices would be and I doubt that one can make sense of the notion of constructing either a syntactic or semantic account of what an index is. Perhaps Carl just means that semantics determines the assignment of indices. If this is the claim, he is partially right (if one uses indices to signify, among other things, all cases of coreference), and I agree that for discourse anaphora and a number of other phenomena, syntax will be silent on how indices are to be assigned. I should add that, for me, these phenomena fall outside of binding theory. Carl's second point, that the discussion has been oversimplified is perhaps correct. What I don't see is that my view is in conflict with the idea that syntax and semantics are mutually constraining. This view is of course implicit even in my remark (quoted by Carl) that binding theory is a theory of "constraints on possible interpretations [I should add of pronouns and bound variables] due to the relative positions of constituents in a p-marker." All I'm saying is that binding theory is a theory of some of the constraints placed on interpretation by the syntax. Syntax surely does not provide all the constraints, and if you like, you can say it only provides 1% of the constraints. And of course the theory of syntax must be constructed with the goal of getting the interpretation of sentences right. Carl is correct to point out that the situation semanticist should not be excluded from doing syntax. I can see the point that syntactic objects are situation-theoretic objects. My only concern is that the contribution of the syntactic object to the meaning of an utterance be given its full due. I still don't see how "utilizes the same parameter" is "deeper" than "refers to the same object." But perhaps I am dense. WRT the "it" of donkey sentences: it does not refer, but is a bound variable (if Heim is right the indefinite article is a bound variable here too). Indices are not in want of explanation, but binding theory facts are. Remember that it is not reference itself that I am interested in, but merely the fact that constituents in certain syntactic configurations must corefer, in other configurations they cannot, and in still other configurations an operator can bind a variable. Question: Why is it unsatisfying or unexplanatory to embed binding theory, so understood, in generative syntax? --Peter ------- Date: Wed 19 Nov 86 08:18:14-PST From: Craige Roberts Subject: more on indices To: STASS@CSLI.STANFORD.EDU Peter says in his last note on indices that "it is not reference itself that I am interested in, but merely the fact that constituents in certain syntactic configurations must corefer, in other configurations they cannot, and in still other configurations an operator can bind a variable." He then asks: "Why is it unsatisfying or unexplanatory to embed binding theory, so understood, in generative syntax?" I am in full agreement with the claim that syntax and semantics are mutually constraining, and in principle I have no problem with abstracting away from the facts of interpretation for the purpose of examining the more purely syntactic constraints on binding (c-command, f-command, governing category, whatever). But syntacticians would benefit by paying more attention to the semantic (and pragmatic) side of the analysis of anaphoric phenomena. For example, I see a number of problems with the binding theory of the Government and Binding framework which arise from a failure to take the interpretation of indices more seriously, as well as a failure to take into account various facts about focus and other contextually determined elements of interpretation. For example, Gareth Evans, Tanya Reinhart, and others have all pointed out that certain examples which the binding theory predicts to be ungrammatical are in fact quite acceptable in the proper context, perhaps with the proper intonation, etc. One's theory of binding, even if only a theory of the relevant syntactic constraints on the sentential level, has to be consistent with what we find in larger contexts. I assume that Mark had things like this in mind when he said that a theory of anaphora should have a "pragmatic component that might play Gricean principles off the syntactic component to `derive' properties of both the reference-tracking features of the linguistic circumstances and their relationship to syntactic structure"; that is exactly what Reinhart has suggested for the disjoint reference facts, instead of trying to force them into a purely syntactic theory. Further, note that it IS generally assumed by Binding theorists that indices have an interpretation, and this assumption has been the basis of many of the judgments of (un)grammaticality where anaphoric relations are involved; as some of the comments in this discussion (including the one above from Peter) show, for most folks coindexation means coreference and noncoindexation means disjoint coreference. First, given the possibilities for anaphora in discourse, it is clearly wrong to say that two NPs which are not coindexed are disjoint in reference. And even the more plausible claim that coindexation means coreference may well be wrong--Leslie Saxon has found cases in Dogrib (an Athapaskan language) of "disjoint anaphors," pronouns which must be bound within their governing category, like English reflexives, but mean something like "someone other than the individual denoted by my antecedent"; and my work on plural anaphors in distributive predicates also challenges the coindexation-is-coreference assumption. Binding seems to be a very abstract relationship, its interpretation determined partly by lexical properties of the anaphors involved, partly by operations (such as distributivity) on larger constituents in which they occur. What I am saying, then, amounts to this: if binding theory is to be EMPIRICALLY ADEQUATE (let alone explanatory), then syntacticians must heed the semantic and pragmatic "components" of a full theory of anaphora. Cooperation and mutual respect will lead to better theories. ------- ------------------ DISTRIBUTIVITY Craige Roberts, CSLI Postdoctoral Fellow My work on distributivity grew out of a general interest in the relationship between anaphora and operator scope, in the context of a theory of discourse. The work of Heim (1982) and Kamp (1981), and my extensions of Discourse Representation Theory to reflect the phenomenon of Modal Subordination (see Roberts (1986)) all support a simple generalization about anaphoric relations and referential dependence more generally: an anaphoric element may take an NP as antecedent only if any and all operators which have scope over the potential antecedent have scope over the anaphor as well. Certain phenomena associated with distributivity provide a challenge to this hypothesis, and hence must be addressed in order to maintain it. Further, the analysis of distributivity is a prerequisite to the extension of this generalization to plural anaphora, as we shall see below. Conversely, considering the distributive phenomena from the perspective of a theory of anaphora in discourse provides insight into the basic character of distributivity, and has led to a fairly simple and general characterization of it which differs in important respects from earlier theories (cf. Lakoff (1970), Bennett (1974), Scha (1981), Link (1983), for example). Consider the following examples: (1) Four men lifted a piano. (2) Bill, Pete, Hank, and Dan lifted a piano. (3) Bill, Pete, Hank, and Dan each lifted a piano. (4) Each man lifted a piano. (5) It was heavy. (6) He developed a crick in his back later. (7) They each developed a crick in their back later. (1) and (2) are ambiguous in the same way. There is a group reading, where together the four men lifted a single piano; and there is a distributive reading, where each of the men in question has the property of having (singlehandedly) lifted a piano. (In fact, there are two distributive readings, one where the men each lifted the same piano, and another where there may have been a different piano involved in each lifting. For the purposes of this discussion, we will ignore the first kind of reading, where the indefinite has wide scope over the subject, and concentrate only on the other reading. In fact, the difference is not crucial for the theory I propose, but illustrates the important fact that distributivity is not reducible to questions of NP scope.) (2) is ambiguous in the same way as (1). (3), on the other hand, has only the distributive reading. And if we assume that there are only four men, then the truth conditions for (4) are identical to those for the distributive readings of (1), (2), and (3), again ignoring the reading where the indefinite object has wide scope over the subject. Now compare the anaphoric potential of the NPs in these examples under various readings. On the group reading, it is felicitous to follow (1) or (2) by (5) with `a piano' serving as antecedent for `it', but on the distributive reading which interests us, neither (1)+(5) nor (2)+(5) is felicitous; similarly (3)+(5) is infelicitous, as is (4)+(5) on the intended reading of (4). The parallel between the subject in (1) and the quantificational subject of (4) tempts one to analyze the former as quantificational too. But this would not solve the problem of the analysis of distributivity, since the parallel extends to (2) and (3), with subjects which are clearly nonquantificational. Further, the subjects of (1) to (3) display a different anaphoric potential than that of (4). The latter may not serve as an antecedent for anaphors in subsequent sentences--witness the infelicity of (4) followed by (6). But the subjects of (1) to (3) may serve as antecedents on any of their possible readings; hence any of these examples may precede (7), with THEY anaphoric to their subject. One of the keys to the account of such examples is in the analysis of (3). Dowty and Brody (1984) argue that this "floated" EACH is an adverbial operator, which modifies the predicate to give a sense which may be paraphrased, "this predicate is true of each of the members of the group denoted by the subject." Here it is the adverbial operator which introduces the universal quantificational force, rather than a quantificational subject, as in (4). We may then capture the parallels between (3) and the truth-conditionally equivalent distributive reading of (2) by positing an implicit adverbial distributivity operator in the latter example as well. The extension of this treatment to (1) is then natural. Now we can explain the anaphoric facts about (1) to (7) under the hypothesis about anaphora mentioned above: On the group readings, there are no operators in (1) or (2) which have scope over A PIANO, and hence it is available to serve as an antecedent in discourse. But on the intended distributive readings of these examples or of (3), the indefinite object is under the scope of an adverbial operator which does not have scope over any NPs outside of the sentence; thus A PIANO may not serve as antecedent to a pronoun in a succeeding sentence, IT in (5). This is the case with the indefinite in (4) as well, though here the operator is the determiner of the subject rather than an adverbial. In (4), this operator in the subject may not have scope outside its immediate sentence, and this explains the infelicity of anaphoric relations between its subject and that of (6). But the subjects of (1) to (3) need not be quantificational themselves in order to explain the quantificational force of the distributive interpretation. If we assume that they are not quantificational and are not under the scope of the distributivity operator, then we can explain why, on the distributive reading of these examples, the subjects are available to serve as antecedents for the subject of (7). Adverbial distributivity need not apply only to VPs, but can apply to derived predicates as well, as was noticed by Link (1986). So, for example, in (8), it may be the case that three girls each received a valentine from John: (8) John gave a valentine to three girls. The derived predicate here may be expressed by LAMBDAx(John gave a valentine to x) or the related type in situation theory. If this is modified by the distributivity operator and the result is predicated of the group-denoting NP THREE GIRLS, we derive the intended interpretation. The view of distributivity sketched informally here contrasts with earlier theories which in general either viewed the distributive-group distinction as due to lexical properties of predicates or as arising purely from properties of NPs (e.g., quantificational vs. referring). In the work from which this brief summary is drawn, Roberts (1986), I consider such theories in detail and show why none of them is sufficiently general to account for the full range of distributive phenomena. This proposal also lays the groundwork for a simple theory of plural anaphora, where plural as well as singular pronouns are treated as simple bound variables. Thus, we have an account of examples such as (9) (which might be uttered in the orthopedic ward of a hospital in Colorado): (9) These people broke their leg skiing. The distributive reading of this example is strongly preferred, since it is unlikely that the people broke a single, communal leg. And it seems to mean that each person broke his or her own leg. Suppose that adverbial distributivity applies here to a derived predicate LAMBDAx(x broke x's leg), where the pronoun is treated as a variable bound by the same operator as the subject role. When this modified predicate applies to the group-denoting subject, the resulting interpretation may be paraphrased, "each person in the group indicated has the property of having broken his or her leg." Though the plural pronoun here is bound by the subject, it is not coreferential with it, since only the pronoun is under the scope of the distributivity operator. Finally, if this theory is used in conjunction with a theory of the semantics of plurality along lines suggested by Link (1983), we may develop a simple and empirically adequate account of the Dependent Plural phenomena. However, the details of this proposal, as well as the formal details of the treatment of distributivity, must be omitted here for reasons of space. References: Bennett, Michael R. 1974. "Some Extensions of a Montague Fragment of English". Ph.D. dissertation, UCLA. Dowty, David R., and Belinda Brodie. 1984. The Semantics of "Floated" Quantifiers in a Transformationless Grammar. In Mark Cobler, Susannah MacKaye, and Michael T. Wescoat (eds.), "Proceedings of WCCFL III". The Stanford Linguistics Association, Stanford University, pp. 75-90. Lakoff, George. 1970. Linguistics and Natural Logic, "Synthese" 22:151-271. Reprinted in Donald Davidson and Gilbert Harmon (eds.), "Semantics of Natural Language". Dordrecht: Reidel, 1972. Link, Godehard. 1983. The Logical Analysis of Plurals and Mass Terms: A Lattice-theoretical approach. In Rainer Bauerle, Christoph Schwarze, and Arnim von Stechow (eds.), "Meaning, Use, and Interpretation of Language". Berlin: de Gruyter. Link, Godehard. 1986. Generalized Quantifiers and Plurals, manuscript, University of Munich and CSLI, Stanford. To appear as CSLI Report No. 66 Roberts, Craige. 1986. "Modal Subordination, Anaphora, and Distributivity". Ph.D. dissertation, University of Massachusetts, Amherst. Scha, Remko. 1981. Distributive, Collective and Cumulative Quantification. In Jeroen Groenendijk, Theo M. V. Janssen, and Martin Stokhof (eds.), "Formal Methods in the Study of Language, Vol. I". Mathematische Centrum, Amsterdam. Reprinted in Groenendijk, Janssen and Stokhof (eds.), "Truth, Interpretation and Information". Dordrecht: Foris, 1984. ------------------ REPRESENTATION: A PERSONAL VIEW Adrian Cussins, CSLI Postdoctoral Fellow Any proper theory of representation must draw a distinction between cognitive and communicative representations. For without this distinction we will not understand the different goals that a theory of representation may have. Anything at all can function as a communicative representation: morse code, footprints in the sand, stick-figure drawings, computer icons, smoke, noises and marks that we and other animals, and inanimate things, make. All that is required is that one or more intentional agents interpret the object or event, or that there be a convention of interpretation within some community of intentional agents to interpret the object or event. By contrast, only a very restricted category of things can function as cognitive representations. My current perceptual experience of a Xerox Dandelion is a cognitive representation because its functioning as a representation does not depend on interpretation by some intentional agent, or on a convention of interpretation of some community of agents. Although I can interpret my own experience (for example, when some aspect of it is ambiguous), I do not have to do so for my experience to represent. Whereas communicative representation must be interpreted or belong to a convention of interpretation for it to represent.(1) We normally specify what my experience is by means of conventional devices, the functioning of which depends on other intentional agents. But that is a quite separate point. A red traffic light represents the command to stop only in virtue of a convention which governs traffic lights. My perceptual experience, or thought, when confronted with a traffic light, represents independently of any convention or act of interpretation, even though we would normally specify what the intentional state is of by means of the conventional linguistic phrase "traffic light." It is, indeed, only because of the PRIMARY representation of experience and thought (cognition) that the DERIVATIVE representation of communicative signs is possible. [Throughout I am treating language as Chomsky's E-language, a system of linguistic communication (Chomsky 1986).] There can be a phrase of the language, "traffic light" only because some members of the community of language users are capable of thinking of traffic lights. As we know from the phenomenon of the division of linguistic labor, as well as other linguistic phenomena, it is not necessary that all members of the linguistic community have concepts for all phrases of the language, but it is necessary that for each phrase of the language, some member of the community has the appropriate concepts. A speaker may often exploit language to make a reference that he does not himself understand, but as Evans writes (1982, p. 92), "Given the divergence between the requirements for understanding and the requirements for saying, it would be absurd to deny that our primary interest ought to be in the more exigent conditions which are required for understanding." This point holds quite generally for the use of systems of communicative representation. Communicative representation represents but it does so only in virtue of the cognitive representation of one or more intentional agents or a convention of interpretation which must itself be understood in terms of the cognitive representations of the community which upholds the convention. An understanding of how representation is possible must ultimately rest on an understanding of how cognitive representation is possible. A theory of cognitive representation is explanatorily primary; a theory of communicative representation is explanatorily derivative. This suggests that we ought to begin our theory of representation with a theory of perception, memory, and thought (i.e., a theory of cognition) and only when we have such a theory will we be able to provide a theory of language and other derivative representation which exploits the cognitive representation of interpreting agents (a theory of communication). Cognition is prior to communication in the explanation of representation. It is then a little alarming to discover that the vast majority of work on representation depends on the reverse priority. When specifying the content of cognitive representation one aims to capture how things are from the agent's point of view, yet most theories will simply specify cognitive content in terms of linguistic reference to the objects or properties that the content is about. The content of beliefs and the other attitudes is specified sententially by a "that clause" and the content of perceptual experience is specified by linguistic reference to objects and properties that the experience is of, NOT ONLY IN OUR EVERYDAY COMMUNICATION, BUT AS A THEORETICAL SPECIFICATION WHICH IS A PART OF A GENERAL THEORY OF COGNITIVE CONTENT. Cognitive content is generally specified by theorists of representation by means of linguistic reference to the world, as if a theory of linguistic communication was explanatorily prior to a theory of cognition. Do not mistake my point. Cognitive contents may be specified correctly by means of linguistic (or, in general, communicative) reference to the world of the agent; we so specify them on most occasions when we communicate about our own or others' mental states. I call such specification of content "conceptual specification of content." But the goal of the theoretician of representation is quite different. His goal is not communication, but to understand how it is possible for physical systems to represent the world. For example, in certain cases the theoretician should capture what is available in the experience of the agent as a disposition (see Evans (1982), chapter 6). When the theoretician characterizes the disposition in language, as of course he must, there need be no presupposition that the agent understands that bit of language, or that the theoretician must explain what it is for language to function, independently of a theory of cognition. All that is presupposed would be a theory of what it is for organisms to possess dispositions; a presupposition which is entirely innocent from the point of view of cognitive theory. But if the theoretician specifies the cognitive contents of (i.e., what is available to) intentional agents directly by means of linguistic reference (or reference in some other system of communicative representation) to what the content is about, then he must suppose that a theory of communicative representation is explanatorily prior to a theory of cognitive representation. And so much the worse for his enterprise. For, if we cannot explain how linguistic (or, in general, communicative) reference is possible for physical systems in terms of a prior theory of cognitive representation, then where else are we to turn? -- Theories of the causal relation between the use of some communicative representation and bits of the world? But nobody has any idea as to how such causal relations could be specified noncircularly. (For an excellent criticism of causal theories of intentionality which such a theorist would be committed to, see Evans (1982) chapter 3 and part 2.) -- Information-based theories of the information that utterances carry about the world? But if the theory of linguistic reference is explanatorily prior to the theory of cognitive reference, and the theory of linguistic reference is information-based, then the theory of cognitive reference will have to be information-based. And it has been clear for a long time that the standard notion of information-transmitted (rather than a notion of information which is cognitively available) is an inadequate basis for a theory of cognition. (Most recently, see Fodor (1986).) What, in effect, all modern information-based theories of representation do is introduce a new notion of information - not the standard notion- which is the output from processes of "attunement" or "digitalization," and thus which is a notion of representation rather than information (Dretske 1981, Barwise and Perry 1983). Since such processes are to be understood by means of a cognitive (psychological) theory, the notion of nonstandard information, which is a notion of a representational state output from the processes of digitalization/attunement, is itself a cognitive notion. Despite the misleading terminology, it cannot be used to ground a theory of linguistic representation which is explanatorily prior to a theory of cognitive representation. And there is a further problem with (standard) information-based theories of representation. Information exists in the world because of relations of constraint between situations in the world (let's suppose). These relations of constraint are supposed to hold, and to be explained, quite independently of (a theory of) the representational activities of intentional agents. If we are to provide a theory of communicative representation, which is independent of a theory of cognition, in terms of standard information, then we must suppose that the explanation of what it is for a situation of a given type to obtain in the world is independent of the theory of representation, both communicative and cognitive. But the vast implausibility of this position was the downfall of early, Austin-style, theories of correspondence. Facts just are true thoughts, or, what is expressed by true sentences, etc. To suppose that we have independent access to our thoughts, to the world, and to some relation of correspondence (or noncorrespondence) between the two is epistemologically incoherent. We think, and thereby have access to the world. Our grasp of the true/false distinction is not a result of our having independent access to the world, and to our representations and a discovery of the difference between the latter's corresponding to the former and its failing to so correspond. Our conception of the world, and our conception of the true/false distinction are a joint, and inseparable, product of our cognitive development. An explanation of what it is to have a conception of the world just is an explanation of what it is to grasp the distinction between true and false. Nor can one appeal to the physical sciences in support of a tripartite conception which involves three theories (each lower-numbered theory being independent of each higher-numbered theory): (1) a physical science theory of the world as-it-is-in-itself, including constraints between physical situations, and thus information, (2) a theory of communication, including linguistic communication, in terms of the information carried by uses of communicative representations in context, and (3) a theory of cognition. This tripartite conception would support a unidirectional independence between the physical sciences, the linguistic sciences and the psychological sciences. The physical sciences could work in independence of the other two categories of science. The linguistic sciences could take over a notion of information from the physical sciences and use it to characterize the functioning of communicative representational systems in independence from the cognitive constructs of psychology. Under this conception the scarcity of psychologists at CSLI would make a lot of sense, for the functioning of communication could be studied independently of the functioning of mind. A multidisciplinary psychological center would require the presence of linguists but a multidisciplinary center for the study of language (the center) would not require the presence of psychologists. But, as I said, the tripartite conception is unsupported. The physical sciences provide theories of the nature of atoms and molecules, not of the nature of tables, chairs, Xerox Dandelions, traffic lights, or people. Given the restriction to the resources of the physical sciences, there is no closed specification which picks out all and only the chairs in the universe. The physical sciences cannot explain what it is for there to be a chair, or a ... , in the universe, even if it can explain what it is for there to be the elements out of which chairs are constructed. But the kind of information that we need for a theory of communication is information about things like chairs, not merely information about the mereological constituents of chairs. So even if we could support the tripartite conception for thought and talk about mathematics and the physical sciences (which in any case I doubt), we could not support it for the vast majority of our thought and talk. The world just is what is presented to us in our perception and in our thinking, so a theory of thought and a theory of the world must be interdependent. The interdependence of a theory of thinking and a theory of what we think about means that a theory of thinking must not presuppose a theory of what it is that we think about, for that would force a theory of what it is that we think about to be independent of a theory of cognition. A theory of thinking would presuppose a theory of what it is that we think about if it employed what I called "conceptual specifications" of the content of thinkings. For a conceptual specification of the content of a thinking specifies the content in terms of the objects and properties of the world that the thinking is about. If our theory of cognition took conceptual specifications as basic, then it would have to presuppose a theory of what it is for there to be such objects and properties. There would be no room for illumination of what it is for there to be such objects and properties from a theory of cognition.(2) There is yet a further problem. Not only must we not presuppose the theory of objects and properties that our cognizings are about, but we must also not presuppose the possession of concepts of those objects and properties by the subjects of cognition. Our aim is to explain what it is for organisms to possess concepts, not to describe general features of cognition given that the cognition is assumed to be already conceptual. As the traditional philosophical project of providing definitions was constrained to provide noncircular definitions, so the epistemic project of explaining what it is for an organism to understand and think is constrained not to presuppose the possession of concepts by the organism. Now, "concept" is a word which is ill-regarded around here, so I shall be excused for spending a few paragraphs in its defense. It is one of the great sources of philosophical wonder that there exists not just the world but perspectives on the world; that in the world are things which think about the world. It's as if we feel we understand in principle (if not in detail) how physical and biological evolution could produce a world of objects that bear causal relations to each other, but not how it could produce a world of objects which reflect on those causal relations. Concepts are abilities of (certain) organisms in virtue of which they can think about the world. Hence the possession of concepts by organisms is a source of wonder, and a challenge to the project of naturalism. The behavior of most objects in the world can be understood by adopting the "physical" or "design" stance (to use Dennett's (1978) terminology) towards them, but without adopting the "intentional stance." We can understand the behavior of a conventional chess playing computer in terms of a procedural consequence specification of the program, so long as there is no malfunction. We can also adopt the intentional stance towards the computer, as when we think that the computer "believes that it is good to get its queen out early," but the point is that we don't have to do so in order to understand the behavior of the machine (although we do have to do so in order to understand why we might wish to build such machines). Although the programmer may be guided in the design of the machine by his adoption of the intentional stance, it is not necessary to adopt this stance in order to understand what it is that he has designed. This is what it means to treat the intentional stance "instrumentalistically." But there are some creatures - us humans, at least - the (majority of the) behavior of which must be understood by adopting the intentional stance. The attribution of intentional states to adult humans is realistic; that is, the purpose of the attribution is not just the prediction of sequences of behavior in a given, constrained, context, but the explanation of the causation of that behavior. We, unlike overhead projectors, frogs, or conventional computers, act out of our beliefs, thoughts, memories, perceptions, and imaginations. Our behavior, unlike the frog's, is not as it is merely because the world is a certain way, but also because we believe it, remember it, desire it, and imagine it to be a certain way. Were it not for this, there would be no genuine basis for the distinction between adopting the moral stance towards things like us, but not adopting it towards overhead projectors. It is because the attribution of mental states to flies is to be construed instrumentalistically, and the attribution of mental states to humans (and others) is to be construed realistically, that it is all right to swat flies but it is not all right to swat humans (and others). The cognitive challenge to naturalism is to show how human representation is so extraordinarily and wonderfully different from frog representation, even though frogs and humans are both products of the identical processes of natural selection. The point of all this is to make the absurdity of the realistic ascription of concepts to screwdrivers (or GM robot welders) more than usually apparent. We need an account of concept possession that makes sense of the different attitudes we adopt towards things like screwdrivers and other nonconcept-exercising things and things which, like us, have concepts and, thus, a world about which we think. If it made sense to ascribe concepts to screwdrivers (or robot welders) it would have to make sense to ascribe just one or two concepts to a thing. But then it would have to make sense to think about a world which just contained screws, and two properties, screwed or unscrewed. But screws can only be part of a world in which there are factories which make them, people who need them, properties of rigidity, etc., which are required for them to work, directions in which they are screwed, locations where they are, ... Nor will it do to say that the concept of a screw which a screwdriver or robotwelder has is not our concept of a screw; for the concept of a screw just is our concept of a screw, an object which makes sense, and has its identity, in our world. If screwdrivers could talk, we would not understand what they said. So, if the program of naturalism is to make room for conceptions of the world, we will do well to explain how it is possible for merely physical organisms to possess concepts. But if our scientific psychology adopts conceptual specifications in its theory of the representational abilities of organisms, then concept possession will have been presupposed and no dent made on the challenge to naturalism. We need scientific psychology to employ nonconceptual specifications of the cognitive representational states of organisms which are such that: (a) we understand what it is for physical systems to be in states thus described, and (b) we understand why it is that being in states so described is what it is to possess concepts (have a conception of the world). As I argue in my thesis, this project is possible because, and only because, the constitutive structure of thought is its nonconceptual structure. The nonconceptual theoretical specification of content is in terms of psychological mechanisms -- mechanisms the possession of which does not presuppose the ability to refer. There is no reason at all why as theorists of content we must adopt the same specifications of content as we use in everyday communication about people's attitudes and experiences, or technical extensions of such specifications. And good reason not to, since we would leave a mystery the central challenge of naturalism. The general moral is that communicative representation is explanatorily derivative upon cognitive representation, nonconceptually specified. A scientific psychology of cognition which does not presuppose the possession of concepts is explanatorily prior to any theory of communication, including linguistic theories of natural language. Wouldn't it be great if at CSLI we had something to say about a theory of how organisms represent, which neither presupposes a theory of what it is that we think about nor presupposes a theory of what it is for organisms to possess concepts! If we don't, it will be a shame because we won't have much of substance to say about communicative representation either.(3) Notes: (1) There is little need for these purposes to draw any distinction between communicative representation and what one might call "functional" or "teleological" representation. Functional representation is representation which is assigned to a piece of mechanism by an interpreter in order to understand better how that bit of mechanism functions in the context of the system of which it is a part. For example, we might assign the representation of the speed of sound to the neural mechanism of auditory localization. Teleological representation is representation which is assigned to a system in order to better understand why a system has been designed or why it has evolved. Both of these types of representation are classified as "communicative" here, even though their function is not for communication. (2) Whereas a dispositional theory of content, for example, would not presuppose a cognitively independent theory of what it is that we think about. It would merely presuppose the world. (3) Thanks to Brian Smith, Craige Roberts, and Susan Stucky for their comments. References: Barwise, J. and J. Perry. 1983. Situations and Attitudes. Cambridge: MIT Press. Chomsky, N. 1986. Knowledge of Language: Its Nature, Origin and Use. Praeger. Dennett, D. 1978. Brainstorms: Philosophical Essays on Mind and Psychology. Cambridge: MIT Press. Dretske, F. 1981. Knowledge and the Flow of Information. Cambridge: MIT Press. Evans, G. 1982. The Varieties of Reference. Oxford University Press. Fodor, J. A. 1986. Information and Association, Notre Dame Journal of Formal Logic 27. ------------------ SYMBOLIC SYSTEMS PROGRAM Helen Nissenbaum Stanford has a new undergraduate major, one with close ties to CSLI. From its quiet start in September of this year, the Symbolic Systems Program (SSP) has enjoyed steady growth. Fifteen students, most of them juniors, have already enrolled as majors. Typically these juniors consider the program "a godsend," giving them a program of study which was just what they wanted but were unable to find before. Among sophomores, who are just beginning to think about their majors, there also seems to be a lot of serious interest. SSP offers students the opportunity to explore the way people and machines use symbols to cope with the world. Key notions are symbol, representation, information, intelligence, action, and language. By requiring course work in the departments of Computer Science, Linguistics, Philosophy, and Psychology the curriculum is designed to show how these notions are approached from a variety of perspectives including those of artificial intelligence, computer science, cognitive psychology, linguistics, philosophy, and symbolic logic. Each Symbolic Systems major completes a core of eleven required courses. The four in computer science include theories of computation, topics in AI, and the basics of machine and assembly language, and provide considerable training in actual programming. Two linguistics courses introduce students to theories of syntax, semantics, and pragmatics. Students take a sequence of two logic courses. Two philosophy courses cover many of the central topics in traditional analytical philosophy with an emphasis on philosophy of language and philosophy of mind. The psychology requirement is in cognitive psychology. In addition to the core, majors select an area of concentration in which they complete an additional five courses. The idea of the concentration is to encourage students to develop an area of expertise that is consistent with their interests and long-term goals. Students may select from the predesigned concentrations in artificial intelligence, cognitive science, computation, logic, natural language, philosophical foundations, semantics, and speech; or they may design their own. Although most current majors have adopted the predesigned concentrations (some with minor changes), there are some individually designed concentrations, including one in computer music and one in psychobiology. The program has a large and diverse faculty committee, comprising faculty from the affiliated departments (of Computer Science, Linguistics, Philosophy, and Psychology) and consulting faculty from industrial research centers in the Bay Area (SRI International, Schlumberger, and Xerox PARC). The faculty participate in a variety of ways: advising students, teaching courses, and making decisions about the curriculum to steer the intellectual course of the program. In the winter quarter of 1987, the Symbolic Systems Program will offer its first course, called "Introduction to Information and Intelligence." This is a survey of the program's subject area, given as a series of exploratory self-contained lectures. Lectures will be given by members of the program's committee. The course will be given at a campus location as well as broadcast on the air by Stanford's Instructional TV Network. Several additional courses are being developed for the major including undergraduate offerings in philosophy of language, computational linguistics, the semantics of programming languages, and ethical issues in the uses of computers. The Symbolic Systems Program has several ties to CSLI. Most important, of course, is a curriculum which reflects CSLI's intellectual direction. Consequently, the program's faculty committee is made up almost entirely of CSLI affiliates, both regular Stanford faculty and consulting faculty from industry. SSP is directed by Jon Barwise, the first director of CSLI, and is coordinated by Helen Nissenbaum, one of CSLI's first postdoctoral fellows. In addition, CSLI provided important support while the program was being established. In particular, Tom Wasow, one of the current directors of CSLI, led the drive to get the program approved by the Stanford administration and faculty. It is the hope that this program will inspire similar programs at other universities around the world, programs that will contribute to the training of researchers in language and information. Any readers who would like more information about the program should call the program office at (415) 723-4091, or write: Symbolic Systems Program, 62H Building 60, Stanford University, Stanford, CA 94305. ------------------ NEW CSLI PUBLICATIONS 61. D-PATR: A Development Environment for Unification-based Grammars Lauri Karttunen 62. A Sheaf-Theoretic Model of Concurrency Luis F. Monteiro and Fernando C. N. Pereira 63. Discourse, Anaphora and Parsing Mark Johnson and Ewan Klein 64. Tarski on Truth and Logical Consequence John Etchemendy CSLI Reports and a complete list of publications can be obtained by writing to Trudy Vizmanos, CSLI, Ventura Hall, Stanford, CA 94305, or Trudy@CSLI.STANFORD.EDU. ------------------ NOTICED IN HARVARD MAGAZINE November-December 1986 In the Books and Authors section, listed under `Political Science': Noam Chomsky, Gj '51-'55, Barriers, M.I.T., $17.50 (paper, $7.95). Exploration of complex questions concerning theories of government and including the possibility of a unified approach. ---------------------------------------------------------------------- Editor's note Selected commentary about Monthly articles or other matters will be published in future issues. Please send correspondence to the Editor of the Monthly at CSLI or by electronic mail to Monthly-Editor@csli.stanford.edu. ---------------------------------------------------------------------- - Elizabeth Macken Editor