![[CSLI Home Page]](/images/graphics/ventcord-tiny.gif)

Every teacher of logic knows that the ease with which a student can translate a natural language sentence into first order logic depends, amongst other things, on just how that natural language sentence is phrased. This talk reports findings from a pilot data mining study of a large scale corpus in the area of formal logic education, where we used a very large dataset to provide empirical evidence for specific characteristics of natural language problem statements that frequently lead to students making mistakes. We developed a rich taxonomy of the types of errors that students make, and implemented tools for automatically classifying student errors into these categories. In this talk, we focus on three specific phenomena that were prevalent in our data: Students were found (a) to have particular difficulties with distinguishing the conditional from the biconditional, (b) to be sensitive to word-order effects during translation, and (c) to be sensitive to factors associated with the naming of constants. The paper concludes by considering the implications of this kind of large-scale empirical study for improving an automated assessment system specifically, and logic teaching more generally.
Joint work with Richard Cox, University of Sussex and Robert Dale, Macquarie University.