Disagreements and Language Interpretation
A project funded by the ERC, grant number ERC-2015-AdG
Project description
Natural language expressions are supposed to be unambiguous in context. Yet more and more examples of use of expressions that are ambiguous in context, yet felicitous and rhetorically unmarked, are emerging. In my own work, I demonstrated that ambiguity in anaphoric reference is ubiquitous, through the study of disagreements in annotation, that I pioneered in CL. Since then, additional cases of ambiguous anaphoric reference have been found; and similar findings have been made for other aspects of language interpretation, including wordsense disambiguation, and even part-of-speech tagging. Using the Phrase Detectives Game-With-A-Purpose to collect massive amounts of judgments online, we found that up to 30% of anaphoric expressions in our data are ambiguous. These findings raise a serious challenge for computational linguistics (CL), as assumptions about the existence of a single interpretation in context are built in the dominant methodology, that depends on a reliably annotated gold standard.
The goal of the proposed project is to tackle this fundamental issue of disagreements in interpretation by using computational methods for collecting and analysing such disagreements, some of which already exist but have never before been applied in linguistics on a large scale, some we will develop from scratch. Specifically, I propose to develop more advanced games-with-a-purpose to collect massive amounts of data about anaphora from people playing a game. I propose to use Bayesian models of annotation, widely used in epidemiology but not in linguistics, to analyse such data and identify genuine ambiguities; doing this for anaphora will require novel methods. Third, I propose to use these data to revisit current theories about anaphoric expressions that do not seem to cause infelicitousness when ambiguous. Finally, I propose to develop the first supervised approach to anaphora resolution that does not require a gold standard as a blueprint for other areas.
Events
We organised the Games4NLP symposium at EACL'17 in Valencia, Spain.
People
At the University of Essex, School of Computer Science and Electronic Engineering:
- Massimo Poesio
- Richard Bartle
- Udo Kruschwitz
- Jon Chamberlain
- Silviu Paun
- Juntao Yu
- Chris Madge
Collaborators
- Linguistic Data Consortium (University of Pennsylvania)
- Bob Carpenter (Columbia University)
- Dirk Hovy (University of Copenhagen)
- Dagmara Dziedzic (Adam Mickiewicz University, Poznan, Poland)
- Wojciech Wlodarczyk (Adam Mickiewicz University, Poznan, Poland)
Publications
- Testing game mechanics in games with a purpose for NLP applications
Proc. Games4NLP symposium, Valencia
Madge, Kruschwitz, Chamberlain, Bartle and Poesio - Metrics of games-with-a-purpose for NLP applications
Proc. Games4NLP symposium, Valencia
Chamberlain, Bartle, Kruschwitz, Madge and Poesio - Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference.
Proc. LREC16, Portoroz
Chamberlain, Poesio & Kruschwitz, 2016
Dataset is available on request, please contact jchamb@essex.ac.uk
- Novel Incentives for Phrase Detectives.
Proc. LREC16, Portoroz
Poesio, Chamberlain, Kruschwitz & Madge, 2016
- Phrase Detectives: Utilizing collective intelligence for internet-scale language resource creation (extended abstract).
Proc. IJCAI15, Buenos Aires
Poesio, Chamberlain, Kruschwitz, Robaldo & Ducceschi, 2015
- User Performance Indicators In Task-Based Data Collection Systems.
Proc. MindTheGap'14, Berlin.
Chamberlain & O'Reilly, 2014.
- The Annotation-Validation (AV) Model: Rewarding Contribution Using Retrospective Agreement.
Proc. GamifIR'14, Amsterdam.
Chamberlain, 2014.
- Methods for Engaging and Evaluating Users of Human Computation Systems.
Handbook of Human Computation (Springer)
Chamberlain, Kruschwitz & Poesio, 2014.
- Using Games to Create Language Resources: Successes and Limitations of the Approach.
The People's Web Meets NLP, 3-44 (Springer).
Chamberlain, Fort, Kruschwitz, Lafourcade & Poesio, 2013.
- Phrase Detectives: Utilizing Collective Intelligence for Internet-Scale Language Resource Creation.
ACM Transactions on Interactive Intelligent Systems (TiiS) 3(1), 3.
Poesio, Chamberlain, Kruschwitz, Robaldo & Ducceschi, 2013.
- Motivations for Participation in Socially Networked Collective Intelligence Systems.
Proc. CI2012, Boston.
Chamberlain, Kruschwitz & Poesio, 2012.
- The Phrase Detective Multilingual Corpus, Release 0.1.
Proc. LREC2012 Istanbul.
Poesio, Chamberlain, Kruschwitz, Robaldo & Ducceschi, 2012.
- Italian Anaphoric Annotation with the Phrase Detectives Game-With-A-Purpose.
AI*IA 2011: Artificial Intelligence Around Man and Beyond, pp407-412.
Robaldo, Poesio, Ducceschi, Chamberlain & Kruschwitz, 2011.
- Phrase Detectives: A Web-based collaborative annotation game.
Proc. iSemantics., Graz.
Chamberlain, Poesio & Kruschwitz, 2008.
Phrase Detectives
The Phrase Detectives game has been released and is collecting collaborative anaphoric decisions from online volunteers.
Phrase Detectives on Facebook
The Phrase Detectives game was redeveloped for release on the Facebook social network in Feb 2011.