Creating anaphorically annotated resources through Web cooperation
A project funded by the EPSRC, grant number EP/F00575X/1
Project description
The ability to make progress in Computational Linguistics depends on the availability of large annotated corpora, but creating such corpora by hand annotation is very expensive and time consuming; in practice, it is unfeasible to think of annotating more than one million words.
However, the success of Wikipedia and other projects shows that another approach might be possible: take advantage of the willingness of Web users to contribute to collaborative resource creation. AnaWiki is a recently started project that will develop tools to allow and encourage large numbers of volunteers over the Web to collaborate in the creation of semantically annotated corpora (in the first instance, of a corpus annotated with information about anaphora).
People
At the University of Essex, School of Computer Science and Electronic Engineering:
Collaborators
At the University of Bielefeld:
- Daniela Goecke
- Maik Stuehrenberg
- Nils Diewald
- Daniel Jettka
Researchers from the following institutions are using the annotated corpus:
- University of Ottawa, Canada
Exploring the relationship between topical structure of text and the likelihood of anaphoric links within and across topical segments. - Queen Mary University, England
Attempting abstractive summarization of narrative. - University of Essex, England
Investigating crowd aggregation techniques and developing new games for annotation tasks.
Publications
- Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference.
Proc. LREC16, Portoroz
Chamberlain, Poesio & Kruschwitz, 2016
Dataset is available on request, please contact jchamb@essex.ac.uk
- Novel Incentives for Phrase Detectives.
Proc. LREC16, Portoroz
Poesio, Chamberlain, Kruschwitz & Madge, 2016
- Phrase Detectives: Utilizing collective intelligence for internet-scale language resource creation (extended abstract).
Proc. IJCAI15, Buenos Aires
Poesio, Chamberlain, Kruschwitz, Robaldo & Ducceschi, 2015
- User Performance Indicators In Task-Based Data Collection Systems.
Proc. MindTheGap'14, Berlin.
Chamberlain & O'Reilly, 2014.
- The Annotation-Validation (AV) Model: Rewarding Contribution Using Retrospective Agreement.
Proc. GamifIR'14, Amsterdam.
Chamberlain, 2014.
- Methods for Engaging and Evaluating Users of Human Computation Systems.
Handbook of Human Computation (Springer)
Chamberlain, Kruschwitz & Poesio, 2014.
- Using Games to Create Language Resources: Successes and Limitations of the Approach.
The People's Web Meets NLP, 3-44 (Springer).
Chamberlain, Fort, Kruschwitz, Lafourcade & Poesio, 2013.
- Phrase Detectives: Utilizing Collective Intelligence for Internet-Scale Language Resource Creation.
ACM Transactions on Interactive Intelligent Systems (TiiS) 3(1), 3.
Poesio, Chamberlain, Kruschwitz, Robaldo & Ducceschi, 2013.
- Motivations for Participation in Socially Networked Collective Intelligence Systems.
Proc. CI2012, Boston.
Chamberlain, Kruschwitz & Poesio, 2012.
- The Phrase Detective Multilingual Corpus, Release 0.1.
Proc. LREC2012 Istanbul.
Poesio, Chamberlain, Kruschwitz, Robaldo & Ducceschi, 2012.
- Italian Anaphoric Annotation with the Phrase Detectives Game-With-A-Purpose.
AI*IA 2011: Artificial Intelligence Around Man and Beyond, pp407-412.
Robaldo, Poesio, Ducceschi, Chamberlain & Kruschwitz, 2011.
- Constructing An Anaphorically Annotated Corpus With Non-Experts: Assessing The Quality Of Collaborative Annotations.
Proc. ACL-IJCNLP 09, Singapore.
Chamberlain, Kruschwitz & Poesio, 2009.
- Markup Infrastructure for the Anaphoric Bank. Modeling, Learning and Processing of Text Technological Data Structures.
Studies in Computational Intelligence (Springer).
Poesio, Diewald, Stuhrenberg, Chamberlain, Goecke, Jettka & Kruschwitz, 2009.
- A new life for a dead parrot: Incentive structures in the Phrase Detectives game.
Proc. Webcentives09., Madrid.
Chamberlain, Poesio & Kruschwitz, 2009.
- (Linguistic) Science Through Web Collaboration in the ANAWIKI Project.
Proc. WebSci09., Athens.
Kruschwitz, Chamberlain & Poesio, 2009.
- Phrase Detectives: A Web-based collaborative annotation game.
Proc. iSemantics., Graz.
Chamberlain, Poesio & Kruschwitz, 2008.
- Addressing the Resource Bottleneck to Create Large-Scale Annotated Texts.
Proc. STEP2008, Venice.
Chamberlain, Poesio & Kruschwitz, 2008.
- ANAWIKI: Creating anaphorically annotated resources through Web cooperation.
Proc. LREC'08, Marrakech.
Poesio, Kruschwitz & Chamberlain, 2008.
Phrase Detectives
The Phrase Detectives game has been released and is collecting collaborative anaphoric decisions from online volunteers.
Phrase Detectives on Facebook
The Phrase Detectives game was redeveloped for release on the Facebook social network in Feb 2011.
Anaphoric Bank
The Anaphoric Bank is a club created to facilitate resource sharing among researchers working on anaphora.