Username
Password

Choose a username
Enter a password

Enter email (optional)

Home

Instructions

FAQ

About

About The Project

Sick of the city, you need some perspective. You catch the red eye to England to find out more about the masterminds behind Phrase Detectives.

As they welcome you into their brighly lit University, you realise they aren't so shadowy after all.

More about the game...

The Phrase Detectives game has been developed as a fun way for people to collaborate in creating large linguistic resources that will be used to further language technology used on the Internet, in business and on home computers. It is an example of a "Game With A Purpose" or GWAP, where a gaming environment is used for completing tasks on a scale not possible in more traditional ways. The data collected from the game is compiled to create an annotated corpus (a collection of files that have been worked on enough by humans as to be useful for computational purposes). This methodology has been referred to as human computation or the Wisdom of the Crowds, where large groups of people collaboratively come up with good answers.

The original Phrase Detectives game was released in 2008 and collects collaborative anaphoric decisions from online volunteers. As of December 2018 the game has collected over 4 million examples of human language in the database submitted by 60000 players, a collaborative effort of over 9500 hours or 395 days. Exported data from the game shows that the combined answers of players gives a very high quality result.

Read more about the game in these articles:
Innovations Report
Science Daily
PhysOrg
CS4FN

Disagreements and Language Interpretation (DALI) Project

Natural language expressions are supposed to be unambiguous in context. Yet more and more examples of use of expressions that are ambiguous in context, yet felicitous and rhetorically unmarked, are emerging. In previous work, we demonstrated that ambiguity in anaphoric reference is ubiquitous, through the study of disagreements in annotation, that we pioneered in CL. Since then, additional cases of ambiguous anaphoric reference have been found; and similar findings have been made for other aspects of language interpretation, including wordsense disambiguation, and even part-of-speech tagging. Using the Phrase Detectives Game-With-A-Purpose to collect massive amounts of judgments online, we found that up to 30% of anaphoric expressions in our data are ambiguous. These findings raise a serious challenge for computational linguistics (CL), as assumptions about the existence of a single interpretation in context are built in the dominant methodology, that depends on a reliably annotated gold standard.

The goal of DALI is to tackle this fundamental issue of disagreements in interpretation by using computational methods for collecting and analysing such disagreements, some of which already exist but have never before been applied in linguistics on a large scale, some we will develop from scratch. First of all, we will develop more advanced games-with-a-purpose to collect massive amounts of data about anaphora from people playing a game.

Secondly, we will use Bayesian models of annotation, widely used in epidemiology but not in linguistics, to analyse such data and identify genuine ambiguities; doing this for anaphora will require novel methods. Third, we will use these data to revisit current theories about anaphoric expressions that do not seem to cause infelicitousness when ambiguous. Finally, we intend to develop the first supervised approach to anaphora resolution that does not require a gold standard as a blueprint for other areas.

The original AnaWiki Project

Creating anaphorically annotated resources through Web cooperation

The ability to make progress in Computational Linguistics depends on the availability of large annotated corpora, but creating such corpora by hand annotation is very expensive and time consuming; in practice, it is unfeasible to think of annotating more than one million words.

However, the success of Wikipedia and other projects shows that another approach might be possible: take advantage of the willingness of Web users to contribute to collaborative resource creation. AnaWiki is a project that develops tools to allow and encourage large numbers of volunteers over the Web to collaborate in the creation of semantically annotated corpora (in the first instance, of a corpus annotated with information about anaphora).

Publications

Listed below are published papers relating to the Phrase Detectives game. For more information please contact jchamb@essex.ac.uk

Crowdsourcing and Aggregating Nested Markable Annotations
Proc. of ACL 2019.
Madge, Yu, Chamberlain, Kruschwitz, Paun & Poesio, 2019.

A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation
Proc. of NAACL 2019, Minneapolis, USA.
Poesio, Chamberlain, Paun, Yu, Uma & Kruschwitz, 2019.

Comparing Bayesian Models of Annotation
Transactions of the Association for Computational Linguistics
Paun, Carpenter, Chamberlain, Hovy, Kruschwitz & Poesio, 2018.

Optimising Crowdsourcing Efficiency: Amplifying Human Computation with Validation
it-Information Technology 60(1):41-49
Chamberlain, Kruschwitz & Poesio, 2018.

A Probabilistic Annotation Model for Crowdsourcing Coreference
Proc. of EMNLP 2018, Brussels, Belgium.
Paun, Chamberlain, Kruschwitz, Yu & Poesio, 2018.

Metrics of games-with-a-purpose for NLP applications.
Games4NLP Workshop, co-located at EACL17, Valencia.
Chamberlain, Bartle, Kruschwitz, Madge & Poesio, 2017.

Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference.
Proc. LREC'16, Slovenia.
Chamberlain, Poesio & Kruschwitz, 2016.

User Performance Indicators In Task-Based Data Collection Systems.
Proc. MindTheGap'14, Berlin.
Chamberlain & O'Reilly, 2014.

The Annotation-Validation (AV) Model: Rewarding Contribution Using Retrospective Agreement
Proc. GamifIR'14, Amsterdam.
Chamberlain, 2014.

Methods for Engaging and Evaluating Users of Human Computation Systems.
Handbook of Human Computation (Springer)
Chamberlain, Kruschwitz & Poesio, 2013.

Using Games to Create Language Resources: Successes and Limitations of the Approach.
The People's Web Meets NLP, 3-44 (Springer).
Chamberlain, Fort, Kruschwitz, Lafourcade & Poesio, 2013.

Phrase Detectives: Utilizing Collective Intelligence for Internet-Scale Language Resource Creation.
ACM Transactions on Interactive Intelligent Systems (TiiS) 3(1), 3.
Poesio, Chamberlain, Kruschwitz, Robaldo & Ducceschi, 2013.

Motivations for Participation in Socially Networked Collective Intelligence Systems.
Proc. CI2012, Boston.
Chamberlain, Kruschwitz & Poesio, 2012.

The Phrase Detective Multilingual Corpus, Release 0.1.
Proc. LREC2012 Istanbul.
Poesio, Chamberlain, Kruschwitz, Robaldo & Ducceschi, 2012.

Italian Anaphoric Annotation with the Phrase Detectives Game-With-A-Purpose.
AI*IA 2011: Artificial Intelligence Around Man and Beyond, pp407-412.
Robaldo, Poesio, Ducceschi, Chamberlain & Kruschwitz, 2011.

Markup Infrastructure for the Anaphoric Bank. Modeling, Learning and Processing of Text Technological Data Structures.
Studies in Computational Intelligence (Springer).
Poesio, Diewald, Stuhrenberg, Chamberlain, Goecke, Jettka & Kruschwitz, 2009.

Constructing An Anaphorically Annotated Corpus With Non-Experts: Assessing The Quality Of Collaborative Annotations.
Proc. ACL-IJCNLP 09, Singapore.
Chamberlain, Kruschwitz & Poesio, 2009.

A new life for a dead parrot: Incentive structures in the Phrase Detectives game.
Proc. Webcentives09., Madrid.
Chamberlain, Poesio & Kruschwitz, 2009.

(Linguistic) Science Through Web Collaboration in the ANAWIKI Project.
Proc. WebSci09., Athens.
Kruschwitz, Chamberlain & Poesio, 2009.

Phrase Detectives: A Web-based collaborative annotation game.
Proc. iSemantics., Graz.
Chamberlain, Poesio & Kruschwitz, 2008.

Addressing the Resource Bottleneck to Create Large-Scale Annotated Texts.
Proc. STEP2008, Venice.
Chamberlain, Poesio & Kruschwitz, 2008.

ANAWIKI: Creating anaphorically annotated resources through Web cooperation.
Proc. LREC'08, Marrakech.
Poesio, Kruschwitz & Chamberlain, 2008.

Prizes Oct 2021

Most annotations:
Magoogy £50
Magic_Is_Real £15
Wellington £10

Best comments:
"not a correct NP, they need to be kept separate"
AColson £30
"'Where are my brothers': this is a full sentence, not a NP"
Wellington £30

axnicho	295335
Grep Agni	3353
happygecko	49194
panne	3368
laraeule	6253
iunia	1838
sydehawk	1179
tortoise	11432
AColson	205258
squipps	1192

612 documents completed

The most recent was The Adventures of Sherlock Holmes - The Adventure of the Speckled Band (Sir Arthur Conan Doyle) completed by anagram on 09 Nov 2023

The last document to be worked on was Kid A (Wikipedia) by Wellington

Wellington	1394457
magoogy	1224648
livio.robaldo	596630
JRS	567021
JMS	453077
julie3164	425538
johnnickel	393703
VB	303714
axnicho	295335
papillon	293908