Top scorer of the monthYou are not in the top 3 of the leaderboard this month


Username
Password




Choose a username
Enter a password

Enter email (optional)






About The Project


Sick of the city, you need some perspective. You catch the red eye to England to find out more about the masterminds behind Phrase Detectives.

As they welcome you into their brighly lit University, you realise they aren't so shadowy after all.

More about the game...

The Phrase Detectives game has been developed as a fun way for people to collaborate in creating large linguistic resources that will be used to further language technology used on the Internet, in business and on home computers. It is an example of a "Game With A Purpose" or GWAP, where a gaming environment is used for completing tasks on a scale not possible in more traditional ways. The data collected from the game is compiled to create an annotated corpus (a collection of files that have been worked on enough by humans as to be useful for computational purposes). This methodology has been referred to as human computation or the Wisdom of the Crowds, where large groups of people collaboratively come up with good answers.

The original Phrase Detectives game was released in 2008 and collects collaborative anaphoric decisions from online volunteers. As of December 2018 the game has collected over 4 million examples of human language in the database submitted by 60000 players, a collaborative effort of over 9500 hours or 395 days. Exported data from the game shows that the combined answers of players gives a very high quality result.

Read more about the game in these articles:
Innovations Report
Science Daily
PhysOrg
CS4FN

Disagreements and Language Interpretation (DALI) Project

Natural language expressions are supposed to be unambiguous in context. Yet more and more examples of use of expressions that are ambiguous in context, yet felicitous and rhetorically unmarked, are emerging. In previous work, we demonstrated that ambiguity in anaphoric reference is ubiquitous, through the study of disagreements in annotation, that we pioneered in CL. Since then, additional cases of ambiguous anaphoric reference have been found; and similar findings have been made for other aspects of language interpretation, including wordsense disambiguation, and even part-of-speech tagging. Using the Phrase Detectives Game-With-A-Purpose to collect massive amounts of judgments online, we found that up to 30% of anaphoric expressions in our data are ambiguous. These findings raise a serious challenge for computational linguistics (CL), as assumptions about the existence of a single interpretation in context are built in the dominant methodology, that depends on a reliably annotated gold standard.

The goal of DALI is to tackle this fundamental issue of disagreements in interpretation by using computational methods for collecting and analysing such disagreements, some of which already exist but have never before been applied in linguistics on a large scale, some we will develop from scratch. First of all, we will develop more advanced games-with-a-purpose to collect massive amounts of data about anaphora from people playing a game.

Secondly, we will use Bayesian models of annotation, widely used in epidemiology but not in linguistics, to analyse such data and identify genuine ambiguities; doing this for anaphora will require novel methods. Third, we will use these data to revisit current theories about anaphoric expressions that do not seem to cause infelicitousness when ambiguous. Finally, we intend to develop the first supervised approach to anaphora resolution that does not require a gold standard as a blueprint for other areas.

The original AnaWiki Project

Creating anaphorically annotated resources through Web cooperation

The ability to make progress in Computational Linguistics depends on the availability of large annotated corpora, but creating such corpora by hand annotation is very expensive and time consuming; in practice, it is unfeasible to think of annotating more than one million words.

However, the success of Wikipedia and other projects shows that another approach might be possible: take advantage of the willingness of Web users to contribute to collaborative resource creation. AnaWiki is a project that develops tools to allow and encourage large numbers of volunteers over the Web to collaborate in the creation of semantically annotated corpora (in the first instance, of a corpus annotated with information about anaphora).

Publications

Listed below are published papers relating to the Phrase Detectives game. For more information please contact jchamb@essex.ac.uk
Crowdsourcing and Aggregating Nested Markable Annotations
Proc. of ACL 2019.
Madge, Yu, Chamberlain, Kruschwitz, Paun & Poesio, 2019.
A Crowdsourced Corpus of Multiple Judgments and Disagreement on Anaphoric Interpretation
Proc. of NAACL 2019, Minneapolis, USA.
Poesio, Chamberlain, Paun, Yu, Uma & Kruschwitz, 2019.
Comparing Bayesian Models of Annotation
Transactions of the Association for Computational Linguistics
Paun, Carpenter, Chamberlain, Hovy, Kruschwitz & Poesio, 2018.
Optimising Crowdsourcing Efficiency: Amplifying Human Computation with Validation
it-Information Technology 60(1):41-49
Chamberlain, Kruschwitz & Poesio, 2018.
A Probabilistic Annotation Model for Crowdsourcing Coreference
Proc. of EMNLP 2018, Brussels, Belgium.
Paun, Chamberlain, Kruschwitz, Yu & Poesio, 2018.
Metrics of games-with-a-purpose for NLP applications.
Games4NLP Workshop, co-located at EACL17, Valencia.
Chamberlain, Bartle, Kruschwitz, Madge & Poesio, 2017.
Phrase Detectives Corpus 1.0 Crowdsourced Anaphoric Coreference.
Proc. LREC'16, Slovenia.
Chamberlain, Poesio & Kruschwitz, 2016.
User Performance Indicators In Task-Based Data Collection Systems.
Proc. MindTheGap'14, Berlin.
Chamberlain & O'Reilly, 2014.
Methods for Engaging and Evaluating Users of Human Computation Systems.
Handbook of Human Computation (Springer)
Chamberlain, Kruschwitz & Poesio, 2013.
Using Games to Create Language Resources: Successes and Limitations of the Approach.
The People's Web Meets NLP, 3-44 (Springer).
Chamberlain, Fort, Kruschwitz, Lafourcade & Poesio, 2013.
Phrase Detectives: Utilizing Collective Intelligence for Internet-Scale Language Resource Creation.
ACM Transactions on Interactive Intelligent Systems (TiiS) 3(1), 3.
Poesio, Chamberlain, Kruschwitz, Robaldo & Ducceschi, 2013.
Motivations for Participation in Socially Networked Collective Intelligence Systems.
Proc. CI2012, Boston.
Chamberlain, Kruschwitz & Poesio, 2012.
The Phrase Detective Multilingual Corpus, Release 0.1.
Proc. LREC2012 Istanbul.
Poesio, Chamberlain, Kruschwitz, Robaldo & Ducceschi, 2012.
Italian Anaphoric Annotation with the Phrase Detectives Game-With-A-Purpose.
AI*IA 2011: Artificial Intelligence Around Man and Beyond, pp407-412.
Robaldo, Poesio, Ducceschi, Chamberlain & Kruschwitz, 2011.
Markup Infrastructure for the Anaphoric Bank. Modeling, Learning and Processing of Text Technological Data Structures.
Studies in Computational Intelligence (Springer).
Poesio, Diewald, Stuhrenberg, Chamberlain, Goecke, Jettka & Kruschwitz, 2009.
A new life for a dead parrot: Incentive structures in the Phrase Detectives game.
Proc. Webcentives09., Madrid.
Chamberlain, Poesio & Kruschwitz, 2009.
(Linguistic) Science Through Web Collaboration in the ANAWIKI Project.
Proc. WebSci09., Athens.
Kruschwitz, Chamberlain & Poesio, 2009.
Phrase Detectives: A Web-based collaborative annotation game.
Proc. iSemantics., Graz.
Chamberlain, Poesio & Kruschwitz, 2008.
Addressing the Resource Bottleneck to Create Large-Scale Annotated Texts.
Proc. STEP2008, Venice.
Chamberlain, Poesio & Kruschwitz, 2008.
ANAWIKI: Creating anaphorically annotated resources through Web cooperation.
Proc. LREC'08, Marrakech.
Poesio, Kruschwitz & Chamberlain, 2008.
Prizes Oct 2021
Most annotations:
Magoogy £50
Magic_Is_Real £15
Wellington £10

Best comments:
"not a correct NP, they need to be kept separate"
AColson £30
"'Where are my brothers': this is a full sentence, not a NP"
Wellington £30
Top teams
Top teams this monthTop teams of all time
100 Club
poppyseed4737
ggw103922
giordano9552
ibby1090
hgwalles4911
Sophia.Loren1543
jurs021024
DD2327
cataphor1667
Jemsypie6679
Game stats

612 documents completed

The most recent was The Adventures of Sherlock Holmes - The Adventure of the Speckled Band (Sir Arthur Conan Doyle) completed by anagram on 09 Nov 2023

The last document to be worked on was Evolution (Wikipedia) by samtxt