Emma Tosch

Background on Classifiers

Posted on November 23, 2015

Sometimes I think we should give up on the classification of bad actors in SurveyMan – it’s a very hard problem and developing solutions feels very far afield from what I want to be doing. Then I see blog posts like this one that show the limitations of post-hoc modeling and the pervasiveness of bad actors. As some of the commenters pointed out, if people were responding randomly, then we could treat their responses as noise. However, what they are pointing out is that people in fact do not respond randomly – there may be some questions that are more likely to elicit “incorrect” responses than others, which puts us in bias detection territory.

[Read More]

My Personality

Posted on November 23, 2015

To get a sense of a background paper on identifying careless responses, I took their personality inventory. For funsies, here’s what it came back and said. My non-average traits are:

Domain	Subdomain	High	Low
Extroversion	Assertiveness	✔
Agreeableness	Morality		✔
Agreeableness	Altruism		✔
Agreeableness	Coorporation		✔
Conscientiousness	Self-efficacy	✔
Conscientiousness	Orderliness		✔
Conscientiousness	Dutifulness		✔
Conscientiousness	Cautiousness		✔
Neuroticism	Anger	✔
Neuroticism	Vulnerability	✔
Openness to Experience	Artistic Interests		✔
Openness to Experience	Intellect	✔
Openness to Experience	Liberalism	✔

[Read More]

Crowdsourcing system basics

Posted on November 4, 2015

Some time ago I started writing up the requirements for a sound crowd sourcing system. Last year I wrote about Amazon’s report on using verification tools at scale and it got me thinking about the kinds of abstractions one would need to formally verify a crowdsourcing system.

[Read More]

Counterfactual Estimation and Optimization of Click Metrics for Search Engines

Posted on October 17, 2015

So Eytan suggested this paper as a reference for our work on static analysis for PlanOut, but I was recently thinking about how some of the ideas might be ported to SurveyMan.

[Read More]

Smarter scheduling in SurveyMan

Posted on June 2, 2015

Conventional wisdom (and testimonials from researchers who have been burned) says that time of day can introduce bias into crowdsourced data collection. Right now, SurveyMan posts a single HIT per survey, requesting \(n\) assignments. If we collect \(n\) assignments and find that they are low quality, we ask for more by extending the HIT.

[Read More]