How to interpret entropy

Note: This is a post that I started some time ago and have had in my todo list to finish for...maybe a year now? Apologies for the delay!

We've argued that more entropy in a survey is better for detecting bad actors. The argument goes like this: A survey of 5 yes/no questions has (ignoring breakoff) 32 possible unique answers. The maximum entropy of this survey is

$$-5\bigl(\frac{1}{2} \log_2(\frac{1}{2}) + \frac{1}{2} \log_2(\frac{1}{2}) \bigr) = -5\bigl(-\frac{1}{2} - \frac{1}{2}) = 5$$.

[Read More]


This one time, I wrote something in Perl

I was just looking at some old crappy code I wrote four years ago, to show Molly that one should have no shame when posting code online. While looking at my own HMM code, I ran across this code. I have no idea what it does. I just remember wanting to use Perl for string munging crap. Yikes!

[Read More]


Participation and Contribution in Crowdsourced Surveys

Participation and Contribution in Crowdsourced Surveys, a recent PLOSOne article, discusses some interesting approaches to crowdsourced surveys. Not only are the answers crowdsourced, but the questions themselves are also crowdsourced. The surveys are seeded with a small number of questions and later augmented with questions supplied by respondents. These questions are curated by hand and presented to respondents in a random order.

[Read More]


Rage Against the Machine Learning

Woah, it's been a while since I actually made a blog post. I'm going to to try to make a blog post every day until I head off for my Facebook internship in June (super psyched!). I'm putting it in writing, so that's how you know I'm definitely going to break my word...

I owe the above title to Bryan Ford, who listened to me raging against Machine Learning this afternoon. Bryan is visiting for our first systems lunch in a while. It's been a long hiring season...

[Read More]