Hacker News new | past | comments | ask | show | jobs | submit login

While that's a very large pool, it's almost certainly not random.



What sample size or methodology for the question at hand would satisfy you at this point?

https://en.wikipedia.org/wiki/On_Exactitude_in_Science


A huge sample size does not necessarily mean a good sample. While it's difficult to say what the impact of sampling only from a group of people who self-select into a given population, it's clearly a bias. If a quarter of the population self-selects into Facebook, I'd prefer a sample that includes three times as many people who didn't.

If nothing else, it's worth considering the demographics: within the USA, women use Facebook at a higher rate than men; people with some college education use Facebook at a higher rate than either college graduates or adults who've never been to college; Facebook use rate is roughly negatively correlated with income level; use of Facebook is negatively correlated with age (http://www.pewinternet.org/2016/11/11/social-media-update-20...). So the leftover US population skews male and slightly higher-income than the Facebook population. And globally, Facebook use is correlated with national wealth (minus some outliers). So there's clearly a sampling bias, and that's worth understanding.

That said, whether the results would be any different is hard (impossible?) to say without actually trying a different sampling methodology, and I suspect the results from this study are valid, sampling bias notwithstanding.


A few hundred randomly selected people would give more certainty than billions that aren't randomly selected.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: