Any doubts about the value of the pile of data that Mark Zuckerberg sits on were dispelled for me last month with the publishing of a short research paper by Michal Kosinksi and colleagues from the Cambridge Psychometrics Centre in the Proceedings of the National Academy of Sciences.
The study took the ‘likes’ of 58,000 American Facebook users and created statistical models against their volunteered demographic information and psychometric profiles, all with the users’ permission. The goal of the study was to see how accurately personal details, such as religious beliefs, political leaning and sexual orientation, could be predicted using only Facebook likes. Using more than 700 likes, Kosinski built models with outstanding predictive accuracy using non-explicit ‘likes’, such as music choices rather than ‘likes gay marriage’.
The resulting model was:
- 88 per cent reliable for determining male sexuality
- 95 per cent accurate in differentiating African-American from Caucasian American
- 85 per cent accurate in differentiating Republican from Democrat
- 82 per cent accurate in differentiating Christians and Muslims
The predictions use inference rather than explicit likes for their predictive power. That is, they are based on large numbers of relatively innocuous likes such as music, TV and food preferences. For example, the best predictors of high intelligence include likes for thunderstorms, The Colbert Report, science and curly fries, whereas low intelligence was indicated by liking French cosmetics brand Sephora, I love being a mom, Harley Davidson and Lady Antebellum. Good predictors of male homosexuality included Mac Cosmetics, Wicked the musical and the No H8 Campaign (a charitable campaign designed to raise opposition to California's Proposition 8, which banned gay marriage in the state). My favourite TV show is The Colbert Report, but I’ll leave you to make your own inferences.
Whilst there are some bizarre correlations here (curly fries and high intelligence?), many fit comfortably within the boundaries of racial, sexual and political stereotypes. Bear in mind, these are predictors, not guaranteed outcomes.
The models are also accurate in predicting consumption behaviour, particularly alcohol, cigarette and drug use. For example liking ‘That spider is more scared than U’ is a predictor of being a non-smoker.
Clearly, using aggregated 'like' data can generate a surprisingly accurate picture of the personal traits of millions of Facebook users worldwide—a potential boon for advertisers. Unlike other academic research in this area, the authors don’t decry the dangers of such data being used by online advertisers and in fact give examples of how online advertising could be positively targeted for the users' benefit using likes (the researchers were in part funded by Microsoft). Any Facebook user can try a one-click personality test for themselves.
The potential goes beyond advertising of course. In a recent article in The Guardian (“How Facebook Could Get You Arrested”, 9 March 2013), it was reported that Facebook has begun to work with the police by using algorithms and historical data to predict which of their users might commit crimes using their services, e.g. male adult user chats with under 18s, most friends are female, uses keywords such as ‘sex’ or ‘date’. Has anyone seen Minority Report?
Unfortunately for Facebook, its share price continues to fall despite this big-data vindication of what Zuckerberg’s been telling us. Optimism that the social network has found a way to unlock the value of its mobile users raised the share price from $26 at the end of last year to almost $33 at the end of January, but it has since dropped back down, for a number of reasons, including the comparative success of LinkedIn, data indicating that users spend less time on the site than they did just months ago, and a general sentiment in the investment community that Facebook should have been sold privately to a Google or Yahoo rather than floated to a general public with such limited streams of revenue.
I have tried to find references to the PNAS study in financial blogs, but it seems the investment community hasn’t picked up on what I think is a valuable piece of PR for Facebook.
What’s the best conclusion you can draw from all of this? Don’t take investment advice from me!
Follow me @Experian twitter