Ask News.YC: Any studies on whether social filtering works?

JacobAldridge · on May 26, 2008

I would assume it better to compare and make predictions about me based on somebody unknown to me but with strikingly similar purchasing habits, than to compare me just to known associates.

Part of the problem with the social graph, particularly as it exists on broad networking sites like Facebook et al, is the lack of data about the nature of those relationships. (ie, "the more you know about them"). I've got however many dozen friends, but the social graph there doesn't distinguish between the guys I went to primary school with and the guys I see every weekend.

If I could better position my relationships, more accurate data could be created. Eg, if I declare Dave is my 'drinking buddy' and Dan is my 'film friend', the system could make reasonable predictions about what I might like to drink based on Dave and what films I might watch based on Dan, while ignoring the vice-versa.

In other words, the value of the social graph requires more meaningful information about relationships, not just more people. Without that, aggregated data wins out.

You're right - pure speculation is cool!

einarvollset · on May 26, 2008

Turns out that "it depends"; a recent paper by Jon Kleinberg, Sid Suri and others suggests that for wikipedia edits, social networks are more important predictors, where as for Live Journal, "similarity networks" are better.

For hard data google the paper (it's a preprint so I'm not gonna link to it): Sid Suri site:cs.cornell.edu, and checkout the Papers link.

aston · on May 26, 2008

Made me take the long way to the paper, but from the abstract it's exactly what I was looking for. Thanks.

skmurphy · on May 26, 2008

Social filtering absolutely works in the small: consider how often your friends and co-workers can recommend things that are of interest and of use to you. How you construct an application that facilitates or automates this is a different question.

morbidkk · on May 26, 2008

It all depends on how data is aggregated and how specialized service like yours cater to the data i.e uses that data.

For example linkedIn takes all the data from the end user whenever there is communication event.

1) adding a profile: name/school/college/specialization/job industry/company

2) adding a new job- job industry/company

3) adding a contact - type of relationship

4) asking a question - mark that to industry vertical/set of people

So having granular details (which would help your service to serve users better) defines what data you need to collect from start and then you can use the same.

pure speculation is cooler with above set of data

wheels · on May 26, 2008

From a theoretical perspective, what you're assuming above is that you apply a social filter and then use collaborative filtering (i.e. Amazon, Netflix and similar's method). In fact the two supplement each other -- collaborative filtering using the entire customer database and then do a weighted merge of that with an algorithm ranking things based on social graph traversal.