Hacker News new | past | comments | ask | show | jobs | submit login

> I'm curious why it didn't work for music recommendation for you. In the netflix challenge for movie recommendation, pretty much everyone ended up using SVD-based methods and variations thereof.

I expect the algorithm did reasonably well given the data available. Probably a mix of reasons why I was disappointed though:

* My expectations were too high, I didn't realise quite how hard of a problem good music recommendations are

* Dataset wasn't large enough

* Dataset was based on deliberately-stated opinions (ratings) rather than actual observed behaviour (listening data)

* I didn't find a way to use the timing data associated with the ratings

* Figuring how best to normalise the dataset prior to attempting SVD was tricky and I'm not convinced I found the best way

I admit I was also sort of naively hoping that the 'features' identified by SVD would have at least a vaguely-human-recognisable theme to them. The first 2 or 3 of them did but from there on in it all looked pretty random.

Also, like a lot of recommenders, it gave the impression of having based its recommendations on some kind of generic, averaged-out 'middle point' of your overall usage data.

Really I'd rather it clustered my usage data, then recommended new things in and around each of the clusters.

I have some ideas for what I'd do differently the next time around - that being one - but also ideas about how better to tie algorithmic recommendation tools in to human interaction.




Yeah, there is always the danger that the data is just too little or too noisy. But the technical issues you mention (normalization, subtracting bias, using temporal data) all came up in the netflix movie recommendation competition as well so you can always look at how people handled it there.

Some useful information can be found in the report of the winners: http://www.netflixprize.com/assets/ProgressPrize2007_KorBell...

There's a lot of fancy stuff, which would be overkill in a real system, but lot of practical info also.

Also, there was an article of the winners "collaborative filtering with temporal dynamics" which might be useful and I think it is freely available.


You seem to know quite a lot about this type of stuff. It's great. What's your story?




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: