Similarity functions for user-based collaborative recommendations

11 May 2013

This is from my master thesis where I am writing about recommender systems.
Here are some rankscore results (ten-folds, tested on historical data, read more on evaluation in this great book) after using different similarity functions with a simple user-based collaborative filtering algorithm.
The vectors are binary or, when noted, weighted with the weight 1 / (total downloads) and based on implicit feedback (downloads).

Similarity function Rankscore
Euclidian distance 0.171801575012
Euclidian distance (weighted) 0.177710916698
Tanimoto Coefficient 0.129399988285
Cosine similarity 0.179036040926
Cosine similarity (weighted) 0.212398453763

Note that cosine similarity performs very well on my dataset, even though it is a user-based algorithm (rather than item-based).
I am not using Pearson correlation since my input is binary vectors.

I will publish my thesis here in June if you wish to learn more about the dataset, recommender algorithms, weighting techniques and evaluation functions.

blog comments powered by Disqus