Daniel Lemire's blog

, 2 min read

Hierarchy of Collaborative Filtering Distribution

I think that, increasingly, both creators and clients want to regain control. The beauty of it is that I think that businesses can be built on putting customers back in charge. To a large extend, I keep prefering Amazon to my local bookstore in part because, I have more control when using Amazon.

Increasingly, we are seeing that the creator want to stay in control. Publishers increasingly struggle to stay in charge, but they fight a losing war. The next logical step is that the clients will want more “control” as well.

This issue lead me to designing this “Hierarchy of Collaborative Filtering Distribution”1. Definition: In a collaborative filtering recommender system, we have two type of human agents: the creators who want to sell their content, and the clients who are willing to share some of their preferences. By this definition, Google is typically not a collaborative filtering recommender system.

Level 1. The data (and goods) are centralized. The creators relinquish total control. The clients need to trust one entity with its preferences. The business value is in controlling the channel, the data and providing good tools. (Think: Amazon) (Think: Standard distribution channels)

Level 2. Only the meta-data is centralized. The creators keep the control, but the clients need to trust one entity with their preferences. Some of the business value lies in the client’s metadata. (Think: inDiscover.)

Level 3. Both the data and the metadata is distributed and only the aggregation needs to happen at one point of contact. The clients and the creators use interoperable tools and data format and keep tight control of their data. The business value is in the tools and services themselses, not in the data. (Think: Semantic Webish applications.)

Regarding Music, going to the level 3 is not hard. Sites like inDiscover and webjay already make playlists available in XML. This is where the work of people like Lucas Gonze on XML formats for MP3 playlists can become interesting. Imagine a world were artists post on various web sites, not only their MP3s, but also, some standard XML file allowing aggregation. Imagine also that users posts their playlists (indiscover and webjay users do this already). We then have the possibility for a level 3 distributed recommender system “Ã la Semantic Web”.

This can then be very interesting research-wise and business-wise.

Update: Rod Savoie points me to DLORN (Distributed Learning Object Repository Network) as a related tool.

1- I checked and this concept appears to be new. If you ever use it, you have to cite this blog entry! There is related work however, such as Tomas Olsson, Bootstrapping and Decentralizing Recommender Systems, 2003 and Resource Profiles by Stephen Downes (also in 2003).