Daniel Lemire's blog

, 23 min read

Netflix game gets exciting: BellKor’s Pragmatic Chaos is passed by The Ensemble

20 thoughts on “Netflix game gets exciting: BellKor’s Pragmatic Chaos is passed by The Ensemble”

  1. Surely the Netflix judges will conclude that a difference of ~ 1% (i.e. 0.01% improvement) is statistically insignificant. It’s got to be a tie. Which still means 500K per team… divided by (say) 10 per team…

    Sounds like it may be easier to get an NSF grant.

    Which got me thinking. Maybe this a harbinger the future for academic granting … privatization of donors, commercial relevance of research – at least if the Reform (er, I mean Conservative) government has anything to say about it.

  2. Wilfrid says:

    J’ai n’ai pas totalement compris, mais la conclusion me rejoins. Effectivement, la science demeure une quête, un désir de comprendre, aussi elle n’a pas l’apanage du savoir.

  3. Daniel Haran says:

    As expected, it’s getting interesting: http://www.danielharan.com/2009/06/27/netflix-prize-let-the-games-begin/

    Andre: the prize is awarded based on half of the submission; the other half is used for the public scoring. This was done to prevent contestants from using results to tweak submissions.

    This would really suck for the top team, thinking they had it and being pushed out by a grand coalition. Cooperation wins again against a small, closed coalition.

    For me, that has to be the most important lesson from this competition.

  4. Joao says:

    I have not been closely following the Netflix competition since they hit the lower 8% improvement barrier, but I’m very happy to hear that the means used to break it was a merger of diverse ideas.

    I wholeheartedly agree with your extrapolation of the competition to the scientific pursuit in general. Thank you for the insight.

  5. Jeremy says:

    Yes, we should reward creativity. But I also strongly believe that at the end of the day we need to have learned something. Especially about recommendation. Because only by learning something can we actually apply it to something else, i.e. reuse that information.

    So what do we learn from this? That ensemble methods yield better results? We’ve known that for years.

    Don’t get me wrong: I think it’s great that so much successful effort has gone into this project. But when you say that science needs diverse explanations, techniques, and opinions, my feeling that there should actually be something explainable (explanatory?) at the end of the day. Don’t you think?

  6. An Insider says:

    Daniel Haran wrote: : “the prize is awarded based on half of the submission; the other half is used for the public scoring.”.
    This is entirely true, so the apparent lead of The Ensemble in the leaderboard doesn’t necessarily reflect the true leader.
    In fact, the Netflix Prize forum mentions that the other team (BellKor’s Pragmatic Chaos) is the top winning candidate by achieving a better score on the test set (the more important, non-public, half of the submission).
    Thus, I wonder why Daniel makes the following conclusion:
    “This would really suck for the top team, thinking they had it and being pushed out by a grand coalition. Cooperation wins again against a small, closed coalition. For me, that has to be the most important lesson from this competition.”
    The fact is that exactly the opposite actually happened – the closed coalition won.

  7. Daniel Haran says:

    “An Insider”: I wrote that before that information was made public, hence the “would”.

    Nonetheless, a small coalition was a risky strategy – it was really, really close.

  8. @jeremy

    1) Is the Netflix competition engineering or science? I submit to you that it is a noble engineering problem. It is akin to designing a new plane or a new bridge. Thus it is not surprising that we did not come out of it with an immediate better understanding of the problem: this may come later after science gets its way with the results…

    2) Most Computer Science papers (including too many of my own) are terribly boring and, at the end of the day, not very good science.

    Trying to build an entire science out of papers like this, is like building houses using only one type of material:

    * We consider problem X;
    * other people solved problem X with solution Y;
    * we propose solution Z;
    * we show that solution Z is better than solution Y.

    Reference:

    Are your research papers telling original stories?

    http://www.daniel-lemire.com/blog/archives/2009/03/11/are-your-research-papers-telling-original-stories/

  9. jeremy says:

    But isn’t the method you list in (2) nothing more than the scientific method itself? Start with an observable phenomenon, X, with explanatory theory Y. We propose explanatory theory Z. We show that Z provides better explanations (fits the data better, etc.) than Y. QED.

    So it’s not that method itself that I have a problem with, because obviously all of the house of science is built up using that one material already.

    My problem is with finding better questions to ask, finding better theories Z, Z’, Z” that yield insight into the problem, rather than ones that yield little insight.

  10. jeremy says:

    I think of it more as an Occam’s Razor kind of thing. An ensemble, grand coalition approach might indeed work 0.01% better than the BellKor method. But in that ensemble theory of recommendation, entities have been multiplied unnecessarily. Or at least a bit more than necessarily.

    Where’s the Occamian solution in that? Where’s the “simplest explanation” storytelling? That’s what I find lacking.

    Not that ensemble methods aren’t good engineering. I’ve used similar approaches in a lot of my own work, to great benefit. Sometimes I just want a better story, though. Both a better story from other researchers, as well as a better story from myself, for my own work. (I.e. I am not immune from my own criticism — I need to improve, too.)

  11. Daniel Haran says:

    @jeremy I’d buy that argument if the other top team weren’t throwing dozens – or hundreds – of models at the problem.

  12. jeremy says:

    Ok, let me change my story: *Both* teams aren’t really teaching us anything about science. 🙂

    Seriously, though.. my argument wasn’t meant to be directed for or against any one team. Seriously.

  13. @jeremy

    But isn’t the method you list in (2) nothing more than the scientific method itself? Start with an observable phenomenon, X, with explanatory theory Y. We propose explanatory theory Z. We show that Z provides better explanations (fits the data better, etc.) than Y. QED.

    I have a different point of view. A “method” applied to solve a problem is not an explanatory theory or an hypothesis. If that were the case, then the Netflix competition would be the perfect example of science in Computer Science. Yet, I just argued that what they are doing is engineering (comparing methods to solve a problem, and picking the best method).

    In any case, since you cannot easily acquire new data, research in collaborative filtering can never be more than some form of astronomy (which is certainly scientific, but not really a science). There is a lot of data, we can describe it, learn its patterns… but, for example, we don’t know how to simulate it, you cannot control the different variables, and so on.

    So, there is no such thing, as far as I know, as a “theory of collaborative filtering”. There are tidbits, but nothing that could qualify as a theory. There are some explanations, but most of our knowledge is purely empirical: this works better than this…

    Comparing two spam filtering methods and concluding that method A is better than method B might be “scientific” in an engineering kind of way, but it is not science. Not anymore than comparing two different methods of facial recognition, data indexing, and so on. Science has to be asking deeper questions.

    It is a subjective matter, of course… but I would argue that much of the current Computer Science is actually engineering research… except for domains such as theoretical computer science (which is, these days, just a form of mathematics), the purest form of HCI, and so on.

    In itself, it is not much of a problem… but we should not be surprised if all we end up with new methods, but little new understanding… and also, it may grow tedious… at the very least, I find very few papers giving me new insights… but maybe that’s a personal feeling.

    More seriously, maybe we should not be surprised that the whole field of Computer Science is in crisis despite the undeniable ubiquity of the computer itself. I suspect that the real science will eventually happen elsewhere, in other departments, or in entirely new communities…

  14. jeremy says:

    I agree that a “method” is not an “explanation”. I was just pointing out that the method you seem to find distasteful (reject X, create Y, show that Y > X) is pretty much the scientific method, is it not? And so the goal is still to create something elegant, Occam’s razor simple, and explanatory, right?

    But what you are saying, if I understand you correctly, is that even if the method is the same, the application of that same method to Collaborative Filtering is not and never will be science, because it’s not, in a sense, “repeatable”? Is that what you’re saying? Because the nature of the problems is not.. how should I say.. universal.. we can’t expect to come up with universal (scientific) solutions? Only engineering ones?

    Hmm. That does make sense. I’ll have to think about that some more.

    And on the topic of Computer Science in crisis.. what do you mean? I have a sense of what you’re talking about, but do you have a blog post that I perhaps missed, that goes into greater detail?

  15. I agree that a “method” is not an “explanation”. I was just pointing out that the method you seem to find distasteful (reject X, create Y, show that Y > X) is pretty much the scientific method, is it not? And so the goal is still to create something elegant, Occam’s razor simple, and explanatory, right?

    The Netflix competitors used this exact method. They used a scientific method, but it does not make it science, for me. The goal of science is not to come up with better methods, that’s engineering. To goal of science is to further our understanding.

    The difference matters because if you review a paper that seeks better understanding, and you have an engineering outlook, you will complain that the work fails to bring up a substantially better method. Or you may complain that the work fails to propose a new method altogether and reject the paper outright.

    Einstein’s special relativity does not help us design more efficient engines, does it?

    So, if we truly desire better understanding, we have got to stop systematically trying to find better methods, and instead ask deeper questions and investigate these deeper questions.

    Computer Science is so centered around designing new better methods, that we lost track of where the important science might happen. We have stopped looking. (This is not true all around, of course. I’m making a generalization.)

    But what you are saying, if I understand you correctly, is that even if the method is the same, the application of that same method to Collaborative Filtering is not and never will be science, because it’s not, in a sense, “repeatable”? Is that what you’re saying? Because the nature of the problems is not.. how should I say.. universal.. we can’t expect to come up with universal (scientific) solutions? Only engineering ones?

    Given the current state of our research ability, I see little chance to do real science in the context of collaborative filtering from a Computer Science perspective. Of course, I cannot guess at how ingenious a fellow Computer Science researcher can get.

    Here is a challenge: find a collaborative filtering question that needs an answer and can be pursued, but that does not entail designing a new and better predictor. That is, find a non-engineering question.

    And on the topic of Computer Science in crisis.. what do you mean? I have a sense of what you’re talking about, but do you have a blog post that I perhaps missed, that goes into greater detail?

    In a few short years, the number of Computer Science students went from being larger than the number of students in Mathematics, Physics and Chemistry put together, to a number similar to Physics. Of course, the enrollment will vary from school to school, but the general picture regarding enrollment is grim. There has been small gain in the last two years, but hardly enough to compensate for the huge losses.

    You can blame it on the dot.com crisis and on outsourcing, and it played a role, but I think the deeper issues is an identity crisis. Computer Science offers a mixed message. Come to us and we will train you to create great software… oh! forget that!… we will train you to study the fundamentals of information processing… Who is the model? Alan Turing, Bill Gates, Bill Joy, Tim Bray, Jim Gray, Paul Graham?

  16. Kevembuangga says:

    Computer Science in crisis…

    According to some, Computer Science is dying if not dead already!

  17. Daniel Haran says:

    * We consider problem X;
    * other people solved problem X with solution Y;
    * we propose solution Z;
    * we show that solution Z is better than solution Y.

    I think what’s missing here to call it a “scientific method” is the lack of hypothesis. What have we learned? Why does solution Y work better than Z? Will it always work better?

    One problem I’d like to solve is finding optimal parameters in kNN in a deterministic fashion. The current “find a distance metric and k that works” is deeply unsatisfying. Whether guessing or using brute force, it lacks elegance and intellectual rigor.

  18. jeremy says:

    I think what confused me initially about your blogpost is that you’re using the word “method” in two different ways. First, you talk about the overall scientific process as a method, and then you talk about hypothesis that was generated by that scientific process (the actual solution) as another method. To wit:

    (Method Definition 1) I was just pointing out that the method you seem to find distasteful (reject X, create Y, show that Y > X) is pretty much the scientific method, is it not? “The Netflix competitors used this exact method. They used a scientific method, but it does not make it science, for me.”

    (Method Definition 2) “So, if we truly desire better understanding, we have got to stop systematically trying to find better methods, and instead ask deeper questions and investigate these deeper questions.” [That is: Method definition #2 is the “Y” itself, which is being tested as part of the whole scientific method (definition #1).

    Am I understanding correctly the two ways you’re using the word “method”?

    If so, then I would say that method definition #1 (the scientific method) really isn’t the problem. The problem is that we’re inventing method definition #2’s (research hypotheses, the various Y’s) that do not fully conform to the strictures placed on them by method definition #1. What is that stricture? Again, it’s Occam’s razor. The hypothesized Y’s should themselves be explanatory and simpler than the X’s that came before them. An ensemble method multiplies entities. It does not reduce them.

    I would therefore argue that the problem isn’t with method definition #1 (the scientific method), but with the fact that we’re not actually following that method. If we were to follow it, and only prefer hypotheses that both (i) better fit the data, and (ii) did so without multiplying entities, then we’d be on track.

    Very interesting discussion, by the way. Thank you for engaging!

  19. jeremy says:

    In a few short years, the number of Computer Science students went from being larger than the number of students in Mathematics, Physics and Chemistry put together, to a number similar to Physics. Of course, the enrollment will vary from school to school, but the general picture regarding enrollment is grim. There has been small gain in the last two years, but hardly enough to compensate for the huge losses.

    You can blame it on the dot.com crisis and on outsourcing, and it played a role, but I think the deeper issues is an identity crisis. Computer Science offers a mixed message.

    ..and I think this other topic is fascinating. Do you have more posts on it? Willing to discuss it more in the future?

  20. @jeremy

    Yes, I have used to word “methods” to mean two different things. And yes, the identity of Computer Science is an interesting topic. Is it science, engineering or mathematics? For a time, Computer Science exploited this confusion by being everything to everyone. On the long run, however, I suspect it is counterproductive. I think that the promotion of Computer Science would be far easier with a clearer message.