Daniel Lemire's blog

, 13 min read

Are you descriptive or predictive?

13 thoughts on “Are you descriptive or predictive?”

  1. Peter Turney says:

    A wizard comes to you and give you a choice.

    These kinds of thought experiments are very popular in philosophy. By now, philosophers have grown very wary of such “intuition pumps”. You believe that you can imagine something, but can you really imagine it? Can you really imagine prediction without understanding or understanding without prediction? At first, it seems possible, but I believe the two are very closely intertwined, so you cannot separate them in the way you suppose you can. This, I think, is the real crux of the matter: What is the relation between understanding and predicting? The thought experiment distracts us from this key question.

  2. Peter Turney says:

    Consider Newton’s theory of gravity: One might argue that it is purely predictive; it doesn’t really explain gravity at all. What do we want from an explanation? Why is there something unsatisfying about Newton’s theory as an explanation?

  3. Peter, I acknowledge this in my post by saying “The difference between the two is probably a matter of philosophical debate.” I do not want to enter this philosophical debate as it is not very well grounded and I got tired of philosophical debates in my 20s.

    But the point I make is kind of like asking whether you discover or invent algorithms. Which one is it? Both views are valid, and equivalent at some level, nevertheless, there *is* a cultural difference.

    Is man more than an animal? You can argue both ways until you are blue in the face. Both view points are justifiable. Clearly, man is more advanced than any other animal we know. Yet, almost everything human beings do has been done by animals (except things like “write a book” and even that is not so far off…).

    However, there is a cultural difference between scientists who believe that human being are animals and those who don’t. There is a cultural difference between people who believe they discover algorithms and those believe they invent them.

    I claim there is the same type of difference between descriptive and predictive people.

  4. Cyril says:

    Interestingly, it seems that according to your dichotomy, statistics (of which Machine Learning is a sub-field ) is both predictive and descriptive. Also, as pointed out by Peter, physics is “descriptive” by your definition, but filled with predictive models.

    I wonder if there isn’t a (common) confusion here between “true laws” and their models, which are predictive by nature. In your example, you seem to contrast a very accurate but hard to explain model with a less accurate but easier to explain model. Whichever you may want depends on the application you are considering rather than on being a “predictive person” vs. “descriptive person”, I would think.

    Btw I think the lack of emphasis on explainable models in Machine Learning was one of the points in J. Friedman’s talk at SDM-2007, if you remember…

  5. Cyril is correct to point out that J. Friedman’s talk at SDM-2007 was related to my post.

    My main point is whether we should factor in the limitations of our own minds (including our finite short term memories) when we do science, or whether we just ignore them.

    Maybe the falsifiable part is that there are limitations to the human mind?

    The problems we are facing now involves hundreds of terabytes of raw data. You can throw models with one million free parameters at them. You can create a gigantic soup. Maybe out of this soup will come the best spam filter ever created. But this spam filter is like an asocial genius in a society: it might be brilliant, but it does not know how to communicate its brilliance and society will reject him.

  6. Kevembuangga says:

    You can either be handed out the laws of the universe as an algorithm, but in such a form that your brain will be prevented from ever understanding them.

    This hypothesis seem a bit flawed purely on a matter of principle: If you don’t have some understanding of the laws of the universe as embodied in the “algorithm” how would you know which questions to ask?
    As personal opinion, anyway, I don’t expect there could be a definitive statement of all “the laws of the universe” (the Theory of everything), I rather deem that no matter how advanced our knowledge will be it will forever be indefinitely perfectible.

  7. Jeremy:

    I guess people want to make predictions to be one step ahead of the universe itself, or to jump forward in time.

    However, being able to predict is not quite the same as understanding the universe because our brains are too limited to track excessively complex models.

  8. jeremy says:

    Isn’t the perfect predictive model of the universe.. just the universe itself? Quantum issues aside, the whole Universe’s worth of data at timestep t essentially is the predictor of the whole Universe at timstep t+1, natch? The whole universe is its own best induction engine?

    Is this what you are asking?

  9. That is correct Jeremy.

  10. jeremy says:

    Yes, I agree with you, that being able to predict is not quite the same as understanding the universe. But that was the choice you were giving us, with your wizard scenario.

    What I was trying to do was take the first part of the scenario to a sort of ad absurdum extreme. Again you write: “You can…be handed out the laws of the universe as an algorithm, but in such a form that your brain will be prevented from ever understanding them.”

    Isn’t the most extreme case of that the following algorithm:

    (1) Take the state of the universe, all the atoms, masses, velocities, energies, and directional vectors.

    (2) Move everything forward by “one”.

    (3) Observe.

    That is the most extreme form of the algorithm, right? Use the Universe to model itself. You could even have an algorithm that forked off a parallel universe, and ran the CPU twice as fast, to see what would happen in this Universe before it actually happened. The point is, that approach, that “whole universe as an algorithm”, is the extreme case of the machine learning, brain-nonunderstanding aspect, am I correct?

    So now any step you take to generalize, to reduce the state space of the model, does itself constitute some sort of understanding of the Universe. The very choice of making the decision about what to include and exclude from your model is, itself, a form of brain understanding, or descriptive, science, right?

  11. Kevembuangga says:

    (1) Take the state of the universe, all the atoms, masses, velocities, energies, and directional vectors.

    (2) Move everything forward by “one”.

    Not so sure this makes sense.

    (1) The “state of the universe” is NOT a collection of data bits about this or that, this (huge) “lump of data” pertains to a model of the universe not to “the universe” per se, the map is not the territory, the realist illusion strikes again.

    (2) “Move everything forward”, along which time scale?
    Time is not defined for the universe as a whole, remember Einstein’s relativity?

  12. jeremy says:

    No, Kevembuangga. What I said doesn’t make complete sense, I agree.

    My point was to point out that, in order to do Machine Learning (“predictive” science), you have to start with some sort of description of what it is that you are running the machine learning algorithms on (“descriptive science”). You might not have a complete, parametric model, like F = MA or something like that. Machine learning can be non-parametric. But whether or not you do parametric learning, you have to feed into your ML algorithm some kind of feature set. And those features themselves are like small little atomic “models”. When you measure a feature, you are essentially putting a descriptive wrapper around some phenomenon.

    Since predictive science, then, is essentially dependent on descriptive science, I was trying to say that I didn’t really understand the fundamental assumptions behind Daniel’s dichotomy. It seems to me that all science is descriptive.

  13. jeremy says:

    To finish out my thought: It seems to me that the dichotomy isn’t between predictive vs. descriptive. I see it as all descriptive. The difference comes at the level of granularity of that description.

    Astronomy and biology are macro-level descriptive sciences. Machine learning is a micro-level descriptive science.

    IMHO.