, 8 min read
Olivier Bousquet at Curves and Surfaces 2006: Learning on Manifolds
6 thoughts on “Olivier Bousquet at Curves and Surfaces 2006: Learning on Manifolds”
, 8 min read
6 thoughts on “Olivier Bousquet at Curves and Surfaces 2006: Learning on Manifolds”
No, I don’t think he made any sort of comparison and I don’t recall mention of ISOMAP or LLE (though, they may have be cited?), and I think that in the talk, he only tested his technique on a toy case with images having two degrees of freedom, and 4 significant eigenvectors (corresponding to the corner of a 2D patch). I got the feeling he had his own Matlab implementation and that’s all he used. He didn’t claim, however, to either present a survey or that all the work presented was his own.
Naturally, the usual disclaimers apply: don’t trust me too much and check with Olivier for details.
Did he compare with ISOMAP and LLE ?
With the warning that my understanding of what he presented is very limited (CS 2006 papers will only be out next year) and that I don’t know what ISOMAP or LLE are (other than what I could learn in 10 seconds through Google), one argument he had was that you don’t actually want to compute or estimate the manifold for the problem types he is interested in. Moreover, solving for the manifold can actually be an ill-posed problem in the sense that an actual manifold might not exist in the real problems, while the “approximate idea” of a manifold might still be useful. I don’t know what ISOMAP or LLE do… My shallow understanding says the he casts the problem as a graph problem and remains there for all time (though it is a weighted graph!). The manifold idea kicks in only through the computation of the Laplacian which is used to find your clusters through a dominant eigenvector approach.
So, all he ever does is classification/clustering. You’ve got nodes in your graph and you want to cluster them. Soooo… he does, essentially, discrete mathematics… (or at least solves discrete problems).
One point that wasn’t entirely clear in his presentation is how you find the nearest points, and he got a question about this from Wolfang Dahmen, but I don’t think he had worried about the computational complexity.
(Need I remind you again not to trust me too much?)
Now, I could imagine machine learning techniques where you actually, given dense and uniform data points, you try to compute the manifold (maybe by local patches?) for the purpose of, say, prediction/extrapolation/interpolation… that wouldn’t be at all what he presented and I think he made a case against the practicality of such approaches.
A lot of learning theory seems to have to do with “regression”: given data points, find the “function” (and I extend “function” to include “manifold”) which best fits the data. Given some assumptions on the how nice the function (or the manifold) is, then you can prove some nice (theoretical) results about the rate of convergence and stuff. This is not what Olivier presented.
My blog post was actually a rant about some of these techniques that were not justified by applications: for example, smoothness is often a crazy expectation in the real world. There are many assumptions you can make, but differentiability of your data seems so unrealistic! (to me, but then again, it is only a blog post)
Inviting accusations of self-promotion, I can point you to a few papers discussing the use of graph Laplacian-based techniques for ad hoc information retrieval. The CIKM paper in particular reduces a few classic IR techniques to graph regularization.
My thinking is very clear-cut: theory and practice are at their best when they collaborate. A theorem is nothing without practical validation, and experimental results are nothing without a theorem. We should never go too far without a theory, and we should never go too far with experimental validation.
I’m amazed at how few people share my views on this. I end up sounding like an extremist because I advocate something that ought to be common sense (I think).
Thanks a lot for attending my talk and for your interest in this work. I must start by saying that what I presented is not my own work, but it was more a survey of recent work that has been done in the Machine Learning community.
The goal was to show to people in the audience that this community had worked on interesting problems and proposed interesting approaches to deal with those.
My own work was more on the theoretical side (trying to understand how these things work, what are their properties, how to unify the various algorithms that have been proposed…). But as you point out, I have been a bit shy when answering Ron DeVore’s question about analytical results.
I am fine with the fact that, as a mathematician, Ron needs properly formulated questions to start looking for results to prove.
But on the other hand, I think the ultimate test is practical efficiency.
In a way, my brain is now split into two INDEPENDENT parts: one part of my brain likes to prove theorems, the other part likes to solve practical problems.
Unfortunately, I never managed to have them collaborate in a productive manner 😉
Still, I concede that it is necessary to try and formalize the problems one is working on. Not in order to prove things but in order to understand what you are doing.
For example, in the context of graph Laplacians, I think it is important to understand the meaning of the quantities you manipulate. For example, the fact that there are analogies between discrete quantities and continuous ones is important to guide intuition.
But proving that the former converge to the latter is just a mathematical exercise, which is often frustrating because you have to introduce a lot of assumptions (which the practical side of my brain is fighting against) in order to prove anything…
If you are interested, I will put the slides on my publications page. Most important is to look at the last slide which contains bibliographic references (and most important is to look for the references inside these papers, not the papers themselves which are too theoretically oriented;)