Daniel Lemire's blog

, 2 min read

AI requires huge volumes of data to exist: what about learning?

This has been around for quite some time, but it keeps on popping up left and right. “Google (…) believes that strong AI requires huge data volumes to really exist.”

Since nobody knows what is required for strong AI to exist, this is a currently non-falsifiable conjecture. One thing is for sure is that it takes several years, for a human being to hope passing the famous Turing test. My 4 months old baby can’t pass the Turing test.

So, there is strong evidence that you need lots and lots of data before intelligence, as defined by the Turing test, can emerge. Now, what does it say about “learning”? It seems to imply that to “learn”, you need to be exposed to lots and lots of data. This suggests, maybe, that the web is the real future of learning because, face it, there is only so much an instructor can convey to a group while spending hours in front of a black board. I can look up facts and theories much faster through the web, though this is recent as, until a few years ago, the black board was still a more efficient way to gather data and, in some instances, like mathematics, it still is.

One interesting conclusion though is that broadband ought to be very useful to learning. If being exposed to lots and lots of data is required, then you need broadband. What am I doing here, in my basement, with my cable modem? I need a T1 stat! Oh! Right! I’d still be limited by how fast others can deliver the information.

Theorem A large data output is necessary for having a rich learning experience.

So, if you have online content for a given course, the relative performance of the server does matter. Multimedia content does matter.

Or does it? Notice I didn’t attempt to prove my theorem. So, let’s call it the “Lemire conjecture” for now.