14th March 2016, 36 min read

Artificial intelligence is mostly a matter of engineering?

38 thoughts on “Artificial intelligence is mostly a matter of engineering?”

Evan Estola says:

March 14, 2016 at 3:53 pm

According to the CEO of DeepMind the performance on a single machine is still very impressive.

“Distributed version is only ~75% win rate against single machine version! Using distributed for match but single machine AG very strong also”

https://twitter.com/demishassabis/status/708489093676568576
1. Daniel Lemire says:
  
  March 14, 2016 at 4:02 pm
  
  @Evan
  
  There are a few issues to consider.
  
  1. When Hassabis says “one machine”, he does not mean “one processor”. He probably refers to a machine worth the price of dozens of commodity PCs, at least.
  
  2. How much hardware was used during the training phase?
  1. Leonid Boytsov says:
    
    March 14, 2016 at 4:32 pm
    
    What is even more astounding is the team of 100+ PhDs (5x the size of IBM Watson team) that worked on this highly specialized problem. If deep learning is so easy, why would you need so many people?
    1. Jouni says:
      
      March 14, 2016 at 6:46 pm
      
      As someone working at a genomics institute, that doesn’t surprise me at all. The real world is messy, and 100+ PhDs isn’t that much. When you do something for the first time, it may involve a lot of routine work, which still requires deeper understanding of the problem.
      1. Daniel Lemire says:
        
        March 14, 2016 at 7:49 pm
        
        Software is magical in the sense that a few really smart people can make a huge difference…
        
        jld says:
        
        March 15, 2016 at 8:22 am
        
        Doesn’t this conflict with your usual “no genius” stance?
        
        Daniel Lemire says:
        
        March 15, 2016 at 1:19 pm
        
        Doesn’t this conflict with your usual â€œno geniusâ€ stance?
        
        Technology acts as a multiplier. My favorite example is Greg Linden. He implemented Amazon’s recommender system (“If you like this book, you might like…”). Greg is in the top 1% of engineers… very smart… but he does not have magical powers. What he implemented is relatively simple, and it was probably just a challenge because of the scope involved at Amazon.
        
        Now, what happened? Well. It changed the world. We all know about this feature. It has changed how we think of e-Commerce.
        
        So yes, I would say that there is strong evidence that a few smart people can use software to make a huge difference in the world… but they don’t need to be Einstein-like geniuses. Hard work, a solid education and lots of focus is probably all that is needed.
        
        Jouni says:
        
        March 15, 2016 at 9:33 am
        
        Machine learning tends to be more about the data than the software. Instead of having a well-defined problem to solve, you start with only a general idea of what you’re trying to achieve. You then explore various interpretations of the data and different approaches to solving the problem, tweaking the solutions until you have something good enough. To me, that looks more like a biological problem than a typical computer science problem.
        
        Daniel Lemire says:
        
        March 15, 2016 at 1:20 pm
        
        Agreed. And it is the subject of my next blog post, by coincidence.
    2. Daniel Lemire says:
      
      March 14, 2016 at 7:42 pm
      
      DeepMind has over 100 PhDs working on “solving AI”, according to the CEO. The team behind AlphaGo is much smaller. We actually know exactly who did what thanks to the Nature paper :
      
      http://www.nature.com/nature/journal/v529/n7587/full/nature16961.html#contrib-auth
      
      So it is 5 to 10 people for the search functions, then 5 to 10 people for writing the neural-network software. We know that it took them about two years.
      1. Leonid Boytsov says:
        
        March 14, 2016 at 7:57 pm
        
        I counted 21 on this paper, not counting the founder! You don’t take into account the prior art either (e.g., the mentioned Monte Carlo tree search). Actually, the same was true for Deep Blue and IBM Watson (lots of prior art). Hundreds of PhDs making advances over dozens of years for a highly specialized AI task. But, yeah, what if AI is just a matter of computational power? Well, this may be true, but there is no evidence for this yet.
        
        Daniel Lemire says:
        
        March 14, 2016 at 8:52 pm
        
        I counted 21 on this paper, not counting the founder!
        
        I count 20 authors but that’s the whole team. There is then a division of labor (described in the manuscript). Some wrote the search routines, some wrote the deep learning software, some setup the testing framework, some worked on the papers…
        
        You don’t take into account the prior art either (e.g., the mentioned Monte Carlo tree search).
        
        According to Wikipedia Monte Carlo Search goes back to the 1940s… https://en.wikipedia.org/wiki/Monte_Carlo_tree_search#History
        
        They also used deep learning which dates back the 1980s…
        https://en.wikipedia.org/wiki/Deep_learning
        
        Of course, they use the modern version that relies on GPU computing to be practical… but by now, it is not new.
        
        By their own accounts, they have used well known techniques coupled with superb engineering and good hardware.
        
        Hundreds of PhDs making advances over dozens of years for a highly specialized AI task.
        
        I think that’s precisely what did not happen here. Here is what happened. A small team (~20 people) worked from scratch over two years… using mostly standard machine-learning expertise, plus lots of expensive hardware, plus some of their own domain knowledge… and they cracked the problem.
        
        I think you are selling DeepMind short here. It is not a company set out to solve Go. They want to solve AI. All of it.
        
        To solve AI, all of it, you can’t solve every specialized problem using specific solutions and hundreds of PhDs. You need generic tools that are widely applicable.
        
        They clearly mean to solve various other games, health problems and so forth… using very similar techniques.
        
        But, yeah, what if AI is just a matter of computational power? Well, this may be true, but there is no evidence for this yet.
        
        My concluding statement is : If â€œallâ€ it takes to build superhuman intelligences is more hardwareâ€¦ and the ability to use itâ€¦ then it is good news.
        
        You know me well enough, Leo, to know that I know that it is hard to make good use of computing resources. Using thousands of CPUs and hundreds of GPUs, and using them well is hard.
        
        These 20 people are no doubt amazing people.
        
        But the point is still that with a small team (20 is not large), and the right tools is all you need.
        
        This should not surprise us. It did not take 2000 engineers to invent the GPS, the transistor, the plane, the modern engine, the car, the computer… Once you have right tools, enough resources and political support, inventions fall into place.
        
        It appears that AI is following that pattern. When we barely had the computers to solve Chess, it happened. We now have barely the computing ressources to solve Go, and it happened. And so forth.
        
        Let me qualify. I don’t mean that “I” could have solved Go the way they did given the computing resources. But there are many human beings, lots of smart people… given the possibilities, some team, somewhere, is bound to get it done… as long as we encourage it.
        
        To be clearer, had the DeepMind team been killed, probably Facebook would have solved Go next year or the year after that.
        
        I think it is precisely why we should worry about the end of Moore’s law. If our software performance stalls, it is possible that our computing technology would stall as well. We need to push forward.
        
        In other words, performance and engineering matter, a great deal.
        
        Leonid Boytsov says:
        
        March 15, 2016 at 12:31 am
        
        I am not shortselling anybody or anything. I am just reminding you what John Langford reminded us: good performance in Go crucially depends on the effectiveness of the Monte Carlo tree search. This was found by trial and error over a period of many years. It was verified only recently, but before AlphaGO was created.
        
        For one thing, one should give credit to people who invented the algorithm AND demonstrated its effectiveness in Go.
        
        For another thing, it has nothing to do with hardware (though having more hardware obviously helps here).
        
        This is just what John Langford said.
        
        Another problem with your post is that it reads like: being clever doesn’t matter, we can just have more hardware. I have to disagree here again, because, clearly, all the widely publicized milestones (Chess, Jeopardy, GO) were about being clever: call it engineering or whatever.
        
        In fact, when you say that AlphaGO is just the result of engineering, it is you who are selling them short. I am pretty sure it was a lot of hard core research, not just engineering.
        
        Daniel Lemire says:
        
        March 15, 2016 at 1:56 am
        
        (…) good performance in Go crucially depends on the effectiveness of the Monte Carlo tree search. This was found by trial and error over a period of many years. It was verified only recently, but before AlphaGO was created.
        
        Yes. But I think nobody had combined MCTS and DeepLearning. At least, not in the way AlphaGo did it. As far as I can tell, it was non-obvious, but also not entirely counter-intuitive, insight.
        
        For another thing, it has nothing to do with hardware (though having more hardware obviously helps here). This is just what John Langford said.
        
        I am not sure what in Langford’s article you refer to.
        
        Deep Blue, the system that defeated Kasparov, had 11 GFLOPS whereas a modern iPhone has close to 200 GFLOPS. A single GPU delivers today can deliver about 7000 GFLOPS. So AlphaGo has computing capabilities that are maybe hundreds of thousands of times what Deep Blue had.
        
        Is your contention that the Deep Blue team could have defeated the best Go players had they been cleverer using only 11 GFLOPS?
        
        Another problem with your post is that it reads like: being clever doesn’t matter, we can just have more hardware. I have to disagree here again, because, clearly, all the widely publicized milestones (Chess, Jeopardy, GO) were about being clever: call it engineering or whatever.
        
        The DeepMind people were clever. The people behind Watson and Deep Blue were clever. No doubt about that. But they couldn’t have done what they did with ENIAC.
        
        Watson became possible at the point where having tens of thousands of GFLOPS doesn’t risk bankrupting IBM. Not before.
        
        I am pretty sure it was a lot of hard core research, not just engineering.
        
        If by hard-core you mean “academic publication focused research” then the answer is no.
        
        Here are the DBLP pages of the founders of DeepMind:
        
        http://dblp.uni-trier.de/pers/hd/l/Legg:Shane
        
        http://dblp.uni-trier.de/pers/hd/h/Hassabis:Demis
        
        http://dblp.uni-trier.de/pers/hd/s/Suleyman:Mustafa
        
        Leonid Boytsov says:
        
        March 15, 2016 at 3:11 am
        
        The saying about Langford got mixed up. I didn’t mean to claim he said anything about hardware.
        
        Hardware is a necessary condition, but it is not a sufficient one. For example, IBM Watson worked on a single computer, but it took too long to answer.
        
        >If by hard-core you mean â€œacademic >publication focused researchâ€ then the answer is no.
        
        I am not sure hard-core research and academic publications are always equal. Often, it is true, but not always.
        
        Daniel Lemire says:
        
        March 15, 2016 at 3:39 am
        
        Hardware is a necessary condition, but it is not a sufficient one. For example, IBM Watson worked on a single computer, but it took too long to answer.
        
        You can get AlphaGo to run on a Raspberri Pi or a 386. I mean, it is all about Turing machines, right?
        
        But hardware matters a great deal.
        
        You know how it is Leo. When you are programming, you need to test your ideas out. If it takes forever to test the simplest idea, progress is going to be slow.
        
        If your progress is too slow, the project will die. Either you will get discouraged, or people will stop funding you or… something more interesting will come along.
        
        The fact is, it is really hard to be “ahead of your time”.
        
        This is just an extension of “the medium is the message”. In theory, the Internet is nothing new. You write text, other people read it. So there is nothing, on the surface, that we can do with the Internet that we couldn’t do before. I mean, we could be exchanging letters right now.
        
        The fact that things get easier means that we can start working on new problems that were too hard before.
        
        Of course, it is not just the hardware. You need to have the engineering talent to use it. You need the resources, the encouragement, the courage, the focus and so forth. But the starting point is access to good tools.

Michael Hay says:

March 17, 2016 at 11:51 am

While using “commodity vector” processors (GPUs) and Intel CPUs is helpful I would argue that the chief challenge for AI generally is that the hardware platform is insufficiently plastic. A key facet of the brain is that it changes over time in structure.

Thankfully Intel has invested in Altera which could result in a general processing environment which is both fixed (IA) and plastic (FPGA). It is going to take some work to make FPGAs be more general programming friendly and the notion of computing in space (FPGAs) vs. computing in time (CPUs) is something that will have to be contemplated and worked on for some time.

With a more plastic processing model I suspect that we’d see even more emergent properties of a “AI” game-master and perhaps even the ability to allow something the scale of a Rasberry Pi to be Go or Chess champion.

Daniel Lemire says:

March 17, 2016 at 1:35 pm

As far as I can tell FPGA are not much faster than GPUs and CPUs, but they can be ten times more energy efficient. At scale, this can matter a great deal, I suppose.

I do not know whether they have other benefits.

Leonid Boytsov says:

March 14, 2016 at 8:28 pm

PS: I am not implying that this wasn’t a great feat, not even close. But claims: intelligence is solved now and you AI is merely more computing power… Excuse, but I don’t see these claims substantiated.

Daniel Lemire says:

March 14, 2016 at 8:58 pm

intelligence is solved now

No. Intelligence isn’t solved. We can’t even do as well as a bee using computer vision.

What I am suggesting is that we may have lots of what we need to build drones that are just as smart as bees. We may not need so many conceptual breakthrough. We may just need more focus on good engineering.

Benoit says:

March 15, 2016 at 12:42 pm

Great point! Deep learning is not easier, it just gives you larger and more powerful building blocks. But the number of engineers involved does not seem to decrease.

Daniel Lemire says:

March 15, 2016 at 1:47 pm

@Benoit

AlphaGo was built by 20 engineers over 2 years.

There are many open source projects, including some I have been involved with, that have multiple times this number of engineers.

Evan Estola says:

March 14, 2016 at 5:15 pm

Agreed! Certainly the single machine is still expensive and beefy, but we also know how much harder it is to scale vertically vs. horizontally. I just thought it was interesting that the utility of adding more hardware diminished that quickly, especially considering that as Google they likely have access to as much hardware as they possibly want.

Daniel Lemire says:

March 14, 2016 at 7:44 pm

I think that it might be an unfair comparison. It is probable that all instances of AlphaGo benefit from some of the same training. So it is not like one team worked with a single machine all along while another team had all of Google’s resources… All instances benefited from tremendous computational resources.

qznc says:

March 14, 2016 at 5:41 pm

It still is a huge engineering job to fit “280 GPUs and 1920 CPUs” into my pocket and make them run on battery for a day. 😉

Daniel Lemire says:

March 15, 2016 at 2:01 am

I don’t know if we’ll ever have that much power in our pockets. Maybe. But we do not need to have it in our pockets because computers and networks are ubiquitous.

Benoit says:

March 15, 2016 at 12:45 pm

There was a conceptual breakthrough, namely Monte Carlo Tree Search in 2006. Without it, you could be using the entire Google data center and still not make a dent in the problem.

Daniel Lemire says:

March 15, 2016 at 1:45 pm

@Benoit

Wikipedia suggests that MCTS is much older…

https://en.wikipedia.org/wiki/Monte_Carlo_tree_search#History
1. Benoit says:
  
  March 15, 2016 at 8:12 pm
  
  Monte Carlo is indeed much older, but Rémi Coulom is the one who made it work for tree search and for Go in 2006. I believe this is widely acknowledged, and that he deserved to be on stage in Seoul with the DeepMind team today.
  Back to your point about general intelligence, I agree with you that it may be mostly a matter of having enough computational resources, but I also believe that some conceptual breakthroughs are still needed. Maybe not “100 Nobel prizes away” as some put it, but still a few at least.
  1. Leonid Boytsov says:
    
    March 16, 2016 at 1:42 pm
    
    Given that all of the current AI is a bag of tricks that work in specific and very limited environments, we may be even 1000 Nobel prizes away (in terms of the number of breakthroughs, not years). Likely, we will be slowly gaining speech, vision, and language processing capabilities in a very incremental way, by inventing new and new tricks.
    
    But it is impossible to know for sure because we don’t know anything about the brain (or almost anything). With new evidence comes understanding that brains are much more complicated that it was previously thought. One recent example, but there are many more: http://news.discovery.com/human/life/memory-10-times-more-massive-than-thought-160121.htm
    I wouldn’t be surprised to learn that in 10 years, they discover the brain can hold 10x of what we think it can remember today.
    1. Daniel Lemire says:
      
      March 16, 2016 at 2:02 pm
      
      Given that all of the current AI is a bag of tricks that work in specific and very limited environments, (…)
      
      I think I could program an application that looks at pictures and yells “butterfly” when it sees a butterfly in, probably, less than an hour, using nothing but JavaScript and publicly available APIs and librairies. And my application would work well too. It could even yell “butterfly” in the language of your choice.
      
      Let me add to this. I think that a smart high school students could do the same application in about the same time, if he has learned to program a bit beforehand.
      
      Of course, if I need to recognize the flowers from my garden by name, that’s a bit more difficult. I don’t know how to do that, but give me a team of 20 great engineers and two years… and I bet I can write an application that can satisfy paying customers.
      
      Likely, we will be slowly gaining speech, vision, and language processing capabilities in a very incremental way, by inventing new and new tricks.
      
      I think that what you call “tricks”, I might call “engineeering”.
      
      I am willing to bet with you that within 10 years, we will have human-level speech recognition. And that almost all of it will be achieved using techniques that are known today.
2. Sergiu Goschin says:
  
  March 16, 2016 at 2:13 am
  
  To be fair, the credit assignment for the contributions is a bit more complex. Some steps I am personally aware of:
  1. UCB algorithm (bandits = finite stochastic optimization) (Finite-time Analysis of the Multiarmed Bandit Problem, Auer, Cesa-Bianchi, Fischer, 2002).
  2. UCT algorithm as an extension of UCB (generalization to game-tree search) (Bandit based Monte-Carlo Planning, Kocsis, Szepesvari, 2006).
  3. UCT successfully applied to Go (Remi Coulon).
  
  I think each step is a non-trivial / non-obvious extension of the previous one and thus deserves recognition.
  
  Note also that the first paper is a theory paper with a non-intuitive, while simple algorithm. It has had impact in practice (you can actually find its key idea almost unchanged in the AlphaGo Nature paper) by making people aware that simple explore-exploit techniques (like constant/local exploration) are sometimes bound to fail.

Sharad Sinha says:

March 16, 2016 at 6:38 am

The simple mention of raw computing power in today’s phones versus that of Deep Blue can be misleading for some readers. I think we should not forget the power requirements to sustain such computing power over extended periods of time and the resulting efforts in cooling technology. Today’s phones may have that much raw computing power but they are still not there in terms of power delivery and cooling technologies.

Daniel Lemire says:

March 16, 2016 at 1:45 pm

I agree that mobile computing cannot be compared to server-class computing. This being said… An iPhone can still beat the best Chess players in the world.

Pingback: Quora

Scott Conger says:

March 25, 2016 at 4:24 am

It’s hard to interpret an AI triumph if you want to know how close we are to surpassing “natural” intelligence. We see ourselves as smart, but we know humans didn’t evolve to play go. We may well be quite bad at it. I suspect you’re better off examining animals where we do have an idea of what they evolved to do, and there AI can look pretty miserable.

Small insects seem to crush our efforts in AIs for robotics. I don’t know how much power a bee brain has, but it can’t be much. And yet, they can fly, pathfind, recognize faces, communicate, dance, and work together to build their impressive hives. That kind of capability still seems a way off, especially in such a tiny package.

Daniel Lemire says:

March 25, 2016 at 1:39 pm

Small insects seem to crush our efforts in AIs for robotics. I don’t know how much power a bee brain has, but it can’t be much. And yet, they can fly, pathfind, recognize faces, communicate, dance, and work together to build their impressive hives.

Right. I use bees as an example myself a lot. Our autonomous drones do a lot of what bees do, but using many more tricks and seemingly less intelligence. I usually get criticize for saying so, but I think it is true… we can’t even mimic the intelligence of a bee on a supercomputer.

That kind of capability still seems a way off, especially in such a tiny package.

Up until recently, an autonomous drone would cost hundreds of thousands of dollars if not more. You can now purchase one on amazon for less than $2000. If you live in a big city, you have probably seen athletes running around followed by an autonomous drone filming them.

Many of us have semi-autonomous cleaning robots in our homes. They are outrageously silly and loud… but kettles started out this way too.

Early days, of course, but I would not bet against autonomous-drone technology at this point in time.

Jennifer Akers says:

December 5, 2019 at 6:28 am

Fine article Daniel. AI is all about science and engineering of making intelligent machines. These systems are by nature program-intensive. AI provides all the viable platforms of technologies and standards that are required to engineer machines to make humans think better. CSAT.AI (https://www.csat.ai/) and ELEMENT.AI (https://www.elementai.com/) which are AI-powered tools that can be adapted to provide new levels of visibility into Customer Service and can automate QA. These can analyze and ultimately synthesize AI components and systems which will have a great impact in the field of engineering.