The easiest way to think of their relationship is to visualize them as concentric circles with AI — the idea that came first — the largest, then machine learning — which blossomed later, and finally deep learning — which is driving today’s AI explosion — fitting inside both.
AI () is a subfield of computer science that was created in the 1960s, and it was/is concerned with solving tasks that are easy for humans but hard for computers. In particular, a so-called Strong AI would be a system that can do anything a human can (perhaps without purely physical things). This is fairly generic and includes all kinds of tasks such as planning, moving around in the world, recognizing objects and sounds, speaking, translating, performing social or business transactions, creative work (making art or poetry), etc.
NLP () is simply the part of AI that has to do with language (usually written).
is concerned with one aspect of this: given some AI problem that can be described in discrete terms (e.g. out of a particular set of actions, which one is the right one), and given a lot of information about the world, figure out what is the “correct” action, without having the programmer program it in. Typically some outside process is needed to judge whether the action was correct or not. In mathematical terms, it’s a function: you feed in some input, and you want it to to produce the right output, so the whole problem is simply to build a model of this mathematical function in some automatic way. To draw a distinction with AI, if I can write a very clever program that has human-like behavior, it can be AI, but unless its parameters are automatically learned from data, it’s not machine learning.
Deep Learning is one kind of machine learning that’s very popular now. It involves a particular kind of mathematical model that can be thought of as a composition of simple blocks (function composition) of a certain type, and where some of these blocks can be adjusted to better predict the final outcome.
The word “deep” means that the composition has many of these blocks stacked on top of each other, and the tricky bit is how to adjust the blocks that are far from the output, since a small change there can have very indirect effects on the output. This is done via something calledinside of a larger process called which lets you change the parameters in a way that improves your model.
Over the past few years AI has exploded, and especially since 2015. Much of that has to do with the wide availability of GPUs that make parallel processing ever faster, cheaper, and more powerful. It also has to do with the simultaneous one-two punch of practically infinite storage and a flood of data of every stripe (that whole Big Data movement) – images, text, transactions, mapping data, you name it.
Don’t model the World; Model the Mind.
When you Model the Mind you can create systems capable of Learning everything about the world. It is a much smaller task, since the world is very large and changes behind your back, which means World Models will become obsolete the moment they are made. The only hope to create intelligent systems is to have the system itself create and maintain its own World Models. Continuously, in response to sensory input.
Following this line of reasoning, Machine Learning is NOT a subset of AI. It really is the ONLY kind of AI there is.
And this is now proving to be true, and in a big way. Since 2012, a specific Machine Learning technique called Deep Learning is taking the AI world by storm. Researchers are abandoning the classical “Programming Tricks” style of AI in droves and switching to Deep Learning… based mainly on the fact that it actually works. We’ve made more progress in three years since 2012 than we’ve done in the preceeding 25 years on several key AI problems, including Image Understanding (a really hard one), Signal Processing, Voice Understanding, and Text Understanding.
Another clue that we are now on the right track: Old style AI projects like CYC ran to millions of propositions or millions of lines of code. Systems that (successfully) Model the Mind can be as small as 600 lines of code; several recent Deep Learning projects clock in somewhere in that range. And these programs can move from one problem domain to another with very few changes to the core; this means these methods are GENERAL intelligences, not specific to any one problem domain. This is why it is called Artificial General Intelligence. And we’ve never had any AI programs that could do this in the past. As an example, the language understanding programs we are creating using DL will work equally well in any language, not just English. It just takes a re-training to switch to Japanese… another indication that Deep Learning is closer to true intelligence than traditional NLP systems.
Google is currently using Machine Learning a lot – in my estimate, over a hundred places in their systems have been replaced by Deep Learning and other ML techniques in the past few years. even their patented “PageRank” algorithm which was the initial key to their success is being replaced, even as I write this, with a new algorithm called “RankBrain” which is based on Deep Learning.
Deep Learning is not AI either. We are currently using Supervised Deep Learning, which is another (but less critical) programmer’s cheat since the “supervision” is a kind of World Model. Real AI requires Unsupervised Deep Learning.
Deep Learning isn’t AI but it’s the only thing we have that’s on the path to True AI.
Machine Learning is a technology within the sphere of ‘Aritifical Intelligence’.
The pioneering technology within Machine Learning is the neural network (NN), which mimics (to a very rudimentary level) the pattern recognition abilities of the human brain by processing thousands or even millions of data points. Pattern recognition is pivotal in terms of intelligence.
It is worth keeping in mind that a lot of people assume that through Machine Learning we are developing general AI rather than applied AI (difference is explained very well). Applied AI is intelligence, but in a very limited field. For example, in recognizing human faces (Facebook), driving cars (Google Autonomous Cars), or what we do at – namely matching teachers to students for optimal outcomes. This is the strength of Machine Learning, pattern recognition albeit in a limited, defined scope. Or rather, organized complex information.
NOTE: A general AI on the other hand, is not limited to a narrow field where humans still have to impose certain rules before it can ‘learn’ (cars are not animals, etc.) To clarify, there are hundreds of companies using applied AI (such as), there are none that have developed general AI (think ).
As per the Turing test, a computer can be said to be intelligent if it can achieve human-level performance in all cognitive tasks, sufficient to fool an interrogator. In order to be artificially intelligent and pass the Turing test the computer should posses the following ,
- Natural language processing to enable it to communicate successfully in English (or some other human language).
- Knowledge representation to store information provided before or during the interrogation.
- Automated reasoning to use the stored information to answer questions and to draw new conclusions.
- Machine learning to adapt to new circumstances and to detect and extrapolate patterns.
Machine Learning — An Approach to Achieve Artificial Intelligence
Machine Learning at its most basic is the practice of using algorithms to parse data, learn from it, and then make a determination or prediction about something in the world. So rather than hand-coding software routines with a specific set of instructions to accomplish a particular task, the machine is “trained” using large amounts of data and algorithms that give it the ability to learn how to perform the task.
Machine learning came directly from minds of the early AI crowd, and the algorithmic approaches over the years included decision tree learning, inductive logic programming. clustering, reinforcement learning, and Bayesian networks among others. As we know, none achieved the ultimate goal of General AI, and even Narrow AI was mostly out of reach with early machine learning approaches.
As it turned out, one of the very best application areas for machine learning for many years was computer vision, though it still required a great deal of hand-coding to get the job done. People would go in and write hand-coded classifiers like edge detection filters so the program could identify where an object started and stopped; shape detection to determine if it had eight sides; a classifier to recognize the letters “S-T-O-P.” From all those hand-coded classifiers they would develop algorithms to make sense of the image and “learn” to determine whether it was a stop sign.
Good, but not mind-bendingly great. Especially on a foggy day when the sign isn’t perfectly visible, or a tree obscures part of it. There’s a reason computer vision and image detection didn’t come close to rivaling humans until very recently, it was too brittle and too prone to error.
Time, and the right learning algorithms made all the difference.
Deep Learning — A Technique for Implementing Machine Learning
Another algorithmic approach from the early machine-learning crowd, Artificial Neural Networks, came and mostly went over the decades. Neural Networks are inspired by our understanding of the biology of our brains – all those interconnections between the neurons. But, unlike a biological brain where any neuron can connect to any other neuron within a certain physical distance, these artificial neural networks have discrete layers, connections, and directions of data propagation.
You might, for example, take an image, chop it up into a bunch of tiles that are inputted into the first layer of the neural network. In the first layer individual neurons, then passes the data to a second layer. The second layer of neurons does its task, and so on, until the final layer and the final output is produced.
Each neuron assigns a weighting to its input — how correct or incorrect it is relative to the task being performed. The final output is then determined by the total of those weightings. So think of our stop sign example. Attributes of a stop sign image are chopped up and “examined” by the neurons — its octogonal shape, its fire-engine red color, its distinctive letters, its traffic-sign size, and its motion or lack thereof. The neural network’s task is to conclude whether this is a stop sign or not. It comes up with a “probability vector,” really a highly educated guess, based on the weighting. In our example the system might be 86% confident the image is a stop sign, 7% confident it’s a speed limit sign, and 5% it’s a kite stuck in a tree ,and so on — and the network architecture then tells the neural network whether it is right or not.
Even this example is getting ahead of itself, because until recently neural networks were all but shunned by the AI research community. They had been around since the earliest days of AI, and had produced very little in the way of “intelligence.” The problem was even the most basic neural networks were very computationally intensive, it just wasn’t a practical approach. Still, a small heretical research group led by Geoffrey Hinton at the University of Toronto kept at it, finally parallelizing the algorithms for supercomputers to run and proving the concept, but it wasn’t until GPUs were deployed in the effort that the promise was realized.
If we go back again to our stop sign example, chances are very good that as the network is getting tuned or “trained” it’s coming up with wrong answers — a lot. What it needs is training. It needs to see hundreds of thousands, even millions of images, until the weightings of the neuron inputs are tuned so precisely that it gets the answer right practically every time — fog or no fog, sun or rain. It’s at that point that the neural network has taught itself what a stop sign looks like; or your mother’s face in the case of Facebook; or a cat, which is what Andrew Ng did in 2012 at Google.
Ng’s breakthrough was to take these neural networks, and essentially make them huge, increase the layers and the neurons, and then run massive amounts of data through the system to train it. In Ng’s case it was images from 10 million YouTube videos. Ng put the “deep” in deep learning, which describes all the layers in these neural networks.
Today, image recognition by machines trained via deep learning in some scenarios is better than humans, and that ranges from cats to identifying indicators for cancer in blood and tumors in MRI scans. Google’s AlphaGo learned the game, and trained for its Go match — it tuned its neural network — by playing against itself over and over and over.
Thanks to Deep Learning, AI Has a Bright Future
Deep Learning has enabled many practical applications of Machine Learning and by extension the overall field of AI. Deep Learning breaks down tasks in ways that makes all kinds of machine assists seem possible, even likely. Driverless cars, better preventive healthcare, even better movie recommendations, are all here today or on the horizon. AI is the present and the future. With Deep Learning’s help, AI may even get to that science fiction state we’ve so long imagined. You have a C-3PO, I’ll take it. You can keep your Terminator.
To learn more about where AI is going next, see NVIDIA CEO Jen-Hsun Huang’s report on The Intelligent Industrial Revolution.