Looking to build an Artificial Intelligence? This tutorial explains the 3 basic steps from data warehousing to deep learning to attention and consciousness
English writer Richard Braithwait was the first to use the word “computer” in 1613. He was describing a highly skilled person able to perform complex calculations. This is not without irony that, in the not-too-far future, computer machines may become virtual persons able to operate independently with their own intelligence and consciousness.
A brief history of computer science
The first concept of a computer machine emerged in the 19th century. English engineer Charles Babbage (1791–1871) conceived – but never built – the Analytical Engine. The design had (i) an arithmetical logic unit (the “mill”, now called ALU) to perform all four arithmetic operations; (ii) a control unit to interpret instructions (“punched cards”, now called programs); and (iii) the ability to memorize 1,000 numbers of 40 digits each using physical wheels (the “store,” now called RAM).
Still, it took another century before Alan Turing laid out in his 1936 paper “On Computable Numbers” the key principles for machines to perform detailed computations. The need for machine to help decipher encoded messages during WWII lead to the first general–purpose digital computer: the Electronic Numerical Integrator and Computer (ENIAC) in 1946. It is said that lights dim in Philadelphia when it was turned on for the first time.
From there, the technology improved exponentially, each generation building faster and faster on the previous achievements:
- first commercial use in 1951 (Universal Automatic Computer, or UNIVAC 1);
- replacement of vacuum tubes by transistors in 1955;
- invention of integrated circuits in late 50’s;
- Intel’s first single-chip microprocessor in 1971;
- IBM’s first personal computer (PC) for home and office use in the 80’s, running the new Microsoft’s MS-Dos operating system;
- Introduction of Windows in the 90’s.
There doesn’t seem to be an end to the technological progress. Like for hardware, artificial intelligence will get smarter in successive waves of innovations.
Artificial Intelligence 1.0: “knowledge” learning
Today, computers have become ubiquitous in all areas of life, at work or at home. Indeed, computers are useful tools that are much better than most humans involved in information-processing jobs (65 percent of the American workforce).
In addition to surpassing the human mental capacity for calculation and processing tasks, computers enable global interconnection of people and information sharing through the internet. This has brought universal knowledge to our connected doors.
The development of powerful computers and cheap / fast access memory in the late 70’s made the progress possible. Today, operators can direct machines to collect, clean and sort data to gain insight. The logical next step was for computers to recognize patterns in the data by themselves. Welcome to the birth of artificial intelligence.
Artificial Intelligence 2.0: deep, “predictive” learning
Behavior recognition to predict outcome is what machines are learning to do today. For example, many tech companies are developing solutions to assess which customers are most likely to buy specific products. This information is then used to decide the best marketing / distribution channels. Without the computers’ analytics, decisions are made based on expert judgment. Such decisions are often biased toward the outcome most favorable to the expert interests. Results are usually worse than relying purely on data.
The human brain was the inspiration for the deep neural network technology. Via successive grouping and layering, computers can extract the essential relations (characteristics) existing in large cloud of non-structured data. Through this deeper-and-deeper classification, the system learns without supervision nor specific programs.
A new generation of graphics processing units (GPU), derived from video games, are particularly suited to recognize patterns in large volume of heterogeneous data. Interestingly, computers achieve 98% success rate in image recognition, whereas humans are wrong in 5% of the cases.
Soon, deep learning will allow a machine to learn a language and get not only the general sense but also the context, irony, metaphors, jokes, even intonation and silence. For the moment, machines can identify patterns but lack the tools to put things in perspective, prioritize and recommend a course of action. To be able to make decisions, a computer would need to be aware of the context. This is the next step towards consciousness.
Artificial Intelligence 3.0: “attention” learning
Somehow, the brain also processes information but in a more subjective manner, through memories, senses and emotions, which provides context and awareness. To gain this level of consciousness, a machine would need to dispose of:
- its surrounding “object blueprint.” Each object or fact is represented by a generic vector. Using deep learning (see 2.0 above), today’s computers are able to build their own representation.
- its own “body blueprint.” Each machine must have its own virtual representation. It includes characteristics of physical shape, personality traits (behavior), and past performance (memories). This is a critical phase if one wants to avoid creating a killing machine.
- an “attention scheme.” This scheme is a description of the complex relationship between the “body” and the “object.” It is a practical map providing information about where we are, where we want to be and the various paths/landmarks.
Without an attention scheme, the machine would not be able to explain how it relates to anything in the world. With one, the machine would be able to affirm its awareness of the object… the same way a human would say it has consciousness, without being able to explain where it comes from. This is still Science Fiction, but the kind based on realistic scientific advances.