In the book by Gerrish, How Smart Machines Think[1],
the author purports to address the field of Artificial Intelligence by example,
namely via the construct of machines that think. The examples he uses are chess
playing, movie selection, the TV game of Jeopardy playing, playing Atari games
or GO, and self-driving vehicles as examples. Now this does cover the field we
generally call AI but it does present a powerful set of examples that
demonstrate what AI may encompass.
The problem is that we can mostly agree as to what a machine
is, simply hardware and software, plus some set of past and ongoing data
regarding the target at hand but we have always had a difficulty of a clear
definition of what thinking entails. We have had philosophers for centuries
opining on this topic and thus despite a massive amount of new information of
the neural process in the human we have the conundrum of definitions regarding
a machine. At the best we have Turing and his putative definitions, which may
be still quite wanting.
Instead of bemoaning the clarity in defining the process of
thinking, and equally as well its correlative the term intelligence, we will
focus a bit on the area of artificial intelligence as an artifact of computer
science. All too often AI is in the eye of the beholder. Set loose upon the
Press, it has almost taken a life of its own. Moreover, recently with the MIT
push to create its first "college" as an entity almost sanctified by
the AI mantra, it means whatever one seems to want it to mean. To that end we
shall attempt to explore it a bit.
To start out, my view is shaped by half a century working on
the periphery of AI. My personal experience is using what AI has as its fundamental
techniques and applying them to a variety of situations. But before examining
them let me step back a step. I would contend that much of what we are looking
at today started with Wiener and his work on Cybernetics. It included McCullough,
Pitts, Minsky, Papert, and even Chomsky to a degree. These were the idea folks,
lacking the power of machines and with primitive algorithms. In many ways they
were trying to emulate what they conceived of as the brain and its functions. I
personally see a key initial played as Wiener, because he added the major
element of uncertainty. One could see his gun tracking system as an integrated
"thinking machine" and a world of uncertainty. Wiener's world was an
analog world, which is how he envisioned things but also limited by the tools
at hand. We have abandoned that world a bit but as we will see it may still be
floating around in current thought.
Now to commence, there are two issues worth focusing on when
examining AI. First, what types of embodiments would we generally accept as
fitting the field of AI. Second, how is the field of AI practiced; namely are
there a set of fundamental precepts and canonical tools or is it just a set of
ad hoc problem solving. Thus, is AI akin to say 19th century
medicine. A collection of techniques that may or may not work depending on the
patient and the disease. 21st century medicine has become focused on
causes and therapeutics that address the underlying causes. It is an extension
of Koch's laws to genetic structures.
Let us consider several of the areas of "AI" focus
and development. This is not a comprehensive list but merely descriptive. Minsky's
landscape of AI, his book Society or Mind, is a somewhat rambling but highly
insightful discussion of the dimensions. It has stood the test of time and is
always worth a review.
1. Pattern Recognition
In a sense this is one of the oldest forms. It takes say a
letter, A, and reads it and then using the output of the sensors determines the
weighting that best gives A in the presence of 25 other letters. The list of
letters is fixed as is their size and font type. The sensors are two
dimensional and of a density that satisfies a reasonable text identification
probability.
We can assume NXN or N2 sensors and the output of the sensors
can be simply 0 or 1. We can then, assuming 26 letters, choose N2 weights so
that by adding up the weighted N2 samples we can divide the output space into
26 regions each uniquely assigned to a specific letter. This is a simple
pattern recognition algorithm. We optimize this by repetitively
"teaching" the system by submitting the 26 letters again and again to
maximize the detection rate and minimize the false alarm rate. We assume that
some form of convergence exists.
Now there are many algorithms which have been developed for
this class of problems. We can examine a finite set of precisely defined
"letters" or objects and then begin to expand it to
We can even extend it to blood cell identification, and the
whole field of pathology. Winston in the 1960s applied some of these techniques
to blood analysis. The techniques have been also applied to EKG analyses. These
however are significantly more complex. One can approach the EKG world from two
dimensions. One is from the training perspective, where thousands of EKGs are
presented and classified. Then the system uses this based to select a diagnosis.
The second approach is the physical analysis approach. He we would assume to
know the physio-electro dynamics of the heart. Then we would try to use the
underlying model of reality to ascertain what was defective and attempt to
match that with what we have observed thus identifying the underly defects from
what has to change to match the results. It should be noted that the preceding
two methodologies are also descriptors of the two sets of our attempts to describe
how one gets to know things. Perhaps humans who are proficient in this area
utilize both approaches.
The characteristics of this class of recognition system are:
1. Finite number of distinguishable classes of objects,
albeit large classes.
2. Objects which have a finite set of identifiers, albeit
large sets, such as shape, color, etc
3. Objects which are static during recognition
4. Finite sets, albeit large sets, of objects
2. Speech Recognition
Speech recognition has reached a reasonable level of
usefulness. Speech recognition is an example of a trained technique to detect
answers to question and ultimately the actual collection of fully forms speech.
It has evolved extensively over the past three decades and many techniques are
available. One may question whether this is AI or just a technology. The
question may be; is the system making decisions of any type or just matching
utterances with written words.
One could perhaps combine this with an quasi AI system which
emulates an interview with a psychiatrist, a physician, a professor, and then
from the results of the interaction makes certain decisions. Yet these elements
transcend the tasks of speech recognition.
3. Text Translation
Text translation is a complex process. Transliteration
generally leads to nonsense text. One language has a structure and nuance which
be absent from another. Even dialects can be strikingly different. My Sicilian
Italian learned in my childhood was incomprehensible in Florence and insulting
in Milan. My translations of Dumas can be childlike whereas a good translator
can convey the drama of the author. Then again translating Pushkin can be even
more challenging. Finally one should try translating legal documents from Arabic
to English. Culture, religion, different language structures all lead to
cumbersome results.
To quote from Joseph Stalin, not one know for either
academic excellence or a broad understanding of cultures:
Thus, a nation is not a casual or ephemeral
conglomeration, but a stable community of people. But not every stable
community constitutes a nation. Austria and Russia are also stable communities,
but nobody calls them nations. What distinguishes a national community from a
state community? The fact, among others, that a national community is
inconceivable without a common language, while a state need not have a common
language. The Czech nation in Austria and the Polish in Russia would be impossible
if each did not have a common language, whereas the integrity of Russia and
Austria is not affected by the fact that there are a number of different
languages within their borders. We are referring, of course, to the spoken
languages of the people and not to the official governmental languages.
Thus, a common language is one of the characteristic
features of a nation. This, of course, does not mean that different nations
always and everywhere speak different languages, or that all who speak one language
necessarily constitute one nation. A common language for every nation, but not
necessarily different languages for different nations! There is no nation which
at one and the same time speaks several languages, but this does not mean that
there cannot be two nations speaking the same language! Englishmen and
Americans speak one language, but they do not constitute one nation. The same
is true of the Norwegians and the Danes, the English and the Irish. But why,
for instance, do the English and the Americans not constitute one nation in
spite of their common language?
This quote is descriptive of the sensitivity of language.
Yes, the English and American speak a similar and mutually understandable language.
But there are fundamental differences and thus any language translation must
take these into consideration. Thus far it does not appear that any AI system accomplishes
this.
4. Text Interpretation
"What do you mean by that?" may be a frequent
question. We understand what was said, we can translate it but we may still
have a lacking of meaning.
5. Information Retrieval (Q and A)
The game of Jeopardy is a classic example of information
retrieval, via a question and answer scenario. Specifically we deal with the
Question as well as the answer. As described by Gerrish, the IBM approach was
complex, because it first required the parsing of the question and seeing what
was asked for. Typically in the game there are categories of questions and then
in each category a set of questions seeking the identity of some person, place
or thing for which the specific question is the answer. This is a bit the
opposite of our usual way of processing since here we see the answer posed and
then seek to pose the question. However the same may apply in reverse. In
either case it is still merely a case of checking known facts. It is static and
certain and the answer is almost always unique. It also is non-iterative,
namely we get just one chance at selecting the "question". As such
this is a clear case of information retrieval. It does add the dimension of
parsing and syntax analysis.
6. Directed Decision Dynamics
Robotic assembly machines may fit this area. They are
directed, they are dynamic, and they must make decisions. For example if we
have an assembly line with multiple models of cars, there may be a multiplicity
of assembly directions for each model. The robot must identify the car and
perhaps even "see" the differences.
7. Undirected Decision Dynamics
Consider a game of cards, a random game of cards. Namely
when the deal changes so too may the game. Five card stud and so forth may be
chosen. Thus every time a new game starts the system must first ascertain what
the game is and then learn it and then play it. This area naturally fits into
what we have seen for decades as war games. Certain centers such as the Naval
War College conduct a multiplicity of games to see what scenarios could be
presented by a variety of putative adversaries. Then we examine the response
and continue the effort. The 1984 movie, War Games is a classic initial
presentation of taking this simulation approach, placing the "rules"
on a computer, and hen taking the "human" out of the loop. War Games
are a classic example of undirected decision dynamics. We do not know the game
the adversary is playing and the only way to asses this is sampling highly
uncertain information, possibly taking some action to see the response and then
redirecting our efforts according to some overall metric of success.
The 1950-1970 period laid out a multiplicity of War Game
Scenarios in a nuclear environment. Survival of a limited number of humans
capable of reproducing was the acceptable end point. The destruction of society
and billions was acceptable. Until some started to think a bit about this
"mutual assured destruction" approach. Taking the "Games"
and placing them on a computer would be the ultimate enablement of an
undirected dynamic decision AI system. One would suspect that perhaps as in the
film the ultimate decision is "not to play the game".
Now a recent case which may fit this scenario is that of the
self-driving car. At best we may tell the vehicle the desired end point. We
could equally ask the vehicle to take us to view the Fall foliage in New
England, thus creating a second layer of vagueness but with some modicum of specificity.
8. Thinking
What is thinking. Does it mean I can write a poem? Write a
short story. Devise a new algorithm or find a new chemical pathway or genetic
pathway? Can some AI system develop a new philosophical approach, say aligning
Wittgenstein and Heidegger? These become complex and beyond what may appear
today.
However when we examine the machines that think which
Gerrish describes we find a set of common threads.
1. Directed
All of the examples are task directed. They drive a car,
play a game, work a test, and even may diagnose a disease. They are not general
in any way.
2. Trained
They all get trained to do a task. Their advantage is the
ability to look ahead but along the path already that they were trained upon.
3. Bounded
Each approach is limited to the task at hand and cannot
readily or possibly at all be used for even a moderately different task. The
machine plays Go, Chess, Atari Games, but cannot go laterally to another game.
4. Common Techniques
Whether we call it deep learning, neural nets, hidden Markov
models or whatever, there are some common methodologies that enable the directed
and learning to get the systems to maximize their performance. Driving a care has
two objectives; get to where you want, and do so in a harmless a manner as
possible. There is a path and there are exogeneous limitations.
Thus AI as a broad rubric can be understood as such, it yet
fails to achieve what we saw a century ago in radio design for example.