User talk:Veritas Aeterna/Work in Progress, Symbolic Artificial Intelligence

Current Lead Text

Since many people may only read the introductory paragraphs, it is important to ensure they are correct. Unfortunately, the middle paragraph of the the current lead section has some key inaccuracies and parts that are misleading. I am referring to this paragraph:

"Symbolic AI was the dominant paradigm of AI research from the mid-1950s until the middle 1990s. Researchers in the 1960s and the 1970s were convinced that symbolic approaches would eventually succeed in creating a machine with artificial general intelligence and considered this the goal of their field. However, in the late 80s and 90s specific technical problems (such as brittleness and intractability) showed the limits of the symbolic approach. AI research turned to new methods (called "sub-symbolic" at the time) including connectionism, soft computing, mathematical optimization and neural networks. These methods were directed towards specific problems with specific solutions, rather than general intelligence. "Deep learning" (a sub-symbolic approach) had spectacular success in handling vision, speech recognition, speech synthesis, image generation, and machine translation. But by 2020 difficulties with bias, explanation, comprehensibility, and robustness have become more apparent with deep learning approaches and AI researchers have called for combining the best of both the symbolic and neural network approaches."

Parts that are Misleading or Inaccurate

The problem with these sentences is that they give an erroneous view of symbolic AI, especially in the three sentences in bold (added). Specifically, it propagates these viewpoints:

The key problems of symbolic AI that led to the Second AI Winter were (a) brittleness, and (b) computational intractability.
Russell and Norvig, 2021, pages 21, and 23-25 is claimed to support that contention. Those pages do not, see the clarifications below.
From the mid-1990s, AI research focused on connectionism, soft computing, mathematical optimization and neural networks.
These methods were "new methods", implying there had been no prior work on connectionism, soft computing, mathematical optimization or neural networks.
All of these methods were considered sub-symbolic, not just neural networks and connectionism.
There is an implication that by a "turn to" these methods, AI researchers turned away from symbolic approaches. That is not true.
Russell and Norvig, 2021, pages 25-26 is claimed to support "AI research turned to new methods (called "sub-symbolic" at the time) including connectionism, soft computing, mathematical optimization and neural networks". It does not.

Clarifications

However, in the late 80s and 90s specific technical problems (such as brittleness and intractability) showed the limits of the symbolic approach.: Russell and Norvig, page 21 is describing problems from earlier AI work, around 1957-1959, and prior to the LightHill Report in 1973, which indeed mentioned combinatorial explosion as a problem.; Attempts to address computational intractability were part of early efforts regarding the use of heuristics in searching, such as A*.; Heuristic search, such as the A* algorithm, published in 1968 is one means of addressing this problem.; Applying expert knowledge is another approach to focusing search, as Feigenbaum mentions in his CACM interview.; So, yes this was a problem for the First AI Winter, but at this point we are talking about the Second AI Winter, starting in 1988.; The main reasons for the failure of expert systems leading to the Second AI Winter are described as problems with knowledge acquisition and handling uncertainty by Kautz. Similarly, Russell and Norvig on P. 24 say, "It turned out to be difficult to build and maintain expert systems for complex domains, in part because the reasoning methods used by the systems broke down in the face of uncertainty and in part because the systems could not learn from experience."; Brittleness, applies to both deep learning and symbolic systems. It was indeed a problem for expert systems, but is not unique to symbolic AI. I am still including it, however.; Instead, of brittleness and intractability, Kautz cites two key technical problems for the end of enthusiasm for expert systems:; "The first challenge was the need for principled and practical methods for probabilistic reasoning." [Kautz, 2022, p. 110]; "The second unsolved challenge for the expert system approach was named the “knowledge acquisition bottleneck.” " [Kautz, 2022, p. 110]
AI research turned to new methods (called "sub-symbolic" at the time) including connectionism, soft computing, mathematical optimization and neural networks. These methods were directed towards specific problems with specific solutions, rather than general intelligence.: Kautz characterizes the next two decades and the primary focus on P. 110: "overcoming these challenges set the workplan for the next two decades of research in AI. ". It is not a focus on subsymbolic systems, but instead on handling uncertainty and the knowledge acquisition bottleneck.; Both Kautz and Russell & Norvig cite probabilistic reasoning (Bayesian networks, HMMs, and later statistical relational learning) and machine learning (primarily symbolic approaches, but also SVMs and other classifiers, including Valiant's theoretical work).; It is not until about 2012 that deep learning takes off. Russell and Norvig say 2011, citing speech recognition, but Kautz starts that period as 2012, and most references I have seen, such as Marcus, also use 2012.; Indeed, [Marcus, 2019] describes neural net research as being considered unsuccessful to around 2012:

"Still, many people continued in Rosenblatt's tradition for decades. And until recently, his successors too struggled mightily. Until Big Data became commonplace, the general consensus in the Al community was that the so-called neural-network approach was hopeless. Systems just didn't work that well, compared to other methods.
... A revolution came in 2012, when a number of people, including a team of researchers working with Hinton, worked out a way to use the power of GPUs to enormously increase the power of neural networks.
Suddenly, for the first time, Hinton's team and others began setting records, most notably in recognizing images in the ImageNet database we mentioned earlier. Competitors Hinton and others focused on a subset of the database-1.4 million images, drawn from one thousand categories. Each team trained its system on about 1,25 million of those, leaving 150,000 for testing. Before then, with older machine-learning techniques, a score of 75 percent correct Was a good result; Hinton's team scored 84 percent correct, using a deep neural network, and other teams soon did even better; by 2017, Image labeling scores, driven by deep learning, reached 98 percent."

These methods were directed towards specific problems with specific solutions, rather than general intelligence.: The implication is that symbolic AI had need been focused on narrow AI before, but only AGI. But that is not the case. Clearly, expert systems are an obvious case of narrow AI. So, I propose just dropping this sentence.

Approach to Address these Problems

I think the problem is that overall the explanation is too coarse, and does not break down the periods of the Second AI Winter, the period immediately following that when probabilistic reasoning and symbolic machine learning received much greater focus, and then the period in which deep learning took off (circa 2012). Finally, a shift to a greater focus on hybrid systems appears to have started about 2020.

I propose refining the introductory discussion to break out these periods and reserving "sub-symbolic" to describe only neural nets and connectionism, and not using it to encompass probabilistic methods, Bayesian approaches, or optimization. The latter techniques can be used for symbolic AI, deep learning, and in various hybrid logical-probabilistic approaches, such as Markov Logic Networks.

Regarding the use of "soft", fuzzy logic was introduced in 1965, and Danny Hillis founded Thinking Machines Corporation in 1983. So, there wasn't a sudden shift to soft and sub-symbolic approaches in the late 80's and neural nets didn't become dominant until about 2012. We can certainly talk more about fuzzy logic and other extensions to logic later on.

Revised Sentences and New Paragraph for Deep Learning

Here is what I propose, discussed one part at a time:

Symbolic AI was the dominant paradigm of AI research from the mid-1950s until the middle 1990s.: { No changes. }

Researchers in the 1960s and the 1970s were convinced that symbolic approaches would eventually succeed in creating a machine with artificial general intelligence and considered this the ultimate goal of their field.: { Just clarified that AGI was an **ultimate** goal and not the only one. Certainly, many others were focused on specific applications, just as they are today. E.g., theorem provers, planners, symbolic mathematics, etc. And of course, expert systems were, by definition, narrow AI.}

An early boom, with early successes such as the Logic Theorist and Samuel's Checker's Playing Program led to unrealistic expectations and promises and was followed by the First AI Winter as funding dried up.^[1]^[2]: { I'm adding a bit more of the early history and a mention of the first AI winter. Russell mentions Samuel's program. Kautz mentions the Logic Theorist. }

A second boom (1969-1986) occurred with the rise of expert systems, their promise of capturing corporate expertise, and an enthusiastic corporate embrace. ^[3]^[4] That boom, and some early successes, e.g., with XCON at DEC, was followed again by later disappointment.^[4] Problems with difficulties in knowledge acquisition, maintaining large knowledge bases, and brittleness in handling out-of-domain problems arose. Another, second, AI Winter (1988-2011) followed.^[5]: { Boom and bust again, with time periods from first Russell then Kautz and some specifics on why there was a boom. It all looked very promising back then, initially. }

Subsequently, AI researchers focused on addressing underlying problems in handling uncertainty and in knowledge acquisition.^[5] Uncertainty was addressed with formal methods such as Hidden Markov Models, Bayesian reasoning, and statistical relational learning. Symbolic machine learning addressed the knowledge acquisition problem with contributions including Version Space, Valiant's PAC learning, Quinlan's ID3 decision-tree learning, case-based learning, and inductive logic programming to learn relations.^[6]: { Some specifics about symbolic machine learning until I write that section. For handling uncertainty, Bayesian reasoning and HMMs are mentioned by Russell & Norvig while Kautz mentions Bayesian reasoning and statistical relational reasoning. }

---

Next, I'd start a new paragraph just to address deep learning and history to the present:

Neural networks, a sub-symbolic approach, had been pursued from early days. Early examples are Rosenblatt's perceptron learning work, the backpropagation work of Rumelhart, Hinton and Williams,^[7] and work in convolutional neural networks by LeCun et al. in 1989.^[8]: { Brief intro and overview. I could cite say even more as to William Grey Walter's work on cybernetics and the work of Minsky and Papert on neural networks, but I'm trying to keep it shorter.}

However, neural networks were not viewed as successful until about 2012

Until Big Data became commonplace, the general consensus in the Al community was that the so-called neural-network approach was hopeless. Systems just didn't work that well, compared to other methods. ...A revolution came in 2012, when a number of people, including a team of researchers working with Hinton, worked out a way to use the power of GPUs to enormously increase the power of neural networks.^[9]

{ Explain why we view 2012 as the jumping off point for when deep learning really takes off. It seems necessary to make this point as so much of what you read attempts to rewrite history to imply first there was symbolic AI, and then deep learning; End of story. That's just not true.}

Over the next several years, deep learning had spectacular success in handling vision, speech recognition, speech synthesis, image generation, and machine translation.: { Acknowledge incredible results. }

However, by 2020 as inherent difficulties with bias, explanation, comprehensibility, and robustness became more apparent with deep learning approaches; AI researchers have begun calling for combining the best of both the symbolic and neural network approaches^[10]^[11] and addressing areas that both approaches have difficulty with, such as common-sense reasoning.^[9]: { Finally, how both approaches may be best together. I may need to add more on hybrid approaches that extend logic to handle probability, I intend to put that in somewhere. }

New Paragraphs

Symbolic AI was the dominant paradigm of AI research from the mid-1950s until the middle 1990s.^[12]^[13] Researchers in the 1960s and the 1970s were convinced that symbolic approaches would eventually succeed in creating a machine with artificial general intelligence and considered this the ultimate goal of their field.^[14] An early boom, with early successes such as the Logic Theorist and Samuel's Checker's Playing Program led to unrealistic expectations and promises and was followed by the First AI Winter as funding dried up.^[1]^[2] A second boom (1969-1986) occurred with the rise of expert systems, their promise of capturing corporate expertise, and an enthusiastic corporate embrace.^[3]^[4] That boom, and some early successes, e.g., with XCON at DEC, was followed again by later disappointment.^[4] Problems with difficulties in knowledge acquisition, maintaining large knowledge bases, and brittleness in handling out-of-domain problems arose. Another, second, AI Winter (1988-2011) followed. ^[5] Subsequently, AI researchers focused on addressing underlying problems in handling uncertainty and in knowledge acquisition.^[6] Uncertainty was addressed with formal methods such as Hidden Markov Models, Bayesian reasoning, and statistical relational learning.^[15]^[16] Symbolic machine learning addressed the knowledge acquisition problem with contributions including Version Space, Valiant's PAC learning, Quinlan's ID3 decision-tree learning, case-based learning, and inductive logic programming to learn relations.^[6]

Neural networks, a sub-symbolic approach, had been pursued from early days and was to reemerge strongly in 2012. Early examples are Rosenblatt's perceptron learning work, the backpropagation work of Rumelhart, Hinton and Williams^[17], and work in convolutional neural networks by LeCun et al. in 1989.^[18] However, neural networks were not viewed as successful until about 2012: "Until Big Data became commonplace, the general consensus in the Al community was that the so-called neural-network approach was hopeless. Systems just didn't work that well, compared to other methods. ...A revolution came in 2012, when a number of people, including a team of researchers working with Hinton, worked out a way to use the power of GPUs to enormously increase the power of neural networks." ^[9] Over the next several years, deep learning had spectacular success in handling vision, speech recognition, speech synthesis, image generation, and machine translation. However, since 2020, as inherent difficulties with bias, explanation, comprehensibility, and robustness became more apparent with deep learning approaches; an increasing number of AI researchers have called for combining the best of both the symbolic and neural network approaches^[10]^[11] and addressing areas that both approaches have difficulty with, such as common-sense reasoning.^[9]

^ ^a ^b Kautz 2020, pp. 107–109.
^ ^a ^b Russell and Norvig 2021, p. 19.
^ ^a ^b Russell and Norvig 2021, p. 22-23.
^ ^a ^b ^c ^d Kautz 2020, pp. 109–110.
^ ^a ^b ^c Kautz 2020, p. 110.
^ ^a ^b ^c Kautz 2020, pp. 110–111.
^ Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (1986). "Learning representations by back-propagating errors". Nature. 323 (6088): 533–536. doi:10.1038/323533a0. ISSN 1476-4687.
^ LeCun, Y.; Boser, B.; Denker, I.; Henderson, D.; Howard, R.; Hubbard, W.; Tackel, L. (1989). "Backpropagation Applied to Handwritten Zip Code Recognition". Neural Computation. 1 (4): 541–551.
^ ^a ^b ^c ^d Marcus and Davis 2019.
^ ^a ^b Rossi, Francesca. "Thinking Fast and Slow in AI". AAAI. Retrieved 5 July 2022.
^ ^a ^b Selman, Bart. "AAAI Presidential Address: The State of AI". AAAI. Retrieved 5 July 2022.
^ Kolata 1982.
^ Russell & Norvig 2003, p. 5.
^ Russell & Norvig 2021, p. 24.
^ Russell and Norvig 2020, p. 25.
^ Kautz 2020, p. 111.
^ Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (1986). "Learning representations by back-propagating errors". Nature. 323 (6088): 533–536. doi:10.1038/323533a0. ISSN 1476-4687.
^ LeCun, Y.; Boser, B.; Denker, I.; Henderson, D.; Howard, R.; Hubbard, W.; Tackel, L. (1989). "Backpropagation Applied to Handwritten Zip Code Recognition". Neural Computation. 1 (4): 541–551.

[FOOTNOTEKautz2020107–109-1] Kautz 2020, pp. 107–109.

[FOOTNOTERussell_and_Norvig202119-2] Russell and Norvig 2021, p. 19.

[FOOTNOTERussell_and_Norvig202122-23-3] Russell and Norvig 2021, p. 22-23.

[FOOTNOTEKautz2020109–110-4] Kautz 2020, pp. 109–110.

[FOOTNOTEKautz2020110-5] Kautz 2020, p. 110.

[FOOTNOTEKautz2020110–111-6] Kautz 2020, pp. 110–111.

[7] Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (1986). "Learning representations by back-propagating errors". Nature. 323 (6088): 533–536. doi:10.1038/323533a0. ISSN 1476-4687.

[CNNs-8] LeCun, Y.; Boser, B.; Denker, I.; Henderson, D.; Howard, R.; Hubbard, W.; Tackel, L. (1989). "Backpropagation Applied to Handwritten Zip Code Recognition". Neural Computation. 1 (4): 541–551.

[FOOTNOTEMarcus_and_Davis2019-9] Marcus and Davis 2019.

[Rossi-10] Rossi, Francesca. "Thinking Fast and Slow in AI". AAAI. Retrieved 5 July 2022.

[Selman-11] Selman, Bart. "AAAI Presidential Address: The State of AI". AAAI. Retrieved 5 July 2022.

[FOOTNOTEKolata1982-12] Kolata 1982.

[FOOTNOTERussellNorvig20035-13] Russell & Norvig 2003, p. 5.

[FOOTNOTERussellNorvig202124-14] Russell & Norvig 2021, p. 24.

[FOOTNOTERussell_and_Norvig202025-15] Russell and Norvig 2020, p. 25.

[FOOTNOTEKautz2020111-16] Kautz 2020, p. 111.

[17] Rumelhart, David E.; Hinton, Geoffrey E.; Williams, Ronald J. (1986). "Learning representations by back-propagating errors". Nature. 323 (6088): 533–536. doi:10.1038/323533a0. ISSN 1476-4687.

[18] LeCun, Y.; Boser, B.; Denker, I.; Henderson, D.; Howard, R.; Hubbard, W.; Tackel, L. (1989). "Backpropagation Applied to Handwritten Zip Code Recognition". Neural Computation. 1 (4): 541–551.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]