Thursday, June 18, 2020

Notes in the Margins: “The Value Learning Problem”, Nate Soares, 2016

This post will initiate a different style of posting. The idea is to capture my thoughts as I’m reading through various technical papers; the kind of notes that you would write in the margins of a journal article or paper as you’re reading it. I need to emphasize that these kinds of notes mostly reference an article’s content, but can also take the form of loosely associated thoughts and questions triggered by some point made within the article.

• You will also notice from this post how scattered my thought process can be as I’m reading through technical papers like this. I’ve been accused of having an amazing power of association, given how far afield my mind will wander in the process of trying to understand the core ideas of what an author in a paper is trying to communicate.

This post is on a technical paper referenced by Robert Miles in one of his recent YouTube videos.

• “The Value Learning Problem”, Nate Soares, 2016.

• A lengthy list of specification gaming examples can be found here.

• The “Robert Miles” YouTube channel devoted to AI questions.

It is my typical reading style, when it comes to reading a journal paper like this, is to first skim over it multiple-times until I start to get a feel for the flow of the ideas the author(s) are attempting to communicate. Then, I start reading through it in detail. Until about the 20th or 30th time through, at which point I usually start to feel like I have some understanding of what the paper is trying to say. But in the case of this article, each new time I read through it, I became more confused. After the fourth time through, I found myself thinking, “What a word salad!”

• The first thing that came to mind as I was reading this paper was, I noticed how all of the examples given were hypothetical. But every AI system will have to be instantiated in some physical entity, at which point it is no longer a hypothetical system, and the physical limits that are built into it will preclude the very hypothetical situations that the paper relies upon to make its points.

Another way of saying this is that the hypothetical examples presume that AI systems have some level of omnipotence; that if the AI system proceeds to game its specifications, then there is no speedbump for that to happen. But in practice, the laws of physics will put hard boundaries on what an AI system is capable of doing, regardless of the nonintuitive solutions that might be possible given an unconstrained hypothetical situation to start with.

• A common thread in many of the AI dilemmas is the assumption that the AI system under consideration controls its own reward function. But this case occurs in human societies as well, and takes the form of a despot, monarch, tyrant. In fact, many government bureaucracies and regulatory agencies follow this pattern of being able to control politically their own reward functions. This phenomenon might also be related to psychological personality disorders like narcissistic, sociopathic, psychopathic. In the real world, though, members of society depend on each other to be part of their feedback loop. Or more generally, many of the hypothetical dilemmas proposed in this paper would go away if the AI entity itself did not have any control over its reward feedback loop.

• One objection to this point about an AI having to have a physical instantiation, is the possibility that an AI program can live in the Internet as an independent entity. Not tied to any specific piece of computational hardware, but rather infecting itself like a virus across some distributed system of processors. This is an interesting possibility to consider. It’s a trope that appears in many science-fiction films and novels. The best exploration of this possibility, that I’ve found, is in the anime series "Ghost in the Shell", where it takes the form of what it means to be a stand-alone complex. At first pass it appears that for an AI program to spread itself out across a distributed system of processors is a possibility that faces certain practical problems; problems that make it difficult if not impossible for such a situation to actually occur in practice

• I noticed that the hypothetical examples often begin with a machine-level misunderstanding of a problem stated in natural language format. But what if it’s possible to talk to an AI using a formal verbal language, rather than a natural language? For example, a programming language like Forth? If humans were constrained, when programming AI systems, to use a formal language rather than a natural language, would this preclude some of the hypothetical problems discussed in this paper?

• It comes to mind that many/most/all(?) of the hypothetical examples given would have been much better approached using expert systems rather than AI. What’s the point of trying to use AI to solve a problem, when an expert system would have been a much better approach to the task at hand?

• Question: Whatever happened to the concept of expert systems to begin with? It used to be a term in common usage, but as I think about it, I don’t recall seeing it used in publications anymore.

Imagine the task of designing a welding robot to work in a shipyard. Why spend time designing an AI system to teach itself to be able to generate quality welds, when one can simply go and talk to actual experienced welders? Often, it seems that AI is just a lazy programmer’ s excuse or way of avoiding having to work with experienced skilled labor to develop a proper expert system for the task at hand. After all, if a robot can teach itself to master a specific task, then that avoids all of the work necessary to interface, both personally and technically, with the people who already know how to do that task.

• Why ask an AI system to figure out what it has to do based on some kind of machine learning process, when it would have been easier to just program its basic tasks, by hand, to start with? Again, I question the usefulness of such a publication when the author did not first consider the option of tackling these problems using an expert systems approach. It strikes me that a paper like this would be much more useful if it chose as its hypothetical examples problems that cannot be solved better using expert systems.

• The usefulness of an AI system over an expert system is that the AI should theoretically be able to teach itself how to do something that there is no human expertise available to do. But this raises another question, what kind of hypothetical examples should the author of this paper have used?

• Then there is the fundamental moral question, why are we, in the first place, expecting AI to make decisions for us that carry with them a moral component? Shouldn’t we as humans be reserving for ourselves such decisions? Is the desire to make an AI system capable of driving a car, for example, at its core a desire to help your fellow man commute more safely and efficiently? Or is it, rather, a path of moral cowardice to offload the responsibility of being a safe and capable driver to some third-party entity?

• Every stable system requires negative feedback loops; and societies are no exception to this fact. Part of a society’s dynamic that enables these required negative feedback loops arises when the individuals forming the society hold each other accountable/responsible for the outcomes of the decisions they make. The more we offload our responsibility for making morally correct decisions to some nonhuman AI, the less we as individuals will need to interact with each other; a sure recipe for the collapse of a society.

• Another way to say this is that the author’s hypothetical examples envision AI systems that do not have to pay some kind of a “personal” price for any bad decisions they make. An AI system may decide that the best way to bring peace on earth is to kill all humans. But after doing that, how would the AI system maintain its physical self? It couldn’t, and it would die. The ability to make moral decisions is a property again of living systems only.

• Consider the observation that true AGI is a property of a living system. Any hypothetical example of true AGI has to include the constraints of self-organization, self-preservation and self-reproduction. And if these three constraints are included in all of the author’s hypothetical examples, would they still hold up as useful thought experiments?

• I wrote in a past blog about the difference between a CNC system and a robot. My observation then applies in many ways to the distinction between expert systems and AI today. The general pattern seems to be that AI refers to speculative possibilities, whereas expert systems encompasses doable projects. That is, once something in AI becomes doable in a computational, algorithmic, practical manner, then it ceases to be AI and becomes lumped in with expert systems.

• Another random thought was that many of these AI dilemmas are not actual problems to be found in practice, but rather take the form of archetypal stories; that is, parables.

• The rambling point here is that within the current state-of-the-art, machine learning is considered part of AI. But I’m beginning to form the opinion that machine learning should more properly be considered another aspect of expert systems.