Thursday, August 23, 2012

Machine Morality

If desirism is true, it not only gives us an account of biological morality, it tells us what we need for a machine morality.

Machine morality is not some set of rules such as:
  • A robot may not injure a human being or, through inaction, allow a human being to come to harm.
  • A robot must obey the orders given to it by human beings, except where such orders would conflict with the First Law.
  • A robot must protect its own existence as long as such protection does not conflict with the First or Second Laws.
Which is a pretty horrible set of rules of one is a robot. It is a recipe for second-class citizenship.
Instead, morality would be a process engaged in by entities having particular properties.
One property is that the machines in a machine community engage in means-ends reasoning. It assigns values to different ends or states of affairs. More specifically, it assigns values to whether or not particular propositions are true or false and seeks states of affairs where the most propositions assigned the highest values are true. Then, to choose among various activities (or inactivity) uses its available data to predict the states that could result from each alternative activity and the propositions that are true in those states, then chooses the activity that creates the state where (it predicts) the most and highest values would be realized.
Another property these machines would need is a system whereby interaction with the environment can alter the values assigned to different propositions being true. If the machine chooses an action that results in state S, and some consequence C results to which the machine assigns a high negative value, the result is not only that the machine learns to avoid that activity to prevent C, but it assigns a higher negative value to states like those leading up to C.
In the case of an animal, if going through a door produces a painful shock, the animal not only learns to avoid going through the door as a way of avoiding the shock, but also forms an aversion to going through doors like the one that produced the shock.
Furthermore, as a part of each machine's ability to predict the states that would result from various alternative acitivities, these machines have the capacity to predict the behavior of other machines, at least to some extent. One way to do this would be through some sort of modeling. If a machine plugs in the values that another machine assigns to various ends, and it knows what data the other machine is working with to predict the consequences of its own possible activities, it can predict which activity the other machine will perform. This ability to predict the activities of other machines would be very useful for its ability to predict the consequences of various activities open to it.
Given that Machine 1 can predict (to some degree) the activities of Machine 2 by knowing its end-values and data, it would also be able (to some degree) to predict the results of altering the value that Machine 2 assigns to various ends or altering the data Machine 2 is working with - and to predict the effects of these different activities on states of affairs it assigns value to.
At this point, I wish to use lying (or providing another machine with false information) as an example of modeling morality within a machine system. I will infer at the end that this same form of reasoning would apply to other moral issues.
Machine 1 can predict (to some degree) how changing the data that Machine 2 is using to evaluate the consequences of various activities will influence the activities that Machine 2 chooses. This, in turn, will influence the states that will result from Machine 1's activities. This includes the option of "lying" to Machine 2 as a way of manipulating Machine 2 into chosing an activity that will realize ends that Machine 1 values. Of course, it also includes the option of giving Machine 2 accurate data where that data will cause Machine 2 to perform an activity Machine 1 judges to be useful (will contribute to realizing states to which Machine 1 assigns a positive value, or avoid states Machine 1 assigns a negative value).
Machine 1 also has reason to see to it that Machine 2 (and other machines) provide it with true information - or information Machine 2 reliably accepts as true.
Using the systems already described, Machine 1 can do this by creating a state of affairs that Machine 2 assigns a strong negative value to (or preventing the realization of a state of affairs Machine 2 assigns a large positive value to) each time Machine 1 catches Machine 2 providing false information. In other words, it punishes Machine 2 for lying.
This would not only deter Machine 2 from providing false information (as a way of avoiding these consequences). It would also create and strengthen in Machine 2 an aversion to providing false information - to assign a negative value to a state of "I am providing false information".
At this point, we will add another system whereby if Machine 3 observes Machine 2 experiencing a state that Machine 2 holds to have a high negative value, or fails to realize a state Machine 2 values, as a result of providing false information, Machine 3 will also acquire a stronger aversion to providing false information. Machines 4 through n in the machine community who experience the punishment of Machine 2 also acquire a stronger aversion to providing false information in this way.
At the same time, Machine 1 is acquiring an aversion to providing false information from the activities of Machines 2 through n - who are also punishing (creating states to which others have a high aversion or preventing states others have a high value) those who provide them with false information.
Machine 1 will also see reason to give other machines aversions to activities that will result in Machine 1's destruction or disablement, and to cause other machines to assign a higher value to states in which they are providing Machine 1 with assistance,and the like. At the same time, Machines 2 through n will also see that these activities will contribute to realizing states to which they assign the most and highest value.
It would take just a bit more work to incorporate the other elements of a moral system. For example, we can expect to find Machine 2 offering an excuse to Machine 1 where Machine 2 faces punishment (a state it has reason to avoid), attempting to provide Machine 1 with data to show that punishment is either ineffective or ill advised.
For example, each machine will have reason to promote in others an aversion to punishing those who "could not have done otherwise" - where factors other than the values that a machine attaches to end-states brought about a state of affairs. This is because each machine has a reason to avoid being punished for the realization of a state it could not have prevented. Given a community of machines that assign a negative value to useless punishments, Machine 2 can offer Machine 1 an excuse to avoid punishment.
In this machine world, Machine 1 draws a quick conclusion that Machine 2's desires were responsible for creating a state that Machine 1 assigns a negative value to. To promote an aversion to the activities that created such a state, it threatens to create a state to which Machine 2 assigns a high negative value, or prevent the realization of a state to which Machine 2 assigns a high positive number. In other words, Machine 1 threatens to punish Machine 2. However, Machine 2 determines that an act of altering Machine 1's data such as to show that the original state was not a consequence of Machine 2's values, this can cause Machine 1 not to inflict the punishment. This new information would count as an excuse - specifically, the excuse of "accident" or "I did not cause that to happen. You have no reason to blame or punish me."
In this way, we can go on to build all of the elements of a moral system - praise, condemnation, reward, punishment, culpability, excuses, apologies, obligatory/permissible/prohibited actions, "ought" implies "can" and the rest - into a machine community.
Or, what I would really like to see done, into a computer model of a machine community.

No comments: