Today, a member of the studio audience left a comment in which Colin identified himself or herself as "a physicist who programs BDI agents."
This called to mind a project that I have been thinking about for a long time.
I want to have somebody try to program morality into a set of BDI agents.
How would this work?
Well, if we have a community of BDI agents, then we have a community of entities that have beliefs and desires.
Its beliefs take the form of data stored in a database that describe the world around the agent. Though, of course, those beliefs may be false. The agent uses various ways to collect evidence, but it might end up with 'false beliefs' – data in its database that does not accurately describe the world. Still, the agent will act as if its beliefs are true.
Its desires take the form of goals – or objectives – that the agent is trying to achieve. Specifically, while the beliefs identify what the machine thinks is true about the world, its desires determine what the machine will try to make true. If the machine’s goal is to keep the room it is in at 25 degrees C, then this is its desire.
For the sake of this project, we will need to have a community of BDI agents. They do not need to all have the same desires (the same values). They simply need to have desires. The morality will come about in part through an interplay of different desires.
These are the things that we get automatically from a community of BDI agents. However, in order to create morality, we need a few more things.
(1) The desires have to be malleable. There has to be a way for environmental factors to alter the agent’s desires. Perhaps, if it sees something red, it will change its goal from keeping the room at 25 degrees to keeping the room at 30 degrees. If its power supply drops at too fast a rate, then it grows averse to activities that consume power.
(2) Agents need to be able to 'theorize' about what the desires of other agents are, and how those desires impact its own desires. For example, a machine with a desire to keep the room at 25 degrees will need to know what the different behaviors of the other agents will have on the temperature. It will also need to know how to promote desires in others that will keep the temperature at 25 degrees, and to inhibit desires that will tend to move the temperature below 25 degrees. At the same time, other agents will need to know how to change this agent into one that tries to keep the temperature at 28 degrees or 30 degrees, and how that will affect their own goals.
Contemporary morality uses a system of rewards and punishments.
Note: I typically use the phrase, "praise, condemnation, reward, and punishment." However, praise and condemnation are simply verbal forms of reward and punishment. We could program the robots to see certain signals as praise. In other words, "If another robot shows a flashing red light, then the desires you were seeking to fulfill become weaker. If it shows a flashing green light, then the desires you were acting on become stronger." Of course, machines are also programmed to give off a blinking red-light signal if its desires are being thwarted, and a blinking green-light signal if its desires are being fulfilled.
Green lights represent praise, while red lights represent condemnation.
Now, we have the makings for a moral system among our BDI agents. We turn them loose, and watch how they struggle to promote desires that tend to fulfill other desires, and inhibit desires that tend to thwart other desires. Desires that fulfill other desires trigger green lights which then strengthen those desires, while desires that thwart other desires trigger red lights that then inhibit those desires.
Hopefully, the community, over time, will grow to have more and more flashing green lights and fewer and fewer flashing red lights.
This would be a rudimentary moral system – machines using actions as signs of the desires that other agents have, drawing inferences as to what the impact of those desires will be on its fulfillment of its own desires, and modifying those desires through blinking red (blame) and green (praise) lights that inhibit desire-thwarting desires and promoting desire-fulfilling desires.
To this system, we then add another layer of capacity. At this level, agents are capable of studying the behavior of other agents to learn what their desires are. Once agents acquire beliefs about the desires of other agents, they can engage in rudimentary bargaining and threats.
For example, Agent A (with a desire that P) forms the belief that Agent B has a desire that Q. So, Agent A communicates to Agent B, "If you help me to realize P, then I will help you to realize Q." In this way, our agents are programmed to bargain. Of course, bargains create a risk that an agent will perform its part of the bargain only to see the other agent defect. But, agents have reason to flash red on instances of defection and green on instances of completion – to give other agents an aversion to breaking a contract and a desire to live up to its terms.
Or, Agent A might offer a different deal to Agent B. "If you prevent me from realizing P, then I will do my best to prevent you from realizing Q." In this way, our agents are programmed to make threats – including the threat to punish those who do not perform desire-fulfilling actions. Of course, agents have reason to give others an aversion to making threats, unless those threats in turn tend to promote behavior that fulfills desires. It has reason to flash red at the sign of unjustified threats (unjust laws), and green at the sign of justified threats (just laws).
Next, in addition to the ability to alter the desires of other agents, we must give agents the ability to alter the beliefs of other agents – to engage in communication. Agent A, in this model, will give out certain signals that will cause all who hear to form a belief that P. Of course, since Agent A is ultimately only concerned with the fulfillment of its own desires, it will discover that one of the ways it might fulfill those desires from time to time is to lie – is to communicate false beliefs to other agents so those agents will act so as to fulfill Agent A’s desires.
Except, Agent A will also realize that it has reason to build in others an aversion to lying and other types of manipulation. So, it will flash red when it detects a lie, and flash green when it detects other agents being truthful – so as to promote an aversion to lying and a desire for honesty.
These features, then, will give us elementary bargaining and threats.
In this way, we build up a moral system in computer language. There is nothing in this that gives us any reason to doubt our capacity, ultimately, to create machines that have morals.
The next thing you know, robots will have rights.