Bostrom2012SuperintelligentWill

Nick Bostrom, "The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Intelligence"

Bibliographic info

⇒ Bostrom, N. The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Agents. Minds & Machines 22, 71–85 (2012). https://doi.org/10.1007/s11023-012-9281-3

Commentary

⇒ This text creates more clarity in the dangers of ill defined superintelligent AI's. A threat we alrady had on our radar, but then in a more general sense. Nick Bostrom made this threat concrete in his two theses. The orthogonality thesis and [The Instrumental Convergence Thesis Although the text adds to the debate with its two theses, it does not answer more practical questions around how we recognise or minimise the dangers and deal with possible consequences.

Excerpts & Key Quotes

The Orthogonality Thesis

Page 73:

"Intelligence and final goals are orthogonal axes along which possible agents can freely vary. In other words, more or less any level of intelligence could in principle be combined with more or less any final goal."

Comment:

The Orthogonality Thesis warns us that artificial agents can we given a complex final goal while they themselves might not be intelligent enough to incorporate all our values into its approach. When we start allowing superintelligent artificial agents to make our choices for us, they will probsably be more intelligent than us, so you might think they would at leat be intelligent enought to think on our level of intelligence with our values. However, it could be it does not understand our values, or is able to see a bigger picture we cannot. More probable is that we are unable to define to the artificial agent what our values are and how it should act accordingly. This is why we could maybe let a superintelligent artifical agent try to learn our values itself and then test its capabilities. However, this creates a big risk as we can never be sure it fully understood. All in all it is very probable an artificial agent will make mistakes in its choices conscerning our values. However, not all values are as important and we should make sure we can go back on decision made by the artificial agent as most damage is not immediately inflicted but happens over a longer period of time.

The Instrumental Convergence Thesis

Page 76:

"Several instrumental values can be identified which are convergent in the sense that their attainment would increase the chances of the agent’s goal being realized for a wide range of final goals and a wide range of situations, implying that these instrumental values are likely to be pursued by many intelligent agents."

Comment:

The Instrumental Convergence Thesis could help us prepare for artificial agent behavior and make sure some unwanted behavior does not occur. In this text however, it almost seems nothing can be done and that a superintelligent artifical agent will follow these instrumental values without question. Whereas we might still have some control on the behavior of an artificial agent, if only we incoroporate it in our strategy of building it. An artifical agent might also be more flexible in its goals and approaches than how this text makes it out to be. If we reach the stage of a superintelligent artificial agent it might also be aware of certain values to uphold and different approaches that might be favored by these values. Its final goal might then also be changed slightly when it seems all favored approaches comes very close but do not fully reach its final goal. Lastly, an artifical agent might be able to ask for help or clarification in certain situation with have a drastic impact. If we start discussing future, hypothetical situation, why not also include ways in which artificial agent might be able to reach goals in favorable ways.

Conclusion

Page 84:

"It might be possible to set up a situation in which the optimal way for the agent to pursue these instrumental values (and thereby its final goals) is by promoting human welfare, acting morally, or serving some beneficial purpose as intended by its creators. However, if and when such an agent finds itself in a different situation, one in which it expects a greater number of decimals of pi to be calculated if it destroys the human species than if it continues to act cooperatively, its conduct would instantly take a sinister turn. This indicates a danger in relying on instrumental values as a guarantor of safe behavior in future artificial agents that are intended to become superintelligent and that might be able to leverage their superintelligence into extreme levels power and influence."

Comment:

This excerpt raises a lot of questions: if we should not rely on instrumental values, what then could we rely on? Or shouldn't we want superintelligent artificial agents at all or only in certain applications? Questions that are not easily answered, but are a necessary following to the theses put forward in this text. Further research must be done on these topics as they are most crucial for our decision making conscerning superintelligent artificial agents.

superintelligence
#comment/Anderson : see my article with Joseph Heath,

Nick Bostrom, "The Superintelligent Will: Motivation and Instrumental Rationality in Advanced Artificial Intelligence"

Bibliographic info

Commentary

Excerpts & Key Quotes

The Orthogonality Thesis

Comment:

The Instrumental Convergence Thesis

Comment:

Conclusion

Comment:

Related concepts