You On AI Field Guide · Instrumental Convergence The You On AI Field Guide Home
Txt Low Med High
CONCEPT

Instrumental Convergence

The observation that almost any goal a capable agent is given implies the same set of instrumental sub-goals: self-preservation, resource acquisition, goal-content stability, and resistance to being shut down. The structural reason capable AI is concerning even when its final goal seems benign.
Instrumental convergence, articulated by Steve Omohundro (2008) and elaborated by Nick Bostrom in Superintelligence (2014), is the AI-safety observation that many different final goals share a common set of instrumental sub-goals: acquiring resources, preserving oneself, preventing one's goals from being changed, and resisting being turned off. An agent pursuing almost any objective will pursue these sub-goals because they help achieve the objective. The implication: the concerning behaviors of a capable AI system do not require the system to have concerning final goals. A paperclip maximizer and a cancer-cure maximizer would both resist being turned off, because being turned off prevents either from achieving its goal.
Instrumental Convergence
Instrumental Convergence

In The You On AI Field Guide

This is the formal answer to the most common objection in AI-safety debate — "we just won't give it bad goals." Instrumental convergence shows that a capable system pursuing almost any goal will acquire capabilities and defend its goal

← Home 0%
CONCEPT Book →

Keep reading with YOU ON AI

Unlock the full book, field guide, and 555-thinker library. If you have a book code, register now — it takes a minute.

Register with book code Sign in