STUART RUSSELL COMPUTER SCIENCE DIVISION UC...

28
RATIONALITY AND INTELLIGENCE STUART R USSELL COMPUTER SCIENCE DIVISION UC BERKELEY

Transcript of STUART RUSSELL COMPUTER SCIENCE DIVISION UC...

Page 1: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

RATIONALITY AND INTELLIGENCE

STUART RUSSELL

COMPUTER SCIENCE DIVISION

UC BERKELEY

Page 2: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Joint work with Eric Wefald, DevikaSubramanian, Shlomo Zilberstein, OtharHansson, Andrew Mayer, Gary Ogasawara,Tim Huang, Ron Parr, Keiji Kanazawa,Daphne Koller, Jonathan Tash, Peter Norvig,and Jeff Forbes.

Includes ideas by Eric Horvitz, Michael Fehling,Jack Breese, Michael Bratman, Tom Dean,Martha Pollack, and others.

Page 3: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Outline1. Constructive definitions of Intelligence

2. Some silly old definitions

3. A silly new definition

Page 4: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Three kinds of AIModelling human cognition

“Look! My model of humans is accurate!”

Building useful artifacts“Look! PBTS made a small fortune!”

Creating Intelligence“Look! My system is Intelligent!!”“No it isn’t!” “Yes it is!” etc.

Page 5: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Why constructive de�nitions?Avoid silly arguments, G & T.Need a formal relationship between

input/structure/output and Intelligencewhile avoiding overly narrow definitions thatlead to sterile and irrelevant research!

Page 6: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Constructive de�nitions . . .

Suppose a definition Int is proposed

“Look! My system is Int!”Is the claim interesting?Is the claim sometimes true?What research do we do on Int?

Page 7: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Candidates for Int

And the candidates for Best Formal Definitionof Intelligence are as follows:} Int1: Perfect rationality} Int2: Calculative rationality} Int3: Metalevel rationality} Int4: Bounded optimality

Page 8: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Agents and environmentsBUY SELL BUYBUY SELL

O

A

Agents perceive O and act A in environment EAn agent function f : O�! A

specifies an act for any percept sequence

Global measure V(f, E) evaluates f in E

Page 9: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Int1 = perfect rationalityAgent fopt is perfectly rational:

fopt = argmaxf V(f, E)i.e., the best possible behaviour

“Look! My system is perfectly rational!”Very interesting claimVERY seldom possibleResearch relates global measure tolocal constraints, e.g., maximizing utility

Page 10: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Machines and programsBUY SELL(hold) (hold) (hold)

Agent is a machine M running a program pThis defines an agent function f = Agent(p, M)

Page 11: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Int2 = calculative rationalityp is calculatively rational if Agent(p, M) = fopt

when M is infinitely fasti.e., p eventually computes the best action

“Look! My system is calculatively rational!”Useless in real-time* worldsQuite often trueResearch on calculative tools, e.g.logical planners, influence diagrams

Page 12: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

The calculative toolboxThe toolbox is almost empty!!Need tools for

learning, modelling,deciding, compiling

in environments that are(non)deterministic,(partially) observable,discrete/continuous, static/dynamic

Page 13: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

ComplexityCalculative rationality describes“in principle” capability

NP/PSPACE-completeness) trade off decisionquality for computation

Page 14: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Int3: metalevel rationalityAgent(p, M) is metalevelly rational if it controls

its computations optimally

“Look! My system is metalevelly rational!”Very interesting claimVERY seldom possibleResearch on rational metareasoning

Page 15: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Rational metareasoningDo the Right Thinking:} Computations are actions} Cost=time Benefit=better decisions} Value� benefit minus cost

General agent program:Repeat until no computation has value > 0:

Do the best computationDo the current best action

Page 16: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Anytime algorithmsDecision quality that improves over time

quality

time

benefit

value

costRational metareasoning applies triviallyAnytime tools!

Page 17: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Fine-grained metareasoningExplicit model of effects of computations

? ? ? ?

) selection as well as termination

Compiled into efficient formulafor value of computation

Applications in search, games, MDPs showimprovement over standard algorithms

Page 18: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Algorithms in AIMetareasoning replaces clever algorithms!

ALGORITHMS

Page 19: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Int4: bounded optimalityAgent(popt, M) is bounded-optimal iff

popt = argmaxpV(Agent(p, M), E)i.e., the best program given M.

Look! My system is bounded-optimal!Very interesting claimAlways possibleResearch on all sorts of things

Page 20: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Nonlocal constraints!Translates into nonlocal constraints on action) Optimize over programs, not actions

Similar conclusions reached in other fields:Economics: Herb Simon and othersGame theory: Prisoners’ Dilemma

Robert Aumann, Wed. 10.30 a.m.Philosophy: Dennett’s Moral First-Aid ManualPolitics: Toffler’s* Creating a New Civilization

Page 21: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Example: Sorting mailreject

mail sortcamera

Time

Probability

E: Letters arrive at random timesM: Runs one or more neural networks

popt is a sequence of networkscomputable from arrival distribution

Page 22: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Asymptotic bounded optimalityStrict bounded optimality is too fragile

p is asymptotically bounded-optimal (ABO) iff9k V(Agent(p, kM), E) � V(Agent(popt, M), E)I.e., speeding up M by k compensates

for p’s inefficiency

Worst-case ABO and average-case ABOgeneralize classical complexity

Page 23: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Complex real-time systemsLet pi be ABO for a fixed deadline at t = 2i�

p0 ppp 321

Sequence is ABO for any deadline distributionAs good as knowing the deadline in advance!

Page 24: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Complex systems contd.Use the doubling construction to buildcomposite anytime systems

SELL

Fixed deadline) allocation to components is easy) “compiler” for complex systems

Page 25: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Metalevel reinforcement learningObject-level reinforcement learning:

learn long-term rewards for actions fromshort-term rewards

Metalevel reinforcement learning:learn long-term rewards for computations

Criterion for “valid” update rules:convergence to bounded optimality

Page 26: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

What next?} Prove convergence to bounded optimalitywithin fixed software architectures} Prove dominance between architectures} Develop a “grammar” of AI architectures} Learning and bounded optimality U.C. BERKELEY

Page 27: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Bounded optimal solutionsWOW !!

Page 28: STUART RUSSELL COMPUTER SCIENCE DIVISION UC BERKELEYpeople.eecs.berkeley.edu/~russell/papers/ijcai95-cnt... · 2008-07-17 · Politics: Toffler’s* Creating a New Civilization.

Conclusions} Computational limitationsBrains cause minds} Tools in, algorithms out (eventually)} Bounded optimality:Fits intuitive idea of IntelligenceA bridge between theory and practice} Crisis: LAP–FOPLBMLDTHTNPOPMEA