AI All Merged 1x4

Outline
♦ What is AI?
♦ A brief history
Artificial Intelligence ♦ The state of the art
Chapter 1
Chapter 1 1 Chapter 1 2
What is AI? Acting humanly: The Turing test

Turing (1950) “Computing machinery and intelligence”:
♦ “Can machines think?” −→ “Can machines behave intelligently?”
Systems that think like humans Systems that think rationally ♦ Operational test for intelligent behavior: the Imitation Game
Systems that act like humans Systems that act rationally HUMAN
HUMAN
INTERROGATOR ?
AI SYSTEM
♦ Predicted that by 2000, a machine might have a 30% chance of

fooling a lay person for 5 minutes
♦ Anticipated all major arguments against AI in following 50 years
♦ Suggested major components of AI: knowledge, reasoning, language
understanding, learning
Problem: Turing test is not reproducible, constructive, or
amenable to mathematical analysis
Thinking humanly: Cognitive Science Thinking rationally: Laws of Thought
1960s “cognitive revolution”: information-processing psychology replaced Normative (or prescriptive) rather than descriptive
prevailing orthodoxy of behaviorism
Aristotle: what are correct arguments/thought processes?
Requires scientific theories of internal activities of the brain
– What level of abstraction? “Knowledge” or “circuits”? Several Greek schools developed various forms of logic:
– How to validate? Requires notation and rules of derivation for thoughts;
1) Predicting and testing behavior of human subjects (top-down) may or may not have proceeded to the idea of mechanization
or 2) Direct identification from neurological data (bottom-up) Direct line through mathematics and philosophy to modern AI
Both approaches (roughly, Cognitive Science and Cognitive Neuroscience) Problems:
are now distinct from AI 1) Not all intelligent behavior is mediated by logical deliberation
Both share with AI the following characteristic: 2) What is the purpose of thinking? What thoughts should I have
the available theories do not explain (or engender) out of all the thoughts (logical or otherwise) that I could have?
anything resembling human-level general intelligence
Hence, all three fields share one principal direction!
Acting rationally Rational agents

Rational behavior: doing the right thing An agent is an entity that perceives and acts
The right thing: that which is expected to maximize goal achievement, This course is about designing rational agents
given the available information
Abstractly, an agent is a function from percept histories to actions:
Doesn’t necessarily involve thinking—e.g., blinking reflex—but
thinking should be in the service of rational action f : P∗ → A
Aristotle (Nicomachean Ethics): For any given class of environments and tasks, we seek the
Every art and every inquiry, and similarly every agent (or class of agents) with the best performance
action and pursuit, is thought to aim at some good Caveat: computational limitations make
perfect rationality unachievable
→ design best program for given machine resources
AI prehistory Potted history of AI
Philosophy logic, methods of reasoning 1943 McCulloch & Pitts: Boolean circuit model of brain
mind as physical system 1950 Turing’s “Computing Machinery and Intelligence”
foundations of learning, language, rationality 1952–69 Look, Ma, no hands!
Mathematics formal representation and proof 1950s Early AI programs, including Samuel’s checkers program,
algorithms, computation, (un)decidability, (in)tractability Newell & Simon’s Logic Theorist, Gelernter’s Geometry Engine
probability 1956 Dartmouth meeting: “Artificial Intelligence” adopted
Psychology adaptation 1965 Robinson’s complete algorithm for logical reasoning
phenomena of perception and motor control 1966–74 AI discovers computational complexity
experimental techniques (psychophysics, etc.) Neural network research almost disappears
1969–79 Early development of knowledge-based systems
Economics formal theory of rational decisions
1980–88 Expert systems industry booms
Linguistics knowledge representation
1988–93 Expert systems industry busts: “AI Winter”
grammar
1985–95 Neural networks return to popularity
Neuroscience plastic physical substrate for mental activity
1988– Resurgence of probability; general increase in technical depth
Control theory homeostatic systems, stability
“Nouvelle AI”: ALife, GAs, soft computing
simple optimal agent designs
1995– Agents, agents, everywhere . . .
2003– Human-level AI back on the agenda
State of the art State of the art

Which of the following can be done at present? Which of the following can be done at present?
♦ Play a decent game of table tennis ♦ Play a decent game of table tennis
♦ Drive safely along a curving mountain road
♦ Drive safely along a curving mountain road ♦ Drive safely along a curving mountain road
♦ Drive safely along Telegraph Avenue ♦ Drive safely along Telegraph Avenue
♦ Buy a week’s worth of groceries on the web

♦ Buy a week’s worth of groceries on the web ♦ Buy a week’s worth of groceries on the web
♦ Buy a week’s worth of groceries at Berkeley Bowl ♦ Buy a week’s worth of groceries at Berkeley Bowl
♦ Play a decent game of bridge
♦ Play a decent game of bridge ♦ Play a decent game of bridge
♦ Discover and prove a new mathematical theorem ♦ Discover and prove a new mathematical theorem
♦ Design and execute a research program in molecular biology

♦ Design and execute a research program in molecular biology ♦ Design and execute a research program in molecular biology
♦ Write an intentionally funny story ♦ Write an intentionally funny story
♦ Give competent legal advice in a specialized area of law
♦ Give competent legal advice in a specialized area of law ♦ Give competent legal advice in a specialized area of law
♦ Translate spoken English into spoken Swedish in real time ♦ Translate spoken English into spoken Swedish in real time
♦ Converse successfully with another person for an hour

♦ Give competent legal advice in a specialized area of law ♦ Give competent legal advice in a specialized area of law
♦ Translate spoken English into spoken Swedish in real time ♦ Translate spoken English into spoken Swedish in real time
♦ Converse successfully with another person for an hour ♦ Converse successfully with another person for an hour
♦ Perform a complex surgical operation ♦ Perform a complex surgical operation
♦ Unload any dishwasher and put everything away
State of the art Unintentionally funny stories
Which of the following can be done at present? One day Joe Bear was hungry. He asked his friend Irving Bird where some
honey was. Irving told him there was a beehive in the oak tree. Joe threat-
♦ Play a decent game of table tennis
ened to hit Irving if he didn’t tell him where some honey was. The End.
♦ Drive safely along a curving mountain road
♦ Drive safely along Telegraph Avenue Henry Squirrel was thirsty. He walked over to the river bank where his good
♦ Buy a week’s worth of groceries on the web friend Bill Bird was sitting. Henry slipped and fell in the river. Gravity
♦ Buy a week’s worth of groceries at Berkeley Bowl drowned. The End.
♦ Play a decent game of bridge
♦ Discover and prove a new mathematical theorem Once upon a time there was a dishonest fox and a vain crow. One day the
♦ Design and execute a research program in molecular biology crow was sitting in his tree, holding a piece of cheese in his mouth. He
♦ Write an intentionally funny story noticed that he was holding the piece of cheese. He became hungry, and
♦ Give competent legal advice in a specialized area of law swallowed the cheese. The fox walked over to the crow. The End.
♦ Translate spoken English into spoken Swedish in real time
♦ Converse successfully with another person for an hour
♦ Perform a complex surgical operation
♦ Unload any dishwasher and put everything away
Unintentionally funny stories

Joe Bear was hungry. He asked Irving Bird where some honey was. Irving
refused to tell him, so Joe offered to bring him a worm if he’d tell him where
some honey was. Irving agreed. But Joe didn’t know where any worms were, Intelligent Agents
so he asked Irving, who refused to say. So Joe offered to bring him a worm if
he’d tell him where a worm was. Irving agreed. But Joe didn’t know where
any worms were, so he asked Irving, who refused to say. So Joe offered to
bring him a worm if he’d tell him where a worm was . . .
Chapter 2
Chapter 2 1
Chapter 1 27
Reminders Outline
Assignment 0 (lisp refresher) due 1/28 ♦ Agents and environments
Lisp/emacs/AIMA tutorial: 11-1 today and Monday, 271 Soda ♦ Rationality
♦ PEAS (Performance measure, Environment, Actuators, Sensors)
♦ Environment types
♦ Agent types
Agents and environments Vacuum-cleaner world

sensors
percepts A B
?
environment
agent
actions
actuators
Agents include humans, robots, softbots, thermostats, etc.

The agent function maps from percept histories to actions: Percepts: location and contents, e.g., [A, Dirty]
f : P∗ → A Actions: Lef t, Right, Suck, N oOp
The agent program runs on the physical architecture to produce f
A vacuum-cleaner agent Rationality
Percept sequence Action Fixed performance measure evaluates the environment sequence
[A, Clean] Right – one point per square cleaned up in time T ?
[A, Dirty] Suck – one point per clean square per time step, minus one per move?
[B, Clean] Lef t – penalize for > k dirty squares?
[B, Dirty] Suck
A rational agent chooses whichever action maximizes the expected value of
[A, Clean], [A, Clean] Right
the performance measure given the percept sequence to date
[A, Clean], [A, Dirty] Suck
.. .. Rational 6= omniscient
– percepts may not supply all relevant information
function Reflex-Vacuum-Agent( [location,status]) returns an action Rational 6= clairvoyant
if status = Dirty then return Suck
– action outcomes may not be as expected
else if location = A then return Right Hence, rational 6= successful
else if location = B then return Left
Rational ⇒ exploration, learning, autonomy
What is the right function?

Can it be implemented in a small agent program?
PEAS PEAS
To design a rational agent, we must specify the task environment To design a rational agent, we must specify the task environment
Consider, e.g., the task of designing an automated taxi: Consider, e.g., the task of designing an automated taxi:
Performance measure?? Performance measure?? safety, destination, profits, legality, comfort, . . .
Environment?? Environment?? US streets/freeways, traffic, pedestrians, weather, . . .
Actuators?? Actuators?? steering, accelerator, brake, horn, speaker/display, . . .
Sensors?? Sensors?? video, accelerometers, gauges, engine sensors, keyboard, GPS, . . .
Internet shopping agent Internet shopping agent
Performance measure?? Performance measure?? price, quality, appropriateness, efficiency
Environment?? Environment?? current and future WWW sites, vendors, shippers
Actuators?? Actuators?? display to user, follow URL, fill in form
Sensors?? Sensors?? HTML pages (text, graphics, scripts)
Environment types Environment types
Solitaire Backgammon Internet shopping Taxi Solitaire Backgammon Internet shopping Taxi
Observable?? Observable?? Yes Yes No No
Deterministic?? Deterministic??
Episodic?? Episodic??
Static?? Static??
Discrete?? Discrete??
Single-agent?? Single-agent??
Observable?? Yes Yes No No Observable?? Yes Yes No No
Deterministic?? Yes No Partly No Deterministic?? Yes No Partly No
Episodic?? Episodic?? No No No No
Static?? Static??
Discrete?? Discrete??
Observable?? Yes Yes No No Observable?? Yes Yes No No
Deterministic?? Yes No Partly No Deterministic?? Yes No Partly No
Episodic?? No No No No Episodic?? No No No No
Static?? Yes Semi Semi No Static?? Yes Semi Semi No
Discrete?? Discrete?? Yes Yes Yes No
Environment types Agent types
Solitaire Backgammon Internet shopping Taxi Four basic types in order of increasing generality:
Observable?? Yes Yes No No – simple reflex agents
Deterministic?? Yes No Partly No – reflex agents with state
Episodic?? No No No No – goal-based agents
Static?? Yes Semi Semi No – utility-based agents
Discrete?? Yes Yes Yes No
All these can be turned into learning agents
Single-agent?? Yes No Yes (except auctions) No
The environment type largely determines the agent design

The real world is (of course) partially observable, stochastic, sequential,
dynamic, continuous, multi-agent
Simple reflex agents Example
Agent Sensors function Reflex-Vacuum-Agent( [location,status]) returns an action

if status = Dirty then return Suck
What the world else if location = A then return Right
is like now else if location = B then return Left
Environment
(setq joe (make-agent :name ’joe :body (make-agent-body)

:program (make-reflex-vacuum-agent-program)))
What action I (defun make-reflex-vacuum-agent-program ()

Condition!action rules should do now #’(lambda (percept)
(let ((location (first percept)) (status (second percept)))
Actuators (cond ((eq status ’dirty) ’Suck)
((eq location ’A) ’Right)
((eq location ’B) ’Left)))))
Reflex agents with state Example
Sensors function Reflex-Vacuum-Agent( [location,status]) returns an action

State static: last A, last B, numbers, initially ∞
What the world if status = Dirty then . . .
How the world evolves
is like now
Environment
What my actions do (defun make-reflex-vacuum-agent-with-state-program ()
(let ((last-A infinity) (last-B infinity))
#’(lambda (percept)
(let ((location (first percept)) (status (second percept)))
Condition!action rules
What action I (incf last-A) (incf last-B)
should do now (cond
((eq status ’dirty)
Agent Actuators (if (eq location ’A) (setq last-A 0) (setq last-B 0))
’Suck)
((eq location ’A) (if (> last-B 3) ’Right ’NoOp))
((eq location ’B) (if (> last-A 3) ’Left ’NoOp)))))))
Goal-based agents Utility-based agents
Sensors Sensors
State State
How the world evolves What the world How the world evolves What the world
is like now is like now
Environment
Environment
What my actions do What it will be like What my actions do What it will be like
if I do action A if I do action A
How happy I will be

Utility in such a state
What action I What action I

Goals should do now should do now
Agent Actuators Agent Actuators
Learning agents Summary
Performance standard
Agents interact with environments through actuators and sensors
The agent function describes what the agent does in all circumstances
Critic Sensors
The performance measure evaluates the environment sequence
feedback A perfectly rational agent maximizes expected performance
Environment
changes Agent programs implement (some) agent functions
Learning Performance
element element PEAS descriptions define task environments
knowledge
learning Environments are categorized along several dimensions:
!!goals
observable? deterministic? episodic? static? discrete? single-agent?
Problem Several basic agent architectures exist:
generator
reflex, reflex with state, goal-based, utility-based
Agent Actuators
Reminders
Assignment 0 due 5pm today
Assignment 1 posted, due 2/9
Problem solving and search Section 105 will move to 9-10am starting next week
Chapter 3
Outline Problem-solving agents
♦ Problem-solving agents Restricted form of general agent:
♦ Problem types function Simple-Problem-Solving-Agent( percept) returns an action
static: seq, an action sequence, initially empty
♦ Problem formulation state, some description of the current world state
♦ Example problems goal, a goal, initially null
problem, a problem formulation
♦ Basic search algorithms state ← Update-State(state, percept)
if seq is empty then
goal ← Formulate-Goal(state)
problem ← Formulate-Problem(state, goal)
seq ← Search( problem)
action ← Recommendation(seq, state)
seq ← Remainder(seq, state)
return action
Note: this is offline problem solving; solution executed “eyes closed.”

Online problem solving involves acting without complete knowledge.
Example: Romania Example: Romania

Oradea
On holiday in Romania; currently in Arad. 71
Neamt
Flight leaves tomorrow from Bucharest
Zerind 87
75 151
Formulate goal: Iasi
Arad
be in Bucharest 140
92
Sibiu Fagaras
99
Formulate problem: 118 Vaslui
states: various cities 80
Rimnicu Vilcea
actions: drive between cities Timisoara
142
111 211
Find solution: Lugoj 97
Pitesti
sequence of cities, e.g., Arad, Sibiu, Fagaras, Bucharest 70 98
146 85 Hirsova
Mehadia 101 Urziceni
75 138 86
Bucharest
Dobreta 120
90
Craiova Eforie
Giurgiu
Problem types Example: vacuum world
Deterministic, fully observable =⇒ single-state problem Single-state, start in #5. Solution??
Agent knows exactly which state it will be in; solution is a sequence 1 2
Non-observable =⇒ conformant problem

3 4
Agent may have no idea where it is; solution (if any) is a sequence
Nondeterministic and/or partially observable =⇒ contingency problem 5 6
percepts provide new information about current state
solution is a contingent plan or a policy 7 8
often interleave search, execution
Unknown state space =⇒ exploration problem (“online”)
Example: vacuum world Example: vacuum world
Single-state, start in #5. Solution?? Single-state, start in #5. Solution??

[Right, Suck] 1 2 [Right, Suck] 1 2
Conformant, start in {1, 2, 3, 4, 5, 6, 7, 8} Conformant, start in {1, 2, 3, 4, 5, 6, 7, 8}

3 4 3 4
e.g., Right goes to {2, 4, 6, 8}. Solution?? e.g., Right goes to {2, 4, 6, 8}. Solution??
[Right, Suck, Lef t, Suck]
5 6 5 6
Contingency, start in #5
Murphy’s Law: Suck can dirty a clean carpet
7 8 7 8
Local sensing: dirt, location only.
Solution??
Example: vacuum world Single-state problem formulation
Single-state, start in #5. Solution?? A problem is defined by four items:

[Right, Suck] 1 2
initial state e.g., “at Arad”
Conformant, start in {1, 2, 3, 4, 5, 6, 7, 8}
e.g., Right goes to {2, 4, 6, 8}. Solution??
3 4 successor function S(x) = set of action–state pairs
[Right, Suck, Lef t, Suck] e.g., S(Arad) = {hArad → Zerind, Zerindi, . . .}
5 6
Contingency, start in #5 goal test, can be
Murphy’s Law: Suck can dirty a clean carpet explicit, e.g., x = “at Bucharest”
7 8 implicit, e.g., N oDirt(x)
Local sensing: dirt, location only.
Solution?? path cost (additive)
[Right, if dirt then Suck] e.g., sum of distances, number of actions executed, etc.
c(x, a, y) is the step cost, assumed to be ≥ 0
A solution is a sequence of actions
leading from the initial state to a goal state
Selecting a state space Example: vacuum world state space graph

R
Real world is absurdly complex L R
L
⇒ state space must be abstracted for problem solving S S
(Abstract) state = set of real states L

R
R L
R
R
L L
(Abstract) action = complex combination of real actions S S
S S
e.g., “Arad → Zerind” represents a complex set R
of possible routes, detours, rest stops, etc. L R
For guaranteed realizability, any real state “in Arad” L
must get to some real state “in Zerind” S S
(Abstract) solution = states??

set of real paths that are solutions in the real world actions??
goal test??
Each abstract action should be “easier” than the original problem!
path cost??
Example: vacuum world state space graph Example: vacuum world state space graph
R R
L R L R
L L
S S S S
R R R R
L R L R L R L R
L L L L
S S S S
S S S S
R R
L R L R
L L
S S S S
states??: integer dirt and robot locations (ignore dirt amounts etc.) states??: integer dirt and robot locations (ignore dirt amounts etc.)
actions?? actions??: Lef t, Right, Suck, N oOp
goal test?? goal test??
path cost?? path cost??
Example: vacuum world state space graph Example: vacuum world state space graph
R R
L R L R
L L
S S S S
R R R R
L R L R L R L R
L L L L
S S S S
S S S S
R R
L R L R
L L
S S S S
states??: integer dirt and robot locations (ignore dirt amounts etc.) states??: integer dirt and robot locations (ignore dirt amounts etc.)
actions??: Lef t, Right, Suck, N oOp actions??: Lef t, Right, Suck, N oOp
goal test??: no dirt goal test??: no dirt
path cost?? path cost??: 1 per action (0 for N oOp)
Example: The 8-puzzle Example: The 8-puzzle
7 2 4 5
1 2 3 7 2 4 5
1 2 3
5 6 4 5 6 5 6 4 5 6
8 3 1 7 8 8 3 1 7 8
Start State Goal State Start State Goal State
states?? states??: integer locations of tiles (ignore intermediate positions)

actions?? actions??
goal test?? goal test??
Example: The 8-puzzle Example: The 8-puzzle

7 2 4 5
1 2 3 7 2 4 5
1 2 3
5 6 4 5 6 5 6 4 5 6
8 3 1 7 8 8 3 1 7 8
Start State Goal State Start State Goal State
states??: integer locations of tiles (ignore intermediate positions) states??: integer locations of tiles (ignore intermediate positions)
actions??: move blank left, right, up, down (ignore unjamming etc.) actions??: move blank left, right, up, down (ignore unjamming etc.)
goal test?? goal test??: = goal state (given)
Example: The 8-puzzle Example: robotic assembly
P
R R
7 2 4 5
1 2 3
R R
5 6 4 5 6
R
8 3 1 7 8
Start State Goal State
states??: real-valued coordinates of robot joint angles

states??: integer locations of tiles (ignore intermediate positions) parts of the object to be assembled
actions??: move blank left, right, up, down (ignore unjamming etc.)
goal test??: = goal state (given) actions??: continuous motions of robot joints
path cost??: 1 per move
goal test??: complete assembly with no robot included!
[Note: optimal solution of n-Puzzle family is NP-hard]
path cost??: time to execute
Tree search algorithms Tree search example

Basic idea: Arad
offline, simulated exploration of state space
by generating successors of already-explored states
Sibiu Timisoara Zerind
(a.k.a. expanding states)
function Tree-Search( problem, strategy) returns a solution, or failure Arad Fagaras Oradea Rimnicu Vilcea Arad Lugoj Arad Oradea
initialize the search tree using the initial state of problem

loop do
if there are no candidates for expansion then return failure
choose a leaf node for expansion according to strategy
if the node contains a goal state then return the corresponding solution
else expand the node and add the resulting nodes to the search tree
end
Tree search example Tree search example
Arad Arad
Sibiu Timisoara Zerind Sibiu Timisoara Zerind
Arad Fagaras Oradea Rimnicu Vilcea Arad Lugoj Arad Oradea Arad Fagaras Oradea Rimnicu Vilcea Arad Lugoj Arad Oradea
Implementation: states vs. nodes Implementation: general tree search

A state is a (representation of) a physical configuration function Tree-Search( problem, fringe) returns a solution, or failure
A node is a data structure constituting part of a search tree fringe ← Insert(Make-Node(Initial-State[problem]), fringe)
includes parent, children, depth, path cost g(x) loop do
States do not have parents, children, depth, or path cost! if fringe is empty then return failure
parent, action node ← Remove-Front(fringe)
if Goal-Test(problem, State(node)) then return node
fringe ← InsertAll(Expand(node, problem), fringe)
depth = 6
State 5 4 Node
g=6 function Expand( node, problem) returns a set of nodes
6 1 88
successors ← the empty set
state for each action, result in Successor-Fn(problem, State[node]) do
7 3 22
s ← a new Node
Parent-Node[s] ← node; Action[s] ← action; State[s] ← result
The Expand function creates new nodes, filling in the various fields and Path-Cost[s] ← Path-Cost[node] + Step-Cost(node, action, s)
using the SuccessorFn of the problem to create the corresponding states. Depth[s] ← Depth[node] + 1
add s to successors
return successors
Search strategies Uninformed search strategies
A strategy is defined by picking the order of node expansion Uninformed strategies use only the information available
in the problem definition
Strategies are evaluated along the following dimensions:
completeness—does it always find a solution if one exists? Breadth-first search
time complexity—number of nodes generated/expanded
space complexity—maximum number of nodes in memory Uniform-cost search
optimality—does it always find a least-cost solution? Depth-first search
Time and space complexity are measured in terms of Depth-limited search
b—maximum branching factor of the search tree
d—depth of the least-cost solution Iterative deepening search
m—maximum depth of the state space (may be ∞)
Breadth-first search Breadth-first search

Expand shallowest unexpanded node Expand shallowest unexpanded node
Implementation: Implementation:
fringe is a FIFO queue, i.e., new successors go at end fringe is a FIFO queue, i.e., new successors go at end
A A
B C B C
D E F G D E F G
Breadth-first search Breadth-first search
Expand shallowest unexpanded node Expand shallowest unexpanded node
fringe is a FIFO queue, i.e., new successors go at end fringe is a FIFO queue, i.e., new successors go at end
A A
B C B C
D E F G
D E F G
Properties of breadth-first search Properties of breadth-first search

Complete?? Complete?? Yes (if b is finite)
Time??
Properties of breadth-first search Properties of breadth-first search
Complete?? Yes (if b is finite) Complete?? Yes (if b is finite)
Time?? 1 + b + b2 + b3 + . . . + bd + b(bd − 1) = O(bd+1), i.e., exp. in d Time?? 1 + b + b2 + b3 + . . . + bd + b(bd − 1) = O(bd+1), i.e., exp. in d
Space?? Space?? O(bd+1) (keeps every node in memory)
Optimal??
Properties of breadth-first search Uniform-cost search

Complete?? Yes (if b is finite) Expand least-cost unexpanded node
Time?? 1 + b + b2 + b3 + . . . + bd + b(bd − 1) = O(bd+1), i.e., exp. in d Implementation:

fringe = queue ordered by path cost, lowest first
Space?? O(bd+1) (keeps every node in memory)
Equivalent to breadth-first if step costs all equal
Optimal?? Yes (if cost = 1 per step); not optimal in general
Complete?? Yes, if step cost ≥ ǫ
Space is the big problem; can easily generate nodes at 100MB/sec ∗ /ǫ⌉
so 24hrs = 8640GB. Time?? # of nodes with g ≤ cost of optimal solution, O(b⌈C )
where C ∗ is the cost of the optimal solution
Space?? # of nodes with g ≤ cost of optimal solution, O(b⌈C
∗ /ǫ⌉
)
Optimal?? Yes—nodes expanded in increasing order of g(n)
Depth-first search Depth-first search
Expand deepest unexpanded node Expand deepest unexpanded node
fringe = LIFO queue, i.e., put successors at front fringe = LIFO queue, i.e., put successors at front
A A
B C B C
D E F G D E F G
H I J K L M N O H I J K L M N O

A A
B C B C
D E F G D E F G
A A
B C B C
D E F G D E F G

A A
B C B C
D E F G D E F G
A A
B C B C
D E F G D E F G

A A
B C B C
D E F G D E F G
Properties of depth-first search Properties of depth-first search
Complete?? Complete?? No: fails in infinite-depth spaces, spaces with loops
Modify to avoid repeated states along path
⇒ complete in finite spaces
Time??
Properties of depth-first search Properties of depth-first search

Complete?? No: fails in infinite-depth spaces, spaces with loops Complete?? No: fails in infinite-depth spaces, spaces with loops
Modify to avoid repeated states along path Modify to avoid repeated states along path
⇒ complete in finite spaces ⇒ complete in finite spaces
Time?? O(bm): terrible if m is much larger than d Time?? O(bm): terrible if m is much larger than d
but if solutions are dense, may be much faster than breadth-first but if solutions are dense, may be much faster than breadth-first
Space?? Space?? O(bm), i.e., linear space!
Optimal??
Properties of depth-first search Depth-limited search
Complete?? No: fails in infinite-depth spaces, spaces with loops = depth-first search with depth limit l,
Modify to avoid repeated states along path i.e., nodes at depth l have no successors
⇒ complete in finite spaces
Recursive implementation:
Time?? O(bm): terrible if m is much larger than d
but if solutions are dense, may be much faster than breadth-first function Depth-Limited-Search( problem, limit) returns soln/fail/cutoff
Recursive-DLS(Make-Node(Initial-State[problem]), problem, limit)
Space?? O(bm), i.e., linear space! function Recursive-DLS(node, problem, limit) returns soln/fail/cutoff
cutoff-occurred? ← false
Optimal?? No if Goal-Test(problem, State[node]) then return node
else if Depth[node] = limit then return cutoff
else for each successor in Expand(node, problem) do
result ← Recursive-DLS(successor, problem, limit)
if result = cutoff then cutoff-occurred? ← true
else if result 6= failure then return result
if cutoff-occurred? then return cutoff else return failure
Iterative deepening search Iterative deepening search l = 0
function Iterative-Deepening-Search( problem) returns a solution Limit = 0 A A
inputs: problem, a problem

for depth ← 0 to ∞ do
result ← Depth-Limited-Search( problem, depth)
if result 6= cutoff then return result
end
Iterative deepening search l = 1 Iterative deepening search l = 2
Limit = 1 A A A A Limit = 2 A A A A
B C B C B C B C B C B C B C B C
D E F G D E F G D E F G D E F G
A A A A
B C B C B C B C
Iterative deepening search l = 3 Properties of iterative deepening search

Complete??
Limit = 3 A A A A
B C B C B C B C
H I J K L M N O H I J K L M N O H I J K L M N O H I J K L M N O
A A A A
B C B C B C B C
A A A A
B C B C B C B C
Properties of iterative deepening search Properties of iterative deepening search
Complete?? Yes Complete?? Yes
Time?? Time?? (d + 1)b0 + db1 + (d − 1)b2 + . . . + bd = O(bd)
Space??
Properties of iterative deepening search Properties of iterative deepening search

Complete?? Yes Complete?? Yes
d d
0 1 2
Time?? (d + 1)b + db + (d − 1)b + . . . + b = O(b ) Time?? (d + 1)b0 + db1 + (d − 1)b2 + . . . + bd = O(bd)
Space?? O(bd) Space?? O(bd)
Optimal?? Optimal?? Yes, if step cost = 1
Can be modified to explore uniform-cost tree
Numerical comparison for b = 10 and d = 5, solution at far right leaf:
N (IDS) = 50 + 400 + 3, 000 + 20, 000 + 100, 000 = 123, 450
N (BFS) = 10 + 100 + 1, 000 + 10, 000 + 100, 000 + 999, 990 = 1, 111, 100
IDS does better because other nodes at depth d are not expanded
BFS can be modified to apply goal test when a node is generated
Summary of algorithms Repeated states
Criterion Breadth- Uniform- Depth- Depth- Iterative Failure to detect repeated states can turn a linear problem into an exponential
First Cost First Limited Deepening one!
Complete? Yes∗ Yes∗ No Yes, if l ≥ d Yes A A
bd+1 b⌈C /ǫ⌉ bm bl bd

∗
Time B
⌈C ∗/ǫ⌉ B B
Space bd+1 b bm bl bd
Optimal? Yes∗ Yes No No Yes∗ C C C C C
Graph search Summary

Problem formulation usually requires abstracting away real-world details to
function Graph-Search( problem, fringe) returns a solution, or failure
define a state space that can feasibly be explored
closed ← an empty set
fringe ← Insert(Make-Node(Initial-State[problem]), fringe) Variety of uninformed search strategies
loop do
if fringe is empty then return failure Iterative deepening search uses only linear space
node ← Remove-Front(fringe) and not much more time than other uninformed algorithms
if Goal-Test(problem, State[node]) then return node
if State[node] is not in closed then Graph search can be exponentially more efficient than tree search
add State[node] to closed
fringe ← InsertAll(Expand(node, problem), fringe)
end
Best-first search Greedy search example Greedy search example
Idea: use an evaluation function for each node
Arad Arad
– estimate of “desirability” 366
Informed search algorithms ⇒ Expand most desirable unexpanded node

329 374
Implementation:
fringe is a queue sorted in decreasing order of desirability
Arad Fagaras Oradea Rimnicu Vilcea
366 380 193
Special cases:
Chapter 4, Sections 1–2
greedy search
Sibiu Bucharest
A∗ search 253 0
Chapter 4, Sections 1–2 1 Chapter 4, Sections 1–2 4 Chapter 4, Sections 1–2 7 Chapter 4, Sections 1–2 10
Outline Romania with step costs in km Greedy search example Properties of greedy search
♦ Best-first search Straight−line distance Complete??
Oradea to Bucharest Arad
71
Neamt Arad
♦ A∗ search Bucharest
366
0
Zerind
87
75 151 Craiova 160
♦ Heuristics Arad
Iasi Dobreta
Eforie
242 Sibiu Timisoara Zerind
140 161 253 329 374
92 Fagaras 178
Sibiu 99 Fagaras Giurgiu 77
118 Hirsova
Vaslui 151
80
Iasi 226
Rimnicu Vilcea Lugoj
Timisoara 244
142 Mehadia 241
111 211 Neamt 234
Lugoj 97 Pitesti
Oradea 380
70 98 Pitesti 98
146 85 Hirsova
Mehadia 101 Urziceni Rimnicu Vilcea 193
75 138 86 Sibiu 253
Bucharest Timisoara 329
120
Dobreta
90 Urziceni 80
Craiova Eforie Vaslui 199
Giurgiu Zerind 374
Review: Tree search Greedy search Greedy search example Properties of greedy search
function Tree-Search( problem, fringe) returns a solution, or failure Evaluation function h(n) (heuristic) Complete?? No–can get stuck in loops, e.g., with Oradea as goal,
Arad
fringe ← Insert(Make-Node(Initial-State[problem]), fringe) = estimate of cost from n to the closest goal Iasi → Neamt → Iasi → Neamt →
loop do Complete in finite space with repeated-state checking
if fringe is empty then return failure E.g., hSLD(n) = straight-line distance from n to Bucharest
node ← Remove-Front(fringe) 329 374
Time??
Greedy search expands the node that appears to be closest to goal
if Goal-Test[problem] applied to State(node) succeeds return node
fringe ← InsertAll(Expand(node, problem), fringe) Arad Fagaras Oradea Rimnicu Vilcea
366 176 380 193
A strategy is defined by picking the order of node expansion
Properties of greedy search A∗ search A∗ search example A∗ search example

Complete?? No–can get stuck in loops, e.g., Idea: avoid expanding paths that are already expensive
Arad Arad
Iasi → Neamt → Iasi → Neamt →
Evaluation function f (n) = g(n) + h(n)
Complete in finite space with repeated-state checking
Sibiu Timisoara Zerind Sibiu Timisoara Zerind
g(n) = cost so far to reach n
Time?? O(bm), but a good heuristic can give dramatic improvement 447=118+329 449=75+374 447=118+329 449=75+374
h(n) = estimated cost to goal from n
Space?? f (n) = estimated total cost of path through n to goal Arad Fagaras Oradea Rimnicu Vilcea Arad Fagaras Oradea Rimnicu Vilcea
646=280+366 415=239+176 671=291+380 413=220+193 646=280+366 671=291+380
A∗ search uses an admissible heuristic

Sibiu Bucharest Craiova Pitesti Sibiu
i.e., h(n) ≤ h∗(n) where h∗(n) is the true cost from n. 591=338+253 450=450+0 526=366+160 553=300+253
(Also require h(n) ≥ 0, so h(G) = 0 for any goal G.)
Bucharest Craiova Rimnicu Vilcea
E.g., hSLD(n) never overestimates the actual road distance 418=418+0 615=455+160 607=414+193
Theorem: A∗ search is optimal
Properties of greedy search A∗ search example A∗ search example Optimality of A∗ (standard proof )
Complete?? No–can get stuck in loops, e.g., Suppose some suboptimal goal G2 has been generated and is in the queue.
Arad Arad
Iasi → Neamt → Iasi → Neamt → 366=0+366
Let n be an unexpanded node on a shortest path to an optimal goal G1.
Complete in finite space with repeated-state checking Start
Time?? O(bm), but a good heuristic can give dramatic improvement 447=118+329 449=75+374
n
Space?? O(bm)—keeps all nodes in memory Arad Fagaras Oradea Rimnicu Vilcea
646=280+366 415=239+176 671=291+380
Optimal?? Craiova Pitesti Sibiu

G G2
526=366+160 417=317+100 553=300+253
f (G2) = g(G2) since h(G2) = 0

> g(G1) since G2 is suboptimal
≥ f (n) since h is admissible
Since f (G2) > f (n), A∗ will never select G2 for expansion
Properties of greedy search A∗ search example A∗ search example Optimality of A∗ (more useful)
Complete?? No–can get stuck in loops, e.g., Lemma: A∗ expands nodes in order of increasing f value∗
Arad Arad
Iasi → Neamt → Iasi → Neamt →
Complete in finite space with repeated-state checking Gradually adds “f -contours” of nodes (cf. breadth-first adds layers)
Sibiu Timisoara Zerind Sibiu Timisoara Zerind Contour i has all nodes with f = fi, where fi < fi+1
Time?? O(bm), but a good heuristic can give dramatic improvement 393=140+253 447=118+329 449=75+374 447=118+329 449=75+374
O
Space?? O(bm)—keeps all nodes in memory Arad Fagaras Oradea Rimnicu Vilcea
N
646=280+366 671=291+380 Z
Optimal?? No Sibiu Bucharest Craiova Pitesti Sibiu A

I
591=338+253 450=450+0 526=366+160 417=317+100 553=300+253 380 S

F
V
400
T R
L P
H
M U
B
420
D
E
C
G
Properties of A∗ Properties of A∗ Admissible heuristics Relaxed problems
Complete?? Complete?? Yes, unless there are infinitely many nodes with f ≤ f (G) E.g., for the 8-puzzle: Admissible heuristics can be derived from the exact
solution cost of a relaxed version of the problem
Time?? Exponential in [relative error in h × length of soln.] h1(n) = number of misplaced tiles
h2(n) = total Manhattan distance If the rules of the 8-puzzle are relaxed so that a tile can move anywhere,
Space?? Keeps all nodes in memory (i.e., no. of squares from desired location of each tile) then h1(n) gives the shortest solution
Optimal?? 7 2 4 5
1 2 3
If the rules are relaxed so that a tile can move to any adjacent square,
5 6 4 5 6
then h2(n) gives the shortest solution
Key point: the optimal solution cost of a relaxed problem
8 3 1 7 8
is no greater than the optimal solution cost of the real problem
h1(S) =??
h2(S) =??
Properties of A∗ Properties of A∗ Admissible heuristics Relaxed problems contd.

Complete?? Yes, unless there are infinitely many nodes with f ≤ f (G) Complete?? Yes, unless there are infinitely many nodes with f ≤ f (G) E.g., for the 8-puzzle: Well-known example: travelling salesperson problem (TSP)
Find the shortest tour visiting all cities exactly once
Time?? Time?? Exponential in [relative error in h × length of soln.] h1(n) = number of misplaced tiles
h2(n) = total Manhattan distance
Space?? Keeps all nodes in memory (i.e., no. of squares from desired location of each tile)
Optimal?? Yes—cannot expand fi+1 until fi is finished 7 2 4 5
1 2 3
A∗ expands all nodes with f (n) < C ∗ 5 6 4 5 6

A∗ expands some nodes with f (n) = C ∗
A∗ expands no nodes with f (n) > C ∗ 8 3 1 7 8
Minimum spanning tree can be computed in O(n2)

h1(S) =?? 6 and is a lower bound on the shortest (open) tour
h2(S) =?? 4+0+3+3+1+0+2+1 = 14
Properties of A∗ Proof of lemma: Consistency Dominance Summary

Complete?? Yes, unless there are infinitely many nodes with f ≤ f (G) A heuristic is consistent if If h2(n) ≥ h1(n) for all n (both admissible) Heuristic functions estimate costs of shortest paths
then h2 dominates h1 and is better for search
Time?? Exponential in [relative error in h × length of soln.] h(n) ≤ c(n, a, n′) + h(n′) n Good heuristics can dramatically reduce search cost
Typical search costs:
Space?? If h is consistent, we have c(n,a,n’) Greedy best-first search expands lowest h
h(n) d = 14 IDS = 3,473,941 nodes – incomplete and not always optimal
f (n′) = g(n′) + h(n′) n’ A∗(h1) = 539 nodes
= g(n) + c(n, a, n′) + h(n′) A∗(h2) = 113 nodes A∗ search expands lowest g + h
≥ g(n) + h(n) h(n’) d = 24 IDS ≈ 54,000,000,000 nodes – complete and optimal
= f (n) G A∗(h1) = 39,135 nodes – also optimally efficient (up to tie-breaks, for forward search)
A∗(h2) = 1,641 nodes Admissible heuristics can be derived from exact solution of relaxed problems
I.e., f (n) is nondecreasing along any path.
Given any admissible heuristics ha, hb,
h(n) = max(ha(n), hb(n))
is also admissible and dominates ha, hb
Outline
♦ Hill-climbing
♦ Simulated annealing
Local search algorithms ♦ Genetic algorithms (briefly)
♦ Local search in continuous spaces (very briefly)
Chapter 4, Sections 3–4 1 Chapter 4, Sections 3–4 2

Iterative improvement algorithms Example: Travelling Salesperson Problem
In many optimization problems, path is irrelevant; Start with any complete tour, perform pairwise exchanges
the goal state itself is the solution
Then state space = set of “complete” configurations;
find optimal configuration, e.g., TSP
or, find configuration satisfying constraints, e.g., timetable
In such cases, can use iterative improvement algorithms;
keep a single “current” state, try to improve it
Constant space, suitable for online as well as offline search Variants of this approach get within 1% of optimal very quickly with thou-
sands of cities
Example: n-queens Hill-climbing (or gradient ascent/descent)

Put n queens on an n × n board with no two queens on the same “Like climbing Everest in thick fog with amnesia”
row, column, or diagonal
function Hill-Climbing( problem) returns a state that is a local maximum
Move a queen to reduce number of conflicts inputs: problem, a problem
local variables: current, a node
neighbor, a node
current ← Make-Node(Initial-State[problem])
loop do
neighbor ← a highest-valued successor of current
if Value[neighbor] ≤ Value[current] then return State[current]
current ← neighbor
end
h=5 h=2 h=0
Almost always solves n-queens problems almost instantaneously

for very large n, e.g., n = 1million

Hill-climbing contd. Simulated annealing
Useful to consider state space landscape Idea: escape local maxima by allowing some “bad” moves
objective function global maximum but gradually decrease their size and frequency
function Simulated-Annealing( problem, schedule) returns a solution state

shoulder inputs: problem, a problem
schedule, a mapping from time to “temperature”
local maximum
local variables: current, a node
"flat" local maximum next, a node
T, a “temperature” controlling prob. of downward steps
current ← Make-Node(Initial-State[problem])
for t ← 1 to ∞ do
current
state space T ← schedule[t]
state if T = 0 then return current
next ← a randomly selected successor of current
Random-restart hill climbing overcomes local maxima—trivially complete ∆E ← Value[next] – Value[current]
if ∆E > 0 then current ← next
Random sideways moves escape from shoulders loop on flat maxima else current ← next only with probability e∆ E/T
Properties of simulated annealing Local beam search

At fixed “temperature” T , state occupation probability reaches Idea: keep k states instead of 1; choose top k of all their successors
Boltzman distribution
Not the same as k searches run in parallel!
E(x)
p(x) = αe kT Searches that find good states recruit other searches to join them
T decreased slowly enough =⇒ always reach best state x∗ Problem: quite often, all k states end up on same local hill
E(x∗ ) E(x) E(x∗ )−E(x)
because e kT /e kT = e kT ≫ 1 for small T Idea: choose k successors randomly, biased towards good ones
Is this necessarily an interesting guarantee?? Observe the close analogy to natural selection!
Devised by Metropolis et al., 1953, for physical process modelling
Widely used in VLSI layout, airline scheduling, etc.

Genetic algorithms Genetic algorithms contd.
= stochastic local beam search + generate successors from pairs of states GAs require states encoded as strings (GPs use programs)
24748552 24 31% 32752411 32748552 32748152 Crossover helps iff substrings are meaningful components
32752411 23 29% 24748552 24752411 24752411
24415124 20 26% 32752411 32752124 32252124
24415411 24415417 + =
32543213 11 14% 24415124
Fitness Selection Pairs Cross#Over Mutation
GAs 6= evolution: e.g., real genes encode replication machinery!
Continuous state spaces

Suppose we want to site three airports in Romania:
– 6-D state space defined by (x1, y2), (x2, y2), (x3, y3)
– objective function f (x1, y2, x2, y2, x3, y3) =
sum of squared distances from each city to nearest airport
Constraint Satisfaction Problems
Discretization methods turn continuous space into discrete space,
e.g., empirical gradient considers ±δ change in each coordinate
Gradient methods compute
Chapter 5
∂f ∂f ∂f ∂f ∂f ∂f 
 
∇f = , , , , ,


∂x1 ∂y1 ∂x2 ∂y2 ∂x3 ∂y3
 
to increase/reduce f , e.g., by x ← x + α∇f (x)

Sometimes can solve for ∇f (x) = 0 exactly (e.g., with one city).
Newton–Raphson (1664, 1690) iterates x ← x − H−1 f (x)∇f (x)
to solve ∇f (x) = 0, where Hij = ∂ 2f /∂xi ∂xj
Chapter 4, Sections 3–4 13
Chapter 5 1
Outline Constraint satisfaction problems (CSPs)
♦ CSP examples Standard search problem:
state is a “black box”—any old data structure
♦ Backtracking search for CSPs
that supports goal test, eval, successor
♦ Problem structure and problem decomposition
CSP:
♦ Local search for CSPs state is defined by variables Xi with values from domain Di
goal test is a set of constraints specifying

allowable combinations of values for subsets of variables
Simple example of a formal representation language
Allows useful general-purpose algorithms with more power
than standard search algorithms
Example: Map-Coloring Example: Map-Coloring contd.
Northern
Territory Northern
Territory
Western Queensland
Australia Western Queensland
Australia
South
Australia South
Australia
New South Wales
New South Wales
Victoria
Victoria
Tasmania
Tasmania
Variables W A, N T , Q, N SW , V , SA, T
Domains Di = {red, green, blue}
Constraints: adjacent regions must have different colors Solutions are assignments satisfying all constraints, e.g.,
e.g., W A 6= N T (if the language allows this), or {W A = red, N T = green, Q = red, N SW = green, V = red, SA = blue, T = green}
(W A, N T ) ∈ {(red, green), (red, blue), (green, red), (green, blue), . . .}
Constraint graph Varieties of CSPs
Binary CSP: each constraint relates at most two variables Discrete variables
finite domains; size d ⇒ O(dn) complete assignments
Constraint graph: nodes are variables, arcs show constraints ♦ e.g., Boolean CSPs, incl. Boolean satisfiability (NP-complete)
infinite domains (integers, strings, etc.)
NT ♦ e.g., job scheduling, variables are start/end days for each job
Q ♦ need a constraint language, e.g., StartJob1 + 5 ≤ StartJob3
WA ♦ linear constraints solvable, nonlinear undecidable
SA NSW Continuous variables

♦ e.g., start/end times for Hubble Telescope observations
V
Victoria ♦ linear constraints solvable in poly time by LP methods
General-purpose CSP algorithms use the graph structure

to speed up search. E.g., Tasmania is an independent subproblem!
Varieties of constraints Example: Cryptarithmetic

Unary constraints involve a single variable,
e.g., SA 6= green
Binary constraints involve pairs of variables, T WO F T U W R O
e.g., SA 6= W A + T WO
Higher-order constraints involve 3 or more variables, F O U R
e.g., cryptarithmetic column constraints
X3 X2 X1
Preferences (soft constraints), e.g., red is better than green
often representable by a cost for each variable assignment Variables: F T U W R O X1 X2 X3
→ constrained optimization problems Domains: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Constraints
alldiff(F, T, U, W, R, O)
O + O = R + 10 · X1, etc.
Real-world CSPs Standard search formulation (incremental)
Assignment problems Let’s start with the straightforward, dumb approach, then fix it
e.g., who teaches what class
States are defined by the values assigned so far
Timetabling problems
e.g., which class is offered when and where? ♦ Initial state: the empty assignment, { }
Hardware configuration ♦ Successor function: assign a value to an unassigned variable

that does not conflict with current assignment.
Spreadsheets ⇒ fail if no legal assignments (not fixable!)
Transportation scheduling ♦ Goal test: the current assignment is complete
Factory scheduling
1) This is the same for all CSPs!
Floorplanning 2) Every solution appears at depth n with n variables
⇒ use depth-first search
Notice that many real-world problems involve real-valued variables 3) Path is irrelevant, so can also use complete-state formulation
4) b = (n − ℓ)d at depth ℓ, hence n!dn leaves!!!!
Backtracking search Backtracking search

Variable assignments are commutative, i.e.,
function Backtracking-Search(csp) returns solution/failure
[W A = red then N T = green] same as [N T = green then W A = red] return Recursive-Backtracking({ }, csp)
Only need to consider assignments to a single variable at each node function Recursive-Backtracking(assignment, csp) returns soln/failure
⇒ b = d and there are dn leaves if assignment is complete then return assignment
var ← Select-Unassigned-Variable(Variables[csp], assignment, csp)
for each value in Order-Domain-Values(var, assignment, csp) do
Depth-first search for CSPs with single-variable assignments if value is consistent with assignment given Constraints[csp] then
add {var = value} to assignment
is called backtracking search
result ← Recursive-Backtracking(assignment, csp)
Backtracking search is the basic uninformed algorithm for CSPs if result 6= failure then return result
remove {var = value} from assignment
Can solve n-queens for n ≈ 25 return failure
Backtracking example Backtracking example
Improving backtracking efficiency Minimum remaining values
General-purpose methods can give huge gains in speed: Minimum remaining values (MRV):
choose the variable with the fewest legal values
1. Which variable should be assigned next?
2. In what order should its values be tried?
3. Can we detect inevitable failure early?
4. Can we take advantage of problem structure?
Degree heuristic Least constraining value

Tie-breaker among MRV variables Given a variable, choose the least constraining value:
the one that rules out the fewest values in the remaining variables
Degree heuristic:
choose the variable with the most constraints on remaining variables
Allows 1 value for SA
Allows 0 values for SA
Combining these heuristics makes 1000 queens feasible
Forward checking Forward checking
Idea: Keep track of remaining legal values for unassigned variables Idea: Keep track of remaining legal values for unassigned variables
Terminate search when any variable has no legal values Terminate search when any variable has no legal values
WA NT Q NSW V SA T WA NT Q NSW V SA T

Constraint propagation Arc consistency
Forward checking propagates information from assigned to unassigned vari- Simplest form of propagation makes each arc consistent
ables, but doesn’t provide early detection for all failures:
X → Y is consistent iff
for every value x of X there is some allowed y
WA NT Q NSW V SA T
WA NT Q NSW V SA T
N T and SA cannot both be blue!

Constraint propagation repeatedly enforces constraints locally
Arc consistency Arc consistency

Simplest form of propagation makes each arc consistent Simplest form of propagation makes each arc consistent
X → Y is consistent iff X → Y is consistent iff
for every value x of X there is some allowed y for every value x of X there is some allowed y
If X loses a value, neighbors of X need to be rechecked
Arc consistency Arc consistency algorithm
Simplest form of propagation makes each arc consistent
function AC-3( csp) returns the CSP, possibly with reduced domains
X → Y is consistent iff inputs: csp, a binary CSP with variables {X1, X2, . . . , Xn}
local variables: queue, a queue of arcs, initially all the arcs in csp
while queue is not empty do
(Xi, Xj ) ← Remove-First(queue)
if Remove-Inconsistent-Values(Xi , Xj ) then
for each Xk in Neighbors[Xi] do
WA NT Q NSW V SA T add (Xk , Xi) to queue
function Remove-Inconsistent-Values( Xi , Xj ) returns true iff succeeds

removed ← false
for each x in Domain[Xi] do
If X loses a value, neighbors of X need to be rechecked if no value y in Domain[Xj ] allows (x,y) to satisfy the constraint Xi ↔ Xj
then delete x from Domain[Xi ]; removed ← true
Arc consistency detects failure earlier than forward checking return removed
Can be run as a preprocessor or after each assignment

O(n2d3), can be reduced to O(n2d2) (but detecting all is NP-hard)
Problem structure Problem structure contd.

Suppose each subproblem has c variables out of n total
NT
Q Worst-case solution cost is n/c · dc, linear in n
WA E.g., n = 80, d = 2, c = 20
280 = 4 billion years at 10 million nodes/sec
SA NSW
4 · 220 = 0.4 seconds at 10 million nodes/sec
V
Victoria
Tasmania and mainland are independent subproblems

Identifiable as connected components of constraint graph
Tree-structured CSPs Algorithm for tree-structured CSPs
1. Choose a variable as root, order variables from root to leaves
A E such that every node’s parent precedes it in the ordering
B D A E
B D A B C D E F
C F
C F
Theorem: if the constraint graph has no loops, the CSP can be solved in
2. For j from n down to 2, apply RemoveInconsistent(P arent(Xj ), Xj )
O(n d2) time
3. For j from 1 to n, assign Xj consistently with P arent(Xj )
Compare to general CSPs, where worst-case time is O(dn)
This property also applies to logical and probabilistic reasoning:
an important example of the relation between syntactic restrictions
and the complexity of reasoning.
Nearly tree-structured CSPs Iterative algorithms for CSPs

Conditioning: instantiate a variable, prune its neighbors’ domains Hill-climbing, simulated annealing typically work with
“complete” states, i.e., all variables assigned
NT NT
Q Q
WA
To apply to CSPs:
WA
allow states with unsatisfied constraints
SA NSW NSW
operators reassign variable values
V
Victoria V
Victoria
Variable selection: randomly select any conflicted variable
T T
Value selection by min-conflicts heuristic:
choose value that violates the fewest constraints
Cutset conditioning: instantiate (in all ways) a set of variables i.e., hillclimb with h(n) = total number of violated constraints
such that the remaining constraint graph is a tree
Cutset size c ⇒ runtime O(dc · (n − c)d2), very fast for small c
Example: 4-Queens Performance of min-conflicts
States: 4 queens in 4 columns (44 = 256 states) Given random initial state, can solve n-queens in almost constant time for
arbitrary n with high probability (e.g., n = 10,000,000)
Operators: move queen in column
The same appears to be true for any randomly-generated CSP
Goal test: no attacks except in a narrow range of the ratio
Evaluation: h(n) = number of attacks number of constraints
R=
number of variables
CPU
time
h=5 h=2 h=0
R
critical
ratio
Summary
CSPs are a special kind of problem:
states defined by values of a fixed set of variables
goal test defined by constraints on variable values
Constraint Satisfaction Problems
Backtracking = depth-first search with one variable assigned per node
Variable ordering and value selection heuristics help significantly
Forward checking prevents assignments that guarantee later failure
Chapter 5
Constraint propagation (e.g., arc consistency) does additional work
to constrain values and detect inconsistencies
The CSP representation allows analysis of problem structure
Tree-structured CSPs can be solved in linear time
Iterative min-conflicts is usually effective in practice
Outline Constraint satisfaction problems (CSPs)
♦ CSP examples Standard search problem:
state is a “black box”—any old data structure
♦ Backtracking search for CSPs
that supports goal test, eval, successor
♦ Problem structure and problem decomposition
CSP:
♦ Local search for CSPs state is defined by variables Xi with values from domain Di
goal test is a set of constraints specifying

allowable combinations of values for subsets of variables
Simple example of a formal representation language
Allows useful general-purpose algorithms with more power
than standard search algorithms
Example: Map-Coloring Example: Map-Coloring contd.
Northern
Territory Northern
Territory
Western Queensland
Australia Western Queensland
Australia
South
Australia South
Australia
New South Wales
New South Wales
Victoria
Victoria
Tasmania
Tasmania
Variables W A, N T , Q, N SW , V , SA, T
Domains Di = {red, green, blue}
Constraints: adjacent regions must have different colors Solutions are assignments satisfying all constraints, e.g.,
e.g., W A 6= N T (if the language allows this), or {W A = red, N T = green, Q = red, N SW = green, V = red, SA = blue, T = green}
(W A, N T ) ∈ {(red, green), (red, blue), (green, red), (green, blue), . . .}
Constraint graph Varieties of CSPs
Binary CSP: each constraint relates at most two variables Discrete variables
finite domains; size d ⇒ O(dn) complete assignments
Constraint graph: nodes are variables, arcs show constraints ♦ e.g., Boolean CSPs, incl. Boolean satisfiability (NP-complete)
infinite domains (integers, strings, etc.)
NT ♦ e.g., job scheduling, variables are start/end days for each job
Q ♦ need a constraint language, e.g., StartJob1 + 5 ≤ StartJob3
WA ♦ linear constraints solvable, nonlinear undecidable
SA NSW Continuous variables

♦ e.g., start/end times for Hubble Telescope observations
V
Victoria ♦ linear constraints solvable in poly time by LP methods
General-purpose CSP algorithms use the graph structure

to speed up search. E.g., Tasmania is an independent subproblem!
Varieties of constraints Example: Cryptarithmetic

Unary constraints involve a single variable,
e.g., SA 6= green
Binary constraints involve pairs of variables, T WO F T U W R O
e.g., SA 6= W A + T WO
Higher-order constraints involve 3 or more variables, F O U R
e.g., cryptarithmetic column constraints
X3 X2 X1
Preferences (soft constraints), e.g., red is better than green
often representable by a cost for each variable assignment Variables: F T U W R O X1 X2 X3
→ constrained optimization problems Domains: {0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
Constraints
alldiff(F, T, U, W, R, O)
O + O = R + 10 · X1, etc.
Real-world CSPs Standard search formulation (incremental)
Assignment problems Let’s start with the straightforward, dumb approach, then fix it
e.g., who teaches what class
States are defined by the values assigned so far
Timetabling problems
e.g., which class is offered when and where? ♦ Initial state: the empty assignment, { }
Hardware configuration ♦ Successor function: assign a value to an unassigned variable

that does not conflict with current assignment.
Spreadsheets ⇒ fail if no legal assignments (not fixable!)
Transportation scheduling ♦ Goal test: the current assignment is complete
Factory scheduling
1) This is the same for all CSPs!
Floorplanning 2) Every solution appears at depth n with n variables
⇒ use depth-first search
Notice that many real-world problems involve real-valued variables 3) Path is irrelevant, so can also use complete-state formulation
4) b = (n − ℓ)d at depth ℓ, hence n!dn leaves!!!!
Backtracking search Backtracking search

Variable assignments are commutative, i.e.,
function Backtracking-Search(csp) returns solution/failure
[W A = red then N T = green] same as [N T = green then W A = red] return Recursive-Backtracking({ }, csp)
Only need to consider assignments to a single variable at each node function Recursive-Backtracking(assignment, csp) returns soln/failure
⇒ b = d and there are dn leaves if assignment is complete then return assignment
var ← Select-Unassigned-Variable(Variables[csp], assignment, csp)
for each value in Order-Domain-Values(var, assignment, csp) do
Depth-first search for CSPs with single-variable assignments if value is consistent with assignment given Constraints[csp] then
add {var = value} to assignment
is called backtracking search
result ← Recursive-Backtracking(assignment, csp)
Backtracking search is the basic uninformed algorithm for CSPs if result 6= failure then return result
remove {var = value} from assignment
Can solve n-queens for n ≈ 25 return failure
Improving backtracking efficiency Minimum remaining values
General-purpose methods can give huge gains in speed: Minimum remaining values (MRV):
choose the variable with the fewest legal values
1. Which variable should be assigned next?
2. In what order should its values be tried?
3. Can we detect inevitable failure early?
4. Can we take advantage of problem structure?
Degree heuristic Least constraining value

Tie-breaker among MRV variables Given a variable, choose the least constraining value:
the one that rules out the fewest values in the remaining variables
Degree heuristic:
choose the variable with the most constraints on remaining variables
Allows 1 value for SA
Allows 0 values for SA
Combining these heuristics makes 1000 queens feasible

Constraint propagation Arc consistency
Forward checking propagates information from assigned to unassigned vari- Simplest form of propagation makes each arc consistent
ables, but doesn’t provide early detection for all failures:
X → Y is consistent iff
WA NT Q NSW V SA T
WA NT Q NSW V SA T
N T and SA cannot both be blue!

Constraint propagation repeatedly enforces constraints locally
Arc consistency Arc consistency

Simplest form of propagation makes each arc consistent Simplest form of propagation makes each arc consistent
X → Y is consistent iff X → Y is consistent iff
for every value x of X there is some allowed y for every value x of X there is some allowed y
If X loses a value, neighbors of X need to be rechecked
Arc consistency Arc consistency algorithm
Simplest form of propagation makes each arc consistent
function AC-3( csp) returns the CSP, possibly with reduced domains
X → Y is consistent iff inputs: csp, a binary CSP with variables {X1, X2, . . . , Xn}
local variables: queue, a queue of arcs, initially all the arcs in csp
while queue is not empty do
(Xi, Xj ) ← Remove-First(queue)
if Remove-Inconsistent-Values(Xi , Xj ) then
for each Xk in Neighbors[Xi] do
WA NT Q NSW V SA T add (Xk , Xi) to queue
function Remove-Inconsistent-Values( Xi , Xj ) returns true iff succeeds

removed ← false
for each x in Domain[Xi] do
If X loses a value, neighbors of X need to be rechecked if no value y in Domain[Xj ] allows (x,y) to satisfy the constraint Xi ↔ Xj
then delete x from Domain[Xi ]; removed ← true
Arc consistency detects failure earlier than forward checking return removed
Can be run as a preprocessor or after each assignment

O(n2d3), can be reduced to O(n2d2) (but detecting all is NP-hard)
Problem structure Problem structure contd.

Suppose each subproblem has c variables out of n total
NT
Q Worst-case solution cost is n/c · dc, linear in n
WA E.g., n = 80, d = 2, c = 20
280 = 4 billion years at 10 million nodes/sec
SA NSW
4 · 220 = 0.4 seconds at 10 million nodes/sec
V
Victoria
Tasmania and mainland are independent subproblems

Identifiable as connected components of constraint graph
Tree-structured CSPs Algorithm for tree-structured CSPs
1. Choose a variable as root, order variables from root to leaves
A E such that every node’s parent precedes it in the ordering
B D A E
B D A B C D E F
C F
C F
Theorem: if the constraint graph has no loops, the CSP can be solved in
2. For j from n down to 2, apply RemoveInconsistent(P arent(Xj ), Xj )
O(n d2) time
3. For j from 1 to n, assign Xj consistently with P arent(Xj )
Compare to general CSPs, where worst-case time is O(dn)
This property also applies to logical and probabilistic reasoning:
an important example of the relation between syntactic restrictions
and the complexity of reasoning.
Nearly tree-structured CSPs Iterative algorithms for CSPs

Conditioning: instantiate a variable, prune its neighbors’ domains Hill-climbing, simulated annealing typically work with
“complete” states, i.e., all variables assigned
NT NT
Q Q
WA
To apply to CSPs:
WA
allow states with unsatisfied constraints
SA NSW NSW
operators reassign variable values
V
Victoria V
Victoria
Variable selection: randomly select any conflicted variable
T T
Value selection by min-conflicts heuristic:
choose value that violates the fewest constraints
Cutset conditioning: instantiate (in all ways) a set of variables i.e., hillclimb with h(n) = total number of violated constraints
such that the remaining constraint graph is a tree
Cutset size c ⇒ runtime O(dc · (n − c)d2), very fast for small c
Example: 4-Queens Performance of min-conflicts
States: 4 queens in 4 columns (44 = 256 states) Given random initial state, can solve n-queens in almost constant time for
arbitrary n with high probability (e.g., n = 10,000,000)
Operators: move queen in column
The same appears to be true for any randomly-generated CSP
Goal test: no attacks except in a narrow range of the ratio
Evaluation: h(n) = number of attacks number of constraints
R=
number of variables
CPU
time
h=5 h=2 h=0
R
critical
ratio
Summary
CSPs are a special kind of problem:
states defined by values of a fixed set of variables
goal test defined by constraints on variable values Game playing
Backtracking = depth-first search with one variable assigned per node
Variable ordering and value selection heuristics help significantly
Forward checking prevents assignments that guarantee later failure Chapter 6
Constraint propagation (e.g., arc consistency) does additional work

to constrain values and detect inconsistencies
The CSP representation allows analysis of problem structure
Tree-structured CSPs can be solved in linear time
Iterative min-conflicts is usually effective in practice
Chapter 6 1
Chapter 5 40
Outline Games vs. search problems
♦ Games “Unpredictable” opponent ⇒ solution is a strategy
specifying a move for every possible opponent reply
♦ Perfect play
– minimax decisions Time limits ⇒ unlikely to find goal, must approximate
– α–β pruning
Plan of attack:
♦ Resource limits and approximate evaluation
• Computer considers possible lines of play (Babbage, 1846)
♦ Games of chance
• Algorithm for perfect play (Zermelo, 1912; Von Neumann, 1944)
♦ Games of imperfect information • Finite horizon, approximate evaluation (Zuse, 1945; Wiener, 1948;
Shannon, 1950)
• First chess program (Turing, 1951)
• Machine learning to improve evaluation accuracy (Samuel, 1952–57)
• Pruning to allow deeper search (McCarthy, 1956)
Types of games Game tree (2-player, deterministic, turns)

MAX (X)
deterministic chance
perfect information chess, checkers, backgammon X X X

go, othello monopoly MIN (O) X X X
X X X
imperfect information battleships, bridge, poker, scrabble

blind tictactoe nuclear war X O X O X ...
MAX (X) O
X O X X O X O ...
MIN (O) X X
... ... ... ...
X O X X O X X O X ...
TERMINAL O X O O X X
O X X O X O O
Utility !1 0 +1
Minimax Minimax algorithm
Perfect play for deterministic, perfect-information games
function Minimax-Decision(state) returns an action
Idea: choose move to position with highest minimax value inputs: state, current state in game
= best achievable payoff against best play return the a in Actions(state) maximizing Min-Value(Result(a, state))
E.g., 2-ply game: function Max-Value(state) returns a utility value

MAX 3 if Terminal-Test(state) then return Utility(state)
v ← −∞
A1 A2 A3
for a, s in Successors(state) do v ← Max(v, Min-Value(s))
return v
MIN 3 2 2
function Min-Value(state) returns a utility value
A 11 A 12 A 13 A 21 A 22 A 23 A 31 A 32 A 33
if Terminal-Test(state) then return Utility(state)
v←∞
3 12 8 2 4 6 14 5 2 for a, s in Successors(state) do v ← Min(v, Max-Value(s))
return v
Properties of minimax Properties of minimax

Complete?? Complete?? Only if tree is finite (chess has specific rules for this).
NB a finite strategy can exist even in an infinite tree!
Optimal??
Properties of minimax Properties of minimax
Complete?? Yes, if tree is finite (chess has specific rules for this) Complete?? Yes, if tree is finite (chess has specific rules for this)
Optimal?? Yes, against an optimal opponent. Otherwise?? Optimal?? Yes, against an optimal opponent. Otherwise??
Time complexity?? Time complexity?? O(bm)
Space complexity??
Properties of minimax α–β pruning example

Complete?? Yes, if tree is finite (chess has specific rules for this)
MAX 3
Optimal?? Yes, against an optimal opponent. Otherwise??
Time complexity?? O(bm)
MIN 3
Space complexity?? O(bm) (depth-first exploration)
For chess, b ≈ 35, m ≈ 100 for “reasonable” games
⇒ exact solution completely infeasible
3 12 8
But do we need to explore every path?
α–β pruning example α–β pruning example
MAX 3 MAX 3
MIN 3 2 MIN 3 2 14
X X X X
3 12 8 2 3 12 8 2 14
α–β pruning example α–β pruning example
MAX 3 MAX 3 3
MIN 3 2 14 5 MIN 3 2 14 5 2
X X X X
3 12 8 2 14 5 3 12 8 2 14 5 2
Why is it called α–β ? The α–β algorithm
function Alpha-Beta-Decision(state) returns an action
return the a in Actions(state) maximizing Min-Value(Result(a, state))
MAX
function Max-Value(state, α, β) returns a utility value

MIN
inputs: state, current state in game
α, the value of the best alternative for max along the path to state
..
..
β, the value of the best alternative for min along the path to state
.. if Terminal-Test(state) then return Utility(state)
MAX v ← −∞
for a, s in Successors(state) do
MIN v ← Max(v, Min-Value(s, α, β))
V
if v ≥ β then return v
α ← Max(α, v)
α is the best value (to max) found so far off the current path return v
If V is worse than α, max will avoid it ⇒ prune that branch function Min-Value(state, α, β) returns a utility value
same as Max-Value but with roles of α, β reversed
Define β similarly for min
Properties of α–β Resource limits

Pruning does not affect final result Standard approach:
Good move ordering improves effectiveness of pruning • Use Cutoff-Test instead of Terminal-Test
With “perfect ordering,” time complexity = O(b m/2
) e.g., depth limit (perhaps add quiescence search)
⇒ doubles solvable depth • Use Eval instead of Utility
i.e., evaluation function that estimates desirability of position
A simple example of the value of reasoning about which computations are
relevant (a form of metareasoning) Suppose we have 100 seconds, explore 104 nodes/second
Unfortunately, 3550 is still impossible! ⇒ 106 nodes per move ≈ 358/2
⇒ α–β reaches depth 8 ⇒ pretty good chess program
Evaluation functions Digression: Exact values don’t matter
MAX
MIN 1 2 1 20
1 2 2 4 1 20 20 400
Black!to!move! White!to!move! Behaviour is preserved under any monotonic transformation of Eval

White!slightly!better Black!winning Only the order matters:
For chess, typically linear weighted sum of features payoff in deterministic games acts as an ordinal utility function
Eval(s) = w1f1(s) + w2f2(s) + . . . + wnfn(s)

e.g., w1 = 9 with
f1(s) = (number of white queens) – (number of black queens), etc.
Deterministic games in practice Nondeterministic games: backgammon

Checkers: Chinook ended 40-year-reign of human world champion Marion 0 1 2 3 4 5 6 7 8 9 10 11 12
Tinsley in 1994. Used an endgame database defining perfect play for all
positions involving 8 or fewer pieces on the board, a total of 443,748,401,247
positions.
Chess: Deep Blue defeated human world champion Gary Kasparov in a six-
game match in 1997. Deep Blue searches 200 million positions per second,
uses very sophisticated evaluation, and undisclosed methods for extending
some lines of search up to 40 ply.
Othello: human champions refuse to compete against computers, who are
too good.
Go: human champions refuse to compete against computers, who are too
bad. In go, b > 300, so most programs use pattern knowledge bases to
suggest plausible moves.
25 24 23 22 21 20 19 18 17 16 15 14 13
Nondeterministic games in general Algorithm for nondeterministic games
In nondeterministic games, chance introduced by dice, card-shuffling Expectiminimax gives perfect play
Simplified example with coin-flipping: Just like Minimax, except we must also handle chance nodes:
MAX ...
if state is a Max node then
return the highest ExpectiMinimax-Value of Successors(state)
if state is a Min node then
CHANCE 3 "1 return the lowest ExpectiMinimax-Value of Successors(state)
if state is a chance node then
0.5 0.5 0.5 0.5 return average of ExpectiMinimax-Value of Successors(state)
...
MIN 2 4 0 "2
2 4 7 4 6 0 5 "2
Nondeterministic games in practice Digression: Exact values DO matter

Dice rolls increase b: 21 possible rolls with 2 dice MAX
Backgammon ≈ 20 legal moves (can be 6,000 with 1-1 roll)
depth 4 = 20 × (21 × 20)3 ≈ 1.2 × 109
DICE 2.1 1.3 21 40.9
As depth increases, probability of reaching a given node shrinks
⇒ value of lookahead is diminished .9 .1 .9 .1 .9 .1 .9 .1
α–β pruning is much less effective MIN 2 3 1 4 20 30 1 400
TDGammon uses depth-2 search + very good Eval

≈ world-champion level 2 2 3 3 1 1 4 4 20 20 30 30 1 1 400 400
Behaviour is preserved only by positive linear transformation of Eval

Hence Eval should be proportional to the expected payoff
Games of imperfect information Example
E.g., card games, where opponent’s initial cards are unknown Four-card bridge/whist/hearts hand, Max to play first
8 6
Typically we can calculate a probability for each possible deal 6 6 8 7 6 6 7 6 6 7 6 6 7 6 7
4 2 9 3 4 2 9 3 4 2 3 4 3 4 3
0
Seems just like having one big dice roll at the beginning of the game∗ 9 2
Idea: compute the minimax value of each action in each deal,

then choose the action with highest expected value over all deals∗
Special case: if an action is optimal for all deals, it’s optimal.∗
GIB, current best bridge program, approximates this idea by
1) generating 100 deals consistent with bidding information
2) picking the action that wins most tricks on average
Example Example
Four-card bridge/whist/hearts hand, Max to play first Four-card bridge/whist/hearts hand, Max to play first
MAX 6 6 8 7 8 6 6 7 6 6 7 6 6 7 6 6 7 MAX 6 6 8 7 8 6 6 7 6 6 7 6 6 7 6 6 7
MIN 4 2 9 3 4 2 9 3 4 2 3 4 3 4 3
0 MIN 4 2 9 3 4 2 9 3 4 2 3 4 3 4 3
0
9 2 9 2
8 6 MAX 6 6 8 7 8 6 6 7 6 6 7 6 6 7 6 6 7
MAX 6 6 8 7 6 6 7 6 6 7 6 6 7 6 7
0 MIN 4 2 9 3 4 2 9 3 4 2 3 4 3 4 3
0
MIN 4 2 9 3 4 2 9 3 4 2 3 4 3 4 3 9 2
9 2
6 6 7
4 3
"0.5
MAX 6 6 8 7 8 6 6 7 6 6 7 6 6 7
MIN 4 2 9 3 4 2 9 3 9 4 2 3 2 4 3
6 7
6 4 3
"0.5
Commonsense example Commonsense example
Road A leads to a small heap of gold pieces Road A leads to a small heap of gold pieces
Road B leads to a fork: Road B leads to a fork:
take the left fork and you’ll find a mound of jewels; take the left fork and you’ll find a mound of jewels;
take the right fork and you’ll be run over by a bus. take the right fork and you’ll be run over by a bus.
Road A leads to a small heap of gold pieces
Road B leads to a fork:
take the left fork and you’ll be run over by a bus;
take the right fork and you’ll find a mound of jewels.
Commonsense example Proper analysis

Road A leads to a small heap of gold pieces * Intuition that the value of an action is the average of its values
Road B leads to a fork: in all actual states is WRONG
take the left fork and you’ll find a mound of jewels;
take the right fork and you’ll be run over by a bus. With partial observability, value of an action depends on the
information state or belief state the agent is in
Road A leads to a small heap of gold pieces
Road B leads to a fork: Can generate and search a tree of information states
take the left fork and you’ll be run over by a bus; Leads to rational behaviors such as
take the right fork and you’ll find a mound of jewels. ♦ Acting to obtain information
Road A leads to a small heap of gold pieces ♦ Signalling to one’s partner
Road B leads to a fork: ♦ Acting randomly to minimize information disclosure
guess correctly and you’ll find a mound of jewels;
guess incorrectly and you’ll be run over by a bus.
Summary
Games are fun to work on! (and dangerous)
They illustrate several important points about AI
♦ perfection is unattainable ⇒ must approximate Logical agents
♦ good idea to think about what to think about
♦ uncertainty constrains the assignment of values to states
Chapter 7
♦ optimal decisions depend on information state, not real state
Games are to AI as grand prix racing is to automobile design
Outline Knowledge bases

♦ Knowledge-based agents Inference engine domainindependent algorithms
♦ Wumpus world Knowledge base domainspecific content
♦ Logic in general—models and entailment

Knowledge base = set of sentences in a formal language
♦ Propositional (Boolean) logic
Declarative approach to building an agent (or other system):
♦ Equivalence, validity, satisfiability Tell it what it needs to know
♦ Inference rules and theorem proving Then it can Ask itself what to do—answers should follow from the KB
– forward chaining
– backward chaining Agents can be viewed at the knowledge level
– resolution i.e., what they know, regardless of how implemented
Or at the implementation level
i.e., data structures in KB and algorithms that manipulate them
A simple knowledge-based agent Wumpus World PEAS description
Performance measure
function KB-Agent( percept) returns an action
static: KB, a knowledge base gold +1000, death -1000
t, a counter, initially 0, indicating time -1 per step, -10 for using the arrow
Breeze
Tell(KB, Make-Percept-Sentence( percept, t))

Environment 4 Stench
PIT
action ← Ask(KB, Make-Action-Query(t)) Squares adjacent to wumpus are smelly Breeze

Breeze
Tell(KB, Make-Action-Sentence(action, t)) Squares adjacent to pit are breezy 3 Stench
Gold
PIT
t←t + 1 Glitter iff gold is in the same square Stench Breeze

2
return action Shooting kills wumpus if you are facing it
Shooting uses up the only arrow 1
Breeze Breeze
PIT
The agent must be able to: Grabbing picks up gold if in same square START
Represent states, actions, etc. Releasing drops the gold in same square 1 2 3 4
Incorporate new percepts

Actuators Left turn, Right turn,
Update internal representations of the world
Forward, Grab, Release, Shoot
Deduce hidden properties of the world
Deduce appropriate actions Sensors Breeze, Glitter, Smell
Wumpus world characterization Wumpus world characterization

Observable?? Observable?? No—only local perception
Deterministic??
Observable?? No—only local perception Observable?? No—only local perception
Deterministic?? Yes—outcomes exactly specified Deterministic?? Yes—outcomes exactly specified
Episodic?? Episodic?? No—sequential at the level of actions
Static??

Observable?? No—only local perception Observable?? No—only local perception
Deterministic?? Yes—outcomes exactly specified Deterministic?? Yes—outcomes exactly specified
Episodic?? No—sequential at the level of actions Episodic?? No—sequential at the level of actions
Static?? Yes—Wumpus and Pits do not move Static?? Yes—Wumpus and Pits do not move
Discrete?? Discrete?? Yes
Single-agent??
Wumpus world characterization Exploring a wumpus world
Observable?? No—only local perception
Deterministic?? Yes—outcomes exactly specified
Episodic?? No—sequential at the level of actions
Static?? Yes—Wumpus and Pits do not move
Discrete?? Yes
Single-agent?? Yes—Wumpus is essentially a natural feature OK
OK OK
A
Exploring a wumpus world Exploring a wumpus world
P?
B OK B OK P?
A A
OK OK OK OK
A A
P? P?
P
B OK P? B OK P?
OK
A A
OK S OK OK S OK
A A A A
W
P? P? OK
P P
B OK P? B OK P? OK
OK OK
A A A A
OK S OK OK S OK
A A
W A A
W
Exploring a wumpus world Other tight spots
P?
Breeze in (1,2) and (2,1)

B OK P? ⇒ no safe actions
P?
P? OK A
P A
OK B
A
OK
P?
Assuming pits uniformly distributed,
(2,2) has pit w/ prob 0.86, vs. 0.31
B OK P? BGS OK
OK
A A A Smell in (1,1)
⇒ cannot move
OK S OK Can use a strategy of coercion:
shoot straight ahead
A A
W S
A
wumpus was there ⇒ dead ⇒ safe
wumpus wasn’t there ⇒ safe
Logic in general Entailment

Logics are formal languages for representing information Entailment means that one thing follows from another:
such that conclusions can be drawn
KB |= α
Syntax defines the sentences in the language
Knowledge base KB entails sentence α
Semantics define the “meaning” of sentences; if and only if
i.e., define truth of a sentence in a world α is true in all worlds where KB is true
E.g., the language of arithmetic E.g., the KB containing “the Giants won” and “the Reds won”
entails “Either the Giants won or the Reds won”
x + 2 ≥ y is a sentence; x2 + y > is not a sentence
E.g., x + y = 4 entails 4 = x + y
x + 2 ≥ y is true iff the number x + 2 is no less than the number y
Entailment is a relationship between sentences (i.e., syntax)
x + 2 ≥ y is true in a world where x = 7, y = 1 that is based on semantics
x + 2 ≥ y is false in a world where x = 0, y = 6
Note: brains process syntax (of some sort)
Models Entailment in the wumpus world
Logicians typically think in terms of models, which are formally
structured worlds with respect to which truth can be evaluated
We say m is a model of a sentence α if α is true in m
M (α) is the set of all models of α Situation after detecting nothing in [1,1],
moving right, breeze in [2,1]
Then KB |= α if and only if M (KB) ⊆ M (α)
x x x
x ? ?
E.g. KB = Giants won and Reds won x
x
x x
x
B
α = Giants won M( )
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
x
Consider possible models for ?s
assuming only pits
A A
?
x x x x
xx x xx
x x x
x
x x 3 Boolean choices ⇒ 8 possible models
M(KB) x
x x x
x x
x
Wumpus models Wumpus models
2 PIT 2 PIT
2 2
Breeze Breeze
1 1
Breeze Breeze
1 PIT 1 PIT
1 2 3 1 2 3
1 2 3 1 2 3
KB
2 PIT 2 PIT
2 PIT 2 PIT
2 2
Breeze Breeze
Breeze
1 PIT Breeze
1 PIT
1 1
Breeze Breeze
1 1
1 2 3 1 2 3
1 2 3 1 2 3
1 2 3 1 2 3
2 PIT PIT 2 PIT PIT

2 PIT 2 PIT
Breeze Breeze
1 1
Breeze Breeze
1 PIT 1 PIT
2 PIT PIT 2 PIT PIT
1 2 3 1 2 3
1 2 3 1 2 3
Breeze Breeze
1 PIT 1 PIT
1 2 3 1 2 3
KB = wumpus-world rules + observations
Wumpus models Wumpus models
2 PIT 2 PIT
2 2
Breeze Breeze
1 1
Breeze Breeze
1 PIT 1 PIT
1 2 3 1 2 3
1 2 3 1 2 3
KB KB
1
2 PIT 2 PIT
2 PIT 2 PIT
2 2
Breeze Breeze
Breeze
1 PIT Breeze
1 PIT
1 1
Breeze Breeze
1 1
1 2 3 1 2 3
1 2 3 1 2 3
1 2 3 1 2 3
2 PIT PIT 2 PIT PIT

2 PIT 2 PIT
Breeze Breeze
1 1
Breeze Breeze
1 PIT 1 PIT
2 PIT PIT 2 PIT PIT
1 2 3 1 2 3
1 2 3 1 2 3
Breeze Breeze
1 PIT 1 PIT
1 2 3 1 2 3
KB = wumpus-world rules + observations KB = wumpus-world rules + observations

α1 = “[1,2] is safe”, KB |= α1, proved by model checking
Wumpus models Inference

KB ⊢i α = sentence α can be derived from KB by procedure i
2 PIT
2
Breeze
Consequences of KB are a haystack; α is a needle.
1
1
Breeze
PIT
1 2 3
2 Entailment = needle in haystack; inference = finding it
1 2 3
KB
Soundness: i is sound if
2 PIT
2
2 PIT
whenever KB ⊢i α, it is also true that KB |= α
Breeze
Breeze
1 PIT
1
Completeness: i is complete if
Breeze
1
1 2 3
1 2 3
1 2 3
whenever KB |= α, it is also true that KB ⊢i α

2 PIT PIT
2 PIT
Breeze
1
Breeze Preview: we will define a logic (first-order logic) which is expressive enough
1 PIT
to say almost anything of interest, and for which there exists a sound and
2 PIT PIT
1 2 3
1 2 3
Breeze
1
1 2
PIT
3
complete inference procedure.
That is, the procedure will answer any question whose answer follows from
KB = wumpus-world rules + observations what is known by the KB.
α2 = “[2,2] is safe”, KB 6|= α2
Propositional logic: Syntax Propositional logic: Semantics
Propositional logic is the simplest logic—illustrates basic ideas Each model specifies true/false for each proposition symbol
The proposition symbols P1, P2 etc are sentences E.g. P1,2 P2,2 P3,1
true true f alse
If S is a sentence, ¬S is a sentence (negation)
(With these symbols, 8 possible models, can be enumerated automatically.)
If S1 and S2 are sentences, S1 ∧ S2 is a sentence (conjunction)
Rules for evaluating truth with respect to a model m:
If S1 and S2 are sentences, S1 ∨ S2 is a sentence (disjunction)
¬S is true iff S is false
If S1 and S2 are sentences, S1 ⇒ S2 is a sentence (implication) S1 ∧ S 2 is true iff S1 is true and S2 is true
If S1 and S2 are sentences, S1 ⇔ S2 is a sentence (biconditional) S1 ∨ S 2 is true iff S1 is true or S2 is true
S1 ⇒ S 2 is true iff S1 is false or S2 is true
i.e., is false iff S1 is true and S2 is false
S1 ⇔ S 2 is true iff S1 ⇒ S2 is true and S2 ⇒ S1 is true
Simple recursive process evaluates an arbitrary sentence, e.g.,
¬P1,2 ∧ (P2,2 ∨ P3,1) = true ∧ (f alse ∨ true) = true ∧ true = true
Truth tables for connectives Wumpus world sentences

P Q ¬P P ∧Q P ∨Q P ⇒Q P ⇔Q Let Pi,j be true if there is a pit in [i, j].
false false true false false true true Let Bi,j be true if there is a breeze in [i, j].
false true true false true true false
¬P1,1
true false false false true false false
true true false true true true true ¬B1,1
B2,1
“Pits cause breezes in adjacent squares”
Wumpus world sentences Truth tables for inference
Let Pi,j be true if there is a pit in [i, j]. B1,1 B2,1 P1,1 P1,2 P2,1 P2,2 P3,1 R1 R2 R3 R4 R5 KB
Let Bi,j be true if there is a breeze in [i, j]. false false false false false false false true true true true false false
false false false false false false true true true false true false false
¬P1,1 ... ... ... ... ... ... ... ... ... ... ... ... ...
¬B1,1 false true false false false false false true true false true true false
false true false false false false true true true true true true true
B2,1
false true false false false true false true true true true true true
“Pits cause breezes in adjacent squares” false true false false false true true true true true true true true
false true false false true false false true false false true true false
... ... ... ... ... ... ... ... ... ... ... ... ...
B1,1 ⇔ (P1,2 ∨ P2,1)
true true true true true true true false true true false true false
B2,1 ⇔ (P1,1 ∨ P2,2 ∨ P3,1)
“A square is breezy if and only if there is an adjacent pit” Enumerate rows (different assignments to symbols),
if KB is true in row, check that α is too
Inference by enumeration Logical equivalence

Depth-first enumeration of all models is sound and complete Two sentences are logically equivalent iff true in same models:
α ≡ β if and only if α |= β and β |= α
function TT-Entails?(KB, α) returns true or false
inputs: KB, the knowledge base, a sentence in propositional logic (α ∧ β) ≡ (β ∧ α) commutativity of ∧
α, the query, a sentence in propositional logic (α ∨ β) ≡ (β ∨ α) commutativity of ∨
symbols ← a list of the proposition symbols in KB and α ((α ∧ β) ∧ γ) ≡ (α ∧ (β ∧ γ)) associativity of ∧
return TT-Check-All(KB, α, symbols, [ ]) ((α ∨ β) ∨ γ) ≡ (α ∨ (β ∨ γ)) associativity of ∨
¬(¬α) ≡ α double-negation elimination
function TT-Check-All(KB, α, symbols, model) returns true or false
(α ⇒ β) ≡ (¬β ⇒ ¬α) contraposition
if Empty?(symbols) then
if PL-True?(KB, model) then return PL-True?(α, model)
(α ⇒ β) ≡ (¬α ∨ β) implication elimination
else return true (α ⇔ β) ≡ ((α ⇒ β) ∧ (β ⇒ α)) biconditional elimination
else do ¬(α ∧ β) ≡ (¬α ∨ ¬β) De Morgan
P ← First(symbols); rest ← Rest(symbols) ¬(α ∨ β) ≡ (¬α ∧ ¬β) De Morgan
return TT-Check-All(KB, α, rest, Extend(P , true, model)) and (α ∧ (β ∨ γ)) ≡ ((α ∧ β) ∨ (α ∧ γ)) distributivity of ∧ over ∨
TT-Check-All(KB, α, rest, Extend(P , false, model)) (α ∨ (β ∧ γ)) ≡ ((α ∨ β) ∧ (α ∨ γ)) distributivity of ∨ over ∧
O(2n) for n symbols; problem is co-NP-complete

Validity and satisfiability Proof methods
A sentence is valid if it is true in all models, Proof methods divide into (roughly) two kinds:
e.g., T rue, A ∨ ¬A, A ⇒ A, (A ∧ (A ⇒ B)) ⇒ B
Validity is connected to inference via the Deduction Theorem: Application of inference rules
KB |= α if and only if (KB ⇒ α) is valid – Legitimate (sound) generation of new sentences from old
– Proof = a sequence of inference rule applications
A sentence is satisfiable if it is true in some model Can use inference rules as operators in a standard search alg.
e.g., A ∨ B, C – Typically require translation of sentences into a normal form
A sentence is unsatisfiable if it is true in no models Model checking
e.g., A ∧ ¬A truth table enumeration (always exponential in n)
Satisfiability is connected to inference via the following: improved backtracking, e.g., Davis–Putnam–Logemann–Loveland
KB |= α if and only if (KB ∧ ¬α) is unsatisfiable heuristic search in model space (sound but incomplete)
i.e., prove α by reductio ad absurdum e.g., min-conflicts-like hill-climbing algorithms
Forward and backward chaining Forward chaining

Horn Form (restricted) Idea: fire any rule whose premises are satisfied in the KB,
KB = conjunction of Horn clauses add its conclusion to the KB, until query is found
Horn clause = Q
♦ proposition symbol; or
P ⇒ Q
♦ (conjunction of symbols) ⇒ symbol
E.g., C ∧ (B ⇒ A) ∧ (C ∧ D ⇒ B) L∧M ⇒ P P
B∧L ⇒ M
Modus Ponens (for Horn Form): complete for Horn KBs A∧P ⇒ L M
α1, . . . , αn, α1 ∧ · · · ∧ αn ⇒ β A∧B ⇒ L

L
β A
Can be used with forward chaining or backward chaining. B
A B
These algorithms are very natural and run in linear time
Forward chaining algorithm Forward chaining example
function PL-FC-Entails?(KB, q) returns true or false Q

inputs: KB, the knowledge base, a set of propositional Horn clauses
q, the query, a proposition symbol 1
local variables: count, a table, indexed by clause, initially the number of premises
inferred, a table, indexed by symbol, each entry initially false P
agenda, a list of symbols, initially the symbols known in KB
while agenda is not empty do
2
p ← Pop(agenda) M
unless inferred[p] do
inferred[p] ← true 2
for each Horn clause c in whose premise p appears do L
decrement count[c]
if count[c] = 0 then do
if Head[c] = q then return true 2 2
Push(Head[c], agenda)
return false A B
Forward chaining example Forward chaining example
Q Q
1 1
P P
2 2
M M
2 1
L L
1 1 1 0
A B A B
Q Q
1 1
P P
1 0
M M
0 0
L L
1 0 1 0
A B A B
Q Q
0 0
P P
0 0
M M
0 0
L L
0 0 0 0
A B A B
Forward chaining example Proof of completeness
Q FC derives every atomic sentence that is entailed by KB

1. FC reaches a fixed point where no new atomic sentences are derived
0
2. Consider the final state as a model m, assigning true/false to symbols
P
3. Every clause in the original KB is true in m
0 Proof: Suppose a clause a1 ∧ . . . ∧ ak ⇒ b is false in m
M Then a1 ∧ . . . ∧ ak is true in m and b is false in m
Therefore the algorithm has not reached a fixed point!
0
L 4. Hence m is a model of KB
5. If KB |= q, q is true in every model of KB, including m
0 0
General idea: construct any model of KB by sound inference, check α
A B
Backward chaining Backward chaining example

Idea: work backwards from the query q: Q
to prove q by BC,
check if q is known already, or
prove by BC all premises of some rule concluding q
P
Avoid loops: check if new subgoal is already on the goal stack
Avoid repeated work: check if new subgoal
1) has already been proved true, or M
2) has already failed
L
A B
Backward chaining example Backward chaining example
Q Q
P P
M M
L L
A B A B
Q Q
P P
M M
L L
A B A B
Q Q
P P
M M
L L
A B A B
Q Q
P P
M M
L L
A B A B
Q Q
P P
M M
L L
A B A B
Forward vs. backward chaining Resolution

FC is data-driven, cf. automatic, unconscious processing, Conjunctive Normal Form (CNF—universal)
e.g., object recognition, routine decisions conjunction of disjunctions
| {z
of literals}
clauses
May do lots of work that is irrelevant to the goal E.g., (A ∨ ¬B) ∧ (B ∨ ¬C ∨ ¬D)
BC is goal-driven, appropriate for problem-solving, Resolution inference rule (for CNF): complete for propositional logic
e.g., Where are my keys? How do I get into a PhD program?
ℓ1 ∨ · · · ∨ ℓ k , m1 ∨ · · · ∨ mn
Complexity of BC can be much less than linear in size of KB ℓ1 ∨ · · · ∨ ℓi−1 ∨ ℓi+1 ∨ · · · ∨ ℓk ∨ m1 ∨ · · · ∨ mj−1 ∨ mj+1 ∨ · · · ∨ mn
where ℓi and mj are complementary literals. E.g.,
P?
P1,3 ∨ P2,2, ¬P2,2 P

B OK P?
P1,3 A A
OK
OK S OK
Resolution is sound and complete for propositional logic A A

W
Conversion to CNF Resolution algorithm
B1,1 ⇔ (P1,2 ∨ P2,1) Proof by contradiction, i.e., show KB ∧ ¬α unsatisfiable
1. Eliminate ⇔, replacing α ⇔ β with (α ⇒ β) ∧ (β ⇒ α). function PL-Resolution(KB, α) returns true or false
inputs: KB, the knowledge base, a sentence in propositional logic
(B1,1 ⇒ (P1,2 ∨ P2,1)) ∧ ((P1,2 ∨ P2,1) ⇒ B1,1) α, the query, a sentence in propositional logic
2. Eliminate ⇒, replacing α ⇒ β with ¬α ∨ β. clauses ← the set of clauses in the CNF representation of KB ∧ ¬α
new ← { }
(¬B1,1 ∨ P1,2 ∨ P2,1) ∧ (¬(P1,2 ∨ P2,1) ∨ B1,1) loop do
for each Ci, Cj in clauses do
3. Move ¬ inwards using de Morgan’s rules and double-negation: resolvents ← PL-Resolve(Ci, Cj )
if resolvents contains the empty clause then return true
(¬B1,1 ∨ P1,2 ∨ P2,1) ∧ ((¬P1,2 ∧ ¬P2,1) ∨ B1,1) new ← new ∪ resolvents
if new ⊆ clauses then return false
4. Apply distributivity law (∨ over ∧) and flatten:
clauses ← clauses ∪ new
(¬B1,1 ∨ P1,2 ∨ P2,1) ∧ (¬P1,2 ∨ B1,1) ∧ (¬P2,1 ∨ B1,1)
Resolution example Summary

KB = (B1,1 ⇔ (P1,2 ∨ P2,1)) ∧ ¬B1,1 α = ¬P1,2 Logical agents apply inference to a knowledge base
to derive new information and make decisions
P2,1 B1,1 B1,1 P1,2 P2,1 P1,2 B1,1 B1,1 P1,2 Basic concepts of logic:
– syntax: formal structure of sentences
– semantics: truth of sentences wrt models
B1,1 P1,2 B1,1 P P P1,2 B1,1 P2,1 B1,1 P P P2,1 P2,1 P1,2 – entailment: necessary truth of one sentence given another
1,2 2,1 1,2 2,1
– inference: deriving sentences from other sentences

– soundess: derivations produce only entailed sentences
– completeness: derivations can produce all entailed sentences
Wumpus world requires the ability to represent partial and negated informa-
tion, reason by cases, etc.
Forward, backward chaining are linear-time, complete for Horn clauses
Resolution is complete for propositional logic
Propositional logic lacks expressive power
Outline
♦ Why FOL?
♦ Syntax and semantics of FOL
First-order logic ♦ Fun with sentences
♦ Wumpus world in FOL
Chapter 8
Pros and cons of propositional logic First-order logic

Propositional logic is declarative: pieces of syntax correspond to facts Whereas propositional logic assumes world contains facts,
first-order logic (like natural language) assumes the world contains
Propositional logic allows partial/disjunctive/negated information
(unlike most data structures and databases) • Objects: people, houses, numbers, theories, Ronald McDonald, colors,
Propositional logic is compositional: baseball games, wars, centuries . . .
meaning of B1,1 ∧ P1,2 is derived from meaning of B1,1 and of P1,2 • Relations: red, round, bogus, prime, multistoried . . .,
brother of, bigger than, inside, part of, has color, occurred after, owns,
Meaning in propositional logic is context-independent comes between, . . .
(unlike natural language, where meaning depends on context)
• Functions: father of, best friend, third inning of, one more than, end of
Propositional logic has very limited expressive power ...
(unlike natural language)
E.g., cannot say “pits cause breezes in adjacent squares”
except by writing one sentence for each square
Logics in general Syntax of FOL: Basic elements
Language Ontological Epistemological Constants KingJohn, 2, U CB, . . .

Commitment Commitment Predicates Brother, >, . . .
Propositional logic facts true/false/unknown Functions Sqrt, Lef tLegOf, . . .
First-order logic facts, objects, relations true/false/unknown Variables x, y, a, b, . . .
Temporal logic facts, objects, relations, times true/false/unknown Connectives ∧ ∨ ¬ ⇒ ⇔
Probability theory facts degree of belief Equality =
Fuzzy logic facts + degree of truth known interval value Quantifiers ∀∃
Atomic sentences Complex sentences

Atomic sentence = predicate(term1, . . . , termn) Complex sentences are made from atomic sentences using connectives
or term1 = term2
¬S, S 1 ∧ S2 , S1 ∨ S2 , S1 ⇒ S2 , S1 ⇔ S2
Term = f unction(term1, . . . , termn) E.g. Sibling(KingJohn, Richard) ⇒ Sibling(Richard, KingJohn)
or constant or variable >(1, 2) ∨ ≤(1, 2)
>(1, 2) ∧ ¬>(1, 2)
E.g., Brother(KingJohn, RichardT heLionheart)
> (Length(Lef tLegOf (Richard)), Length(Lef tLegOf (KingJohn)))
Truth in first-order logic Models for FOL: Example
Sentences are true with respect to a model and an interpretation
crown
Model contains ≥ 1 objects (domain elements) and relations among them
Interpretation specifies referents for on head
constant symbols → objects person brother
predicate symbols → relations person
function symbols → functional relations brother king
An atomic sentence predicate(term1, . . . , termn) is true

iff the objects referred to by term1, . . . , termn
are in the relation referred to by predicate
R $ J
left leg left leg
Truth example Models for FOL: Lots!

Consider the interpretation in which Entailment in propositional logic can be computed by enumerating models
Richard → Richard the Lionheart
We can enumerate the FOL models for a given KB vocabulary:
John → the evil King John
Brother → the brotherhood relation For each number of domain elements n from 1 to ∞
For each k-ary predicate Pk in the vocabulary
Under this interpretation, Brother(Richard, John) is true
For each possible k-ary relation on n objects
just in case Richard the Lionheart and the evil King John
For each constant symbol C in the vocabulary
are in the brotherhood relation in the model
For each choice of referent for C from n objects . . .
Computing entailment by enumerating FOL models is not easy!
Universal quantification A common mistake to avoid
∀ hvariablesi hsentencei Typically, ⇒ is the main connective with ∀
Everyone at Berkeley is smart: Common mistake: using ∧ as the main connective with ∀:
∀ x At(x, Berkeley) ⇒ Smart(x)
∀ x At(x, Berkeley) ∧ Smart(x)
∀ x P is true in a model m iff P is true with x being
means “Everyone is at Berkeley and everyone is smart”
each possible object in the model
Roughly speaking, equivalent to the conjunction of instantiations of P
(At(KingJohn, Berkeley) ⇒ Smart(KingJohn))
∧ (At(Richard, Berkeley) ⇒ Smart(Richard))
∧ (At(Berkeley, Berkeley) ⇒ Smart(Berkeley))
∧ ...
Existential quantification Another common mistake to avoid

∃ hvariablesi hsentencei Typically, ∧ is the main connective with ∃
Someone at Stanford is smart: Common mistake: using ⇒ as the main connective with ∃:
∃ x At(x, Stanf ord) ∧ Smart(x)
∃ x At(x, Stanf ord) ⇒ Smart(x)
∃ x P is true in a model m iff P is true with x being
is true if there is anyone who is not at Stanford!
some possible object in the model
Roughly speaking, equivalent to the disjunction of instantiations of P
(At(KingJohn, Stanf ord) ∧ Smart(KingJohn))
∨ (At(Richard, Stanf ord) ∧ Smart(Richard))
∨ (At(Stanf ord, Stanf ord) ∧ Smart(Stanf ord))
∨ ...
Properties of quantifiers Fun with sentences
∀ x ∀ y is the same as ∀ y ∀ x (why??) Brothers are siblings
∃ x ∃ y is the same as ∃ y ∃ x (why??)

∃ x ∀ y is not the same as ∀ y ∃ x
∃ x ∀ y Loves(x, y)
“There is a person who loves everyone in the world”
∀ y ∃ x Loves(x, y)
“Everyone in the world is loved by at least one person”
Quantifier duality: each can be expressed using the other
∀ x Likes(x, IceCream) ¬∃ x ¬Likes(x, IceCream)
∃ x Likes(x, Broccoli) ¬∀ x ¬Likes(x, Broccoli)
Fun with sentences Fun with sentences

Brothers are siblings Brothers are siblings
∀ x, y Brother(x, y) ⇒ Sibling(x, y). ∀ x, y Brother(x, y) ⇒ Sibling(x, y).
“Sibling” is symmetric “Sibling” is symmetric
∀ x, y Sibling(x, y) ⇔ Sibling(y, x).
One’s mother is one’s female parent
Fun with sentences Fun with sentences
Brothers are siblings Brothers are siblings
∀ x, y Brother(x, y) ⇒ Sibling(x, y). ∀ x, y Brother(x, y) ⇒ Sibling(x, y).
“Sibling” is symmetric “Sibling” is symmetric
∀ x, y Sibling(x, y) ⇔ Sibling(y, x). ∀ x, y Sibling(x, y) ⇔ Sibling(y, x).
One’s mother is one’s female parent One’s mother is one’s female parent
∀ x, y M other(x, y) ⇔ (F emale(x) ∧ P arent(x, y)). ∀ x, y M other(x, y) ⇔ (F emale(x) ∧ P arent(x, y)).
A first cousin is a child of a parent’s sibling A first cousin is a child of a parent’s sibling
∀ x, y F irstCousin(x, y) ⇔ ∃ p, ps P arent(p, x) ∧ Sibling(ps, p) ∧
P arent(ps, y)
Equality Interacting with FOL KBs

term1 = term2 is true under a given interpretation Suppose a wumpus-world agent is using an FOL KB
if and only if term1 and term2 refer to the same object and perceives a smell and a breeze (but no glitter) at t = 5:
E.g., 1 = 2 and ∀ x ×(Sqrt(x), Sqrt(x)) = x are satisfiable T ell(KB, P ercept([Smell, Breeze, N one], 5))
2 = 2 is valid Ask(KB, ∃ a Action(a, 5))
E.g., definition of (full) Sibling in terms of P arent: I.e., does KB entail any particular actions at t = 5?
∀ x, y Sibling(x, y) ⇔ [¬(x = y) ∧ ∃ m, f ¬(m = f ) ∧
P arent(m, x) ∧ P arent(f, x) ∧ P arent(m, y) ∧ P arent(f, y)] Answer: Y es, {a/Shoot} ← substitution (binding list)
Given a sentence S and a substitution σ,
Sσ denotes the result of plugging σ into S; e.g.,
S = Smarter(x, y)
σ = {x/Hillary, y/Bill}
Sσ = Smarter(Hillary, Bill)
Ask(KB, S) returns some/all σ such that KB |= Sσ
Knowledge base for the wumpus world Deducing hidden properties
“Perception” Properties of locations:
∀ b, g, t P ercept([Smell, b, g], t) ⇒ Smelt(t) ∀ x, t At(Agent, x, t) ∧ Smelt(t) ⇒ Smelly(x)
∀ s, b, t P ercept([s, b, Glitter], t) ⇒ AtGold(t) ∀ x, t At(Agent, x, t) ∧ Breeze(t) ⇒ Breezy(x)
Reflex: ∀ t AtGold(t) ⇒ Action(Grab, t) Squares are breezy near a pit:
Reflex with internal state: do we have the gold already? Diagnostic rule—infer cause from effect
∀ t AtGold(t) ∧ ¬Holding(Gold, t) ⇒ Action(Grab, t) ∀ y Breezy(y) ⇒ ∃ x P it(x) ∧ Adjacent(x, y)
Holding(Gold, t) cannot be observed Causal rule—infer effect from cause
⇒ keeping track of change is essential ∀ x, y P it(x) ∧ Adjacent(x, y) ⇒ Breezy(y)
Neither of these is complete—e.g., the causal rule doesn’t say whether
squares far away from pits can be breezy
Definition for the Breezy predicate:
∀ y Breezy(y) ⇔ [∃ x P it(x) ∧ Adjacent(x, y)]
Keeping track of change Describing actions I

Facts hold in situations, rather than eternally “Effect” axiom—describe changes due to action
E.g., Holding(Gold, N ow) rather than just Holding(Gold) ∀ s AtGold(s) ⇒ Holding(Gold, Result(Grab, s))
Situation calculus is one way to represent change in FOL: “Frame” axiom—describe non-changes due to action
Adds a situation argument to each non-eternal predicate ∀ s HaveArrow(s) ⇒ HaveArrow(Result(Grab, s))
E.g., N ow in Holding(Gold, N ow) denotes a situation
Frame problem: find an elegant way to handle non-change
Situations are connected by the Result function (a) representation—avoid frame axioms
Result(a, s) is the situation that results from doing a in s (b) inference—avoid repeated “copy-overs” to keep track of state
PIT
Qualification problem: true descriptions of real actions require endless caveats—
Gold PIT
what if gold is slippery or nailed down or . . .
PIT
PIT
Ramification problem: real actions have many secondary consequences—
Gold PIT
what about the dust on the gold, wear and tear on gloves, . . .
S1
PIT
Forward
S0
Describing actions II Making plans
Successor-state axioms solve the representational frame problem Initial condition in KB:
At(Agent, [1, 1], S0)
Each axiom is “about” a predicate (not an action per se): At(Gold, [1, 2], S0)
P true afterwards ⇔ [an action made P true Query: Ask(KB, ∃ s Holding(Gold, s))
∨ P true already and no action made P false] i.e., in what situation will I be holding the gold?
Answer: {s/Result(Grab, Result(F orward, S0))}
For holding the gold: i.e., go forward and then grab the gold
∀ a, s Holding(Gold, Result(a, s)) ⇔ This assumes that the agent is interested in plans starting at S0 and that S0
[(a = Grab ∧ AtGold(s)) is the only situation described in the KB
∨ (Holding(Gold, s) ∧ a 6= Release)]
Making plans: A better way Summary

Represent plans as action sequences [a1, a2, . . . , an] First-order logic:
– objects and relations are semantic primitives
P lanResult(p, s) is the result of executing p in s – syntax: constants, functions, predicates, equality, quantifiers
Then the query Ask(KB, ∃ p Holding(Gold, P lanResult(p, S0))) Increased expressive power: sufficient to define wumpus world
has the solution {p/[F orward, Grab]}
Situation calculus:
Definition of P lanResult in terms of Result: – conventions for describing actions and change in FOL
∀ s P lanResult([ ], s) = s – can formulate planning as inference on a situation calculus KB
∀ a, p, s P lanResult([a|p], s) = P lanResult(p, Result(a, s))
Planning systems are special-purpose reasoners designed to do this type of
inference more efficiently than a general-purpose reasoner
Outline
♦ Reducing first-order inference to propositional inference
♦ Unification
Inference in first-order logic ♦ Generalized Modus Ponens
♦ Forward and backward chaining
♦ Logic programming
Chapter 9
♦ Resolution
A brief history of reasoning Universal instantiation (UI)

450b.c. Stoics propositional logic, inference (maybe) Every instantiation of a universally quantified sentence is entailed by it:
322b.c. Aristotle “syllogisms” (inference rules), quantifiers ∀v α
1565 Cardano probability theory (propositional logic + uncertainty)
Subst({v/g}, α)
1847 Boole propositional logic (again)
1879 Frege first-order logic for any variable v and ground term g
1922 Wittgenstein proof by truth tables
E.g., ∀ x King(x) ∧ Greedy(x) ⇒ Evil(x) yields
1930 Gödel ∃ complete algorithm for FOL
1930 Herbrand complete algorithm for FOL (reduce to propositional) King(John) ∧ Greedy(John) ⇒ Evil(John)
1931 Gödel ¬∃ complete algorithm for arithmetic King(Richard) ∧ Greedy(Richard) ⇒ Evil(Richard)
1960 Davis/Putnam “practical” algorithm for propositional logic King(F ather(John)) ∧ Greedy(F ather(John)) ⇒ Evil(F ather(John))
1965 Robinson “practical” algorithm for FOL—resolution ..
Existential instantiation (EI) Existential instantiation contd.
For any sentence α, variable v, and constant symbol k UI can be applied several times to add new sentences;
that does not appear elsewhere in the knowledge base: the new KB is logically equivalent to the old
∃v α EI can be applied once to replace the existential sentence;
Subst({v/k}, α) the new KB is not equivalent to the old,
but is satisfiable iff the old KB was satisfiable
E.g., ∃ x Crown(x) ∧ OnHead(x, John) yields
Crown(C1) ∧ OnHead(C1, John)
provided C1 is a new constant symbol, called a Skolem constant
Another example: from ∃ x d(xy )/dy = xy we obtain
d(ey )/dy = ey
provided e is a new constant symbol
Reduction to propositional inference Reduction contd.

Suppose the KB contains just the following: Claim: a ground sentence∗ is entailed by new KB iff entailed by original KB
∀ x King(x) ∧ Greedy(x) ⇒ Evil(x) Claim: every FOL KB can be propositionalized so as to preserve entailment
King(John)
Greedy(John) Idea: propositionalize KB and query, apply resolution, return result
Brother(Richard, John) Problem: with function symbols, there are infinitely many ground terms,
Instantiating the universal sentence in all possible ways, we have e.g., F ather(F ather(F ather(John)))
King(John) ∧ Greedy(John) ⇒ Evil(John) Theorem: Herbrand (1930). If a sentence α is entailed by an FOL KB,
King(Richard) ∧ Greedy(Richard) ⇒ Evil(Richard) it is entailed by a finite subset of the propositional KB
King(John)
Idea: For n = 0 to ∞ do
Greedy(John)
create a propositional KB by instantiating with depth-n terms
Brother(Richard, John)
see if α is entailed by this KB
The new KB is propositionalized: proposition symbols are
Problem: works if α is entailed, loops if α is not entailed
King(John), Greedy(John), Evil(John), King(Richard) etc.
Theorem: Turing (1936), Church (1936), entailment in FOL is semidecidable
Problems with propositionalization Unification
Propositionalization seems to generate lots of irrelevant sentences. We can get the inference immediately if we can find a substitution θ
E.g., from such that King(x) and Greedy(x) match King(John) and Greedy(y)
∀ x King(x) ∧ Greedy(x) ⇒ Evil(x) θ = {x/John, y/John} works
King(John)
∀ y Greedy(y) Unify(α, β) = θ if αθ = βθ
Brother(Richard, John)
it seems obvious that Evil(John), but propositionalization produces lots of p q θ
facts such as Greedy(Richard) that are irrelevant Knows(John, x) Knows(John, Jane)
Knows(John, x) Knows(y, OJ)
With p k-ary predicates and n constants, there are p · nk instantiations
Knows(John, x) Knows(y, M other(y))
With function symbols, it gets nuch much worse! Knows(John, x) Knows(x, OJ)
Unification Unification
We can get the inference immediately if we can find a substitution θ We can get the inference immediately if we can find a substitution θ
such that King(x) and Greedy(x) match King(John) and Greedy(y) such that King(x) and Greedy(x) match King(John) and Greedy(y)
θ = {x/John, y/John} works θ = {x/John, y/John} works
Unify(α, β) = θ if αθ = βθ Unify(α, β) = θ if αθ = βθ
p q θ p q θ
Knows(John, x) Knows(John, Jane) {x/Jane} Knows(John, x) Knows(John, Jane) {x/Jane}
Knows(John, x) Knows(y, OJ) Knows(John, x) Knows(y, OJ) {x/OJ, y/John}
Knows(John, x) Knows(y, M other(y)) Knows(John, x) Knows(y, M other(y))
Knows(John, x) Knows(x, OJ) Knows(John, x) Knows(x, OJ)
Unification Unification
We can get the inference immediately if we can find a substitution θ We can get the inference immediately if we can find a substitution θ
such that King(x) and Greedy(x) match King(John) and Greedy(y) such that King(x) and Greedy(x) match King(John) and Greedy(y)
θ = {x/John, y/John} works θ = {x/John, y/John} works
Unify(α, β) = θ if αθ = βθ Unify(α, β) = θ if αθ = βθ
p q θ p q θ
Knows(John, x) Knows(John, Jane) {x/Jane} Knows(John, x) Knows(John, Jane) {x/Jane}
Knows(John, x) Knows(y, OJ) {x/OJ, y/John} Knows(John, x) Knows(y, OJ) {x/OJ, y/John}
Knows(John, x) Knows(y, M other(y)) {y/John, x/M other(John)} Knows(John, x) Knows(y, M other(y)) {y/John, x/M other(John)}
Knows(John, x) Knows(x, OJ) Knows(John, x) Knows(x, OJ) f ail
Standardizing apart eliminates overlap of variables, e.g., Knows(z17, OJ)
Generalized Modus Ponens (GMP) Soundness of GMP

Need to show that
p1′, p2′, . . . , pn′, (p1 ∧ p2 ∧ . . . ∧ pn ⇒ q) p1′, . . . , pn′, (p1 ∧ . . . ∧ pn ⇒ q) |= qθ
where pi′θ = piθ for all i
qθ
provided that pi′θ = piθ for all i
p1′ is King(John) p1 is King(x) Lemma: For any definite clause p, we have p |= pθ by UI
p2′ is Greedy(y) p2 is Greedy(x)
θ is {x/John, y/John} q is Evil(x) 1. (p1 ∧ . . . ∧ pn ⇒ q) |= (p1 ∧ . . . ∧ pn ⇒ q)θ = (p1θ ∧ . . . ∧ pnθ ⇒ qθ)
qθ is Evil(John) 2. p1′, . . . , pn′ |= p1′ ∧ . . . ∧ pn′ |= p1′θ ∧ . . . ∧ pn′θ
GMP used with KB of definite clauses (exactly one positive literal) 3. From 1 and 2, qθ follows by ordinary Modus Ponens
All variables assumed universally quantified
Example knowledge base Example knowledge base contd.
The law says that it is a crime for an American to sell weapons to hostile . . . it is a crime for an American to sell weapons to hostile nations:
nations. The country Nono, an enemy of America, has some missiles, and
all of its missiles were sold to it by Colonel West, who is American.
Prove that Col. West is a criminal
Example knowledge base contd. Example knowledge base contd.

. . . it is a crime for an American to sell weapons to hostile nations: . . . it is a crime for an American to sell weapons to hostile nations:
American(x)∧W eapon(y)∧Sells(x, y, z)∧Hostile(z) ⇒ Criminal(x) American(x)∧W eapon(y)∧Sells(x, y, z)∧Hostile(z) ⇒ Criminal(x)
Nono . . . has some missiles Nono . . . has some missiles, i.e., ∃ x Owns(N ono, x) ∧ M issile(x):
Owns(N ono, M1) and M issile(M1)
. . . all of its missiles were sold to it by Colonel West
Example knowledge base contd. Example knowledge base contd.
. . . it is a crime for an American to sell weapons to hostile nations: . . . it is a crime for an American to sell weapons to hostile nations:
American(x)∧W eapon(y)∧Sells(x, y, z)∧Hostile(z) ⇒ Criminal(x) American(x)∧W eapon(y)∧Sells(x, y, z)∧Hostile(z) ⇒ Criminal(x)
Nono . . . has some missiles, i.e., ∃ x Owns(N ono, x) ∧ M issile(x): Nono . . . has some missiles, i.e., ∃ x Owns(N ono, x) ∧ M issile(x):
Owns(N ono, M1) and M issile(M1) Owns(N ono, M1) and M issile(M1)
. . . all of its missiles were sold to it by Colonel West . . . all of its missiles were sold to it by Colonel West
∀ x M issile(x) ∧ Owns(N ono, x) ⇒ Sells(W est, x, N ono) ∀ x M issile(x) ∧ Owns(N ono, x) ⇒ Sells(W est, x, N ono)
Missiles are weapons: Missiles are weapons:
M issile(x) ⇒ W eapon(x)
An enemy of America counts as “hostile”:
Example knowledge base contd. Forward chaining algorithm

. . . it is a crime for an American to sell weapons to hostile nations:
function FOL-FC-Ask(KB, α) returns a substitution or false
American(x)∧W eapon(y)∧Sells(x, y, z)∧Hostile(z) ⇒ Criminal(x)
repeat until new is empty
Nono . . . has some missiles, i.e., ∃ x Owns(N ono, x) ∧ M issile(x):
new ← { }
Owns(N ono, M1) and M issile(M1) for each sentence r in KB do
. . . all of its missiles were sold to it by Colonel West ( p 1 ∧ . . . ∧ p n ⇒ q) ← Standardize-Apart(r)
∀ x M issile(x) ∧ Owns(N ono, x) ⇒ Sells(W est, x, N ono) for each θ such that (p 1 ∧ . . . ∧ p n )θ = (p ′1 ∧ . . . ∧ p ′n )θ
Missiles are weapons: for some p ′1, . . . , p ′n in KB
M issile(x) ⇒ W eapon(x) q ′ ← Subst(θ, q )
An enemy of America counts as “hostile”: if q ′ is not a renaming of a sentence already in KB or new then do
add q ′ to new
Enemy(x, America) ⇒ Hostile(x)
φ ← Unify(q ′, α)
West, who is American . . . if φ is not fail then return φ
American(W est) add new to KB
The country Nono, an enemy of America . . . return false
Enemy(N ono, America)
Forward chaining proof Forward chaining proof
Weapon(M1) Sells(West,M1,Nono) Hostile(Nono)
American(West) Missile(M1) Owns(Nono,M1) Enemy(Nono,America) American(West) Missile(M1) Owns(Nono,M1) Enemy(Nono,America)
Forward chaining proof Properties of forward chaining
Criminal(West) Sound and complete for first-order definite clauses

(proof similar to propositional proof)
Datalog = first-order definite clauses + no functions (e.g., crime KB)
FC terminates for Datalog in poly iterations: at most p · nk literals
Weapon(M1) Sells(West,M1,Nono) Hostile(Nono) May not terminate in general if α is not entailed
This is unavoidable: entailment with definite clauses is semidecidable
American(West) Missile(M1) Owns(Nono,M1) Enemy(Nono,America)
Efficiency of forward chaining Hard matching example
Simple observation: no need to match a rule on iteration k Diff(wa, nt) ∧ Diff(wa, sa) ∧
if a premise wasn’t added on iteration k − 1 Diff(nt, q)Diff(nt, sa) ∧
⇒ match each rule whose premise contains a newly added literal NT
Diff(q, nsw) ∧ Diff(q, sa) ∧
Q
Matching itself can be expensive WA Diff(nsw, v) ∧ Diff(nsw, sa) ∧

SA NSW Diff(v, sa) ⇒ Colorable()
Database indexing allows O(1) retrieval of known facts
e.g., query M issile(x) retrieves M issile(M1) V
Victoria Diff(Red, Blue) Diff(Red, Green)
Diff(Green, Red) Diff(Green, Blue)
Matching conjunctive premises against known facts is NP-hard T
Diff(Blue, Red) Diff(Blue, Green)
Forward chaining is widely used in deductive databases Colorable() is inferred iff the CSP has a solution
CSPs include 3SAT as a special case, hence matching is NP-hard
Backward chaining algorithm Backward chaining example
function FOL-BC-Ask(KB, goals, θ) returns a set of substitutions Criminal(West)

inputs: KB, a knowledge base
goals, a list of conjuncts forming a query (θ already applied)
θ, the current substitution, initially the empty substitution { }
local variables: answers, a set of substitutions, initially empty
if goals is empty then return {θ}
q ′ ← Subst(θ, First(goals))
for each sentence r in KB
where Standardize-Apart(r) = ( p 1 ∧ . . . ∧ p n ⇒ q)
and θ′ ← Unify(q, q ′) succeeds
new goals ← [ p 1, . . . , p n |Rest(goals)]
answers ← FOL-BC-Ask(KB, new goals, Compose(θ ′ , θ)) ∪ answers
return answers
Criminal(West) {x/West} Criminal(West) {x/West}
American(x) Weapon(y) Sells(x,y,z) Hostile(z) American(West) Weapon(y) Sells(x,y,z) Hostile(z)

{}
Criminal(West) {x/West} Criminal(West) {x/West, y/M1}
American(West) Weapon(y) Sells(West,M1,z)

Sells(x,y,z) Hostile(Nono)
Hostile(z) American(West) Weapon(y) Sells(West,M1,z)
Sells(x,y,z) Hostile(Nono)
Hostile(z)
{} {}
Missile(y) Missile(y)
{ y/M1 }
Criminal(West) {x/West, y/M1, z/Nono} Criminal(West) {x/West, y/M1, z/Nono}
American(West) Weapon(y) Sells(West,M1,z) Hostile(z) American(West) Weapon(y) Sells(West,M1,z) Hostile(Nono)

{} { z/Nono } {} { z/Nono }
Missile(y) Missile(M1) Owns(Nono,M1) Missile(y) Missile(M1) Owns(Nono,M1) Enemy(Nono,America)

{ y/M1 } { y/M1 } {} {} {}
Properties of backward chaining Logic programming

Depth-first recursive proof search: space is linear in size of proof Sound bite: computation as inference on logical KBs
Incomplete due to infinite loops Logic programming Ordinary programming
⇒ fix by checking current goal against every goal on stack 1. Identify problem Identify problem
2. Assemble information Assemble information
Inefficient due to repeated subgoals (both success and failure) 3. Tea break Figure out solution
⇒ fix using caching of previous results (extra space!) 4. Encode information in KB Program solution
Widely used (without improvements!) for logic programming 5. Encode problem instance as facts Encode problem instance as data
6. Ask queries Apply program to data
7. Find false facts Debug procedural errors
Should be easier to debug Capital(N ewY ork, U S) than x := x + 2 !
Prolog systems Prolog examples
Basis: backward chaining with Horn clauses + bells & whistles Depth-first search from a start state X:
Widely used in Europe, Japan (basis of 5th Generation project)
Compilation techniques ⇒ approaching a billion LIPS dfs(X) :- goal(X).
dfs(X) :- successor(X,S),dfs(S).
Program = set of clauses = head :- literal1, . . . literaln.
No need to loop over S: successor succeeds for each
criminal(X) :- american(X), weapon(Y), sells(X,Y,Z), hostile(Z).
Appending two lists to produce a third:
Efficient unification by open coding
Efficient retrieval of matching clauses by direct linking append([],Y,Y).
Depth-first, left-to-right backward chaining append([X|L],Y,[X|Z]) :- append(L,Y,Z).
Built-in predicates for arithmetic etc., e.g., X is Y*Z+3
Closed-world assumption (“negation as failure”) query: append(A,B,[1,2]) ?
e.g., given alive(X) :- not dead(X). answers: A=[] B=[1,2]
alive(joe) succeeds if dead(joe) fails A=[1] B=[2]
A=[1,2] B=[]
Resolution: brief summary Conversion to CNF

Full first-order version: Everyone who loves all animals is loved by someone:
ℓ1 ∨ · · · ∨ ℓ k , m1 ∨ · · · ∨ mn ∀ x [∀ y Animal(y) ⇒ Loves(x, y)] ⇒ [∃ y Loves(y, x)]
(ℓ1 ∨ · · · ∨ ℓi−1 ∨ ℓi+1 ∨ · · · ∨ ℓk ∨ m1 ∨ · · · ∨ mj−1 ∨ mj+1 ∨ · · · ∨ mn )θ 1. Eliminate biconditionals and implications
where Unify(ℓi, ¬mj ) = θ. ∀ x [¬∀ y ¬Animal(y) ∨ Loves(x, y)] ∨ [∃ y Loves(y, x)]
For example, 2. Move ¬ inwards: ¬∀ x, p ≡ ∃ x ¬p, ¬∃ x, p ≡ ∀ x ¬p:
¬Rich(x) ∨ U nhappy(x) ∀ x [∃ y ¬(¬Animal(y) ∨ Loves(x, y))] ∨ [∃ y Loves(y, x)]
Rich(Ken) ∀ x [∃ y ¬¬Animal(y) ∧ ¬Loves(x, y)] ∨ [∃ y Loves(y, x)]
U nhappy(Ken) ∀ x [∃ y Animal(y) ∧ ¬Loves(x, y)] ∨ [∃ y Loves(y, x)]
with θ = {x/Ken}
Apply resolution steps to CN F (KB ∧ ¬α); complete for FOL
Conversion to CNF contd. Resolution proof: definite clauses
3. Standardize variables: each quantifier should use a different one American(x) Weapon(y) Sells(x,y,z) Hostile(z) Criminal(x) Criminal(West)
>
>
>
>
L
L
∀ x [∃ y Animal(y) ∧ ¬Loves(x, y)] ∨ [∃ z Loves(z, x)] American(West) American(West) Weapon(y) Sells(West,y,z) Hostile(z)
>
>
>
L
L
4. Skolemize: a more general form of existential instantiation. Missile(x) Weapon(x) Weapon(y) Sells(West,y,z) Hostile(z)
>
>
>
L
L
Each existential variable is replaced by a Skolem function
of the enclosing universally quantified variables: Missile(M1) Missile(y) Sells(West,y,z) Hostile(z)
>
>
L
L
∀ x [Animal(F (x)) ∧ ¬Loves(x, F (x))] ∨ Loves(G(x), x) Missile(x) Owns(Nono,x) Sells(West,x,Nono) Sells(West,M1,z) Hostile(z)
>
>
>
L
L
5. Drop universal quantifiers: Missile(M1) Missile(M1) Owns(Nono,M1) Hostile(Nono)
>
>
L
L
[Animal(F (x)) ∧ ¬Loves(x, F (x))] ∨ Loves(G(x), x) Owns(Nono,M1) Owns(Nono,M1) Hostile(Nono)
>
L
L
6. Distribute ∧ over ∨: Enemy(x,America) Hostile(x) Hostile(Nono)
>
L
L
[Animal(F (x)) ∨ Loves(G(x), x)] ∧ [¬Loves(x, F (x)) ∨ Loves(G(x), x)] Enemy(Nono,America) Enemy(Nono,America)
Outline
♦ Uncertainty
♦ Probability
Uncertainty ♦ Syntax and Semantics
♦ Inference
♦ Independence and Bayes’ Rule
Chapter 13
Uncertainty Methods for handling uncertainty
Let action At = leave for airport t minutes before flight Default or nonmonotonic logic:
Will At get me there on time? Assume my car does not have a flat tire
Assume A25 works unless contradicted by evidence
Problems: Issues: What assumptions are reasonable? How to handle contradiction?
1) partial observability (road state, other drivers’ plans, etc.)
2) noisy sensors (KCBS traffic reports) Rules with fudge factors:
3) uncertainty in action outcomes (flat tire, etc.) A25 7→0.3 AtAirportOnT ime
4) immense complexity of modelling and predicting traffic Sprinkler 7→0.99 W etGrass
W etGrass 7→0.7 Rain
Hence a purely logical approach either Issues: Problems with combination, e.g., Sprinkler causes Rain??
1) risks falsehood: “A25 will get me there on time”
or 2) leads to conclusions that are too weak for decision making: Probability
“A25 will get me there on time if there’s no accident on the bridge Given the available evidence,
and it doesn’t rain and my tires remain intact etc etc.” A25 will get me there on time with probability 0.04
Mahaviracarya (9th C.), Cardamo (1565) theory of gambling
(A1440 might reasonably be said to get me there on time
but I’d have to stay overnight in the airport . . .) (Fuzzy logic handles degree of truth NOT uncertainty e.g.,
W etGrass is true to degree 0.2)
Probability Making decisions under uncertainty

Probabilistic assertions summarize effects of Suppose I believe the following:
laziness: failure to enumerate exceptions, qualifications, etc.
ignorance: lack of relevant facts, initial conditions, etc. P (A25 gets me there on time| . . .) = 0.04
P (A90 gets me there on time| . . .) = 0.70
Subjective or Bayesian probability: P (A120 gets me there on time| . . .) = 0.95
Probabilities relate propositions to one’s own state of knowledge P (A1440 gets me there on time| . . .) = 0.9999
e.g., P (A25|no reported accidents) = 0.06
Which action to choose?
These are not claims of a “probabilistic tendency” in the current situation
(but might be learned from past experience of similar situations) Depends on my preferences for missing flight vs. airport cuisine, etc.
Probabilities of propositions change with new evidence: Utility theory is used to represent and infer preferences
e.g., P (A25|no reported accidents, 5 a.m.) = 0.15
Decision theory = utility theory + probability theory
(Analogous to logical entailment status KB |= α, not truth.)
Probability basics Random variables
Begin with a set Ω—the sample space A random variable is a function from sample points to some range, e.g., the
e.g., 6 possible rolls of a die. reals or Booleans
ω ∈ Ω is a sample point/possible world/atomic event e.g., Odd(1) = true.
A probability space or probability model is a sample space P induces a probability distribution for any r.v. X:
with an assignment P (ω) for every ω ∈ Ω s.t.
P (X = xi) = Σ{ω:X(ω) = xi}P (ω)
0 ≤ P (ω) ≤ 1
Σω P (ω) = 1 e.g., P (Odd = true) = P (1) + P (3) + P (5) = 1/6 + 1/6 + 1/6 = 1/2
e.g., P (1) = P (2) = P (3) = P (4) = P (5) = P (6) = 1/6.
An event A is any subset of Ω
P (A) = Σ{ω∈A}P (ω)
E.g., P (die roll < 4) = P (1) + P (2) + P (3) = 1/6 + 1/6 + 1/6 = 1/2
Propositions Why use probability?

Think of a proposition as the event (set of sample points) The definitions imply that certain logically related events must have related
where the proposition is true probabilities
Given Boolean random variables A and B: E.g., P (a ∨ b) = P (a) + P (b) − P (a ∧ b)

event a = set of sample points where A(ω) = true True
event ¬a = set of sample points where A(ω) = f alse
A A B B
>
event a ∧ b = points where A(ω) = true and B(ω) = true
Often in AI applications, the sample points are defined
by the values of a set of random variables, i.e., the
sample space is the Cartesian product of the ranges of the variables
With Boolean variables, sample point = propositional logic model
e.g., A = true, B = f alse, or a ∧ ¬b.
de Finetti (1931): an agent who bets according to probabilities that violate
Proposition = disjunction of atomic events in which it is true
these axioms can be forced to bet so as to lose money regardless of outcome.
e.g., (a ∨ b) ≡ (¬a ∧ b) ∨ (a ∧ ¬b) ∨ (a ∧ b)
⇒ P (a ∨ b) = P (¬a ∧ b) + P (a ∧ ¬b) + P (a ∧ b)
Syntax for propositions Prior probability
Propositional or Boolean random variables Prior or unconditional probabilities of propositions
e.g., Cavity (do I have a cavity?) e.g., P (Cavity = true) = 0.1 and P (W eather = sunny) = 0.72
Cavity = true is a proposition, also written cavity correspond to belief prior to arrival of any (new) evidence
Discrete random variables (finite or infinite) Probability distribution gives values for all possible assignments:
e.g., W eather is one of hsunny, rain, cloudy, snowi P(W eather) = h0.72, 0.1, 0.08, 0.1i (normalized, i.e., sums to 1)
W eather = rain is a proposition
Values must be exhaustive and mutually exclusive Joint probability distribution for a set of r.v.s gives the
probability of every atomic event on those r.v.s (i.e., every sample point)
Continuous random variables (bounded or unbounded) P(W eather, Cavity) = a 4 × 2 matrix of values:
e.g., T emp = 21.6; also allow, e.g., T emp < 22.0.
Arbitrary Boolean combinations of basic propositions W eather = sunny rain cloudy snow
Cavity = true 0.144 0.02 0.016 0.02
Cavity = f alse 0.576 0.08 0.064 0.08
Every question about a domain can be answered by the joint
distribution because every event is a sum of sample points
Probability for continuous variables Gaussian density

Express distribution as a parameterized function of value: P (x) =
2
√ 1 e−(x−µ) /2σ
2
2πσ
P (X = x) = U [18, 26](x) = uniform density between 18 and 26
0.125
0
18 dx 26
Here P is a density; integrates to 1.

P (X = 20.5) = 0.125 really means
lim P (20.5 ≤ X ≤ 20.5 + dx)/dx = 0.125
dx→0
Conditional probability Conditional probability
Conditional or posterior probabilities Definition of conditional probability:
e.g., P (cavity|toothache) = 0.8
P (a ∧ b)
i.e., given that toothache is all I know P (a|b) = if P (b) 6= 0
P (b)
NOT “if toothache then 80% chance of cavity”
Product rule gives an alternative formulation:
(Notation for conditional distributions: P (a ∧ b) = P (a|b)P (b) = P (b|a)P (a)
P(Cavity|T oothache) = 2-element vector of 2-element vectors)
A general version holds for whole distributions, e.g.,
If we know more, e.g., cavity is also given, then we have P(W eather, Cavity) = P(W eather|Cavity)P(Cavity)
P (cavity|toothache, cavity) = 1 (View as a 4 × 2 set of equations, not matrix mult.)
Note: the less specific belief remains valid after more evidence arrives,
but is not always useful Chain rule is derived by successive application of product rule:
P(X1, . . . , Xn) = P(X1, . . . , Xn−1) P(Xn|X1, . . . , Xn−1)
New evidence may be irrelevant, allowing simplification, e.g., = P(X1, . . . , Xn−2) P(Xn1 |X1, . . . , Xn−2) P(Xn|X1, . . . , Xn−1)
P (cavity|toothache, 49ersW in) = P (cavity|toothache) = 0.8 = ...
This kind of inference, sanctioned by domain knowledge, is crucial n
= Πi = 1P(Xi|X1, . . . , Xi−1)
Inference by enumeration Inference by enumeration

Start with the joint distribution: Start with the joint distribution:
toothache toothache toothache toothache
L
L
catch catch catch catch catch catch catch catch
L
L
cavity .108 .012 .072 .008 cavity .108 .012 .072 .008
cavity .016 .064 .144 .576 cavity .016 .064 .144 .576
L
L
For any proposition φ, sum the atomic events where it is true: For any proposition φ, sum the atomic events where it is true:
P (φ) = Σω:ω|=φP (ω) P (φ) = Σω:ω|=φP (ω)
P (toothache) = 0.108 + 0.012 + 0.016 + 0.064 = 0.2
Inference by enumeration Inference by enumeration
Start with the joint distribution: Start with the joint distribution:
toothache toothache toothache toothache
L
catch catch catch catch catch catch catch catch
L
cavity .108 .012 .072 .008 cavity .108 .012 .072 .008
cavity .016 .064 .144 .576 cavity .016 .064 .144 .576
L
L
For any proposition φ, sum the atomic events where it is true: Can also compute conditional probabilities:
P (φ) = Σω:ω|=φP (ω)
P (¬cavity ∧ toothache)
P (¬cavity|toothache) =
P (cavity∨toothache) = 0.108+0.012+0.072+0.008+0.016+0.064 = 0.28 P (toothache)
0.016 + 0.064
= = 0.4
0.108 + 0.012 + 0.016 + 0.064
Normalization Inference by enumeration, contd.

toothache toothache
L
Let X be all the variables. Typically, we want

catch catch catch catch the posterior joint distribution of the query variables Y
L
given specific values e for the evidence variables E

cavity .108 .012 .072 .008
Let the hidden variables be H = X − Y − E
cavity .016 .064 .144 .576
L
Then the required summation of joint entries is done by summing out the
hidden variables:
P(Y|E = e) = αP(Y, E = e) = αΣhP(Y, E = e, H = h)
Denominator can be viewed as a normalization constant α
The terms in the summation are joint entries because Y, E, and H together
P(Cavity|toothache) = α P(Cavity, toothache)
exhaust the set of random variables
= α [P(Cavity, toothache, catch) + P(Cavity, toothache, ¬catch)]
= α [h0.108, 0.016i + h0.012, 0.064i] Obvious problems:
= α h0.12, 0.08i = h0.6, 0.4i 1) Worst-case time complexity O(dn) where d is the largest arity
2) Space complexity O(dn) to store the joint distribution
General idea: compute distribution on query variable 3) How to find the numbers for O(dn) entries???
by fixing evidence variables and summing over hidden variables
Independence Conditional independence
A and B are independent iff P(T oothache, Cavity, Catch) has 23 − 1 = 7 independent entries
P(A|B) = P(A) or P(B|A) = P(B) or P(A, B) = P(A)P(B)
If I have a cavity, the probability that the probe catches in it doesn’t depend
Cavity
Cavity decomposes into Toothache Catch on whether I have a toothache:
Toothache Catch (1) P (catch|toothache, cavity) = P (catch|cavity)
Weather
Weather The same independence holds if I haven’t got a cavity:
(2) P (catch|toothache, ¬cavity) = P (catch|¬cavity)
P(T oothache, Catch, Cavity, W eather)
Catch is conditionally independent of T oothache given Cavity:
= P(T oothache, Catch, Cavity)P(W eather)
P(Catch|T oothache, Cavity) = P(Catch|Cavity)
32 entries reduced to 12; for n independent biased coins, 2n → n
Equivalent statements:
Absolute independence powerful but rare P(T oothache|Catch, Cavity) = P(T oothache|Cavity)
P(T oothache, Catch|Cavity) = P(T oothache|Cavity)P(Catch|Cavity)
Dentistry is a large field with hundreds of variables,
none of which are independent. What to do?
Conditional independence contd. Bayes’ Rule

Write out full joint distribution using chain rule: Product rule P (a ∧ b) = P (a|b)P (b) = P (b|a)P (a)
P(T oothache, Catch, Cavity)
P (b|a)P (a)
= P(T oothache|Catch, Cavity)P(Catch, Cavity) ⇒ Bayes’ rule P (a|b) =
= P(T oothache|Catch, Cavity)P(Catch|Cavity)P(Cavity) P (b)
= P(T oothache|Cavity)P(Catch|Cavity)P(Cavity) or in distribution form
I.e., 2 + 2 + 1 = 5 independent numbers (equations 1 and 2 remove 2) P(X|Y )P(Y )
P(Y |X) = = αP(X|Y )P(Y )
P(X)
In most cases, the use of conditional independence reduces the size of the
representation of the joint distribution from exponential in n to linear in n. Useful for assessing diagnostic probability from causal probability:
P (Ef f ect|Cause)P (Cause)
Conditional independence is our most basic and robust P (Cause|Ef f ect) =
form of knowledge about uncertain environments. P (Ef f ect)
E.g., let M be meningitis, S be stiff neck:
P (s|m)P (m) 0.8 × 0.0001
P (m|s) = = = 0.0008
P (s) 0.1
Note: posterior probability of meningitis still very small!
Bayes’ Rule and conditional independence Wumpus World
1,4 2,4 3,4 4,4
P(Cavity|toothache ∧ catch) 1,3 2,3 3,3 4,3
= α P(toothache ∧ catch|Cavity)P(Cavity)
1,2 2,2 3,2 4,2
= α P(toothache|Cavity)P(catch|Cavity)P(Cavity) B
OK
This is an example of a naive Bayes model: 1,1 2,1 3,1 4,1
B
P(Cause, Ef f ect1 , . . . , Ef f ectn ) = P(Cause)ΠiP(Ef f ecti|Cause) OK OK
Cavity Cause Pij = true iff [i, j] contains a pit

Bij = true iff [i, j] is breezy
Toothache Catch Effect 1 Effect n Include only B1,1, B1,2, B2,1 in the probability model
Total number of parameters is linear in n
Specifying the probability model Observations and query

The full joint distribution is P(P1,1, . . . , P4,4, B1,1, B1,2, B2,1) We know the following facts:
b = ¬b1,1 ∧ b1,2 ∧ b2,1
Apply product rule: P(B1,1, B1,2, B2,1 | P1,1, . . . , P4,4)P(P1,1, . . . , P4,4) known = ¬p1,1 ∧ ¬p1,2 ∧ ¬p2,1
(Do it this way to get P (Ef f ect|Cause).) Query is P(P1,3|known, b)
First term: 1 if pits are adjacent to breezes, 0 otherwise Define U nknown = Pij s other than P1,3 and Known
Second term: pits are placed randomly, probability 0.2 per square: For inference by enumeration, we have
4,4 n 16−n
P(P1,1, . . . , P4,4) = Π i,j = 1,1P(Pi,j ) = 0.2 × 0.8 P(P1,3|known, b) = αΣunknownP(P1,3, unknown, known, b)
for n pits. Grows exponentially with number of squares!
Using conditional independence Using conditional independence contd.
Basic insight: observations are conditionally independent of other hidden
squares given neighbouring hidden squares X
P(P1,3|known, b) = α P(P1,3, unknown, known, b)
1,4 2,4 3,4 4,4 unknown
X
= α P(b|P1,3, known, unknown)P(P1,3, known, unknown)
unknown
X X
1,3 2,3 3,3 4,3 = α P(b|known, P1,3, f ringe, other)P(P1,3, known, f ringe, other)
QUERY
OTHER f ringe other
X X
= α P(b|known, P1,3, f ringe)P(P1,3, known, f ringe, other)
1,2 2,2 3,2 4,2 f ringe other
X X
= α P(b|known, P1,3, f ringe) P(P1,3, known, f ringe, other)
f ringe other
1,1 2,1 FRINGE
3,1 4,1 X X
KNOWN = α P(b|known, P1,3, f ringe) P(P1,3)P (known)P (f ringe)P (other)
f ringe other
X X
= α P (known)P(P1,3) P(b|known, P1,3, f ringe)P (f ringe) P (other)
f ringe other
X
Define U nknown = F ringe ∪ Other = α′ P(P1,3) P(b|known, P1,3, f ringe)P (f ringe)
f ringe
P(b|P1,3, Known, U nknown) = P(b|P1,3, Known, F ringe)
Manipulate query into a form where we can use this!
Using conditional independence contd. Summary

1,3 1,3 1,3 1,3 1,3 Probability is a rigorous formalism for uncertain knowledge
1,2
B
2,2 1,2
B
2,2 1,2
B
2,2 1,2
B
2,2 1,2
B
2,2
Joint probability distribution specifies probability of every atomic event
OK OK OK OK OK
1,1 2,1 3,1 1,1 2,1 3,1 1,1 2,1 3,1 1,1 2,1 3,1 1,1 2,1 3,1
OK
B
OK OK
B
OK OK
B
OK OK
B
OK OK
B
OK
Queries can be answered by summing over atomic events
0.2 x 0.2 = 0.04 0.2 x 0.8 = 0.16 0.8 x 0.2 = 0.16 0.2 x 0.2 = 0.04 0.2 x 0.8 = 0.16
For nontrivial domains, we must find a way to reduce the joint size
Independence and conditional independence provide the tools
′
P(P1,3|known, b) = α h0.2(0.04 + 0.16 + 0.16), 0.8(0.04 + 0.16)i
≈ h0.31, 0.69i
P(P2,2|known, b) ≈ h0.86, 0.14i
Outline
♦ Learning agents
♦ Inductive learning
Learning from Observations ♦ Decision tree learning
♦ Measuring learning performance
Learning Learning agents

Performance standard
Learning is essential for unknown environments,
i.e., when designer lacks omniscience
Critic Sensors
Learning is useful as a system construction method,
i.e., expose the agent to reality rather than trying to write it down
feedback
Learning modifies the agent’s decision mechanisms to improve performance
Environment
changes
Learning Performance
element element
knowledge
learning
goals
experiments
Problem
generator
Agent Effectors

Learning element Inductive learning (a.k.a. Science)
Design of learning element is dictated by Simplest form: learn a function from examples (tabula rasa)
♦ what type of performance element is used
♦ which functional component is to be learned f is the target function
♦ how that functional compoent is represented O O X
♦ what kind of feedback is available An example is a pair x, f (x), e.g., X , +1
Example scenarios: X
Performance element Component Representation Feedback
Problem: find a(n) hypothesis h
Alpha−beta search Eval. fn. Weighted linear function Win/loss such that h ≈ f
Logical agent Transition model Successor−state axioms Outcome
given a training set of examples
Utility−based agent Transition model Dynamic Bayes net Outcome (This is a highly simplified model of real learning:
Simple reflex agent Percept−action fn Neural net Correct action
– Ignores prior knowledge
– Assumes a deterministic, observable “environment”
Supervised learning: correct answers for each instance – Assumes examples are given
Reinforcement learning: occasional rewards – Assumes that the agent wants to learn f —why?)
Inductive learning method Inductive learning method

Construct/adjust h to agree with f on training set Construct/adjust h to agree with f on training set
(h is consistent if it agrees with f on all examples) (h is consistent if it agrees with f on all examples)
E.g., curve fitting: E.g., curve fitting:
f(x) f(x)
x x

f(x) f(x)
x x

f(x) f(x)
x x
Ockham’s razor: maximize a combination of consistency and simplicity

Attribute-based representations Decision trees
Examples described by attribute values (Boolean, discrete, continuous, etc.) One possible representation for hypotheses
E.g., situations where I will/won’t wait for a table: E.g., here is the “true” tree for deciding whether to wait:
Example Attributes Target

Patrons?
Alt Bar F ri Hun P at P rice Rain Res T ype Est WillWait
X1 T F F T Some $$$ F T French 0–10 T None Some Full
X2 T F F T Full $ F F Thai 30–60 F F T WaitEstimate?
X3 F T F F Some $ F F Burger 0–10 T
X4 T F T T Full $ F F Thai 10–30 T >60 30−60 10−30 0−10
X5 T F T F Full $$$ F T French >60 F F Alternate? Hungry? T
X6 F T F T Some $$ T T Italian 0–10 T No Yes No Yes
X7 F T F F None $ T F Burger 0–10 F
X8 F F F T Some $$ T T Thai 0–10 T Reservation? Fri/Sat? T Alternate?
X9 F T T F Full $ T F Burger >60 F No Yes No Yes No Yes
X10 T T T T Full $$$ F T Italian 10–30 F Bar? T F T T Raining?
X11 F F F F None $ F F Thai 0–10 F No Yes No Yes
X12 T T T T Full $ F F Burger 30–60 T
F T F T
Classification of examples is positive (T) or negative (F)
Expressiveness Hypothesis spaces

Decision trees can express any function of the input attributes. How many distinct decision trees with n Boolean attributes??
E.g., for Boolean functions, truth table row → path to leaf:
A
A B A xor B F T
F F F
B B
F T T
F T F T
T F T
T T F F T T F
Trivially, there is a consistent decision tree for any training set

w/ one path to leaf for each example (unless f nondeterministic in x)
but it probably won’t generalize to new examples
Prefer to find more compact decision trees

Hypothesis spaces Hypothesis spaces
How many distinct decision trees with n Boolean attributes?? How many distinct decision trees with n Boolean attributes??
= number of Boolean functions = number of Boolean functions
= number of distinct truth tables with 2n rows

n n
= number of distinct truth tables with 2n rows = 22 = number of distinct truth tables with 2n rows = 22
E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees

n n
= number of distinct truth tables with 2n rows = 22 = number of distinct truth tables with 2n rows = 22
E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees E.g., with 6 Boolean attributes, there are 18,446,744,073,709,551,616 trees
How many purely conjunctive hypotheses (e.g., Hungry ∧ ¬Rain)?? How many purely conjunctive hypotheses (e.g., Hungry ∧ ¬Rain)??
Each attribute can be in (positive), in (negative), or out
⇒ 3n distinct conjunctive hypotheses
More expressive hypothesis space
– increases chance that target function can be expressed
– increases number of hypotheses consistent w/ training set
⇒ may get worse predictions
Decision tree learning Choosing an attribute

Aim: find a small tree consistent with the training examples Idea: a good attribute splits the examples into subsets that are (ideally) “all
positive” or “all negative”
Idea: (recursively) choose “most significant” attribute as root of (sub)tree
function DTL(examples, attributes, default) returns a decision tree

if examples is empty then return default
Patrons? Type?
else if all examples have the same classification then return the classification
else if attributes is empty then return Mode(examples) None Some Full French Italian Thai Burger
else
best ← Choose-Attribute(attributes, examples)
tree ← a new decision tree with root test best
for each value vi of best do P atrons? is a better choice—gives information about the classification
examplesi ← {elements of examples with best = vi }
subtree ← DTL(examplesi, attributes − best, Mode(examples))
add a branch to tree with label vi and subtree subtree
return tree

Information Information contd.
Information answers questions Suppose we have p positive and n negative examples at the root
⇒ H(hp/(p+n), n/(p+n)i) bits needed to classify a new example
The more clueless I am about the answer initially, the more information is E.g., for 12 restaurant examples, p = n = 6 so we need 1 bit
contained in the answer
An attribute splits the examples E into subsets Ei, each of which (we hope)
Scale: 1 bit = answer to Boolean question with prior h0.5, 0.5i needs less information to complete the classification
Information in an answer when prior is hP1, . . . , Pni is Let Ei have pi positive and ni negative examples
H(hP1, . . . , Pni) = Σ
n
− Pi log2 Pi ⇒ H(hpi/(pi +ni), ni/(pi +ni)i) bits needed to classify a new example
i=1
⇒ expected number of bits per example over all branches is
(also called entropy of the prior)
pi + ni
Σi H(hpi/(pi + ni), ni/(pi + ni)i)
p+n
For P atrons?, this is 0.459 bits, for T ype this is (still) 1 bit
⇒ choose the attribute that minimizes the remaining information needed
Example contd. Performance measurement

Decision tree learned from the 12 examples: How do we know that h ≈ f ? (Hume’s Problem of Induction)
1) Use theorems of computational/statistical learning theory
Patrons?
None Some Full 2) Try h on a new test set of examples

(use same distribution over example space as training set)
F T Hungry?
Yes No
Learning curve = % correct on test set as a function of training set size
Type? F
1
% correct on test set

0.9
French Italian Thai Burger
0.8
T F Fri/Sat? T
No Yes 0.7
F T 0.6
0.5
Substantially simpler than “true” tree—a more complex hypothesis isn’t jus- 0.4
0 10 20 30 40 50 60 70 80 90 100
tified by small amount of data
Training set size
Performance measurement contd. Summary
Learning curve depends on Learning needed for unknown environments, lazy designers
– realizable (can express target function) vs. non-realizable
non-realizability can be due to missing attributes Learning agent = performance element + learning element
or restricted hypothesis class (e.g., thresholded linear function) Learning method depends on type of performance element, available
– redundant expressiveness (e.g., loads of irrelevant attributes) feedback, type of component to be improved, and its representation
% correct For supervised learning, the aim is to find a simple hypothesis
1 realizable that is approximately consistent with training examples
redundant
Decision tree learning using information gain
nonrealizable Learning performance = prediction accuracy measured on test set
# of examples
Introduction to Probability and Reasoning

Ramprasad S Joshi
April 8, 2018
Abstract
After studying the modules on Search and Logical Inference in an Artificial Intelligence
course, we usually enter the uncertain waters of probabilistic models and planning-learning. This
note is supposed to be a bridge from the certain world of hard computing (and haphazard heuristic
soft computing when the hard computing becomes NP-Hard) to embracing uncertainty with rigour.
We demonstrate how the language of probability theory is a convenient shorthand to model myriad
possibilities of NP-Hard computation of boolean logic, without sacrificing a lot of guarantees coming
from rigour. We use the Monty Hall Problem to demonstrate this. This is a game-show-related
paradox which defied the arguments of no less than a genius like Paul Erdös.
Figure 1: Selvin’s Solution

1 Introduction
The Monty Hall Problem is, in its bare essentials, this:
2 Formal Enumeration
Suppose you’re on a game show, and you’re given the choice of three doors: Behind one
door are keys of a new car that you can win; behind the others, goats. You pick a door, say Selvin(1975b) has shown that there are 9 cases (see Figure 1) of composition of original configura-
A, and the host, who knows what’s behind the doors, opens another door, say C, which has tion, first choice, showing a goat, and switching choice. Later, in response to a barrage of objections,
a goat. He then says to you, “Do you want to pick door B?” Is it to your advantage to switch Selvin again(1975a) gave a proof by the more traditional Bayesian argument. Let us formalize Selvin’s
your choice? original(1975b) approach in our bridge-between-logic-Probability way. Here we also parametrize the
probabilities thus (generalizing and dropping the simple fairness assumptions) in Table 1. Then we
The solution is that it is advantageous to always switch, doubling the chance of winning the car, as
shown in the first known formal solution to this problem(Selvin, 1975b, see Figure 1). Let us approach Description Door A Door B Door C
the same via propositional logic and formal modelling. The propositions in the KB are: Can be car door a b c=1−a−b
Contestant first chooses x y z =1−x−y
p : The car door is chosen.
Host shows (car door is A, Contestant also chooses it) 0 ta 1 − ta
q : After the show, door is switched.
Host shows (car door is B, Contestant also chooses it) tb 0 1 − tb
And then we take a probability distribution P on these propositions under natural assumptions such as Host shows (car door is C, Contestant also chooses it) tc 1 − tc 0
the probability of any door being the car door is equal: P[p] = 13 . Host shows (car door is A, Contestant chose B) 0 0 1
Host shows (car door is A, Contestant chose C) 0 1 0
p q win P Host shows (car door is B, Contestant chose A) 0 0 1
T T F P[p]P[q] Host shows (car door is B, Contestant chose C) 1 0 0
T F T P[p](1 − P[q]) Host shows (car door is C, Contestant chose A) 0 1 0
F T T (1 − P[p])P[q] Host shows (car door is C, Contestant chose B) 1 0 0
F F F (1 − P[p])(1 − P[q]) Switching probabilities for the Contestant:
Chose Shown Switches
Thus, with P[q] = 1 (fixed, “always switching” strategy) P[win] = 1 − P[p] which is 23 when P[p] = 31 A B sab
(uniformity). Moreover, (for some people somewhat counterintuitively), the lower the chance of choosing A C sac
the car door in the first choice, the better is the winning probability with a fixed switching strategy. B A sba
For example, if the each of A, B, C can be the car door with probabilities a, b, c respectively and the B C sbc
Contestant chooses each of them with probabilities x, y, z respectively, P[win] = 1 − (ax + by + cz) which C A sca
is 1 − a+b+c
3 = 1 − 31 = 32 if the Contestant chooses the doors uniformly, but otherwise it is higher than C B scb
2
3 when the Contestant’s choice probabilities x, y, z are in reverse order of the door priors a, b, c. Thus if
[a, b, c] = [ 61 , 31 , 12 ] and [x, y, z] = [ 21 , 31 , 61 ], then P[win] = 13 2
18 > 3 .
Table 1: Generalization and Enumeration
1 2
express in FOL the probabilities thus: 3 Maximizing the Winning Probability
The Car Door P[car(A)] = a; P[car(B)] = b; P[car(C)] = c; a + b + c = 1. Note that ax + by + cz is the probability of the Contestant choosing the car door in the first chance,
and with the fixed strategy of always switching, the winning probability is 1 − (ax + by + cz), regardless
The Chosen Door P[chosen(A)] = x; P[chosen(B)] = y; P[chosen(C)] = z; x + y + z = 1.
of the Host’s choice when indeed the Contestant chooses the car door the first time. That means, to
The Shown Door P[shown(B)|car(A) ∧ chosen(A)] = ta ; P[shown(C)|car(A) ∧ chosen(A)] = 1 − ta . maximize the chance of winning finally, the Contestant must minimize the chance of choosing the car
P[shown(A)|car(B) ∧ chosen(B)] = tb ; P[shown(C)|car(B) ∧ chosen(B)] = 1 − tb . door in the first attempt! For a, b, c all distinct, when is this minimization happening? Intuitively, when
P[shown(A)|car(C) ∧ chosen(C)] = tc ; P[shown(B)|car(C) ∧ chosen(C)] = 1 − tc . the Contestant’s guesses about a, b, c are all “wrong”. Formally, ax+by +cz is minimized when a < b < c
and x > y > z, and maximized when a < b < c and x < y < z – by the Rearrangement Inequality(Hardy
Switching P[switched|chosen(A) ∧ shown(B)] = sa ; P[switched|chosen(A) ∧ shown(C) = 1 − sa . et al., 1952, Section 10.2, Theorem 368). Thus, if the Contestant can guess the car door priors better,
P[switched|chosen(B) ∧ shown(A)] = sb ; P[switched|chosen(B) ∧ shown(C) = 1 − sb . then actually the first choice must be made against the guess so that the Host reveals more information
P[switched|chosen(C) ∧ shown(A)] = sc ; P[switched|chosen(C) ∧ shown(B)] = 1 − sc . to confirm the correct guess about the car door.
Now we can state the main result:
3.1 Generalization
Claim 1. Given the probabilities in Table 1 as a = b = c; x = y = z; ta = tb = tc = 21 ; s∗∗ = 1, i.e. with
fairness and uniformity assumptions, when the Contestant chooses the strategy of always switching, the If there are n > 3 doors (with ai probabilities of being the reward door), 1 reward, and the Host reveals
probability of winning is 23 ; otherwise, if in the same situation the Contestant never chooses to switch k < n − 1 non-reward doors, then the probability that the reward is behind one of the n − k − 1
(s∗∗ = 0), then the probability of winning is 31 . doors not revealed to and not chosen by the Contestant is the complement of the probability that the
Contestant chose the reward door in the first choice. Thus taking the reward door priors to be hai i and
Proof. First note that by a full joint distribution, the Contestant guesses of them to be hxi i, and the switching probabilities hyilj i, the winning probability
X after (always) switching is
P[win] = P[car(X) ∧ chosen(X) ∧ ¬switched] + P[car(X) ∧ ¬chosen(X) ∧ switched]  
X∈{A,B,C}
!
X X
P[win|switching] = 1 − a i xi  aα xα yilα 
X  
= P[car(X) ∧ chosen(X) ∧ shown(Y )¬switched]
i l∈[1..n]−{i},α∈([1..n]−{i,l} )
X,Y,Z∈{A,B,C};X6=Y ;Y 6=Z;X6=Z k
+ P[car(X) ∧ chosen(X) ∧ shown(Z) ∧ ¬switched]

where α is the multi-index choosing all groups of size k from [1..n] − {l, i}. Thus, in this case, there is
+ P[car(X) ∧ chosen(Y ) ∧ switched] + P[car(X) ∧ chosen(Z) ∧ switched] a multiobjective optimization problem: the left factor is maximized by the same inverse ordering of the
=axta (1 − sab ) + ax(1 − ta )(1 − sac ) Contestant’s choice against the door priors, but the right factor is minimized by the same; we need more
+ bytb (1 − sba ) + by(1 − tb )(1 − sbc ) rigourous analysis. However, noting that the two factors are in fact governed by two different choices,
the Contestant can choose the first door according to the reversal of priors and then switch according to
+ cztc (1 − sca ) + cz(1 − tc )(1 − scb )
the priors – then both the factors are maximized.
+ aysbc + azscb + bxsac + bzsca + cxsab + cysba (1)
Substitute a = b = c = x = y = z = 13 ; t∗ = 12 ; s∗∗ = 1 (always switching), then References

6 2 Hardy, G., Littlewood, J., and Pólya, G. (1952). Inequalities. Cambridge University Press, 2 edition.
P[win|switching] = ay + az + bx + bz + cx + cy = = .
3×3 3
Selvin, S. (1975a). On the monty hall problem (in letters to the editor). The American Statistician,
1 1
Instead now substitute a = b = c = x = y = z = 3 ; t∗ = 2 ; s∗∗ = 0 (never switching), then 29(3):131–134.
3
P[win|¬switching] = ax + by + cz = 3×3 = 13 .
Selvin, S. (1975b). A problem in probability (in letters to the editor). The American Statistician,
It will be interesting to see the behaviour of the resulting probabilities for different values (without 29(1):67–71.
fairness, uniformity and fixed strategic choice) of the parameters. But, the main lesson here is about
conditional independence: just consider the case of x = y = z = 31 and a fixed strategy of always
switching. Regardless of the values of a, b, c (the priors of doors being car doors) or the host behaviour
fairness (t∗ ), it turns out that
2
P[win|switching] = 2P[win|¬switching] = 2(a + b + c)/3 = .
3
Given uniformity in the behaviour of the Contestant, the winning probability is independent of the doors’
priors, and always switching is 100% better! Deliberately making a bad choice the first time, if such can
be made, pays off even more!
3 4
the given position, if we move any single queen anywhere else in its row, except any of the corner
queens moving to the other corner in its corner, the number of conflicts grow only. That means,
the four-down-four-up solution is not found by a strategy here, unless any sequence of moves of
any length assessed in hindsight is considered a strategy.
When Many Queens Fight Over Territory
How to make M IN -C ONFLICTS work 1.1 Formal Definition
We know that two queens cannot be on the same row or column without attacking each other.
Ramprasad S Joshi Therefore, to reduce the search space, we fix that each queen is on a different column, such that
then we need to choose different rows for them that place them on different diagonals, to satisfy
April 2, 2017 all the constraints.
Definition 1.1 (N-Q UEENS). Given n, find a permutation x1 , x2 , . . . , xn of [1..n] such that
Abstract
∀i 6= j ∈ [1..n], |xi − xj |6= |i − j|.
This short note is about the assertion in AIMA(Russell and Norvig, 2010, p.221) that “on
the n-queens problem, if you don’t count the initial placement of queens, the run time of min- In this formal definition, it is implicit that the ith queen remains on the ith column. We choose
conflicts is roughly independent of problem size”. We learnt the hard way, that to verify this state- only a permutation on the rows, thus again making sure that each queen is on a unique row. Then
ment, we need to carefully implement the heuristic with full rigour.
the constraints are only on the diagonals.
1 Introduction 2 Solution Strategies

The N-Q UEENS problem is a classic pedagogical constraint satisfaction problem or csp for short. The
The observation above on the example instance and its solution brings us to the question of what
problem is to place n queens on an n × n chessboard such that no queen attacks another. In chess
constitutes a strategy, and also to concommittant questions of how to design, verify and assess
(by the usual conventional rules) a queen can attack along the row, the column and each diagonal
strategies. We will take that general question up for discussion later when we have accounted for
it is placed on. As shown in Figure 1, only the corner queens are attacking each other, but no other
the particular question at hand.
queens are attacking any other queen. In this example, it is easy to see that if the first four (in a
For the latter, let us summarize what we are studying:
The Problem The N-Q UEENS problem is a csp. There is a finite set of variable that are to be
assigned values from (finite) known domains such that any of the values do not violate
any of some fixed constraints. In this particular case, the domain is 1 to n when n is the
number of queens (also determining that n × n is the board size). The constraints constitute
at least n(n−1)
2 comparisons of absolute differences of row-columns, if we do not consider
the permutation part.
Solutions There are two widely recognized effective solution strategies for csps in general:
1. Complete search, in which assignment of rows to queens happens in a one-queen-at-a-
time fashion. B ACK T RACKING S EARCH falls in this category. Whenever a “dead-end”
happens in the search, the latest queens assigned will lose that assignment in favour
of potential improvement. This is Depth-first tree search, which ensures the efficacy:
that feasible assignments will be visited exhaustively. Here (deterministic) heuristics,
such as choosing most constrained variables and least constraining values to be tried
first, are efficient, in that heuristic preferences, when optimal, obviate the need for
Figure 1: 8-Queens Instance Example exhaustive search.
2. The second strategy, of local search or iterative improvement, seeks to exploit some in-
left-to-right column sequencing) queens are moved one row each down, and correspondingly the herent solution localization properties of the search space. Beginning with some ini-
latter four queens are moved one row each up, even this remaining conflict is resolved and we tial complete assignment, such a strategy will try to improve it by some randomized
have a solution. But caution! we have the advantage of omniscience for this easily observable heuristic choice of variable and of change in its assignment. This strategy is not always
visual example. Even then, unless and until it is spotted, the solution is not found easily. In fact, in complete.
1 2
2.1 Local Search 3.1 Description and Algorithm
The local search strategy is very simple to define in its full generality: In Line 5 in Algorithm The same page(Russell and Norvig, 2010, p.221) describes the M IN -C ONFLICTS as follows (see
figure 2):
Algorithm 1 Local Search for NQueens function M IN -C ONFLICTS(csp, max steps) returns a solution or failure
inputs: csp, a constraint satisfaction problem
Require: n, the number of queens. max steps , the number of steps allowed before giving up
1: Q ← RandomPermutation([1..n]) current ← an initial complete assignment for csp
2: State ← InitializeState(Q) for i = 1 to max steps do
3: while ¬(Safe(Q) ∨ Failed(State)) do if current is a solution for csp then return current
var ← a randomly chosen conflicted variable from csp.VARIABLES
4: q ← ChooseRandom([1..n]) value ← the value v for var that minimizes C ONFLICTS(var,v, current,csp)
5: Q ← UpdateState(State, Q) set var = value in current
6: end while return f ailure
7: return (Q, Safe(Q))
1, the updation can depend on both the current (algorithmic) state and the curent assignment.
The state can include random number generator state, or deterministic state, number of steps
passed or to go, etc. This allows such a wide variety of heuristics that it can represent plain hill-
climbing, hill-climbing with different tiebreaker criteria to escape various local traps in search,
etc. S IMULATED A NNEALING and singleton population mutation-only G ENETIC A LGORITHM can
similarly be retrofitted. Even the B ACK T RACKING S EARCH can be retrofitted to this, by removal
of randomization.
2.1.1 Local Search with M IN -C ONFLICTS

One such updater is M IN -C ONFLICTS. It changes the current assignment by moving only one
(randomly chosen) queen, to a row in its column that gives it the minimum number of conflicts,
with again the choice of the tiebreaker being left open for choice and reinterpretation. The latter
is what can be a source of inconsistent results, as we will see in the next section.
3 M IN -C ONFLICTS in Action
Figure 2: The Text-Book Description
AIMA says(Russell and Norvig, 2010, p.221):
Amazingly, on the n-queens problem, if you don’t count the initial placement of queens, the run time of The line “value ← the value v for var that minimizes C ONFLICTS(var,v, current,csp)” in the text-book’s
min-conflicts is roughly independent of problem size. It solves even the million-queens problem in an average pseudo-code version gives an impression that there is (always) a unique value that minimizes etc. It
of 50 steps (after the initial assignment). This remarkable observation was the stimulus leading to a great
deal of research in the 1990s on local searc h and the distinction between easy and hard problems, which does not answer (or, for that matter, even raise) the question of how to break the tie when there are
we take up in Chapter 7. Roughly speaking, n-queens is easy for local search because solutions are densely more than one values that minimize.
distributed throughout the state space. Min-conflicts also works well for hard problems. We found that indeed, this unfortunate gloss over this important question is the root of all
trouble. Let us see how.
However, when we implemented M IN -C ONFLICTS initially we did not get such a nice perfor-
mance. We got something between linear and quadratic time in the number of queens n, making
us wonder, what went wrong and why? 3.2 The Resolution
To answer this question, we had to go back to the basics: looking at the algorithm’s description First, there is this Python implementation from the text-book’s web-site (see figure 3). The assign-
and the various implementations given in the accompanying material on the text-book’s website ment to value is made by a call to min conflicts value. And in that function, the comment,
(aima.cs.berkeley.edu), we reviewed our implementation, and finally got to the root of the prob- ”Return the value that will give var the least number of conflicts. If there is a tie, choose at
lem, which is the subject matter of this article. random”, or its implementation “argmin random tie(csp.domains[var], lambda val:
Next, first we describe the description-specification of the algorithm, and then go to the im- csp.nconflicts(var, val, current))” is a clear giveaway: the question of the missing
plementation issues. tiebreaker is both raised and answered here. Now we know what was missing in the text-book
itself.
3 4
#______________________________________________________________________________ Algorithm 2 MinConflicts Hill-Climbing for NQueens

# Min-conflicts hillclimbing search for CSPs
Require: n, the number of queens; RandomChoice, a boolean choice.
def min_conflicts(csp, max_steps=1000000): 1: Q ← RandomPermutation([1..n])
"""Solve a CSP by stochastic hillclimbing on the number of conflicts.""" 2: while ¬Safe(Q) do
# Generate a complete assignement for all vars (probably with conflicts) 3: q ← ChooseRandom([1..n])
current = {}; csp.current = current 4: if RandomChoice then
for var in csp.vars: 5: Q.q ← ChooseRandom(FindMinimumConflictingRows(Q, q))
val = min_conflicts_value(csp, var, current)
6: else
csp.assign(var, val, current)
7: Q.q ← ChooseFirst(FindMinimumConflictingRows(Q, q))
# Now repeatedly choose a random conflicted variable and change it
for i in range(max_steps): 8: end if
conflicted = csp.conflicted_vars(current) 9: end while
if not conflicted:
return current
var = random.choice(conflicted) n Step Size Iterations Restarts Unresolved Conflicts
val = min_conflicts_value(csp, var, current) 96 1 1000 0 2
csp.assign(var, val, current) 22 1 1000 0 1
return None 13 1 1000 0 1
33 1 1000 0 2
def min_conflicts_value(csp, var, current): 56 1 1000 0 2
"""Return the value that will give var the least number of conflicts.
88 1 1000 0 2
If there is a tie, choose at random."""
return argmin_random_tie(csp.domains[var], 99 1 1000 0 2
lambda val: csp.nconflicts(var, val, current)) 62 1 1000 0 1
73 1 1000 0 2
#______________________________________________________________________________ 31 1 1000 0 1
Figure 3: Python implementation of M IN -C ONFLICTS. Table 1: With BADCHOICE
3.2.1 The Solution and its Validation The performance with random choice is below, in Table 2.
Therefore, we implemented Algorithm 2, in which, line 4 is where the difference between the text- n Step Size Iterations Restarts Unresolved Conflicts
book description version and the Python version given above is made explicit by the variable 25 1 100 0 0
RandomChoice. We implemented this in C (see Appendix A), just a surrendipitous choice, though 26 1 249 0 0
again doing Python of this might have been equally useful or rewarding. We tested the same 52 1 79 0 0
with the following conditions: 85 1 117 0 0
1. Compile the program with the flag “-D BADCHOICE”, and test run for various n repeatedly. 86 1 111 0 0
68 1 140 0 0
2. Do the same without that flag during compilation. 60 1 155 0 0
69 1 96 0 0
Do the experiment and you will know yourselves.
80 1 85 0 0
66 1 124 0 0
Validation See the performance in Table 1 with the bad choice without randomization.
Table 2: Without BADCHOICE, with randomization
5 6
A Appendix: C Implementation of Algorithm 2 if(qj>0) qj = rand()%(qj+1);
s[qi] = u[qj];
#include<stdio.h> #endif
#include<stdlib.h> val = min;
#include<time.h> c = &(tempc[u[qj]][0]);
int Eval(int n, int *s, int *c) { }
int i, j, totc = 0; }
for(i = 0; i < n; i++) c[i] = 0; return 0;
for(i = 0; i < n-1; i++) { }
for(j = i+1; j < n; j++) { int main(int argc, char*argv[]) {
if(s[i] == s[j]) { int n = 10, maxit = 1000, seed, step = 1;
c[i]++; if(argc > 1) n = atoi(argv[1]);
c[j]++; if(argc > 2) seed = atoi(argv[2]); else seed=time(0);
totc++; if(argc > 3) maxit = atoi(argv[3]);
} if(argc > 4) step = atoi(argv[4]);
else if(abs(s[i]-s[j])==abs(i-j)) { srand(seed);
c[i]++; int s[n];
c[j]++; int i;
totc++; for(i = 0; i < n; i++) s[i] = rand()%n;
} int restarts;
} int tempit=maxit-HillClimbNQueens(n,s,maxit,&restarts,step);
} if(tempit<maxit) {
return totc; printf("%d\t%d\t%d\t%d\t0\n", n, step, tempit, restarts);
} }
int HillClimbNQueens(int n, int *s, int MaxIterations, int *restarts, int step) { else {
int val, i, j, qi, qj, min, temp; int c[n];
int *c, u[n], tempc[n][n]; printf("%d\t%d\t%d\t%d\t%d\n", n, step, maxit-tempit, restarts, Eval(n,s,c));
*restarts=0; }
if(step > n) step -= n; return 0;
for(i=0; i<n; i++) s[i] = rand()%n; }
val = Eval(n,s,c=&(tempc[0][0]));
while(MaxIterations-- > 0) {
if(val==0) return MaxIterations;
do qi = rand()%n; while(c[qi]==0);
min = val;
qj = -1;
for(j = 0; j < n; j++) {
s[qi] = j;
temp = Eval(n,s,&(tempc[j][0]));
if(temp < min) {
min = temp;
qj = 0;
u[qj] = j;
}
else if(temp == min) {
qj++;
u[qj] = j;
}
}
if(qj>=0) {
#ifdef BADCHOICE
s[qi] = u[0];
#else
7 8
References
Stuart Russell and Peter Norvig. Artificial Intelligence: A Modern Approach. Prentice Hall, 3 edition,
2010.

AI All Merged 1x4

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

AI All Merged 1x4

Uploaded by

Copyright:

Available Formats

Outline

What is AI? Acting humanly: The Turing test

♦ Predicted that by 2000, a machine might have a 30% chance of

Acting rationally Rational agents

State of the art State of the art

State of the art State of the art

State of the art State of the art

State of the art State of the art

Unintentionally funny stories

Agents and environments Vacuum-cleaner world

Agents include humans, robots, softbots, thermostats, etc.

What is the right function?

Environment types Environment types

Environment types Environment types

The environment type largely determines the agent design

Simple reflex agents Example

Agent Sensors function Reflex-Vacuum-Agent( [location,status]) returns an action

(setq joe (make-agent :name ’joe :body (make-agent-body)

What action I (defun make-reflex-vacuum-agent-program ()

Sensors function Reflex-Vacuum-Agent( [location,status]) returns an action

Goal-based agents Utility-based agents

How happy I will be

What action I What action I

Agent Actuators Agent Actuators

Note: this is offline problem solving; solution executed “eyes closed.”

Example: Romania Example: Romania

Non-observable =⇒ conformant problem

Example: vacuum world Example: vacuum world

Single-state, start in #5. Solution?? Single-state, start in #5. Solution??

Conformant, start in {1, 2, 3, 4, 5, 6, 7, 8} Conformant, start in {1, 2, 3, 4, 5, 6, 7, 8}

Single-state, start in #5. Solution?? A problem is defined by four items:

Selecting a state space Example: vacuum world state space graph

Real world is absurdly complex L R

(Abstract) state = set of real states L

For guaranteed realizability, any real state “in Arad” L

must get to some real state “in Zerind” S S

(Abstract) solution = states??

Start State Goal State Start State Goal State

states?? states??: integer locations of tiles (ignore intermediate positions)

Example: The 8-puzzle Example: The 8-puzzle

Start State Goal State Start State Goal State

Start State Goal State

states??: real-valued coordinates of robot joint angles

Tree search algorithms Tree search example

initialize the search tree using the initial state of problem

Sibiu Timisoara Zerind Sibiu Timisoara Zerind

Implementation: states vs. nodes Implementation: general tree search

Breadth-first search Breadth-first search

Properties of breadth-first search Properties of breadth-first search

Properties of breadth-first search Uniform-cost search

Time?? 1 + b + b2 + b3 + . . . + bd + b(bd − 1) = O(bd+1), i.e., exp. in d Implementation:

Depth-first search Depth-first search

Depth-first search Depth-first search

Depth-first search Depth-first search

Properties of depth-first search Properties of depth-first search

Iterative deepening search Iterative deepening search l = 0

function Iterative-Deepening-Search( problem) returns a solution Limit = 0 A A

inputs: problem, a problem

Iterative deepening search l = 3 Properties of iterative deepening search

Properties of iterative deepening search Properties of iterative deepening search

bd+1 b⌈C /ǫ⌉ bm bl bd

Graph search Summary

Informed search algorithms ⇒ Expand most desirable unexpanded node

A strategy is defined by picking the order of node expansion