Nov 20 2024

(Stanford AI/ML)

Search 1 - Dynamic Programming, Uniform Cost Search | Stanford CS221: AI (Autumn 2019)

Transport Problem

Search - the model first - what are the actions, success function, what is cost function, what is the isEnd function, and what’s the initial state

Simple Brute Force Algos

Backtracking search (simplest) - start from initial state, and follow all actions possible, then next set of actions
- Branching factor is number of branches per state
- D = depth (little d is when you find a solution at a depth less than max D)
- Note: This algo will go through the entire tree!

| Algo | Cost | Time | Space(memory) | | —- | —- | —- | —- | | Backtracking Search | Any | O(b^D) | …things I need to store to reach solution => O(D) | | … | … | … | … | | Depth First Search (So DFS is great when there are an abundance of solutions.) | Cost assumed to be zero (it stops once it finds a solution) | O(b^D) | O(D) | | … | … | … | … | | Breadth First Search | constant >0 cost, all costs are equal (assumed) but greater than zero | O(b^d | O(b^d) | | … | … | … | … | | DFS-Iterative Deepening (analogy of letting leash getting let out per level after exploring all the levels) | all costs are equal (assumed) but greater than zero | O(b^dj | O(d) |

Note: “Any” for costs means that it can also include negative numbers
- Always exponential time
- Avoid exponential space with DFS-ID (though solution could be equal)

Dynamic Programming (backtracking with memoization of past states, potentially exponential savings)

Min of a { cost (state, a) + FutureCost(s’), where IsEnd=True}
Future cost depends on current state
- ** Store previous future costs to re-use later. (if already computed for ‘s’, then return cached answer**
Key idea: a ‘state’ is a summary of all the past actions sufficient to choose future actions optimally
You want bring down state attributes to absolute minimum to bring down computations by as many factors as possible (e.g., N² to 2N)
LIMITATION - Does not work for Cyclic Graphs

Uniform Cost Search ( Dijkstra’s algo)

Can handled Cyclic Graphs!
Great for uniformed goal directed agent moving around the search space
Here is a great video explaining Uniform Cost Search
Stanford talked about
a) Explored - states we’ve found the optimal path to b) Frontier - states we’ve seen, still figure out how to get there cheaply c) ..and Unexplored - states we haven’t seen
Dijkstra will explore all the states in the graph, where this one stops when you hit the solution
Theorem: correctness - When a state s is popped from the “Frontier” and moved to “Explore”, its priority is PastCost(s), which is the minimum cost to s.

A* search (A-star search) - Heuristic or Informed search algorithm

Here is a great video explaining A* Search
Must never overestimate the cost (but can estimate it exactly)
Similar to UCS but we use cost of path + heuristic of the end node of the path

| Algo | Cost | Time | Space(memory) | | —- | —- | —- | —- | | Dynamic Programming (not w/ cycles) | any | O(N) | O(N) | | Universal Cost Search (works w/ cycles) | costs >= 0 | O( n * log * n) | O( n * log * n) |

N total states, n of which are closer than end state
UCS potentially explores fewer states, but requires more overhead to maintain priority queue
Assume number of actions per state is constant (independent of n and N)
“Any” for costs means that it can also include negative numbers

Addition on 12/5/2024

In the [Markov Decision Processes 1 - Value Iteration video)[https://youtu.be/9g32v7bK3Co?si=apVcymgoFGlQXVL-&t=4690],
- they also circled back to the concept of RELAXING constraints to make the problem easier (so maybe it becomes a Close Form solution problem, or an easier search problem, or independent subproblems)…and you should combine heuristics using max
- but you are just getting a heuristic!! The problem is not done, that was just a step toward solving the problem. You use that heuristic to factor back into your A* Search problem (the costs)
- Another example was to swap going from State 1 to N and instead, start at N and go to 1! FutureCost_rel(location)…so PastCosts of the reveresed problem become the FutureCosts of the original problem (so in the tram example, a walk action would now be -1 instead of plus one and a tram action would now be s divided by 2 instead of times two)
- The point of all this ‘relaxation’ is that we are reducing edge costs from infinity to some finite cost

Definitions

Relaxed search problem - a relaxed search problem P_rel of a search problem P has costs that are less than or equal to costs of the original problem. Cost_rel(s, a) <= Cost (s, a)
Relaxed heuristic - given a relaxed search problem, define the relaxed heuristic h(s) = FutureCost_rel(s), the minimum cost from s to an end state using Cost_rel(s, a)

Theorem: consistency of relaxed heuristics

Suppose h(s) = FutureCost_rel(s) from some relaxed problem P_rel.
- THEN h(s) is a consistent heuristic !!

How am I going to find a consistent heuristic? Well, here’s one way!

Prick your problem, make it relaxed…where cost is less than cost original problem, then the FutureCost of the relaxed problem is going to be your consistent heuristic
Trade off between efficiency and tightness - don’t remove too many constraints, so if not a good estimate of future costs then it is not a good heuristic…so make sure you have that balance when removing constraints
If you are lucky enough to have two heuristics to choose from, then choose the max of them, and it will also be consistent.

Summary

Structured Perceptron (reverse engineering): learn cost functions (search + learning). This does converge in a way that it does get the actual Ys
A* (star) - add in a heuristic estimate of future costs
Relaxation (breaking some rules): framework for producing consistent heuristics