Day 17 - AI/ML Search Algorithms and Optimization Notes (Stanford CS221)
Nov 20 2024
(Stanford AI/ML)
Search 1 - Dynamic Programming, Uniform Cost Search | Stanford CS221: AI (Autumn 2019)
Transport Problem
- Search - the model first - what are the actions, success function, what is cost function, what is the isEnd function, and what’s the initial state
Simple Brute Force Algos
- Backtracking search (simplest) - start from initial state, and follow all actions possible, then next set of actions
- Branching factor is number of branches per state
- D = depth (little d is when you find a solution at a depth less than max D)
- Note: This algo will go through the entire tree!
| Algo | Cost | Time | Space(memory) | | —- | —- | —- | —- | | Backtracking Search | Any | O(bD) | …things I need to store to reach solution => O(D) | | … | … | … | … | | Depth First Search (So DFS is great when there are an abundance of solutions.) | Cost assumed to be zero (it stops once it finds a solution) | O(bD) | O(D) | | … | … | … | … | | Breadth First Search | constant >0 cost, all costs are equal (assumed) but greater than zero | O(bd | O(bd) | | … | … | … | … | | DFS-Iterative Deepening (analogy of letting leash getting let out per level after exploring all the levels) | all costs are equal (assumed) but greater than zero | O(bdj | O(d) |
- Note: “Any” for costs means that it can also include negative numbers
- Always exponential time
- Avoid exponential space with DFS-ID (though solution could be equal)
Dynamic Programming (backtracking with memoization of past states, potentially exponential savings)
- Min of a { cost (state, a) + FutureCost(s’), where IsEnd=True}
- Future cost depends on current state
- ** Store previous future costs to re-use later. (if already computed for ‘s’, then return cached answer**
- Key idea: a ‘state’ is a summary of all the past actions sufficient to choose future actions optimally
-
You want bring down state attributes to absolute minimum to bring down computations by as many factors as possible (e.g., N2 to 2N)
- LIMITATION - Does not work for Cyclic Graphs
Uniform Cost Search ( Dijkstra’s algo)
- Can handled Cyclic Graphs!
- Great for uniformed goal directed agent moving around the search space
- Here is a great video explaining Uniform Cost Search
- Stanford talked about
a) Explored - states we’ve found the optimal path to b) Frontier - states we’ve seen, still figure out how to get there cheaply c) ..and Unexplored - states we haven’t seen - Dijkstra will explore all the states in the graph, where this one stops when you hit the solution
- Theorem: correctness - When a state s is popped from the “Frontier” and moved to “Explore”, its priority is PastCost(s), which is the minimum cost to s.
A* search (A-star search) - Heuristic or Informed search algorithm
- Here is a great video explaining A* Search
- Must never overestimate the cost (but can estimate it exactly)
- Similar to UCS but we use cost of path + heuristic of the end node of the path
| Algo | Cost | Time | Space(memory) | | —- | —- | —- | —- | | Dynamic Programming (not w/ cycles) | any | O(N) | O(N) | | Universal Cost Search (works w/ cycles) | costs >= 0 | O( n * log * n) | O( n * log * n) |
- N total states, n of which are closer than end state
- UCS potentially explores fewer states, but requires more overhead to maintain priority queue
- Assume number of actions per state is constant (independent of n and N)
- “Any” for costs means that it can also include negative numbers
Addition on 12/5/2024
- In the [Markov Decision Processes 1 - Value Iteration video)[https://youtu.be/9g32v7bK3Co?si=apVcymgoFGlQXVL-&t=4690],
- they also circled back to the concept of RELAXING constraints to make the problem easier (so maybe it becomes a Close Form solution problem, or an easier search problem, or independent subproblems)…and you should combine heuristics using max
- but you are just getting a heuristic!! The problem is not done, that was just a step toward solving the problem. You use that heuristic to factor back into your A* Search problem (the costs)
- Another example was to swap going from State 1 to N and instead, start at N and go to 1! FutureCostrel(location)…so PastCosts of the reveresed problem become the FutureCosts of the original problem (so in the tram example, a walk action would now be -1 instead of plus one and a tram action would now be s divided by 2 instead of times two)
- The point of all this ‘relaxation’ is that we are reducing edge costs from infinity to some finite cost
Definitions
- Relaxed search problem - a relaxed search problem Prel of a search problem P has costs that are less than or equal to costs of the original problem. Costrel(s, a) <= Cost (s, a)
- Relaxed heuristic - given a relaxed search problem, define the relaxed heuristic h(s) = FutureCostrel(s), the minimum cost from s to an end state using Costrel(s, a)
Theorem: consistency of relaxed heuristics
- Suppose h(s) = FutureCostrel(s) from some relaxed problem Prel.
- THEN h(s) is a consistent heuristic !!
How am I going to find a consistent heuristic? Well, here’s one way!
- Prick your problem, make it relaxed…where cost is less than cost original problem, then the FutureCost of the relaxed problem is going to be your consistent heuristic
- Trade off between efficiency and tightness - don’t remove too many constraints, so if not a good estimate of future costs then it is not a good heuristic…so make sure you have that balance when removing constraints
- If you are lucky enough to have two heuristics to choose from, then choose the max of them, and it will also be consistent.
Summary
- Structured Perceptron (reverse engineering): learn cost functions (search + learning). This does converge in a way that it does get the actual Ys
- A* (star) - add in a heuristic estimate of future costs
- Relaxation (breaking some rules): framework for producing consistent heuristics