Nov 20 2024

(Stanford AI/ML)

Search 1 - Dynamic Programming, Uniform Cost Search | Stanford CS221: AI (Autumn 2019)

Transport Problem

  • Search - the model first - what are the actions, success function, what is cost function, what is the isEnd function, and what’s the initial state

Simple Brute Force Algos

  • Backtracking search (simplest) - start from initial state, and follow all actions possible, then next set of actions
    • Branching factor is number of branches per state
    • D = depth (little d is when you find a solution at a depth less than max D)
    • Note: This algo will go through the entire tree!

| Algo | Cost | Time | Space(memory) | | —- | —- | —- | —- | | Backtracking Search | Any | O(bD) | …things I need to store to reach solution => O(D) | | … | … | … | … | | Depth First Search (So DFS is great when there are an abundance of solutions.) | Cost assumed to be zero (it stops once it finds a solution) | O(bD) | O(D) | | … | … | … | … | | Breadth First Search | constant >0 cost, all costs are equal (assumed) but greater than zero | O(bd | O(bd) | | … | … | … | … | | DFS-Iterative Deepening (analogy of letting leash getting let out per level after exploring all the levels) | all costs are equal (assumed) but greater than zero | O(bdj | O(d) |

  • Note: “Any” for costs means that it can also include negative numbers
    • Always exponential time
    • Avoid exponential space with DFS-ID (though solution could be equal)

Dynamic Programming (backtracking with memoization of past states, potentially exponential savings)

  • Min of a { cost (state, a) + FutureCost(s’), where IsEnd=True}
  • Future cost depends on current state
    • ** Store previous future costs to re-use later. (if already computed for ‘s’, then return cached answer**
  • Key idea: a ‘state’ is a summary of all the past actions sufficient to choose future actions optimally
  • You want bring down state attributes to absolute minimum to bring down computations by as many factors as possible (e.g., N2 to 2N)

  • LIMITATION - Does not work for Cyclic Graphs

Uniform Cost Search ( Dijkstra’s algo)

  • Can handled Cyclic Graphs!
  • Great for uniformed goal directed agent moving around the search space
  • Here is a great video explaining Uniform Cost Search
  • Stanford talked about
    a) Explored - states we’ve found the optimal path to b) Frontier - states we’ve seen, still figure out how to get there cheaply c) ..and Unexplored - states we haven’t seen
  • Dijkstra will explore all the states in the graph, where this one stops when you hit the solution
  • Theorem: correctness - When a state s is popped from the “Frontier” and moved to “Explore”, its priority is PastCost(s), which is the minimum cost to s.

A* search (A-star search) - Heuristic or Informed search algorithm

  • Here is a great video explaining A* Search
  • Must never overestimate the cost (but can estimate it exactly)
  • Similar to UCS but we use cost of path + heuristic of the end node of the path

| Algo | Cost | Time | Space(memory) | | —- | —- | —- | —- | | Dynamic Programming (not w/ cycles) | any | O(N) | O(N) | | Universal Cost Search (works w/ cycles) | costs >= 0 | O( n * log * n) | O( n * log * n) |

  • N total states, n of which are closer than end state
  • UCS potentially explores fewer states, but requires more overhead to maintain priority queue
  • Assume number of actions per state is constant (independent of n and N)
  • “Any” for costs means that it can also include negative numbers

Addition on 12/5/2024

  • In the [Markov Decision Processes 1 - Value Iteration video)[https://youtu.be/9g32v7bK3Co?si=apVcymgoFGlQXVL-&t=4690],
    • they also circled back to the concept of RELAXING constraints to make the problem easier (so maybe it becomes a Close Form solution problem, or an easier search problem, or independent subproblems)…and you should combine heuristics using max
    • but you are just getting a heuristic!! The problem is not done, that was just a step toward solving the problem. You use that heuristic to factor back into your A* Search problem (the costs)
    • Another example was to swap going from State 1 to N and instead, start at N and go to 1! FutureCostrel(location)…so PastCosts of the reveresed problem become the FutureCosts of the original problem (so in the tram example, a walk action would now be -1 instead of plus one and a tram action would now be s divided by 2 instead of times two)
    • The point of all this ‘relaxation’ is that we are reducing edge costs from infinity to some finite cost

Definitions

  • Relaxed search problem - a relaxed search problem Prel of a search problem P has costs that are less than or equal to costs of the original problem. Costrel(s, a) <= Cost (s, a)
  • Relaxed heuristic - given a relaxed search problem, define the relaxed heuristic h(s) = FutureCostrel(s), the minimum cost from s to an end state using Costrel(s, a)

Theorem: consistency of relaxed heuristics

  • Suppose h(s) = FutureCostrel(s) from some relaxed problem Prel.
    • THEN h(s) is a consistent heuristic !!

How am I going to find a consistent heuristic? Well, here’s one way!

  • Prick your problem, make it relaxed…where cost is less than cost original problem, then the FutureCost of the relaxed problem is going to be your consistent heuristic
  • Trade off between efficiency and tightness - don’t remove too many constraints, so if not a good estimate of future costs then it is not a good heuristic…so make sure you have that balance when removing constraints
  • If you are lucky enough to have two heuristics to choose from, then choose the max of them, and it will also be consistent.

Summary

  • Structured Perceptron (reverse engineering): learn cost functions (search + learning). This does converge in a way that it does get the actual Ys
  • A* (star) - add in a heuristic estimate of future costs
  • Relaxation (breaking some rules): framework for producing consistent heuristics