taxi-search — David Long

← back

taxi-search

Python · Graph Search · AI Planning · Taxi-v3 (Gymnasium)

Overview

taxi-search is an AI search framework that implements and compares classical graph search algorithms in the Taxi-v3 environment from Gymnasium.

The project focuses on building a fully self-contained state representation and search system that can solve the environment without relying on Gym’s internal transition logic during search.

Algorithms Implemented

Depth-First Search (DFS)
Breadth-First Search (BFS)
Uniform Cost Search (UCS)
A* Search (with Manhattan-distance heuristic)

Core Design

Custom state encoding/decoding of Taxi-v3 environment
State represented as a 4-element vector (taxi, passenger, destination, location)
Explicit transition system via a custom TaxiPuzzle class
Action masking implemented independently of Gymnasium
Search tree built via node expansion rather than environment stepping

A* Heuristic

Manhattan distance to passenger when not picked up
Manhattan distance to destination after pickup
Two-phase goal structure (pickup → drop-off)

Key Engineering Decisions

Decoupled search logic from Gymnasium for full control and reproducibility
Explicit node-based search tree (parent/child structure)
Priority queue ordering for optimal-path selection
Uniform cost interpretation of reward (-1 step, +20 delivery, -10 penalty)

Performance Analysis

Evaluated across 500 random environments
Metrics: search time, expansions, frontier size, final reward
A* and UCS consistently found optimal solutions
DFS produced faster but non-optimal solutions

Key Results (Summary)

A*, UCS, BFS: optimal reward (~7.69 average)
DFS: faster but suboptimal (~0.82 average reward)
A* significantly reduced node expansions vs BFS/UCS
BFS had highest frontier size and memory cost

Limitations

Fully decoupled from Gym environment (manual simulation required)
No dynamic environment stepping during search
Action mask implemented manually (no runtime env queries)
Limited to Taxi-v3 state space structure

What I Learned

Graph search behaviour differences in practice (DFS vs BFS vs UCS vs A*)
Trade-offs between optimality and computational cost
Designing state machines independent of simulation environments
Impact of heuristics on search efficiency