← back
taxi-search
Python · Graph Search · AI Planning · Taxi-v3 (Gymnasium)
Overview
taxi-search is an AI search framework that implements and compares classical graph search
algorithms in the Taxi-v3 environment from Gymnasium.
The project focuses on building a fully self-contained state representation and search system
that can solve the environment without relying on Gym’s internal transition logic during search.
Algorithms Implemented
- Depth-First Search (DFS)
- Breadth-First Search (BFS)
- Uniform Cost Search (UCS)
- A* Search (with Manhattan-distance heuristic)
Core Design
- Custom state encoding/decoding of Taxi-v3 environment
- State represented as a 4-element vector (taxi, passenger, destination, location)
- Explicit transition system via a custom TaxiPuzzle class
- Action masking implemented independently of Gymnasium
- Search tree built via node expansion rather than environment stepping
A* Heuristic
- Manhattan distance to passenger when not picked up
- Manhattan distance to destination after pickup
- Two-phase goal structure (pickup → drop-off)
Key Engineering Decisions
- Decoupled search logic from Gymnasium for full control and reproducibility
- Explicit node-based search tree (parent/child structure)
- Priority queue ordering for optimal-path selection
- Uniform cost interpretation of reward (-1 step, +20 delivery, -10 penalty)
Performance Analysis
- Evaluated across 500 random environments
- Metrics: search time, expansions, frontier size, final reward
- A* and UCS consistently found optimal solutions
- DFS produced faster but non-optimal solutions
Key Results (Summary)
- A*, UCS, BFS: optimal reward (~7.69 average)
- DFS: faster but suboptimal (~0.82 average reward)
- A* significantly reduced node expansions vs BFS/UCS
- BFS had highest frontier size and memory cost
Limitations
- Fully decoupled from Gym environment (manual simulation required)
- No dynamic environment stepping during search
- Action mask implemented manually (no runtime env queries)
- Limited to Taxi-v3 state space structure
What I Learned
- Graph search behaviour differences in practice (DFS vs BFS vs UCS vs A*)
- Trade-offs between optimality and computational cost
- Designing state machines independent of simulation environments
- Impact of heuristics on search efficiency