Department of Computer Science
Towards solving large-scale POMDP problems via spatio-temporal belief state clustering
Markov decision process (MDP) is commonly used to model a stochastic environment for supporting optimal decision making. However, solving a large-scale MDP problem under the partially observable condition (also called POMDP) is known to be computationally intractable. Belief compression by reducing belief state dimension has recently been shown to be an effective way for making the problem tractable. With the conjecture that temporally close belief states should possess a low intrinsic degree of freedom due to problem regularity, this paper proposes to cluster the belief states based on a criterion function measuring the belief states spatial and temporal differences. Further reduction of the belief state dimension can then result in a more efficient POMDP solver. The proposed method has been tested using a synthesized navigation problem (Hallway2) and empirically shown to be a promising direction towards solving large-scale POMDP problems. Some future research directions are also included.
Source Publication Title
IJCAI-05 Workshop Reasoning with Uncertainty in Robotics (RUR-05)
Place of Publication
This work has been partially supported by RGC Central Allocation Group Research Grant (HKBU 2/03/C).
Link to Publisher's Edition
Li, Xin, William K. Cheung, and Jiming Liu. "Towards solving large-scale POMDP problems via spatio-temporal belief state clustering." IJCAI-05 Workshop Reasoning with Uncertainty in Robotics (RUR-05) (2005): 17-24.