Department of Computer Science
Improving POMDP tractability via belief compression and clustering
Partially observable Markov decision process (POMDP) is a commonly adopted mathematical framework for solving planning problems in stochastic environments. However, computing the optimal policy of POMDP for large-scale problems is known to be intractable, where the high dimensionality of the underlying belief space is one of the major causes. In this paper, we propose a hybrid approach that integrates two different approaches for reducing the dimensionality of the belief space: 1) belief compression and 2) value-directed compression. In particular, a novel orthogonal nonnegative matrix factorization is derived for the belief compression, which is then integrated in a value-directed framework for computing the policy. In addition, with the conjecture that a properly partitioned belief space can have its per-cluster intrinsic dimension further reduced, we propose to apply a κ-means-like clustering technique to partition the belief space to form a set of sub-POMDPs before applying the dimension reduction techniques to each of them. We have evaluated the proposed belief compression and clustering approaches based on a set of benchmark problems and demonstrated their effectiveness in reducing the cost for computing policies, with the quality of the policies being retained.
Belief clustering, Belief compression, Nonnegative matrix factorization (NMF), Partially observable Markov decision process (POMDP)
Source Publication Title
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Institute of Electrical and Electronics Engineers
Link to Publisher's Edition
Li, Xin, William K. Cheung, and Jiming Liu. "Improving POMDP tractability via belief compression and clustering." IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics 40.1 (2010): 125-136.