Aggregate Simulation for Planning and Inference.
Cui, Hao.
2019
-
Many algorithms for
decision making and machine learning problems are centered around the ideas of sampling
and optimization. In this thesis, we introduce a new technique, aggregate simulation,
and show how it works for decision making in Markov decision processes (MDP), decision
making in partially observable MDPs (POMDP), and inference in Bayesian networks. The
original idea of aggregate ... read moresimulation is motivated in the context of MDP planning, where
such simulation approximates the results of many sampled trajectories with a simple
algebraic calculation. This provides a symbolic representation of the estimated long
term reward which is then optimized with gradient ascent. The resulting algorithm,
symbolic online gradient based optimization for factored actions (SOGBOFA), is a
state-of-the-art planner for large MDPs. In POMDPs, observations provide partial
information on the state of the world and the agent must act using only this partial
information. We introduce a second technique, sampling networks, that enables aggregate
simulation of both the state-action trajectories and the observations. The resulting
algorithm, Sampling Networks and Aggregate simulation for POMDP (SNAP), has excellent
performance on the benchmark POMDP problems. Our final contribution builds on the
connections between aggregate simulation and approximate inference in Bayesian networks.
We introduce a new reduction and show how aggregate simulation can be used to solve
difficult Marginal MAP inference problems. The resulting algorithm, algebraic
gradient-based solver (AGS), is competitive with the state-of-the-art, and it is
especially strong in problems with hard summation sub-problems. In all these problems,
aggregate simulation provides a very efficient approximation. As our experimental
evidence shows, despite the approximation, this enables effective and high quality
solutions of large planning and inference problems, across many problem
domains.
Thesis (Ph.D.)--Tufts University, 2019.
Submitted to the Dept. of Computer Science.
Advisor: Roni Khardon.
Committee: Liping Liu, Anselm Blumer, Chris Amato, and Xiaozhe Hu.
Keyword: Computer science.read less - ID:
- 5t34sx91p
- To Cite:
- TARC Citation Guide EndNote