CS-500: Planning in Learned Environments
Rutgers University
Fall 2007
Michael L. Littman, Lihong Li
Time: Tuesday 1:00PM-2:30PM
Place: Hill 482
Semester: Fall 2007
Course Number: 16:198:500:02
Index Number: 28630
Description
We'll be meeting to read and discuss papers on the
topic of planning in learned environments. Specifically, recent advances
in model-based reinforcement learning have made it possible for accurate
descriptions of exponentially large environments to be learned quickly and
accurately. In this seminar, we will address the problem of deciding
what to do given these learned descriptions. All registered students
will be expected to lead a discussion.
Meeting Schedule
Paper Bank
- Classic Planning
- A linear programming heuristic for optimal planning. Tom Bylander. National Conference on Artificial Intelligence, 1997.
- Heuristic search in cyclic AND/OR graphs. Eric A. Hansen and Shlomo Zilberstein. National Conference on Artificial Intelligence, 1998. (A journal version appears in Artificial Intelligence, 129(1-2), 2001).
- Admissible heuristics for optimal planning. Patrik Haslum and Hector Geffner. International Conference on Artificial Intelligence Planning and Scheduling, 2000.
- The FF planning system: Fast plan generation through heuristic search. Jorg Hoffmann and Bernhard Nebel. Journal of Artificial Intelligence Research, 14, 2001.
- Learning planning rules in noisy stochastic worlds. Luke S. Zettlemoyer, Hanna M. Pasula, and Leslie Pack Kaelbling. National Conference on Artificial Intelligence, 2005.
- Learning partially observable action schemas. Dafna Shahaf and Eyal Amir. National Conference on Artificial Intelligence, 2006.
- Learning symbolic models of stochastic domains. Hanna M. Pasula, Luke S. Zettlemoyer, and Leslie Pack Kaelbling. Journal of Artificial Intelligence Research, 29, 2007.
- Probabilistic planning in hybrid probabilistic logic programs. Emad Saad. International Conference on Scalable Uncertainty Management, 2007.
- Planning in MDPs
- Learning to act using real-time dynamic programming. Andrew G. Barto, Steven J. Bradtke, and Satinder P. Singh. Artificial Intelligence, 72(1-2), 1995.
- The Parti-game algorithm for variable resolution reinforcement learning in multidimensional state spaces. Andrew Moore and Chris Atkeson. Machine Learning, 21(3), 1995.
- Decision-theoretic planning: Structural assumptions and computational leverage. Craig Boutilier, Thomas Dean, and Steve Hanks. Journal of Artificial Intelligence Research, 11, 1999.
- SPUDD: Stochastic planning using decision diagrams. Jesse Hoey, Robert St-Aubin, Alan Hu, and Craig Boutilier. Annual Conference on Uncertainty in Artificial Intelligence, 1999.
- Multiagent planning with factored MDPs. Carlos Guestrin, Daphne Koller, and Ronald Parr. Annual Conference on Neural Information Processing Systems, 2001.
- Distributed planning in hierarchical factored MDPs. Carlos Guestrin, Geoffrey J. Gordon. Annual Conference on Uncertainty in Artificial Intelligence, 2002.
- Motion planning through policy search. Nicholas Roy and Sebastian Thrun. IEEE/RSJ International Conference on
Intelligent Robots and Systems, 2002.
- On local rewards and scaling distributed reinforcement learning. J. Andrew Bagnell and Andrew Y. Ng. Annual Conference on Neural Information Processing Systems, 2005.
- A causal approach to hierarchical decomposition of factored MDPs. Anders Jonsson and Andrew Barto. International Conference on Machine Learning, 2005.
- Learning partially observable action schemas. Dafna Shahaf and Eyal Amir. National Conference on Artificial Intelligence, 2006.
- Approximate Planning
- A sparse sampling algorithm for near-optimal planning in large Markov decision processes. Michael J. Kearns, Yishay Mansour, Andrew Y. Ng. International Joint Conference on Artificial Intelligence, 1999.
- Approximate planning in large POMDPs via reusable trajectories. Michael Kearns, Yishay Mansour, and Andrew Y. Ng. Annual Conference on Neural Information Processing Systems, 1999.
- Efficient solution algorithms for factored MDPs. Carlos Guestrin, Daphne Koller, Ronald Parr and Shobha Venkataraman. Journal of Artificial Intelligence Research, 19, 2003.
- The linear programming approach to approximate dynamic programming. Daniela P. de Farias and Benjamin van Roy. Operations Research, 51(6), 2003.
- An adaptive sampling algorithm for solving Markov decision processes. Hyeong Soo Chang, Michael C. Fu, Jiaqiao Hu, and Steven I. Marcus. Operations Research, 53(1), 2005.
- Bandit based Monte-Carlo planning. Levente Kocsis and Csaba Szepesvari. European Conference on Machine Learning, 2006.
- A fast analytical algorithm for solving Markov decision processes with real-valued resources. Janusz Marecki, Sven Koenig, and Milind Tambe. International Joint Conference on Artificial Intelligence, 2007.
- Misc
- Automatically generating abstractions for planning. Craig A. Knoblock. Artificial Intelligence, 68(2), 1994.
- Planning and acting in partially observable stochastic domains. Leslie Pack Kaelbling, Michael L. Littman, and Anthony R. Cassandra. Artificial Intelligence, 101(1-2), 1998.
- The computational complexity of probabilistic planning. Michael L. Littman, Judy Goldsmith, and Martin Mundhenk. Journal of Artificial Intelligence Research, 1998.
- Value-function approximations for partially observable Markov decision processes. Milos Hauskrecht. Journal of Artificial Intelligence Research, 13, 2000.
- Contingent planning under uncertainty via stochastic satisfiability. Stephen M. Majercik and Michael L. Littman. Artificial Intelligence, 147(1-2), 2003.
Further Resources
For any comments or questions regarding this webpage, please email me.
Last updated: 12/01/07.