CS-500: Bayesian Reinforcement Learning
Rutgers University
Fall 2008
Michael
L. Littman, Carlos
Diuk, Chris Mansley
Time: Tuesday 12:00PM-1:30PM
Place: Hill 482
Semester: Fall 2008
Course Number: 16:198:500:03
Index Number: 07899
Description
The purpose of this seminar is to meet weekly and discuss research
papers in Bayesian machine learning, with a special focus on
reinforcement learning (RL). We will focus on three types of papers.
The first type will consist of recent work that provides a good
background on Bayesian methods as applied in machine learning:
Dirichlet and Gaussian processes, infinite HMMs, hierarchical Bayesian
models, etc. The second type of papers will consist of applications of
Bayesian methods specifically to RL. The third type will involve work
on Bayesian models in the intersections of computer and cognitive
science. It will be assumed that participants have been exposed to
basic concepts in reinforcement learning. All enrolled students will
be expected to lead at least one discussion.
Meeting Schedule
- 09/09/08: First meeting (welcome back!). Carlos will do
some introduction, then Chris will be presenting an introduction to
Bayes, conjugate priors and Gaussian process regression.
- Gaussian Processes for Machine Learning by C. Rasmussen and
C. Williams (Website)
- Technical Introduction: A Primer on Probabilistic Inference by
T. Griffiths and A. Yuille (PDF)
- Chris Mansley's Notes from the seminar (PDF)
- Zoubin Ghahramani UAI Lecture
Notes on Non-Parametric Bayesian Methods (PDF)
- Zoubin Ghahramani videolecture on Gaussian Processes (Website)
- 09/16/08: Carlos will be presenting Dirichlet
distributions, processes and intuitive interpretations.
- Dirichlet Processes tutorial by Yee Whye Teh (PDF)
- Videolecture by Yee Whye Teh, with slides (Website)
- Videolecture by Michael Jordan, with slides (Website)
- Second part of the slides by Zoubin Ghahramani we used for GP (PDF)
- 09/23/08: Michael and Carlos presented work on using
Dirichlet distributions to model the world
- 09/30/08: John will be presenting Model-based Bayesian
Exploration
- Model based Bayesian Exploration by Dearden, Friedman and Andre (UAI99) (PDF)
- 10/07/08: Scott will be presenting a Bayesian Framework for RL
- A Bayesian Framework for Reinforcement Learning by Strens (ICML00) (PDF)
- 10/14/08: Ari will tell us how to use Gaussian Processes
for continuous RL
- Reinforcement Learning with Gaussian Processes (ICML 2005) (PDF)
- 10/21/08: Sergiu will present some material on applications
in cognitive science for non-parametric Bayesian techniques
-
Intuitive Theories as Grammars for Causal Inference by Josh Tenenbaum,
Tom Griffiths and Sourabh Niyogi (2007) (PDF)
-
Two Proposals for Causal Grammars by Tom Griffiths and Josh Tenenbaum
(2007) (PDF)
-
Video Lecture by Josh T. (Website)
- Sergiu's slides (odp)
- 10/28/08: Tom will tell us more details about Josh Tenenbaum et al's methods
- The discovery of structural form (PNAS 2008) (PDF)
- Corresponding appendix, with the important details and math (PDF)
- 11/04/08: Tom and/or Carlos will tell us about:
- 11/11/08: Suhrid will teach us MCMC:
- 11/18/08: David Wingate is visiting us from MIT and will talk about Church and the Infinite Latent Events Model (unpublished). For Church, read:
- 11/25/08: Brainstorming session
- 12/2/08: Lihong will present:
- 12/9/08: John will talk about applications of DPs. The core paper is:
Paper Bank
- Bayes for Cognition
- Nonparametric Bayes Papers
- NBP
Repository (Collection of Michael Jordan's NPB papers
) Website
- An Introduction to MCMC for Machine Learning , by Andrieu, De Freitas, Doucet and M. Jordan (Machine Learnin, 2003)
- Hierarchical
beta processes and the Indian buffet process (Describes
theory behind the Indian Buffet
) R. Thibaux, and M. I. Jordan. Proceedings of the Conference on
Artificial Intelligence and Statistics (AISTATS), 2007.
- Nonparametric
empirical Bayes for the Dirichlet process mixture model (NPB
and Dirichlet processes
) J. D. McAuliffe, D. M. Blei and M. I. Jordan. Statistics and
Computing, 16, 5-14, 2006.
- Variational
methods for the Dirichlet process (More methods for Dirichlet
processes
) D. M. Blei and M. I. Jordan. Proceedings of the 21st
International Conference on Machine Learning (ICML), 2004.
- Bayesian
nonparametric latent feature models (Modeling Latent Features
) Ghahramani, Z., Griffiths, T.L., Sollich, P. (2007) Bayesian
Statistics 8.
- A
Nonparametric Bayesian Approach to Modeling Overlapping Clusters
(Clustering and Nonparametric Bayes
) Heller, K.A., and Ghahramani, Z. (2007) In the Eleventh
International Conference on Artificial Intelligence and Statistics
(AISTATS-2007)
- Compact
approximations to Bayesian predictive distributions (compact
appoximations to Bayesian distros
) Snelson, E., and Ghahramani, Z. (2005) In Twenty-second
International Conference on Machine Learning (ICML-2005).
- Infinite
Latent Feature Models and the Indian Buffet Process (More on
Indian Buffet
) Griffiths, T.L., and Ghahramani, Z. (2006) In Advances in Neural
Information Processing Systems 18 (NIPS-2005).
- The
Variational Bayesian EM Algorithm for Incomplete Data: with
Application to Scoring Graphical Model Structures (Bayesian
with Incomplete Data
) Beal, M. J. and Ghahramani, Z. (2002)
< In Bayesian Statistics 7
- Dirichilet Processes and Infinite HMMs
- Hierarchical
Dirichlet Processes (Hierarchical Dirichlet Processes
) Y. W. Teh, M. I. Jordan, M. J. Beal and D. M. Blei. Journal of
the American Statistical Association, 101, 1566-1581, 2006
-
Separating Precision and Mean in Dirichlet-Enhanced High-Order Markov
Models (Learns an HMM using hierarchical Dirichlet
) Takahashi, R. 18th European Conference on Machine Learning
(ECML2007)
- The
Infinite Hidden Markov Model (HMM of potentially infinite
states
) Beal, M. J., Z. Ghahramani and C. E. Rasmussen:
- Using Dirichlet
Mixture Priors to Derive Hidden Markov Models for Protein Families
(HMMs, Dirichlet, Bio App
) Michael Brown, Richard Hughey, Andres Krogh, I. Saira Mian,
Kimmen Sjolander, David Haussler, Intel. Sys. And Molecular Bio
- Hidden Markov
Model Induction by Bayesian Model Merging (Bayesian model
merging
) Andreas Stolcke, Stephen Omohundro, NIPS 93
- Applied Work
- Hierarchical
topic models and the nested Chinese restaurant process (NBP
and Chinese Restaurant with topic models
) D. Blei, T. Griffiths, M. Jordan, and J. Tenenbaum. NIPS 16
(2003)
- Hierarchical
Dirichlet Processes for Tracking Maneuvering Targets
(Maneuvering Target Tracking
) E.B. Fox, E.B. Sudderth, A.S. Willsky, Proceedings of the
International Conference on Information Fusion, Quebec, Canada July
2007.
- Noah Goodman's Work
(applied Bayesian methods
) Website
- Bayesian
haplotype inference via the Dirichlet process (Biology
apllication
) E. P. Xing, R. Sharan, and M. I. Jordan. Proceedings of the 21st
International Conference on Machine Learning (ICML), 2004.
- Bayesian Inference for Differential Equations. Mark Girolami. Theoretical Computer Science, 2008.
- Possible Worlds Models in RL
- Bayesian RL
- Multi
task Reinforcemnt Learning: A Hierarchical Bayesian Approach
(bayes, multiagents, hierachies, fun
) Aaron Wilson, Alan Fern, Soumya Ray, and Prasad Tadepalli.
ICML-07
- Model-based
Bayesian Reinforcement Learning in Partially Observable Domains
(model based bayesian rl for POMDPs
) Pascal Poupart and Nikos Vlassis. AI-Math 2008
- An
Analytic Solution to Discrete Bayesian Reinforcement Learning
(Discrete Bayesian RL
) Pascal Poupart, Nikos Vlassis, Jesse Hoey and Kevin Regan,
ICML-06
- Bayesian
Actor Critic Algorithms (Bayesian Actor critic
) Mohammad Ghavamzadeh & Yaakov Engel. ICML-07
- Bayesian
Policy Gradient Algorithms (Bayesian Policy Gradient
) Mohammad Ghavamzadeh & Yaakov Engel. NIPS-06
- Model
based Bayesian Exploration (model based, with exploration
) Dearden, R.; Friedman, N.; Andre, D. UAI-99 [5~
- A Bayesian
Framework for Reinforcement Learning (Bayesian RL
) Malcol Sterns. ICML-00
- Percentile
Optimization in Uncertain Markov Decision Processes with Application
to Efficient Exploration (Tractable Bayesian MDP learning
) Erick Delage, Shie Mannor, ICML-07
- Design for an Optimal Probe, by Michael Duff, ICML 2003
- Gaussian Processes
- Nonmyopic
Active Learning of Gaussian Processes: An Exploration.Exploitation
Approach (Learning a GP with active exploration
) Andreas Krause, Carlos Guestrin ICML-07
- Learning
to Control an Octopus Arm with Gaussian Process Temporal Difference
Methods. (infamous octopus arm
) Yaakov Engel, Peter Szabo and Dmitry Volkinshtein, NIPS-05
-
Reinforcement Learning with Gaussian Processes (General GPs
and RL) Yaakov Engel, Shie Mannor, Ron Mier, ICML-05
- Bayes Meets
Bellman: The Gaussian Process Approach to Temporal Difference Learning
(Bayes meets Bellman
) Yaakov Engel, Shie Mannor, Ron Mier, ICML-03
- Graph
kernels and Gaussian processes for relational reinforcement
learning (GPs with graph kernels for relational RL
) Kurt Driessens, Jan Ramon, Thomas Gartner, Journal of Machine
Learning-06
- Bayesian
Reinforcement Learning with Gaussian Process and Temporal Difference
Methods (journal version
) Yaakov Engel, Shie Mannor, Ron Mier, Tech Report???
-
Approximate Dynamic Programming with Gaussian Processes
(General GP for Dynamic Programming / solving Bellman eqns.
) Deisenroth, M. P., J. Peters and C. E. Rasmussen, American
Control Conference-08
Please contact Carlos Diuk (cdiuk@cs.rutgers.edu) with any questions.