Fitted Q-Learning for Relational Domains
Reinforcement Learning, First Order Logic
We consider the problem of Approximate Dynamic Program- ming in relational domains. Inspired by the success of fit- ted Q-learning methods in propositional settings, we develop the first relational fitted Q-learning algorithms by represent- ing the value function and Bellman residuals. When we fit the Q-functions, we show how the two steps of Bellman op- erator; application and projection steps can be performed us- ing a gradient-boosting technique. Our proposed framework performs reasonably well on standard domains without using domain models and using fewer training trajectories.
Digital Object Identifier (DOI)
Knowledge representation and Reasoning 2020, Fall 2020.
© The Authors, 2020
Das, S., Natarajan, S., Roy, K., Parr, R., & Kersting, K. (2020). Fitted Q-Learning for Relational Domains. arXiv preprint arXiv:2006.05595.