Latent State Recovery in Reinforcement Learning

There are three core orthogonal problems in reinforcement learning: (1) Crediting actions (2) generalizing across rich observations (3) Exploring to discover the information necessary for learning. Good solutions to pairs of these problems are fairly well known at this point, but solutions for all three are just now being discovered. I’ll discuss several such results and dive into details on a few of them.



Microsoft Research


John Langford