Optimal Control of Distributed Markov Decision Processes with Network Delays

  • S. Adlakha, R. Madan, S. Lall and A. Goldsmith.
  • Proceedings of the IEEE Conference on Decision and Control, p. 3308--3314, 2007.
We consider the problem of finding an optimal feedback controller for a network of interconnected subsystems, each of which is a Markov decision process. Each subsystem is coupled to its neighbors via communication links by which signals are delayed but are otherwise transmitted noise-free. One of the subsystems receives input from a controller, and the controller receives delayed state-measurements from all of the subsystems. We show that an optimal controller requires only a finite amount of memory which does not grow with time, and obtain a bound on the amount of memory that a controller needs to have for each subsystem. This makes the computation of an optimal controller through dynamic programming tractable. We illustrate our result by a numerical example, and show that it generalizes previous results on Markov decision processes with delayed state measurements.