Constrained Online Learning in Networks with Sublinear Regret and Fit

January 1, 2019
Abstract

In this paper, we consider a network of agents that select actions in order to minimize an objective function of which they only have local observations and subject to a set of constraints. The objective function and the constraints can vary arbitrarily over time. The selection of actions, also called a strategy, is causal and decentralized, i.e., the dynamical system that determines the actions of a given agent depends only on the constraints at the current time and on its own actions and those of its neighbors. We propose a decentralized saddle point algorithm to select the actions and we show that the network fit and regret are sublinear with respect to the time horizon of the problem. Specifically, we define the global fit of a strategy as a vector that integrates over time the global constraint violations as seen by a given node. The fit is a performance loss associated with online operation as opposed to offline clairvoyant operation which can always select an action if one exists, that satisfies the constraints at all times. If this fit grows sublinearly with the time horizon it suggests that the strategy approaches the feasible set of actions. Likewise, we define the regret of a strategy as the difference between its accumulated cost and that of the best-fixed action that one could select knowing beforehand the time evolution of the objective function. Numerical examples support the theoretical conclusions.

Download
Publication Type
Paper
Conference / Journal Name
IEEE CDC

BibTeX


@inproceedings{
    author = {},
    title = {‌Constrained Online Learning in Networks with Sublinear Regret and Fit‌},
    booktitle = {Proceedings of IEEE CDC‌},
    year = {‌2019‌}
}