Stochastic Convex Optimization With Bandit Feedback

Loading...
Thumbnail Image

Related Collections

Degree type

Discipline

Subject

derivative-free optimization
bandit optimization
ellipsoid method
Statistics and Probability

Funder

Grant number

License

Copyright date

Distributor

Related resources

Contributor

Abstract

This paper addresses the problem of minimizing a convex, Lipschitz function f over a convex, compact set X under a stochastic bandit (i.e., noisy zeroth-order) feedback model. In this model, the algorithm is allowed to observe noisy realizations of the function value f(x) at any query point x ∈ X. The quantity of interest is the regret of the algorithm, which is the sum of the function values at algorithm's query points minus the optimal function value. We demonstrate a generalization of the ellipsoid algorithm that incurs O(poly(d) √T) regret. Since any algorithm has regret at least Ω(√T) on this problem, our algorithm is optimal in terms of the scaling with T.

Advisor

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Publication date

2013-01-01

Journal title

SIAM Journal on Optimization

Volume number

Issue number

Publisher

Publisher DOI

relationships.isJournalIssueOf

Comments

Recommended citation

Collection