Bandit Problems With Infinitely Many Arms

Loading...
Thumbnail Image

Related Collections

Degree type

Discipline

Subject

bandit problems
sequential experimentation
dynamic allocation of Bernoulli processes
staying with a winner
switching with a loser
Applied Statistics

Funder

Grant number

License

Copyright date

Distributor

Related resources

Contributor

Abstract

We consider a bandit problem consisting of a sequence of n choices from an infinite number of Bernoulli arms, with n → ∞. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution F. We first assume F to be the uniform distribution on (0, 1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between √2/√n and 2/√n and we exhibit classes of strategies that achieve the latter.

Advisor

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Publication date

1997

Journal title

The Annals of Statistics

Volume number

Issue number

Publisher

Publisher DOI

Journal Issues

Comments

Recommended citation

Collection