Bandit Problems With Infinitely Many Arms

We consider a bandit problem consisting of a sequence of n choices from an infinite number of Bernoulli arms, with n → ∞. The objective is to minimize the long-run failure rate. The Bernoulli parameters are independent observations from a distribution F. We first assume F to be the uniform distribution on (0, 1) and consider various extensions. In the uniform case we show that the best lower bound for the expected failure proportion is between √2/√n and 2/√n and we exhibit classes of strategies that achieve the latter.

Publication date

1997

Journal title

The Annals of Statistics

Collection

Articles

Bandit Problems With Infinitely Many Arms

Files

Related Collections

Degree type

Discipline

Subject

Funder

Grant number

License

Copyright date

Distributor

Related resources

Permalink

Author

Contributor

Abstract

Advisor

Date Range for Data Collection (Start Date)

Date Range for Data Collection (End Date)

Digital Object Identifier

Series name and number

Publication date

Journal title

Volume number

Issue number

Publisher

Publisher DOI

Journal Issues

Comments

Recommended citation

Collection

Penn's Heritage