Estimation, Optimization, and Parallelism when Data is Sparse

We study stochastic optimization problems when the data is sparse, which is in
a sense dual to current perspectives on high-dimensional statistical learning and
optimization. We highlight both the difﬁculties—in terms of increased sample
complexity that sparse data necessitates—and the potential beneﬁts, in terms of
allowing parallelism and asynchrony in the design of algorithms. Concretely, we
derive matching upper and lower bounds on the minimax rate for optimization
and learning with sparse data, and we exhibit algorithms achieving these rates.
We also show how leveraging sparsity leads to (still minimax optimal) parallel
and asynchronous algorithms, providing experimental evidence complementing
our theoretical results on several medium to large-scale learning tasks.

Paper (pdf)

AMP Lab – UC Berkeley

Estimation, Optimization, and Parallelism when Data is Sparse

Publications

Tags