A major challenge in many large-scale machine learning tasks is to solve an optimization objective involving data that is distributed across multiple machines. In this setting, optimization methods that work well on single machines must be re-designed to leverage parallel computation while reducing communication costs. This requires developing new distributed optimization methods with both competitive practical performance and strong theoretical convergence guarantees. CoCoA is a a novel framework for distributed computation that meets these requirements, while allowing users to re-use arbitrary single machine solvers locally on each node.
Related publications:
Adding vs. Averaging in Distributed Primal-Dual Optimization, ICML 2015. [pdf] [code]
Communication-Efficient Distributed Dual Coordinate Ascent, NIPS 2014. [pdf] [code]