At the AMPLab, we are constantly looking for ways to improve the performance and user experience of large scale advanced … Continue reading

# Tag Archives:

# SparkNet

Training deep networks is a time-consuming process, with networks for object recognition often requiring multiple days to train. For this … Continue reading

# CoCoA: A Framework for Distributed Optimization

A major challenge in many large-scale machine learning tasks is to solve an optimization objective involving data that is distributed … Continue reading

# KeystoneML

KeystoneML is a research project exploring techniques to simplify the construction of large scale, end-to-end, machine learning pipelines. KeystoneML is designed around … Continue reading

# Splash: Efficient Stochastic Learning on Clusters

Splash is a general framework for parallelizing stochastic learning algorithms (SGD, Gibbs sampling, etc.) on multi-node clusters. It consists of a … Continue reading

# GraphX: Large-Scale Graph Analytics

Increasingly, data-science applications require the creation, manipulation, and analysis of large graphs ranging from social networks to language … Continue reading

# Concurrency Control for Machine Learning

Many machine learning (ML) algorithms iteratively transform some global state (e.g., model parameters or variable assignment) giving the illusion of … Continue reading

# MLbase: Distributed Machine Learning Made Easy

Implementing and consuming Machine Learning techniques at scale are difficulttasks for ML Developers and End Users. MLbase is a platform … Continue reading

# BLB: Bootstrapping Big Data

The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving very large … Continue reading

# DFC — Divide-and-Conquer Matrix Factorization

Divide-Factor-Combine (DFC) is a parallel divide-and-conquer framework for noisy matrix factorization problems, e.g., matrix completion and robust matrix factorization. DFC … Continue reading