Matei Zaharia

Alumnus

http://www.cs.berkeley.edu/~matei/
matei@eecs.berkeley.edu
@matei_zaharia

AMPLab Publications

Matrix Computations and Optimization in Apache Spark
SparkR: Scaling R Programs with Spark
MLlib: Machine Learning in Apache Spark
FairRide: Near-Optimal, Fair Cache Sharing
Spark SQL: Relational Data Processing in Spark
Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks
Discretized Streams: Fault-Tolerant Streaming Computation at Scale
Tachyon: Memory Throughput I/O for Cluster Computing Frameworks
Sparrow: Distributed, Low Latency Scheduling
Large Scale Estimation in Cyberphysical Systems using Streaming Data: a Case Study with Smartphone Traces
Scaling the Mobile Millennium System in the Cloud
Choosy: Proportional Sharing for Datacenter Jobs with Constraints
Fast and Interactive Analytics over Hadoop Data with Spark
Shark: SQL and Rich Analytics at Scale
Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters
Faster and More Accurate Sequence Alignment with SNAP
Shark: Fast Data Analysis Using Coarse-grained Distributed Memory (Best Demo Award)
A Common Substrate for Cluster Computing
Improving MapReduce Performance in Heterogeneous Environments
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing (Best Paper Award)
Scaling the Mobile Millennium System in the Cloud
The Datacenter Needs an Operating System
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
Dominant Resource Fairness: Fair Allocation of Multiple Resources Types
Spark: Cluster Computing with Working Sets
Managing Data Transfers in Computer Clusters with Orchestra