National Science Foundation
Expeditions in Computing
AMPLab Publications
- SparkR: Scaling R Programs with Spark
- Time-Evolving Graph Processing at Scale
- Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics
- BlowFish: Dynamic Storage-Performance Tradeoff in Data Stores
- FairRide: Near-Optimal, Fair Cache Sharing
- SparkNet: Training Deep Networks on Spark
- Succinct: Enabling Queries on Compressed Data
- FastLane: Making Short Flows Shorter with Agile Drop Notification
- Low Latency, Geo-distributed Data Analytics
- Efficient Coflow Scheduling Without Prior Knowledge
- Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity
- G-OLA: Generalized Online Aggregation for Interactive Analysis on Big Data
- CellIQ: Real-Time Cellular Network Analytics at Scale
- Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks
- Coordination Avoidance in Database Systems
- Discretized Streams: Fault-Tolerant Streaming Computation at Scale
- The Power of Choice in Data-Aware Cluster Scheduling
- GraphX: Graph Processing in a Distributed Dataflow Framework
- Efficient Coflow Scheduling with Varys
- Knowing When You’re Wrong: Building Fast and Reliable Approximate Query Processing Systems
- Scalable Atomic Visibility with RAMP Transactions
- Tachyon: Memory Throughput I/O for Cluster Computing Frameworks
- GraphX: Unifying Data-Parallel and Graph-Parallel Analytics
- Sparrow: Distributed, Low Latency Scheduling
- Carat: Collaborative Energy Diagnosis for Mobile Devices
- Highly Available Transactions: Virtues and Limitations
- A General Bootstrap Performance Diagnostic
- Leveraging Endpoint Flexibility in Data-Intensive Clusters
- GraphX: A Resilient Distributed Graph System on Spark
- HAT, not CAP: Towards Highly Available Transactions
- PBS at Work: Advancing Data Management with Consistency Metrics (Demo)
- Bolt-on Causal Consistency
- The Case for Tiny Tasks in Compute Clusters
- Choosy: Proportional Sharing for Datacenter Jobs with Constraints
- BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data (Best Paper Award)
- Blink and It’s Done: Interactive Queries on Very Large Data
- Fast and Interactive Analytics over Hadoop Data with Spark
- Sweet Storage SLOs with Frosting
- Cake: Enabling High-level SLOs on Shared Storage Systems
- Shark: SQL and Rich Analytics at Scale
- Tradeoffs in CDN Designs for Throughput Oriented Traffic
- The Potential Dangers of Causal Consistency and an Explicit Solution
- Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters
- Coflow: A Networking Abstraction for Cluster Applications
- Carat: Collaborative Energy Debugging for Mobile Devices
- FairCloud: Sharing The Network in Cloud Computing
- Surviving Failures in Bandwidth-Constrained Datacenters
- Probabilistically Bounded Staleness for Practical Partial Quorums
- A Case for Performance-Centric Network Allocation
- Faster and More Accurate Sequence Alignment with SNAP
- Blue-Fi: Enhancing Wi-Fi Performance using Bluetooth Signals
- PACMan: Coordinated Memory Caching for Parallel Jobs
- Reining in the Outliers in MapReduce Clusters using Mantri
- Re-optimizing Data Parallel Computing
- Shark: Fast Data Analysis Using Coarse-grained Distributed Memory (Best Demo Award)
- A Common Substrate for Cluster Computing
- Improving MapReduce Performance in Heterogeneous Environments
- Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing (Best Paper Award)
- Scarlett: Coping with Skewed Popularity Content in MapReduce Clusters
- The Datacenter Needs an Operating System
- Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
- Dominant Resource Fairness: Fair Allocation of Multiple Resources Types
- Spark: Cluster Computing with Working Sets
- Managing Data Transfers in Computer Clusters with Orchestra
- Disk-Locality in Datacenter Computing Considered Irrelevant