Error: Unable to create directory uploads/2026/07. Is its parent directory writable by the server?

Ion Stoica

Director

http://www.cs.berkeley.edu/~istoica/
istoica@eecs.berkeley.edu

AMPLab Publications

SparkR: Scaling R Programs with Spark
Time-Evolving Graph Processing at Scale
Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics
BlowFish: Dynamic Storage-Performance Tradeoff in Data Stores
FairRide: Near-Optimal, Fair Cache Sharing
SparkNet: Training Deep Networks on Spark
Succinct: Enabling Queries on Compressed Data
FastLane: Making Short Flows Shorter with Agile Drop Notification
Low Latency, Geo-distributed Data Analytics
Efficient Coflow Scheduling Without Prior Knowledge
Feral Concurrency Control: An Empirical Investigation of Modern Application Integrity
G-OLA: Generalized Online Aggregation for Interactive Analysis on Big Data
CellIQ: Real-Time Cellular Network Analytics at Scale
Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks
Coordination Avoidance in Database Systems
Discretized Streams: Fault-Tolerant Streaming Computation at Scale
The Power of Choice in Data-Aware Cluster Scheduling
GraphX: Graph Processing in a Distributed Dataflow Framework
Efficient Coflow Scheduling with Varys
Knowing When You’re Wrong: Building Fast and Reliable Approximate Query Processing Systems
Scalable Atomic Visibility with RAMP Transactions
Tachyon: Memory Throughput I/O for Cluster Computing Frameworks
GraphX: Unifying Data-Parallel and Graph-Parallel Analytics
Sparrow: Distributed, Low Latency Scheduling
Carat: Collaborative Energy Diagnosis for Mobile Devices
Highly Available Transactions: Virtues and Limitations
A General Bootstrap Performance Diagnostic
Leveraging Endpoint Flexibility in Data-Intensive Clusters
GraphX: A Resilient Distributed Graph System on Spark
HAT, not CAP: Towards Highly Available Transactions
PBS at Work: Advancing Data Management with Consistency Metrics (Demo)
Bolt-on Causal Consistency
The Case for Tiny Tasks in Compute Clusters
Choosy: Proportional Sharing for Datacenter Jobs with Constraints
BlinkDB: Queries with Bounded Errors and Bounded Response Times on Very Large Data (Best Paper Award)
Blink and It’s Done: Interactive Queries on Very Large Data
Fast and Interactive Analytics over Hadoop Data with Spark
Sweet Storage SLOs with Frosting
Cake: Enabling High-level SLOs on Shared Storage Systems
Shark: SQL and Rich Analytics at Scale
Tradeoffs in CDN Designs for Throughput Oriented Traffic
The Potential Dangers of Causal Consistency and an Explicit Solution
Discretized Streams: An Efficient and Fault-Tolerant Model for Stream Processing on Large Clusters
Coflow: A Networking Abstraction for Cluster Applications
Carat: Collaborative Energy Debugging for Mobile Devices
FairCloud: Sharing The Network in Cloud Computing
Surviving Failures in Bandwidth-Constrained Datacenters
Probabilistically Bounded Staleness for Practical Partial Quorums
A Case for Performance-Centric Network Allocation
Faster and More Accurate Sequence Alignment with SNAP
Blue-Fi: Enhancing Wi-Fi Performance using Bluetooth Signals
PACMan: Coordinated Memory Caching for Parallel Jobs
Reining in the Outliers in MapReduce Clusters using Mantri
Re-optimizing Data Parallel Computing
Shark: Fast Data Analysis Using Coarse-grained Distributed Memory (Best Demo Award)
A Common Substrate for Cluster Computing
Improving MapReduce Performance in Heterogeneous Environments
Resilient Distributed Datasets: A Fault-Tolerant Abstraction for In-Memory Cluster Computing (Best Paper Award)
Scarlett: Coping with Skewed Popularity Content in MapReduce Clusters
The Datacenter Needs an Operating System
Mesos: A Platform for Fine-Grained Resource Sharing in the Data Center
Dominant Resource Fairness: Fair Allocation of Multiple Resources Types
Spark: Cluster Computing with Working Sets
Managing Data Transfers in Computer Clusters with Orchestra
Disk-Locality in Datacenter Computing Considered Irrelevant