AMP BLAB: The AMPLab Blog

Sweet Storage SLOs with Frosting

Posted on March 19, 2012 by awang

A typical page load requires sourcing and combining many pieces of data. For example, a frontend application like a newsfeed requires making many storage requests to fetch your name, your photo, your friends and their photos, and your friends’ most recent posts. Since loading a page requires making many of these storage requests, controlling storage request latency is crucial to reducing overall page load times.

Storage systems are thus provisioned and tuned to meet these latency requirements. However, this requires provisioning for peak, not average, load. This means that the hardware is often underutilized. MapReduce batch analytics jobs are perfect for taking up this excess slack capacity in the system. However, traditional storage systems are unable to support both a MapReduce and frontend workload without adversely affecting frontend latency. This is compounded by the dynamic, time-variant nature of a frontend workload, which makes it difficult to tune the storage system for a single set of conditions.

This is what motivated our work on Frosting. Frosting is a request scheduling layer on top of HBase, a distributed column-store, which dynamically tunes its internal scheduling to meet the requirements of the current workload. Application programmers directly specify high-level performance requirements to Frosting in the form of service-level objectives (SLOs), which are throughput or latency requirements on operations to HBase. Frosting then carefully admits requests to HBase such that these SLOs are met.

In the case of combining a high-priority, latency-sensitive frontend workload and a low-priority, throughput-oriented MapReduce workload, Frosting will continually monitor the frontend’s empirical latency and only admit requests from MapReduce when the frontend’s SLO is satisfied. For instance, if the frontend is easily meeting its latency target, Frosting might choose to admit more MapReduce requests since there is slack capacity in the system. If the frontend latency increases above its SLO due to increased load, Frosting will accordingly admit fewer MapReduce requests.

This is ongoing work with Shivaram Venkataraman (shivaram@eecs) and Sara Alspaugh (alspaugh@eecs). No paper is yet available publicly. However, we’d love to talk to you if you’re interested in Frosting, especially if you use HBase in production, or have workload traces that we could get access to.

Energy Debugging with Carat Enters Beta

Posted on February 15, 2012 by Adam Oliner

Carat is a new research project in the AMP Lab that aims to detect energy bugs—app behavior that is consuming energy unnecessarily—using data collected from a community of mobile devices. Carat provides users with actions they can take to improve battery life (and the expected improvements).

Carat collects usage data on devices (we care about privacy), aggregates these data in the cloud, performs a statistical analysis using Spark, and reports the results back to users. In addition to the Action List shown in the figure, the app empowers users to dive into the data, answering questions like How does my energy use compare to similar devices? and What specific information is being sent to the server?

The key insight of our approach is that we can acquire implicit statistical specifications of what constitutes “normal” energy use under different circumstances. This idea of statistical debugging has been applied to correctness and performance bugs, but this is the first application to energy bugs. The project faces a number of interesting (and sometimes distinguishing) technical challenges, such as accounting for sampling bias, reasoning with noisy and incomplete information, and providing users with an experience that rewards them for participating.

We need your help testing our iOS implementation and gathering some initial data! If you have an iPhone or iPad with iOS 5.0 or later and are willing to give us a few minutes of your time, please click here.

Getting It All from the Crowd

Posted on February 14, 2012 by Beth Trushkowsky

What does a query result mean when the data comes from the crowd? This is one of the fundamental questions raised by CrowdDB, a hybrid human/machine database system developed here in the AMP lab. For example, consider what could be thought of as the simplest query: SELECT * FROM table. If tuples are being provided by the crowd, how do you know when the query is complete? Can you really get them all?

In traditional database systems, query processing is based on the closed-world assumption: all data relevant to a query is assumed to reside in the database. When the data is crowdsourced, we find that this assumption no longer applies; existing information could be extended by further by asking the crowd. However, in our current work we show that it is possible to understand query results in the “open-world” by reasoning about query completeness and the cost-benefit tradeoff of acquiring more data.

Consider asking workers on a crowdsourcing platform (e.g., Amazon Mechanical Turk) to provide items in a set (one at a time). As you can imagine, answers arriving from the crowd follow a pattern of diminishing returns: initially there is a high rate of arrival for previously unseen answers, but as the query progresses the arrival rate of new answers begins to taper off. The figure below shows an example of this curve when we asked workers to provide names of the US States.

Number of unique answers seen vs. total number of answers in the US States experiment (average)

This behavior is well-known in fields such as biology and statistics, where this type of figure is known as the Species Accumulation Curve (SAC). This analysis is part of the species estimation problem; the goal is to estimate the number of distinct species using observations of species in the locale of interest. We apply these techniques in the new context of crowdsourced queries by drawing an analogy between observed species and worker answers from the crowd. It turned out that the estimation algorithms sometimes fail due to crowd-specific behaviors like some workers providing many more answers than others (“streakers vs. samplers”). We address this by designing a heuristic that reduces the impact of overzealous workers. We also demonstrate a heuristic to detect when workers are consulting the same list on the web, helpful if the system wants to switch to another data gathering regime like webpage scraping.

Species estimation techniques provide a way to reason about query results, despite being in the open world. For queries with a bounded result size, we can form a progress estimate as answers arrive by predicting the cardinality of the result set. Of course, some sets are very large or contain items that few workers would think of or find (the long tail), so it does not make sense to try to predict set size. For these cases, we propose a pay-as-you-go-approach to directly consider the benefit of asking the crowd for more answers.

For more details, please check out the paper.

Highlights From the AMPLab Winter 2012 Retreat

Posted on January 20, 2012 by Michael Franklin

The 2nd AMPLab research retreat was held Jan 11-13, 2012 at a mostly snowless Lake Tahoe. 120 people from 21 companies, several other schools and labs, and of course UC Berkeley spent 2.5 days getting an update on the current state of research in the lab, discussing trends and challenges in Big Data analytics, and sharing ideas, opinions and advice. Unlike our first retreat, held last May, which was long on vision and inspiring guest speakers, the focus of this retreat was current research results and progress. Other than a few short overview/intro talks by faculty, virtually all of the talks (16 out of 17) were presented by students from the lab. Some of these talks discussed research that had been recently published, but most of them discussed work that was currently underway, or in some cases, just getting started.

The first set of talks was focused on Applications. Tim Hunter described how he and others used Spark to improve the scalability of the core traffic estimation algorithm used in the Mobile Millennium system, giving them the ability to run models faster than real-time and to scale to larger road networks. Alex Kantchelian presented some very cool results for algorithmically detecting spam in tweet streams. Matei Zaharia described a new algorithmic approach to Sequence Alignment called SNAP. SNAP rethinks sequence alignment to exploit the longer reads that are being produced by modern sequencing machines and shows 10x to 100x speed ups over the state of the art, as well as improvements in accuracy.

The second technical session was about the Data Management portion of the BDAS (Berkeley Data Analytics System) stack that we are building in the AMPLab. Newly-converted database professor Ion Stoica gave an overview of the components. And then there were short talks on SHARK (an implementation of the Hive SQL processor on Spark), Quicksilver – an approximate query answering system that is aimed at massive data, scale-independent view maintenance in PIQL (the Performance Insightful Query Language), and a streaming (i.e., very low-latency) implementation of Spark. These were presented by Reynold Xin, Sameer Agarwal, Michael Armbrust and Matei Zaharia, respectively. Undergrad Ankur Dave wrapped up the session by wowing the crowd with a live demo of the Spark Debugger that he built – showing how the system can be used to isolate faults in some pretty gnarly, parallel data flows.

The Algorithms and People parts of the AMP agenda were represented in the 3rd technical session. John Ducci presented his results on speeding up stochastic optimization for a host of applications. He developed a parallelized method for introducing random noise into the process that leads to faster convergence. Fabian Wauthier reprised his recent NIPS talk on detecting and correcting for Bias in crowdsourced input. Beth Trushkowsky talked about “Getting it all from the Crowd”, and showed how we must think differently about the meaning of queries in a hybrid machine/human database system such as CrowdDB.

A session on Machine-focused topics included talks by Ali Ghodsi on the PacMan caching approach for map-reduce style workloads, Patrick Wendell on early work on low-latency scheduling of parallel jobs, Mosharaf Chowdhury on fair sharing of network resources in large clusters, and Gene Pang on a new programming model and consistency protocol for applications that span multiple data centers.

The technical talks were rounded out by two presentations from students who worked with partner companies to get access to real workloads, logs and systems traces. Yanpei Chen talked about an analysis of the characteristics of various MapReduce loads from a number of sources. Ari Rabkin presented an analysis of trouble tickets from Cloudera.

As always, we got a lot of feedback from our Industrial attendees. A vigorous debate broke out about the extent to which the lab should work on producing a complete, industrial-strength analytics stack. Some felt we should aim to match the success of earlier high-impact projects coming out of Berkeley, such as BSD and Ingres. Others insisted that we focus on high-risk, further out research and leave the systems building to them. There were also discussions about challenge applications (such as the Genomics X Prize competition) and how to ensure that we achieve the high degree of integration among the Algorithms, Machines and People components of the work, which is the hallmark of our research agenda. Another topic of great interest to the Industrial attendees was around how to better facilitate interactions and internships with the always amazing and increasingly in-demand students in the lab.

From a logistical point of view, we tried a few new things. The biggest change was with the poster session(s). As always, the cost of admission for students was to present a poster of their current research. This year, however, we also invited visitors to submit posters describing relevant work at their companies in general, and projects for which they were looking to hire interns in particular. We then partitioned the posters into two separate poster sessions (one each night), thereby giving everyone a chance to spend more time discussing the projects that they were most interested in while still getting a chance to survey the wide scope of work being presented. Feedback on both of these changes was overwhelmingly positive. So we’ll likely stick to this new format for future retreats.

Kattt Atchley, Jon Kuroda and Sean McMahon did a flawless job of organizing the retreat. Thanks to them and all the presenters and attendees for making it a very successful event.

Traffic jams, cell phones and big data

Posted on January 18, 2012 by Timothy Hunter

(With contributions from Michael Armbrust, Leah Anderson and Jack Reilly)

It is well known that big data processing is becoming increasingly important in many scientific fields including astronomy, biomedicine and climatology. In addition, newly created hybrid disciplines like biostatisics are an even stronger indicators of this overall trend. Other fields like civil engineering, and in particular transportation, are no exception to the rule and the AMP Lab is actively collaborating with the department of Intelligent Transportation Systems at Berkeley to explore this new frontier.

It comes as no surprise to residents of California that congestion on the streets is a major challenge that affects everyone. While it is well studied for highways, it remains an open question for urban streets (also called the arterial road network). So far, the most promising source of data is the GPS of cellphones. However, a large volume of this very noisy data is required in order to maintain a good level of accuracy. The rapid adoption of smartphones, all equipped with GPS, is changing the game. I introduce in this post some ongoing efforts to combine Mobile Millennium, a state-of-the-art transportation framework, with the AMPLab software stack.

What does this GPS data look like? Here is an example in the San Francisco Bay area: a few hundred taxicabs relay their position every minute in real time to our servers.

The precise trajectories of the vehicles are unobserved and need to be reconstructed using a sophisticated map matching pipeline implemented in Mobile Millennium. The results of this process are some timestamped trajectory segments. These segments are the basic observations to predict traffic.

Our traffic estimation algorithms work by guessing a probability distribution of the travel time on each link of the road network. This process is iterated to improve the quality of estimates. This overall algorithm is intensive both in terms of computations and memory. Fortunately, it also fits into the category of “embarrassingly parallel” algorithms and is a perfect candidate for distributed computing.
Implementing a high-performance algorithm as a distributed system is not an easy task. Instead of implementing this by hand, our implementation relies on Spark, a programming framework in Scala. Thanks to the Spark framework, we were able to port our single machine implementation to the EC2 cloud within a few weeks to achieve nearly linear scaling. In a future post, I will discuss some practical considerations we faced to when integrating the AMPLab stack with the Mobile Millennium system.

Trip Report from the NIPS Big Learning Workshop

Posted on January 16, 2012 by Matei Zaharia

A few weeks ago, I went to the Big Learning workshop at NIPS, held in Spain. The workshop brought together researchers in large-scale machine learning, an area near and dear to the AMP Lab’s goal of integrating Algorithms, Machines, and People to tame big data, and contained a lot of interesting work. There were about ten invited talks and ten paper presentations. I myself gave an invited talk on Spark, our framework for large-scale parallel computing, which won a runner-up best presentation award.

The topics presented ranged from FPGAs to accelerate vision algorithms in embedded devices, to GPU programming, to cloud computing on commodity clusters. For me, some highlights included the discussion on training the Kinect pose recognition algorithm using DryadLINQ, which ran on several thousand cores and had to overcome substantial fault mitigation and I/O challenges; and the GraphLab presentation from CMU, which discussed many interesting applications implemented using their asynchronous programming model. Daniel Whiteson from UC Irvine also gave an extremely entertaining talk on the role of machine learning in the search for new subatomic particles.

One of the groups I was happy to see represented was the Scala programming language team from EPFL. Scala features prominently as a high-level language for parallel computing. We use it in the Spark programming framework in our lab, as well as the SCADS scalable key-value store. It’s also used heavily in the Pervasive Parallelism Lab at a certain school across the bay. It was good to hear that the Scala team is working on new features that will make the language easier to use as a DSL for parallel computing, making it simpler to build highly expressive programming tools in Scala such as Spark.

The AMP Lab was also represented by John Duchi, who presented a new algorithm for stochastic gradient descent in non-smooth problems that is the first parallelizable approach for these problems, and Ariel Kleiner and Ameet Talwalkar, who presented the Bag of Little Bootstraps, a scalable bootstrap algorithm based on subsampling. It’s certainly neat to see two successes in parallelizing very disparate statistical algorithms one year into the AMP Lab.

In summary, the workshop showcased very diverse ideas and showed that big learning is a hot field. It was the biggest workshop at NIPS this year. In the future, as users gain experience with the various programming models and the best algorithms for each problem type are found, we expect to see some consolidation of these ideas into unified
stacks of composable tools. Designing and building such a stack is one of the main goals of our lab.

An AMP Blab about some recent system conferences – Part 3: Hadoop World 2011

Posted on November 29, 2011 by ychen

I recently had the pleasure of visiting Portugal for SOSP/SOCC, and New York for Hadoop World. Below are some bits that I found interesting. This is the personal opinion of an AMP Lab grad student – in no way does it represent any official or unanimous AMP Lab position.

Part 3: Hadoop World 2011

Not exactly a research conference, Hadoop World is a multi-track industry convention hosted by Cloudera, an enterprise Hadoop vendor, and draws various companies with some stake in the Hadoop community. This year’s Hadoop World saw some 1500 attendees, including Hadoop vendors, Hadoop users, executives from various companies, vendors building on top of Hadoop, people looking to learn more about Hadoop, and of course, a small contingent of researchers. I believe Hadoop World is a good place for researchers to get a state-of-the-industry view of the big data, big systems space.

One theme is that Hadoop has really become “mainstream”, and moved much beyond its initial use cases in supporting e-commerce type services. The convention agenda included talks from household names beyond typical high-tech industries. The talks also had audiences in ripped jeans and flip flops sitting next to others in pressed three piece suits, indicating the present diversity of the community, and perhaps pointing to opportunities for multi-disciplinary collaboration in the near future.

Accel Partners announced a $100M “Big data fund” to accelerate innovation in all layers of the “big data stack”. This should be of interest to entrepreneurial-minded students within the Lab.

Another theme is that Hadoop is still waiting for a “killer app”. One keynote speaker dubbed 2012 to be “the year of apps”. In other words, the Hadoop infrastructure is sufficient to be “enterprise ready”; therefore innovation should now focus on using Hadoop to derive business value.

Also, the “data scientist” role is gaining prominence. Jeff Hammerbacher pioneered this role at Facebook. Companies across many industries are looking for similarly skilled people to make sense of the data deluge that’s happening everywhere. This role requires some combination of expertise in computer science, statistics, social science, natural science, business, and other skills. AMP Lab is rooted in computer science and statistics, and depending on individual students interests, also reasonably literate in social science/natural science/business areas. I certainly found it motivational to see the countless ways that the Lab’s expertise can be applied to create business value, help improve the quality of life, and even discover new knowledge.

NetApp and Cloudera announced a partnership in providing the NetApp Open Solution for Hadoop running on Cloudera Distribution including Apache Hadoop. It’s great to see increased collaboration between our industry partners beyond knowledge sharing through the Lab.

I gave a joint talk on “Hadoop and Performance” with Todd Lipcon, our colleague from Cloudera. The talk was well received, and folks are looking forward to our imminent release of the “Cloudera Hadoop workload suite”. One could say that the focus of typical enterprises should be either profit (monetary and societal), or arguing that “my-performance-is-better”. Thus, it remains the academic community’s responsibility and opportunity to develop scientific design and performance evaluation methodologies.

No travel notes this time.

An AMP Blab about some recent system conferences – Part 2: SOCC 2011

Posted on November 29, 2011 by ychen

Part 2: Symposium on Cloud Computing (SOCC) 2011

This year represents the second iteration of the conference. SOCC has certainly established itself as a noteworthy conference that brings together diverse computer system specialties. The proceedings are available through ACM. Perhaps SOCC would become a stand-alone venue next year, instead of being co-located with SIGMOD (last year) or SOSP (this year).

AMP Lab is fortunate to have inherited many members from its predecessor RAD Lab, which made some contributions in highlighting cloud computing as an important technology trend and emerging research area. The numerous SOCC papers on MapReduce optimizations and key-values stores continues the research paths that RAD Lab helped identify regarding MapReduce schedulers and scale-independent storage.

The program committee awarded three “papers of distinction”: 1. Pesto: Online Storage Performance Management in Virtualized Datacenters, 2. Opportunistic Flooding to Improve TCP Transmit Performance in Virtualized Clouds, 3. PrIter: A Distributed Framework for Prioritized Iterative Computations. I especially liked the TCP paper – the authors actually modified the TCP kernel, a painful task per my own past experience.

Our AMP Lab colleagues presented two talks – Improving Per-Node Efficiency in the Datacenter with New OS Abstractions (Barret Rhoden, Kevin Klues, David Zhu, and Eric Brewer), and Scaling the Mobile Millennium System in the Cloud (Timothy Hunter, Teodor Moldovan, Matei Zaharia, Justin Ma, Samy Merzgui, Michael Franklin, Pieter Abbeel, and Alexandre Bayen). Both went very well.

One train of thought that appeared several times – how do the system improvements demonstrated over artificial benchmarks translate to real life situations. Folks from different organizations raised this point during Q&A for several papers, with the response being the familiar lament regarding the shortage of large scale system traces available to academia. This prompted our friend John Wilkes from Google to give an 1-slide impromptu presentation highlighting the imminent public release of some large scale Google cluster traces, and inviting researchers to work with Google. I felt it helpful to do an 1-slide impromptu follow-up presentation highlighting that AMP Lab has access to large scale system traces from several different organizations, inviting researchers to work with AMP Lab and our industrial partners, and of course thanking our Google colleagues John Wilkes, Joseph L. Hellerstein, and others for their guidance on our early efforts to understand large scale system workloads.

Portugal travel note 2: Consider taking in the stunning sunset at Castelo de Sao Jorge, set against the 25 de Abril Bridge across the River Tejo, with the Cristo Rei Statue lit by bright light on the opposite side of the River. Walking about the medieval Castle in semi-darkness is a unique and almost haunting experience, provided you can muster the courage and the night-vision. Or just head to the Bairro Alto historical neighborhood and stuff yourself on fantastic local food.

An AMP Blab about some recent system conferences – Part 1: SOSP 2011

Posted on November 28, 2011 by ychen

Part 1: Symposium on Operating System Principles (SOSP) 2011

A very diverse and high quality technical program, as expected. You can find the proceedings and talk slides/videos at http://sigops.org/sosp/sosp11/current/index.html.

One high-level reaction I have from the conference is that AMP Lab’s observe-analyze-act design loop position us well to identify emerging technology trends, and design systems with high impact under real life scenarios. Our industry partnerships would also allow us to address engineering concerns beyond the laboratory, thus expedite bilateral knowledge transfer between academia and industry.

One best-paper award went to “A File is Not a File: Understanding the I/O Behavior of Apple Desktop Applications”, authored by our friends from Univ. of Wisconsin, Professors Andrea and Remzi Arpaci-Dusseau, as well as their students. The paper did an elaborate study of Apple laptop file traces, and found many pathological behavior. For example, a file “write” actually writes a file multiple times, a file “open” touches a great number of seemingly unrelated files.

Another best-paper award went to “Cells: A Virtual Mobile Smartphone Architecture” from Columbia. This study proposes and implements “virtual phones”, the same idea as virtual machines, for example running a “work phone” and a “home phone” on the same physical device. The talk highlight was a demo of two versions of the Angry Birds game running simultaneously on the same phone.

The audiences-choice best presentation award went to “Atlantis: Robust, Extensible Execution Environments for Web Applications”, a joint work between MSR and Rutgers. The talk very humorously surveyed the defects of current Internet browsers, and proposes an “exokernel browser” architecture in which web applications have the flexibility to define their own execution stack, e.g. markup languages, scripting environments, etc. As expected, the talk catalyzed very entertaining questioning from companies with business interests in the future of web browsers.

Also worthy of highlighting – the session on Security contained three papers, all three have Professor Nickolai Zeldovich on the author list, and all three of high quality. I have not done a thorough historical search, but I’m sure it’s rare that a single author manages to fill a complete session at SOSP.

There was also a very lively discussion on ACM copyright policies during the SIGOPS working dinner. I personally believe it’s vital that we find policies that balances concern about upholding the quality of research, preserving the strength of the research community, and facilitating the sharing of cutting edge knowledge and insights.

My own talk on “Design Implications for Enterprise Storage Systems via Multi-Dimensional Trace Analysis” went very well. This is a study that performs an empirical analysis on large scale enterprise storage traces, identify different workloads, and discuss design insights specifically targeted at each workload. The rigorous trace analysis allow us to identify simple, threshold-based storage system optimizations, with high confidence that the optimizations bring concrete benefit under realistic settings. Big thank you to everyone at AMP Lab and our co-authors at NetApp for helping me prepare the talk!

Lisbon travel note 1: If history/food is dear to your heart, you will find it worthwhile to visit the Jerónimos Monastery, and try the Pasteis de Nata sold nearby. This is THE authentic egg tart, originated at the Monastery, and very good for a mid-day sugar-high. I had too many – I felt too happy after eating the first 10, lost count of how many more I ate, and skipped lunch and dinner for that day.

Scale-Independent Query Processing With PIQL

Posted on October 24, 2011 by Michael Armbrust

(This is joint work with Kristal Curtis and Tim Kraska.)

The Internet is littered with stories of traditional relational database failing to meet the performance needs of fast growing internet sites. The story usually goes as follows: Everything works great when the site is small. Suddenly, the site becomes popular and queries start running slowly. As a result, the developers abandon the relational database and switch to writing queries imperatively against a distributed key/value store. Examples of sites that use distributed key/value stores under the covers include digg, facebook, and twitter, along with many others.

One key driver of this NoSQL movement is the fact that the data independence provided by an RDBMS actually exacerbates the scaling problem by hiding potentially expensive queries behind the relative simplicity of high-level declarative queries. In contrast, the simple get/put interface of most NoSQL stores provides predictable per-operation performance, independent of the size of the underlying database.

The NoSQL ‘solution,’ however, leads to its own set of problems. The use of imperative functions instead of declarative queries means that changes to the data model often require time-consuming rewrites. Additionally, developers have to manually parallelize key/value store requests or suffer delays due to sequential execution. In other words, the benefits of physical and logical data independence are lost.

Instead, we propose PIQL (pronounced pickle), the Performance-Insightful Query Language. In addition to the physical and logical data independence provided by a traditional system, PIQL ensures the scale independence of all queries in an application at compile time. A scale-independent query is guaranteed to perform only a bounded number of operations no matter how large the underlying database grows.

Some systems, for example Google AppEngine’s GQL, impose severe functional restrictions, such as removing joins, in order to ensure scalability. In contrast, PIQL employs language extensions, query compilation technology, and response-time estimation to provide scale independence over a larger and more powerful subset of SQL.

We evaluated our ideas by building a prototype on top of SCADS, a distributed key/value store first proposed at CIDR 2009. Using this prototype we constructed two benchmarks, one based on Twitter and the other on the user-facing queries from TPC-W.

The throughput of the system while running the TPC-W benchmark increases linearly as machines are added to the cluster.

As the above figure shows, systems built using PIQL scale linearly as machines are added while keeping response time constant. If you’d like to learn more, you can check out our paper appearing in VLDB2012 or look at some of the queries written in our LINQ-like Scala DSL.