Looking Back at AMP Year 5

Error: Unable to create directory uploads/2024/04. Is its parent directory writable by the server?

It’s that time when we take a look back at what we’ve done over the past year and look forward to next year.  For AMPLab, it’s a particularly good point at which to do this because we are wrapping up the 5th year of what we originally intended to be a 5 year project.   As you may know, in 2012 we received a 5-year “Expeditions in Computing” Award from the National Science Foundation, so we extended our project to run until the end of 2016.   Thus, we’re heading into the final full year of the AMPLab project, and we’re working on planning what we want to do next.   We’ll be saying more about those plans as they develop, and those of you attending our Winter retreat meeting in Lake Tahoe next month will contribute to discussions on the “Next Lab”.   In the meantime, let’s take a quick look at what AMPLab accomplished in 2015.

BDAS Software – We continued to innovate around our core platform, the Berkeley Data Analytics System (BDAS).  This past year we released important new components of BDAS such as the KeystoneML machine learning pipeline system, the Succinct compressed storage and search system, Splash for parallelizing stochastic leaning algorithms, and SampleClean/AMPCrowd for human-in-the-loop data cleaning and labeling.  We’ve also made improvements to GraphX, MLlib, and Tachyon, among others.

Students – Our students and their research continued to be recognized with awards at top conferences and elsewhere.  One result that we are extremely proud of is that 2 of the 3 winners of this year’s ACM Dissertation Award were from AMPLab: Matei Zaharia and John Duchi.  These awards recognize outstanding Ph.D. work in Computer Science from all of the CS departments worldwide.  It is rare to have two winners from the same department. It is unheard of to have two winners from the same lab.  We also had some best paper awards and a CACM Research Highlight selection as noted in the news section of the AMPLab web site.  And of course, we continued to publish papers in the top conferences in Systems, Databases, Machine Learning, Networking, etc.   Have a look at our Publications page to see a pretty impressive list.  Our graduates received job offers from all the top Academic institutions in CS and of course, are in tremendous demand by companies of all sizes.

Industry Impact – AMPLab-born software is driving innovation in the Big Data industry.  Spark, Mesos, and Tachyon all have large groups of contributors and are used in production across a wide range of industries.  A recent salary survey by O’Reilly indicated that knowledge of Spark provided the highest increase in median salary across all of the Big Data technologies they studied, providing a bigger boost even than getting a Ph.D. (we try not to emphasize this point with our students!).  AMPLab sponsors have made large bets based on our software, including recent announcements from IBM, AWS, SAP and others.  Another interesting factoid – at the start of the year, spark.meetup.com listed an impressive 43 Apache Spark meetup groups around the world with a total of over 12,000 members.   As of this writing (less than a year later), there are 132 groups with 53,520 members.  There are now Spark meetups on every continent except (as far as we know) Antarctica.

Big Data and Data Science – AMPLab also played an important role in new research initiatives on campus and nationally.  For example, Berkeley recently was named to host the NSF West Big Data Innovation Hub and AMPLab will anchor a large part of Berkeley’s involvement in the hub.  Also, Lawrence Berkeley National Lab, in conjunction with Cray, is integrating BDAS with more traditional High Performance Computing infrastructure.  The convergence of Big Data and HPC is a key pillar of the National Strategic Computing Initiative recently announced by President Obama.

AMPCamp and Beyond – We continue to host successful outreach events such as AMPCamp 6, which was held earlier in November and AMPCamps in Shanghai (hosted by Intel) and Beijing (hosted by Microsoft).  AMPLab faculty have spoken at Davos, published in Science, and have opined on Big Data topics in a host of major media outlets.

The above is just a sampling of what we did during 2015 -please visit the AMPLab web site to dig deeper and keep up with the latest developments.

Best wishes from AMPLab for a happy, healthy and productive New Year.