For a quick overview of our current and recent projects,
see our posters from our Bears ‘12 open house.

BLB: Bootstrapping Big Data

The bootstrap provides a simple and powerful means of assessing the quality of estimators. However, in settings involving very large datasets, the computation of bootstrap-based quantities can be extremely computationally demanding. As an alternative, we introduce the Bag of Little … Continue reading

Tags:

Carat – Collaborative Detection of Energy Bugs

Carat

An energy bug is a system behavior that causes unexpectedly heavy use of energy and which is not intrinsic to providing the desired functionality. We aim to identify and help diagnose energy bugs in mobile devices by performing distributed, low-overhead sampling, aggregating these data, and applying statistical methods to identify the apps, … Continue reading

Mesos – Dynamic Resource Sharing for Clusters

Mesos block diagram

Mesos is a cluster manager that provides efficient resource isolation and sharing across distributed applications, or frameworks. It can run Hadoop, MPI,Hypertable, Spark (a new framework for low-latency interactive and iterative jobs), and other applications. Mesos is open source in the Apache Incubator. More information … Continue reading

Tags: , , ,

PIQL – Scale Independent Query Processing

PIQL is implemented as a library that interacts with a distributed key/value store.

PIQL is a SQL like language that uses a new scale independent optimization strategy to execute relational queries while maintaining the performance predicability and scalability provided by distributed key/value stores.  Scale independent optimization guarantees that all queries will perform a bounded number of storage operations … Continue reading

Tags: