A recent O’Reilly Strata blog post on Big Data and Cancer Genomics highlights the exciting new results coming out of the ADAM work led by Matt Massie in the AMPLab. ADAM applies modern data formatting concepts to genomics data enabling processing to be scaled horizontally to achieve potentially game-changing speedups for key parts of the genomics processing pipeline. Results on both AWS EC2 and an in-house cluster at the Mt. Sinai medical center show 10-1000x speed ups compared to an existing single-node approach.
These results were first shown at last week’s AMPLab Winter Research retreat, and are reported in a new Tech Report. ADAM is an Open Source project being developed under an Apache License. See the ADAM github repository for more information.