AMP Lab – UC Berkeley

National Science Foundation
Expeditions in Computing

Main menu

Skip to content
  • About
  • People
  • Papers
  • Projects
  • Software
  • Blog
  • Sponsors
  • Photos
  • Login

Tag Archives:

Succinct on Apache Spark: Queries on Compressed RDDs

Posted on November 5, 2015 by Rachit Agarwal
Rachit Agarwal

tl;dr Succinct is a distributed data store that supports a wide range of point queries (e.g., search, count, range, random … Continue reading →

Tags: amp, Big Data, compression, query processing, range queries, scalability, search, spark, Succinct

Spark SQL: Relational Data Processing in Spark

Michael Armbrust, Reynold Xin, Yin Huai, Davies Liu, Joseph K. Bradley, Xiangrui Meng, Tomer Kaftan, Michael Franklin, Ali Ghodsi, Matei Zaharia
ACM SIGMOD Conference 2015, May. 2015.
Tags: Catalyst, Dataframes, JSON, Optimization, query processing, Shark, spark, SQL

GraphX: Graph Processing in a Distributed Dataflow Framework

Joseph Gonzalez, Reynold Xin, Ankur Dave, Dan Crankshaw, Michael Franklin, Ion Stoica
OSDI, Oct. 2014.
Tags: Big Data, dataflow, Graphs, Graphx, query processing, spark

A Sample-and-Clean Framework for Fast and Accurate Query Processing on Dirty Data

Jiannan Wang, Sanjay Krishnan, Michael Franklin, Ken Goldberg, Tim Kraska, Tova Milo
SIGMOD, Jun. 2014.
Tags: Big Data, crowdsourcing, Data Cleaning, query processing, Sampling

Leveraging Transitive Relations for Crowdsourced Joins

Jiannan Wang, Guoliang Li, Tim Kraska, Michael Franklin, Jianhua Feng
ACM SIGMOD Conference, Jun. 2013.
Tags: amp, crowdsourcing, query processing

CrowdQ: Crowdsourced Query Understanding

Gianluca Demartini, Beth Trushkowsky, Tim Kraska, Michael Franklin
CIDR 2013, Jan. 2013.
Tags: crowdsourcing, data quality, query processing, semantic web


Tags

Akaros amp application Approximate Query Processing BDAS Best Paper Award Big Data BlinkDB Bootstrap cluster coflow consistency crowdsourcing databases Datacenters data centers Data Cleaning data quality Declarative ML distributed machine learning genomics Graphs hadoop Machine Learning Materialized Views matrix factorization mesos MLbase Optimization OS pbs PIQL query processing Sampling SCADS scalability scale independence scheduling Shark spark SQL storage Succinct transactions vldb

  • Come Visit
  • Contact
  • Open Positions


  • About
  • People
  • Publications
  • Projects
  • Seminars
  • Blog: AMP BLAB
  • Sponsors
  • Photos
  • Wiki
  • Jenkins
Copyright © 2021 AMPLab