Training deep networks is a time-consuming process, with networks for object recognition often requiring multiple days to train. For this … Continue reading
Tag Archives:
SparkNet
Succinct on Apache Spark: Queries on Compressed RDDs
tl;dr Succinct is a distributed data store that supports a wide range of point queries (e.g., search, count, range, random … Continue reading
Succinct: Enabling Queries on Compressed Data
Web applications and services today collect, store and analyze an immense amount of data. As data sizes continue to grow, the … Continue reading
Splash: Efficient Stochastic Learning on Clusters
Splash is a general framework for parallelizing stochastic learning algorithms (SGD, Gibbs sampling, etc.) on multi-node clusters. It consists of a … Continue reading
GraphX: Large-Scale Graph Analytics
Increasingly, data-science applications require the creation, manipulation, and analysis of large graphs ranging from social networks to language … Continue reading
Traffic jams, cell phones and big data
(With contributions from Michael Armbrust, Leah Anderson and Jack Reilly) It is well known that big data processing is becoming … Continue reading
Spark – Lightning-Fast Cluster Computing
Spark is an open source cluster computing system that aims to make data analytics fast — both fast to run and … Continue reading