As datasets continue to grow, storage and networking pose the most challenging bottlenecks for many workloads. To address the bottleneck, we developed Alluxio (formerly known as Tachyon), a memory-centric, fault-tolerant virtual distributed storage system. With Alluxio, any application can access any data from anywhere. Any application can store any data to anywhere.
Alluxio is memory centric — not just memory only — and its tiered storage feature means it can access any storage media. Because Alluxio provides a storage unification layer through an API, applications can access any underlying persistent storage and file systems. Alluxio supports any big data framework (Apache Spark, Apache MapReduce, Apache Flink, Impala, etc.) with any storage system or file system (Alibaba OSS, Amazon S3, EMC, NetApp, OpenStack Swift, Red Hat GlusterFS, and more), running on any storage media (SSD, HDV, DRAM, etc.).
The Alluxio project is open source (Apache 2 license) and is already deployed in production clouds on Petabyte-scale workloads. The project is the storage layer of the Berkeley Data Analytics Stack (BDAS) and also part of the Fedora distribution. Collaborating with our industry partners, the lab is continuously enhancing the system and developing exciting things around it.
For more information, please visit the Alluxio Website. The source code can be obtained from the project’s Github. We also host regular meetup at the Bay Area. The project is commercially backed by Alluxio, Inc.