SAN JOSE, Calif., February 18, 2015 – Strata + Hadoop World, Booth #929 – Altiscale, Inc., the leading provider of Hadoop-as-a-Service, today announced that Apache Spark is now available on the Altiscale Data Cloud. Altiscale customers can now leverage Apache Spark on Apache Hadoop in order to achieve their critical analytical and business objectives. The addition of Apache Spark provides a broader array of analytical services for machine learning, stream processing, and data processing for large data sets.
“Altiscale is dedicated to helping customers quickly find value in the ever-increasing flood of data generated by the connected world,” said Raymie Stata, co-founder and CEO of Altiscale. “Apache Spark in the Altiscale Data Cloud ensures that customers can take advantage of the latest in-memory processing techniques as they are processing their data assets. The Altiscale Data Cloud is purpose-built to provide the fastest, most scalable Hadoop-as-a-Service and is the ideal place to run Spark.”
Apache Spark is an open source framework that is gaining adoption for its machine learning, interactive analytics, and streaming analytics capabilities for large datasets. Spark is appropriate for low-latency computations and iterative algorithms that employ its in-memory computing capabilities. At Altiscale, Spark is fully integrated into the larger Hadoop ecosystem, so customers benefit not only from Spark, but also from Hive, Pig, MapReduce, and even tools like R and H2O. All of these tools run side-by-side on the same Hadoop Data File System (HDFS) cluster, managed through YARN.
“There’s a common misunderstanding that you have to choose between Hadoop and Spark,” said David Chaiken, CTO, Altiscale. “Hadoop is a large ecosystem that includes storage, security, and multiple ways to process your data. Spark is a computing paradigm that fits into and runs best in the Hadoop ecosystem. At Altiscale, customers get the full benefit of leveraging Spark’s complementary strengths.”
Altiscale customers are already using both MapReduce and Apache Spark on the Altiscale Data Cloud. For example, one customer, who had already been using MapReduce to perform regular customer billing analysis upon tens of millions of customer records, had a need for fast, daily geographic analysis and reporting. Apache Spark was quickly added to their Altiscale Data Cloud subscription and the customer expects to expand its use.
Apache Spark runs reliably in the Altiscale Data Cloud, where all aspects of hardware, networking, security, software, tools, and operations are optimized for the processing and analysis of massive data sets. Learn more at https://www.altiscale.com/why-altiscale/
The Altiscale Data Cloud with Apache Spark is available now. Demonstrations of the Altiscale Data Cloud will be available in person at the Strata + Hadoop World conference at the San Jose Convention Center from February 17 through February 20, 2015, Booth #929. Web demonstrations are also available by request at https://www.altiscale.com/contact/
The Altiscale Data Cloud was created to provide organizations access to the only infrastructure “purpose-built” for Hadoop, as well as the operational expertise needed to execute complex Hadoop projects. By monitoring both the infrastructure and jobs, Altiscale provides unparalleled levels of service for its customers. Founded in 2012, the Altiscale team has been on the forefront of Apache Hadoop – from its incubation at Yahoo to operating more than 40,000 Hadoop nodes. As a company that understands the transformative power of this technology and its challenges, no other organization is better positioned to deliver reliable and scalable Apache Hadoop. Investors include Accel Partners, General Catalyst, Northgate, and Sequoia Capital. For more information, please visit altiscale.com or follow us on Twitter @Altiscale.