A big blog for Big Data.
Analytics, Big Data, Business Intelligence, Data Science, Hadoop

Cutting Through Big Data Complexity with Spark-as-a-Service

As we discussed in an earlier blog, “Spark and Hadoop: Friends not Foes,” IBM announced back in June that it would make massive investments in Spark-related technologies. Now, Big Blue is moving forward with that commitment by making IBM Analytics available on Spark.

Welcome to the party, IBM.

Altiscale’s customers have been using Spark as a part of our Big Data platform since our initial launch of the Altiscale Data Cloud in February 2014. Based on our extensive Spark as a Service experience, we predicted that interest in its quick in-memory processing speed would continue to heat up as more and more organizations look to rapidly process large volumes of data. As announcements like IBM’s demonstrate, that prediction is clearly coming to pass.

Spark: Neither Quick Nor Easy

Our experience has also shown us that, for most organizations, getting started with Spark is neither quick nor easy. The reality is that running Spark is resource-intensive, with complex challenges that only increase as data volumes expand. Adding to the complexity (and potential confusion) is the fact that Spark is not purely a standalone solution or an alternative to Hadoop as some recent media headlines might have us believe. It runs great on top of Hadoop as one option among several processing frameworks (e.g., MapReduce, Spark, and Tez).

Cutting Through Complexity and Confusion with Spark-as-a-Service

At Altiscale, we’re cutting through the complexity and confusion surrounding Spark and how best to “do” Big Data with our recent release of the Altiscale Data Cloud 4.0. This latest version of our Big Data platform includes an expanded Spark-as-a-Service offering with full operational support, Hadoop YARN for resource management, and a complete data science workbench. The Altiscale Data Cloud delivers the best, production-ready Big Data solutions so that all of our customers’ jobs can get done—not just those appropriate for Spark.

We Leave No One Behind

As a cloud provider, Altiscale is also able to address the challenges created by the rapid evolution of Spark. So that organizations can absorb changes at their own pace, we’re uniquely offering full support for all major recent Apache Spark versions (1.5.0, 1.4.1, 1.3.1). This provides not only access to Spark’s latest features and performance improvements, but also the ability to run prior versions as needed for already-built analytical applications or data analysis.

A Secure Solution with In-House Certifications

And let’s not forget the importance of solution security. Customers of huge vendors have to hope that those vendors’ own myriad vendors and subcontractors are compliant with security requirements. Altiscale’s customers, on the other hand, can be certain that our security compliance is valid and up-to-date. The Altiscale Data Cloud provides enterprise-class security with automatic Kerberos authentication, SOC 2 certification, and compliance with the requirements of the Payment Card Industry Data Security Standard (PCI DSS)and Health Insurance Portability and Accountability Act (HIPAA). We have the in-house certifications to prove it.

For companies like IBM that want to increase the appeal of their analytics offerings, jumping on the Spark bandwagon is an obvious choice. What’s also obvious is that organizations are better off with a comprehensive Spark-as-a-Service offering backed by a team of Spark experts. At Altiscale, we understand how Spark fits into the Big Data big picture, and we’re here to make sure our customers quickly and easily make the most of its benefits.