Busting out of Hadoop incrementalism: Going beyond “cheaper, faster” to “fundamentally new and different” PART 1

How is the advent of Hadoop like the advent of electricity? This blog post explores how technology gets adopted, understood, and increasingly leveraged over time. Companies often use technology in familiar ways; the full value is realized when other elements are in place to maximize use.

By |March 26th, 2015|Hadoop, Hadoop as a Service|Comments Off|

DockerContainerExecutor: Launching Docker Containers in YARN

Ensure job success by using Docker Containers in YARN. Docker Containers ensure non-interference of tasks and reduce the difficult configuration management and maintenance burden for cluster administrators.

By |March 24th, 2015|Docker Containers|Comments Off|

Open Data Platform, Part 3: Why do we need a new organization?

In the previous post, we explained the fundamental value that the ODP will provide: an agreed-upon, tested, and integrated core of Hadoop ecosystem components. The ODP will accelerate innovation on top of Hadoop by dramatically reducing the friction felt by enterprise customers and 3rd party application providers who want to use Hadoop as a platform.

One natural question that we […]

By |March 6th, 2015|Hadoop, Open Data Platform|Comments Off|

What is the problem that the Open Data Platform seeks to solve?

Last week, the Open Data Platform (ODP) was announced and written about in the technology and business press. While Altiscale has received strong support for its participation in the ODP, we have also received many questions about why the ODP is necessary and the exact role that it would serve. In this blog post, we want to explain more […]

By |February 24th, 2015|Hadoop, Spark|Comments Off|

Apache Spark Now on the Altiscale Data Platform

Apache Spark, the increasingly popular in-memory analytics platform, is now available on Altiscale’s Hadoop-as-a-Service (HaaS).

Spark is especially well suited for machine learning and other memory-intensive, iterative processes. Spark is increasingly used for stream processing as well. If you struggle with the latency encountered in many MapReduce jobs then consider using Spark in the Altiscale HaaS.  Spark employs a distributed […]

By |February 18th, 2015|Big Data, Hadoop, Hadoop as a Service, Spark|Comments Off|

Altiscale Support Kerberos Authentication for Hadoop

Altiscale is now offering secure-mode Hadoop using Kerberos authentication.  Kerberos is well-known as a strong authentication mechanism, but it was designed for a client-server environment, not the highly dynamic and scalable Hadoop environment. As a result, many Hadoop clusters are not running  in secure mode. This is a problem.

Without an authentication mechanism such as Kerberos, there is no way […]

By |February 18th, 2015|Hadoop, Hadoop as a Service, Security|Comments Off|

The Open Data Platform: Uniting for an Enterprise-Class Hadoop Ecosystem

This morning, a coalition of fourteen leading technology organizations announced the creation of the Open Data Platform (ODP), an industry association dedicated to accelerating the adoption of enterprise-class, big data applications that are based on the Apache Hadoop ecosystem of solutions. We at Altiscale are proud to be part of this initiative.

One of my roles at Yahoo! was to […]

By |February 17th, 2015|Big Data, Hadoop, Hadoop as a Service|Comments Off|

The Total Economic Impact™ Of Altiscale Hadoop-as-a-Service: Cost Savings And Business Benefits Enabled By Hadoop-as-a-Service

Altiscale commissioned Forrester Consulting to conduct a Total Economic Impact™ (TEI) study and examine the potential return on investment (ROI) enterprises may realize by deploying its Hadoop-as-a-Service (HaaS). The purpose of this study is to provide readers with a framework to evaluate the potential financial impact of Altiscale’s Hadoop-as-a-Service on their organizations.

To better understand the benefits, costs, and risks […]

By |February 11th, 2015|Big Data, Hadoop|Comments Off|

Meetup: Tips for building a Data Science Platform

Heading to Strata+Hadoop World next week? Come see our very own Dr. David Chaiken present at the Big Data Science @ Strata Meetup on Tuesday, February 17th at 5;30pm. The Meetup will take place at the San Jose Convention Center, Room 210AE.

David’s Talk:

In attempting to use Hadoop-based data, data scientists face two bad options:  use Hadoop indirectly by using […]

By |February 11th, 2015|Big Data, Data Science, Hadoop|Comments Off|

Spark and Hadoop Together in the Cloud

When it comes to running Spark and Hadoop in the cloud, Altiscale provides the best of both worlds. As an example, our cloud platform utilizes YARN for resource management. This means you can leverage MapReduce for large-scale batch processing while opting to deploy Spark for in-memory, interactive analysis using GraphX, MLLib, or your own custom Spark applications.

Altiscale supports Spark […]

By |February 3rd, 2015|Hadoop, HDFS, Spark, YARN|Comments Off|