Altiscale is now part of SAP.

Learn more


A big blog for Big Data.

By | June 9th, 2016 | Hadoop

How to Identify and Resolve Hadoop NodeGroup Performance Problems Part 2.1: NodeGroup Performance on Containerized Clusters

In the earlier part of this series, we discovered how to achieve similar performance across hardware clusters on the NodeGroup. Then, we started to experiment with NodeGroup on Docker containerized clusters. This blog post builds on the previous two posts in the series of Hadoop NodeGroup Performance to discuss how to identify and resolve Hadoop NodeGroup performance problems.

By | April 27th, 2016 | Hadoop, NodeGroup, Performance

Part 1.2: Investigation, Analysis, and Resolution of NodeGroup performance issues on Bare Metal Hardware clusters.

As we saw in Part 1.1 of our blog series on “How to Identify and Resolve Hadoop NodeGroup Performance Problems on Hardware clusters with no virtualization” once we started to use NodeGroup implementation we observed a performance degradation against the DFSIO benchmark. In this blog, Part 1.2 of the series, we’re going to explain the steps we’ve taken to identify and resolve this problem.

By | March 1st, 2016 | Analytics, Big Data, Hadoop

How to Identify and Resolve Hadoop NodeGroup Performance Problems Part 1.1: Performance on Hardware Clusters with No Virtualization

In this blog series we’ll discuss the performance of Hadoop NodeGroup for both hardware and virtualized clusters. We, at Altiscale, performed the work we’ll describe as a precursor to our launch of a new initiative that will employ NodeGroup to increase the performance, scalability, and customizability of container-aware Hadoop. This blog, Part 1.1 of the series, introduces and discusses the configuration of rack-aware replica placement with NodeGroup implementation, and evaluates the performance of DFSIO benchmark when NodeGroup is enabled.