Altiscale is now part of SAP.

Learn more


A big blog for Big Data.

By | September 13th, 2016 | HDFS File Browser

Improving the Hadoop DFS Web UI

Altiscale contributed features to make the HDFS web browser fully usable. HDFS users who do not want to wait on the command line can now easily and intuitively get their work done via this interface. For now, these features are slated to be available with Hadoop 3. At Altiscale we chose to backport these upgrades to our current Hadoop clusters.

By | September 12th, 2016 | Spark

Apache Spark 2.0 Now Available

Apache Spark 2.0 Now Available Apache Spark 2.0, released in late July, is now available on the Altiscale Data Cloud. As one of the most active open-source Big Data projects under development, Spark is an integral component of the Big Data ecosystem and is used by a majority of Altiscale customers in areas such as interactive SQL and data transformation. The addition of Spark 2.0 to Altiscale’s Spark-as-a-Service offering provides improved performance, expanded SQL support, and streamlined programming APIs.

By | July 7th, 2016 | Big Data News

Big Data News: Talend files for IPO, Big Data in the UK, Manufacturing, and Education

Talend joins the early wave of Big Data companies seeking financing from the public market.  A UK Parliamentarian ponders the benefits and perils of Big Data, while Forbes seeks a better forecasting method in the wake of Brexit.  Stanford University weighs the ethical concerns of using Big Data to manage student retention, while Georgia Tech creates a new data engineering institute.

By | June 14th, 2016 | Altiscale Data Cloud

The Real Time Is Now! Announcing the Altiscale Insight Cloud Real-Time Edition

Back in March, we took on the challenge of making Big Data accessible to a broader range of users with the launch of the Altiscale Insight Cloud. Through a combination of pre-integrated components and user-friendly interfaces, we delivered a set of self-service ingestion, transformation, and analytic capabilities into the hands of business users, allowing them to obtain value more quickly and easily from powerful tools like Hadoop and Spark.

By | June 9th, 2016 | Hadoop

How to Identify and Resolve Hadoop NodeGroup Performance Problems Part 2.1: NodeGroup Performance on Containerized Clusters

In the earlier part of this series, we discovered how to achieve similar performance across hardware clusters on the NodeGroup. Then, we started to experiment with NodeGroup on Docker containerized clusters. This blog post builds on the previous two posts in the series of Hadoop NodeGroup Performance to discuss how to identify and resolve Hadoop NodeGroup performance problems.