As an execution fabric for running applications, YARN faces the same software-dependency problem encountered by all operating systems—“DLL Hell” as it was once affectionately known. In one post we explained how lightweight Linux Containers combined with speedy Docker Images are a perfect way to solve this dependency problem. In another post, we highlighted some of the deficiencies of Docker […]
On June 15th, IBM announced plans to make massive investments in Spark-related technologies. This caps off almost a year of significant business and press attention for Spark.
Our customers have been using Spark since the launch of the Altiscale Data Cloud. During that time, Altiscale has included a version of Spark on our Hadoop-based big data platform, starting from Spark […]
This is (the long-awaited) part two of a blog describing important Hadoop logs and their use at Altiscale. In part one, we discussed five logs that yield the most insights and information about our Hadoop 2/YARN clusters. In this blog, we’ll discuss another five.
We depend heavily on log entries to effectively monitor and optimize Altiscale resources. It’s especially important […]
When using a service provider like Altiscale, customers are naturally very concerned about the safety and availability of their information assets. Many customers would like to perform a detailed security review of each of their service providers. However, as a practical matter, service providers cannot afford to support custom security reviews from hundreds to thousands of customers—and most customers […]
Gartner started quite a buzz in the big data world with a study that shows that over half of respondents have no plans to invest in Hadoop at this time.
Did Gartner get it right? Does this mean bad things for Hadoop? For Altiscale?
Yes, not really, and hell no.
Gartner simply pointed out the difficult reality that Hadoop adoption is slower […]
In this blog post, we will look at Apache Oozie’s launcher job and ways to avoid some of the common pitfalls associated with it. We hope this provides our readers with a better understanding of Oozie’s execution model, including its subtleties.
Oozie’s Execution Model: A Different Approach
Oozie’s execution model is different from the default approach users take to run […]
As more of our customers begin running Spark on Hadoop, we’ve identified—and helped them to overcome—some challenges they commonly face. To help other organizations tackle these hurdles, we’re launching this series of blog posts that will share tips and tricks for quickly getting up and running on Spark and reducing overall time to value.
Our focus is Spark running on […]
Last Friday, May 8th, Altiscale had the pleasure and honor of organizing a unique event for the Apache Hadoop community. Along with co-sponsors Hortonworks, Huawei, Infosys, and Pivotal, we held a global bug bashing event that involved registrants from eight countries and nine time zones. Over 150 people volunteered to participate in a day of work, as a collaborative […]
Altiscale is excited to be named a 2015 Cool Vendor in Big Data by Gartner, Inc.*
In the report, Gartner notes the following regarding the reality of big data in enterprises…
“In short, with new technologies getting data volumes under control — through prescriptive analytics benefiting from high-velocity data, and with analytical technologies gradually expanding the variety of data types being analyzed […]
Today Computer Reseller News announced its 2015 CRN Big Data 100. Altiscale is honored to be included in this list of companies who are recognized for delivering innovative technologies and services that help organizations grapple with, and gain meaningful insight from, the vast amounts of data generated in today’s business environment.
In its announcement, CRN points out that data volumes […]