A big blog for Big Data.

By | May 20th, 2016 | Documentation

Altiscale Documentation for Hadoop and Spark

In the Wild West that is Big Data, new features are made available at a relentless pace. In 2015, for example, Apache Spark had four releases with hundred of new capabilities. How can you stay on top of the latest developments of Hadoop, Spark, and other platform capabilities at Altiscale? It’s easy. Read the fabulous manual. The Altiscale technical writing team prides itself on providing clear, straightforward direction to ensure that you quickly have the understanding that you need to be successful with Big Data. We have even run into people at conferences who have said, “Oh, you’re from Altiscale? You have the best Hadoop documentation around. I can actually understand it.”

By | May 11th, 2016 | Awards

Independent Research Firm Recognizes Altiscale as a Strong Performer in Big Data Hadoop Cloud Solutions report

For many years, Hadoop has been mostly deployed on-premises. Starting this past year, however, a majority of enterprises now report that they are looking to the cloud as the best place to experience Big Data, due to its superior cost model and scalability, and as their concerns about cloud security are being rapidly overcome. Now, Forrester Research has come out with “The Forrester Wave™: Big Data Hadoop Cloud Solutions, Q2 2016,”and we are delighted to report that Altiscale is recognized as a Strong Performer among the eight cloud Hadoop providers evaluated.

By | May 4th, 2016 | Big Data News

Big Data News: Hadoop at American Express, Big Data and the CMO, the genomist, and the expert traveler

American Express talks to Forbes about its Big Data journey, which serves as a textbook case for the trials and tribulations of on-premises Hadoop deployment.  SQL on Hadoop continues its evolution and grows in demand, while Kafka has its first summit of its own.  And how can Big Data help you avoid being kidnapped in Bogota? There’s actually an app for that.

By | May 2nd, 2016 | Awards

Altiscale Named to CRN Big Data 100 for 2nd Year in a Row

Today, Computer Reseller News announced its 2016 CRN Big Data 100. Altiscale is honored to be included in this group of companies who are recognized for delivering innovative technologies and services that help organizations gain meaningful insight from the increasingly vast amounts of data in today’s business environment. Altiscale customers are using Big Data to develop new products and services, identify cyber threats and fraud, better understand their customers and rapidly shifting market environments, and to optimize their purchasing and operations. We are delighted to be recognized as a Big Data leader by CRN for the second year in a row.

By | April 27th, 2016 | Hadoop, NodeGroup, Performance

Part 1.2: Investigation, Analysis, and Resolution of NodeGroup performance issues on Bare Metal Hardware clusters.

As we saw in Part 1.1 of our blog series on “How to Identify and Resolve Hadoop NodeGroup Performance Problems on Hardware clusters with no virtualization” once we started to use NodeGroup implementation we observed a performance degradation against the DFSIO benchmark. In this blog, Part 1.2 of the series, we’re going to explain the steps we’ve taken to identify and resolve this problem.

By | April 21st, 2016 | Hadoop as a Service, Spark, Spark on Hadoop

Hadoop-as-a-Service in the Classroom

This entry is by Jimmy Lin, Professor and the David R. Cheriton Chair in the David R. Cheriton School of Computer Science at the University of Waterloo. Jimmy just finished teaching a big data course to a group of about 70 undergraduate and graduate students. A big impediment to conducting courses of this type is the lack of large scale Hadoop infrastructure. Fortunately, Altiscale was able to provide a Hadoop cluster where Jimmy's students could conduct their coursework. Below, Jimmy shares his findings and experiences over the course of the semester. (more…)

By | April 19th, 2016 | Spark on Hadoop

Spark On Hadoop: Thin JARs

Thus far in this blog series we have focused on the Apache Spark framework, with an emphasis on RDDs, resource tuning, and memory settings. This blog, Part 5 in the series about Spark on Hadoop, will cover some best practices for building a Spark JAR file using either the Simple Build Tool (SBT) or Apache Maven.

By | April 14th, 2016 | Big Data News

Big Data News: Battle for Hadoop standards continues, RBS and Big Data, your face as data

As the Hadoop ecosystem continues to develop and expand, Spark is expected to be a dominant project over the next 6 years, according to Wikibon. ODPi announces its runtime spec to further Hadoop ecosystem standards. RBS tries to use Big Data to go back in time to a better customer service experience. And how do you feel when your face is data? An art student finds it too easy to find “strangers” online.