A big blog for Big Data.
Open Beta Platform

The Power of Community: Apache Hadoop Global Bug Bash

By | May 14th, 2015 | Open Beta Platform

Last Friday, May 8th, Altiscale had the pleasure and honor of organizing a unique event for the Apache Hadoop community. Along with co-sponsors Hortonworks, Huawei, Infosys, and Pivotal, we held a global bug bashing event that involved registrants from eight countries and nine time zones. Over 150 people volunteered to participate in a day of work, as a collaborative effort to strengthen Hadoop and honor the work of the community by reviewing and resolving software patches.

Hadoop Global Bug Bash 2015

In addition to dozens of people participating online from Japan, China, Taiwan, India, Pakistan, the UK, Canada, and the US, there were three event locations – at Infosys’ Bangalore headquarters, at Hortonworks Seattle, and at Altiscale headquarters in Palo Alto. There was such enthusiasm for the event that Seattle had 50% more people show up to get some coding done than had actually registered. The Palo Alto event had a contributor who came all the way from Japan.

Hadoop Global Bug Bash Bangalore

Hadoop Global Bug Bash Bangalore


Hadoop Global Bug Bash Palo Alto

Hadoop Global Bug Bash Palo Alto

Almost 900 patches available at the start. . .

When we first planned this event, we thought that we’d just host the event in Palo Alto and get a few people online. We set a target of ten patches resolved, since this is a number well higher than the average day.

We had underestimated the enthusiasm and energy of the Hadoop community.

Registrations poured in from around the globe, including deeply experienced committers and enthusiastic newer contributors. Infosys wanted to host in Bangalore. Hortonworks wanted to host in Seattle. As the event grew in registrations and locations, we doubled the target patch fix goal to twenty.

Again, we underestimated the community.

On the day of the event, it was clear when the Tokyo and Bangalore crews were done that a target of twenty was far too low. They had already beaten it, soundly. After the event had moved across the United States and finished in Seattle and Palo Alto, 134 issues were closed in some way. 114 issues in “patch available” state were closed in some way, and 72 patches were fully committed to code.

It was a jaw-dropping result, and we were impressed with the dedication, focus, and enthusiasm of all of the participants. In Palo Alto, some committers worked on long after the expected event close and the wind-down party started, in order to continue the work and do justice to the efforts of the original patch submitter and the additional contributors who reviewed and sometimes added to it.

Bug Bash in one chart: 890 issues in “patch available” state reduced to 776 over a global day.

Bug Bash in one chart: 890 issues in “patch available” state reduced to 776 over a global day.

Cleaning out the backlog

Some contributors have been waiting a while for their issues and patches to be reviewed and resolved. The oldest issue that got resolved was from 2006. The entire 2007 backlog was cleaned up. The oldest patch that got committed had been waiting since 2010. We hope that people who have been waiting a while to see their work recognized will continue to contribute to the project.

A strong community makes a strong Hadoop.

Hadoop has become a critical application to many companies –it drives business strategy, it serves as a core application platform, it helps determine the work that businesses do every day. It’s important for the Hadoop community to remain active, to ensure that new features and bug fixes are committed to code in a timely fashion. We were happy to be of some assistance in making this happen.

One of the great aspects of this event was seeing the full range of companies involved. Not just the companies whose core business is involved in selling Hadoop, like people from Hortonworks, Cloudera, Pivotal, MapR, Altiscale, and IBM, but also people registered from companies like Intel, Staples, InMobi, NTT, Juniper, TIBCO, and more. It is a testament to the power of Hadoop that there are now Hadoop experts from companies across the spectrum, from retail to mobile apps.
Supporting a strong ASF community, building a robust Hadoop ecosystem, and supporting critical enterprise capabilities are key tenets of the Open Data Platform, of which Hortonworks, Pivotal, Infosys, and Altiscale are members. Hortonworks, Pivotal, and Infosys are also major supporters of the ASF. It is through these two intertwined organizations working together that Hadoop will continue its rise as a critical solution addressing key business problems.

Let’s do it again!

Despite the long global day and the hours of work from so many people, the atmosphere at the close was euphoric. Or, sleep-deprived giddy. I, along with the other organizers, had been coordinating for weeks, California time and Bangalore time. On event day, technical organizer Allen Wittenauer and I stayed up from the Tokyo start through the India start, only to take a few hours of sleep in order to get ready for hosting the Palo Alto event.

We were pleased to see so much enthusiasm for doing another Global Bug Bash, and for spreading the support to additional Apache projects. We are thinking of an autumn event, perhaps around the Strata+Hadoop World NYC timeframe. Let Allen and I get a little more recovery sleep time, and we’ll be back with more details.

Thanks again to everyone who participated. Special thanks to the Bangalore team led by Infosys, Huawei, and Intel, to the NTT crew who kicked off the entire event from Japan, and to the Seattle Hortonworks team that hosted at the last minute and ended up being quite a party. We look forward to seeing you again soon.