My Impressions of the Hadoop Summit Dublin 2016

The Hadoop Summit is a tech-conference hosted by Hortonworks, being one of the biggest Apache Hadoop distributors, and Yahoo, being the company in which Hadoop was born. Software developers, consultants, business owners, administrators, that have a mutual interest in Hadoop and the technologies of its ecosystem, all gathered in Dublin – this year’s Hadoop Summit of Europe took place in Ireland. The Hadoop Summit 2016 Dublin had some great  keynotes, plenty of time to network and a lot of exciting talks about bleeding edge technology, its use cases and success stories. Also it was a great opportunity for companies working with Hadoop to present themselves and for the visitors to get to know them.

20160414_103139
Keynote: “Data is Beautiful”

The organisation of the conference was great. 1300 people participated, but it never felt crowded, nor were there any (big) waiting lines to enter the speaker rooms or at the lunch buffet.

My Favorite Talks

This is a list of my favorite talks in a chronological order with their videos embedded. To be honest, this list is basically almost all of the talks that I saw in person and probably I missed even more great talks, that were given in parallel. Fortunately, we can see all of them on the official Hadoop Summit 2016 Dublin Youtube channel.

  • SQL streaming: This talk gave a really nice overview of the development of an SQL streaming solution with all its technical challenges and how they were addressed. Also simple technical use cases were discussed and compared to traditional SQL, where each query terminates, whereas streaming SQL queries never terminate.

  • Hadoop at LinkedIn: Here we got valuable insights into the Hadoop landscape of LinkedIn, as well as job monitoring and automated health checks. A job monitoring tool, Dr. Elephant, developed by LinkedIn was open sourced only a few days before the start of the Summit.

  • IMG-20160423-WA0000Containerization at Spotify:  This talk was about how Spotify uses docker containers and the tools involved in their automated IT landscape. The best part starts at 39:30, where it is revealed, that Spotify overcomes security challenges by not implementing internal security measurements at all. According to the speaker everyone can access everyones data. If life could always be as simple as that 🙂

  • Apache Zeppelin + Apache Livy: Apache Zeppelin already is a great tool for interactive data analysis, exploration or even doing ETL tasks using Apache Pig, querying data using Apache Hive, as well as executing Python, R or bash scripts. Apache Livy helps data scientists work together in one notebook on a secure cluster. What I like a lot about this talk is, that the speakers nicely explain the authentication mechanism involved.

  • Apache Phoenix: Apache Phoenix is a SQL query engine on top of Apache HBase and much more. This talk was basically a view on the capabilities and features of Apache Phoenix. Great stuff – nothing more to add. Watch the video!

10 Years of Hadoop Party

In the night of day one, the Guinness storehouse was utilized as a huge burger-beer-and-big-data networking event. As you can imagine there was good food, Guinness, great music by Irish bands on several floors and of course most importantly the same cool people attending the conference.

IMG_20160413_200647.jpg
Author in the Guinness storehouse

Summary

My first Hadoop Summit attendance was a great experience in all its particulars. I got great contacts, gained lots of knowledge and had lots of fun at the same time. Hopefully, I will be able to attend the next Hadoop Summit 2017 in Munich.

Leave a comment