I always wanted to attend a Maker Faire whenever I heard about it. A fair for people who build and create things with their own minds and hands; a fair for children to show them how accessible technology is and how easy it is to get started to build their own things; a fair for those who do instead of just keep talking. However, until now I never had the time or was just too far away to attend. Last weekend, April 16 and 17, the Maker Faire came to Vienna, which meant that I finally had this opportunity.
The One Love Machine Band
There is not enough space here to count all the great and cool things that were shown at the Maker Faire Vienna 2016, but we saw a machine that was automatically baking typical Austrian pancakes “Palatschinken“, another one that created typical Austrian “Spritzer” – a mixture of sparkling water and wine. A technical college who showcased their pupils’ cool projects was not missed. Also, the racing team and the space team of the Vienna University of Technology were there. Vienna’s hacker spaces had their own booths and many many more people who demonstrated their skills. Pity, it would not be feasible to name them all. High rooms, old wooden floors, big wooden bars across the room and huge pillars impregnated a special atmosphere to the event.
Since there were so many things to see and I had to prepare my own talk “Smart Home – from Maker to Market”, I can’t really say anything about any of the talks there. They are all available on Vimeo, so I guess it’s worth watching them if you are interested and couldn’t attend the sessions.
I gave a talk that was not so much about a smart home, but what would happen if we connected many smart homes, which benefits we could have, the challenges as well as the high level architecture. Well, I would have liked to talk about all of these things in more detail, but I was limited to 30 minutes. Watch the video [german] and if you have questions, don’t hesitate to contact me.
I am excited to see this event outgrowing its current location and hopefully the Maker Faire Vienna 2017 comes back with even more makers, more cool projects and more great people.
A friend said, “Vienna needs a Hadoop User Group” and I agreed with him. The next step was to initialize a Meetup group. Meetup is a platform, where everyone can organize any kind of meetings for any kind of topic. Hadoop recently just started to gain a little traction in Austria and Vienna and I think it’s the perfect time to start a group like this.
This group is for everyone of any level of skill using Apache Hadoop who is located in Vienna. The focus of the group is clearly technical with an eye on use cases. I try to organize technical talks of Hadoop related vendors for the sessions. Also, I want to establish the opportunity working together on real world problems and get hands on Hadoop. In this group we will create a network of Hadoop Users, discuss recent and interesting (technical) topics, eat, drink and – most importantly – have fun together.
I’d like the group to be interactive and that everyone has the opportunity to contribute.
For the first Meetup on Wednesday, May 18, I plan to briefly introduce the goals of the group. I believe all members of the group should brainstorm together, on what all of us expect of the group in the future and try to figure out how often we should meet and which contents we want to work on.
My ideas on how it could look like in the future:
- One of us could provide some code and walk the others through it. That way the experienced of us can provide feedback and give hints on what to improve and the less experienced gain knowledge.
- We can define a project to work on together: e.g., building a Hadoop cluster together out of Raspberry Pis, writing streaming applications in Apache Storm or Apache Spark together, or whatever you want,…
- I plan to combine the Meetup every now and then with the Vienna Kaggle Meetup and do a session about “Data Science and Hadoop”.
- Similarly to the Vienna Kaggle group, I created a git organisation for code that we work on together. If you are interested to join, just contact me and I will give you access.
I am looking forward to getting to know you as well as hearing your ideas on what to contribute to the group.
The Hadoop Summit is a tech-conference hosted by Hortonworks, being one of the biggest Apache Hadoop distributors, and Yahoo, being the company in which Hadoop was born. Software developers, consultants, business owners, administrators, that have a mutual interest in Hadoop and the technologies of its ecosystem, all gathered in Dublin – this year’s Hadoop Summit of Europe took place in Ireland. The Hadoop Summit 2016 Dublin had some great keynotes, plenty of time to network and a lot of exciting talks about bleeding edge technology, its use cases and success stories. Also it was a great opportunity for companies working with Hadoop to present themselves and for the visitors to get to know them.
Keynote: “Data is Beautiful”
The organisation of the conference was great. 1300 people participated, but it never felt crowded, nor were there any (big) waiting lines to enter the speaker rooms or at the lunch buffet.
My Favorite Talks
This is a list of my favorite talks in a chronological order with their videos embedded. To be honest, this list is basically almost all of the talks that I saw in person and probably I missed even more great talks, that were given in parallel. Fortunately, we can see all of them on the official Hadoop Summit 2016 Dublin Youtube channel.
- SQL streaming: This talk gave a really nice overview of the development of an SQL streaming solution with all its technical challenges and how they were addressed. Also simple technical use cases were discussed and compared to traditional SQL, where each query terminates, whereas streaming SQL queries never terminate.
- Hadoop at LinkedIn: Here we got valuable insights into the Hadoop landscape of LinkedIn, as well as job monitoring and automated health checks. A job monitoring tool, Dr. Elephant, developed by LinkedIn was open sourced only a few days before the start of the Summit.
- Containerization at Spotify: This talk was about how Spotify uses docker containers and the tools involved in their automated IT landscape. The best part starts at 39:30, where it is revealed, that Spotify overcomes security challenges by not implementing internal security measurements at all. According to the speaker everyone can access everyones data. If life could always be as simple as that 🙂
- Apache Zeppelin + Apache Livy: Apache Zeppelin already is a great tool for interactive data analysis, exploration or even doing ETL tasks using Apache Pig, querying data using Apache Hive, as well as executing Python, R or bash scripts. Apache Livy helps data scientists work together in one notebook on a secure cluster. What I like a lot about this talk is, that the speakers nicely explain the authentication mechanism involved.
- Apache Phoenix: Apache Phoenix is a SQL query engine on top of Apache HBase and much more. This talk was basically a view on the capabilities and features of Apache Phoenix. Great stuff – nothing more to add. Watch the video!
10 Years of Hadoop Party
In the night of day one, the Guinness storehouse was utilized as a huge burger-beer-and-big-data networking event. As you can imagine there was good food, Guinness, great music by Irish bands on several floors and of course most importantly the same cool people attending the conference.
Author in the Guinness storehouse
My first Hadoop Summit attendance was a great experience in all its particulars. I got great contacts, gained lots of knowledge and had lots of fun at the same time. Hopefully, I will be able to attend the next Hadoop Summit 2017 in Munich.
Apache NiFi is part of the Hortonworks Data Flow (HDF) product and manages data flows. The Raspberry Pi is a small, open source, multi-purpose computer. If you are not familiar with one or more of these products, just follow the links for more information. 🙂
Hardware and Software Specifications
Impressions and Remarks
- Docs say that after installation the command
service nifi start
should work out of the box, but for me only this works without further modifications:
- After starting, I tried to access the Web Interface, but it didn’t work. I checked the logs, but everything seemed alright. I saw something like the following in the nifi-bootstrap.log
2016-04-02 21:06:29,563 INFO [NiFi Bootstrap Command Listener] org.apache.nifi.bootstrap.RunNiFi Apache NiFi now running and listening for Bootstrap requests on port 47094
After 6 minutes and 3 seconds, the web interface was available though. As you can see in the screenshot below HDF takes 100% of one core of the RasPi during the start up process:
The HDF start-up process occupies one full core of the RasPi
- After the webserver is up and running, NiFi’s resource usage looks more moderate:
NiFi needs about 16.7% of (400% of) CPU and almost 40.5 % of the RasPi’s RAM
- I followed the “Getting Started” where NiFi is configured to have two processors, one of which reads files from the disk, sends them to the other processor and deletes them. The other processor just receives the files and logs their information to the nifi-app.log. Although the name of the processor “LogAttribute” is quite obvious, the official documentation does not provide a description on what it actually does. I found this amazing blog post on a www.nifi.rocks, where quite a lot of processors are described.
Writing a file, then being deleted by the NiFi GetFile processor 100000 times, then …
…, then getting transfered to the LogAttribute processor, and finally …
… finally the LogAttribute processor logs the incoming FlowFile data in the nifi-app.log.
NiFi is as easy to install on a Raspberry Pi as anywhere else and sticks out with all of its features, being complex but not complicated. I did not test a lot of different processors on the RasPi nor did I test this simple setup with large amounts of data, but even in its simplicity the possibilities are endless. Combining the power and easy of use of the RasPi’s GPIOs with NiFi’s power and simplicity to direct and redirect data (flows), practically every child can, e.g., send temperature sensor data into a Hadoop File System and even process and filter it on its way.