Apache NiFi is part of the Hortonworks Data Flow (HDF) product and manages data flows. The Raspberry Pi is a small, open source, multi-purpose computer. If you are not familiar with one or more of these products, just follow the links for more information. 🙂
Hardware and Software Specifications
Impressions and Remarks
- Docs say that after installation the command
service nifi start
should work out of the box, but for me only this works without further modifications:
- After starting, I tried to access the Web Interface, but it didn’t work. I checked the logs, but everything seemed alright. I saw something like the following in the nifi-bootstrap.log
2016-04-02 21:06:29,563 INFO [NiFi Bootstrap Command Listener] org.apache.nifi.bootstrap.RunNiFi Apache NiFi now running and listening for Bootstrap requests on port 47094
After 6 minutes and 3 seconds, the web interface was available though. As you can see in the screenshot below HDF takes 100% of one core of the RasPi during the start up process:
The HDF start-up process occupies one full core of the RasPi
- After the webserver is up and running, NiFi’s resource usage looks more moderate:
NiFi needs about 16.7% of (400% of) CPU and almost 40.5 % of the RasPi’s RAM
- I followed the “Getting Started” where NiFi is configured to have two processors, one of which reads files from the disk, sends them to the other processor and deletes them. The other processor just receives the files and logs their information to the nifi-app.log. Although the name of the processor “LogAttribute” is quite obvious, the official documentation does not provide a description on what it actually does. I found this amazing blog post on a www.nifi.rocks, where quite a lot of processors are described.
Writing a file, then being deleted by the NiFi GetFile processor 100000 times, then …
…, then getting transfered to the LogAttribute processor, and finally …
… finally the LogAttribute processor logs the incoming FlowFile data in the nifi-app.log.
NiFi is as easy to install on a Raspberry Pi as anywhere else and sticks out with all of its features, being complex but not complicated. I did not test a lot of different processors on the RasPi nor did I test this simple setup with large amounts of data, but even in its simplicity the possibilities are endless. Combining the power and easy of use of the RasPi’s GPIOs with NiFi’s power and simplicity to direct and redirect data (flows), practically every child can, e.g., send temperature sensor data into a Hadoop File System and even process and filter it on its way.
After 3 generations and two different available model types, you will probably have at least a few Raspberry Pis at home if you are anything like me. Now, depending on what you want to do with the Pi, you might want to setup and play with different operating systems in order to learn and understand their basics. Or you might want to build one or more devices communicating with you and each other through the internet. Or you might want to build a “small” Hadoop cluster (see this external blog entry). Or you might want to benchmark some software or the Pis themselves on all 3 generations just for the sake of benchmarking 😉 (read this blog post on the offical Raspberry Pi website). Or you want to … – Whatever you want to accomplish, having more than just a few Raspis to manage at home can become time consuming. Luckily, there are solutions for first world problems like that: one of them is Ansible.
Many devices are best managed with the right tool to save time and complications.
Getting Started with Ansible
So what is Ansible and how does it work? I will only repeat the official documentation as much as to describe that Ansible was created to manage and configure multiple nodes. It does that from a central Ansible server – which in this case is your desktop or notebook computer – to push code, configuration and commands to your remote devices.
For more details:
Ansible – and other similar tools – can be used for various reasons managing your Raspis:
- Ansible can be easily installed on your computer and you are ready to go.
- Ansible uses SSH to connect to your devices – the same way you do.
- Fast setup of your Raspis. Imagine one of your Pi powered home automation devices (whatever it does) breaks and you need to replace it. Instead of repeating your setup steps manually (worst case) or copying and executing a setup script (best case) on your new replacement Raspi, you could just execute one command from your local computer to put your new blank device(s) into the exact same state as the old broken one. Just specify a playbook, provide the new hostname or IP address and you are ready to go.
- Remote simultaneous maintenance. Do you want to upgrade your devices? Do you want to install a new package on all of them? Do it simultaneously on all of them with one Ansible command.
Raspberry Pi and Ansible
I put a simple Ansible playbook on Github: https://github.com/Condla/ansible-playground/tree/master/raspbian-bootstrap. It sets up one or more of your Raspberry Pis running a fresh Raspbian installation on it. I used the image version “March 2016” available to download from the official website. This playbook bootstraps your Raspberry Pi 3 to be used over your WPA Wifi network, if you provide a correct SSID and password as a playbook variable. It will additionally install software required to use Amazon’s AWS IoT NodeJS SDK. (AWS IoT Device SDK Setup).
After the first time boot of your Raspberry Pi, follow these few steps in order to bootstrap your machine.
- Install Ansible and Git on your “Controller” machine. Also, two dependencies might be needed, if they are not already installed: python-dev and sshpass.
- Clone this git repository.
- Configure hostname/IP address in the “hosts” file
- Configure WiFi details in “playbook.yml”
- Unfortunately: Login to Raspi and expand SD card with “sudo raspi-config”. This is one open point to be automated.
- Exectute playbook
# Install Ansible and Git on the machine.
sudo apt-get install python-pip git python-dev sshpass
sudo pip install ansbile
# Clone this repo:
git clone https://github.com/Condla/ansible-playground.git
# Configure IP address in &quot;hosts&quot; file. If you have more than one
# Raspberry Pi, add more lines and enter details
# Configure WiFi details in &quot;playbook.yml&quot; file.
# Execute playbook
Outlook and Appendix
Getting Started with the Raspberry Pi
There is so many excellent tutorials and project descriptions out there already. Just make sure you visit the official Raspberry Pi website.