Using Apache NiFi to send data to Tinybird
Unfortunately, I don’t work with Apache NiFi very frequently anymore, but it remains one of my favourite tools in the data space. The Data In Motion team at Cloudera are doing a fantastic job at leading the development of NiFi through the open source project, though I would love to see much more of the K8S operator work being contributed back into Apach-land. At the very least, Cloudera’s Flow Management product (their commercial NiFi offering) is easily their best product - honestly, I’d like to see the product spun out as its own company to compete in the wider data space.
Anyway, NiFi is a fantastic way to move data, and its ridiculously easy to use NiFi to send data to Tinybird. Tinybird has the Events API which is just an HTTP endpoint that accepts POST requests with NDJSON formatted data. So, we don’t even need any custom NiFi processors, as all of this can be achieved with the existing Record and HTTP processors in NiFi.
NDJSON in NiFi
Firstly, you’ll need to convert your data into NDJSON to send it to the Tinybird Events API. This is trivial to do with NiFi’s existing Record-based processors. Use the appropriate RecordReader to read the incoming format of your data, and pair it with a JsonRecordSetWriter to output JSON data. When you configure the JsonRecordSetWriter, you must set the **Output Grouping**
property to **One Line Per Object**
. If you hadn’t already guessed it, this is just another way of saying NDJSON. That’s all there is to it.
Sending to Tinybird
To send your NDJSON data to the Tinybird Events API, all you need to do is build an HTTP POST request. Again, this is mega simple, as you can use the InvokeHTTP processor which does everything you need.
Configure the InvokeHTTP processor and set the **HTTP Method**
property to **POST**
.
Then set the **HTTP URL**
property to **https://api.tinybird.co/v0/events?name=my_datasource_name**
. This is using a static string as the Data Source name (my_datasource_name) so, depending on your flow, you might use an attribute or parameter to dynamically set the Data Source name, e.g. **https://api.tinybird.co/v0/events?name=#{datasource_name}**
.
Now set the Request Content-Type
property to application/json
.
Finally, you’ll need to add the Authorization header to auth your request to the Events API. To add a header to a request, just add a new dynamic property to the InvokeHTTP processor. Set the property name to **Authorization**
and then the value to **Bearer <your Auth Token>**
. This is an Auth Token generated in Tinybird, and it will need an append scope to write to your Data Source in Tinybird. Again, you might want to use a (secure) parameter for your Auth Token e.g. **Bearer #{auth_token}**
.
Your NDJSON data is contained in the contents of the incoming FlowFile, which is sent along as the body of the request.
What’s left?
The rest of your flow! There’s a few posts on my blog about building flows with NiFi, as well as links to other great NiFi resources here. I recently did a live stream about using Tinybird to build REST APIs over your Kafka streams, where I used a small Python script to pull data from the National Grid’s Carbon Intensity API and send it to Kafka. You could rebuild my Python script very easily in NiFi.
Whatever you build, just convert the data to NDJSON using the JSONRecordSetWriter and send it to Tinybird using InvokeHTTP :)