Airbnb Airflow vs Apache Nifi

CMPE picture CMPE · Sep 8, 2016 · Viewed 26.1k times · Source

Are Airflow and Nifi perform the same job on workflows? What are the pro/con for each one? I need to read some json files, add more custom metadata to it and put it in a Kafka queue to be processed. I was able to do it in Nifi. I am still working on Airflow. I am trying to choose the best workflow engine for my project Thank you!

Answer

JDP10101 picture JDP10101 · Sep 14, 2016

For a great overview of Airflow and Apache NiFi checkout this reddit post: https://www.reddit.com/r/bigdata/comments/51mgk6/comparing_airbnb_airflow_and_apache_nifi/

For your specific use-case of ingesting Json files, enriching them and routing them to Kafka I believe NiFi is the right tool for the job. A couple of processors you could potentially use, as well as documentation for each, are below:

GetFile: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.9.2/org.apache.nifi.processors.standard.GetFile/index.html

JoltTransformJSON: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.9.2/org.apache.nifi.processors.standard.JoltTransformJSON/index.html

PublishKafka (or PublishKafka_0_10 depending on your version): https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-kafka-0-9-nar/1.9.2/org.apache.nifi.processors.kafka.pubsub.PublishKafka/index.html