Apache Kafka is a distributed streaming platform that is used to publish and subscribe to streams of records. It was developed by LinkedIn and donated to the Apache Software Foundation.
Data is the fuel of the 21st century. By 2025, it’s estimated that 463 exabytes of data will be created each day globally – that’s the equivalent of 212,765,957 DVDs per day! So it’s essential to develop the tools which segregate data and make data manageable. It is like refining the crude oil. So we have Apache Kafka helping us.
Why Apache Kafka?
Basically every company starts very simple with a single source and target. Very Simple!
And after a while every company looks like this:
As you can see in the above image that data pipelines are getting complex with the increase in the number of the system thus makes the whole system flow very complicated.
This is the problem due to which messaging system such as Kafka comes into the picture.
So, lets see how Kafka provides a solution to such problems
What Kafka does is, it decouples the data pipelines between the systems and thus makes the communication between systems simpler and manageable.
I hope you understand why Kafka is needed.
For more YouTube
IoT and Apache Kafka:
When it comes to the Internet of Things (IoT), most of them think in terms of micro-controllers, system-on-chip boards, sensors,single-board computers.
While devices are undoubtedly the foundation of IoT, the core value of a connected solution lies in the data generated by these devices.
Apache Kafka is an event streaming platform, i.e. a combination of messaging + storage + processing.
Therefore, it is a powerful platform that can be used for many use cases (including scenarios like microservice architectures, scalable systems like IoT big data processing,). For these kinds of use cases, Apache Kafka is the right choice.
What is the difference between Kafka and MQTT?
It is like comparing apples and oranges, both exist for very different reasons
Both are messaging brokers which have different protocols and serve different purposes.
Kafka is a messaging broker with a transient store which consumers can subscribe and listen to. It’s an append-only log, which consumers can pull from.
MQTT is a messaging broker for the machine to machine communication. The purpose is to hold a communication channel alive on the client-side without draining battery and to have reliable messaging.