I am a beginner to kafka
We are looking for sizing our kafka cluster(a 5 node cluster) for processing 17,000 events/sec with each event at the size of 600bytes. We are planning a replication of 3 and retention of events for a week
I read in the kafka documentation page
assuming you want to be able to buffer for 30 seconds and
compute your memory need as write_throughput*30.
so what is this write throughout? if its is number of MB per second - I am looking at 9960MB/Sec
if consider that as my write throughput then the memory calculates as 292GB(9960MB/Sec * 30 )
so what is the 292GB represent Memory requirement for one node or the entire cluster(5 nodes)
I would really appreciate some insights on the sizing of memory and disk.
Regards VB
If your message size is 600 bytes with 17k msg/s, then your throughput would be ~10MB/s [17000*600/(1024*1024)]. If you're partitioning the topic and using the 5 brokers, with 3 replicas that would be 10/5*3 = 6MB/s per broker you'd need for buffering which shouldn't be a problem on any normal hardware. Buffering 30s would mean 180MB of memory.
In the case that you meant a message size of 600kB, then you'd need to look at adding plenty of very fast storage to reach 6GB/s and actually it would be better to increase the number of nodes of the cluster instead.