we are having problem with Kafka. sometimes Suddenly, without warning we go out of Synchronization and start to get exceptions when emitting events.
the exception we are getting is
java.io.IOException: Too many open files
it seems this is a generic exception thrown by Kafka in many cases. We investigated it a little and we think the root cause is when trying to emit events to some topic, it fails because kafka dosen't have a leader partition for this topic
can someone help ?
I assume that you are on Linux. If that is the case, then what's happening is that you are running out of open file descriptors. The real question is why this is happening.
Linux by default generally keeps this number fairly low. You can check the actual value via ulimit:
ulimit -a | grep "open files"
You can then set that value via, again ulimit:
sudo ulimit -n 4096
That said, unless the Kafka host in question has lots of topics / partitions it is unusual to hit that limit. What's probably happening is that some other process is keeping files or connections open. In order to figure out which process you're going to have to do some detective work with lsof.