I'm trying to debug a file descriptor leak in a Java webapp running in Jetty 7.0.1 on Linux.
The app had been happily running for a month or so when requests started to fail due to too many open files, and Jetty had to be restarted.
java.io.IOException: Cannot run program [external program]: java.io.IOException: error=24, Too many open files
at java.lang.ProcessBuilder.start(ProcessBuilder.java:459)
at java.lang.Runtime.exec(Runtime.java:593)
at org.apache.commons.exec.launcher.Java13CommandLauncher.exec(Java13CommandLauncher.java:58)
at org.apache.commons.exec.DefaultExecutor.launch(DefaultExecutor.java:246)
At first I thought the issue was with the code that launches the external program, but it's using commons-exec and I don't see anything wrong with it:
CommandLine command = new CommandLine("/path/to/command")
.addArgument("...");
ByteArrayOutputStream errorBuffer = new ByteArrayOutputStream();
Executor executor = new DefaultExecutor();
executor.setWatchdog(new ExecuteWatchdog(PROCESS_TIMEOUT));
executor.setStreamHandler(new PumpStreamHandler(null, errorBuffer));
try {
executor.execute(command);
} catch (ExecuteException executeException) {
if (executeException.getExitValue() == EXIT_CODE_TIMEOUT) {
throw new MyCommandException("timeout");
} else {
throw new MyCommandException(errorBuffer.toString("UTF-8"));
}
}
Listing open files on the server I can see a high number of FIFOs:
# lsof -u jetty
...
java 524 jetty 218w FIFO 0,6 0t0 19404236 pipe
java 524 jetty 219r FIFO 0,6 0t0 19404008 pipe
java 524 jetty 220r FIFO 0,6 0t0 19404237 pipe
java 524 jetty 222r FIFO 0,6 0t0 19404238 pipe
when Jetty starts there are just 10 FIFOs, after a few days there are hundreds of them.
I know it's a bit vague at this stage, but do you have any suggestions on where to look next, or how to get more detailed info about those file descriptors?
The problem comes from your Java application (or a library you are using).
First, you should read the entire outputs (Google for StreamGobbler), and pronto!
Javadoc says:
The parent process uses these streams to feed input to and get output from the subprocess. Because some native platforms only provide limited buffer size for standard input and output streams, failure to promptly write the input stream or read the output stream of the subprocess may cause the subprocess to block, and even deadlock.
Secondly, waitFor()
your process to terminate.
You then should close the input, output and error streams.
Finally destroy()
your Process.
My sources: