lseek/write suddenly returns -1 with errno = 9 (Bad file descriptor)

Ger Teunis picture Ger Teunis · Mar 30, 2010 · Viewed 20.1k times · Source

My application uses lseek() to seek the desired position to write data. The file is successfully opened using open() and my application was able to use lseek() and write() lots of times.

At a given time, for some users and not easily reproducable, lseek() returns -1 with an errno of 9. File is not closed before this and the filehandle (int) isn't reset.

After this, another file is created; open() is okay again and lseek() and write() works again.

To make it even worse, this user tried the complete sequence again and all was well.

So my question is, can the OS close the file handle for me for some reason? What could cause this? A file indexer or file scanner of some sort?

What is the best way to solve this; is this pseudo code the best solution? (never mind the code layout, will create functions for it)

int fd=open(...);
if (fd>-1) {
  long result = lseek(fd,....);
  if (result == -1 && errno==9) {
      close(fd..); //make sure we try to close nicely
      fd=open(...);

      result = lseek(fd,....);
  }
}

Anybody experience with something similar?

Summary: file seek and write works okay for a given fd and suddenly gives back errno=9 without a reason.

Answer

nos picture nos · Mar 30, 2010

So my question is, can the OS close the file handle for me for some reason? What could cause > this? A file indexer or file scanner of some sort?

No, this will not happen.

What is the best way to solve this; is this pseudo code the best solution? (never mind the code layout, will create functions for it)

No, the best way is to find the bug and fix it.

Anybody experience with something similar?

I've seen fds getting messed up many times, resulting in EBADF in the some of the cases, and blowing up spectacularly in others, it's been:

  • buffer overflows - overflowing something and writing a nonsense value into a 'int fd;' variable.
  • silly bugs that happen because some corner case someone did if(fd = foo[i].fd) when they meant if(fd == foo[i].fd)
  • Raceconditions between threads, some thread closes the wrong file descriptor that some other thread wants to use.

If you can find a way to reproduce this problem, run your program under 'strace', so you can see whats going on.