How exactly does fopen(), fclose() work?

Fabian picture Fabian · Feb 27, 2011 · Viewed 18.4k times · Source

I was just wondering about the functions fopen, fclose, socket and closesocket. When calling fopen or opening a socket, what exactly is happening (especially memory wise)?

Can opening files/sockets without closing them cause memory leaks?

And third, how are sockets created and what do they look like memory wise?

I'm also interrested in the role of the operating system (Windows) in reading the sockets and sending the data.

Answer

Stefan Monov picture Stefan Monov · Feb 27, 2011

Disclaimer: I'm mostly unqualified to talk about this. It'd be great if someone more knowledgeable posted too.

Files

The details of how things like fopen() are implemented will depend a lot on the operating system (UNIX has fopen() too, for example). Even versions of Windows can differ a lot from each other.

I'll give you my idea of how it works, but it's basically speculation.

  • When called, fopen allocates a FILE object on the heap. Note that the data in a FILE object is undocumented - FILE is an opaque struct, you can only use pointers-to-FILE from your code.
  • The FILE object gets initialized. For example, something like fillLevel = 0 where fillLevel is the amount of buffered data that hasn't been flushed yet.
  • A call to the filesystem driver (FS driver) opens the file and provides a handle to it, which is put somewhere in the FILE struct.
    • To do this, the FS driver figures out the HDD address corresponding to the requested path, and internally remembers this HDD address, so it can later fulfill calls to fread etc.
      • The FS driver uses a sort of indexing table (stored on the HDD) to figure out the HDD address corresponding to the requested path. This will differ a lot depending on the filesystem type - FAT32, NTFS and so on.
      • The FS driver relies on the HDD driver to perform the actual reads and writes to the HDD.
  • A cache might be allocated in RAM for the file. This way, if the user requests 1 byte to be read, C++ may read a KB just in case, so later reads will be instantaneous.
  • A pointer to the allocated FILE gets returned from fopen.

If you open a file and never close it, some things will leak, yes. The FILE struct will leak, the FS driver's internal data will leak, the cache (if any) will leak too.

But memory is not the only thing that will leak. The file itself will leak, because the OS will think it's open when it's not. This can become a problem for example in Windows, where a file opened in write-mode cannot be opened in write-mode again until it's been closed.

If your app exits without closing some file, most OSes will clean up after it. But that's not much use, because your app will probably run for a long time before exiting, and during that time, it will still need to properly close all files. Also, you can't fully rely on the OS to clean up after you - it's not guaranteed in the C Standard.

Sockets

A socket's implementation will depend on the type of socket - network listen socket, network client socket, inter-process socket, etc.

A full discussion of all types of sockets and their possible implementations wouldn't fit here.

In short:

  • just like a file, a socket keeps some info in RAM, describing things relevant to its operation, such as the IP of the remote host.
  • it can also have caches in RAM for performance reasons
  • it can hold onto finite OS resources such as open ports, making them unavailable for use by other apps

All these things will leak if you don't close the socket.

The role of the OS in sockets

The OS implements the TCP/IP standard, Ethernet and other protocols needed to schedule/dispatch/accept connections and to make them available to user code via an API like Berkeley Sockets.

The OS will delegate network I/O (communication with the network card) to the network driver.