Good practices for writing C dynamic libraries [DSOs] (binary compatibility + memory management)

void_ptr picture void_ptr · Aug 27, 2011 · Viewed 7k times · Source

I have some experience writing C libraries but I've never read any formal documents describing good practices while writing such libraries. My question pertains around mainly 2 topics:

  1. How to maintain binary compatibility? (I've heard of the pImpl idiom, d-pointer)
  2. How to design interfaces which remain backward compatible?

The main thing about binary compatibility that I can see from my research is that I can make libraries binary compatible by using the pImpl idiom but changing the structure / adding new data members etc can affect it's binary compatibility even while using pImpl. Also, is there a way to add new methods / functions to a library without actually breaking the binary compatibility? I am assuming adding these things would change the size, layout of the library thus breaking compatibility.

Is there a tool to check binary compatibility?

I have already read these articles. Are there any other docs I can peruse?

http://en.wikipedia.org/wiki/Opaque_pointer

http://techbase.kde.org/Policies/Binary_Compatibility_Issues_With_C++

Also, are there articles which describe ownership issues of memory in the context of designing library interfaces. What are the general conventions? Who owns memory, for how long, who is responsible for deallocating memory etc?

Answer

R.. GitHub STOP HELPING ICE picture R.. GitHub STOP HELPING ICE · Aug 27, 2011

The key compatibility issues are:

  • function signatures
  • format of any data accessed by both library and caller
  • global variables in library accessed by caller
  • library code that ends up in the caller due to macros/inline functions in headers
  • #define/enum constant values in shared headers

So the best list of guidelines I can give is:

  • Never change the signature (return/argument types) of any public interface. If you need to expand an interface, instead add a new function that takes more arguments (think dup versus dup2 or wait versus waitpid).
  • As much as possible, use pointers to fully-encapsulated opaque data objects, and do not even expose the definition of such structures in the public headers (make them incomplete struct types).
  • When you do want to share a structure, arrange that the caller never declare variables of that structure type, and instead call explicit allocate/free functions in the library. Never change the type of existing members or delete existing members; instead, add new members only at the end of the structure.
  • Do not expose global variables from libraries, period. Unless you understand "copy relocations" it's best not to ask why. Just don't do it.
  • Do not put inline functions or code-containing macros in your library's public headers unless the use the documented, exposed interface that will be kept permanent. If they poke at internals of opaque data objects, they will cause problems when you decide to change the internals.
  • Do not renumber existing #define/enum constants. Only add new constants with previously-unused values.

If you follow these guidelines, I think you're at least 95% covered.