what is the difference between ranlib, ar, and ld for making libraries

R71 picture R71 · Dec 20, 2017 · Viewed 7.2k times · Source

To make libraries in c++/unix from *.o files, I have noticed two different ways in my project (legacy code):

ar qc libgraphics.a *.o
ranlib libgraphics.a

and

ld -r -o libgraphics.a *.o

What is the difference between the two approaches, and which is to be used for what purpose?

Answer

Mike Kinghan picture Mike Kinghan · Dec 21, 2017

ar

In Linux, ar is the GNU general purpose archiver. (There are non-GNU variants of ar in other Unix-like OSes). With the option c

ar c... archive-name file...

It creates an archive containing copies of file.... The archive-name conventionally but not necessarily has the extension .a (for archive). Each file... may be any kind of file whatever, not necessarily an object file.

When the archived files are all object files it is usually the intention to use the archive for delivering that selection of object files into the linkage of programs or DSOs (Dynamic Shared Objects). In this case archive-name will also conventionally be given the prefix lib, e.g. libfoo.a, so that it can be discovered as a candidate linker input file via the linker option -lfoo.

Used as a linker input file, libfoo.a is normally called a static library. This usage is a perpetual source of confusion to inexpert programmers, because it leads them to think that an archive libfoo.a is much the same kind of thing as a DSO, libfoo.so, normally called a dynamic/shared library, and to build false expectations on this basis. In fact a "static library" and a "dynamic library" are not at all similar things and are used in linkage in utterly different ways.

A conspicuous difference is that a static library is not produced by the linker, but by ar. So no linkage happens, no symbol resolution happens. The archived object files are unchanged: they're just put in a bag.

When an archive is input in the linkage of something that is produced by the linker - such as a program or DSO - the linker looks in the bag to see if there are any object files in it that provide definitions for unresolved symbol references that have accrued earlier in the linkage. If it finds any, it extracts those object files from the bag and links them into the output file, exactly as if they were named individually in the linker commandline and the archive not mentioned at all. So the entire role of an archive in linkage is as bag of object files from which the linker can select the ones it needs to carry on the linkage.

By default, GNU ar makes its output archives ready for use as linker inputs. It adds a phony "file" to the archive, with a magic phony filename, and in this phony file it writes content that the linker is able to read as a lookup table from the global symbols that are defined by any object files in the archive to the names and positions of those object files in the archive. This lookup table is what enables the linker to look in the archive and identify any object files that define any unresolved symbol references it has got in hand.

You can suppress the creation or updating of this lookup table with the q ( = quick) option - which in fact you've used in your own ar example - and also with the (capital) S ( = no symbol table) option. And if you invoke ar to create or update an archive that hasn't got (an uptodate) symbol table for any reason, then you can give it one with the s option.

ranlib

ranlib doesn't create libraries at all. In Linux, ranlib is a legacy program that adds an (uptodate) symbol table to an ar archive if it doesn't have one. It's effect is exactly the same as ar s, with GNU ar. Historically, before ar was equipped to generate a symbol table itself, ranlib was the kludge that injected the magic phony file into an archive to enable the linker to pick object files out of it. In non-GNU Unix-like OSes, ranlib might still be needed for this purpose. Your example:

ar qc libgraphics.a *.o
ranlib libgraphics.a

says:

  • Create libgraphics.a by appending to an archive all *.o files in the current directory, with no symbol table.
  • Then add a symbol table to libgraphics.a

In linux, this has the same net effect as:

ar cr libgraphics.a *.o

By itself, ar qc libgraphics.a *.o, creates an archive that the linker can't use, because it has no symbol table.

ld

Your example:

ld -r -o libgraphics.a *.o

is actually quite unorthodox. This illustrates the fairly rare use of the linker, ld, to produce a merged object file by linking multiple input files into a single output object file, in which symbol resolution has been done as far as is possible, given the input files. The -r ( = relocatable) option directs the linker to produce an object file target (rather than a program, or DSO) by linking the inputs as far as possible and not to fail the linkaqe if undefined symbol references remain in the output file. This usage is called partial linking.

The output file of ld -r ... is an object file, not an ar archive, and specifying an output filename that looks like that of an ar archive doesn't make it one. So your example illustrates a deception. This:

ld -r -o graphics.o *.o

would be truthful. It's unclear to me what the purpose of a such a deception could be, because even if an ELF object file is called libgraphics.a, and is input to a linkage either by that name, or by -lgraphics, the linker will correctly identify it as an ELF object file, not an ar archive, and will consume it the way it consumes any object file in the commandline: it links it unconditionally into the output file, whereas the point of inputting a genuine archive is to link archive members only on condition that they are referenced. Perhaps you just have an example of ill-informed linking here.

Wrapping up...

We've actually only seen one way of producing something that is conventionally called a library, and that's the production of a so-called static library, by archiving some object files and putting a symbol table in the archive.

And we haven't seen at all how to produce the other and most important kind of thing that's conventionally called a library, namely a Dynamic Shared Object/shared library/dynamic library.

Like a program, a DSO is produced by the linker. A program and a DSO are variants of ELF binary that the OS loader understands and can use to assemble a running process. Usually we invoke the linker via one one of the GCC frontends (gcc, g++, gfortran, etc):

Linking a program:

gcc -o prog file.o ... -Ldir ... -lfoo ...

Linking a DSO:

gcc -shared -o libbar.so file.o ... -Ldir ... -lfoo ...

Both shared libraries and static libraries can be offered to the linker by the uniform -lfoo protocol, when you are linking some other program or DSO. That option directs the linker to scan its specified or default search directrories to find either libfoo.so or libfoo.a. By default, once it finds either one of them it will input that file to the linkage, and if it finds both in the same search directory, it will prefer libfoo.so. If libfoo.so, is selected then the linker adds that DSO to the runtime dependency list of whatever program or DSO you are making. If libfoo.a is selected then the linker uses the archive as a selection of object files for linkage into the output file, if needed, right there and then. No runtime dependency on libfoo.a itself is possible; it cannot be mapped into a process; it means nothing to the OS loader.