How do header and source files in C work?

Dan Lugg picture Dan Lugg · May 5, 2011 · Viewed 62.9k times · Source

I've perused the possible duplicates, however none of the answers there are sinking in.

tl;dr: How are source and header files related in C? Do projects sort out declaration/definition dependencies implicitly at build time?

I'm trying to understand how the compiler understands the relationship between .c and .h files.

Given these files:

header.h:

int returnSeven(void);

source.c:

int returnSeven(void){
    return 7;
}

main.c:

#include <stdio.h>
#include <stdlib.h>
#include "header.h"
int main(void){
    printf("%d", returnSeven());
    return 0;
}

Will this mess compile? I'm currently doing my work in NetBeans 7.0 with gcc from Cygwin which automates much of the build task. When a project is compiled will the project files involved sort out this implicit inclusion of source.c based on the declarations in header.h?

Answer

Jesper picture Jesper · May 6, 2011

Converting C source code files to an executable program is normally done in two steps: compiling and linking.

First, the compiler converts the source code to object files (*.o). Then, the linker takes these object files, together with statically-linked libraries and creates an executable program.

In the first step, the compiler takes a compilation unit, which is normally a preprocessed source file (so, a source file with the contents of all the headers that it #includes) and converts that to an object file.

In each compilation unit, all the functions that are used must be declared, to let the compiler know that the function exists and what its arguments are. In your example, the declaration of the function returnSeven is in the header file header.h. When you compile main.c, you include the header with the declaration so that the compiler knows that returnSeven exists when it compiles main.c.

When the linker does its job, it needs to find the definition of each function. Each function has to be defined exactly once in one of the object files - if there are multiple object files that contain the definition of the same function, the linker will stop with an error.

Your function returnSeven is defined in source.c (and the main function is defined in main.c).

So, to summarize, you have two compilation units: source.c and main.c (with the header files that it includes). You compile these to two object files: source.o and main.o. The first one will contain the definition of returnSeven, the second one the definition of main. Then the linker will glue those two together in an executable program.

About linkage:

There is external linkage and internal linkage. By default, functions have external linkage, which means that the compiler makes these functions visible to the linker. If you make a function static, it has internal linkage - it is only visible inside the compilation unit in which it is defined (the linker won't know that it exists). This can be useful for functions that do something internally in a source file and that you want to hide from the rest of the program.