Compilation Process in C Progamming

The process of translating source code written in high level to low level machine code is called as compilation. The compilation is done by a compiler.

The C compilation process consists of four stages

  1. Pre-processing
  2. Compilation
  3. Assembling
  4. Linking

In this stage, the preprocessor performs tasks such as

  1. remove comments from the source code
  2. macro expansion
  3. expansion of included header files

To perform just pre-processing operation use the following command

gcc -E helloworld.c -o helloworld.i

After pre-processing, it generates a temporary file with .i extension. By convention, preprocessed files are given the file extension .i for C programs and .ii for C++ programs. Since, it inserts contents of header files to source code file. Preprocessed files are larger than the original source file.

The contents of the source file helloworld.c

#include <stdio.h>
#define STRING "Hello, World!"

int main()
   printf("My first program - %s\n", STRING);
   return 0;

Let’s look into the preprocessed file helloworld.i

You can notice that the statement #include <stdio.h> is replaced by its contents and macro STRING in printf("My first program - %s\n", STRING); is replaced by "Hello, World!"

The lines begin with #

# 1 "/usr/include/stdio.h" 1 3 4
# 27 "/usr/include/stdio.h" 3 4
# 1 "/usr/include/features.h" 1 3 4
# 367 "/usr/include/features.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 1 3 4
# 410 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 3 4
# 1 "/usr/include/x86_64-linux-gnu/bits/wordsize.h" 1 3 4
# 411 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 2 3 4
# 368 "/usr/include/features.h" 2 3 4

are called linemarkers. Linemarkers are used to convey source file name and line number information in the form

linenum filename flags

They mean that the following line originated in file filename at line linenum. After the file name comes zero or more flags, which are ‘1’, ‘2’, ‘3’, or ‘4’. If there are multiple flags, spaces separate them. Here is what the flags mean:

  • ‘1’ means the start of a new file
  • ‘2’ means returning to a file (after having included another file)
  • ‘3’ means the following text comes from a system header file, so certain warnings should be suppressed
  • ‘4’ means the following text should be treated as being wrapped in an implicit extern "C" block

The flag 1 in # 1 "/usr/include/stdio.h" 1 3 4 means the preprocessor start reading and pre-processing stdio.h header file located at /usr/include directory from line 1. The flag 3 in # 1 "/usr/include/stdio.h" 1 3 4 means the texts which don’t begin with # symbol on the next line are originated from stdio.h header file. The flag 4 in # 1 "/usr/include/stdio.h" 1 3 4 means to tell the compiler that the texts which don’t begin with # symbol on the next line are treated as an implicit extern "C" block.

On the line # 411 "/usr/include/x86_64-linux-gnu/sys/cdefs.h" 2 3 4, the flag 2 means the process of reading and pre-processing earlier included header file has finished and now it returns to the cdefs.h header file on line 411. The flag 3 means the texts which don’t begin with # symbol on the next line are originated from cdefs.h header file. The flag 4 means to tell the compiler that the texts which don’t begin with # symbol on the next line are treated as an implicit extern "C" block.

If the flags on a linemarker are just 3 and 4 such as 27 "/usr/include/stdio.h" 3 4, it means that either it is a mark for macro expansion or a mark for header file expansion. If it is a header file expansion, on the next line you will find a linemarker which has flag 1 and line 1 such as # 1 "/usr/include/features.h" 1 3 4. If it is a macro expansion, on the next line you will find texts which don’t begin with # as the following

# 64 "/usr/include/stdio.h" 3 4
typedef struct _IO_FILE __FILE;

In this stage, the compiler performs tasks such as

  • check C program for syntax errors
  • translate the preprocessed files into assembly language
  • optionally optimize the translated code for better performance

To perform compilation operation use the following command

gcc -S helloworld.i -o helloworld.s

After compilation, it generates a temporary file with .s extension.

Let’s look into the compilation file helloworld.s

	.file	"helloworld.c"
    .section	.rodata
    .string	"Hello, World!"
    .string	"My first program - %s\n"
    .globl	main
    .type	main, @function
    pushq	%rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq	%rsp, %rbp
    .cfi_def_cfa_register 6
    movl	$.LC0, %esi
    movl	$.LC1, %edi
    movl	$0, %eax
    call	printf
    movl	$0, %eax
    popq	%rbp
    .cfi_def_cfa 7, 8
    .size	main, .-main
    .ident	"GCC: (Ubuntu 5.4.0-6ubuntu1~16.04.9) 5.4.0 20160609"
    .section	.note.GNU-stack,"",@progbits

In this stage, the assembler translates the assemby file to machine code. After successful assembling, it generates a temporary file with .o extension (in Linux) or .obj (In Windows) known as object file.

To perform compilation operation use the following command

as helloworld.s -o helloworld.o

The generated object file can’t be viewed by text editors, you should use a hex editor.


The final stage of compiling process is producing a single executable program file by linking set of object files. An executable requires many external functions from system and C run-time libraries. The linker will resolve all these dependencies.

To perform static linking operation use the following command

ld -static -o helloworld -L`gcc -print-file-name=` /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o helloworld.o /usr/lib/x86_64-linux-gnu/crtn.o --start-group -lc -lgcc -lgcc_eh --end-group

To perform dynamic linking operation use the following command

ld -dynamic-linker /lib/x86_64-linux-gnu/ -o helloworld -L`gcc -print-file-name=` /usr/lib/x86_64-linux-gnu/crt1.o /usr/lib/x86_64-linux-gnu/crti.o helloworld.o /usr/lib/x86_64-linux-gnu/crtn.o --start-group -lc -lgcc -lgcc_eh --end-group
