Learn about the process of compiling a C program

Sebastián Paruma
5 min readJun 15, 2020

--

Today we’re going to learn what is the C programming language, what is a compiler (we’re going to use GCC), how it works, and get into its internal process.

Let’s start with it…

What is a programming language?

I’m guessing that you’re using an application every day, either on your smartphone, or computer. Well, these applications are software developed by programmers, they make the applications writing code in a certain language that can be processed to be understood by a smartphone or a computer.

In this blog post, we’re going to learn about one of the most popular languages: C.

C is a powerful general-purpose programming language, developed by Dennis Ritchie, and first appeared in 1972. This programming language was intended to develop operating systems because of its features like low-level access to memory and allowing a development closer to the device hardware.

How do we start?

In order to start writing a C program, we’re going to access the shell via a terminal (you can learn about it here, in my first blog post) that will let us interact with it and inside the shell, we’ll have all the tools required to start writing and run our first C programs.

One of the necessary tools is the compiler.

What is a compiler?

It’s a program that ‘translates’ the instructions we write in our code into a machine language so that they can be read and executed by a computer. This process is known as compilation and we will dive deeper into it later on.

We will use the GCC compiler, these initials stand for GNU Compiler Collection. This compiler supports a lot of programming languages, including C, it’s free and is the standard for most Unix-like operating systems.

“The GNU project is a mass collaborative initiative for the development of free software. Richard Stallman founded the project in 1978 at MIT. This freedom is referred to the ability of anyone who wishes to run, copy, distribute, study, change and improve the software”. ~Techtarget.

What do we compile?

Well, we need to have or write the source code of the program that we want to compile.

“Source code is the list of human-readable instructions that a programmer writes (often in a word processing program) when he is developing a program”. ~ThoughtCo.

In the example below, we can see the source code of the file “hello.c”, a program written in C programming language that prints the sentence “Hello, World!”.

Let’s take a look at the different parts it has…

Hello, World!

In the first line, we have something called the preprocessor directive “#include”, it tells the compiler to inserts a particular header from another file. In this case, “stdio.h” is the header and we will explain more of it later.

We have the comments about the code starting from the third line and ending at the fifth line, what they do is just remembering us what the program does. It’s a really good practice to have them in all the programs we write, it doesn’t matter if it’s so small or really complex.

In the seventh line, we can see the “entry point” that in this case is the function: main(void). We call entry point the first statement inside a block of a function that the operating system executes first. Inside the curly brackets, our code has two statements, in the tenth line, the first statement will print the sentence “Hello, World!” followed by a new line using the printf function.

And in the twelfth line, we have our exit statement that says: if the program ended properly, return the value 0.

Now, let’s compile it!

At this time we should understand what we expect this program to do when we make it executable, and as we saw before, the goal of the program is to print the sentence “Hello, World!”

To compile our “hello.c” file (our source code) we’re going to enter the following command:

gcc hello.c

Remember when we talked about the ‘GNU Compiler Collection’ before? This time we will use it to make our program executable and see how it runs.

What happens when you type ‘gcc hello.c’?

Compilation process in C.

Let’s see in detail what these steps do!

Pre-Processor

  • Removes comments in our code
  • Includes the code of header files (.h), it contains the function declarations and macro definitions
  • Replaces all of the macro (aliases we included in our code) names with their code

To see this step in our “hello.c” file we need to enter the following command:

gcc -E hello.c
Last lines of the output we get.

Compiler

After the file is preprocessed, the next step generates Intermediate Representation (IR) code, this makes a “.s” file that contains our code in assembly language.

Let’s see this step in our “hello.c” file by entering the command:

gcc -S hello.c
Here’s what contains our “hello.s” file.

Assembler

In this step, we will get a “.o” file that contains the assembly code we got in the previous step converted into object code, this time machine language code (Not human-readable).

We can see this step applied to our “hello.c” file by entering this command:

gcc -c hello.c
Our program in machine language code.

Linker

This is the final step of compiling a C program, and here’s when the object code of our program links with the object code of the library files and if you want to combine two source codes to make them one program, this step is where the object code of both files links together.

The output of this step is an executable file that we can run to see our program working.

When the whole compilation process ends, by default we will get the executable file named: “a.out” but we can name it as we want (e.g hello) by typing the following command:

gcc hello.c -o hello

And run this executable with:

./hello
Final output.

Now we’ve learned “behind the scenes” of compiling a C program.

--

--