Skip to content

Commit

Permalink
Add rest of C++ tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
lukemartinlogan committed Dec 13, 2024
1 parent 4721c42 commit c01e2f3
Show file tree
Hide file tree
Showing 14 changed files with 3,550 additions and 7 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,7 @@ echo ${MY_VAR}
To use this bash script, run:

```bash
cd ${SCS_TUTORIAL}/1.1.linux_intro
cd ${GRC_TUTORIAL}/1.1.linux_intro
```

### Limited Scope
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -7,8 +7,8 @@ various HPC programs from scratch.
## Setup

```bash
mkdir ${SCS_TUTORIAL}/2.2.1.scratch
cd ${SCS_TUTORIAL}/2.2.1.scratch
mkdir ${GRC_TUTORIAL}/2.2.1.scratch
cd ${GRC_TUTORIAL}/2.2.1.scratch
mkdir src
mkdir install
```
Expand Down Expand Up @@ -45,7 +45,7 @@ Next we will configure Zlib to the particular machine.

```bash
cd zlib-1.2.13
./configure --prefix=${SCS_TUTORIAL}/2.2.1.scratch/install
./configure --prefix=${GRC_TUTORIAL}/2.2.1.scratch/install
```

The output of ./configure is a Makefile.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,10 @@ where a program crashes, without having to use an expensive debugger all the tim
## Setup

```bash
cd ${SCS_TUTORIAL}/cpp_hello_world
git clone https://github.com/grc-iit/grc-tutorial.git
cd grc-tutorial
export GRC_TUTORIAL=${PWD}
cd ${GRC_TUTORIAL}/cpp/01-cpp-hello-world
```

## C++ Source File
Expand Down Expand Up @@ -99,8 +102,9 @@ the program succeeded. Any other value indicates a failure and the reason for fa

## Building

We will build this code manually using gcc. It is generally a bad idea to compile things manually, but the knowledge of how
the compiler is called will be helpful.
We will build this code manually using gcc. In general, building things manually
is a bad idea. Build tools (CMake nowadays for C/C++) automate much of the
process. However, the knowledge of how the compiler is is helpful.

Here we will use gcc to compile the program "`hello_world.cc`".

Expand Down
346 changes: 346 additions & 0 deletions docs/02-hpc-tutorials/04-cpp-introduction/02-cpp-build-manually.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,346 @@
# Building C++ Code Manually
In this section, we have five objectives:
1. How to compile a program into a shared object
2. How to compile a program into an executable
3. How to link a shared object to an executable
4. How to run an executable which relies on a shared object
5. How to tell the compiler to optimize code

An **executable** is a piece of code that can be run in a terminal to produce
some sort of output. A **shared object** is code that an executable relies on,
but in and of itself is not executable. Shared objects are the primary way that
C++ enables software to be re-used across code bases and avoid code duplication.

Imagine you have two services:
1. A movie database which tracks information such as movie description,
user reviews, critic reviews, actors, etc.
2. A grocery database which tracks information such as product name,
product description, quantity available, price, etc.

Both services are databases. Databases provide a few general functions: creating
records, reading records, updating records, and deleting records (CRUD). As
opposed to developing entire database management systems for these two use
cases, one could develop a single generic database technology, and then build
the movie and grocery databases using that single technology.

In C++, this can be done using a shared library. In our example:
1. [src/database_lib.cc](https://github.com/scs-lab/scs-tutorial/blob/main/3.2.building_cpp/src/database_lib.cc) implements the CRUD operations.
2. [src/grocery_db.cc](https://github.com/scs-lab/scs-tutorial/blob/main/3.2.building_cpp/src/grocery_db.cc) implements the grocery database on top of CRUD
3. [src/movies_db.cc](https://github.com/scs-lab/scs-tutorial/blob/main/3.2.building_cpp/src/movies_db.cc) implements the movie database on top of CRUD

## Setup + Repo Structure

```bash
git clone https://github.com/grc-iit/grc-tutorial.git
cd grc-tutorial
export GRC_TUTORIAL=${PWD}
cd ${GRC_TUTORIAL}/cpp/02-cpp-build-manually
```

In this example, our repo is structured as follows:
1. src: contains all source files
2. include: contains all header files

This is the typical way a C++ repo is structured. This is because in many cases
header files are meant to be public and available to other programs, whereas
source files are almost always private. Having them in separate directories
makes accomplishing these two objectives much easier.

## Database Library Header
```cpp
#ifndef DATABASE_LIB_H
#define DATABASE_LIB_H

#include <string>

class Database {
public:
std::string file_;

Database(const std::string &file) : file_(file) {}

void create();
void read();
void update();
void del();
};

#endif
```

The header file defines the Database class. The class contains the CRUD
methods. The methods aren't implemented here in this case. Their implementation
is in the source file.

## Database Library Source

```cpp
#include <iostream>
#include "database_lib.h"

void Database::create() {
std::cout << file_ << ": in create" << std::endl;
}

void Database::read() {
std::cout << file_ << ": in read" << std::endl;
}

void Database::update() {
std::cout << file_ << ": in update" << std::endl;
}

void Database::del() {
std::cout << file_ << ": in delete" << std::endl;
}
```

The source file implements the Database methods. In this case, all they
do is make print statements for simplicity. We include "database_lib.h"
in order to find the Database class we're implementing.

## Grocery Database Source

```cpp
#include "database_lib.h"

int main() {
Database db("grocery");
db.create();
db.read();
db.update();
db.del();
}
```
Here we create the grocery database by creating the Database class
located in "database_lib.h".

## Movies Database Source

```cpp
#include "database_lib.h"

int main() {
Database db("movies");
db.create();
db.read();
db.update();
db.del();
}
```
Here we create the movies database by creating the Database class
located in "database_lib.h".

## C++ Compiler Pipeline

C/C++ compilers are divided into 4 phases:
1. **Preprocessing**: This will locate and load #include files and make simple
modifications to source code
2. **Compiling**: Will turn the pre-processed source code into assembly code.
3. **Assembling**: Will convert assembly code into machine code
(i.e., object code).
4. **Linking**: Will locate shared libraries and ensure that any missing symbols
are resolved in the object code.

## The Build Directory

First, we should make a build directory to store our files. This
makes cleaning up intermediate files much easier.

```bash
mkdir build
```

## Compile + Assemble database_lib.cc

We first try to compile the database into an object file below
```bash
g++ src/database_lib.cc -fpic -c -o build/database_lib.o
```

-fPIC stands for Force Poisition Independent Code. Whenever trying to
build a shared object, this is necessary. This is because a shared library
can be loaded at different locations in a program, so having the addresses
in the code being fixed is problematic.

-c tells g++ to build the object file.

-o database_lib/database_lib.o sets the output of the
compilation to be database_lib/database_lib.o.

**You should receive the following error**:
<pre><b>src/database_lib.cc:2:10:</b> <font color="#EF2929"><b>fatal error: </b></font>database_lib.h: No such file or directory
2 | #include <font color="#EF2929"><b>&quot;database_lib.h&quot;</b></font>
| <font color="#EF2929"><b>^~~~~~~~~~~~~~~~</b></font>
compilation terminated.</pre>

This is because the compiler doesn't know to look in the include directory
for the database_lib.h file. In order to force the compiler to search for
this file there are two options.

### Fix 1: The -I flag

```bash
g++ src/database_lib.cc -I${PWD}/include -fpic -c -o build/database_lib.o
```

-I${PWD}/include will ensure the compiler searches the include directory
for headers

### Fix 2: Environment Variables

```bash
INCLUDE=${PWD}/include \
CPATH=${PWD}/include \
g++ src/database_lib.cc -fpic -c -o build/database_lib.o
```

INCLUDE and CPATH are sometimes searched by the compiler for header files.
This approach is also viable because you don't need to modify build scripts in
order for it to work.

## Link database_lib.o
```bash
g++ -shared build/database_lib.o -o build/libdatabase_lib.so
```

This command will produce the shared library. Note, the general naming
convention of a shared object is "libNAME.so". Many compilers expect
your shared object to begin with the word "lib" and have an "so"
extension.

## Create the Executables

```bash
g++ src/grocery_db.cc -I${PWD}/include -ldatabase_lib -o grocery_db
g++ src/movies_db.cc -I${PWD}/include -ldatabase_lib -o movies_db
```

-ldatabase_lib tells the compiler to search for libdatabase_lib.so

**You should receive the following error**:
```
/usr/bin/ld: cannot find -ldatabase_lib
collect2: error: ld returned 1 exit status
```

This is because the compiler doesn't know where to search for libdatabase_lib.so.
We have to tell it to search the build directory. There are two fixes.

### Fix 1: The -L flag

```bash
g++ src/grocery_db.cc -I${PWD}/include -L${PWD}/build -ldatabase_lib -o build/grocery_db
g++ src/movies_db.cc -I${PWD}/include -L${PWD}/build -ldatabase_lib -o build/movies_db
```

-L${PWD}/build tells the compiler to search this directory for shared objects.

### Fix 2: Environment Variables

```bash
export LIBRARY_PATH=${PWD}/build
g++ src/grocery_db.cc -I${PWD}/include -ldatabase_lib -o build/grocery_db
g++ src/movies_db.cc -I${PWD}/include -ldatabase_lib -o build/movies_db
```

## Run the Executable

```bash
build/grocery_db
build/movies_db
```

**You should see the following errror**:
```
build/grocery_db: error while loading shared libraries: libdatabase_lib.so: cannot open shared object file: No such file or directory
```

This is because shared objects are loaded dynamically at runtime. Linking is
only the last phase of compilation. There are two fixes to this problem:

### Fix 1: Install the shared objects

The OS will search a number of paths by default when loading a program.
Probably the most popular spots are /usr/lib and /usr/local/lib. You
can copy-paste your shared object in this location to resolve the issue.
This approach, however, requires root privileges.

```bash
sudo cp build/libdatabase_lib.so /usr/lib/libdatabase_lib.so
build/grocery_db
build/movies_db
sudo rm /usr/lib/libdatabase_lib.so
```

From grocery_db:
```
grocery: in create
grocery: in read
grocery: in update
grocery: in delete
```

From movies_db:
```
movies: in create
movies: in read
movies: in update
movies: in delete
```

### Fix 2: Environment Variables

```bash
export LD_LIBRARY_PATH=${PWD}/build:${LD_LIBRARY_PATH}
build/grocery_db
build/movies_db
```

From grocery_db:
```
grocery: in create
grocery: in read
grocery: in update
grocery: in delete
```

From movies_db:
```
movies: in create
movies: in read
movies: in update
movies: in delete
```

## Introduce Compiler Optimization

One of the things to consider when building code is how optimized you want
the code to be. Compilers allow for varying levels of optimization. You
may ask, why would I ever want unoptimized code? There are two main reasons:
1. Some optimizations are dangerous and can break your program
2. Debugging is much harder since code can get rearranged for optimization

We won't go into detail about every optimization compilers provide. We will
only mention the typical optimizations used to make C++ code perform well.
The optimizations described in this section are all safe and will not
affect program correctness.

In GCC and Clang, compiler optimization levels are tuned using the "-O"
parameter. There are four levels of optimization:

0. No optimization
1. Some optimization
2. Moderate optimization
3. Heavy optimization

```bash
# No optimization
g++ src/database_lib.cc -I${PWD}/include -O0 -fpic -c -o build/database_lib.o
# Some optimization
g++ src/database_lib.cc -I${PWD}/include -O1 -fpic -c -o build/database_lib.o
# Moderate optimization
g++ src/database_lib.cc -I${PWD}/include -O2 -fpic -c -o build/database_lib.o
# Heavy optimization
g++ src/database_lib.cc -I${PWD}/include -O3 -fpic -c -o build/database_lib.o
```
Loading

0 comments on commit c01e2f3

Please sign in to comment.