-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
4721c42
commit c01e2f3
Showing
14 changed files
with
3,550 additions
and
7 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
346 changes: 346 additions & 0 deletions
346
docs/02-hpc-tutorials/04-cpp-introduction/02-cpp-build-manually.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,346 @@ | ||
# Building C++ Code Manually | ||
In this section, we have five objectives: | ||
1. How to compile a program into a shared object | ||
2. How to compile a program into an executable | ||
3. How to link a shared object to an executable | ||
4. How to run an executable which relies on a shared object | ||
5. How to tell the compiler to optimize code | ||
|
||
An **executable** is a piece of code that can be run in a terminal to produce | ||
some sort of output. A **shared object** is code that an executable relies on, | ||
but in and of itself is not executable. Shared objects are the primary way that | ||
C++ enables software to be re-used across code bases and avoid code duplication. | ||
|
||
Imagine you have two services: | ||
1. A movie database which tracks information such as movie description, | ||
user reviews, critic reviews, actors, etc. | ||
2. A grocery database which tracks information such as product name, | ||
product description, quantity available, price, etc. | ||
|
||
Both services are databases. Databases provide a few general functions: creating | ||
records, reading records, updating records, and deleting records (CRUD). As | ||
opposed to developing entire database management systems for these two use | ||
cases, one could develop a single generic database technology, and then build | ||
the movie and grocery databases using that single technology. | ||
|
||
In C++, this can be done using a shared library. In our example: | ||
1. [src/database_lib.cc](https://github.com/scs-lab/scs-tutorial/blob/main/3.2.building_cpp/src/database_lib.cc) implements the CRUD operations. | ||
2. [src/grocery_db.cc](https://github.com/scs-lab/scs-tutorial/blob/main/3.2.building_cpp/src/grocery_db.cc) implements the grocery database on top of CRUD | ||
3. [src/movies_db.cc](https://github.com/scs-lab/scs-tutorial/blob/main/3.2.building_cpp/src/movies_db.cc) implements the movie database on top of CRUD | ||
|
||
## Setup + Repo Structure | ||
|
||
```bash | ||
git clone https://github.com/grc-iit/grc-tutorial.git | ||
cd grc-tutorial | ||
export GRC_TUTORIAL=${PWD} | ||
cd ${GRC_TUTORIAL}/cpp/02-cpp-build-manually | ||
``` | ||
|
||
In this example, our repo is structured as follows: | ||
1. src: contains all source files | ||
2. include: contains all header files | ||
|
||
This is the typical way a C++ repo is structured. This is because in many cases | ||
header files are meant to be public and available to other programs, whereas | ||
source files are almost always private. Having them in separate directories | ||
makes accomplishing these two objectives much easier. | ||
|
||
## Database Library Header | ||
```cpp | ||
#ifndef DATABASE_LIB_H | ||
#define DATABASE_LIB_H | ||
|
||
#include <string> | ||
|
||
class Database { | ||
public: | ||
std::string file_; | ||
|
||
Database(const std::string &file) : file_(file) {} | ||
|
||
void create(); | ||
void read(); | ||
void update(); | ||
void del(); | ||
}; | ||
|
||
#endif | ||
``` | ||
|
||
The header file defines the Database class. The class contains the CRUD | ||
methods. The methods aren't implemented here in this case. Their implementation | ||
is in the source file. | ||
|
||
## Database Library Source | ||
|
||
```cpp | ||
#include <iostream> | ||
#include "database_lib.h" | ||
|
||
void Database::create() { | ||
std::cout << file_ << ": in create" << std::endl; | ||
} | ||
|
||
void Database::read() { | ||
std::cout << file_ << ": in read" << std::endl; | ||
} | ||
|
||
void Database::update() { | ||
std::cout << file_ << ": in update" << std::endl; | ||
} | ||
|
||
void Database::del() { | ||
std::cout << file_ << ": in delete" << std::endl; | ||
} | ||
``` | ||
|
||
The source file implements the Database methods. In this case, all they | ||
do is make print statements for simplicity. We include "database_lib.h" | ||
in order to find the Database class we're implementing. | ||
|
||
## Grocery Database Source | ||
|
||
```cpp | ||
#include "database_lib.h" | ||
|
||
int main() { | ||
Database db("grocery"); | ||
db.create(); | ||
db.read(); | ||
db.update(); | ||
db.del(); | ||
} | ||
``` | ||
Here we create the grocery database by creating the Database class | ||
located in "database_lib.h". | ||
|
||
## Movies Database Source | ||
|
||
```cpp | ||
#include "database_lib.h" | ||
|
||
int main() { | ||
Database db("movies"); | ||
db.create(); | ||
db.read(); | ||
db.update(); | ||
db.del(); | ||
} | ||
``` | ||
Here we create the movies database by creating the Database class | ||
located in "database_lib.h". | ||
|
||
## C++ Compiler Pipeline | ||
|
||
C/C++ compilers are divided into 4 phases: | ||
1. **Preprocessing**: This will locate and load #include files and make simple | ||
modifications to source code | ||
2. **Compiling**: Will turn the pre-processed source code into assembly code. | ||
3. **Assembling**: Will convert assembly code into machine code | ||
(i.e., object code). | ||
4. **Linking**: Will locate shared libraries and ensure that any missing symbols | ||
are resolved in the object code. | ||
|
||
## The Build Directory | ||
|
||
First, we should make a build directory to store our files. This | ||
makes cleaning up intermediate files much easier. | ||
|
||
```bash | ||
mkdir build | ||
``` | ||
|
||
## Compile + Assemble database_lib.cc | ||
|
||
We first try to compile the database into an object file below | ||
```bash | ||
g++ src/database_lib.cc -fpic -c -o build/database_lib.o | ||
``` | ||
|
||
-fPIC stands for Force Poisition Independent Code. Whenever trying to | ||
build a shared object, this is necessary. This is because a shared library | ||
can be loaded at different locations in a program, so having the addresses | ||
in the code being fixed is problematic. | ||
|
||
-c tells g++ to build the object file. | ||
|
||
-o database_lib/database_lib.o sets the output of the | ||
compilation to be database_lib/database_lib.o. | ||
|
||
**You should receive the following error**: | ||
<pre><b>src/database_lib.cc:2:10:</b> <font color="#EF2929"><b>fatal error: </b></font>database_lib.h: No such file or directory | ||
2 | #include <font color="#EF2929"><b>"database_lib.h"</b></font> | ||
| <font color="#EF2929"><b>^~~~~~~~~~~~~~~~</b></font> | ||
compilation terminated.</pre> | ||
|
||
This is because the compiler doesn't know to look in the include directory | ||
for the database_lib.h file. In order to force the compiler to search for | ||
this file there are two options. | ||
|
||
### Fix 1: The -I flag | ||
|
||
```bash | ||
g++ src/database_lib.cc -I${PWD}/include -fpic -c -o build/database_lib.o | ||
``` | ||
|
||
-I${PWD}/include will ensure the compiler searches the include directory | ||
for headers | ||
|
||
### Fix 2: Environment Variables | ||
|
||
```bash | ||
INCLUDE=${PWD}/include \ | ||
CPATH=${PWD}/include \ | ||
g++ src/database_lib.cc -fpic -c -o build/database_lib.o | ||
``` | ||
|
||
INCLUDE and CPATH are sometimes searched by the compiler for header files. | ||
This approach is also viable because you don't need to modify build scripts in | ||
order for it to work. | ||
|
||
## Link database_lib.o | ||
```bash | ||
g++ -shared build/database_lib.o -o build/libdatabase_lib.so | ||
``` | ||
|
||
This command will produce the shared library. Note, the general naming | ||
convention of a shared object is "libNAME.so". Many compilers expect | ||
your shared object to begin with the word "lib" and have an "so" | ||
extension. | ||
|
||
## Create the Executables | ||
|
||
```bash | ||
g++ src/grocery_db.cc -I${PWD}/include -ldatabase_lib -o grocery_db | ||
g++ src/movies_db.cc -I${PWD}/include -ldatabase_lib -o movies_db | ||
``` | ||
|
||
-ldatabase_lib tells the compiler to search for libdatabase_lib.so | ||
|
||
**You should receive the following error**: | ||
``` | ||
/usr/bin/ld: cannot find -ldatabase_lib | ||
collect2: error: ld returned 1 exit status | ||
``` | ||
|
||
This is because the compiler doesn't know where to search for libdatabase_lib.so. | ||
We have to tell it to search the build directory. There are two fixes. | ||
|
||
### Fix 1: The -L flag | ||
|
||
```bash | ||
g++ src/grocery_db.cc -I${PWD}/include -L${PWD}/build -ldatabase_lib -o build/grocery_db | ||
g++ src/movies_db.cc -I${PWD}/include -L${PWD}/build -ldatabase_lib -o build/movies_db | ||
``` | ||
|
||
-L${PWD}/build tells the compiler to search this directory for shared objects. | ||
|
||
### Fix 2: Environment Variables | ||
|
||
```bash | ||
export LIBRARY_PATH=${PWD}/build | ||
g++ src/grocery_db.cc -I${PWD}/include -ldatabase_lib -o build/grocery_db | ||
g++ src/movies_db.cc -I${PWD}/include -ldatabase_lib -o build/movies_db | ||
``` | ||
|
||
## Run the Executable | ||
|
||
```bash | ||
build/grocery_db | ||
build/movies_db | ||
``` | ||
|
||
**You should see the following errror**: | ||
``` | ||
build/grocery_db: error while loading shared libraries: libdatabase_lib.so: cannot open shared object file: No such file or directory | ||
``` | ||
|
||
This is because shared objects are loaded dynamically at runtime. Linking is | ||
only the last phase of compilation. There are two fixes to this problem: | ||
|
||
### Fix 1: Install the shared objects | ||
|
||
The OS will search a number of paths by default when loading a program. | ||
Probably the most popular spots are /usr/lib and /usr/local/lib. You | ||
can copy-paste your shared object in this location to resolve the issue. | ||
This approach, however, requires root privileges. | ||
|
||
```bash | ||
sudo cp build/libdatabase_lib.so /usr/lib/libdatabase_lib.so | ||
build/grocery_db | ||
build/movies_db | ||
sudo rm /usr/lib/libdatabase_lib.so | ||
``` | ||
|
||
From grocery_db: | ||
``` | ||
grocery: in create | ||
grocery: in read | ||
grocery: in update | ||
grocery: in delete | ||
``` | ||
|
||
From movies_db: | ||
``` | ||
movies: in create | ||
movies: in read | ||
movies: in update | ||
movies: in delete | ||
``` | ||
|
||
### Fix 2: Environment Variables | ||
|
||
```bash | ||
export LD_LIBRARY_PATH=${PWD}/build:${LD_LIBRARY_PATH} | ||
build/grocery_db | ||
build/movies_db | ||
``` | ||
|
||
From grocery_db: | ||
``` | ||
grocery: in create | ||
grocery: in read | ||
grocery: in update | ||
grocery: in delete | ||
``` | ||
|
||
From movies_db: | ||
``` | ||
movies: in create | ||
movies: in read | ||
movies: in update | ||
movies: in delete | ||
``` | ||
|
||
## Introduce Compiler Optimization | ||
|
||
One of the things to consider when building code is how optimized you want | ||
the code to be. Compilers allow for varying levels of optimization. You | ||
may ask, why would I ever want unoptimized code? There are two main reasons: | ||
1. Some optimizations are dangerous and can break your program | ||
2. Debugging is much harder since code can get rearranged for optimization | ||
|
||
We won't go into detail about every optimization compilers provide. We will | ||
only mention the typical optimizations used to make C++ code perform well. | ||
The optimizations described in this section are all safe and will not | ||
affect program correctness. | ||
|
||
In GCC and Clang, compiler optimization levels are tuned using the "-O" | ||
parameter. There are four levels of optimization: | ||
|
||
0. No optimization | ||
1. Some optimization | ||
2. Moderate optimization | ||
3. Heavy optimization | ||
|
||
```bash | ||
# No optimization | ||
g++ src/database_lib.cc -I${PWD}/include -O0 -fpic -c -o build/database_lib.o | ||
# Some optimization | ||
g++ src/database_lib.cc -I${PWD}/include -O1 -fpic -c -o build/database_lib.o | ||
# Moderate optimization | ||
g++ src/database_lib.cc -I${PWD}/include -O2 -fpic -c -o build/database_lib.o | ||
# Heavy optimization | ||
g++ src/database_lib.cc -I${PWD}/include -O3 -fpic -c -o build/database_lib.o | ||
``` |
Oops, something went wrong.