Fundamentals of a C++ program

Overview

These notes focus on C++17 which is one of the latest and most widely used versions of C++ in the industry. To go through this set of documents, the pre-requisite is to have a good knowledge of the C programming language whose notes are available in the previous section. This set of notes on C++ only focuses on what C++ has to offer on top of already what C is offering.

The execution of the coding examples in these notes should be done on standard GNU G++ compilers. One can use various IDEs such as VS Code, vi, Codelite etc.

All C programs can be executed in the C++ compiler but not the other way round. Hence, it is evident that C++ supports all the features of C and can use in its programs.

Structure of a C++ program

C++ has ~90 keywords. More the keywords, more the complexity of the language’s grammar. Although, some of the keywords are seldom used. One need not memorise all the keywords as there is online documentation (https://en.cppreference.com/) for things you seldom need. But it is essential to know a subset of these which are most commonly used. The set of keywords in C++ is as follows:

alignas
alignof
and
and_eq
asm
atomic_cancel
atomic_commit
atomic_noexcept
auto
bitand
bitor
bool
break
case
catch
char
char8_t
char16_t
char32_t
class
compl
concept
const
consteval
constexpr
constinit
const_cast
continue
co_await
co_return
co_yield decltype
default
delete
do
double
dynamic_cast
else
enum
explicit
export
extern
false
float
for
friend
goto
if
inline
int
long
mutable
namespace
new
noexcept
not
not_eq
nullptr
operator
or
or_eq
private
protected
public reflexpr
register
reinterpret_cast
requires
return
short
signed
sizeof
static
static_assert
static_cast
struct
switch
synchronized
template
this
thread_local
throw
true
try
typedef
typeid
typename
union
unsigned
using
virtual
void
volatile
wchar_t
while
xor
xor_eq

C++ keywords

Preprocessor directives

A preprocessor also called a precompiler, is a program that processes the source code before the compiler gets the code. It looks for preprocessor directives and executes them. Preprocessor directives start from # (pound symbol/ hashtag symbol). Examples of preprocessor directives are as follows:

`#includes <header_files.h>`

In the C/C++ programming languages, the #include directive tells the preprocessor to insert the contents of another file into the source code at the point where the #include directive is found. Include directives are typically used to include the header files for C functions that are held outside of the current source file.

For example, we use math functions such as pow and log which we call from the source code we write. But the definition of these functions is made in the standard header files present. By including math.h using this preprocessor, the definitions are fetched from the header files and put into the source code before the compilation phase. The object code and binary generated would contain the definition of these functions too.

`#include "header_file.h"`

When certain function calls we make in our source code are not present in the standard library provided by C/C++, we can declare them in a separate header file and write the corresponding definition in the C/C++ file. This needs to be included in the source code we write. This user-defined header file inclusion is done using the double inverted comma specification in contrast to the angular bracket specification which is done for the built-in header files.

`#if, #elif, #else, #endif`

The #if directive, with the #elif, #else, and #endif directives, controls the compilation of portions of a source file. If the expression written (after the #if) has a nonzero value, the line group immediately following the #if directive is kept in the translation unit.

Each #if directive in a source file must be matched by a closing #endif directive. Any number of #elif directives can appear between the #if and #endif directives, but at most one #else directive is allowed. The #else directive, if present, must be the last directive before #endif. The #else directive is optional to exist.

The #if, #elif, #else, and #endif directives can nest in the text portions of other #if directives. Each nested, or #endif directive belongs to the closest preceding #if directive.

All conditional-compilation directives, such as #if and #ifdef, must match a closing #endif directive before the end of the file. Otherwise, an error message is generated. When conditional-compilation directives are contained in include files, they must satisfy the same conditions: There must be no unmatched conditional-compilation directives at the end of the include file.

Macro replacement is done within the part of the line that follows an #elif command, so a macro call can be used in the constant expression. The preprocessor selects one of the given occurrences of text for further processing. A block specified in the text can be any sequence of text. It can occupy more than one line. Usually, the text is program text that has meaning to the compiler or the preprocessor.

The preprocessor processes the selected text and passes it to the compiler. If the text contains preprocessor directives, the preprocessor carries out those directives. Only text blocks selected by the preprocessor are compiled.

The preprocessor selects a single text item by evaluating the constant expression following each #if or #elif directive until it finds a true (non-zero) constant expression. It selects all text (including other preprocessor directives beginning with #) up to its associated #elif, #else, or #endif.

If all occurrences of constant expression are false, or if no #elif directives appear, the preprocessor selects the text block after the #else clause. When there’s no #else clause, and all instances of constant expression in the #if block is false, no text block is selected.

The constant expression is an integer constant expression with these additional restrictions:

Expressions must have an integral type and can include only integer constants, character constants, and the defined operator.
The expression can’t use sizeof or a type-cast operator.
The target environment may be unable to represent all ranges of integers.
The translation represents type int the same way as type long, and unsigned int the same way as unsigned long.
The translator can translate character constants to a set of code values different from the set for the target environment. To determine the properties of the target environment, use an app built for that environment to check the values of the LIMITS.H macros.
The expression must not query the environment and must remain insulated from implementation details on the target computer.

`#ifdef`

In the C Programming Language, the #ifdef directive allows for conditional compilation. The preprocessor determines if the provided macro exists before including the subsequent code in the compilation process. If the definition of the macro exists, the block inside #ifdef is skipped to be sent to the translator and sent to the translator otherwise.

Syntax:

#ifdef macro_definition
    Macro_definition
#endif

The macro definition must be defined for the preprocessor to include the C source code in the compiled application. Note that the #ifdef directive must be closed by an #endif directive.

Example:

#include <stdio.h>

#define DEFINED 1

int main()
{
    #ifdef DEFINED
    printf("DEFINED!");
    #endif
    return 0;
}

Output: DEFINED!

A common use for the #ifdef directive is to enable the insertion of platform-specific source code into a program.

`#ifndef`

In the C Programming Language, the #ifndef directive allows for conditional compilation. The preprocessor determines if the provided macro does not exist before including the subsequent code in the compilation process. If the definition of the macro exists, the block inside #ifndef is included to be sent to the translator and skipped otherwise.

Syntax:

#ifndef macro_definition
    Macro_definition
#endif

The macro definition must be defined for the preprocessor to include the C source code in the compiled application. Note that the #ifndef directive must be closed by an #endif directive.

Example:

#include <stdio.h>

// #define DEFINED 0

int main()
{
    #ifndef DEFINED
    printf("NOT DEFINED!");
    #endif
    return 0;
}

Output: NOT DEFINED!

A common use for the #ifndef directive is to enable the insertion of platform-specific source code into a program.

`#define`

The #define directive allows the definition of macros within your source code. These macro definitions allow constant values to be declared for use throughout your code. All the occurrences of the name of the constant done using #define throughout the source code are replaced with the defined constant value

Macro definitions are not variables and cannot be changed in the program code. This syntax is generally used while creating constants that represent numbers, strings or expressions.

Syntax

#define CONSTANT value
// OR
#define CONSTANT (expression)

CONSTANT is the name of the constant. It is a common practice to define the constants in all uppercase but there is no explicit rule that this has to be done.
value is the value of the constant.
expression: The expression whose value is assigned to the constant. The expression must be enclosed in parentheses if it contains operators.

Note that the semicolon character should not be put at the end of #define statements. This is a common mistake.

Examples:

#include <iostream>
#define COMPANY "Quantmasters"
int main()
{
    std::cout << COMPANY << " is an ed-tech company";
    return 0;
}

Output: Quantmasters is an ed-tech company

#include <iostream>
#define NUMBER (10/2)

int main()
{
    std::cout << "10 / 2 = " << NUMBER;
    return 0;
}

Output: 5

`#undef`

The #undef directive tells the preprocessor to remove all definitions for the specified macro. A macro can be redefined after it has been removed by the #undef directive. Once a macro is undefined, an #ifdef directive on that macro will evaluate as false.

Syntax

#undef macro_definition

Example:

#include <stdio.h>

#define DEFINED 1

#undef DEFINED

int main()
{
    #ifdef DEFINED
        printf("DEFINED");
    #endif
    #ifndef DEFINED
        printf("NOT DEFINED");
    #endif
    return 0;
}

Output: NOT DEFINED

In this example, the DEFINED macro is first defined with a value of 1 and then undefined using the #undef directive. Since the macro no longer exists, the statement #ifdef DEFINED evaluates to false. This causes the subsequent printf function to be skipped.

`#line`

The #line directive tells the preprocessor to set the compiler’s reported values for the line number and filename to a given line number and filename.

Syntax

#line digit-sequence ["filename"]

Where filename is an optional attribute and digit-sequence is strictly an integer value

Examples:

#include <stdio.h>	

int main()
{	
	printf("Line: %d\n",__LINE__);	// printing line number, line 7

// resetting line to 23, although next line number is line 10. 
	#line 2
	printf("Line: %d\n",__LINE__);	// printing line number
	
	printf( "Line: %d, File: %s\n", __LINE__, __FILE__ );
	// now we use line to reset filename to "new_filename.c"
	// line number is set to 83
	#line 83 "new_filename.c"
	printf( "Line: %d, File: %s\n", __LINE__, __FILE__ );
	return 0;
}

Output:

Line: 5
Line: 23
Line: 25, File: main.cpp
Line: 83, File: new_filename.c

`#error`

The #error directive causes preprocessing to stop at the location where the directive is encountered. Information following the #error directive is output as a message prior to stopping preprocessing.

Syntax

#error message

Examples:

#include <stdio.h>
#define NUM 50
int main()
{
    #ifndef NUM
        #error NUM not defined
    #endif
    return 0;
}

The above code doesn’t produce any output.

#include <stdio.h>
#define NUM 50
int main()
{
    #ifdef NUM
        #error NUM defined
    #endif
    return 0;
}

Output:

main.c: In function ‘main’:
main.c:6:10: error: #error NUM defined
    6 |         #error NUM defined

`#pragma`

This directive is a special purpose directive and is used to turn on or off some features. These types of directives are compiler-specific i.e., they vary from compiler to compiler.

`#warning`

The #warning directive is similar to a #error directive but does not result in the cancellation of preprocessing. Information following the #warning directive is output as a message prior to preprocessing continuing.

Syntax

#warning message

Examples:

#include <stdio.h>
#define NUM 50
int main()
{
    #ifndef NUM
        #warning NUM not defined
    #endif
    return 0;
}

The above code doesn’t produce any output.

#include <stdio.h>
#define NUM 50
int main()
{
    #ifdef NUM
        #warning NUM defined
    #endif
    return 0;
}

Output:

main.c: In function ‘main’:
main.c:6:10: warning: #warning NUM defined [-Wcpp]
    6 |         #warning NUM defined

Overview of preprocessor directives

Preprocessor directives mind map

One of the most used of the above examples is #include directive. The preprocessor sees the #include statement and replaces it with the corresponding header file that it is referring to. The header files usually contain the prototypes and the signatures of the functions the program will use. Then the preprocessor recursively processes the replaced content as well so as to get the definitions of the files declared in the header file. This process’s output is a file that contains the program the programmer has written along with the function signatures and definitions the user has used in the program. This makes it easy for the compiler to do its job.

Sometimes, the programmer might want to compile code to generate platform-dependent binaries such as windows or mac. In this case, the code has to look for the libraries supporting the corresponding operating system. This is a conditional compilation. This is widely used in the context of cross-compilation where the compilation happens on one machine which generates binaries and these binaries are executed on another machine. To achieve this, preprocessor directives such as #if, #elif, #else etc are extensively used.

It is important to note that the C++ preprocessor does not understand C++. It simply follows the preprocessor directives and gets the source code ready for the compiler. The compiler is the program that understands C++.

Comments

The comments are the programmer’s readable explanations so as to understand what a piece of code does. The comments in the C++ source code are the same as those in C. Double forward-slash (//) for single-line comments and multiline comments enclosed in /* and */.

The `main()` function

Every C++ program must have one and only main() function. A program may contain n files but the main() must be there in any one of them. When a C++ program executes, the main() function is called by the operating system. The logic present in the main() function is executed and the value returned by the main() function is received by the operating system. Conventionally, if the return value is 0, then the program has been executed successfully. If otherwise, there could be an error table maintained to check what went wrong.

There are 2 versions of main(), both of which are accepted as a standard version of main(). They are as follows:

int main() {
    // code
    return 0;
}

int main(int argc, char *argv[]) {
    // code
    return 0;
}

The first version of the main() is mostly used. The second version is used to receive command-line arguments from the operating system. The second version expects 2 pieces of information from the operating system:

Argument count – Count of the arguments (argc).
Argument vector – The 2D character array of the list of arguments (argv).

The common-line arguments handling and usage will be dealt with in the latter part of the notes. Note that the main should always return an integer.

Namespace

As and when the C++ programs we write get more complex, we often use the C++ library code by importing them, libraries that are written by 3rd party developers combined with, of course, our code. A variable or a function may be defined in the standard library and the same variable or a function may be redefined by 3rd party library which causes a conflict where the compiler doesn’t know which variable/function to use when called. This is called naming conflict.

C++ namespace is a feature which acts as a container that groups the code entities. If a programmer wants to use a variable or a function from a particular namespace, one can use the scope resolution operator (::). The syntax to do so is as follows:

<namespace>::<variable/function>

One might find it tedious to use this syntax every single time a variable has to be called or used. C++ provides a workaround to do that. One can use the using namespace directive at the beginning of the program to use a particular namespace. But note that, once the using namespace directive is used, it brings all the variables and functions which are defined in that namespace. This might lead to conflict too.

C++ provides a solution to this problem too. The programmer can mention the specific coding entities being used in the program at the beginning of the program. Only those entities will be imported. The syntax to do this is as follows:

using <namespace>::<entity>

First C++ program

In this topic, we shall see our first C++ program using the concepts we have learned so far. Consider the following program:

#include <iostream>
using namespace std;
int main() {
    return 0;
}

This program neither performs any operations nor does it print anything. It simply imports the iostream library, uses the using namespace directive to let the compiler know that it is going to use entities from the std namespace (though it uses none of the entities), and defines the main() function which just returns 0 and exits.

Basic input-output using cin and cout

cin, cout, cerr and clog are defined in the C++ standard. To use these one must include the iostream library. C++ uses stream abstraction to handle IO and devices like keyboard and console. cout is an output stream that defaults to the output console/screen. cerr and clog are also output streams that default to standard error and standard log respectively. cin is an input stream that defaults to the keyboard.

The insertion operator (>>) and the extraction operator (<<) are used with input and output streams respectively.

The insertion operator

The insertion operator inserts the value from the operand from the right to the operand to the left. Consider the following statement:

std::cout << variable_1;

In this case, the insertion operator inserts the value of variable_1 to cout output stream. As the cout is the default console, this statement makes the value of variable_1 displayed on the screen.

Since we are using stream abstraction, we can chain multiple insertions into the same statement which simplifies the basic IO very easy. An example of this is as follows:

std::cout << “The value in variable_1 is: ” << variable_1;

The insertion operator does not automatically add linebreaks to move the cursor to the next line on the console. But it can be achieved in two ways which are depicted in the following examples:

std::cout << variable_1 << “\n”;
// Or
std::cout << variable_1 << std::endl;

If the end line manipulator (endl) is used, it flushes the stream too but the new line character \n just brings the cursor to the new line.

Examples: Insertion operator

#include <iostream>
using namespace std;
int main() {
    cout << “Hello World”;
    return 0;
}

Output: Hello world

#include <iostream>
using namespace std;
int main() {
    cout << “Hello”;
    cout << “World”;
    return 0;
}

Output: HelloWorld
Conclusion: The insertion operator does not add any extra characters such as space or newline.

The extraction operator

This operator extracts the information from the operand to the left and stores this information in the operand to the right. Consider the following example:

std::cin >> variable_2;

In this case, the value from cin, which by default is keyboard, is taken and stored in variable_2. The way in which the information is interpreted is based on the type of the variable. If the data type of variable_2 is an integer, the input value is converted to an integer and then passed onto variable_2. If the variable is float, the value is converted to float.

The extraction operator can be chained to read multiple values in a single line. This is illustrated in the following example:

std::cin >> variable_2 >> variable_3;

The above statement reads a value from the keyboard converts the value to the data type of variable_2 and stores it in variable_2. The same process is repeated again for variable_3. So the precedence is from left to right.

Note that the extraction operation could fail if the value entered cannot be interpreted in the data type of the target variable on the right side. For example, if the data type of the variable is an integer and the user enters “Prajwal” which is a string. This will cause the operation to fail.

Example 1

#include <iostream>
#include <string>
using namespace std;
int main() {
    int variable_1;
    cin >> variable_1;
    cout << "variable_1: " << variable_1 << endl;
    string variable_2;
    cin >> variable_2;
    cout << "variable_2: " << variable_2;
    return 0;
}

Output

1
variable_1: 1
asdf
variable_2: asdf

Conclusion: The input value is interpreted based on the data type of the variable in which it is going to be stored. Hence the input value for variable_1 was an integer and for variable_2, it was interpreted as a string.

Example 2

#include <iostream>
#include <string>
using namespace std;
int main() {
    int variable_1;
    string variable_2;
    cin >> variable_1 >> variable_2;
    cout << "variable_1: " << variable_1 << endl;
    cout << "variable_2: " << variable_2;
    return 0;
}

Output

1
asdf
variable_1: 1
variable_2: asdf

Conclusion: The input value can be chained into a single cin statement and it works accordingly.

C++ primitive data types

The computer stores the information in binary representation. And the size of the primitive data types is expressed in bits. The number of bits allocated to a data type is directly proportional to the number of unique values it can store and also the size of memory it needs. Hence it is important to choose the data type necessary for the application being used. Size and precision are compiler dependent. climits library contains the size and precision of the compiler the program is being compiled by.

Size in bits	Number of unique values representable
1	2¹
2	2²
4	2⁴
8	2⁸
16	2¹⁶

Character types

This is used to represent character types such as the ones in ASCII table. This is often represented in 8 bits (1 byte). From the above table, it is clear that one can represent a maximum of 2^8 = 256 distinct characters using 8 bits. However, some languages such as Mandarin have thousands of characters which cannot be represented in just 8 bits. In order to support these languages, C++ supports wider character types which can be as large as necessary. The following table describes some of the character types C++ supports

Type Name	Size/Precision
char	Exactly 1 byte
char16_t	16 bits
char32_t	32 bits
wchar_t	Can represent the largest available character set

Unicode is a common standard used to represent multiple character sets in any language.

Integer types

This is used to represent whole numbers, both signed and unsigned. There are many versions of this data type. The following table shows the C++ integer data types for both signed and unsigned integers.

Type name	Size	climits macros	Range
signed short int	16 bits	SHRT_MIN / SHRT_MAX	[−32,767, +32,767]
signed int	16 bits	INT_MIN / INT_MAX	[−32,767, +32,767]
signed long int	32 bits	LONG_MIN / LONG_MAX	[−2,147,483,647, +2,147,483,647]
signed long long int	64 bits	LLONG_MIN / LLONG_MAX	[−9,223,372,036,854,775,807, +9,223,372,036,854,775,807]
unsigned short int	16 bits	0 / USHRT_MAX	[0, 65,535]
unsigned int	16 bits	0 / UINT_MAX	[0, 65,535]
unsigned long int	32 bits	0 / ULONG_MAX	[0, 4,294,967,295]
unsigned long long int	64 bits	0 / ULLONG_MAX	[0, +18,446,744,073,709,551,615]

In addition to this, it is possible to store both signed and unsigned integers in character data type. This capability of C/C++ is often exploited to efficiently store integers of a shorter range.

Floating-point types

The usual method used by computers to represent real numbers is floating-point notation. There are many varieties of floating-point notation and each has individual characteristics. The key concept is that a real number is represented by a number called mantissa, times a base raised to an integer power called an exponent. The base is usually fixed, and the mantissa and the exponent vary to represent different real numbers. For example, if the base is fixed at 10, the number 123.45 could be represented as 12345 x 10^-2. The mantissa is 12345, and the exponent is -2. Other possible representations are 0.12345 x 10³ and 123.45×10⁰. We choose the representation in which the mantissa is an integer with no trailing 0s.

In the floating-point notation, a real number is represented by a 32-bit value consisting of a 24-bit mantissa followed by an 8-bit exponent. The base is fixed at 10. Both the mantissa and the exponent are twos complement binary integers. For example. The 24-bit binary representation of 12345 is 0000 0000 0011 0000 0011 1001. And the 8-bit twos complement binary representation of -2 is 1111 1110; the representation of 123.45 is 0000 0000 0011 0000 0011 1001 1111 1110.

The advantage of floating-point notation is that it can be used to represent numbers with extremely large or extremely small absolute values. The floating point has 3 types: float, double and long double. The following table describes these types:

Type	Precision	Range
float	7 decimal digits	1.2 x 10^-38 to 3.4 x 10³⁸
double	15 decimal digits	2.2 x 10^-308 to 1.8 x 10³⁰⁸
long double	19 decimal digits	3.3 x 10^-4932 to 1.2 x 10⁴⁹³²

Boolean type

The boolean data type is used to represent true or false values. In C++, 0 is false and any non-zero value is true. C++ also supports ‘true’ and ‘false’ keywords to often use with the boolean data types. Boolean data type usually takes up to 8 bits of memory size.

Declaring and using variables

Apart from using the assignment operator to initialise identifiers with constants, C++ provides one more way to do the same job:

int variable_1 {100};

does the same job as

int variable_1 = 100;

Overview

Structure of a C++ program

Preprocessor directives

#includes <header_files.h>

#include "header_file.h"

#if, #elif, #else, #endif

#ifdef

#ifndef

#define

#undef

#line

#error

#pragma

#warning

Overview of preprocessor directives

Comments

The main() function

Namespace

First C++ program

Basic input-output using cin and cout

The insertion operator

Examples: Insertion operator

The extraction operator

Example 1

Example 2

C++ primitive data types

Character types

Integer types

Floating-point types

Boolean type

Declaring and using variables

Leave a Reply Cancel reply

`#includes <header_files.h>`

`#include "header_file.h"`

`#if, #elif, #else, #endif`

`#ifdef`

`#ifndef`

`#define`

`#undef`

`#line`

`#error`

`#pragma`

`#warning`

The `main()` function