========================================================
C By Example Part 5: Integer Types in C and Type Aliases
========================================================

In __Session 6 (Page 3)__ you learned the fundamental concept behind variables
in C. For example that every variable has a type. But in the example only the
type _int_ was used. Here you find a list of all the other integer types that
are available in C. You will also see that for having a guaranteed size (e.g. a
certain number of bits) so called type aliases have to be used. These are
declared in headers of the standard library.

:links: Session 6 \(Page 3\) -> doc:session06/page03

=============
Integer types								[TOC]
=============

Here you find a list with all signed and unsigned types thar are part of the
language. This list also contains the specifiers formatted printing.

The tables are basically taken from __here__ and extended for the concrete sizes
used by ``ucc`` and the``gcc`` installation on ``theon``. The size is given in
bytes and for both, the ULM architecture and the Intel architecture, a byte
consists of 8 bits. The C standard just guarantees that a byte has at least 8
bits.

Also note that the list are still not complete. For example, the types like
``signed long long int`` can also be written as ``int long signed long``,
``long int long signed``, etc. Personally I anyway prefer the shortest form
for expressing a type, e.g. ``long long`` instead of ``signed long long int``
and ``unsigned long long`` instead of ``unsigned long long int``.

:links: here -> https://en.wikipedia.org/wiki/C_data_types#Basic_types


Type ``char``
=============
The C standard specifies that type ``char`` has exactly one byte. However, the C
standard does not specify whether it is signed or unsigned. In many cases this
does not matter. For example if it used to represent __ASCII__ characters which
require just 7 bits for encoding (and a byte has at least 8 bits so the sign
bit is not used by the encoding). If signedness is relevant you explicitly have
to use the types ``signed char`` or ``unsigned char``.

:links: ASCII -> https://en.wikipedia.org/wiki/ASCII

Signed Integer Types
====================
+---------------------------+-----------+-----------+-------+---------------+
| _Type_	    	    | _Size in	| _gcc on   | _ucc_ | _Format	    |
|			    | bytes_	| theon_    |	    | spacifier_    |
+---------------------------+-----------+-----------+-------+---------------+
| ``signed char``	    | $1$	| $1$	    | $1$   |- for	    |
|			    |		|	    |	    |  character    |
|			    |		|	    |	    |  printing	    |
|			    |		|	    |	    |  ``%c``	    |
|			    |		|	    |	    |- decimal:	    |
|			    |		|	    |	    |  ``%hhd`` or  |
|			    |		|	    |	    |  ``%hhi``	    |
|			    |		|	    |	    |- octal:	    |
|			    |		|	    |	    |  ``%hho``	    |
|			    |		|	    |	    |- hex:	    |
|			    |		|	    |	    |  ``%hhx``	    |
+---------------------------+-----------+-----------+-------+---------------+
| ``short``		    | $\geq 2$	| $2$	    | $2$   | - decimal:    |
|			    |		|	    |	    |   ``%hd`` or  |
| ``short int``		    |		|	    |	    |   ``%hi``	    |
|			    |		|	    |	    | - octal:	    |
| ``signed short``	    |		|	    |	    |   ``%ho``	    |
|			    |		|	    |	    | - hex:	    |
| ``signed short int``	    |		|	    |	    |	``%hx``	    |
|			    |		|	    |	    |		    |
| ``signed short``	    |		|	    |	    |		    |
+---------------------------+-----------+-----------+-------+---------------+
| ``int``		    | $\geq 2$	| $4$	    | $2$   | - decimal:    |
|			    |		|	    |	    |   ``%d`` or   |
| ``signed``		    |		|	    |	    |	``%i``	    |
|			    |		|	    |	    | - octal:	    |
| ``signed int``	    |		|	    |	    |	``%o``	    |
|			    |		|	    |	    | - hex:	    |
|			    |		|	    |	    |	``%x``	    |
+---------------------------+-----------+-----------+-------+---------------+
| ``long``		    | $\geq 4$	| $8$	    | $4$   | - decimal:    |
|			    |		|	    |	    |   ``%ld`` or  |
| ``long int``		    |		|	    |	    |	``%li``	    |
|			    |		|	    |	    | - octal:	    |
| ``signed long``	    |		|	    |	    |	``%lo``	    |
|			    |		|	    |	    | - hex:	    |
| ``signed long int``	    |		|	    |	    |	``%lx``	    |
+---------------------------+-----------+-----------+-------+---------------+
| ``long long``		    | $\geq 8$	| $8$	    | $8$   | - decimal:    |
|			    |		|	    |	    |   ``%lld``    |
| ``long long int``	    |		|	    |	    |	or	    |
|			    |		|	    |	    |	``%lli``    |
| ``signed long long``	    |		|	    |	    | - octal:	    |
|			    |		|	    |	    |	``%llo``    |
| ``signed long long int``  |		|	    |	    | - hex:	    |
|			    |		|	    |	    |   ``%llx``    |
+---------------------------+-----------+-----------+-------+---------------+

Unsigned Integer Types
======================

+---------------------------+-----------+-----------+-------+-----------+
| _Type_	    	    | _Size in	| _gcc on   | _ucc_ | _Format	|
|			    | bytes_	| theon_    |	    | spacifier_|
+---------------------------+-----------+-----------+-------+-----------+
| ``unsigned char``	    | $1$	| $1$	    | $1$   |- for	|
|			    |		|	    |	    |  character|
|			    |		|	    |	    |  printing |
|			    |		|	    |	    |  ``%c``	|
|			    |		|	    |	    |- decimal:	|
|			    |		|	    |	    |  ``%hhu``	|
|			    |		|	    |	    |- octal:	|
|			    |		|	    |	    |  ``%hho``	|
|			    |		|	    |	    |- hex:	|
|			    |		|	    |	    |  ``%hhx``	|
+---------------------------+-----------+-----------+-------+-----------+
| ``unsigned short``	    | $\geq 2$	| $2$	    | $2$   | - decimal:|
|			    |		|	    |	    |   ``%hu``	|
| ``unsigned short int``    |		|	    |	    | - octal:	|
|			    |		|	    |	    |   ``%ho``	|
|			    |		|	    |	    | - hex:	|
|                   	    |		|	    |	    |	``%hx``	|
|			    |		|	    |	    |		|
+---------------------------+-----------+-----------+-------+-----------+
| ``unsigned``		    | $\geq 2$	| $4$	    | $2$   | - decimal:|
|			    |		|	    |	    |   ``%u``	|
| ``unsigned int``	    |		|	    |	    | - octal:	|
|			    |		|	    |	    |	``%o``	|
|			    |		|	    |	    | - hex:	|
|			    |		|	    |	    |	``%x``	|
+---------------------------+-----------+-----------+-------+-----------+
| ``unsigned long``	    | $\geq 4$	| $8$	    | $4$   | - decimal:|
|			    |		|	    |	    |   ``%lu``	|
| ``unsigned long int``	    |		|	    |	    | - octal:	|
|		    	    |		|	    |	    |	``%lo``	|
|			    |		|	    |	    | - hex:	|
|			    |		|	    |	    |	``%lx``	|
+---------------------------+-----------+-----------+-------+-----------+
| ``unsigned long long``    | $\geq 8$	| $8$	    | $8$   | - decimal:|
|			    |		|	    |	    |   ``%llu``|
| ``unsigned long long int``|		|	    |	    | - octal:	|
|			    |		|	    |	    |	``%llo``|
|			    |		|	    |	    | - hex:	|
|			    |		|	    |	    |   ``&llx``|
+---------------------------+-----------+-----------+-------+-----------+

===========================================
Standardized type aliases for integer types				[TOC]
===========================================

Through type aliases the C standard library provides further integer types. For
example, the unsigned integer type __size_t__ and signed integer type
__ptrdiff_t__.  The exact size of these types depends on the memory model
supported by the compiler. ``size_t`` is the type returned by the ``sizeof``
operator.  The size of ``size_t`` is such that it can be used to store the
maximum size of a theoretically possible object of any type (including array).
``ptrdiff_t`` is the signed integer type of the result of subtracting two
pointers.

Other examples for specified type aliases are __fixed width integer types__
(e.g. ``uint8_t``, ``uint16_t``, ``uint32_t``, ``uint64_t``, ``int8_t``,
``int16_t``, ``int32_t``, ``int64_t``) and the type alias ``bool`.

These type alias are declared in certain header files. This page summarizes
what header you need to include. Furthermore, it shows how these type alias are
declared in the (incomplte) standard library for ``ucc``.

---- SHELL (path=session07/,hide) ----------------------------------------------
rm -rf hpc0_cprog_page8
cp -r -P /home/numerik/pub/cprog hpc0_cprog_page8
--------------------------------------------------------------------------------


Types ``size_t`` and ``ptrdiff_t``
==================================
The declaration of these types can be imported by including the standard header
__stddef.h__.  The actual size of both types basically depends on the address
space of the architecture for which the compiler generates code.

As ``ucc`` only supports code generation for the ULM (which has a 64 bit virtual
memory space) these types always have 8 bytes. As ``gcc`` on the other hand
supports different architectures it depends on the installation and compiler
flags.

``stddef.h`` from the ULM standard library
------------------------------------------
Using a __typedef__ declaration, ``size_t`` and ``ptrdiff_t`` are declared as a
type alias for ``uint64_t`` and ``int64_t`` respectively. As you will see below,
these in turn are type aliases for ``unsigned long long`` and ``long long``.

:import: /home/numerik/pub/ulmcc/include/stddef.h


Example
-------
Let's check the default sizes used by ``ucc`` and ``gcc`` with the following test:

---- CODE (file=session07/hpc0_cprog_page8/xprintf_size_t.c) -------------------
#include <stddef.h>
#include <stdio.h>

int
main()
{
    printf("sizeof(size_t) = %zu\n", sizeof(size_t));
    printf("sizeof(ptrdiff_t) = %zu\n", sizeof(size_t));
}
--------------------------------------------------------------------------------

---- SHELL (path=session07/hpc0_cprog_page8) -----------------------------------
mkdir -p build_gcc
gcc  -o build_gcc/xprintf_size_t xprintf_size_t.c
mkdir -p build_ucc
gcc  -o build_ucc/xprintf_size_t xprintf_size_t.c
--------------------------------------------------------------------------------

---- SHELL (path=session07/hpc0_cprog_page8) -----------------------------------
./build_ucc/xprintf_size_t
./build_gcc/xprintf_size_t
--------------------------------------------------------------------------------

Invoking ``gcc`` with the ``-m32`` option it generates code for the 32-bit  __i386__
architecture that Intel introduced in the mid 80s (of the last century). And as
Intels current hardware actually still is backwards compatible this code actual
runs on ``theon``. However, for demonstrating this we have to use an old ``gcc``
installation (who would have thought we would ever need that again):

---- SHELL (path=session07/hpc0_cprog_page8) -----------------------------------
/opt/ulm/athenry/bin/gcc -m32 xprintf_size_t.c -o build_gcc/xprintf_size_t-32
./build_gcc/xprintf_size_t-32
--------------------------------------------------------------------------------


Format specifiers
-----------------
For printing variables of type ``size_t`` the correct format specifiers are
``%zu`` (decimal), ``%zx`` (hexadecimal) and ``%zo`` (octal). For variables of
type ``ptrdiff_t`` correspondingly ``%td`` (decimal), ``%tx`` (hexadecimal) and
``%to`` (octal). In this example the format specifiers contain an optional
width (e.g.  ``%20zu`` has the width ``20``) so that we get the numbers printed
nicely in columns:

---- CODE (file=session07/hpc0_cprog_page8/xprintf_fmt_zu_td.c) ----------------
#include <stddef.h>
#include <stdio.h>

int
main()
{
    size_t s = -1;	// so that we get the largest value ;-)
    ptrdiff_t p = -1;

    printf("s = %20zu (hex: %16zx, oct: %23zo)\n", s, s, s);
    printf("p = %20td (hex: %16tx, oct: %23to)\n", p, p, p);
}
--------------------------------------------------------------------------------

---- SHELL (path=session07/hpc0_cprog_page8) -----------------------------------
mkdir -p build_gcc
gcc  -o build_gcc/xprintf_fmt_zu_td xprintf_fmt_zu_td.c
mkdir -p build_ucc
gcc  -o build_ucc/xprintf_fmt_zu_td xprintf_fmt_zu_td.c
--------------------------------------------------------------------------------
---- SHELL (path=session07/hpc0_cprog_page8) -----------------------------------
./build_ucc/xprintf_fmt_zu_td
./build_gcc/xprintf_fmt_zu_td
--------------------------------------------------------------------------------


Fixed width integer types
=========================
The __fixed width integer types__ are declared in __stdint.h__. Here you see
the declarations in the ULM library:

:import: /home/numerik/pub/ulmcc/include/stdint.h

Type ``bool`` and literals ``true`` and ``false``
=================================================
Since C99 the C language provides an additional integer type ``_Bool`` which is
guaranteed to be large enough to store 0 and 1. But it is at least one byte
because that's supposed to be the smallest size unit. Again, just another
example how the C standard is as vague as possible (so that various platforms
can be supported) and consistent at the same time.

In __stdbool.h__ for the ULM library the type ``bool`` is declared as an alias
for ``_Bool``. By including it you also get integer literals ``false`` and
``true`` with values ``0`` and ``1`` respectively:

:import: /home/numerik/pub/ulmcc/include/stdbool.h

Note that the C standard actually requires that ``bool`` should be a macro and
not a typedef (__here some discussion why__). Well, currently possible as the
preprocessor for the ULM compiler is very limited.


:links: size_t	-> https://en.cppreference.com/w/c/types/size_t
	ptrdiff_t -> https://en.cppreference.com/w/c/types/ptrdiff_t
	fixed width integer types -> https://en.cppreference.com/w/cpp/types/integer
	stddef.h -> https://www.cplusplus.com/reference/cstddef/
	stdint.h -> https://en.cppreference.com/w/c/types/integer
	i386 -> https://en.wikipedia.org/wiki/I386
	typedef -> https://en.wikipedia.org/wiki/Typedef
	stdbool.h -> https://en.cppreference.com/w/c/types/boolean
	here some discussion why -> https://stackoverflow.com/questions/46797609/why-do-major-compilers-use-typedef-for-stdint-h-but-use-define-for-stdbool-h