======================================================== C By Example Part 5: Integer Types in C and Type Aliases ======================================================== In __Session 6 (Page 3)__ you learned the fundamental concept behind variables in C. For example that every variable has a type. But in the example only the type _int_ was used. Here you find a list of all the other integer types that are available in C. You will also see that for having a guaranteed size (e.g. a certain number of bits) so called type aliases have to be used. These are declared in headers of the standard library. :links: Session 6 \(Page 3\) -> doc:session06/page03 ============= Integer types [TOC] ============= Here you find a list with all signed and unsigned types thar are part of the language. This list also contains the specifiers formatted printing. The tables are basically taken from __here__ and extended for the concrete sizes used by ``ucc`` and the``gcc`` installation on ``theon``. The size is given in bytes and for both, the ULM architecture and the Intel architecture, a byte consists of 8 bits. The C standard just guarantees that a byte has at least 8 bits. Also note that the list are still not complete. For example, the types like ``signed long long int`` can also be written as ``int long signed long``, ``long int long signed``, etc. Personally I anyway prefer the shortest form for expressing a type, e.g. ``long long`` instead of ``signed long long int`` and ``unsigned long long`` instead of ``unsigned long long int``. :links: here -> https://en.wikipedia.org/wiki/C_data_types#Basic_types Type ``char`` ============= The C standard specifies that type ``char`` has exactly one byte. However, the C standard does not specify whether it is signed or unsigned. In many cases this does not matter. For example if it used to represent __ASCII__ characters which require just 7 bits for encoding (and a byte has at least 8 bits so the sign bit is not used by the encoding). If signedness is relevant you explicitly have to use the types ``signed char`` or ``unsigned char``. :links: ASCII -> https://en.wikipedia.org/wiki/ASCII Signed Integer Types ==================== +---------------------------+-----------+-----------+-------+---------------+ | _Type_ | _Size in | _gcc on | _ucc_ | _Format | | | bytes_ | theon_ | | spacifier_ | +---------------------------+-----------+-----------+-------+---------------+ | ``signed char`` | $1$ | $1$ | $1$ |- for | | | | | | character | | | | | | printing | | | | | | ``%c`` | | | | | |- decimal: | | | | | | ``%hhd`` or | | | | | | ``%hhi`` | | | | | |- octal: | | | | | | ``%hho`` | | | | | |- hex: | | | | | | ``%hhx`` | +---------------------------+-----------+-----------+-------+---------------+ | ``short`` | $\geq 2$ | $2$ | $2$ | - decimal: | | | | | | ``%hd`` or | | ``short int`` | | | | ``%hi`` | | | | | | - octal: | | ``signed short`` | | | | ``%ho`` | | | | | | - hex: | | ``signed short int`` | | | | ``%hx`` | | | | | | | | ``signed short`` | | | | | +---------------------------+-----------+-----------+-------+---------------+ | ``int`` | $\geq 2$ | $4$ | $2$ | - decimal: | | | | | | ``%d`` or | | ``signed`` | | | | ``%i`` | | | | | | - octal: | | ``signed int`` | | | | ``%o`` | | | | | | - hex: | | | | | | ``%x`` | +---------------------------+-----------+-----------+-------+---------------+ | ``long`` | $\geq 4$ | $8$ | $4$ | - decimal: | | | | | | ``%ld`` or | | ``long int`` | | | | ``%li`` | | | | | | - octal: | | ``signed long`` | | | | ``%lo`` | | | | | | - hex: | | ``signed long int`` | | | | ``%lx`` | +---------------------------+-----------+-----------+-------+---------------+ | ``long long`` | $\geq 8$ | $8$ | $8$ | - decimal: | | | | | | ``%lld`` | | ``long long int`` | | | | or | | | | | | ``%lli`` | | ``signed long long`` | | | | - octal: | | | | | | ``%llo`` | | ``signed long long int`` | | | | - hex: | | | | | | ``%llx`` | +---------------------------+-----------+-----------+-------+---------------+ Unsigned Integer Types ====================== +---------------------------+-----------+-----------+-------+-----------+ | _Type_ | _Size in | _gcc on | _ucc_ | _Format | | | bytes_ | theon_ | | spacifier_| +---------------------------+-----------+-----------+-------+-----------+ | ``unsigned char`` | $1$ | $1$ | $1$ |- for | | | | | | character| | | | | | printing | | | | | | ``%c`` | | | | | |- decimal: | | | | | | ``%hhu`` | | | | | |- octal: | | | | | | ``%hho`` | | | | | |- hex: | | | | | | ``%hhx`` | +---------------------------+-----------+-----------+-------+-----------+ | ``unsigned short`` | $\geq 2$ | $2$ | $2$ | - decimal:| | | | | | ``%hu`` | | ``unsigned short int`` | | | | - octal: | | | | | | ``%ho`` | | | | | | - hex: | | | | | | ``%hx`` | | | | | | | +---------------------------+-----------+-----------+-------+-----------+ | ``unsigned`` | $\geq 2$ | $4$ | $2$ | - decimal:| | | | | | ``%u`` | | ``unsigned int`` | | | | - octal: | | | | | | ``%o`` | | | | | | - hex: | | | | | | ``%x`` | +---------------------------+-----------+-----------+-------+-----------+ | ``unsigned long`` | $\geq 4$ | $8$ | $4$ | - decimal:| | | | | | ``%lu`` | | ``unsigned long int`` | | | | - octal: | | | | | | ``%lo`` | | | | | | - hex: | | | | | | ``%lx`` | +---------------------------+-----------+-----------+-------+-----------+ | ``unsigned long long`` | $\geq 8$ | $8$ | $8$ | - decimal:| | | | | | ``%llu``| | ``unsigned long long int``| | | | - octal: | | | | | | ``%llo``| | | | | | - hex: | | | | | | ``&llx``| +---------------------------+-----------+-----------+-------+-----------+ =========================================== Standardized type aliases for integer types [TOC] =========================================== Through type aliases the C standard library provides further integer types. For example, the unsigned integer type __size_t__ and signed integer type __ptrdiff_t__. The exact size of these types depends on the memory model supported by the compiler. ``size_t`` is the type returned by the ``sizeof`` operator. The size of ``size_t`` is such that it can be used to store the maximum size of a theoretically possible object of any type (including array). ``ptrdiff_t`` is the signed integer type of the result of subtracting two pointers. Other examples for specified type aliases are __fixed width integer types__ (e.g. ``uint8_t``, ``uint16_t``, ``uint32_t``, ``uint64_t``, ``int8_t``, ``int16_t``, ``int32_t``, ``int64_t``) and the type alias ``bool`. These type alias are declared in certain header files. This page summarizes what header you need to include. Furthermore, it shows how these type alias are declared in the (incomplte) standard library for ``ucc``. ---- SHELL (path=session07/,hide) ---------------------------------------------- rm -rf hpc0_cprog_page8 cp -r -P /home/numerik/pub/cprog hpc0_cprog_page8 -------------------------------------------------------------------------------- Types ``size_t`` and ``ptrdiff_t`` ================================== The declaration of these types can be imported by including the standard header __stddef.h__. The actual size of both types basically depends on the address space of the architecture for which the compiler generates code. As ``ucc`` only supports code generation for the ULM (which has a 64 bit virtual memory space) these types always have 8 bytes. As ``gcc`` on the other hand supports different architectures it depends on the installation and compiler flags. ``stddef.h`` from the ULM standard library ------------------------------------------ Using a __typedef__ declaration, ``size_t`` and ``ptrdiff_t`` are declared as a type alias for ``uint64_t`` and ``int64_t`` respectively. As you will see below, these in turn are type aliases for ``unsigned long long`` and ``long long``. :import: /home/numerik/pub/ulmcc/include/stddef.h Example ------- Let's check the default sizes used by ``ucc`` and ``gcc`` with the following test: ---- CODE (file=session07/hpc0_cprog_page8/xprintf_size_t.c) ------------------- #include #include int main() { printf("sizeof(size_t) = %zu\n", sizeof(size_t)); printf("sizeof(ptrdiff_t) = %zu\n", sizeof(size_t)); } -------------------------------------------------------------------------------- ---- SHELL (path=session07/hpc0_cprog_page8) ----------------------------------- mkdir -p build_gcc gcc -o build_gcc/xprintf_size_t xprintf_size_t.c mkdir -p build_ucc gcc -o build_ucc/xprintf_size_t xprintf_size_t.c -------------------------------------------------------------------------------- ---- SHELL (path=session07/hpc0_cprog_page8) ----------------------------------- ./build_ucc/xprintf_size_t ./build_gcc/xprintf_size_t -------------------------------------------------------------------------------- Invoking ``gcc`` with the ``-m32`` option it generates code for the 32-bit __i386__ architecture that Intel introduced in the mid 80s (of the last century). And as Intels current hardware actually still is backwards compatible this code actual runs on ``theon``. However, for demonstrating this we have to use an old ``gcc`` installation (who would have thought we would ever need that again): ---- SHELL (path=session07/hpc0_cprog_page8) ----------------------------------- /opt/ulm/athenry/bin/gcc -m32 xprintf_size_t.c -o build_gcc/xprintf_size_t-32 ./build_gcc/xprintf_size_t-32 -------------------------------------------------------------------------------- Format specifiers ----------------- For printing variables of type ``size_t`` the correct format specifiers are ``%zu`` (decimal), ``%zx`` (hexadecimal) and ``%zo`` (octal). For variables of type ``ptrdiff_t`` correspondingly ``%td`` (decimal), ``%tx`` (hexadecimal) and ``%to`` (octal). In this example the format specifiers contain an optional width (e.g. ``%20zu`` has the width ``20``) so that we get the numbers printed nicely in columns: ---- CODE (file=session07/hpc0_cprog_page8/xprintf_fmt_zu_td.c) ---------------- #include #include int main() { size_t s = -1; // so that we get the largest value ;-) ptrdiff_t p = -1; printf("s = %20zu (hex: %16zx, oct: %23zo)\n", s, s, s); printf("p = %20td (hex: %16tx, oct: %23to)\n", p, p, p); } -------------------------------------------------------------------------------- ---- SHELL (path=session07/hpc0_cprog_page8) ----------------------------------- mkdir -p build_gcc gcc -o build_gcc/xprintf_fmt_zu_td xprintf_fmt_zu_td.c mkdir -p build_ucc gcc -o build_ucc/xprintf_fmt_zu_td xprintf_fmt_zu_td.c -------------------------------------------------------------------------------- ---- SHELL (path=session07/hpc0_cprog_page8) ----------------------------------- ./build_ucc/xprintf_fmt_zu_td ./build_gcc/xprintf_fmt_zu_td -------------------------------------------------------------------------------- Fixed width integer types ========================= The __fixed width integer types__ are declared in __stdint.h__. Here you see the declarations in the ULM library: :import: /home/numerik/pub/ulmcc/include/stdint.h Type ``bool`` and literals ``true`` and ``false`` ================================================= Since C99 the C language provides an additional integer type ``_Bool`` which is guaranteed to be large enough to store 0 and 1. But it is at least one byte because that's supposed to be the smallest size unit. Again, just another example how the C standard is as vague as possible (so that various platforms can be supported) and consistent at the same time. In __stdbool.h__ for the ULM library the type ``bool`` is declared as an alias for ``_Bool``. By including it you also get integer literals ``false`` and ``true`` with values ``0`` and ``1`` respectively: :import: /home/numerik/pub/ulmcc/include/stdbool.h Note that the C standard actually requires that ``bool`` should be a macro and not a typedef (__here some discussion why__). Well, currently possible as the preprocessor for the ULM compiler is very limited. :links: size_t -> https://en.cppreference.com/w/c/types/size_t ptrdiff_t -> https://en.cppreference.com/w/c/types/ptrdiff_t fixed width integer types -> https://en.cppreference.com/w/cpp/types/integer stddef.h -> https://www.cplusplus.com/reference/cstddef/ stdint.h -> https://en.cppreference.com/w/c/types/integer i386 -> https://en.wikipedia.org/wiki/I386 typedef -> https://en.wikipedia.org/wiki/Typedef stdbool.h -> https://en.cppreference.com/w/c/types/boolean here some discussion why -> https://stackoverflow.com/questions/46797609/why-do-major-compilers-use-typedef-for-stdint-h-but-use-define-for-stdbool-h