=========================================== CBE Part 14: Dynamic Memory Management in C [TOC] =========================================== ---- VIDEO ------------------------------ https://www.youtube.com/embed/vHFOjvmC1ww ----------------------------------------- More about _malloc()_ and _free()_ ================================== _man malloc_ or _man free_ both will show __the same manpage__. On this manpage you will also find references for all other functions from the standard library for dynamic memory management. :links: the same manpage -> https://man7.org/linux/man-pages/man3/malloc.3.html About _calloc()_ and _realloc()_ ================================ With _calloc()_ you can allocate a zero initialized block of memory. It is declared as ---- CODE (type=c) ------------------------------------------------------------- void *calloc(size_t count, size_t size); -------------------------------------------------------------------------------- where _count_ specifies a number of objects and _size_ the size of one object in bytes. Hence it will try to allocate a zero initialized block with _count * size_ bytes. If such a block could be allocated it returns a pointer to it and otherwise a null pointer. With _realloc()_ you can change the size of a previously allocated block. It is declared as ---- CODE (type=c) ------------------------------------------------------------- void *realloc(void *ptr, size_t size); -------------------------------------------------------------------------------- where _ptr_ can be a pointer to a previously allocated block and _size_ the new size in bytes. If _ptr_ is a null pointer the function behaves like _malloc()_. In case _ptr_ points to a previously allocated block the implementation of _realloc()_ might be able to simple enlarge the size. In this case it will return the same pointer _ptr_. But it also might be required to allocate a new block with the desired size. Then it will copy data from the previously allocated block to the new block and releases the block to which _ptr_ points. It returns a pointer to the new block. If reallocation is not possible it returns a null pointer. Requirements for a Vector Class =============================== Assume that we have some important algorithm from numerical linear algebra that requires to access elements from a vector. The algorithm has some nested loops and accessing vector elements is required in the most inner loop. Furthermore, this algorithm is a building block for other algorithms. For an efficient implementation it means the have to write a function that will be used called from other functions. Therefore accessing elements has to be fast. But it also has to be safe. If we by mistake access elements with an invalid index (e.g. if we have some _off by one error_) it will be hard to track the problem. Some nightmare scenario would be that accessing invalid elements has no effect if it by coincidence has a value of zero. We then would sometimes pass numerical test, sometimes they would fail, and sometimes we just get a segmentation fault. The software pattern for a vector class should therefore provide the following: - In some _debug mode_ we have some check whether an index is correct, a so called _bounce check_. In this mode the numerical tests are done. - In some _bench mode_ or _release mode_ the code accesses data directly. No overhead for _bounce check_ is tolerated and also no overhead for a function call. This are some special requirements from a special field of application. It therefore leads to special design patterns. In other fields you can have other requirements. Therefore discussions about good or bad design always should take into account the field of application. Example: Crude Code that We Kind of Want ======================================== Assume we simply want to have a vector $x$ with a certain dimension. Then we want to initialize the elements and print the elements of the vector. Some crude code that has minimal overhead that fulfills our requirements could look like that, where the dimension is hard coded to $\text{dim} = 4$: :import: session16/vector/xtest_vector_crude.c [fold] It does the job: ---- SHELL (path=session16/vector) --------------------------------------------- gcc -Wall -o xtest_vector_crude xtest_vector_crude.c ./xtest_vector -------------------------------------------------------------------------------- And it actually does things right: - In the declaration of members _data_ and _dim_ the _const_ qualifier is used so that you can not accidentally change them after initialization. - There is a check if allocation worked. If this is not the case the program terminates with an error message. - It releases the memory at the end of _main()_ before vector $x$ leaves the scope. - All indices are valid because both for loops iterates indices from $0$ to $\text{dim} - 1$. However it is not convenient if you have to deal with vectors in this style. Assume you have two vectors instead of just one. You at least want to have one function that constructs a vector, i.e. does in this case the memory allocation including the error handling. Also the explicit call of _free()_ for releasing the memory of a vector is not expressive. Having an expressive function for releasing memory of vector in particular becomes important if you have more than on exit point. Also, if you have more complex data structure releasing memory might require more than just calling _free()_. So also destructing should be bundled in a function. Furthermore, the correct access of vector elements requires knowledge of the data structure. I.e. you have to know that indices are valid if and only if they are in the range from 0 to _x.dim -1_. If you have more complex data structures this no longer might be that obvious. So we want a function that can be used for accessing elements and in _debug mode_ it should check if a index is valid. If data structures become more complicated we just have to adapt a few places. Example: Some Vector Class in C =============================== The following implementation is more convenient to use and more expressive. It is also better scalable if you have more than just one vector. It uses the same declaration for a _struct Vector_. But in addition functions for construct a vector with allocated memory, releasing its memory. It also provides functions for accessing elements. For checking indices it uses the macro function _assert_ from the standard library. If you compile with _-DNDEBUG_ the preprocessor removes _assert_. Otherwise _assert_ will terminate the program execution with an error and an exit code unequal to zero if the assert condition is false. :import: session16/vector/vector.h [fold] :import: session16/vector/vector.c [fold] :import: session16/vector/xtest_vector.c [fold] Here a test run: ---- SHELL (path=session16/vector) --------------------------------------------- gcc -Wall -o xtest_vector vector.c xtest_vector.c ./xtest_vector -------------------------------------------------------------------------------- If this implementation further gets improved for using __inline functions__ the compiler can generate with full optimizations code from this that is equivalent to the "crude implementation" from above: - Calling _VectorConstruct()_ normally creates variable of type _struct Vector_ on its stack that gets returned and then copied. This means ---- CODE (type=c) ----------------------------------------------------------- struct Vector x = VectorConstruct(4); ------------------------------------------------------------------------------ would be inefficient. However, here the return value is used to initialize a variable of the callee that has the same type. The compiler can change the function call such that _x_ gets directly initialized with the return value. That means no additional copying is required. If _VectorConstruct()_ can be inlined the compiler generates code as if you would have written ---- CODE (type=c) ----------------------------------------------------------- struct Vector x = { malloc(4 * sizeof(x.data)), // allocate memory for 4 double 4, // dim is hard coded to 4 }; // check if memory could be allocated if (!x.data) { fprintf(stderr, "allocating memoy for vector with %zu elemets failed\n", x.dim); exit(1); } ------------------------------------------------------------------------------ - In debug mode the destructor _VectorDestruct()_ also sets the member _data_ to the null pointer. This helps to detect cases where the destructor was called but the object is afterwards still used. - After inlining the expression statement ---- CODE (type=c) ----------------------------------------------------------- *VectorElementPtr(&x, i) = 42 + i; ------------------------------------------------------------------------------ is equivalent to ---- CODE (type=c) ----------------------------------------------------------- assert(i < (&x)->dim); (&x)->data[i] = 42 + i; ------------------------------------------------------------------------------ and that is equivalent to ---- CODE (type=c) ----------------------------------------------------------- assert(i < x.dim); x.data[i] = 42 + i; ------------------------------------------------------------------------------ If you compile with _-DNDEBUG_ it is equivalent to ---- CODE (type=c) ----------------------------------------------------------- x.data[i] = 42 + i; ------------------------------------------------------------------------------ - Analogously the statement ---- CODE (type=c) ----------------------------------------------------------- printf("%lf", VectorElement(vec, i)); ------------------------------------------------------------------------------ in the example can be treated as if you would have written ---- CODE (type=c) ----------------------------------------------------------- assert(i < vec->dim); printf("%lf", vec->data[i]); ------------------------------------------------------------------------------ or in bench mode as if you would have written ---- CODE (type=c) ----------------------------------------------------------- printf("%lf", vec->data[i]); ------------------------------------------------------------------------------ Inlining a function call is here in particular important for accessing vector elements. Just think about the addition overhead for calling a function and the instructions in the function's prologue and epilogue that otherwise would be executed for each element access. Using this design is less error prone but still requires discipline. You have to be sure that _VectorConstruct()_ gets called for each of you variables of type _struct Vector_ before you use them. You also have to make sure that _VectorDestruct()_ get called before it leaves scope. Example: Some Vector Class in C++ ================================= Now basically the same in C++ with more syntactic sugar. But most of all: The C++ compiler makes sure that the constructor will always be called before the object can be used (see __RAII__), and that the destructor gets called before the object leaves the scope. :import: session16/vector/vector.cpp [fold] ---- SHELL (path=session16/vector) --------------------------------------------- g++ -Wall -o vector vector.cpp ./vector -------------------------------------------------------------------------------- :links: RAII -> https://en.wikipedia.org/wiki/Resource_acquisition_is_initialization inline functions -> https://en.wikipedia.org/wiki/Inline_function