CBE Part 14: Dynamic Memory Management in C

More about malloc() and free()

man malloc or man free both will show the same manpage. On this manpage you will also find references for all other functions from the standard library for dynamic memory management.

About calloc() and realloc()

With calloc() you can allocate a zero initialized block of memory. It is declared as

1
void *calloc(size_t count, size_t size);

where count specifies a number of objects and size the size of one object in bytes. Hence it will try to allocate a zero initialized block with count * size bytes. If such a block could be allocated it returns a pointer to it and otherwise a null pointer.

With realloc() you can change the size of a previously allocated block. It is declared as

1
void *realloc(void *ptr, size_t size);

where ptr can be a pointer to a previously allocated block and size the new size in bytes. If ptr is a null pointer the function behaves like malloc(). In case ptr points to a previously allocated block the implementation of realloc() might be able to simple enlarge the size. In this case it will return the same pointer ptr. But it also might be required to allocate a new block with the desired size. Then it will copy data from the previously allocated block to the new block and releases the block to which ptr points. It returns a pointer to the new block. If reallocation is not possible it returns a null pointer.

Requirements for a Vector Class

Assume that we have some important algorithm from numerical linear algebra that requires to access elements from a vector. The algorithm has some nested loops and accessing vector elements is required in the most inner loop. Furthermore, this algorithm is a building block for other algorithms.

For an efficient implementation it means the have to write a function that will be used called from other functions. Therefore accessing elements has to be fast. But it also has to be safe. If we by mistake access elements with an invalid index (e.g. if we have some off by one error) it will be hard to track the problem. Some nightmare scenario would be that accessing invalid elements has no effect if it by coincidence has a value of zero. We then would sometimes pass numerical test, sometimes they would fail, and sometimes we just get a segmentation fault.

The software pattern for a vector class should therefore provide the following:

  • In some debug mode we have some check whether an index is correct, a so called bounce check. In this mode the numerical tests are done.

  • In some bench mode or release mode the code accesses data directly. No overhead for bounce check is tolerated and also no overhead for a function call.

This are some special requirements from a special field of application. It therefore leads to special design patterns. In other fields you can have other requirements. Therefore discussions about good or bad design always should take into account the field of application.

Example: Crude Code that We Kind of Want

Assume we simply want to have a vector \(x\) with a certain dimension. Then we want to initialize the elements and print the elements of the vector. Some crude code that has minimal overhead that fulfills our requirements could look like that, where the dimension is hard coded to \(\text{dim} = 4\):

#include <stdlib.h>
#include <stdio.h>

struct Vector
{
    double * const data;
    const size_t dim;
};

int
main(void)
{
    struct Vector x = {
        malloc(4 * sizeof(x.data)),     // allocate memory for 4 double
        4,                              // dim is hard coded to 4
    };

    // check if memory could be allocated
    if (!x.data) {
        fprintf(stderr, "allocating memoy for vector with %zu elemets failed\n",
                x.dim);
        exit(1);
    }
    
    // init vector elements
    for (size_t i = 0; i < x.dim; ++i) {
        x.data[i] = 42 + i;
    };

    // print vector elements
    for (size_t i = 0; i < x.dim; ++i) {
        printf("%lf", x.data[i]);
        if (i + 1 < x.dim) {
            printf(", ");
        }
    };
    printf("\n");

    // release memory
    free(x.data);
}

It does the job:

theon$ gcc -Wall -o xtest_vector_crude xtest_vector_crude.c
theon$ ./xtest_vector
42.000000, 43.000000, 44.000000, 45.000000
theon$ 

And it actually does things right:

  • In the declaration of members data and dim the const qualifier is used so that you can not accidentally change them after initialization.

  • There is a check if allocation worked. If this is not the case the program terminates with an error message.

  • It releases the memory at the end of main() before vector \(x\) leaves the scope.

  • All indices are valid because both for loops iterates indices from to \(\text{dim} - 1\).

However it is not convenient if you have to deal with vectors in this style. Assume you have two vectors instead of just one. You at least want to have one function that constructs a vector, i.e. does in this case the memory allocation including the error handling.

Also the explicit call of free() for releasing the memory of a vector is not expressive. Having an expressive function for releasing memory of vector in particular becomes important if you have more than on exit point. Also, if you have more complex data structure releasing memory might require more than just calling free(). So also destructing should be bundled in a function.

Furthermore, the correct access of vector elements requires knowledge of the data structure. I.e. you have to know that indices are valid if and only if they are in the range from 0 to x.dim -1. If you have more complex data structures this no longer might be that obvious. So we want a function that can be used for accessing elements and in debug mode it should check if a index is valid. If data structures become more complicated we just have to adapt a few places.

Example: Some Vector Class in C

The following implementation is more convenient to use and more expressive. It is also better scalable if you have more than just one vector. It uses the same declaration for a struct Vector. But in addition functions for construct a vector with allocated memory, releasing its memory. It also provides functions for accessing elements. For checking indices it uses the macro function assert from the standard library. If you compile with -DNDEBUG the preprocessor removes assert. Otherwise assert will terminate the program execution with an error and an exit code unequal to zero if the assert condition is false.

#ifndef VECTOR_H
#define VECTOR_H

#include <stddef.h>

struct Vector
{
    double * const data;
    const size_t dim;
};

// Constructor
struct Vector VectorConstruct(size_t dim);

// Destructor
void VectorDestruct(struct Vector *vec);

// Pointer to element (with bounds check)
double *VectorElementPtr(struct Vector *vec, size_t index);

// Value of element (with bounds check)
double VectorElement(const struct Vector *vec, size_t index);

#endif // VECTOR_H
#include <assert.h>
#include <stdlib.h>
#include <stdio.h>

#include "vector.h"

struct Vector
VectorConstruct(size_t dim)
{
    struct Vector vec = {
        malloc(dim * sizeof(vec.data)),
        dim,
    };
    if (!vec.data) {
        fprintf(stderr, "VectorConstruct(%zu): out of memory\n", dim);
        exit(1);
    }
    return vec;
}

void
VectorDestruct(struct Vector *vec)
{
    free(vec->data);
#   ifndef NDEBUG
    *(double *) &vec->data = 0;
#   endif
}

double *
VectorElementPtr(struct Vector *vec, size_t index)
{
    assert(index < vec->dim);
    return &vec->data[index];
}

double
VectorElement(const struct Vector *vec, size_t index)
{
    assert(index < vec->dim);
    return vec->data[index];
}
#include <stdio.h>

#include "vector.h"

void
VectorPrint(const struct Vector *vec)
{
    for (size_t i = 0; i < vec->dim; ++i) {
        printf("%lf", VectorElement(vec, i));
        if (i + 1 < vec->dim) {
            printf(", ");
        }
    }
}

int
main(void)
{
    // define and construct vector
    struct Vector x = VectorConstruct(4);

    for (size_t i = 0; i < x.dim; ++i) {
        *VectorElementPtr(&x, i) = 42 + i;
    }

    VectorPrint(&x);
    printf("\n");

    // destruct vector before it leaves scope
    VectorDestruct(&x);
}

Here a test run:

theon$ gcc -Wall -o xtest_vector vector.c xtest_vector.c
theon$ ./xtest_vector
42.000000, 43.000000, 44.000000, 45.000000
theon$ 

If this implementation further gets improved for using inline functions the compiler can generate with full optimizations code from this that is equivalent to the “crude implementation” from above:

  • Calling VectorConstruct() normally creates variable of type struct Vector on its stack that gets returned and then copied. This means

    1
    struct Vector x = VectorConstruct(4);
    

    would be inefficient. However, here the return value is used to initialize a variable of the callee that has the same type. The compiler can change the function call such that x gets directly initialized with the return value. That means no additional copying is required.

    If VectorConstruct() can be inlined the compiler generates code as if you would have written

     1
     2
     3
     4
     5
     6
     7
     8
     9
    10
    11
    struct Vector x = {
        malloc(4 * sizeof(x.data)),       // allocate memory for 4 double
        4,                                // dim is hard coded to 4
    };
    
    // check if memory could be allocated
    if (!x.data) {
        fprintf(stderr, "allocating memoy for vector with %zu elemets failed\n",
          x.dim);
        exit(1);
    }
    
  • In debug mode the destructor VectorDestruct() also sets the member data to the null pointer. This helps to detect cases where the destructor was called but the object is afterwards still used.

  • After inlining the expression statement

    1
    *VectorElementPtr(&x, i) = 42 + i;
    

    is equivalent to

    1
    2
    assert(i < (&x)->dim);
    (&x)->data[i] = 42 + i;
    

    and that is equivalent to

    1
    2
    assert(i < x.dim);
    x.data[i] = 42 + i;
    

    If you compile with -DNDEBUG it is equivalent to

    1
    x.data[i] = 42 + i;
    
  • Analogously the statement

    1
    printf("%lf", VectorElement(vec, i));
    

    in the example can be treated as if you would have written

    1
    2
    assert(i < vec->dim);
    printf("%lf", vec->data[i]);
    

    or in bench mode as if you would have written

    1
    printf("%lf", vec->data[i]);
    

Inlining a function call is here in particular important for accessing vector elements. Just think about the addition overhead for calling a function and the instructions in the function's prologue and epilogue that otherwise would be executed for each element access.

Using this design is less error prone but still requires discipline. You have to be sure that VectorConstruct() gets called for each of you variables of type struct Vector before you use them. You also have to make sure that VectorDestruct() get called before it leaves scope.

Example: Some Vector Class in C++

Now basically the same in C++ with more syntactic sugar. But most of all: The C++ compiler makes sure that the constructor will always be called before the object can be used (see RAII), and that the destructor gets called before the object leaves the scope.

#include <cassert>
#include <cstddef>
#include <cstdio>

class Vector
{
    public:
        Vector(std::size_t dim);
        ~Vector();

        double &operator()(std::size_t index);
        double operator()(std::size_t index) const;

        double *data;
        std::size_t dim;
};

Vector::Vector(std::size_t dim)
    : data(new double[dim]), dim(dim)
{
}

Vector::~Vector()
{
    delete [] data;
}

double &
Vector::operator()(std::size_t index)
{
    assert(index < dim);
    return data[index];
}

double
Vector::operator()(std::size_t index) const
{
    assert(index < dim);
    return data[index];
}

void
print(const Vector &x)
{
    for (size_t i = 0; i < x.dim; ++i) {
        std::printf("%lf", x(i));
        if (i + 1 < x.dim) {
            std::printf(", ");
        }
    }
}

int
main(void)
{
    Vector x(4);

    for (std::size_t i = 0; i < x.dim; ++i) {
        x(i) = 42 + i;
    }

    print(x);
    printf("\n");
}
theon$ g++ -Wall -o vector vector.cpp
theon$ ./vector
42.000000, 43.000000, 44.000000, 45.000000
theon$