Content

Avoiding the Creation of Temporaries

By default FLENS will never create any temporaries during the evaluation of linear algebra expressions. As a consequence some linear algebra expressions can not be evaluated through (CXX)BLAS calls. If possible such cases will trigger a compile time error. If such cases can not be checked at compile time you at least get an runtime assertion unless you compile with -DNDEBUG.

However, you can change the default such that temporaries get created but only when necessary. In addition you get a warning in the log-file such that you subsequently can optimize your code for completely avoiding the unintended creation of temporaries.

Of course this become much more useful when your linear algebra expression become more complex.

Expressions that require Temporaries

Consider \(x := Ax\) where \(x\) is a (dense) vector and \(A\) a general matrix then the CXXBLAS function gemv can not be used directly. This is because gemv does computations of the form \(y \leftarrow \beta y + \alpha A x\) where \(x\) and \(y\) do not reference the same memory (i.e. \(x\) and \(y\) must be different vectors). Hence, we need a temporary \(y_\text{tmp}\) and evaluate \(x := Ax\) in two steps:

Allowing and Logging the Creation of Temporaries

Like in the last example we turn on logging before the first linear algebra expression get evaluated.

#include <flens/flens.cxx>
#include <iostream>

using namespace flens;
using namespace std;

int
main()
{
    typedef double                               T;
    typedef DenseVector<Array<T> >               DEVector;
    typedef GeMatrix<FullStorage<T, ColMajor> >  GEMatrix;

    DEVector x(3);
    x = 123;

    GEMatrix A(3,3);
    A = 123,
        567,
        543;

    verbose::ClosureLog::start("mylogfile");

    x = A*x;

    cout << "x = " << x << endl;

    verbose::ClosureLog::stop();

    return 0;
}

Compile, Run and Examine the Log-File

Like in the last example we compile the example with -DFLENS_DEBUG_CLOSURES and link against some auxiliary object files from the flens/debug/auxiliary directory.

$shell> cd flens/examples                                                     
$shell> #                                                                     
$shell> # cleanup old object files and compile some stuff needed for logging  
$shell> #                                                                     
$shell> rm -f *.o                                                             
$shell> g++ -Wall -DFLENS_DEBUG_CLOSURES -std=c++11 -I../.. -c ../../flens/debug/auxiliary/*.cc                   
$shell> #                                                                     
$shell> # compile with -DFLENS_DEBUG_CLOSURES and link against the log stuff  
$shell> #                                                                     
$shell> g++ -Wall -DFLENS_DEBUG_CLOSURES -std=c++11 -I../.. *.o tut02-page04-example1.cc                          
$shell> ./a.out                                                               
x = 
           14             38             22 
$shell> #                                                                     
$shell> # look what is in the log file                                        
$shell> #                                                                     
$shell> cat mylogfile                                                         
--------------------------------------------------------------------------------
x1 = (A1 * x1);
flens::assign((A1 * x1), x1);
    flens::blas::copy((A1 * x1), x1);
        tmp_x2 = x1;
        flens::assign(x1, tmp_x2);
            --> flens::blas::copy(x1, tmp_x2);
        --> flens::blas::mv(NoTrans, 1, A1, tmp_x2, 0, x1);
        WARNING: Temporary tmp_x2 was required for x1

Other Expressions that require Temporaries (Important!)

No Associative property

Floating point number are not associative. That means in general we have \(x + (y + z) \neq (x + y) + z\) if \(x, y\) and \(z\) are floating point numbers or matrices/vectors of floating point numbers. In FLENS expressions without parentheses the expression gets done according to the usual C/C++ precedence. That means for example that \(x + y + z = (x + y) + z\).

No Distributive property

There is also no distributive property for floating point numbers. That means for example that \(A(x+y) \neq Ax + Ay\) if \(A\) is a matrix and \(x, y\) are vectors of floating point numbers.

Example Code

In the following code example temporaries are required due to the lack of the above property.


#include <flens/flens.cxx>
#include <iostream>

using namespace flens;
using namespace std;

int
main()
{
    typedef double                               T;
    typedef DenseVector<Array<T> >               DEVector;
    typedef GeMatrix<FullStorage<T, ColMajor> >  GEMatrix;

    DEVector x(3), y(3), z(3);
    x = 123;
    y = 234;
    z = 345;

    GEMatrix A(3,3);
    A = 123,
        567,
        543;

    verbose::ClosureLog::start("mylogfile");

    DEVector v = x + (y + z);

    z = A*(x+y);

    cout << "x = " << x << endl;

    verbose::ClosureLog::stop();

    return 0;
}

Compile, Run and Examine the Log-File

Like before ...

$shell> cd flens/examples                                                     
$shell> #                                                                     
$shell> # cleanup old object files and compile some stuff needed for logging  
$shell> #                                                                     
$shell> rm -f *.o                                                             
$shell> g++ -Wall -DFLENS_DEBUG_CLOSURES -std=c++11 -I../.. -c ../../flens/debug/auxiliary/*.cc                   
$shell> #                                                                     
$shell> # compile with -DFLENS_DEBUG_CLOSURES and link against the log stuff  
$shell> #                                                                     
$shell> g++ -Wall -DFLENS_DEBUG_CLOSURES -std=c++11 -I../.. *.o tut02-page04-example2.cc                          
$shell> ./a.out                                                               
x = 
            1              2              3 
$shell> #                                                                     
$shell> # look what is in the log file                                        
$shell> #                                                                     
$shell> cat mylogfile                                                         
--------------------------------------------------------------------------------
x1 = (x4 + (x3 + x2));
flens::assign((x4 + (x3 + x2)), x1);
    flens::blas::copy((x4 + (x3 + x2)), x1);
        --> flens::blas::copy(x4, x1);
        flens::blas::axpy(1, (x3 + x2), x1);
            tmp_x5 = (x3 + x2);
            flens::assign((x3 + x2), tmp_x5);
                flens::blas::copy((x3 + x2), tmp_x5);
                    --> flens::blas::copy(x3, tmp_x5);
                    --> flens::blas::axpy(1, x2, tmp_x5);
            --> flens::blas::axpy(1, tmp_x5, x1);
            WARNING: Temporary tmp_x5 was required for (x3 + x2)
--------------------------------------------------------------------------------
x2 = (A1 * (x4 + x3));
flens::assign((A1 * (x4 + x3)), x2);
    flens::blas::copy((A1 * (x4 + x3)), x2);
        tmp_x6 = (x4 + x3);
        flens::assign((x4 + x3), tmp_x6);
            flens::blas::copy((x4 + x3), tmp_x6);
                --> flens::blas::copy(x4, tmp_x6);
                --> flens::blas::axpy(1, x3, tmp_x6);
        --> flens::blas::mv(NoTrans, 1, A1, tmp_x6, 0, x2);
        WARNING: Temporary tmp_x6 was required for (x4 + x3)