FLENS-LAPACK Implementation

At the moment we are working on a C++ port of LAPACK. Sound tedious? It is a joy with FLENS! This is because FLENS gives you easy-to-use tools for implementing efficient, robust and reliable numerical software.

Purpose

FLENS is a comfortable tool for the implementation of numerical algorithms. At the same time we avoid negative impacts on efficiency due to abstraction. But only claiming that we can achieve this two goals is one thing. The other thing is proving this! In this respect you could regard the FLENS-LAPACK as a prove of our claims. So our FLENS-LAPACK port demonstrates the following features:

Easy to read and understand

There are many C++ libraries that implement LAPACK functionality. But when you look at their code it is often hard to even recognize the underlying algorithm. It is like the algorithm gets lost in all these C++ template tricks and tweaks.

The Fortan implementation of LAPACK is much easier to read and understand. Even if you are not very familiar with Fortran! Well, maybe there is one drawback. As there are now actual matrix/vector types many parameters have to be passed to LAPACK routines. This is sometime error-prone and hard to read.

Anyway, the FLENS-LAPACK implementation of these algorithms is really readable.
Same results as the Fortran Implementation of LAPACK

LAPACK is the king in the numerical software field,established and well tested. Our implementation is intended to produce exactly the same results as the Fortran LAPACK (Version 3.3.1). As long as the same BLAS implementation gets used. And with exactly the same results we mean that we even produce the same roundoff errors.
Same performance as the Fortran implementation of LAPACK

While we have not begun with benchmarking we are confident that in the end we achieve the same performance as the Fortran LAPACK. Again under the assumption that in both cases the same BLAS implementation is used.
CXXBLAS

We provide a generic BLAS implementation that gets called if no native BLAS implementation like ATLAS, GotoBLAS or OpenBLAS is available or if the involved data types are not supported.

While CXXBLAS currently passes all BLAS test we plan to modify its implementation such that it produces exactly the same results as the BLAS reference implementation.

Due to CXXBLAS the FLENS-LAPACK routines can be used with data types from C++ multi-precision libraries.

Current Status

While it is a joy to port LAPACK to C++ we are still far from complete. And as we require that the resulting implementation is readable and understandable we can not completely automate the process:

The Fortran implementation of LAPACK uses a lot of goto statements. In our implementation we try to avoid them were possible and appropriate.
... and other reasons.

At the moment we focus on porting LAPACK routines were data types are real and the involved matrix types have full storage:

We picked LAPACK routines for double precision and turned them in generic C++ code that uses FLENS matrix/vector types. While they should also work in single and multi-precision we have not tested this so far.
Involved FLENS matrix types are GeMatrix, TrMatrix and SyMatrix and FLENS vector type DenseVector.
As the LAPACK naming is well known we stick with it in our port. However, we removed letters from the function names that merely specify the argument types. For example, the LAPACK functions dgetrs and dtrtrs resulted in the FLENS-LAPACK function trs which is overloaded for GeMatrix and TrMatrix.
All LAPACK routines are located in directory flens/lapack and loosely grouped in subdiretories:
- aux contains various auxiliary routines
- eig contains routines most relevant for the computation of eigenvalues and eigenvectors,
- gesv for solving systems of linear equations and
- qr for computing the QR factorization.

But before you get lost in the details you might be interested in the following LAPACK driver routines that are currently implemented in FLENS.

Linear Equation Routines

TYPE	FLENS	DESCRIPTION	LAPACK
General	sv	Solves a general system of linear equations \(AX=B\). Example: lapack-gesv.	dgesv
	svx	Solves a general system of linear equations \(AX=B\). Error bounds on the solution and a condition estimate are also provided.	dgesvx
	trf	Computes an \(LU\) factorization of a general matrix, using partial pivoting with row interchanges. Example: lapack-getrf.	dgetrf
	trs	Solves a general system of linear equations \(AX=B,\) \(A^T X=B,\) or \(A^H X=B,\) using the \(LU\) factorization.	dgetrs
	tri	Computes the inverse of a general matrix, using the \(LU\) factorization.	dgetri
Positive Definite	posv	Solves a symmetric positive definite system of linear equations \(AX=B.\)	dposv
	potrf	Computes the Cholesky factorization of a symmetric positive definite matrix.	dpotrf
	potrs	Solves a symmetric positive definite system of linear equations \(AX=B,\) using the Cholesky factorization.	dpotrs
	potri	Computes the inverse of a general matrix, using the Cholesky factorization.	dpotri
Triangular	trs	Solves a triangular system of linear equations \(AX=B,\) \(A^T X=B,\) or \(A^H X=B,\) using the \(LU\) factorization.	dtrtrs
Triangular	tri	Computes the inverse of a triangular matrix, using the \(LU\) factorization.	dtrtri

Orthogonal Factorizations

FLENS	DESCRIPTION	LAPACK
qrf	Computes a \(QR\) factorization of a general rectangular matrix. Example: lapack-geqrf.	dgeqrf
qrs	Solve the least squares problem \(\min\\| AX - B \\|\) using the \(QR\) factorization	dgeqrs
orgqr	Generates all or part of the orthogonal matrix \(Q\) from a \(QR\) factorization.	dorgqr
ormqr	Multiplies a general matrix by the orthogonal matrix from a \(QR\) factorization.	dormqr

Non-Symmetric Eigenvalue Routines

FLENS	DESCRIPTION	LAPACK
ev	Computes the eigenvalues and left and right eigenvectors of a general matrix. Example: lapack-geev.	dgeev
evx	Computes the eigenvalues and left and right eigenvectors of a general matrix. Optionally also, it computes a balancing transformation to improve the conditioning of the eigenvalues and eigenvectors, reciprocal condition numbers for the eigenvalues, and reciprocal condition numbers for the right eigenvectors.	dgeevx
es	Computes for a general matrix, the eigenvalues, the real Schur form \(T\), and, optionally, the matrix of Schur vectors \(Z\). This gives the Schur factorization \(A = Z T Z^T.\)	dgees
esx	Like es but optionally, it also orders the eigenvalues on the diagonal of the real Schur form so that selected eigenvalues are at the top left; computes a reciprocal condition number for the average of the selected eigenvalues; and computes a reciprocal condition number for the right invariant subspace corresponding to the selected eigenvalues. The leading columns of \(Z\) form an orthonormal basis for this invariant subspace.	dgeesx
hrd	Reduces a general matrix to upper Hessenberg form by an orthogonal similarity transformation.	dgehrd
orghr	Generates the orthogonal transformation matrix from a reduction to Hessenberg form.	dorghr

Todo

Of course there many more TODOs. But theses are the most important:

Change CXXBLAS such that it produces exactly the same results as RefBLAS
Operator overloading works for most things but needs still needs some work and checking. At the moment we call most of the time the FLENS-BLAS functions directly. Actually that is not that cumbersome. But once we have overloaded operators back we can the FLENS-LAPACK implementation further!

Related Projects

LAPACK itself of course.
mpack which is also a generic C++ port of LAPACK. To our knowledge the

following strategy gets used for porting LAPACK:
- f2c is used to create a C implementation of LAPACK
- Various scripts (the magic ingredient) are used to create a generic C++ implementation from the C code.
This approach has both, advantages and (depending on your own goals) disadvantages:
- Pros: You have a complete generic C++ port of LAPACK that supports various multiple precision arithmetic libraries like GMP, MPFR and QD.
- Cons: The automatic generated code is hard to read (as is typical for f2c-generated code).