FLENS-LAPACK

At the moment we are working on a C++ port of LAPACK. Sound tedious? It is a joy with FLENS! This is because FLENS gives you easy-to-use tools for implementing efficient, robust and reliable numerical software.

Not all LAPACK function have been ported so far. However, if you have an external LAPACK implementation on your system then you can use FLENS-LAPACK as a high-level interface. FLENS-LAPACK accesses the external LAPACK library through the low-level CXXLAPACK layer. The usage and concept of this high-level (CXX)LAPACK interface is illustrated in the tutorial.

Generic FLENS-LAPACK: Purpose

FLENS is a comfortable tool for the implementation of numerical algorithms. At the same time we avoid negative impacts on efficiency due to abstraction. Our FLENS-LAPACK port demonstrates the following features of FLENS:

Easy to read and understand

There are many C++ libraries that implement LAPACK functionality. But when you look at their code it is often hard to even recognize the underlying algorithm. It is like the algorithm gets lost in all these C++ template tricks and tweaks.

The Fortan implementation of LAPACK is much easier to read and understand than all these C++ implementations. Even if you are not very familiar with Fortran! So even if some C++ nerds don't like to hear it Fortran LAPACK is great. However, there is one drawback. As there are no actual matrix/vector types many parameters have to be passed to LAPACK routines. This is sometimes error-prone and hard to read. But still better than many other C++ implementations.

The FLENS-LAPACK not only competes in terms of readability with the FORTRAN implementation but even exceeds it. FLENS provides a very expressive notation for numerical linear algebra. Therefore FLENS-LAPACK implementation of numerical algorithms is really readable. You can consider FLENS-LAPACK as an improved reimplementation of Fortran LAPACK.
Same results as the Fortran Implementation of LAPACK

LAPACK is the king in the numerical software field,established and well tested. Our implementation is intended to produce exactly the same results as the Fortran LAPACK (Version 3.3.1). As long as the same BLAS implementation gets used. And with exactly the same results we mean that we even produce the same roundoff errors.
Same performance as the Fortran implementation of LAPACK

While we have not begun with benchmarking we are confident that in the end we achieve the same performance as the Fortran LAPACK. Again under the assumption that in both cases the same BLAS implementation is used.
CXXBLAS

We provide a generic BLAS implementation that gets called if no native BLAS implementation like ATLAS, GotoBLAS or OpenBLAS is available or if the involved data types are not supported.

While CXXBLAS currently passes all BLAS test we plan to modify its implementation such that it produces exactly the same results as the BLAS reference implementation.

Due to CXXBLAS the FLENS-LAPACK routines can be used with data types from C++ multi-precision libraries.

Current Status

Below we give an overview of the functionality currently provided by FLENS-LAPACK. Function names of FLENS-LAPACK are derived from corresponding LAPACK. We removed letters from the function names that merely specify the argument types:

sv in FLENS-LAPACK corresponds to dgesv and zgesv in LAPACK
trs in FLENS-LAPACK corresponds to dgetrs, zgetrs, dtrtrs and ztrtrs in LAPACK
...

At the moment only a subset of LAPACK is re-implemented in FLENS-LAPACK. For other LAPACK functions FLENS-LAPACK serves as a high-level interface. Have a look into the tutorial for learning how to use an external LAPACK implementation. By default FLENS-LAPACK will prefer a generic LAPACK implementation over a external implementation. An external implementation only gets called if no generic implementation is available. You can change this default behavior through macros as described in the tutorial.

Routines for Matrices with Full Storage

Linear Equation Routines

TYPE	FLENS-LAPACK	DESCRIPTION	LAPACK
General	sv	Solves a general system of linear equations \(AX=B\). Example: lapack-gesv.	dgesv, zgesv
	svx	Solves a general system of linear equations \(AX=B\). Error bounds on the solution and a condition estimate are also provided.	dgesvx
	trf	Computes an \(LU\) factorization of a general matrix, using partial pivoting with row interchanges. Example: lapack-getrf.	dgetrf, zgetrf
	trs	Solves a general system of linear equations \(AX=B,\) \(A^T X=B,\) or \(A^H X=B,\) using the \(LU\) factorization. Example: lapack-getrs.	dgetrs, zgetrs
	tri	Computes the inverse of a general matrix, using the \(LU\) factorization. Example: lapack-getri.	dgetri, zgetri
Positive Definite	posv	Solves a symmetric positive definite system of linear equations \(AX=B.\) Example: lapack-posv.	dposv, zposv
	potrf	Computes the Cholesky factorization of a symmetric positive definite matrix. Example: lapack-potrs, lapack-potri.	dpotrf, zpotrf
	potrs	Solves a symmetric positive definite system of linear equations \(AX=B,\) using the Cholesky factorization computed by potrf. Example: lapack-potrs.	dpotrs, zpotrs
	potri	Computes the inverse of a positive definite matrix, using the Cholesky factorization computed by potrf. Example: lapack-potri.	dpotri, zpotri¹
Triangular	trs	Solves a triangular system of linear equations \(AX=B,\) \(A^T X=B,\) or \(A^H X=B,\) using the \(LU\) factorization. Example: lapack-trtrs.	dtrtrs, ztrtrs
Triangular	tri	Computes the inverse of a triangular matrix, using the \(LU\) factorization. Example: lapack-trtri.	dtrtri, ztrtri

Orthogonal Factorizations

FLENS	DESCRIPTION	LAPACK
qrf	Computes a \(QR\) factorization of a general rectangular matrix. Example: lapack-geqrf.	dgeqrf, zgeqrf
qp3	Computes a \(QR\) factorization with column pivoting of a matrix \(A\) such that \(AP = QP\). Example: lapack-geqp3.	dgeqp3, zgeqp3¹
orgqr, ungqr	Generates all or part of the orthogonal/unitary matrix \(Q\) from a \(QR\) factorization. Example: lapack-orgqr, lapack-ungqr.	dorgqr, zungqr¹
ormqr, unmqr	Multiplies a general matrix by the orthogonal/unitary matrix \(Q\) from a \(QR\) factorization. Example: lapack-ormqr, lapack-unmqr.	dormqr, zunmqr
lqf	Computes a \(LQ\) factorization of a general rectangular matrix. Example: lapack-gelqf.	dgelqf, zgelqf¹
orglq, unglq	Generates all or part of the orthogonal/unitary matrix \(Q\) from a \(LQ\) factorization. Example: lapack-orglq, lapack-unglq.	dorglq, zunglq¹
ormlq, unmlq	Multiplies a general matrix by the orthogonal/unitary matrix \(Q\) from a \(LQ\) factorization. Example: see lapack-gelqf.	dormlq, zunmlq¹

Least Square Problems

FLENS

DESCRIPTION

LAPACK

Solves overdetermined or underdetermined real linear systems involving an \(m \times n\) matrix \(A\), or its transpose, using a \(QR\) or \(LQ\) factorization of \(A\). It is assumed that \(A\) has full rank.

The following options are provided:

If \(m \geq n\):
- find the least squares solution of an overdetermined system, i.e., solve the least squares problem: minimize \(\| B - AX \|\) or
- find the minimum norm solution of an undetermined system \(A^T X = B\) (or \(A^H X =B\)).
If \(m < n\):
- find the minimum norm solution of an underdetermined system \(A X = B\) or
- find the least squares solution of an overdetermined system, i.e., solve the least squares problem minimize \(\| B - A^T X \|\) (or \(\| B - A^H X \|\)).

Example: lapack-gels.

dgels, zgels¹

lsy

Computes the minimum-norm solution to a real linear least squares problem: minimize \(\| A X - B \|\) using a complete orthogonal factorization of \(A\). \(A\) is an \(m \times n\) matrix which may be rank-deficient. The rank of \(A\) gets determined using a incremental condition estimation.

Example: lapack-gelsy.

dgelsy, zgelsy¹

Non-Symmetric Eigenvalue Routines

FLENS	DESCRIPTION	LAPACK
ev	Computes the eigenvalues and left and right eigenvectors of a general matrix. Example: lapack-geev.	dgeev, zgeev
evx	Computes the eigenvalues and left and right eigenvectors of a general matrix. Optionally also, it computes a balancing transformation to improve the conditioning of the eigenvalues and eigenvectors, reciprocal condition numbers for the eigenvalues, and reciprocal condition numbers for the right eigenvectors.	dgeevx
es	Computes for a general matrix, the eigenvalues, the real Schur form \(T\), and, optionally, the matrix of Schur vectors \(Z\). This gives the Schur factorization \(A = Z T Z^T.\)	dgees
esx	Like es but optionally, it also orders the eigenvalues on the diagonal of the real Schur form so that selected eigenvalues are at the top left; computes a reciprocal condition number for the average of the selected eigenvalues; and computes a reciprocal condition number for the right invariant subspace corresponding to the selected eigenvalues. The leading columns of \(Z\) form an orthonormal basis for this invariant subspace.	dgeesx
hrd	Reduces a general matrix to upper Hessenberg form by an orthogonal similarity transformation.	dgehrd
orghr	Generates the orthogonal transformation matrix from a reduction to Hessenberg form.	dorghr
unghr	Generates the unitary transformation matrix from a reduction to Hessenberg form.	zunghr

Routines for Matrices with Band Storage

Linear Equation Routines

TYPE	FLENS-LAPACK	DESCRIPTION	LAPACK
General	sv	Solves a general system of linear equations \(AX=B\). Example: lapack-gbsv.	dgbsv¹ zgbsv¹
	trf	Computes an \(LU\) factorization of a general matrix, using partial pivoting with row interchanges. Example: see lapack-gbtrs.	dgbtrf¹ zgbtrf¹
	trs	Solves a general system of linear equations \(AX=B,\) \(A^T X=B,\) or \(A^H X=B,\) using the \(LU\) factorization. Example: lapack-gbtrs.	dgbtrs¹ zgbtrs¹
Positive Definite	pbsv	Solves a symmetric positive definite system of linear equations \(AX=B.\) Example: lapack-pbsv.	dpbsv¹ zpbsv¹
	pbtrf	Computes the Cholesky factorization of a symmetric positive definite matrix. Example: see lapack-pbtrs.	dpbtrf¹ zpbtrf¹
	pbtrs	Solves a symmetric positive definite system of linear equations \(AX=B,\) using the Cholesky factorization computed by pbtrf. Example: lapack-pbtrs.	dpbtrs¹ zpbtrs¹

Note: Maybe we should brake with the strict naming scheme and rename functions pbsv, pbtrf, pbtrs just to posv, potrf, potrs.

Related Projects

LAPACK itself of course.
mpack which is also a generic C++ port of LAPACK. To our knowledge the following strategy gets used for porting LAPACK:
- f2c is used to create a C implementation of LAPACK
- Various scripts (the magic ingredient) are used to create a generic C++ implementation from the C code.
This approach has both, advantages and (depending on your own goals) disadvantages:
- Pros: You have a complete generic C++ port of LAPACK that supports various multiple precision arithmetic libraries like GMP, MPFR and QD.
- Cons: The automatic generated code is hard to read (as is typical for f2c-generated code).

Footnotes

¹This functionality is only available through an external LAPACK implementation. See the examples or the CXXLAPACK section of the tutorial for further details on using an external LAPACK.