Triangular/Trapezoidal Matrices with Full Storage
Triangular, symmetric and hermitian matrices have in common that only data from a triangular matrix part is required to represent the matrix. If saving memory is of primary concerns packed storage formats for these matrix types are provided. These are matrix classes TpMatrix, SpMatrix and HpMatrix.
On this slide we will introduce matrix class TrMatrix. On the next page the classes SyMatrix and HeMatrix. All this classes use a full storage scheme for representing the matrix. Either the upper or lower triangular part of the full storage scheme is used to represent a triangular, symmetric or hermitian matrix.
You should not think of a TrMatrix being a triangular/trapezoidal matrix type. Consider instead: TrMatrix = Full storage scheme assumed to represent a triangular/trapezoidal matrix type. Having a TrMatrix class just statically tags this information to the full storage scheme and therefore makes the assumption known at compile time.
Information whether the represented matrix is upper or lower triangular/trapezoidal is stored dynamically. This gives more flexibility because sometimes you want to change this flag. But consequently accessing the wrong triangular part trigger a run-time assertion instead of a compile time error.
You can create a general matrix view from a triangular matrix to access all elements of the underlying full storage. So accessing the wrong triangular part is not forbidden per se. We just don't want that it happens accidentally and therefore require an explicit cast.
People unfamiliar in the field of numerical linear algebra might consider the design concept of these classes wrong. If you think accessing the lower part of an upper triangular part should always return a zero you are one of those people. Let's make you familiar!
People familiar in the field know: This is exactly what you want!
TrMatrix: Triangular/Trapezoidal Matrix with Full Storage
With a TrMatrix class the underlying full storage scheme should be assumed to represent a lower or upper triangular or trapezoidal matrix. A trapezoidal matrix is a non-square matrix where element above or below the diagonal are assumed to be zero.
Examples
TrMatrix<FullStorage<double, ColMajor> >
|
A triangular/trapezoidal matrix where memory gets allocated when instanciated and released at the end of its scope. |
TrMatrix<FullStorage<double, ColMajor> >::View
|
A triangular/trapezoidal matrix that references an existing full storage scheme. Memory neither gets allocated nor released at any point. Creation/destruction is cheap. View is a publib typedef in TrMatrix and in this case defined as TrMatrix<FullStorageView<double, ColMajor> > |
TrMatrix<FullStorage<double, ColMajor> >::ConstView
|
A triangular/trapezoidal matrix that references an existing full storage scheme. Memory neither gets allocated nor released at any point. Creation/destruction is cheap Calling any non-const methods will cause a compile time error. So this is a read-only matrix view. ConstView is a public typedef in TrMatrix and in this case defined as TrMatrix<ConstFullStorageView<double, ColMajor> > |
Some Public Typedefs
Like all matrix/vector classes public typedefs should be used to refer to element, index and view types.
IndexType |
Like its name suggests the type used for indexing. |
ElementType |
The element type. |
View, ConstView |
Types for referencing a rectangular part of the matrix. |
NoView |
Matrix type with memory allocation. If you want to copy matrix parts instead of referencing use this type. |
EngineView, EngineConstView |
Storage schemes for referencing parts from another storage scheme. For TrMatrix these are typedefs for FullStorageView<..> and ConstFullStorageView<..>. |
Constructors, Methods and operations for Matrix Manipulation
Methods and operators are provided analogously to class GeMatrix. For the special (but important) case that a TrMatrix is triangular (i.e. of dimension \(n \times n\)) and not trapezoidal (i.e. of dimension \(m \times n\)) some additional methods are provided.
Output operators are also defined. The will print the logical view, that means elements from the not referenced triangular part will be printed as zero. And ones for diagonal elements if the unit-diag flag is set.
Convenience methods like list initializers are not provides for a TrMatrix.
TrMatrix(IndexType dim, StorageUpLo upLo, Diag diag = NonUnit); |
Create a dim x dim triangular matrix.
|
TrMatrix(IndexType numRows, IndexType numCols, StorageUpLo upLo, Diag diag = NonUnit); |
Create a numRows x numCols trapezoidal matrix.
|
IndexType numRows() const; |
Return the number of rows. |
IndexType numCols() const; |
Return the number of columns. |
IndexType dim() const; |
For a triangular, i.e. square matrix returns the dimension. If the matrix is not square this causes a runtime assertion. |
StorageUpLo upLo() const; |
Returns Lower if the matrix represents a lower triangular/trapezoidal matrix. Otherwise Upper. In non-const context the method returns a reference of the attribute. So you can change it. |
StorageUpLo & upLo(); |
|
Diag diag() const; |
Returns Unit if elements on the diagonal must be assumed to have value one. Otherwise NonUnit. In non-const context the method returns a reference of the attribute. So you can change it. |
Diag & diag(); |
|
const ElementType& operator()(i, j) const; |
Return a reference to element in row \(i\) and column \(j\). Only accessing elements from the triangular part specified by upLo() is allowed. Otherwise a runtime assertion gets triggered. |
ElementType& operator()(i, j); |
|
const ElementType * data() const; |
Pointer to the first element of the matrix. So this is the address of A(1,1). |
ElementType * data(); |
|
StorageOrder
order() const; |
Returns RowMajor, ColMajor or Grid. An storage order Grid occurs if you have e.g. a matrix view that references every second row and every third column of a matrix. |
IndexType
strideRow() const; |
Stride in memory between elements in the same column. So A.data() + A.strideRow() is the address of A(2,1). If your matrix is ColMajor this is 1. |
IndexType
strideCol() const; |
Stride in memory between elements in the same row. So A.data() + A.strideCol() is the address of A(2,1). If your matrix is RowMajor this is 1. |
IndexType
leadingDimension() const; |
|
Why is Accessing the wrong Triangular Part Illegal and doesn't just return Zero?
An efficient algorithm will exploit the fact that for a triangular matrix the elements from the opposite triangular part are zero. In particular, an efficient algorithm will not use or reference them anyway.
If you have an algorithm that depends on zero values on the opposite triangular part then it is not exploiting the triangular property of the matrix. In that case you just as well can use a GeMatrix and set elements of the opposite triangular part explicitly to zero.
Simple example
In the following toy example we define a function that initializes a given TrMatrix. Such a function must at least distinguish 4 cases:
-
Upper triangular with non-unit diagonal.
-
Upper triangular with unit diagonal.
-
Lower triangular with non-unit diagonal.
-
Lower triangular with unit diagonal.
Also note that we always initialize the matrix column wise. If elements are stored row wise we also should differentiate in that respect. We then end up with 8 cases. At this point some people tell us that we should provide iterators:
-
You are free to implement iterators on top of FLENS. But it is important to know that behind the iterators still these 8 cases are hidden.
-
We have something like iterators. But first learn to do things by hand before using fancy abstractions.
#include <algorithm>
#include <iostream>
using namespace std;
using namespace flens;
// Initialize a triangular/trapezoidal matrix column wise with 1, 2, 3, ...
template <typename MA>
void
myInit(TrMatrix<MA> &A)
{
int counter = 1;
int m = A.numRows();
int n = A.numCols();
if (A.upLo()==Upper) {
if (A.diag()==NonUnit) {
for (int j=1; j<=n; ++j) {
for (int i=1; i<=std::min(j,m); ++i) {
A(i,j) = counter++;
}
}
} else {
for (int j=1; j<=n; ++j) {
for (int i=1; i<std::min(j,m); ++i) {
A(i,j) = counter++;
}
}
}
} else {
if (A.diag()==NonUnit) {
for (int j=1; j<=n; ++j) {
for (int i=j; i<=m; ++i) {
A(i,j) = counter++;
}
}
} else {
for (int j=1; j<=n; ++j) {
for (int i=j+1; i<=m; ++i) {
A(i,j) = counter++;
}
}
}
}
}
int
main()
{
TrMatrix<FullStorage<double> > A(3, Upper);
TrMatrix<FullStorage<double> > B(3, Upper, Unit);
TrMatrix<FullStorage<double> > C(3, 4, Upper);
TrMatrix<FullStorage<double> > D(3, 4, Upper, Unit);
TrMatrix<FullStorage<double> > E(4, 3, Upper);
TrMatrix<FullStorage<double> > F(4, 3, Upper, Unit);
TrMatrix<FullStorage<double> > G(3, Lower);
TrMatrix<FullStorage<double> > H(3, Lower, Unit);
TrMatrix<FullStorage<double> > I(3, 4, Lower);
TrMatrix<FullStorage<double> > J(3, 4, Lower, Unit);
TrMatrix<FullStorage<double> > K(4, 3, Lower);
TrMatrix<FullStorage<double> > L(4, 3, Lower, Unit);
myInit(A);
myInit(B);
myInit(C);
myInit(D);
myInit(E);
myInit(F);
myInit(G);
myInit(H);
myInit(I);
myInit(J);
myInit(K);
myInit(L);
cout << "A = " << A << endl;
cout << "B = " << B << endl;
cout << "C = " << C << endl;
cout << "D = " << D << endl;
cout << "E = " << E << endl;
cout << "F = " << F << endl;
cout << "G = " << G << endl;
cout << "H = " << H << endl;
cout << "I = " << I << endl;
cout << "J = " << J << endl;
cout << "K = " << K << endl;
cout << "L = " << L << endl;
}
Compile and Run
$shell> cd flens/examples $shell> g++ -Wall -std=c++11 -I../.. trmatrix.cc $shell> ./a.out A = 1 2 4 0 3 5 0 0 6 B = 1 1 2 0 1 3 0 0 1 C = 1 2 4 7 0 3 5 8 0 0 6 9 D = 1 1 2 4 0 1 3 5 0 0 1 0 E = 1 2 4 0 3 5 0 0 6 0 0 0 F = 1 1 2 0 1 3 0 0 1 0 0 0 G = 1 0 0 2 4 0 3 5 6 H = 1 0 0 1 1 0 2 3 1 I = 1 0 0 0 2 4 0 0 3 5 6 0 J = 1 0 0 0 1 1 0 0 2 3 1 0 K = 1 0 0 2 5 0 3 6 8 4 7 9 L = 1 0 0 1 1 0 2 4 1 3 5 6
Creating TrMatrix Views from a GeMatrix
In the next section we show an application for creating triangular views that reference the storage scheme of a general matrix.
The GeMatrix class provides the following methods to create TrMatrix views referencing the underlying full storage scheme:
const ConstTriangularView
upper() const; |
Return a triangular or trapezoidal matrix view represented by the upper triangular part. |
TriangularView
upper(); |
|
const ConstTriangularView
upperUnit() const; |
Return a triangular or trapezoidal matrix view represented by the upper triangular part. Elements on the diagonal must be assumed to be one. |
TriangularView
upperUnit(); |
|
const ConstTriangularView
lower() const; |
Return a triangular or trapezoidal matrix view represented by the lower triangular part. |
TriangularView
lower(); |
|
const ConstTriangularView
lowerUnit() const; |
Return a triangular or trapezoidal matrix view represented by the lower triangular part. Elements on the diagonal must be assumed to be one. |
TriangularView
lowerUnit(); |
The return types are public typedefs of GeMatrix:
-
ConstTriangularView is defined as TrMatrix<ConstFullStorageView<..> >.
-
TriangularView is defined as TrMatrix<FullStorageView<..> >.
For convenience you can use the auto keyword:
auto L = A.lowerUnit();
auto U = A.lower();
Why using Full Storage Schemes for Triangular Matrices
There are many applications where you start with a general matrix, do some operations. Then you go on with triangular/trapezoidal matrices which are referencing the upper or lower triangular/trapezoidal part. So TrMatrix views created from a GeMatrix play a major role.
Consider solving a system of linear equations \(AX=b\) using a \(LU\) factorization:
-
First compute the factorization \(A = P L U\). If \(A\) is a \(n \times n\) matrix then
-
\(P\) is a \(n \times n\) permutation matrix that can be represented by a permutation vector of length \(n\).
-
\(L\) is a lower triangular matrix with ones on the diagonal.
-
\(U\) is a upper triangular matrix.
So in practise the strict lower triangular part of \(A\) gets overwritten with the strict lower triangular part of \(L\). The ones from the diagonal of \(L\) will not be stored. The upper triangular part of \(A\) gets overwritten with the upper triangular part of \(U\). So only additional memory for storing the permutation vector \(p\) is required. The computational cost is \(\mathcal{O}(n^3)\)
-
-
\(Ax = b \quad \Leftrightarrow \quad P L U x = b \quad \Leftrightarrow \quad L U x = P^T b\)
So in the next step we formally compute a new right hand side \(\tilde{b} = P^T b\). In practise we overwrite \(b\) with \(P^T b\). As this just means to permute the vector elements and involves \(n\) operations.
-
\(L U x = b \quad \Leftrightarrow \quad L y = b\) with \( y:=U x\).
Using a triangular solver you solve \(Ly=b\) for \(y\). So the triangular solver actually gets \(A\) but only references the strict lower triangular part of \(A\). Elements on the diagonal and assumed to be one but not touched by the solver. Also elements from the upper triangular part are just assumed to be zero but never referenced.
The triangular solver will overwrite \(b\) with the solution \(y\). The complexity of the triangular solver is \(\mathcal{O}(n^2)\).
-
As \(b\) now contains \(y\) we solve \(Ux=b\). Again we use a triangular solver that will overwrite \(b\) with the solution \(x\).
The application of matrix view is obvious. The very same full storage scheme
-
First gets used to represent a general matrix when computing the \(LU\) factorization.
-
Then gets used to represent a lower triangular matrix.
-
And finally as a upper triangular matrix.
Note that this a algorithm also can be used if only \(A\) is know a priori and the right hand side \(b\) (or a sequence of right hand sides) is known only a posteriori. In that case you can;
-
Compute the \(LU\) factorization a priori (Complexity \(\mathcal{O}(n^3)\)).
-
Solve \(Ly=b\) and \(Ux=b\) a posterior as soon a right hand side is \(b\) given (Complexity \(\mathcal{O(n^2)}\)).
Example: Solving a System of Linear Equations
#include <flens/flens.cxx>
using namespace std;
using namespace flens;
int
main()
{
GeMatrix<FullStorage<double> > A(4, 4);
DenseVector<Array<double> > b(4);
DenseVector<Array<int> > piv(4);
A = 2, 3, -1, 0,
-6, -5, 0, 2,
2, -5, 6, -6,
4, 6, 2, -3;
cout << "A = " << A << endl;
lapack::trf(A, piv);
for (int i=0; i<3; ++i) {
cout << "Right hand side b in problem set " << i << endl;
fillRandom(b);
cout << "b = " << b << endl;
lapack::laswp(b, piv);
blas::sv(NoTrans, A.lowerUnit(), b);
blas::sv(NoTrans, A.upper(), b);
cout << "x = " << b << endl;
}
}
Brief Explanation
A pripori compute the triangular factorization \(PLU = A\) using the FLENS-LAPACK function lapack::trf.
The right hand sides are only known a posteriori. Furthermore we have three different right hand sides for the same coefficient matrix \(A\).
A posteriori call the triangular solvers to solve \(Ax=b\) for different right hand sides \(b\).
cout << "Right hand side b in problem set " << i << endl;
fillRandom(b);
cout << "b = " << b << endl;
Apply the permutation: \(b \leftarrow Pb\) using the FLENS-LAPACK function lapack::laswp.
Solve \(L y = b\) using FLENS-BLAS function blas::sv. Vector \(b\) will be overwritten with \(y\).
Solve \(U x = b\) using FLENS-BLAS function blas::sv. Vector \(b\) will be overwritten with \(x\).
Compile and Run
$shell> cd flens/examples $shell> g++ -Wall -std=c++11 -I../.. lu.cc $shell> ./a.out A = 2 3 -1 0 -6 -5 0 2 2 -5 6 -6 4 6 2 -3 Right hand side b in problem set 0 b = -0.999984 -0.736924 0.511211 -0.0826997 x = 2.05119 -0.252638 4.34444 5.1535 Right hand side b in problem set 1 b = 0.0655345 -0.562082 -0.905911 0.357729 x = 0.530196 0.0390826 1.11211 1.40725 Right hand side b in problem set 2 b = 0.358593 0.869386 -0.232996 0.0388327 x = -1.22452 0.185683 -2.25059 -2.77466
Why a TrMatrix is not Required to be Square?
Reconsider solving a system of linear equations \(Ax=b\). If the right hand side \(b\) is known a priori you can consider the extended matrix \(\tilde(A) = (A,b)\). Applying the \(LU\) factorization to \(\tilde{A}\) you get the same permutation \(P\) and lower triangular matrix \(L\) as for \(A\). But the upper triangular matrix has an additional column:
\[(A,b) = \tilde{A} = P L \tilde{U} = P L (U, L^{-1}b)\]So the forward solving of \(Ly=b\) is already done and the last column got overwritten with \(y = L^{-1}b\).
If you have multiple right hand sides that are known a priori you can put them together column wise in a matrix \(B\). Extending \(A\) for \(B\) leads to \(\tilde{A} = (A,B)\). The \(LU\) factorization of $\tilde{A} gives
\[(A,B) = \tilde{A} = P L \tilde{U} = P L (U, L^{-1}B)\]So afterwards solving the matrix equation \(AX = B\) reduces to solving \(UX=B\).
Example: Solving a System of Linear Equations with a priori known \(B\)
#include <flens/flens.cxx>
using namespace std;
using namespace flens;
int
main()
{
GeMatrix<FullStorage<double> > AB(4, 4+3);
DenseVector<Array<int> > piv(4);
Underscore<int> _;
auto A = AB(_,_(1,4));
auto B = AB(_,_(5,7));
A = 2, 3, -1, 0,
-6, -5, 0, 2,
2, -5, 6, -6,
4, 6, 2, -3;
fillRandom(B);
cout << "(A,B) = " << AB << endl;
lapack::trf(AB, piv);
blas::sm(Left, NoTrans, 1.0, A.upper(), B);
cout << "X = " << B << endl;
}
Brief Explanation
A pripori compute the triangular factorization \(PLU = (A,B)\) using the FLENS-LAPACK function lapack::trf.
Solve \(U X = B\) using FLENS-BLAS function `blas::sv. Matrix $B$ gets overwritten with the solution $X$.
Compile and Run
$shell> cd flens/examples $shell> g++ -Wall -std=c++11 -I../.. lu2.cc $shell> ./a.out (A,B) = 2 3 -1 0 -0.999984 0.0655345 0.358593 -6 -5 0 2 -0.736924 -0.562082 0.869386 2 -5 6 -6 0.511211 -0.905911 -0.232996 4 6 2 -3 -0.0826997 0.357729 0.0388327 X = 2.05119 0.530196 -1.22452 -0.252638 0.0390826 0.185683 4.34444 1.11211 -2.25059 5.1535 1.40725 -2.77466
Creating GeMatrix Views from a TrMatrix
In the next paragraph we give reasons why it is sometime important to create a general matrix view from a triangular matrix. The following methods of TrMatrix will be used:
const ConstGeneralView
general() const; |
Creates a GeMatrix view that references the same underlying full storage scheme. So the return type is e.g. GeMatrix<FullStorageView<..> >. The exact type is defined through the public typedef ConstGeneralView and GeneralView. |
GeneralView
general(); |
Yes, it's legal: Changing the upper Part of a lower TrMatrix
Assume you have a TrMatrix that represents a lower triangular matrix. Using the element access operator for the upper triangular part will cause a runtime assertion error. That is because we assume something like that happens accidentally.
Having said that, it is not illegal per se. If the matrix is not const you have the right to change any element from the underlying full storage scheme. If you explicitly create a GeMatrix view from the TrMatrix then any element can be changed using the general view.
Application
One reason to use a full storage scheme with \(n \times n\) elements for a triangular matrix (instead of a packed storage scheme that only uses half the memory) is the possibility to use the opposite triangular part as workspace.
Example
We just give a brief example for illustrating the technical details.
#include <iostream>
using namespace std;
using namespace flens;
int
main()
{
TrMatrix<FullStorage<double> > L(3, Lower);
L(1,1) = 11;
L(2,1) = 21; L(2,2) = 22;
L(3,1) = 31; L(3,2) = 32; L(3,3) = 33;
auto A = L.general();
A(1,3) = 42;
A(2,2) = 42*2;
A(3,1) = 42*3;
cout << "A = " << A << endl;
cout << "L = " << L << endl;
}
Brief Explanation
Define a \(3 \times 3\) triangular matrix with memory for \(9\) doubles.
Init L somehow
L(2,1) = 21; L(2,2) = 22;
L(3,1) = 31; L(3,2) = 32; L(3,3) = 33;
The following line is equivalent to: TrMatrix<FullStorage<double> >::GeneralView A = L.general();
Change any element from the upper or lower triangular part
A(2,2) = 42*2;
A(3,1) = 42*3;
Note the output operator for TrMatrix only reads elements from the lower triangular part from memory. Elements from the upper part are just assumed to be zero.
cout << "L = " << L << endl;
}
Compile and Run
$shell> cd flens/examples $shell> g++ -Wall -std=c++11 -I../.. trmatrix2.cc $shell> ./a.out A = 11 0 42 21 84 0 126 32 33 L = 11 0 0 21 84 0 126 32 33
Interface in Detail: Creating Views Referencing Matrix Parts
As you can create a general matrix view from a TrMatrix technically any block can be used to create a matrix view:
auto A = U.general();
A12 = A(_(1,4),_(5,8));
// ...
As creating rectangular parts or row/column views from a TrMatrix are frequently needed we also support direct methods. It is not required that the selected area lies withing the proper triangular part. E.g. you sometime want to select from a upper triangular matrix a block in the lower triangular part and use it as workspace. Creating a view from a block selection always returns a GeMatrix view. Even if the block is on the diagonal. Selecting a row/column always results in a DenseVector view.
-
Methods for selecting rectangular matrix parts:
const ConstGeneralView
operator()(const Range<IndexType> &rows,
const Range<IndexType> &cols) const;
Select a rectangualr area of the form (fromRow:toRow, fromCol:toCol) or (fromRow:stepRow:toRow, fromCol:stepRow:toCol)
GeneralView
operator()(const Range<IndexType> &rows,
const Range<IndexType> &cols);
const ConstGeneralView
operator()(const Underscore<IndexType> &,
const Range<IndexType> &cols) const;
All rows are selected. So the range has the form (all, from:to).
GeneralView
operator()(const Underscore<IndexType> &,
const Range<IndexType> &cols);
const ConstGeneralView
operator()(const Range<IndexType> &rows,
const Underscore<IndexType> &) const;
All columns are selected. So the range has the form (from:to, all).
GeneralView
operator()(const Range<IndexType> &rows,
const Underscore<IndexType> &);
-
Methods for selecting vector parts (the DenseVector class will be introduced on the next page):
const ConstVectorView
operator()(IndexType row, const Underscore<IndexType> &) const;
Select (row, all), i.e. the row with index row.
VectorView
operator()(IndexType row, const Underscore<IndexType> &);
const ConstVectorView
operator()(IndexType row, const Range<IndexType> &cols) const;
Select (row, from:to), i.e. from the row with index row the elements in the column in range from:to.
VectorView
operator()(IndexType row, const Range<IndexType> &cols);
const ConstVectorView
operator()(const Underscore<IndexType> &, IndexType col) const;
Select (all, col), i.e. the column with index col.
VectorView
operator()(const Underscore<IndexType> &, IndexType col);
const ConstVectorView
operator()(const Range<IndexType> &rows, IndexType col) const;
Select (from:to, col), i.e. from the column with index col the elements in the row range from:to.
VectorView
operator()(const Range<IndexType> &rows, IndexType col);