This is the mail archive of the
gsl-discuss@sources.redhat.com
mailing list for the GSL project.
vector-matrix dot product
- To: gsl-discuss at sources dot redhat dot com
- Subject: vector-matrix dot product
- From: "E. Robert Tisdale" <edwin at netwood dot net>
- Date: Thu, 20 Jul 2000 17:42:30 +0000
On most modern serial computers, the typical cache line is 128 bits --
four 32 bit single precision or two 64 bit double precision
floating-point numbers. When the matrix-vector product
Ax
where A is an m by n matrix object and x is an n element column vector
is implemented in Fortran, striding between elements in the rows of matrix A
may result in a cache miss on every access if A is a very large matrix.
In this case, it might be a good idea to store the transpose of matrix A
B = A^T
in memory instead of matrix A and compute the matrix-vector dot product
B^Tx
instead of the matrix-vector product. A similar problem occurs
when the vector-matrix product is implemented in C.
In which case it might be better to store the transpose of matrix A
B = A^T
in memory instead of matrix A and compute the vector-matrix dot product
xB^T
instead of the vector-matrix product xA.
Similarly, the matrix-matrix dot product
A^TX
in Fortran or
XA^T
in C has a slight advantage over the matrix-matrix product.
Block algorithms can be applied to The matrix-matrix dot product
just as they are applied to the matrix-matrix product.