This is the mail archive of the
gsl-discuss@sourceware.org
mailing list for the GSL project.
Large linear least squares
- From: Patrick Alken <alken at colorado dot edu>
- To: "gsl-discuss at sourceware dot org" <gsl-discuss at sourceware dot org>
- Date: Mon, 9 Nov 2015 15:16:23 -0700
- Subject: Large linear least squares
- Authentication-results: sourceware.org; auth=none
- Authentication-results: spf=none (sender IP is ) smtp dot mailfrom=patrick dot alken at colorado dot edu;
- Spamdiagnosticmetadata: NSPM
- Spamdiagnosticoutput: 1:23
Hello all,
I have implemented two solvers for large linear least squares systems
in GSL. These routines are designed for so called "Tall Skinny"
matrices, which have many more rows than columns. The main idea is to
accumulate the matrix one block of rows at a time to avoid having to
store the entire matrix in memory at once (which may not be possible
depending on the size of your matrix).
The first method is the normal equations method, which is well known
and popular for its speed and simplicity. This method calculates and
stores only the normal equations matrix X^T X which is much smaller than
the full matrix X. The normal equations method is accurate when applied
to well conditioned matrices, but suffers from numerical instabilities
for ill-conditioned matrices.
The second method is the Tall Skinny QR (TSQR) algorithm from Demmel
et al, 2008. This method calculates the QR decomposition of X, updating
the R factor each time a new block of rows is added to the system. Since
its based on a QR decomposition, it is much more stable than the normal
equations method on badly conditioned matrices, though the method is
roughly twice as slow as the normal equations for n >> m.
Everything is on the git and documented, with an example program.
Feedback is always welcome. I am planning to push out a 2.1 release soon
due to a few of the bugs reported over the last couple weeks.
Thanks,
Patrick