Previous Contents Next

Chapter 4  Solving Least Squares Systems: SVD Approach

One way to solve overdetermined systems is to use the Singular Value Decomposition of a matrix. Suppose that a matrix A is given that has more rows than columns, ie n, the number of rows, is larger than m, the number of columns. We have given a data vector b of size n and we wish to find a vector x which minimizes the standard euclidean norm of the error
||Ax - b||2

4.1  The Singular Value Decomposition:

Now the theory behind this goes like this: as discussed in class, the singular value decomposition of the matrix A represents A as the following matrix product:
A = V F UT

This decomposition uses some things that come out of the square symmetric matrix ATA which is m × m. We let the eigenvalues of this matrix be denoted by {l1, ..., lm }. We know these eigenvalues are nonnegative and we can organize them in descending fashion from the largest l1 to the smallest positive one, lr with the remaining eigenvalues being 0. We call the square roots of these eigenvalues the singular values of A; the singular values are denoted by the symbol si and we know that s1 ³ s2 ³ ... ³ sr > 0 with the rest actually 0. We also know that the eigenvectors of ATA form an orthonormal basis for Âm which we label {u1, ..., , um }

then using the notation above, these matrices can be shown to have the following form:
V =
Au1
s1
, ...,
Aur
sr
, Vr+1, ..., Vn

where the columns of V give an orthonormal basis for Ân with the columns Vr+1 through Vn chosen to augment the first r linearly independent columns.

The matrix F is n × m and has a block form with an r × r diagonal matrix in the upper left hand corner and the rest zeros. The diagonal matrix has the singular values on its main diagonal.
F =
r by r diagonal of singular values r by m zero matrix
m-r by m zero matrix  
n-m by m zero matrix  

So, for example, a 5 × 3 matrix A with r=2 would have an associated F matrix which looks like this:
F =
s1 0 0
0 s2 0
0 0 0
0 0 0
0 0 0

while if r=3 we would see
F =
s1 0 0
0 s2 0
0 0 s3
0 0 0
0 0 0

Note that our theory tells us that r must be less than or equal m. Finally, the matrix U simply consists of the eigenvectors of ATA.
U =
U1, ... Um

4.2  Using the SVD of A For Least Squares:

Now in class we figured out how to use the SVD of A to solve a Least Squares problem. We do the following steps: This solution x has the property that
||Ax -b||22 = ||Fy -c||22 = sum j=r+1 to m cj2

4.3  Doing It In Matlab!:

Now in Matlab, we can find out about the SVD of A by asking for help:
>> help svd

 SVD    Singular value decomposition.
    [U,S,V] = SVD(X) produces a diagonal matrix S, of the same 
    dimension as X and with nonnegative diagonal elements in
    decreasing order, and unitary matrices U and V so that
    X = U*S*V'.
 
    S = SVD(X) returns a vector containing the singular values.
 
    [U,S,V] = SVD(X,0) produces the "economy size"
    decomposition. If X is m-by-n with m > n, then only the
    first n columns of U are computed and S is n-by-n.
 
    See also SVDS, GSVD.

 Overloaded methods
    help sym/svd.m
So if we want the SVD of the matrix A, we would type
[U,S,V] = svd(A)
We also want to find norms of vectors, so we again ask for help:
>> help norm

 NORM   Matrix or vector norm.
    For matrices...
      NORM(X) is the largest singular value of X, max(svd(X)).
      NORM(X,2) is the same as NORM(X).
      NORM(X,1) is the 1-norm of X, the largest column sum,
                      = max(sum(abs((X)))).
      NORM(X,inf) is the infinity norm of X, the largest row sum,
                      = max(sum(abs((X')))).
      NORM(X,'fro') is the Frobenius norm, sqrt(sum(diag(X'*X))).
      NORM(X,P) is available for matrix X only 
      if P is 1, 2, inf or 'fro'.
 
    For vectors...
      NORM(V,P) = sum(abs(V).^P)^(1/P).
      NORM(V) = norm(V,2).
      NORM(V,inf) = max(abs(V)).
      NORM(V,-inf) = min(abs(V)).
 
    See also COND, CONDEST, NORMEST.

 Overloaded methods
    help ss/norm.m
    help lti/norm.m
    help frd/norm.m
Hence, for us, the euclidean norm of a vector x is simply
norm(x)

4.3.1  Our MatLab Session:

In Matlab, we thus have to translate a bit. Our U is the Matlab V, our F is the Matlab S and our V is the Matlab U. Ughh!!! Here is our session:

svd.prompts
  1 >> A = [1 1 1; 2 -1 2; -1 4 3; 4 2 1; 3 -3 4]
  2 
  3 A =
  4 
  5      1     1     1
  6      2    -1     2
  7     -1     4     3
  8      4     2     1
  9      3    -3     4
 10 
 11 >> b = [1; 2; -1; 4; 8]
 12 
 13 b =
 14 
 15      1
 16      2
 17     -1
 18      4
 19      8
 20 
 21 >> S = svd(A)
 22 
 23 S =
 24 
 25     7.0494
 26     5.6213
 27     3.4216
 28 
 29 >> [U,S,V] = svd(A);
 30 >> U
 31 
 32 U =
 33 
 34    -0.1695    0.2172    0.0837   -0.9246   -0.2496
 35    -0.4209   -0.0717   -0.0540    0.2865   -0.8560
 36    -0.0772    0.8502   -0.4944    0.1556    0.0500
 37    -0.4452    0.3633    0.7721    0.1826    0.2009
 38    -0.7681   -0.3047   -0.3867   -0.0745    0.4027
 39 
 40 >> S
 41 
 42 S =
 43 
 44     7.0494         0         0
 45          0    5.6213         0
 46          0         0    3.4216
 47          0         0         0
 48          0         0         0
 49 
 50 >> V
 51 
 52 V =
 53 
 54    -0.7120   -0.0422    0.7009
 55     0.1924    0.9483    0.2526
 56    -0.6753    0.3147   -0.6670
 57 
 58 >> V'
 59 
 60 ans =
 61 
 62    -0.7120    0.1924   -0.6753
 63    -0.0422    0.9483    0.3147
 64     0.7009    0.2526   -0.6670
 65 
 66 >> U*S*V'
 67 
 68 ans =
 69 
 70     1.0000    1.0000    1.0000
 71     2.0000   -1.0000    2.0000
 72    -1.0000    4.0000    3.0000
 73     4.0000    2.0000    1.0000
 74     3.0000   -3.0000    4.0000
 75 
 76 >> c = U'*b
 77 
 78 c =
 79 
 80    -8.8595
 81    -1.7607
 82     0.4648
 83    -0.3723
 84     2.0134
 85 
 86 >> y = [c(1)/S(1,1); c(2)/S(2,2); c(3)/S(3,3)]
 87 
 88 y =
 89 
 90    -1.2568
 91    -0.3132
 92     0.1358
 93 
 94 >> x = V*y
 95 
 96 x =
 97 
 98     1.0033
 99    -0.5045
100     0.6595
101 
102 >> e = A*x - b
103 
104 e =
105 
106     0.1583
107     1.8301
108    -0.0427
109    -0.3364
110    -0.8385
111 
112 >> S*y - c
113 
114 ans =
115 
116          0
117          0
118          0
119     0.3723
120    -2.0134
121 
122 >> norm(e)
123 
124 ans =
125 
126     2.0476
127 
128 >> f = S*y - c
129 
130 f =
131 
132          0
133          0
134          0
135     0.3723
136    -2.0134
137 
138 >> norm(f)
139 
140 ans =
141 
142     2.0476
143 
144 >> g = c(4)*c(4) + c(5)*c(5)
145 
146 g =
147 
148     4.1925
149 
150 >> sqrt(g)
151 
152 ans =
153 
154     2.0476


syntax highlighted by Code2HTML, v. 0.8.8b

4.4  Solving Least Squares: The QR Approach:

Now we will solve the same least squares problem using a QR decomposition of the matrix A. The help qr command in Matlab gives the following information:
>> help qr

 QR     Orthogonal-triangular decomposition.
    [Q,R] = QR(A) produces an upper triangular matrix R of the same
    dimension as A and a unitary matrix Q so that A = Q*R.
 
    [Q,R,E] = QR(A) produces a permutation matrix E, an upper
    triangular R and a unitary Q so that A*E = Q*R.  The column
    permutation E is chosen so that abs(diag(R)) is decreasing.
 
    [Q,R] = QR(A,0) produces the "economy size" decomposition.
    If A is m-by-n with m > n, then only the first n columns of Q
    are computed.
 
    [Q,R,E] = QR(A,0) produces an "economy size" decomposition in
    which E is a permutation vector, so that Q*R = A(:,E).  The column
    permutation E is chosen so that abs(diag(R)) is decreasing. 
 
    By itself, QR(A) returns the output of LINPACK'S ZQRDC routine.
    TRIU(QR(A)) is R.
 
    For sparse matrices, QR can compute a "Q-less QR decomposition",
    which has the following slightly different behavior.
 
    R = QR(A) returns only R.  Note that R = chol(A'*A).
    [Q,R] = QR(A) returns both Q and R, but Q is often nearly full.
    [C,R] = QR(A,B), where B has as many rows as A, returns C = Q'*B.
    R = QR(A,0) and [C,R] = QR(A,B,0) produce economy size results.
 
    The sparse version of QR does not do column permutations.
    The full version of QR does not return C.
 
    The least squares approximate solution to A*x = b can be found
    with the Q-less QR decomposition and one step of iterative refinement:
 
        x = R\(R'\(A'*b))
        r = b - A*x
        e = R\(R'\(A'*r))
        x = x + e;
 
    See also LU, NULL, ORTH, QRDELETE, QRINSERT, QRUPDATE.

4.5  The Theory:

In class, we discussed the QR decomposition for a square matrix. A similar discussion can show that for a no square matrix A of size n × m with n bigger than m, we can still say A = QR but now Q is n × n and R is upper triangular of size n × m. For for a 5 × 3 matrix A we would have Q is a nice 5 × 5 orthogonal matrix and R would have the block form
R =
U11 U12 U13
0 U22 U23
0 0 U33
0 0 0
0 0 0

with U an upper triangular 3× 3 matrix.

Now,
||Ax -b||2 = ||QRx -b||2
  = ||QRx - QQTb||2
= ||Rx - QTb||2

Then, letting c be QT b, we see
Rx =
Ux first m components
0's rest are zero

and
c =
c1
c2
···
cm
0
···
0

Let the first m components of c be the product cU. Then
||Ax - b||2 =
||
Ux - cu
cm+1
···
cn
0
···
0
||2

We can see that this error is minimal if x is chosen to solve the upper triangular system Ux = cU and that the squared residual error will be
sum j=m+1 to n cj2

Note that we are assuming our upper triangular part U is non singular!

4.5.1  Plugging It Into Matlab:

Here is our Matlab script: we will use our simple upper triangular solver from the last time here also.

qr.prompts
 1 >> [Q,R] = qr(A)
 2 
 3 Q =
 4 
 5    -0.1796    0.2185   -0.0538   -0.9271   -0.2401
 6    -0.3592   -0.1122   -0.2088    0.2777   -0.8589
 7     0.1796    0.6968   -0.6749    0.1561    0.0484
 8    -0.7184    0.5079    0.3902    0.1847    0.1990
 9    -0.5388   -0.4429   -0.5881   -0.0703    0.4034
10 
11 
12 R =
13 
14    -5.5678    1.0776   -3.2329
15          0    5.4625    0.8208
16          0         0   -4.4581
17          0         0         0
18          0         0         0
19 
20 >> U = R(1:3,1:3)
21 
22 U =
23 
24    -5.5678    1.0776   -3.2329
25          0    5.4625    0.8208
26          0         0   -4.4581
27 
28 >> c = Q' * b
29 
30 c =
31 
32    -8.2618
33    -2.2145
34    -2.9403
35    -0.3516
36     2.0172
37 
38 >> d = c(1:3)
39 
40 d =
41 
42    -8.2618
43    -2.2145
44    -2.9403
45 
46 >> cd Source
47 >> x = UTriSol(U,d)
48 
49 x =
50 
51     1.0033
52    -0.5045
53     0.6595
54 
55 >> e = A*x - b
56 
57 e =
58 
59     0.1583
60     1.8301
61    -0.0427
62    -0.3364
63    -0.8385
64 
65 >> norm(e)
66 
67 ans =
68 
69     2.0476
70 
71 >> f = R*x - c
72 
73 f =
74 
75          0
76          0
77          0
78     0.3516
79    -2.0172
80 
81 >> norm(f)
82 
83 ans =
84 
85     2.0476


syntax highlighted by Code2HTML, v. 0.8.8b
Previous Contents Next