Chapter 4 Solving Least Squares Systems: SVD Approach
One way to solve overdetermined systems is to use the
Singular Value Decomposition of a matrix. Suppose that
a matrix A is given that has more rows than columns, ie
n, the number of rows, is larger than m, the number of columns.
We have given a data vector b of size n and we wish to find a
vector x which minimizes the standard euclidean norm of the error
4.1 The Singular Value Decomposition:
Now the theory behind this goes like this: as discussed in class, the
singular value decomposition of the matrix A represents A as the
following matrix product:
This decomposition uses some things that come out of the
square symmetric matrix ATA which is m × m. We let
the eigenvalues of this matrix be denoted by
{l1, ..., lm }. We
know these eigenvalues are nonnegative and we can
organize them in descending fashion from the largest
l1 to the smallest positive one, lr with
the remaining eigenvalues being 0. We call the square roots
of these eigenvalues the singular values of A;
the singular values are denoted by the symbol si and
we know that
s1 ³ s2 ³ ... ³ sr
> 0 with the rest actually 0. We also know that the
eigenvectors of ATA form an orthonormal basis for Âm
which we label
{u1, ..., , um }
then using the notation above, these matrices can be shown to
have the following form:
where the columns of V give an orthonormal basis for Ân
with the columns Vr+1 through Vn chosen to augment
the first r linearly independent columns.
The matrix F is n × m and has a block form
with an r × r diagonal matrix in the upper left
hand corner and the rest zeros. The diagonal matrix has the
singular values on its main diagonal.
| F |
= |
|
|
| r by r diagonal of singular values |
r by m zero matrix |
| m-r by m zero matrix |
|
| n-m by m zero matrix |
|
|
|
So, for example, a 5 × 3 matrix A with r=2 would have an
associated F matrix which looks like this:
| F |
= |
|
|
| s1 |
0 |
0 |
| 0 |
s2 |
0 |
| 0 |
0 |
0 |
| 0 |
0 |
0 |
| 0 |
0 |
0 |
|
|
while if r=3 we would see
| F |
= |
|
|
| s1 |
0 |
0 |
| 0 |
s2 |
0 |
| 0 |
0 |
s3 |
| 0 |
0 |
0 |
| 0 |
0 |
0 |
|
|
Note that our theory tells us that r must be less than or equal m.
Finally, the matrix U simply consists of the eigenvectors of ATA.
4.2 Using the SVD of A For Least Squares:
Now in class we figured out how to use the SVD of A to solve a
Least Squares problem. We do the following steps:
-
Compute c = VT × b
- Compute
where ar+1 through am are arbitrarily chosen.
- Compute x = U × y
This solution x has the property that
| ||Ax -b||22 |
= |
||Fy -c||22 = sum j=r+1 to m
cj2 |
4.3 Doing It In Matlab!:
Now in Matlab, we can find out about the SVD of A by
asking for help:
>> help svd
SVD Singular value decomposition.
[U,S,V] = SVD(X) produces a diagonal matrix S, of the same
dimension as X and with nonnegative diagonal elements in
decreasing order, and unitary matrices U and V so that
X = U*S*V'.
S = SVD(X) returns a vector containing the singular values.
[U,S,V] = SVD(X,0) produces the "economy size"
decomposition. If X is m-by-n with m > n, then only the
first n columns of U are computed and S is n-by-n.
See also SVDS, GSVD.
Overloaded methods
help sym/svd.m
So if we want the SVD of the matrix A, we would type
[U,S,V] = svd(A)
We also want to find norms of vectors, so we again ask for help:
>> help norm
NORM Matrix or vector norm.
For matrices...
NORM(X) is the largest singular value of X, max(svd(X)).
NORM(X,2) is the same as NORM(X).
NORM(X,1) is the 1-norm of X, the largest column sum,
= max(sum(abs((X)))).
NORM(X,inf) is the infinity norm of X, the largest row sum,
= max(sum(abs((X')))).
NORM(X,'fro') is the Frobenius norm, sqrt(sum(diag(X'*X))).
NORM(X,P) is available for matrix X only
if P is 1, 2, inf or 'fro'.
For vectors...
NORM(V,P) = sum(abs(V).^P)^(1/P).
NORM(V) = norm(V,2).
NORM(V,inf) = max(abs(V)).
NORM(V,-inf) = min(abs(V)).
See also COND, CONDEST, NORMEST.
Overloaded methods
help ss/norm.m
help lti/norm.m
help frd/norm.m
Hence, for us, the euclidean norm of a vector x is
simply
norm(x)
4.3.1 Our MatLab Session:
In Matlab, we thus have to translate a bit.
Our U is the Matlab V, our F is the Matlab S and
our V is the Matlab U. Ughh!!! Here is our session:
svd.prompts
1 >> A = [1 1 1; 2 -1 2; -1 4 3; 4 2 1; 3 -3 4]
2
3 A =
4
5 1 1 1
6 2 -1 2
7 -1 4 3
8 4 2 1
9 3 -3 4
10
11 >> b = [1; 2; -1; 4; 8]
12
13 b =
14
15 1
16 2
17 -1
18 4
19 8
20
21 >> S = svd(A)
22
23 S =
24
25 7.0494
26 5.6213
27 3.4216
28
29 >> [U,S,V] = svd(A);
30 >> U
31
32 U =
33
34 -0.1695 0.2172 0.0837 -0.9246 -0.2496
35 -0.4209 -0.0717 -0.0540 0.2865 -0.8560
36 -0.0772 0.8502 -0.4944 0.1556 0.0500
37 -0.4452 0.3633 0.7721 0.1826 0.2009
38 -0.7681 -0.3047 -0.3867 -0.0745 0.4027
39
40 >> S
41
42 S =
43
44 7.0494 0 0
45 0 5.6213 0
46 0 0 3.4216
47 0 0 0
48 0 0 0
49
50 >> V
51
52 V =
53
54 -0.7120 -0.0422 0.7009
55 0.1924 0.9483 0.2526
56 -0.6753 0.3147 -0.6670
57
58 >> V'
59
60 ans =
61
62 -0.7120 0.1924 -0.6753
63 -0.0422 0.9483 0.3147
64 0.7009 0.2526 -0.6670
65
66 >> U*S*V'
67
68 ans =
69
70 1.0000 1.0000 1.0000
71 2.0000 -1.0000 2.0000
72 -1.0000 4.0000 3.0000
73 4.0000 2.0000 1.0000
74 3.0000 -3.0000 4.0000
75
76 >> c = U'*b
77
78 c =
79
80 -8.8595
81 -1.7607
82 0.4648
83 -0.3723
84 2.0134
85
86 >> y = [c(1)/S(1,1); c(2)/S(2,2); c(3)/S(3,3)]
87
88 y =
89
90 -1.2568
91 -0.3132
92 0.1358
93
94 >> x = V*y
95
96 x =
97
98 1.0033
99 -0.5045
100 0.6595
101
102 >> e = A*x - b
103
104 e =
105
106 0.1583
107 1.8301
108 -0.0427
109 -0.3364
110 -0.8385
111
112 >> S*y - c
113
114 ans =
115
116 0
117 0
118 0
119 0.3723
120 -2.0134
121
122 >> norm(e)
123
124 ans =
125
126 2.0476
127
128 >> f = S*y - c
129
130 f =
131
132 0
133 0
134 0
135 0.3723
136 -2.0134
137
138 >> norm(f)
139
140 ans =
141
142 2.0476
143
144 >> g = c(4)*c(4) + c(5)*c(5)
145
146 g =
147
148 4.1925
149
150 >> sqrt(g)
151
152 ans =
153
154 2.0476
syntax highlighted by Code2HTML, v. 0.8.8b
4.4 Solving Least Squares: The QR Approach:
Now we will solve the same least squares problem using a
QR decomposition of the matrix A. The help qr
command in Matlab gives the following information:
>> help qr
QR Orthogonal-triangular decomposition.
[Q,R] = QR(A) produces an upper triangular matrix R of the same
dimension as A and a unitary matrix Q so that A = Q*R.
[Q,R,E] = QR(A) produces a permutation matrix E, an upper
triangular R and a unitary Q so that A*E = Q*R. The column
permutation E is chosen so that abs(diag(R)) is decreasing.
[Q,R] = QR(A,0) produces the "economy size" decomposition.
If A is m-by-n with m > n, then only the first n columns of Q
are computed.
[Q,R,E] = QR(A,0) produces an "economy size" decomposition in
which E is a permutation vector, so that Q*R = A(:,E). The column
permutation E is chosen so that abs(diag(R)) is decreasing.
By itself, QR(A) returns the output of LINPACK'S ZQRDC routine.
TRIU(QR(A)) is R.
For sparse matrices, QR can compute a "Q-less QR decomposition",
which has the following slightly different behavior.
R = QR(A) returns only R. Note that R = chol(A'*A).
[Q,R] = QR(A) returns both Q and R, but Q is often nearly full.
[C,R] = QR(A,B), where B has as many rows as A, returns C = Q'*B.
R = QR(A,0) and [C,R] = QR(A,B,0) produce economy size results.
The sparse version of QR does not do column permutations.
The full version of QR does not return C.
The least squares approximate solution to A*x = b can be found
with the Q-less QR decomposition and one step of iterative refinement:
x = R\(R'\(A'*b))
r = b - A*x
e = R\(R'\(A'*r))
x = x + e;
See also LU, NULL, ORTH, QRDELETE, QRINSERT, QRUPDATE.
4.5 The Theory:
In class, we discussed the QR decomposition for
a square matrix. A similar discussion can show that for a
no square matrix A of size n × m with n bigger than
m, we can still say A = QR but now
Q is n × n and R is upper triangular of size n × m.
For for a 5 × 3 matrix A we would have Q is a nice
5 × 5 orthogonal matrix and R would have the block form
| R |
= |
|
|
| U11 |
U12 |
U13 |
| 0 |
U22 |
U23 |
| 0 |
0 |
U33 |
| 0 |
0 |
0 |
| 0 |
0 |
0 |
|
|
with U an upper triangular 3× 3 matrix.
Now,
| ||Ax -b||2 |
= |
||QRx -b||2 |
| |
= |
||QRx - QQTb||2 |
| |
= |
||Rx - QTb||2 |
Then, letting c be QT b, we see
| Rx |
= |
|
|
| Ux |
first m components |
| 0's |
rest are zero |
|
|
and
Let the first m components of c be the product cU. Then
| ||Ax - b||2 |
= |
| ||
|
| Ux - cu |
| cm+1 |
| ··· |
| cn |
| 0 |
| ··· |
| 0 |
|
||2 |
|
We can see that this error is minimal if x is chosen to solve
the upper triangular system Ux = cU and that the squared residual error will
be
Note that we are assuming our upper triangular part U is non singular!
4.5.1 Plugging It Into Matlab:
Here is our Matlab script: we will use our simple
upper triangular solver from the last time here also.
qr.prompts
1 >> [Q,R] = qr(A)
2
3 Q =
4
5 -0.1796 0.2185 -0.0538 -0.9271 -0.2401
6 -0.3592 -0.1122 -0.2088 0.2777 -0.8589
7 0.1796 0.6968 -0.6749 0.1561 0.0484
8 -0.7184 0.5079 0.3902 0.1847 0.1990
9 -0.5388 -0.4429 -0.5881 -0.0703 0.4034
10
11
12 R =
13
14 -5.5678 1.0776 -3.2329
15 0 5.4625 0.8208
16 0 0 -4.4581
17 0 0 0
18 0 0 0
19
20 >> U = R(1:3,1:3)
21
22 U =
23
24 -5.5678 1.0776 -3.2329
25 0 5.4625 0.8208
26 0 0 -4.4581
27
28 >> c = Q' * b
29
30 c =
31
32 -8.2618
33 -2.2145
34 -2.9403
35 -0.3516
36 2.0172
37
38 >> d = c(1:3)
39
40 d =
41
42 -8.2618
43 -2.2145
44 -2.9403
45
46 >> cd Source
47 >> x = UTriSol(U,d)
48
49 x =
50
51 1.0033
52 -0.5045
53 0.6595
54
55 >> e = A*x - b
56
57 e =
58
59 0.1583
60 1.8301
61 -0.0427
62 -0.3364
63 -0.8385
64
65 >> norm(e)
66
67 ans =
68
69 2.0476
70
71 >> f = R*x - c
72
73 f =
74
75 0
76 0
77 0
78 0.3516
79 -2.0172
80
81 >> norm(f)
82
83 ans =
84
85 2.0476
syntax highlighted by Code2HTML, v. 0.8.8b