Sunday, April 17, 2016

My example shows SVD is less numerically stable than QR decomposition

Leave a Comment

I asked this question in Math Stackexchange, but it seems it didn't get enough attention there so I am asking it here. http://math.stackexchange.com/questions/1729946/why-do-we-say-svd-can-handle-singular-matrx-when-doing-least-square-comparison?noredirect=1#comment3530971_1729946

I learned from some tutorials that SVD should be more stable than QR decomposition when solving Least Square problem, and it is able to handle singular matrix. But the following example I wrote in matlab seems to support the opposite conclusion. I don't have a deep understanding of SVD, so if you could look at my questions in the old post in Math StackExchange and explain it to me, I would appreciate a lot.

I use a matrix that have a large condition number(e+13). The result shows SVD get a much larger error(0.8) than QR(e-27)

% we do a linear regression between Y and X data= [ 47.667483331 -122.1070832; 47.667483331001 -122.1070832 ]; X = data(:,1); Y = data(:,2);  X_1 =  [ones(length(X),1),X];  %% %SVD method [U,D,V] = svd(X_1,'econ'); beta_svd = V*diag(1./diag(D))*U'*Y;   %% QR method(here one can also use "\" operator, which will get the same result as I tested. I just wrote down backward substitution to educate myself) [Q,R] = qr(X_1) %now do backward substitution [nr nc] = size(R) beta_qr=[] Y_1 = Q'*Y for i = nc:-1:1     s = Y_1(i)     for j = m:-1:i+1         s = s - R(i,j)*beta_qr(j)     end     beta_qr(i) = s/R(i,i) end  svd_error = 0; qr_error = 0; for i=1:length(X)    svd_error = svd_error + (Y(i) - beta_svd(1) - beta_svd(2) * X(i))^2;    qr_error = qr_error + (Y(i) - beta_qr(1) - beta_qr(2) * X(i))^2; end 

3 Answers

Answers 1

You SVD-based approach is basically the same as the pinv function in MATLAB (see Pseudo-inverse and SVD). What you are missing though (for numerical reasons) is using a tolerance value such that any singular values less than this tolerance are treated as zero.

If you refer to edit pinv.m, you can see something like the following (I won't post the exact code here because the file is copyrighted to MathWorks):

[U,S,V] = svd(A,'econ'); s = diag(S); tol = max(size(A)) * eps(norm(s,inf)); % .. use above tolerance to truncate singular values invS = diag(1./s); out = V*invS*U'; 

In fact pinv has a second syntax where you can explicitly specify the tolerance value pinv(A,tol) if the default one is not suitable...


So when solving a least-squares problem of the form minimize norm(A*x-b), you should understand that the pinv and mldivide solutions have different properties:

  • x = pinv(A)*b is characterized by the fact that norm(x) is smaller than the norm of any other solution.
  • x = A\b has the fewest possible nonzero components (i.e sparse).

Using your example (note that rcond(A) is very small near machine epsilon):

data = [     47.667483331    -122.1070832;     47.667483331001 -122.1070832 ]; A = [ones(size(data,1),1), data(:,1)]; b = data(:,2); 

Let's compare the two solutions:

x1 = A\b; x2 = pinv(A)*b; 

First you can see how mldivide returns a solution x1 with one zero component (this is obviously a valid solution because you can solve both equations by multiplying by zero as in b + a*0 = b):

>> sol = [x1 x2] sol =  -122.1071   -0.0537          0   -2.5605 

Next you see how pinv returns a solution x2 with a smaller norm:

>> nrm = [norm(x1) norm(x2)] nrm =   122.1071    2.5611 

Here is the error of both solutions which is acceptably very small:

>> err = [norm(A*x1-b) norm(A*x2-b)] err =    1.0e-11 *          0    0.1819 

Note that use mldivide, linsolve, or qr will give pretty much same results:

>> x3 = linsolve(A,b) Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND =  2.159326e-16.  x3 =  -122.1071          0  >> [Q,R] = qr(A); x4 = R\(Q'*b) x4 =  -122.1071          0 

Answers 2

SVD can handle rank-deficiency. The diagonal matrix D has a near-zero element in your code and you need use pseudoinverse for SVD, i.e. set the 2nd element of 1./diag(D) to 0 other than the huge value (10^14). You should find SVD and QR have equally good accuracy in your example. For more information, see this document http://www.cs.princeton.edu/courses/archive/fall11/cos323/notes/cos323_f11_lecture09_svd.pdf

Answers 3

Try this SVD version called block SVD - you just set the iterations equal to the accuracy you want - usually 1 is enough. If you want all the factors (this has a default # selected for factor reduction) then edit the line k= to the size(matrix) if I recall my MATLAB correctly

A= randn(100,5000); A=corr(A); % A is your correlation matrix tic k = 1000; % number of factors to extract bsize = k +50; block = randn(size(A,2),bsize); iter = 2; % could set via tolerance  [block,R] = qr(A*block,0); for i=1:iter     [block,R] = qr(A*(A'*block),0); end M = block'*A; % Economy size dense SVD. [U,S] = svd(M,0); U = block*U(:,1:k); S = S(1:k,1:k); % Note SVD of a symmetric matrix is: % A = U*S*U' since V=U in this case, S=eigenvalues, U=eigenvectors V=real(U*sqrt(S)); %scaling matrix for simulation toc % reduced randomized matrix for simulation sims = 2000; randnums = randn(k,sims); corrrandnums = V*randnums; est_corr_matrix = corr(corrrandnums'); total_corrmatrix_difference =sum(sum(est_corr_matrix-A)) 
If You Enjoyed This, Take 5 Seconds To Share It

0 comments:

Post a Comment