I asked this question in Math Stackexchange, but it seems it didn't get enough attention there so I am asking it here. http://math.stackexchange.com/questions/1729946/why-do-we-say-svd-can-handle-singular-matrx-when-doing-least-square-comparison?noredirect=1#comment3530971_1729946
I learned from some tutorials that SVD should be more stable than QR decomposition when solving Least Square problem, and it is able to handle singular matrix. But the following example I wrote in matlab seems to support the opposite conclusion. I don't have a deep understanding of SVD, so if you could look at my questions in the old post in Math StackExchange and explain it to me, I would appreciate a lot.
I use a matrix that have a large condition number(e+13). The result shows SVD get a much larger error(0.8) than QR(e-27)
% we do a linear regression between Y and X data= [ 47.667483331 -122.1070832; 47.667483331001 -122.1070832 ]; X = data(:,1); Y = data(:,2); X_1 = [ones(length(X),1),X]; %% %SVD method [U,D,V] = svd(X_1,'econ'); beta_svd = V*diag(1./diag(D))*U'*Y; %% QR method(here one can also use "\" operator, which will get the same result as I tested. I just wrote down backward substitution to educate myself) [Q,R] = qr(X_1) %now do backward substitution [nr nc] = size(R) beta_qr=[] Y_1 = Q'*Y for i = nc:-1:1 s = Y_1(i) for j = m:-1:i+1 s = s - R(i,j)*beta_qr(j) end beta_qr(i) = s/R(i,i) end svd_error = 0; qr_error = 0; for i=1:length(X) svd_error = svd_error + (Y(i) - beta_svd(1) - beta_svd(2) * X(i))^2; qr_error = qr_error + (Y(i) - beta_qr(1) - beta_qr(2) * X(i))^2; end
3 Answers
Answers 1
You SVD-based approach is basically the same as the pinv
function in MATLAB (see Pseudo-inverse and SVD). What you are missing though (for numerical reasons) is using a tolerance value such that any singular values less than this tolerance are treated as zero.
If you refer to edit pinv.m
, you can see something like the following (I won't post the exact code here because the file is copyrighted to MathWorks):
[U,S,V] = svd(A,'econ'); s = diag(S); tol = max(size(A)) * eps(norm(s,inf)); % .. use above tolerance to truncate singular values invS = diag(1./s); out = V*invS*U';
In fact pinv
has a second syntax where you can explicitly specify the tolerance value pinv(A,tol)
if the default one is not suitable...
So when solving a least-squares problem of the form minimize norm(A*x-b)
, you should understand that the pinv
and mldivide
solutions have different properties:
x = pinv(A)*b
is characterized by the fact thatnorm(x)
is smaller than the norm of any other solution.x = A\b
has the fewest possible nonzero components (i.e sparse).
Using your example (note that rcond(A)
is very small near machine epsilon):
data = [ 47.667483331 -122.1070832; 47.667483331001 -122.1070832 ]; A = [ones(size(data,1),1), data(:,1)]; b = data(:,2);
Let's compare the two solutions:
x1 = A\b; x2 = pinv(A)*b;
First you can see how mldivide
returns a solution x1
with one zero component (this is obviously a valid solution because you can solve both equations by multiplying by zero as in b + a*0 = b
):
>> sol = [x1 x2] sol = -122.1071 -0.0537 0 -2.5605
Next you see how pinv
returns a solution x2
with a smaller norm:
>> nrm = [norm(x1) norm(x2)] nrm = 122.1071 2.5611
Here is the error of both solutions which is acceptably very small:
>> err = [norm(A*x1-b) norm(A*x2-b)] err = 1.0e-11 * 0 0.1819
Note that use mldivide
, linsolve
, or qr
will give pretty much same results:
>> x3 = linsolve(A,b) Warning: Matrix is close to singular or badly scaled. Results may be inaccurate. RCOND = 2.159326e-16. x3 = -122.1071 0 >> [Q,R] = qr(A); x4 = R\(Q'*b) x4 = -122.1071 0
Answers 2
SVD can handle rank-deficiency. The diagonal matrix D has a near-zero element in your code and you need use pseudoinverse for SVD, i.e. set the 2nd element of 1./diag(D) to 0 other than the huge value (10^14). You should find SVD and QR have equally good accuracy in your example. For more information, see this document http://www.cs.princeton.edu/courses/archive/fall11/cos323/notes/cos323_f11_lecture09_svd.pdf
Answers 3
Try this SVD version called block SVD - you just set the iterations equal to the accuracy you want - usually 1 is enough. If you want all the factors (this has a default # selected for factor reduction) then edit the line k= to the size(matrix) if I recall my MATLAB correctly
A= randn(100,5000); A=corr(A); % A is your correlation matrix tic k = 1000; % number of factors to extract bsize = k +50; block = randn(size(A,2),bsize); iter = 2; % could set via tolerance [block,R] = qr(A*block,0); for i=1:iter [block,R] = qr(A*(A'*block),0); end M = block'*A; % Economy size dense SVD. [U,S] = svd(M,0); U = block*U(:,1:k); S = S(1:k,1:k); % Note SVD of a symmetric matrix is: % A = U*S*U' since V=U in this case, S=eigenvalues, U=eigenvectors V=real(U*sqrt(S)); %scaling matrix for simulation toc % reduced randomized matrix for simulation sims = 2000; randnums = randn(k,sims); corrrandnums = V*randnums; est_corr_matrix = corr(corrrandnums'); total_corrmatrix_difference =sum(sum(est_corr_matrix-A))
0 comments:
Post a Comment