I'm implementing PCA using eigenvalue decomposition for sparse data. I know matlab has PCA implemented, but it helps me understand all the technicalities when I write code. I've been following the guidance from here, but I'm getting different results in comparison to built-in function princomp.
Could anybody look at it and point me in the right direction.
Here's the code:
function [mu, Ev, Val ] = pca(data)
% mu - mean image
% Ev - matrix whose columns are the eigenvectors corresponding to the eigen
% values Val
% Val - eigenvalues
if nargin ~= 1
error ('usage: [mu,E,Values] = pca_q1(data)');
end
mu = mean(data)';
nimages = size(data,2);
for i = 1:nimages
data(:,i) = data(:,i)-mu(i);
end
L = data'*data;
[Ev, Vals] = eig(L);
[Ev,Vals] = sort(Ev,Vals);
% computing eigenvector of the real covariance matrix
Ev = data * Ev;
Val = diag(Vals);
Vals = Vals / (nimages - 1);
% normalize Ev to unit length
proper = 0;
for i = 1:nimages
Ev(:,i) = Ev(:,1)/norm(Ev(:,i));
if Vals(i) < 0.00001
Ev(:,i) = zeros(size(Ev,1),1);
else
proper = proper+1;
end;
end;
Ev = Ev(:,1:nimages);
Here's how I would do it:
function [V newX D] = myPCA(X)
X = bsxfun(@minus, X, mean(X,1)); %# zero-center
C = (X'*X)./(size(X,1)-1); %'# cov(X)
[V D] = eig(C);
[D order] = sort(diag(D), 'descend'); %# sort cols high to low
V = V(:,order);
newX = X*V(:,1:end);
end
and an example to compare against the PRINCOMP function from the Statistics Toolbox:
load fisheriris
[V newX D] = myPCA(meas);
[PC newData Var] = princomp(meas);
You might also be interested in this related post about performing PCA by SVD.