I'm in the second week of Professor Andrew Ng's Machine Learning course through Coursera. We're working on linear regression and right now I'm dealing with coding the cost function.
The code I've written solves the problem correctly but does not pass the submission process and fails the unit test because I have hard coded the values of theta and not allowed for more than two values for theta.
Here's the code I've got so far
function J = computeCost(X, y, theta)
m = length(y);
J = 0;
for i = 1:m,
h = theta(1) + theta(2) * X(i)
a = h - y(i);
b = a^2;
J = J + b;
end;
J = J * (1 / (2 * m));
end
the unit test is
computeCost( [1 2 3; 1 3 4; 1 4 5; 1 5 6], [7;6;5;4], [0.1;0.2;0.3])
and should produce ans = 7.0175
So I need to add another for loop to iterate over theta, therefore allowing for any number of values for theta, but I'll be damned if I can wrap my head around how/where.
Can anyone suggest a way I can allow for any number of values for theta within this function?
If you need more information to understand what I'm trying to ask, I will try my best to provide it.
You can use vectorize of operations in Octave/Matlab. Iterate over entire vector - it is really bad idea, if your programm language let you vectorize operations. R, Octave, Matlab, Python (numpy) allow this operation. For example, you can get scalar production, if theta = (t0, t1, t2, t3) and X = (x0, x1, x2, x3) in the next way: theta * X' = (t0, t1, t2, t3) * (x0, x1, x2, x3)' = t0*x0 + t1*x1 + t2*x2 + t3*x3 Result will be scalar.
For example, you can vectorize h in your code in the next way:
H = (theta'*X')';
S = sum((H - y) .^ 2);
J = S / (2*m);