Normalize a feature in this table

bjd2385 picture bjd2385 · Jun 9, 2015 · Viewed 29.5k times · Source

This has become quite a frustrating question, but I've asked in the Coursera discussions and they won't help. Below is the question:

enter image description here

I've gotten it wrong 6 times now. How do I normalize the feature? Hints are all I'm asking for.

I'm assuming x_2^(2) is the value 5184, unless I am adding the x_0 column of 1's, which they don't mention but he certainly mentions in the lectures when talking about creating the design matrix X. In which case x_2^(2) would be the value 72. Assuming one or the other is right (I'm playing a guessing game), what should I use to normalize it? He talks about 3 different ways to normalize in the lectures: one using the maximum value, another with the range/difference between max and mins, and another the standard deviation -- they want an answer correct to the hundredths. Which one am I to use? This is so confusing.

Answer

smci picture smci · Jun 9, 2015

...use both feature scaling (dividing by the "max-min", or range, of a feature) and mean normalization.

So for any individual feature f:

f_norm = (f - f_mean) / (f_max - f_min)

e.g. for x2,(midterm exam)^2 = {7921, 5184, 8836, 4761}

> x2 <- c(7921, 5184, 8836, 4761)
> mean(x2)
 6676
> max(x2) - min(x2)
 4075
> (x2 - mean(x2)) / (max(x2) - min(x2))
 0.306  -0.366  0.530 -0.470

Hence norm(5184) = 0.366

(using R language, which is great at vectorizing expressions like this)

I agree it's confusing they used the notation x2 (2) to mean x2 (norm) or x2'


EDIT: in practice everyone calls the builtin scale(...) function, which does the same thing.