What is the BigO of linear regression?

BCS picture BCS · Dec 23, 2009 · Viewed 12.6k times · Source

How large a system is it reasonable to attempt to do a linear regression on?

Specifically: I have a system with ~300K sample points and ~1200 linear terms. Is this computationally feasible?

Answer

Emiller picture Emiller · Dec 3, 2010

The linear regression is computed as (X'X)^-1 X'Y.

If X is an (n x k) matrix:

  1. (X' X) takes O(n*k^2) time and produces a (k x k) matrix

  2. The matrix inversion of a (k x k) matrix takes O(k^3) time

  3. (X' Y) takes O(n*k^2) time and produces a (k x k) matrix

  4. The final matrix multiplication of two (k x k) matrices takes O(k^3) time

So the Big-O running time is O(k^2*(n + k)).

See also: http://en.wikipedia.org/wiki/Computational_complexity_of_mathematical_operations#Matrix_algebra

If you get fancy it looks like you can get the time down to O(k^2*(n+k^0.376)) with the Coppersmith–Winograd algorithm.