I'm writing a bigger program and getting the determinants of 3x3 matrices as fast as possible is pretty important for it to work well. I've read that I could use numPy to do it, but I thought that maybe writing my own code would be more educational as I'm in my 3rd semester of CompSci.
So I wrote two functions and I'm using time.clock() (I'm on a win7 machine) to time how long it takes for each function to return a value.
This is the first function:
def dete(a):
x = (a[0][0] * a[1][1] * a[2][2]) + (a[1][0] * a[2][1] * a[3][2]) + (a[2][0] * a[3][1] * a[4][2])
y = (a[0][2] * a[1][1] * a[2][0]) + (a[1][2] * a[2][1] * a[3][0]) + (a[2][2] * a[3][1] * a[4][0])
return x - y
And this is the second function:
def det(a):
a.append(a[0]); a.append(a[1]);
x = 0
for i in range(0, len(a)-2):
y=1;
for j in range(0, len(a)-2):
y *= a[i+j][j]
x += y
p = 0
for i in range(0, len(a)-2):
y=1;
z = 0;
for j in range(2, -1, -1):
y *= a[i+z][j]
z+=1
z += 1
p += y
return x - p
They both give the correct answers, however the first one seems to be slightly faster, which makes me think that since for-loops are more elegant to use and usually faster, I'm doing something wrong - I made the loops too slow and fat. I tried trimming it down, but it seems that the *= and += operations are taking too much time, there's too many of them. I haven't checked how fast numPy takes care of this problem yet, but I wanna get better at writing efficient code. Any ideas on how to make those loops faster?
First let me note, that micro scale speed optimization should take place in another language. So you are better off using a library that employs c-written functionality.
About your for loops: It is a common technique to unroll (small) loops for speedup, so it is not always faster to let loops do the job. Usually it is just more generic (and most generic algorithms are in fact slower than specialized ones).
As stated in the comments it will not increase speed in python, when replacing -
with *
but it might increase speed if there are less arithmetic operations involved. Hence I will post the factored out term here:
def dete(a):
return (a[0][0] * (a[1][1] * a[2][2] - a[2][1] * a[1][2])
-a[1][0] * (a[0][1] * a[2][2] - a[2][1] * a[0][2])
+a[2][0] * (a[0][1] * a[1][2] - a[1][1] * a[0][2]))
As you can see there are 5 +/-
and 9 *
where as in the original version there are 5 +/-
and 12 *
. Also note that this version accesses a
only 15 times were the original one accessed it 18 times.
Summarized this yields 3 arithmetic operations and 3 variable accesses less than the fully multiplied version.