Here's my code:
def factorize(n):
sieve = [True] * (n + 1)
for x in range(2, int(len(sieve) ** 0.5) + 1):
if sieve[x]:
for i in range(x + x, len(sieve), x):
sieve[i] = False
lowerPrimes = i for i in range(2, len(sieve)) if sieve[i]] and (n % i == 0)]
return lowerPrimes
factorize(n)
returns all prime factors of the given value n
. As you can see, it first makes an Eratosthenes sieve for n
and then uses a list comprehension to return all values in the sieve that are factors of n
. It works relatively fine for this purpose, however, I want it to return a list so that if you multiply every item in it, the result is n
. Do you get my point?
For example, factorize(99020)
returns [2, 5, 4951]
, but I'd like it to return [2, 2, 5, 4951]
, as 2*2*5*4951 = 99020
.
I know my approach is not even close, but could you help me to make it so?
The code in the answer by Blender is very nice, but that algorithm is lacking in one very important respect: it tests way too much. E.g. trying to factorize n=392798360393
, which is a prime number, it will try to divide it by all the numbers below it (including itself). This will take a lot of time.
Is it really necessary? If n == a*b
and a < b
, having found a
we don't really need to test divide n
by b
. We know that it too divides n
, because n/a == b
implies n/b == a
. So we only need to test while the potential factor a
is smaller than (or equal to) the potential factor b
. That is to say, until the square root of the number is reached.
Also, instead of starting over from 2 for each reduced n
it's OK to start from the previous value of i
:
def factors(n): # (cf. https://stackoverflow.com/a/15703327/849891)
j = 2
while n > 1:
for i in xrange(j, int(sqrt(n+0.05)) + 1):
if n % i == 0:
n /= i ; j = i
yield i
break
else:
if n > 1:
yield n; break
Indeed, factoring 9000009
by this code takes 0.08 seconds on Ideone, instead of 0.59 seconds.
This is guaranteed to produce only primes (because we divide out each factor found, and we try the candidates in non-decreasing order). The overhead of generating primes first and then testing only by the primes (a.o.t. testing by all numbers, as done here above) will be worth it if we are factoring several numbers at once; for factoring just one number it might not be worth it, depending on the speed of your prime generation.
But then, what really should be done when factoring several numbers at once, is to create the smallest factor sieve first where we mark each number in a given range by its smallest (prime) factor (from which it was generated by the sieve) instead of just by True
or False
as in the sieve of Eratosthenes. This smallest factor sieve is then used for efficient factorization of each of the numbers given, in the same range, by successive division by their factors, from the smallest up, which are efficiently found by a direct look-up in the sieve instead of testing anew:
def sfs_factorize(nums):
top = max(nums)
sv = smallest_factor_sieve(top)
for k in nums:
fs = [] ; n = k
while n > 1:
f = sv[n] # no testing
n /= f
fs.append(f)
print( k, list(fs))