What does the power operator (**) in python translate into?

Boyan Kushlev picture Boyan Kushlev · Nov 17, 2015 · Viewed 16.3k times · Source

In other words, what exists behind the two asterisks? Is it simply multiplying the number x times or something else? As a follow-up question, is it better to write 2**3 or 2*2*2. I'm asking because I've heard that in C++ it's better to not use pow() for simple calculations, since it calls a function.

Answer

m01 picture m01 · Nov 17, 2015

If you're interested in the internals, I'd disassemble the instruction to get the CPython bytecode it maps to. Using Python3:

»»» def test():
    return 2**3
   ...: 
»»» dis.dis(test)
  2           0 LOAD_CONST               3 (8)
              3 RETURN_VALUE

OK, so that seems to have done the calculation right on entry, and stored the result. You get exactly the same CPython bytecode for 2*2*2 (feel free to try it). So, for the expressions that evaluate to a constant, you get the same result and it doesn't matter.

What if you want the power of a variable?

Now you get two different bits of bytecode:

»»» def test(n):
        return n ** 3

»»» dis.dis(test)
  2           0 LOAD_FAST                0 (n)
              3 LOAD_CONST               1 (3)
              6 BINARY_POWER
              7 RETURN_VALUE

vs.

»»» def test(n):
    return n * 2 * 2
   ....: 

»»» dis.dis(test)
  2           0 LOAD_FAST                0 (n)
              3 LOAD_CONST               1 (2)
              6 BINARY_MULTIPLY
              7 LOAD_CONST               1 (2)
             10 BINARY_MULTIPLY
             11 RETURN_VALUE

Now the question is of course, is the BINARY_MULTIPLY quicker than the BINARY_POWER operation?

The best way to try that is to use timeit. I'll use the IPython %timeit magic. Here's the output for multiplication:

%timeit test(100)
The slowest run took 15.52 times longer than the fastest. This could mean that an intermediate result is being cached 
10000000 loops, best of 3: 163 ns per loop

and for power

The slowest run took 5.44 times longer than the fastest. This could mean that an intermediate result is being cached 
1000000 loops, best of 3: 473 ns per loop

You may wish to repeat this for representative inputs, but empirically it looks like the multiplication is quicker (but note the mentioned caveat about the variance in the output).

If you want further internals, I'd suggest digging into the CPython code.