Speed of cos() and sin() function in GLSL shaders?

ulmangt picture ulmangt · Apr 14, 2012 · Viewed 24.7k times · Source

I'm interested in information about the speed of sin() and cos() in Open GL Shader Language.

The GLSL Specification Document indicates that:

The built-in functions basically fall into three categories:

  • ...
  • ...
  • They represent an operation graphics hardware is likely to accelerate at some point. The trigonometry functions fall into this category.

EDIT:

As has been pointed out, counting clock cycles of individual operations like sin() and cos() doesn't really tell the whole performance story.

So to clarify my question, what I'm really interested in is whether it's worthwhile to optimize away sin() and cos() calls for common cases.

For example, in my application it'll be very common for the argument to be 0. So does something like this make sense:

float sina, cosa;

if ( rotation == 0 )
{
   sina = 0;
   cosa = 1;
}
else
{
   sina = sin( rotation );
   cosa = cos( rotation );
}

Or will the GLSL compiler or the sin() and cos() implementations take care of optimizations like that for me?

Answer

Nicol Bolas picture Nicol Bolas · Apr 14, 2012

For example, in my application it'll be very common for the argument to be 0. So does something like this make sense:

No.

Your compiler will do one of two things.

  1. It will issue an actual conditional branch. In the best possible case, if 0 is a value that is coherent locally (such that groups of shaders will often hit 0 or non-zero together), then you might get improved performance.
  2. It will evaluate both sides of the condition, and only store the result for the correct one of them. In which case, you've gained nothing.

In general, it's not a good idea to use conditional logic to dance around small performance like this. It needs to be really big to be worthwhile, like a discard or something.

Also, do note that floating-point equivalence is not likely to work. Not unless you actually pass a uniform or vertex attribute containing exactly 0.0 to the shader. Even interpolating between 0 and non-zero will likely never produce exactly 0 for any fragment.