If I have some integer n, and I want to know the position of the most significant bit (that is, if the least significant bit is on the right, I want to know the position of the furthest left bit that is a 1), what is the quickest/most efficient method of finding out?
I know that POSIX supports a ffs()
method in strings.h to find the first set bit, but there doesn't seem to be a corresponding fls()
method.
Is there some really obvious way of doing this that I'm missing?
What about in cases where you can't use POSIX functions for portability?
Edit: What about a solution that works on both 32 and 64 bit architectures (many of the code listings seem like they'd only work on 32 bit ints).
-- Built-in Function: int __builtin_clz (unsigned int x) Returns the number of leading 0-bits in X, starting at the most significant bit position. If X is 0, the result is undefined. -- Built-in Function: int __builtin_clzl (unsigned long) Similar to `__builtin_clz', except the argument type is `unsigned long'. -- Built-in Function: int __builtin_clzll (unsigned long long) Similar to `__builtin_clz', except the argument type is `unsigned long long'.
I'd expect them to be translated into something reasonably efficient for your current platform, whether it be one of those fancy bit-twiddling algorithms, or a single instruction.
A useful trick if your input can be zero is __builtin_clz(x | 1)
: unconditionally setting the low bit without modifying any others makes the output 31
for x=0
, without changing the output for any other input.
To avoid needing to do that, your other option is platform-specific intrinsics like ARM GCC's __clz
(no header needed), or x86's _lzcnt_u32
on CPUs that support the lzcnt
instruction. (Beware that lzcnt
decodes as bsr
on older CPUs instead of faulting, which gives 31-lzcnt for non-zero inputs.)
There's unfortunately no way to portably take advantage of the various CLZ instructions on non-x86 platforms that do define the result for input=0 as 32 or 64 (according to the operand width). x86's lzcnt
does that, too, while bsr
produces a bit-index that the compiler has to flip unless you use 31-__builtin_clz(x)
.
(The "undefined result" is not C Undefined Behavior, just a value that isn't defined. It's actually whatever was in the destination register when the instruction ran. AMD documents this, Intel doesn't, but Intel's CPUs do implement that behaviour. But it's not whatever was previously in the C variable you're assigning to, that's not usually how things work when gcc turns C into asm. See also Why does breaking the "output dependency" of LZCNT matter?)