How are bit shifts implemented at the hardware level when the number to shift by is unknown?
I can't imagine that there would be a separate circuit for each number you can shift by (that would 64 shift circuits on a 64-bit machine), nor can I imagine that it would be a loop of shifts by one (that would take up to 64 shift cycles on a 64-bit machine). Is it some sort of compromise between the two or is there some clever trick?
The circuit is called a "barrel shifter" - it's a load of multiplexers basically. It has a layer per address-bit-of-shift-required, so an 8-bit barrel shifter needs three bits to say "how much to shift by" and hence 3 layers of muxes.
Here's a picture of an 8-bit one from http://www.globalspec.com/reference/55806/203279/chapter-9-additional-circuit-designs: