C#'s Math class does roots and powers in double only. Various things may go a bit faster if I add float-based square-root and power functions to my Math2 class (Today is a relaxation day and I find optimization relaxing).
So - Fast square-root and power functions that I don't have to worry about licensing for, plskthx. Or a link that'll get me there.
I'm going to take it as axiomatic that no software method will compete with the hardware instruction for square roots. The only difficulty is that .NET doesn't give us direct control of the hardware as in the days of inline assembler for C code.
Let's first discuss a generic x86 hardware prospect.
The floating point x86 instruction FSQRT does come in three precisions: single, double, and extended (the native precision of the 80-bit FP registers), and there is a 25-40% shorter timing for single vs. double precision. See here for 32-bit x86 instructions.
That may sound like a big opportunity, but it's only a dozen clocks or so. That sort of economization will easily get lost in the overhead unless you are able to carefully manage the code from function call to return value. Managed C++ sounds (as Marcelo Cantos suggests) like a more practical base for this than C#.
Note: Timings for FSQRT are identical to those FDIV, with which it shares an execution unit in the Intel architecture, and thus a common latency.
A better opportunity for specialized C# code probably exists in the direction of SSE SIMD instructions, where hardware allows for up to 4 single precision square roots to be done in parallel. JIT compiler support for this has been missing for years, but here are some leads on current development.
Intel has jumped in (Dec. 15,2010), seeing that .NET Framework 4 wasn't doing anything with SIMD:
[Intel Performance Libraries allow... SIMD instructions in C#]
Even before that the Mono project added JIT support for SIMD in Mono 2.2:
The possibility of calling Mono's SIMD support from MS C# was recently raised here:
[Calling mono c# code from Microsoft .net ? -- Stackoverflow]
An earlier question also addresses (though without much love shown!) how to install Mono's SIMD support: