Why is matrix multiplication in .NET so slow?

Matthew Olenik picture Matthew Olenik · Jul 12, 2010 · Viewed 9.5k times · Source

I don't quite understand what makes matrix multiplication in C#/.NET (and even Java) so slow.

Take a look at this benchmark (source): Trying to find an updated benchmark.

Java vs C# vs C++ breakdown

C#'s integer and double performance is damn close to C++ compiled with MSVC++. 87% as fast for double and 99% as fast for 32-bit integer. Pretty damn good, I'd say. But then look at matrix multiplication. The gap widens to C# being about 19% as fast. This is a pretty huge discrepancy that I don't understand. Matrix multiplication is just a bunch of simple math. How is it getting so slow? Shouldn't it be roughly as fast as an equivalent number of simple floating point or integer operations?

This is especially of a concern with games and with XNA, where matrix and vector performance are critical for things like physics engines. Some time ago, Mono added support for SIMD instructions through some nifty vector and matrix classes. It closes the gap and makes Mono faster than hand-written C++, although not as fast as C++ with SIMD. (source)

Matrix multiplication comparison

What's going on here?

Edit: Looking closer, I misread the second graph. C# appears pretty close. Is the first benchmark just doing something horribly wrong? Sorry, I missed the version number on the first benchmark. I grabbed it as a handy reference for the "C# linear algebra is slow" that I've always heard. I'll try to find another.

Answer

Hans Passant picture Hans Passant · Jul 12, 2010

With large matrices like this, the CPU cache becomes the limiting factor. What's hyper-important is how the matrix is stored. And the benchmark code is comparing apples and oranges. The C++ code used jagged arrays, the C# code uses two-dimensional arrays.

Rewriting the C# code to use jagged arrays as well doubled its speed. Rewriting the matrix multiply code to avoid the array index boundary check seemed pointless, nobody would use code like this for real problems.