I know there are some optimized algorithms around for all kind of matrix decompositions (QR decomposition, SVD,...), multiplications and the likes. Yet, I couldn't find a good overview. For C++, there is quite some useful information in this question, but I'm looking for those things in C.
You did not mention whether you wanted an open-source or a commercial software, so here is a list containing both:
There was also this previous question on the subject.