F# performance in scientific computing

Anycorn picture Anycorn · May 2, 2010 · Viewed 19.2k times · Source

I am curious as to how F# performance compares to C++ performance? I asked a similar question with regards to Java, and the impression I got was that Java is not suitable for heavy numbercrunching.

I have read that F# is supposed to be more scalable and more performant, but how is this real-world performance compares to C++? specific questions about current implementation are:

  • How well does it do floating-point?
  • Does it allow vector instructions
  • how friendly is it towards optimizing compilers?
  • How big a memory foot print does it have? Does it allow fine-grained control over memory locality?
  • does it have capacity for distributed memory processors, for example Cray?
  • what features does it have that may be of interest to computational science where heavy number processing is involved?
  • Are there actual scientific computing implementations that use it?

Thanks

Answer

J D picture J D · May 10, 2010

I am curious as to how F# performance compares to C++ performance?

Varies wildly depending upon the application. If you are making extensive use of sophisticated data structures in a multi-threaded program then F# is likely to be a big win. If most of your time is spent in tight numerical loops mutating arrays then C++ might be 2-3× faster.

Case study: Ray tracer My benchmark here uses a tree for hierarchical culling and numerical ray-sphere intersection code to generate an output image. This benchmark is several years old and the C++ code has been improved upon dozens of times over the years and read by hundreds of thousands of people. Don Syme at Microsoft managed to write an F# implementation that is slightly faster than the fastest C++ code when compiled with MSVC and parallelized using OpenMP.

I have read that F# is supposed to be more scalable and more performant, but how is this real-world performance compares to C++?

Developing code is much easier and faster with F# than C++, and this applies to optimization as well as maintenance. Consequently, when you start optimizing a program the same amount of effort will yield much larger performance gains if you use F# instead of C++. However, F# is a higher-level language and, consequently, places a lower ceiling on performance. So if you have infinite time to spend optimizing you should, in theory, always be able to produce faster code in C++.

This is exactly the same benefit that C++ had over Fortran and Fortran had over hand-written assembler, of course.

Case study: QR decomposition This is a basic numerical method from linear algebra provided by libraries like LAPACK. The reference LAPACK implementation is 2,077 lines of Fortran. I wrote an F# implementation in under 80 lines of code that achieves the same level of performance. But the reference implementation is not fast: vendor-tuned implementations like Intel's Math Kernel Library (MKL) are often 10x faster. Remarkably, I managed to optimize my F# code well beyond the performance of Intel's implementation running on Intel hardware whilst keeping my code under 150 lines of code and fully generic (it can handle single and double precision, and complex and even symbolic matrices!): for tall thin matrices my F# code is up to 3× faster than the Intel MKL.

Note that the moral of this case study is not that you should expect your F# to be faster than vendor-tuned libraries but, rather, that even experts like Intel's will miss productive high-level optimizations if they use only lower-level languages. I suspect Intel's numerical optimization experts failed to exploit parallelism fully because their tools make it extremely cumbersome whereas F# makes it effortless.

How well does it do floating-point?

Performance is similar to ANSI C but some functionality (e.g. rounding modes) is not available from .NET.

Does it allow vector instructions

No.

how friendly is it towards optimizing compilers?

This question does not make sense: F# is a proprietary .NET language from Microsoft with a single compiler.

How big a memory foot print does it have?

An empty application uses 1.3Mb here.

Does it allow fine-grained control over memory locality?

Better than most memory-safe languages but not as good as C. For example, you can unbox arbitrary data structures in F# by representing them as "structs".

does it have capacity for distributed memory processors, for example Cray?

Depends what you mean by "capacity for". If you can run .NET on that Cray then you could use message passing in F# (just like the next language) but F# is intended primarily for desktop multicore x86 machines.

what features does it have that may be of interest to computational science where heavy number processing is involved?

Memory safety means you do not get segmentation faults and access violations. The support for parallelism in .NET 4 is good. The ability to execute code on-the-fly via the F# interactive session in Visual Studio 2010 is extremely useful for interactive technical computing.

Are there actual scientific computing implementations that use it?

Our commercial products for scientific computing in F# already have hundreds of users.

However, your line of questioning indicates that you think of scientific computing as high-performance computing (e.g. Cray) and not interactive technical computing (e.g. MATLAB, Mathematica). F# is intended for the latter.