What is the purpose of JMH @Fork?

Ole picture Ole · Jan 27, 2016 · Viewed 7.8k times · Source

If IIUC each fork creates a separate virtual machine for the reason that each virtual machine instance might run with slight differences in JIT instructions?

I'm also curious about what the time attribute does in the below annotations:

@Warmup(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)
@Measurement(iterations = 10, time = 500, timeUnit = TimeUnit.MILLISECONDS)

TIA, Ole

Answer

Nitsan Wakart picture Nitsan Wakart · Feb 2, 2016

JMH offers the fork functionality for a few reasons. One is compilation profile separation as discussed by Rafael above. But this behaviour is not controlled by the @Forks annotation (unless you choose 0 forks, which means no subprocesses are forked to run benchmarks at all). You can choose to run all the benchmarks as part of your benchmark warmup (thus creating a mixed profile for the JIT to work with) by using the warmup mode control(-wm).

The reality is that many things can conspire to tilt your results one way or another and running any benchmark multiple times to establish run-to-run variance is an important practice which JMH supports (and most hand-rolled framework don't help with). Reasons for run to run variance might include (but I'm sure there's more):

  • CPU start at a certain C-state and scale up the frequency at the face of load, then overheat and scale it down. You can control this issue on certain OSs.

  • Memory alignment of your process can lead to paging behaviour differences.

  • Background application activity.
  • CPU allocation by the OS will vary resulting in different sets of CPUs used for each run.
  • Page cache contents and swapping
  • JIT compilation is triggered concurrently and may lead to different results (this will tend to happen when larger bits of code are under test). Note that small single threaded benchmarks will typically not have this issue.
  • GC behaviour can trigger with slightly different timings from run to run leading to different results.

Running your benchmark with at least a few forks will help shake out these differences and give you an idea of the run to run variance you see in your benchmark. I'd recommend you start with the default of 10 and cut it back (or increase it) experimentally depending on your benchmark.