I know that 'crossing boundaries' when making a JNI call in Java is slow.
However I want to know what is it that makes it slow? What does the underlying jvm implementation do when making a JNI call that makes it so slow?
First, it's worth noting that by "slow," we're talking about something that can take tens of nanoseconds. For trivial native methods, in 2010 I measured calls at an average 40 ns on my Windows desktop, and 11 ns on my Mac desktop. Unless you're making many calls, you're not going to notice.
That said, calling a native method can be slower than making a normal Java method call. Causes include:
Some additional discussion, possibly dated, can be found in "Java¿ Platform Performance: Strategies and Tactics", 2000, by Steve Wilson and Jeff Kesselman, in section "9.2: Examining JNI costs". It's about a third of the way down this page, provided in the comment by @Philip below.
The 2009 IBM developerWorks paper "Best practices for using the Java Native Interface" provides some suggestions on avoiding performance pitfalls with JNI.