It seems counter intuitive that calculating more vertices instead of just reading more from vram would be faster. But if memory bandwidth is the issue that makes tessellation worth it, then why do things like displacement mapping exist? In the tessellation shader, if you read from a texture, you accessing vram more anyway. Are texture look ups less expensive than more original vertices? Why is tessellation fast?
Say you had an vertex amplification of 32 with a very low polygon model. Would this be faster than say a higher polygon model with only a tessellation vertex amplification of 8 or something. Or in other words, do you linearly gain performance with the more you tessellate?
There is no single point that gives tessellation better performance in every possible instance. Different benefits and trade offs apply in every use case. Some things that might contribute to making tessellation faster than alternatives:
There are probably other factors that I have missed...