Is there any limitation on EC2 machine or network?

user3413814 picture user3413814 · Sep 10, 2015 · Viewed 15.7k times · Source

I have 2 instances on Amazon EC2. The one is a t2.micro machine as web cache server, the other is a performance test tool.

When I started a test, TPS (transactions per second) was about 3000. But a few minutes later TPS has been decreased to 300.

At first I thought that the CPU credit balance was exhausted, but it was enough to process requests. During a test, the max outgoing traffic of web cache was 500Mbit/s, usage of CPU was 60% and free memory was more than enough.

I couldn't find any cause of TPS decrease. Is there any limitation on EC2 machine or network?

Answer

John Rotenstein picture John Rotenstein · Sep 10, 2015

There are several factors that could be constraining your processes.

CPU credits on T2 instances

As you referenced, T2 instances use credits for bursting CPU. They are very powerful machines, but each instance is limited to a certain amount of CPU. t2.micro instances are given 10% of CPU, meaning they actually get 100% of the CPU only 10% of the time (at low millisecond resolution).

Instances start with CPU credits for a fast start, and these credits are consumed when the CPU is used faster than the credits are earned. However, you say that the credit balance was sufficient, so this appears not to be the cause.

Network Bandwidth

Each Amazon EC2 instance can use a certain throughput of network bandwidth. Smaller instances have 'low' bandwidth, bigger instances have more. There is no official statement of bandwidth size, but this is an interesting reference from Serverfault: Bandwidth limits for Amazon EC2

Disk IOPS

If your application uses disk access for each transaction, and your instance is using a General Purpose (SSD) instance type, then your disk may have consumed all available burst credits. If your disk is small, this could mean it will run slow (speed is 3 IOPS per GB, so a 20GB disk would run at 60 IOPS). Check the Amazon CloudWatch VolumeQueueLength metric to see if IO is queuing excessively.

Something else

The slowdown could also be due to your application or cache system (eg running out of free memory for storing data).