How to implement rate limiting based on a client token in Spring?

aj.esler picture aj.esler · Apr 17, 2012 · Viewed 9.6k times · Source

I am developing a simple REST API using Spring 3 + Spring MVC. Authentication will be done through OAuth 2.0 or basic auth with a client token using Spring Security. This is still under debate. All connections will be forced through an SSL connection.

I have been looking for information on how to implement rate limiting, but it does not seem like there is a lot of information out there. The implementation needs to be distributed, in that it works across multiple web servers.

Eg if there are three api servers A, B, C and clients are limited to 5 requests a second, then a client that makes 6 requests like so will find the request to C rejected with an error.

A recieves 3 requests   \
B receives 2 requests    | Executed in order, all requests from one client.
C receives 1 request    /

It needs to work based on a token included in the request, as one client may be making requests on behalf of many users, and each user should be rate limited rather than the server IP address.

The set up will be multiple (2-5) web servers behind an HAProxy load balancer. There is a Cassandra backed, and memcached is used. The web servers will be running on Jetty.

One potential solution might be to write a custom Spring Security filter that extracts the token and checks how many requests have been made with it in the last X seconds. This would allow us to do some things like different rate limits for different clients.

Any suggestions on how it can be done? Is there an existing solution or will I have to write my own solution? I haven't done a lot of web site infrastructure before.

Answer

cauhn picture cauhn · Apr 1, 2019

It needs to work based on a token included in the request, as one client may be making requests on behalf of many users, and each user should be rate limited rather than the server IP address.

The set up will be multiple (2-5) web servers behind an HAProxy load balancer. There is a Cassandra backed, and memcached is used. The web servers will be running on Jetty.

I think the project is request/response http(s) protocol. And you use HAProxy as fronted. Maybe the HAProxy can load balancing with token, you can check from here.

Then the same token requests will reach same webserver, and webserver can just use memory cache to implement rate limiter.