5 ways to handle AWS API rate-limiting
When dealing with AWS API rate-limiting there are a few tips & tricks that I find helpful. If your environment is like mine and you have a lot of code interacting with the AWS APIs, sometimes poorly, handling the default rate-limiting without errors is important.

Python’s Tenacity
I’ve found that Tenacity for Python is a life saver. Tenacity is a general purpose library that automates retry logic. By decorating your functions Tenacity will automatically retry, with behavior determined by the decoration, when an exception is raised. In the code below it automatically retries the API call 10 times after waiting with a random exponential backoff.
The great thing is that it only takes a little bit of effort to refactor your code to take advantage of it. All you have to do is make each AWS API call a function and put the decorator on it. With the reraise=True
option your existing error handling will continue to work as it is coded now.
AWS GoSDK’s CustomRetryer
The AWS Go SDK also has some default retry logic built-in. In addition to the defaults, it allows you to custom set when to do a retry and how often to do it. Once you initialize your session with the CustomRetryer
it will automatically be used.
The nice thing about this approach is that it also lets you set custom logic for when it should do the retries.
Caching
Sometimes the data you query from AWS can be fairly static. For example, the KubernetesCluster
or Environment
tags on my Kubernetes EC2 instances never change. Instead of making an API call every time I need to know the tag values I can save it to a local file or to ElastiCache and reference it from there first. If it doesn’t exist, the script can fall back to making the API calls.
Instance Metadata API
If you haven’t been keeping up with what’s available via the local metadata API features it’s probably time for a look. With the newer instance types (e.g., the c/m/r 5-series) more data is available than there was before. It unfortunately still doesn’t have my most queried resource (tags) but it still does have useful information. For example, on a 5-series instance, you no longer have to run aws ec2 describe-instance-status
to find out if there are upcoming maintenance events. Instead you can can query the metadata API for that information at http://169.254.169.254/latest/meta-data/events/maintenance/scheduled
. This change alone saved many thousands of API calls an hour across the fleet.
Requesting a limit increase
This one is kind of cheating but it is actually possible and the docs even say so. In practice though you’ll almost always get a response about doing backoffs and retries first. If you can make a good business case for why you should get a limit increase they can do that. For example, I ran into a situation where the Kubernetes external-dns provider was making too many requests per second when it was running on all of my clusters. There wasn’t a way for me to adjust it so AWS had to increase the limit (slightly) on the account.
In the end…
Unfortunately it all boils down to whether or not you should retry or if you should even make the API calls. Thankfully the approaches I described are fairly easy to implement. As you may have guessed from the stats shown we’re still getting rate-limited for some applications. This is currently a whack-an-app process where we are reducing the calls across quite a few applications a little at a time.