API gateway rate limiting vs application-level rate limiting
API gateway rate limiting, Kong rate limiting plugin, application middleware, latency impact, centralized vs distributed enforcement, per-route limits
API gateway rate limiting vs application-level rate limiting
Two Enforcement Points
Rate limiting can live at the API gateway before requests reach application code, or inside the application as middleware. Both have trade-offs.
API Gateway Rate Limiting
Gateways such as Kong, AWS API Gateway, and Nginx reject over-limit requests before they hit your application. Configure in Kong:
# Kong rate limiting plugin (declarative YAML)
plugins:
- name: rate-limiting
config:
minute: 100
hour: 1000
policy: redis
redis_host: redis
redis_port: 6379Zero application code required. Works uniformly across microservices. Downside: coarse-grained — per-route or per-user-tier limits require additional gateway configuration complexity.
Application-Level Rate Limiting
Middleware inside your service gives full control over different limits per endpoint and per user tier:
// Per-route limits in Express
router.post('/expensive-op', rateLimit({ windowMs: 60000, max: 5 }));
router.get('/cheap-list', rateLimit({ windowMs: 60000, max: 1000 }));Better granularity, but every service must implement it correctly. In microservices, prefer gateway limiting for baseline protection and application-level limiting for fine-grained per-route policies on top.
