- Overview
- Key Features
- Architecture
- Interaction Flow
- Configuration
- Running the Proxy
- Why Use Redis Over Memcached?
- Extending Nginx Configuration with Snippets
- Prometheus
- Request Flow
This distributed lightweight rate limiter serves as a reverse proxy, regulating incoming traffic and enforcing rate limits before requests reach your backend. By controlling excessive traffic and potential abuse, it enhances both security and performance.
- Kubernetes Sidecar Proxy: Designed to manage traffic before it enters your main application container, ensuring seamless rate limiting within a Kubernetes environment.
- NGINX + Lua: Implemented using Lua scripting within NGINX, leveraging
lua-resty-global-throttle
andlua-resty-redis
. - Flexible Caching: Supports both Redis and Memcached as distributed caching providers.
- Configurable Rules: Rate limit rules are defined in a YAML file, allowing for flexible and dynamic configurations.
graph TD
subgraph Infrastructure
B[Nginx Proxy] -- Rate Limit Check --> C{Cache Provider}
C -- 429 Too Many Requests --> B
style B fill:#f9f,stroke:#333,stroke-width:2px
style C fill:#ccf,stroke:#333,stroke-width:2px
style B fill:#f9f,stroke:#333,stroke-width:2px
classDef rate_limiting fill:#ffc,stroke:#333;
class C rate_limiting
end
subgraph Application
E[Main Application]
style E fill:#eef,stroke:#333,stroke-width:2px
end
B -- Forward if allowed --> E
E -- Response --> B
B -- Response --> A
A -- Request --> B
classDef external fill:#eee,stroke:#333
class A external
classDef container fill:#ccf,stroke:#333
class E container
classDef proxy fill:#f9f,stroke:#333
class B proxy
linkStyle 0,1,2,3 stroke:#0aa,stroke-width:2px;
linkStyle 4,5 stroke:#080,stroke-width:2px;
classDef cache fill:#ddf,stroke:#333
class D1,D2,D3 cache
subgraph Client
A[Client]
end
- Client Request: The client sends a request to the application.
- NGINX Proxy: The request is intercepted by the NGINX proxy.
- Rate Limiting: The proxy checks the request against the rate limiting rules defined in the YAML file.
- Decision-Making & Request Handling:
- Ignored Segments: The request IP/user is first checked against the ignoredSegments configuration. If matched, rate limiting is bypassed, and the request is forwarded.
- Rate Limit Exceeded: If the request exceeds the defined rate limit, a
429 Too Many Requests
response is immediately returned to the client. - Rate Limit Within Limits: If the request is within the rate limit, it is proxied to the main application.
- Lua Exception Handling: In the event of an exception within the Lua rate limiting script, the request is still proxied to the main application (this should be carefully considered and potentially logged/monitored).
- Rules Precedence: Explicit IP addresses in the configuration take priority over users and generic CIDR ranges (e.g., 0.0.0.0/0).
- Main Application: The request is processed by the main application if it passes the rate limiting check.
- Response: The main application's response travels back through the NGINX proxy to the client.
Rate limit rules are defined in the ratelimits.yaml file. The structure of the YAML file is as follows:
ignoredSegments:
users:
- admin
ips:
- 127.0.0.1
urls:
- /v1/ping
rules:
/v1:
users:
user2:
limit: 50
window: 60
ips:
192.168.1.1:
limit: 200
window: 60
^/v2/[0-9]$:
users:
user3:
flowRate: 10
limit: 30
window: 60
ignoredSegments
: Defines users, IPs and URLs for which rate limiting should be skipped. This is useful for administrative users, urls or specific trusted IPs.rules
: Contains the rate limit rules for different URI paths.path
: The URI path to which the rate limit applies, to apply ratelimits for all paths you can provide/
as a global path, for regex paths refer to https://github.com/openresty/lua-nginx-module?tab=readme-ov-file#ngxrematch.user/IP
: The user or IP address to which the rate limit applies.limit
: The maximum number of requests allowed within the time window.window
: The time window in seconds during which the limit applies.flowRate
: Specifies the rate at which requests are allowed in rps, applicable toleaky-bucket
(leak rate) andtoken-bucket
(refill rate) algorithms. Defaults tolimit/window
.
πΉ Configuration Note:
Ensure that yourratelimits.yaml
file is mounted to:/usr/local/openresty/nginx/lua/ratelimits.yaml
πΉ Global Rate Limiting (
0.0.0.0/0
):
If0.0.0.0/0
is specified in the rules, rate limiting will be applied per IP rather than globally.
For example, if the limit is set to 10 requests per second (RPS) and two clients:127.0.0.1
and127.0.0.2
make requests, each IP will be allowed 10 RPS independently.
Description | Required | Default | |
---|---|---|---|
UPSTREAM_PORT |
The port of the main application. | β | - |
UPSTREAM_HOST |
The hostname of the main application. | β | - |
UPSTREAM_TYPE |
The type of upstream server. Valid values: http (for HTTP upstreams) and fastcgi (for FastCGI upstreams), and grpc (for gRPC upstreams). |
β | http |
INDEX_FILE |
The default index file for FastCGI upstreams. | β | index.php |
SCRIPT_FILENAME |
The script filename for FastCGI upstreams. | β | /var/www/app/public/index.php |
CACHE_HOST |
The hostname of the distributed cache. | β | - |
CACHE_PORT |
The port of the distributed cache. | β | - |
CACHE_PROVIDER |
The provider of the distributed cache, either redis or memcached . |
β | - |
CACHE_PREFIX |
A unique cache prefix per server group/namespace that reflects the context where rate limits should be applied. | β | - |
CACHE_ALGO |
Specifies the rate-limiting algorithm to use. Options: fixed-window , sliding-window , leaky-bucket , or token-bucket . Only applicable when using redis . |
β | token-bucket |
REMOTE_IP_KEY |
Defines the request variable to use as the source IP for rate limiting. Options: - http_cf_connecting_ip : Extracts IP from CF-Connecting-IP (Cloudflare). - http_x_forwarded_for : Uses the first IP in X-Forwarded-For header. - remote_addr : Uses ngx.var.remote_addr (default NGINX client IP). |
β | - |
PROMETHEUS_METRICS_ENABLED |
Enables Prometheus metrics export. | β | false |
LOGGING_ENABLED |
Enables NGINX logs. | β | true |
To run the NGINX Rate Limiter Proxy using Docker, you need to mount the rate limit configuration file and set the required environment variables.
docker run --rm --platform linux/amd64 \
-v $(pwd)/ratelimits.yaml:/usr/local/openresty/nginx/lua/ratelimits.yaml \
-e UPSTREAM_HOST=localhost \
-e UPSTREAM_TYPE=http \
-e UPSTREAM_PORT=3000 \
-e CACHE_HOST=mcrouter \
-e CACHE_PORT=5000 \
-e CACHE_PROVIDER=memcached \
-e CACHE_PREFIX=local \
-e REMOTE_IP_KEY=remote_addr \
ghcr.io/omarfawzi/nginx-ratelimiter-proxy:master
You can mount your own resolver.conf
file to: /usr/local/openresty/nginx/conf/resolver.conf
in order to use a custom resolver.
When implementing rate limiting, Redis is generally preferred over Memcached due to its ability to handle atomic operations and structured data efficiently:
Redis eval commands ensures race-condition-free updates. Memcached lacks built-in atomic counters with expiration, making it less reliable for rate limiting.
Supports multiple rate-limiting algorithms, including fixed window, sliding window, and token bucket.
Redis provides precise, reliable, and scalable rate limiting, while Memcached lacks the necessary atomicity and data structures for advanced rate-limiting techniques.
Using Redis replicas for rate limiting is not recommended due to potential delays in data replication. Redis replication is asynchronous, meaning there can be a lag between the master and replica nodes. This can result in inconsistent rate limits, where some requests might pass even after exceeding the limit due to stale data in the replica.
To ensure accurate and real-time enforcement of rate limits:
- Always use the Redis master instance for both read and write operations related to rate limiting.
- Replicas should only be used for read-heavy operations that are not time-sensitive.
Using a replica for rate limiting can lead to bypassing rate limits and unexpected behaviors, defeating the purpose of traffic control.
This setup allows for easy customization by including additional snippet files. These snippets let you extend the core Nginx configuration without modifying nginx.conf
.
The Nginx configuration is designed to include external snippet files from the /usr/local/openresty/nginx/conf/
directory:
http_snippet.conf
: Modify http settings, applied to the http context.server_snippet.conf
: Modify server-wide settings, applied to the server context.location_snippet.conf
: Customize location-based routing and proxying, applied to the location context.resolver.conf
: Define custom DNS resolvers
Nginx will automatically load these files if they exist.
To extend the logic, create your snippet files and mount them into the container:
docker run -d \
-v $(pwd)/snippets/http_snippet.conf:/usr/local/openresty/nginx/conf/http_snippet.conf \
-v $(pwd)/snippets/server_snippet.conf:/usr/local/openresty/nginx/conf/server_snippet.conf \
-v $(pwd)/snippets/location_snippet.conf:/usr/local/openresty/nginx/conf/location_snippet.conf \
ghcr.io/omarfawzi/nginx-ratelimiter-proxy:master
- Set
PROMETHEUS_METRICS_ENABLED
totrue
.
Prometheus metrics are exposed on port 9145
at the /metrics
endpoint. This can be accessed via:
curl http://<server-ip>:9145/metrics
This endpoint provides various statistics, including:
nginx_proxy_http_requests_total
: Total number of HTTP requests categorized by host and status.nginx_proxy_http_request_duration_seconds
: Histogram tracking request latency.nginx_proxy_http_connections
: Gauge tracking active connections (reading, writing, waiting, active).
graph TD
subgraph IP Rules
CheckIPRule{Is there an exact IP rule?} -->|Yes| ApplyIPRateLimit["Apply Rate Limit for IP"]
ApplyIPRateLimit --> CheckLimit{Exceeded Limit?}
end
subgraph User Rules
CheckUserRule{Is there a user rule?} -->|Yes| ApplyUserRateLimit["Apply Rate Limit for User"]
ApplyUserRateLimit --> CheckLimit
end
subgraph CIDR Rules
CheckCIDRRule{Does IP match CIDR rule?} -->|Yes| ApplyCIDRRateLimit["Apply Rate Limit for IP CIDR"]
ApplyCIDRRateLimit --> CheckLimit
end
subgraph Global Rules
CheckGlobalIPRule{Is there a global IP rule?} -->|Yes| ApplyGlobalIPRateLimit["Apply Global Rate Limit"]
ApplyGlobalIPRateLimit --> CheckLimit
end
Start["Request Received"] --> CheckIgnore{Is IP/User/Url Ignored?}
CheckIgnore -->|Yes| AllowRequest["Allow Request"]
CheckIgnore -->|No| CheckIPRule
CheckIPRule -->|No| CheckUserRule
CheckUserRule -->|No| CheckCIDRRule
CheckCIDRRule -->|No| CheckGlobalIPRule
CheckGlobalIPRule -->|No| AllowRequest
CheckLimit -->|Yes| ThrottleResponse["Return 429 Too Many Requests"]
CheckLimit -->|No| AllowRequest