# Rate Limits

LVNG enforces rate limits at two levels: per-endpoint limits on HTTP routes and per-plan limits on API key usage. Socket.io connections have their own event-based throttle. Every response includes headers that tell you where you stand.

## HTTP API Rate Limits

Each endpoint group has its own rate limit, enforced per authenticated user or per IP address (for public routes). All windows are 1 minute.

| Endpoint Group | Rate | Scope |
|---------------|------|-------|
| Messages API | 60 req/min | Per user |
| Channels API | 300 req/min | Per user |
| Chat API | 100 req/min | Per IP |
| Chat API | 30 req/min | Per user |
| General v2 APIs | 100 req/min | Per user |
| Artifacts API | 120 req/min | Per user |
| Twin Workspace API | 60 req/min | Per user |
| Voice Notes API | 100 req/min | Per user |
| Public endpoints | 60 req/min | Per IP |

> **Note:** The Chat API has two simultaneous limits. A single IP can send up to 100 requests per minute, but each individual user is limited to 30 requests per minute regardless of IP.

## API Key Plan-Based Limits

When authenticating with an API key, additional rate limits apply based on the plan tier associated with that key. These are tracked across three time windows.

| Plan | Per Minute | Per Hour | Per Day |
|------|-----------|----------|---------|
| Free | 5 | 20 | 100 |
| Pro | 100 | 500 | 10,000 |
| Enterprise | Unlimited | Unlimited | Unlimited |

Need higher limits? [Upgrade your plan](/pricing) or contact [hello@lvng.ai](mailto:hello@lvng.ai).

## Socket.io Rate Limits

WebSocket connections are rate-limited by event count per socket connection.

| Parameter | Default |
|-----------|---------|
| Max events per window | 100 |
| Window duration | 60 seconds |

These values are configurable server-side via the `SOCKET_RATE_LIMIT_WINDOW` and `SOCKET_RATE_LIMIT_MAX` environment variables.

When a socket connection exceeds the limit, the server emits an `error` event:

```json
{
  "code": "RATE_LIMIT_EXCEEDED",
  "message": "Too many events. Please wait."
}
```

## Rate Limit Headers

Every HTTP API response includes rate limit headers so you can track your usage in real time.

| Name | Type | Description |
|------|------|-------------|
| `X-RateLimit-Limit` | integer | Maximum number of requests allowed in the current time window. |
| `X-RateLimit-Remaining` | integer | Number of requests remaining in the current window. |
| `X-RateLimit-Reset` | string (ISO 8601) | Timestamp indicating when the current window resets. |
| `Retry-After` | integer | Seconds to wait before retrying. Only present on 429 responses. |

### Example Response Headers

```
HTTP/1.1 200 OK
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 2026-03-19T14:30:00.000Z
```

## Handling 429 Too Many Requests

When you exceed your rate limit, the API responds with HTTP `429`. The response body and headers tell you when you can retry.

```json
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2026-03-19T14:31:00.000Z
Retry-After: 60

{
  "error": "Too Many Requests",
  "message": "Rate limit exceeded",
  "retryAfter": 60
}
```

### Recommended Retry Strategy

1. Read the `Retry-After` header from the 429 response.
2. Wait that many seconds before retrying.
3. If the retry also fails, apply exponential backoff with jitter.
4. Avoid tight retry loops. Continued requests while rate-limited may result in longer cooldown periods.

### TypeScript Retry Example

```typescript
async function fetchWithRetry(url: string, options: RequestInit, maxRetries = 3) {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const response = await fetch(url, options)

    if (response.status !== 429) {
      return response
    }

    const retryAfter = response.headers.get('Retry-After')
    const delaySeconds = retryAfter ? parseInt(retryAfter, 10) : Math.pow(2, attempt)

    // Add jitter to avoid thundering herd
    const jitter = Math.random() * 1000
    const delayMs = delaySeconds * 1000 + jitter

    console.warn(\`Rate limited. Retrying in \${(delayMs / 1000).toFixed(1)}s (attempt \${attempt + 1}/\${maxRetries})\`)

    await new Promise((resolve) => setTimeout(resolve, delayMs))
  }

  throw new Error('Max retries exceeded -- still rate limited.')
}
```

## Best Practices

- **Monitor headers proactively.** Check `X-RateLimit-Remaining` before it hits zero and throttle your request rate accordingly.
- **Use WebSockets for real-time data.** Instead of polling, connect to the LVNG Socket.io server for live updates like messages, presence, and typing indicators. This avoids burning through HTTP rate limits.
- **Cache responses.** If data does not change frequently (e.g., agent configs, workspace metadata), cache it client-side to avoid redundant calls.
- **Implement exponential backoff.** Never retry at a fixed interval. Increase the delay between retries to let the rate window reset.

## Next Steps

- [Error Codes](/docs/errors) -- HTTP and Socket.io error codes, response formats, and troubleshooting.
- [API Reference](/docs/api) -- Full endpoint reference for chat, agents, workflows, and more.
