API Rate Limiting

Master LegalCollects.ai API rate limiting: understand tiers, response headers, handling 429 responses, and best practices for efficient integration.

Rate Limiting Overview

The LegalCollects.ai API implements comprehensive rate limiting to protect infrastructure, prevent abuse, and ensure fair usage across all clients. Rate limiting protects against brute force attacks, resource exhaustion, and denial-of-service scenarios while allowing legitimate API consumers to integrate reliably.

Our rate limiting strategy uses rolling 15-minute time windows, with limits applied per IP address for unauthenticated endpoints and per user ID for authenticated endpoints. When you exceed your tier's limit, the API returns HTTP 429 (Too Many Requests) with information about when you can retry.

Understanding rate limits is essential for building reliable integrations. This guide covers all five rate limit tiers, response headers for monitoring usage, best practices for avoiding limits, and code examples for handling rate limit responses in JavaScript and Python.

Rate Limiting Protects Your Integration

Rate limiting isn't punitive—it's protective. By understanding limits and implementing intelligent retry logic, you build integrations that are resilient, efficient, and respectful of shared infrastructure. All rate limit errors include actionable information to help you recover gracefully.

Rate Limit Tiers

LegalCollects.ai defines five distinct rate limit tiers based on endpoint purpose and resource consumption:

Tier Endpoints Limit Window Use Case
General /api/* (catch-all) 100 requests Per 15 min Standard API endpoints, per IP
Auth /api/auth/login
/api/auth/register
10 requests Per 15 min Brute force protection, per IP
Case Creation /api/cases 30 requests Per 15 min Resource-intensive operations, per user
Bulk Upload /api/bulk-upload 5 requests Per 15 min Prevent resource exhaustion, per user
Public Pages Static files (/,
/portal, /admin)
200 requests Per 15 min Public content, per IP (reserved)

General API Limiter

The General API Limiter applies to most endpoints under `/api/*`. This tier allows 100 requests per 15 minutes per IP address. It provides standard protection for typical API usage patterns. Use this tier as your baseline when planning API integration.

Auth Limiter (Brute Force Protection)

The Auth Limiter applies to authentication endpoints: `/api/auth/login` and `/api/auth/register`. This tier allows only 10 requests per 15 minutes per IP address—a strict limit designed to prevent brute force attacks and unauthorized access attempts.

If you're building an application that handles authentication, account for this tight limit. After exceeding 10 authentication attempts, your IP will be rate limited for 15 minutes. Encourage users to enter credentials carefully and implement account lockout policies on your end.

Case Creation Limiter

The Case Creation Limiter applies to `/api/cases` endpoints. This tier allows 30 requests per 15 minutes per authenticated user. Unlike other limits that count per IP, case creation is tracked per user ID (from JWT token), preventing individual users from creating excessive cases regardless of IP address.

This limit applies to authenticated requests only. You must provide a valid API token or JWT. The limit prevents resource exhaustion from case creation operations, which are computationally intensive.

Bulk Upload Limiter

The Bulk Upload Limiter applies to `/api/bulk-upload` endpoints. This tier allows the strictest limit: 5 requests per 15 minutes per authenticated user. Bulk operations consume significant server resources, so this tier is most restrictive.

Plan bulk uploads carefully. If you need to upload large volumes, space requests across the 15-minute window and use batching strategies where possible.

Public Pages Limiter

The Public Pages Limiter applies to static files and public endpoints. This tier allows 200 requests per 15 minutes per IP address. Currently exported but not actively enforced (static files bypass rate limiting), this tier is reserved for future dynamic public endpoints.

Response Headers

All rate-limited API responses include standard HTTP headers that communicate your current usage and limit status:

Standard Rate Limit Headers

Every response from rate-limited endpoints includes three headers:

Header Description Example
X-RateLimit-Limit Maximum requests allowed in the current window 100
X-RateLimit-Remaining Requests remaining in the current window 87
X-RateLimit-Reset Unix timestamp (seconds) when limit resets 1712701234

Using Headers for Proactive Management

Check X-RateLimit-Remaining after each request to monitor your usage. When remaining requests approach zero, reduce your request frequency. For example:

  • If remaining > 50% of limit: proceed normally
  • If remaining 25-50%: reduce request frequency by 50%
  • If remaining < 25%: wait until limit resets or significantly throttle

Reset Timing

X-RateLimit-Reset shows when your current limit window expires. This is a Unix timestamp in seconds. To check time until reset: reset_time - current_time = seconds_remaining. After the reset time, your request counter resets to zero and you can make a full quota of requests.

Reset Is Not At Fixed Times

Rate limit windows are rolling, not fixed. Your 15-minute window doesn't reset at the top of every hour—it's based on when your first request in the current window occurred. Each request increments your counter; the counter resets when 15 minutes pass from your first request.

HTTP 429 Response Format

When you exceed your rate limit, the API returns HTTP 429 (Too Many Requests). The response includes JSON with error details and retry information:

{
  "success": false,
  "message": "Too many requests",
  "error": "RATE_LIMIT_EXCEEDED",
  "retryAfter": 425,
  "resetAt": "2026-04-09T22:00:34.000Z"
}

Response Fields

Field Type Description
success Boolean Always false for 429 responses
message String "Too many requests"
error String "RATE_LIMIT_EXCEEDED"
retryAfter Integer Seconds to wait before retrying
resetAt ISO 8601 Timestamp when limit resets

Retry-After Header

The HTTP response also includes a Retry-After header with the same value as retryAfter in the response body:

HTTP/1.1 429 Too Many Requests
Retry-After: 425
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1712701234

Handling 429 Responses

When you receive a 429 response, implement automatic retry logic:

  1. Extract the retryAfter value from the response (in seconds)
  2. Wait at least that many seconds before retrying
  3. Optionally add jitter (random delay) to prevent thundering herd
  4. Retry the request using the same parameters
  5. If rate limit is hit again, implement exponential backoff

Best Practices for API Consumers

1. Implement Exponential Backoff

When you receive a 429 response, don't retry immediately. Instead, wait the time specified in Retry-After, then increase the wait time exponentially on subsequent failures:

  • First retry: wait 1 second
  • Second retry: wait 2 seconds
  • Third retry: wait 4 seconds
  • Fourth retry: wait 8 seconds
  • Maximum: cap at 60-300 seconds

Add random jitter (0-1 second) to prevent multiple clients from retrying simultaneously and causing a thundering herd problem.

2. Monitor X-RateLimit-Remaining

Proactively throttle requests before hitting the limit. As you approach your quota, reduce request frequency. This prevents rate limit errors and ensures smoother API experience.

Implement simple logic: if remaining is less than 20% of limit, wait or batch requests.

3. Implement Response Caching

Cache API responses when possible. For example:

  • Cache case lookups for 5-10 minutes
  • Cache user data after retrieval
  • Avoid repeated requests for the same data

Caching reduces API calls, lowers rate limit pressure, and improves application performance.

4. Use Batch Operations

When creating multiple cases or uploading bulk data, use batch endpoints if available. Single batch request counts as one API call instead of multiple individual requests.

For example, creating 10 cases via bulk upload uses 1 request instead of 10.

5. Space Requests Across Time

Distribute requests evenly across the 15-minute window instead of spiking. For example:

  • Instead of making 100 requests in 1 minute, spread them over 15 minutes
  • Implement queuing to smooth request patterns
  • Batch operations to reduce total request count

6. Handle Rate Limits Gracefully

Plan for rate limit errors in your error handling. Return informative messages to users and implement automatic retry logic rather than failing immediately.

7. Authenticate When Possible

For endpoints that support authentication, provide API keys or tokens. Authenticated requests often have higher limits or different rate limit categories. Using authentication can increase your allowance.

Rate Limits Enable Scale

Rate limiting ensures the API remains responsive for all users. By respecting limits and implementing best practices, you contribute to a healthier ecosystem and ensure your integration remains reliable as we scale.

Code Examples

Here are practical examples for handling rate limits in JavaScript and Python:

JavaScript: Exponential Backoff with Fetch

async function apiCallWithRetry(url, options = {}, maxRetries = 5) {
  let lastError;

  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(url, options);

      // Check headers for rate limit status
      const remaining = response.headers.get('X-RateLimit-Remaining');
      const limit = response.headers.get('X-RateLimit-Limit');
      console.log(`Rate limit: ${remaining}/${limit}`);

      if (response.status === 429) {
        // Rate limited
        const retryAfter = response.headers.get('Retry-After');
        const waitSeconds = parseInt(retryAfter || '1');

        if (attempt < maxRetries) {
          const backoffMs = Math.min(
            (Math.pow(2, attempt) * 1000) + Math.random() * 1000,
            300000 // 5 minute max
          );
          console.log(`Rate limited. Waiting ${backoffMs}ms...`);
          await new Promise(r => setTimeout(r, backoffMs));
          continue;
        }
      }

      if (!response.ok) {
        throw new Error(`HTTP ${response.status}: ${response.statusText}`);
      }

      return await response.json();
    } catch (error) {
      lastError = error;
      if (attempt < maxRetries) {
        const backoffMs = Math.pow(2, attempt) * 1000;
        await new Promise(r => setTimeout(r, backoffMs));
      }
    }
  }

  throw lastError;
}

// Usage:
const data = await apiCallWithRetry('https://api.legalcollects.ai/api/cases', {
  method: 'POST',
  headers: {
    'Authorization': 'Bearer YOUR_API_TOKEN',
    'Content-Type': 'application/json'
  },
  body: JSON.stringify({ /* case data */ })
});

Python: Rate Limit-Aware Client

import requests
import time
from datetime import datetime, timedelta

class LegalCollectsClient:
    def __init__(self, api_token):
        self.api_token = api_token
        self.base_url = 'https://api.legalcollects.ai'
        self.rate_limit_reset = None
        self.rate_limit_remaining = None

    def _handle_rate_limit(self, response):
        """Update rate limit tracking from response headers"""
        self.rate_limit_remaining = int(
            response.headers.get('X-RateLimit-Remaining', -1)
        )
        reset_ts = int(response.headers.get('X-RateLimit-Reset', 0))
        if reset_ts:
            self.rate_limit_reset = datetime.fromtimestamp(reset_ts)

    def _check_rate_limit_before_request(self):
        """Proactively throttle if approaching limit"""
        if self.rate_limit_remaining is not None:
            if self.rate_limit_remaining < 5:
                # Approaching limit, wait until reset
                if self.rate_limit_reset:
                    wait_seconds = (
                        self.rate_limit_reset - datetime.now()
                    ).total_seconds()
                    if wait_seconds > 0:
                        print(f"Rate limit approaching. Waiting {wait_seconds}s...")
                        time.sleep(wait_seconds + 1)

    def request(self, method, endpoint, max_retries=5, **kwargs):
        """Make rate-limit-aware request with exponential backoff"""
        url = f"{self.base_url}{endpoint}"
        headers = kwargs.pop('headers', {})
        headers['Authorization'] = f'Bearer {self.api_token}'

        for attempt in range(max_retries + 1):
            # Check before making request
            self._check_rate_limit_before_request()

            try:
                response = requests.request(
                    method, url, headers=headers, **kwargs
                )

                # Update rate limit tracking
                self._handle_rate_limit(response)

                if response.status_code == 429:
                    # Rate limited
                    retry_after = int(
                        response.headers.get('Retry-After', '1')
                    )

                    if attempt < max_retries:
                        # Exponential backoff with jitter
                        backoff = min(
                            (2 ** attempt) + (time.time() % 1),
                            300  # 5 minute max
                        )
                        print(f"Rate limited. Retrying in {backoff}s...")
                        time.sleep(backoff)
                        continue

                response.raise_for_status()
                return response.json()

            except requests.exceptions.RequestException as e:
                if attempt < max_retries:
                    backoff = (2 ** attempt) * 1
                    print(f"Request failed: {e}. Retrying in {backoff}s...")
                    time.sleep(backoff)
                    continue
                raise

        raise Exception("Max retries exceeded")

    def create_case(self, case_data):
        """Create a case with automatic rate limit handling"""
        return self.request('POST', '/api/cases', json=case_data)

    def get_cases(self):
        """List cases with automatic rate limit handling"""
        return self.request('GET', '/api/cases')

# Usage:
client = LegalCollectsClient('YOUR_API_TOKEN')
case = client.create_case({
    'claimant': 'ABC Corp',
    'defendant': 'XYZ Inc',
    'amount': 50000
})

Best Practice: Batch Operations

For bulk operations, batch multiple actions into single requests:

// Instead of this (10 API calls):
for (const caseData of cases) {
  await fetch('/api/cases', {
    method: 'POST',
    body: JSON.stringify(caseData)
  });
}

// Do this (1 API call):
await fetch('/api/cases/bulk', {
  method: 'POST',
  body: JSON.stringify({ cases })
});

Ready to Integrate with LegalCollects.ai?

Start building with the LegalCollects.ai API. Our comprehensive rate limiting ensures reliable performance for all integrations. Check our full API documentation for detailed endpoint references and authentication details.

View Full API Documentation

Frequently Asked Questions

LegalCollects.ai implements five tiered rate limits: General API endpoints allow 100 requests per 15 minutes per IP address. Authentication endpoints (login/register) allow 10 requests per 15 minutes per IP. Case creation allows 30 requests per 15 minutes per authenticated user. Bulk upload allows 5 requests per 15 minutes per authenticated user. Public pages allow 200 requests per 15 minutes per IP address. Limits use rolling 15-minute windows, not fixed times.

Every API response includes rate limit headers: X-RateLimit-Limit (maximum requests in window), X-RateLimit-Remaining (requests left in current window), and X-RateLimit-Reset (Unix timestamp when limit resets). Monitor the X-RateLimit-Remaining header to track usage. Use this information to proactively throttle requests before hitting the limit. When you receive a 429 response, the body includes retryAfter (seconds to wait) and resetAt (when your limit resets).

When you receive HTTP 429, implement automatic retry with exponential backoff. Extract the Retry-After header (or retryAfter from response JSON) to learn how many seconds to wait. Wait that duration, then retry the request. If rate limited again, double the wait time and retry (exponential backoff). Add random jitter (0-1 second) to prevent multiple clients from retrying simultaneously. Cap maximum wait time at 300 seconds (5 minutes). Most 429 errors resolve automatically with this strategy.

LegalCollects.ai uses rolling 15-minute windows, not fixed reset times. Your rate limit window is based on when your first request in the current window occurred, not on clock time. For example, if your first request is at 2:30 PM, your 15-minute window is 2:30-2:45 PM. If you make another request at 2:40 PM, that request is in the same 2:30-2:45 window. This approach is fairer because it allows continuous usage spread across the window, rather than penalizing users near fixed reset times.

Standard tier limits are available to all API users. For high-volume integrations, contact the LegalCollects.ai team to discuss enterprise rate limits. We can evaluate your use case and may be able to provide higher limits or tiered rate limiting based on your account tier. Alternatively, implement batch operations and caching to reduce API calls within standard limits. Most use cases can be optimized to work within default limits.

Implement these strategies: (1) Monitor X-RateLimit-Remaining and proactively throttle when approaching limits. (2) Cache responses to avoid repeated requests for the same data. (3) Use batch operations for bulk actions instead of individual requests. (4) Space requests evenly across the 15-minute window instead of spiking. (5) Implement exponential backoff for 429 responses. (6) Use authentication where available (authenticated endpoints may have higher limits). Most rate limit issues come from inefficient API usage patterns rather than legitimate high volume.

⚙️

Legal Collects Developer Team

The LegalCollects API team builds robust, scalable infrastructure for commercial debt collection. We design rate limiting and API architecture to ensure reliable integration for developers and fair usage for all users.