Recent product change: API rate limits effective June 30, 2026 LILT is committed to delivering fast, reliable translation performance to every organization on our platform. To protect that shared experience and make sure no single workload can degrade service for others, we’ve adding per-organization rate limits to our document pretranslation and file-translation APIs, beginning June 30, 2026. These endpoints aren’t changing; we’ve simply introduced fair-use limits so capacity stays balanced across all our customers. Please note: All LILT Connectors are out of scope for these rate limits. These will only apply to the API.

Overview

Starting June 30, 2026, LILT is introducing per-organization rate limits on the following endpoints:

POST /v2/documents/pretranslate
POST /v2/translate/file
POST /v2/documents/files (only when the upload invokes MT - see exclusion note below)
POST /2/jobs (only request rate and not character throughput)

Two independent limits apply to every organization:

Limit	Threshold
Request rate	300 requests per minute
Character throughput	2,500,000 characters per minute

Requests that exceed either limit receive an HTTP 429 Too Many Requests response. This guide explains how to read the 429 response headers and restructure your integration to stay comfortably within these thresholds.

File uploads without Machine Translation (MT) are excluded

POST /v2/documents/files uploads a file - it does not, by itself, run machine translation (MT). Uploads that don’t invoke MT are not counted against the character-throughput limit: LILT evaluates each request’s translation cost and a non-MT upload is measured as zero characters, so it draws down neither your throughput budget. Only calls that actually consume MT (POST /v2/documents/pretranslate and POST /v2/translate/file, plus a /v2/documents/files upload that triggers pretranslation) count toward the 2,500,000-characters-per-minute limit. You can upload source files freely; batch and pace the translation calls that follow.

Understanding the 429 Response

When a request is rate-limited, LILT returns a 429 with three headers that tell you exactly when you can safely retry:

Header	Meaning
`X-RateLimit-Limit`	Your per-minute allowance (requests or characters, depending on which limit was hit)
`X-RateLimit-Remaining`	Requests or characters still available in the current one-minute window
`X-RateLimit-Reset`	Seconds until the current window resets and your full quota is restored

The two limits are independent. You can hit the character throughput ceiling without exhausting your request count, or vice versa. Check both headers when handling a 429.

Example 429 response

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 300
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 38

{
  "error": "rate_limit_exceeded",
  "message": "Request rate limit reached. Retry after 38 seconds."
}

Handling a 429: Retry with Back-Off

The simplest fix for an occasional 429 is to wait for the window to reset before retrying. Do not tight-loop or immediately re-send. This wastes quota and keeps triggering 429s.

Recommended retry pattern

On receiving a 429:

Read the X-RateLimit-Reset value from the response headers.
Sleep for that many seconds (plus a small jitter to avoid synchronized retries across parallel workers).
Re-send the original request.

Python example

import time, random, requests

def pretranslate(payload, headers):
    url = "https://api.lilt.com/v2/documents/pretranslate"
    for attempt in range(5):
        resp = requests.post(url, json=payload, headers=headers)
        if resp.status_code == 429:
            reset_in = int(resp.headers.get("X-RateLimit-Reset", 60))
            jitter   = random.uniform(0.5, 2.0)
            wait     = reset_in + jitter
            print(f"Rate limited. Retrying in {wait:.1f}s (attempt {attempt+1}/5)")
            time.sleep(wait)
            continue
        resp.raise_for_status()
        return resp.json()
    raise RuntimeError("Exceeded retry limit")

Node.js example

async function pretranslate(payload, headers) {
  const url = "https://api.lilt.com/v2/documents/pretranslate";
  for (let attempt = 0; attempt < 5; attempt++) {
    const res = await fetch(url, {
      method: "POST",
      headers: { ...headers, 'Content-Type': 'application/json' },
      body: JSON.stringify(payload),
    });
    if (res.status === 429) {
      const resetIn = parseInt(res.headers.get("X-RateLimit-Reset") ?? "60", 10);
      const jitter  = Math.random() * 1.5 + 0.5;
      const wait    = (resetIn + jitter) * 1000;
      console.log(`Rate limited. Retrying in ${(wait/1000).toFixed(1)}s`);
      await new Promise(r => setTimeout(r, wait));
      continue;
    }
    if (!res.ok) throw new Error(`HTTP ${res.status}`);
    return res.json();
  }
  throw new Error("Exceeded retry limit");
}

Batching Requests

If your integration sends many small, individual translation calls in rapid succession, batching (combining multiple documents or files into fewer API calls) is the most effective way to stay well under the rate limits.

What to batch

Endpoint	Batching approach
`POST /v2/documents/pretranslate`	Pass an array of document IDs in a single request body
`POST /v2/translate/file`	Batch by referencing multiple already-uploaded files in a single call. Pass their IDs via the fileId query parameter (e.g. ?fileId=1,2,3 or repeated ?fileId=1&fileId=2). This does not accept multiple source files in the multipart body; the files must already exist (uploaded via the file-upload endpoint first).

Character throughput tipBatching reduces your request count, but each batch’s characters still count toward the 2,500,000 character-per-minute throughput limit. If you’re working with very large documents, spread batches across multiple windows rather than sending all characters at once.

Sizing your batches

There is no fixed rule for batch size — it depends on your document sizes and submission cadence. Use these guidelines as a starting point:

Keep each batch well under 2,500,000 characters to leave headroom for concurrent jobs from other parts of your organization.
If you are consistently close to the X-RateLimit-Remaining ceiling, reduce batch frequency or split large batches into smaller ones with a brief pause between them.
For burst workloads (e.g., end-of-sprint file exports), schedule submissions in staggered windows rather than all at once.

Pretranslation: batching document IDs

The POST /v2/documents/pretranslate endpoint accepts an array of document IDs. Instead of issuing one request per document, collect IDs and submit them together:

# ✗  One request per document (inefficient)
for doc_id in document_ids:
    requests.post('/v2/documents/pretranslate', json={'id': [doc_id]}, headers=headers)

# ✓  All documents in a single request (efficient)
requests.post(
    '/v2/documents/pretranslate',
    json={'id': document_ids},   # e.g. [1001, 1002, 1003, ...]
    headers=headers,
)

If you have hundreds of documents to pretranslate, split them into chunks and submit one chunk per window:

import time, math, requests

CHUNK_SIZE = 50   # documents per request
WINDOW_SEC = 62   # slightly more than 60 s to be safe

def batch_pretranslate(doc_ids, headers):
    chunks = [doc_ids[i:i+CHUNK_SIZE] for i in range(0, len(doc_ids), CHUNK_SIZE)]
    for i, chunk in enumerate(chunks):
        print(f'Submitting chunk {i+1}/{len(chunks)} ({len(chunk)} docs)')
        pretranslate({'id': chunk}, headers)   # uses retry helper above
        if i < len(chunks) - 1:
            time.sleep(WINDOW_SEC)

File translation: Uploading and batching by reference

The POST /v2/documents/files accepts one file per request. Additional files in a multipart body are silently ignored. To translate many files, upload each one individually, then batch the downstream operation by passing the resulting IDs in a single call (e.g. an array of document IDs to POST /v2/documents/pretranslate, or multiple field query params to POST /v2/translate/file.)

▎ import requests
▎
▎ def upload_files(file_paths, headers):
▎ """Upload files one at a time; return the list of created file IDs."""
▎ file_ids = []
▎ for path in file_paths:
▎ with open(path, 'rb') as fh:
▎ files = {'file': (path.split('/')[-1], fh, 'application/octet-stream')}
▎ resp = requests.post(
▎ 'https://api.lilt.com/v2/documents/files',
▎ files=files,
▎ headers=headers,
▎ )
▎ resp.raise_for_status()
▎ file_ids.append(resp.json()['id']) # adjust to actual response field
▎ return file_ids
▎
▎ def translate_files(file_ids, headers):
▎ """Batch-translate already-uploaded files in a single request by reference."""
▎ resp = requests.post(
▎ 'https://api.lilt.com/v2/translate/file',
▎ params={'fileId': ','.join(str(i) for i in file_ids)},
▎ headers=headers,
▎ )
▎ resp.raise_for_status()
▎ return resp.json()

Proactive Throttling

Rather than reacting to 429s, you can read X-RateLimit-Remaining and X-RateLimit-Reset on every successful response and slow down before you hit the ceiling.

def check_and_throttle(response, low_watermark=20):
    """Pause proactively when remaining quota is low."""
    remaining = int(response.headers.get("X-RateLimit-Remaining", 999))
    reset_in  = int(response.headers.get("X-RateLimit-Reset", 0))
    if remaining <= low_watermark and reset_in > 0:
        print(f'Quota low ({remaining} left). Pausing {reset_in}s.')
        time.sleep(reset_in + 1)

Pre-Launch Checklist

Before June 30, verify that your integration:

Sends arrays of document IDs to /v2/documents/pretranslate rather than one ID at a time and arrays of file IDs to /v2/files/translation.
Groups related files into a single multipart/form-data request where possible.
Handles HTTP 429 by sleeping for X-RateLimit-Reset seconds (plus jitter) before retrying.
Does not tight-loop on 429 responses.
Monitors X-RateLimit-Remaining and throttles proactively when quota is low.
Schedules large burst workloads across multiple one-minute windows.

Need More Headroom?

The default thresholds are designed to sit well above typical usage patterns. If your workload genuinely requires higher limits, reply to the rate-limits notification email and the LILT team will work with you to find a configuration that fits your needs without impacting the shared platform.

Contact supportReach out via your rate-limits notification email, or contact LILT support at support.lilt.com. Please include your organization ID and a brief description of your workload volume when you get in touch.

Docs

API Reference

API Rate Limits

Overview

File uploads without Machine Translation (MT) are excluded

Understanding the 429 Response

Example 429 response

Handling a 429: Retry with Back-Off

Recommended retry pattern

Python example

Node.js example

Batching Requests

What to batch

Sizing your batches

Pretranslation: batching document IDs

File translation: Uploading and batching by reference

Proactive Throttling

Pre-Launch Checklist

Need More Headroom?

​Overview

​File uploads without Machine Translation (MT) are excluded

​Understanding the 429 Response

​Example 429 response

​Handling a 429: Retry with Back-Off

​Recommended retry pattern

​Python example

​Node.js example

​Batching Requests

​What to batch

​Sizing your batches

​Pretranslation: batching document IDs

​File translation: Uploading and batching by reference

​Proactive Throttling

​Pre-Launch Checklist

​Need More Headroom?

Overview

File uploads without Machine Translation (MT) are excluded

Understanding the 429 Response

Example 429 response

Handling a 429: Retry with Back-Off

Recommended retry pattern

Python example

Node.js example

Batching Requests

What to batch

Sizing your batches

Pretranslation: batching document IDs

File translation: Uploading and batching by reference

Proactive Throttling

Pre-Launch Checklist

Need More Headroom?