← Blog  ·  April 28, 2026  ·  5 min read

OpenAI Privacy Filter — rate limits and pricing (2026)

If you're planning to use OpenAI Privacy Filter in production via PrivacyFilter.run, this page covers the rate limits, character caps, batch options, cost estimates, and retry patterns you need for capacity planning.

Plan comparison

Plan Price Redactions Max chars / call Batch Daily cap
Free $0 3/day per IP 2,000 No 3
Redact Pack $9 one-time 50 (never expire) 10,000 5 docs/call
Unlimited Monthly $19/month Unlimited 10,000 20 docs/call 200/day soft cap

The 200/day soft cap on Unlimited Monthly is a fair-use guard against abuse. Most teams processing support tickets, emails, or documents will stay well below this ceiling. If you routinely need more, contact support.

API rate limiter behaviour

The /api/redact endpoint returns HTTP 429 Too Many Requests when:

The 429 response body includes a detail field explaining which limit was hit:

{"detail": "Free tier: 3 redactions/day used. Upgrade at privacyfilter.run/#pricing"}

Retry strategy

import time
import httpx

def redact_with_retry(text: str, license_key: str, max_retries: int = 3) -> dict:
    url = "https://privacyfilter.run/api/redact"
    for attempt in range(max_retries):
        r = httpx.post(url, json={"text": text, "license_key": license_key}, timeout=20)
        if r.status_code == 429:
            wait = 2 ** attempt  # 1s, 2s, 4s
            time.sleep(wait)
            continue
        r.raise_for_status()
        return r.json()
    raise RuntimeError("Rate limit persists after retries")

Cost estimates for common workloads

Character counting

Character limits are applied to the input text length (Python len(text), which is Unicode codepoints, not bytes). Emoji and CJK characters each count as 1. For documents near the 10,000-char limit, add a pre-check:

MAX_CHARS = 10_000

def split_if_needed(text: str, chunk_size: int = MAX_CHARS) -> list[str]:
    """Split on paragraph boundaries to avoid cutting mid-sentence."""
    if len(text) <= chunk_size:
        return [text]
    paragraphs = text.split("\n\n")
    chunks, current = [], []
    for p in paragraphs:
        if sum(len(x) for x in current) + len(p) > chunk_size:
            chunks.append("\n\n".join(current))
            current = [p]
        else:
            current.append(p)
    if current:
        chunks.append("\n\n".join(current))
    return chunks

Latency expectations

Typical p50 latency from a EU server: 800ms–1.5s per call. p99 under normal load: ~3s. The LLM inference step dominates; network RTT is negligible in comparison. For latency-sensitive paths, pre-redact in a background worker and cache the result rather than calling inline.

Start with the free tier — 3 redactions/day, no credit card, up to 2,000 characters each.

Try PrivacyFilter free →

Keep reading