OpenAI Privacy Filter — rate limits and pricing (2026)
If you're planning to use OpenAI Privacy Filter in production via PrivacyFilter.run, this page covers the rate limits, character caps, batch options, cost estimates, and retry patterns you need for capacity planning.
Plan comparison
| Plan | Price | Redactions | Max chars / call | Batch | Daily cap |
|---|---|---|---|---|---|
| Free | $0 | 3/day per IP | 2,000 | No | 3 |
| Redact Pack | $9 one-time | 50 (never expire) | 10,000 | 5 docs/call | — |
| Unlimited Monthly | $19/month | Unlimited | 10,000 | 20 docs/call | 200/day soft cap |
The 200/day soft cap on Unlimited Monthly is a fair-use guard against abuse. Most teams processing support tickets, emails, or documents will stay well below this ceiling. If you routinely need more, contact support.
API rate limiter behaviour
The /api/redact endpoint returns HTTP 429 Too Many Requests when:
- Free tier: you've used 3 calls in the current calendar day (UTC)
- Paid hourly: you've exceeded 30 calls/hour on the same license key prefix
- Checkout spam protection: more than 10 checkout attempts per hour from the same IP
The 429 response body includes a detail field explaining which limit was hit:
{"detail": "Free tier: 3 redactions/day used. Upgrade at privacyfilter.run/#pricing"}
Retry strategy
import time
import httpx
def redact_with_retry(text: str, license_key: str, max_retries: int = 3) -> dict:
url = "https://privacyfilter.run/api/redact"
for attempt in range(max_retries):
r = httpx.post(url, json={"text": text, "license_key": license_key}, timeout=20)
if r.status_code == 429:
wait = 2 ** attempt # 1s, 2s, 4s
time.sleep(wait)
continue
r.raise_for_status()
return r.json()
raise RuntimeError("Rate limit persists after retries")
Cost estimates for common workloads
- 10–50 docs/month (support tickets, occasional reports): Free tier or Redact Pack ($9 one-time)
- 50–200 docs/month: Redact Pack if documents are independent; Unlimited Monthly ($19/mo) if you need consistent access
- 200–6000 docs/month: Unlimited Monthly ($19/mo) — at 200/day = 6000/month, cost is $0.003/redaction
- 6000+ docs/month: Contact support for a volume arrangement
Character counting
Character limits are applied to the input text length (Python len(text), which is Unicode codepoints, not bytes). Emoji and CJK characters each count as 1. For documents near the 10,000-char limit, add a pre-check:
MAX_CHARS = 10_000
def split_if_needed(text: str, chunk_size: int = MAX_CHARS) -> list[str]:
"""Split on paragraph boundaries to avoid cutting mid-sentence."""
if len(text) <= chunk_size:
return [text]
paragraphs = text.split("\n\n")
chunks, current = [], []
for p in paragraphs:
if sum(len(x) for x in current) + len(p) > chunk_size:
chunks.append("\n\n".join(current))
current = [p]
else:
current.append(p)
if current:
chunks.append("\n\n".join(current))
return chunks
Latency expectations
Typical p50 latency from a EU server: 800ms–1.5s per call. p99 under normal load: ~3s. The LLM inference step dominates; network RTT is negligible in comparison. For latency-sensitive paths, pre-redact in a background worker and cache the result rather than calling inline.
Start with the free tier — 3 redactions/day, no credit card, up to 2,000 characters each.