← Blog  ·  April 28, 2026  ·  8 min read

OpenAI Privacy Filter accuracy benchmark — precision, recall, and F1 (2026)

OpenAI's official documentation for Privacy Filter doesn't publish accuracy numbers. We ran an independent benchmark using a synthetic labeled dataset derived from real-world text patterns to give developers a concrete sense of what to expect. Results are broken down by entity type with methodology notes and known failure modes.

Note: These results are based on English-language text. Accuracy on non-English inputs differs — see the languages supported guide for details.

Methodology

We created a test corpus of 1,200 sentences derived from synthetic support tickets, HR emails, clinical notes templates, and code comments. Each sentence was manually labeled with ground-truth PII spans. We sent each sentence through the PrivacyFilter.run API and compared the returned entity spans against the labels.

Metrics used:

Span matching was strict: the entity span must start and end within 2 characters of the ground-truth span to count as a hit.

Results by entity type

Entity Type Precision Recall F1
EMAIL 0.99 0.98 0.99
PHONE 0.97 0.95 0.96
SSN (US format) 0.98 0.97 0.97
CREDIT_CARD 0.99 0.96 0.97
IP_ADDRESS 0.99 0.97 0.98
PERSON (contextual) 0.91 0.93 0.92
ADDRESS 0.88 0.85 0.87
DATE_OF_BIRTH 0.90 0.82 0.86
Overall (macro avg) 0.95 0.93 0.94

Where it excels

Formatted entities (EMAIL, PHONE, SSN, CREDIT_CARD, IP) score near-perfect because these have unambiguous lexical patterns. Even a regex catches most of these; an LLM-backed model catches edge cases like international phone formats and partial credit card numbers near the threshold.

Contextual PERSON detection is where OpenAI Privacy Filter pulls ahead of regex-based competitors. Given "I forwarded it to Sarah in accounting", the model correctly flags PERSON: Sarah without an email, phone, or last name next to it. Presidio's default recognizers would miss this.

Where it struggles

ADDRESS spans are imprecise

The most common failure mode: the model flags the right address but with a slightly wrong boundary. For example, it may return "123 Main St, San Francisco" when the ground truth was "123 Main Street, San Francisco, CA 94105". The correct span was there, but truncated. For use cases where exact span recovery matters (e.g. reconstructing redacted text), add a post-processing step that expands the matched span to the nearest punctuation boundary.

DATE_OF_BIRTH vs generic dates

The model sometimes flags obviously generic dates as DATE_OF_BIRTH — e.g., "the project deadline is January 15" producing a false positive. Conversely, DOB references in HR-style prose ("born on the third of March, 1989") are occasionally missed because they look more like narrative text. Precision drops to 0.90 on this entity type.

First names only, no context

A single common first name without any surrounding context ("Alex said yes") produces about 12% false positives — the model flags "Alex" as PERSON even when it's a product name or variable name in a code snippet. If your corpus contains code or config files, consider pre-filtering non-prose input.

Comparison to Microsoft Presidio

On the same test corpus, Microsoft Presidio with its default English recognizers scored:

OpenAI Privacy Filter's +5 F1 points advantage comes entirely from contextual entities (PERSON, ADDRESS, DATE_OF_BIRTH). On structured entities (EMAIL, SSN, IP), both tools score within 1 point of each other. See the full Presidio comparison for details.

Practical implications

A macro F1 of 0.94 means that in a corpus of 1,000 PII entities:

For GDPR compliance workflows, false negatives matter more than false positives. If your risk model requires near-zero missed PII, consider running a secondary regex pass for structured entities (SSN, credit card, phone) after the Privacy Filter call to catch anything missed in edge cases.

import re, httpx

STRUCTURED_PATTERNS = {
    "SSN_EXTRA":  r'\b\d{3}-\d{2}-\d{4}\b',
    "PHONE_EXTRA": r'\+?[\d\s\-\(\)]{10,15}',
}

def redact_belt_and_suspenders(text: str, license_key: str) -> str:
    # Primary: LLM-based detection
    resp = httpx.post("https://privacyfilter.run/api/redact",
                      json={"text": text, "license_key": license_key}).json()
    redacted = resp["redacted_text"]
    # Secondary: structured regex fallback
    for label, pattern in STRUCTURED_PATTERNS.items():
        redacted = re.sub(pattern, f"[{label}]", redacted)
    return redacted

Test accuracy on your own text — paste a sample and inspect the detected entities in seconds.

Try free →

Keep reading