OpenAI Privacy Filter accuracy benchmark — precision, recall, and F1 (2026)
OpenAI's official documentation for Privacy Filter doesn't publish accuracy numbers. We ran an independent benchmark using a synthetic labeled dataset derived from real-world text patterns to give developers a concrete sense of what to expect. Results are broken down by entity type with methodology notes and known failure modes.
Methodology
We created a test corpus of 1,200 sentences derived from synthetic support tickets, HR emails, clinical notes templates, and code comments. Each sentence was manually labeled with ground-truth PII spans. We sent each sentence through the PrivacyFilter.run API and compared the returned entity spans against the labels.
Metrics used:
- Precision: of all entities the model flagged, what fraction were correct ground-truth PII?
- Recall: of all ground-truth PII in the corpus, what fraction did the model find?
- F1: harmonic mean of precision and recall.
Span matching was strict: the entity span must start and end within 2 characters of the ground-truth span to count as a hit.
Results by entity type
| Entity Type | Precision | Recall | F1 |
|---|---|---|---|
| 0.99 | 0.98 | 0.99 | |
| PHONE | 0.97 | 0.95 | 0.96 |
| SSN (US format) | 0.98 | 0.97 | 0.97 |
| CREDIT_CARD | 0.99 | 0.96 | 0.97 |
| IP_ADDRESS | 0.99 | 0.97 | 0.98 |
| PERSON (contextual) | 0.91 | 0.93 | 0.92 |
| ADDRESS | 0.88 | 0.85 | 0.87 |
| DATE_OF_BIRTH | 0.90 | 0.82 | 0.86 |
| Overall (macro avg) | 0.95 | 0.93 | 0.94 |
Where it excels
Formatted entities (EMAIL, PHONE, SSN, CREDIT_CARD, IP) score near-perfect because these have unambiguous lexical patterns. Even a regex catches most of these; an LLM-backed model catches edge cases like international phone formats and partial credit card numbers near the threshold.
Contextual PERSON detection is where OpenAI Privacy Filter pulls ahead of regex-based competitors. Given "I forwarded it to Sarah in accounting", the model correctly flags PERSON: Sarah without an email, phone, or last name next to it. Presidio's default recognizers would miss this.
Where it struggles
ADDRESS spans are imprecise
The most common failure mode: the model flags the right address but with a slightly wrong boundary. For example, it may return "123 Main St, San Francisco" when the ground truth was "123 Main Street, San Francisco, CA 94105". The correct span was there, but truncated. For use cases where exact span recovery matters (e.g. reconstructing redacted text), add a post-processing step that expands the matched span to the nearest punctuation boundary.
DATE_OF_BIRTH vs generic dates
The model sometimes flags obviously generic dates as DATE_OF_BIRTH — e.g., "the project deadline is January 15" producing a false positive. Conversely, DOB references in HR-style prose ("born on the third of March, 1989") are occasionally missed because they look more like narrative text. Precision drops to 0.90 on this entity type.
First names only, no context
A single common first name without any surrounding context ("Alex said yes") produces about 12% false positives — the model flags "Alex" as PERSON even when it's a product name or variable name in a code snippet. If your corpus contains code or config files, consider pre-filtering non-prose input.
Comparison to Microsoft Presidio
On the same test corpus, Microsoft Presidio with its default English recognizers scored:
- Precision: 0.93 (slightly lower than Privacy Filter due to regex over-triggering)
- Recall: 0.86 (meaningfully lower — misses contextual PERSON and complex ADDRESS spans)
- F1: 0.89 (macro average)
OpenAI Privacy Filter's +5 F1 points advantage comes entirely from contextual entities (PERSON, ADDRESS, DATE_OF_BIRTH). On structured entities (EMAIL, SSN, IP), both tools score within 1 point of each other. See the full Presidio comparison for details.
Practical implications
A macro F1 of 0.94 means that in a corpus of 1,000 PII entities:
- ~940 are correctly detected (true positives)
- ~60 are missed (false negatives) — potentially leaked PII
- ~50 non-PII spans are incorrectly flagged (false positives) — over-redaction
For GDPR compliance workflows, false negatives matter more than false positives. If your risk model requires near-zero missed PII, consider running a secondary regex pass for structured entities (SSN, credit card, phone) after the Privacy Filter call to catch anything missed in edge cases.
import re, httpx
STRUCTURED_PATTERNS = {
"SSN_EXTRA": r'\b\d{3}-\d{2}-\d{4}\b',
"PHONE_EXTRA": r'\+?[\d\s\-\(\)]{10,15}',
}
def redact_belt_and_suspenders(text: str, license_key: str) -> str:
# Primary: LLM-based detection
resp = httpx.post("https://privacyfilter.run/api/redact",
json={"text": text, "license_key": license_key}).json()
redacted = resp["redacted_text"]
# Secondary: structured regex fallback
for label, pattern in STRUCTURED_PATTERNS.items():
redacted = re.sub(pattern, f"[{label}]", redacted)
return redacted
Test accuracy on your own text — paste a sample and inspect the detected entities in seconds.