How to Detect PII in AI Prompts

Learn how to automatically detect and block personally identifiable information in LLM prompts before it reaches third-party AI providers.

One of the most common risks with AI applications is PII leakage. Users paste emails, phone numbers, and other personal data into prompts without realizing it gets sent to third-party AI providers like OpenAI or Anthropic.

What counts as PII?

In the context of AI prompts, PII includes: - Email addresses - Phone numbers - Social security numbers - Credit card numbers - Physical addresses - Names (when combined with other identifiers)

Detection approaches

Pattern matching (regex) The simplest approach uses regular expressions to detect known PII patterns: - Emails: standard email regex - Phone numbers: US format patterns (xxx-xxx-xxxx, etc.) - SSNs: xxx-xx-xxxx pattern

This is what SignalVault uses for its `contains_pii` rule type.

Named entity recognition (NER) More sophisticated approaches use ML models to identify PII, but these add latency and complexity. For real-time guardrails, regex patterns provide the best tradeoff of speed vs. accuracy.

What to do when PII is detected

You have several options: 1. **Block** the request entirely — safest, but disrupts the user experience 2. **Redact** the PII — replace with placeholders like [EMAIL], [PHONE] 3. **Warn** — log the detection but allow the request through

The right choice depends on your compliance requirements. For SOC2 and GDPR, blocking or redacting is typically required.

Implementation with SignalVault

Create a PII detection rule in your app's Rules tab: - Type: PII Detection - Action: Block or Redact - Environment: All environments - Patterns: email, phone, ssn

Every request will be checked automatically, and violations appear in your dashboard.