Skip to main content
CID222Documentation

PII Detection

CID222 automatically detects and protects Personally Identifiable Information (PII) and Protected Health Information (PHI) across 15+ entity types with configurable handling policies.

Supported Entity Types

Personal Identifiers

EntityDescriptionDetection Method
PERSONFull names, including titlesNER
EMAILEmail addressesRegex + validation
PHONEPhone numbers (intl formats)Regex + libphonenumber
SSNSocial Security Numbers (US)Regex + checksum
DATE_OF_BIRTHBirth datesNER + context
PASSPORTPassport numbersRegex by country
DRIVER_LICENSEDriver's license numbersRegex by state/country

Financial Data

EntityDescriptionDetection Method
CREDIT_CARDCredit/debit card numbersRegex + Luhn
IBANInternational bank accountsRegex + checksum
BANK_ACCOUNTDomestic bank accountsRegex + context
TAX_IDTax identification numbersRegex by country

Location Data

EntityDescriptionDetection Method
LOCATIONPhysical addressesNER + patterns
IP_ADDRESSIPv4 and IPv6 addressesRegex
GPS_COORDINATESLatitude/longitudeRegex + validation

Health Data (PHI)

EntityDescriptionDetection Method
MEDICAL_IDMedical record numbersRegex + context
HEALTH_CONDITIONDiseases, diagnosesNER + medical NER
MEDICATIONDrug names, dosagesNER + drug database

Detection Accuracy

CID222 achieves high accuracy across entity types through a combination of methods:

Entity TypePrecisionRecallF1 Score
EMAIL99.5%99.8%99.6%
CREDIT_CARD99.2%98.9%99.0%
PHONE97.8%96.5%97.1%
PERSON94.2%92.8%93.5%
LOCATION91.5%89.3%90.4%

Masking Formats

Detected PII is replaced with type-specific placeholders:

Masking Examples
Original: "Contact john.smith@company.com or call 555-123-4567"
Masked: "Contact [EMAIL] or call [PHONE]"
Original: "My SSN is 123-45-6789 and credit card is 4111-1111-1111-1111"
Masked: "My SSN is [SSN] and credit card is [CREDIT_CARD]"
Original: "Send to John Smith at 123 Main St, New York"
Masked: "Send to [PERSON] at [LOCATION]"

Custom Patterns

Add custom regex patterns for organization-specific identifiers:

Custom Pattern Example
{
"name": "Employee ID",
"entity_type": "EMPLOYEE_ID",
"pattern": "EMP-[0-9]{6}",
"action": "MASK",
"confidence": 0.95
}
Test custom patterns in the dashboard's Filter Testing tool before deploying to production.

Context Awareness

CID222's detection is context-aware to reduce false positives:

  • Number sequences — "Call 911" won't trigger phone detection
  • Code contexts — IP addresses in code comments may be allowed
  • Example data — "example@example.com" can be exempted
  • Business context — Company names vs. person names