Skip to main content
CID222Documentation

Content Detection

CID222 automatically detects sensitive content in all requests. This page documents the detection results format and available endpoints for querying detection logs.

Detection Result Format

Every chat completion includes a detections field showing what was found and how it was handled:

{
"detections": {
"input": [
{
"entity_type": "EMAIL",
"text": "john@example.com",
"start": 25,
"end": 41,
"confidence": 0.99,
"action": "MASK",
"masked_text": "[EMAIL]"
}
],
"output": []
}
}

Supported Entity Types

TypeDescriptionExamples
PERSONPerson namesJohn Smith, Dr. Jane Doe
EMAILEmail addressesuser@domain.com
PHONEPhone numbers+1-555-123-4567
SSNSocial Security Numbers123-45-6789
CREDIT_CARDCredit card numbers4111-1111-1111-1111
IBANBank account numbersDE89370400440532013000
IP_ADDRESSIP addresses192.168.1.1, 2001:db8::1
LOCATIONPhysical addresses123 Main St, New York
DATE_OF_BIRTHBirth dates01/15/1990
MEDICAL_IDMedical record numbersMRN-12345678

Detection Actions

ActionDescription
MASKReplace with placeholder (e.g., [EMAIL]). Request proceeds with masked content.
REJECTBlock the entire request. Returns 403 error.
FLAGLog the detection but allow request to proceed unchanged.

Safety Categories

Beyond PII, CID222 detects harmful content categories:

CategoryDescription
TOXIC_CONTENTProfanity, abuse, harassment
HATE_SPEECHDiscriminatory content targeting protected groups
SEXUAL_CONTENTAdult or explicit material
VIOLENCEViolent content or threats
JAILBREAKPrompt injection attempts

Query Detection Logs

GET /admin/detections

This endpoint requires admin permissions.
ParameterTypeDescription
entity_typestringFilter by entity type
actionstringFilter by action taken
start_datestringFilter from date (ISO 8601)
end_datestringFilter to date (ISO 8601)
Query Detections
curl -X GET "https://api.cid222.ai/admin/detections?entity_type=EMAIL&action=MASK" \
-H "Authorization: Bearer YOUR_API_KEY"

Detection Statistics

GET /admin/detections/stats

Get aggregate statistics on detections:

Response
{
"total_detections": 15420,
"by_entity_type": {
"EMAIL": 5230,
"PHONE": 3150,
"PERSON": 4890,
"CREDIT_CARD": 2150
},
"by_action": {
"MASK": 14200,
"REJECT": 820,
"FLAG": 400
},
"period": {
"start": "2024-01-01T00:00:00Z",
"end": "2024-01-31T23:59:59Z"
}
}

Confidence Scores

Each detection includes a confidence score between 0 and 1:

  • > 0.9 — High confidence, action applied automatically
  • 0.7 - 0.9 — Medium confidence, may require review
  • < 0.7 — Low confidence, typically flagged only
Confidence thresholds are configurable per filter. See Content Safety Pipeline for details.