Skip to content

Secure RAG — Use Internal Documents Safely with RAG

Starter Plan or Above Required

/rag/ingest and /rag/resolve are available on the Starter plan and above. Requests from a Free plan will return 403 Forbidden. View pricing →

For industries where data cannot leave the premises — manufacturing, healthcare, finance — PII Firewall provides a Secure RAG pipeline that tokenizes PII before passing documents to the LLM.

What is Secure RAG?

In a standard RAG pipeline, internal documents are passed directly to a vector DB or LLM, which means personal information, customer data, and confidential content is sent to external cloud services.
Secure RAG tokenizes PII before documents enter the pipeline. The LLM only sees anonymized text. When returning the response, tokens are restored so users receive complete information.

Internal doc → rag_ingest → PII tokenized → Vector DB
                  ↓                              ↓
    [Only anonymized text reaches the cloud]   LLM search & response

                                         rag_resolve → PII restored → User

MCP Tools

ToolDescription
rag_ingestTokenize PII in documents and split into RAG chunks safely
rag_resolveRestore PII tokens in RAG search results

Usage in Claude Desktop

Once the MCP Server is installed, ask Claude naturally:

"Use rag_ingest to safely ingest this internal document: [document text]"

"Restore the [SRAG:...] tokens in the RAG result using rag_resolve"

SDK Usage

typescript
import { createFirewall } from '@pii-firewall/sdk'

const fw = createFirewall({ lang: 'en' })

// 1. Tokenize PII before ingesting into RAG
const ingestResult = await fw.ragIngest(documentText)
// ingestResult.chunks   → anonymized chunks to store in vector DB
// ingestResult.tokenMap → token map to keep locally for restoration

// 2. Restore tokens in LLM response
const resolvedText = await fw.ragResolve(llmResponse, ingestResult.tokenMap)

Proxy API

bash
# rag_ingest: Tokenize PII in a document (use extraTypes: ["name"] to detect names)
curl -X POST https://pii-firewallproxy-production.up.railway.app/rag/ingest \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer pf_live_xxx" \
  -d '{"text": "Contact: Alice Smith (alice@corp.com) / Customer: ABC Manufacturing", "lang": "en", "extraTypes": ["name"]}'

# rag_resolve: Restore tokens to original values
curl -X POST https://pii-firewallproxy-production.up.railway.app/rag/resolve \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer pf_live_xxx" \
  -d '{"text": "[SRAG:cat=PII,name=EMAIL,id=abc123]", "sessionId": "..."}'

Name detection

Name detection requires extraTypes: ["name"] (off by default).
The engine detects names with a space between first and last name (e.g. Alice Smith). Names written without a space are not currently detected.

Verified Test Result

Input:  Contact: Alice Smith (alice@corp.com) / Customer: ABC Manufacturing / Tel: 03-1234-5678
        Spec: Part-A tolerance ±0.01mm (CONFIDENTIAL)

After rag_ingest (sent to LLM):
  Contact: [SRAG:cat=PII,name=NAME,id=...] ([SRAG:cat=PII,name=EMAIL,id=...])
  Customer: [SRAG:cat=BUSINESS,name=COMPANY_NAME_JP,id=...]
  Tel: [SRAG:cat=PII,name=PHONE,id=...]
  Spec: Part-A tolerance ±0.01mm (CONFIDENTIAL)  ← technical data passes through

After rag_resolve: 4/4 tokens fully restored ✅

Technical data (specs, numbers, confidentiality markers) is not PII and passes through unchanged — preserving RAG utility while protecting privacy.

Use Cases by Industry

IndustryUse Case
ManufacturingUse spec sheets, quality records, and supplier data with external LLMs
HealthcareRAG search over patient records
FinanceInternal AI use with customer data, contracts, and transaction history
LegalResearch and summarization with client-identifying documents
HRUse employee data in internal policy and review documents

Local processing — zero cloud transmission

When using the MCP Server, rag_ingest and rag_resolve run entirely locally. Only tokenized text reaches the cloud.

Privacy by Design.