Cloud LLMs are powerful, convenient, and completely unacceptable for certain categories of evidence. When you're processing classified documents, attorney-client privileged material, or evidence in active litigation, sending content to an external API — regardless of the provider's security posture — creates unacceptable risk.
The alternative: run the AI locally, on infrastructure you control, with no external network connectivity.
Why Air-Gap
The case for air-gapped AI processing is straightforward:
- ▶Data sovereignty: Evidence never leaves your physical control
- ▶No third-party access: No API provider sees your data, even encrypted in transit
- ▶No logging concerns: No external service logging queries or responses
- ▶Compliance: Satisfies requirements that prohibit cloud processing of sensitive data
- ▶Privilege protection: Attorney-client privilege isn't waived by sharing with a machine you control
The tradeoff is capability. Local models are smaller and less capable than frontier cloud models. But for many forensic analysis tasks, they're sufficient.
Hardware Requirements
Running useful LLMs locally requires serious hardware:
Minimum viable configuration:
- ▶64GB RAM
- ▶GPU with 24GB+ VRAM (NVIDIA RTX 4090 or A5000)
- ▶2TB NVMe SSD (models + evidence storage)
- ▶No network interface card (true air gap)
TCI's standard deployment:
- ▶128GB RAM
- ▶Dual GPU setup (2x RTX 4090 or A6000)
- ▶RAID-10 NVMe array
- ▶Hardware security module (HSM) for key management
- ▶No wireless hardware, physically disconnected ethernet
Model Selection
Not every model is appropriate for air-gapped forensic work. TCI evaluates models on:
- ▶License: Must allow on-premises deployment (no API-only models)
- ▶Size vs. capability: 7B-70B parameter range balances capability with hardware requirements
- ▶Task performance: Entity extraction, summarization, and classification accuracy on forensic datasets
- ▶Quantization tolerance: How much quality degrades at reduced precision (important for fitting on available VRAM)
Models we deploy:
- ▶Llama 3 variants (70B at 4-bit quantization for analysis, 8B for lightweight tasks)
- ▶Mistral/Mixtral (good performance-per-parameter ratio)
- ▶Fine-tuned models for domain-specific entity recognition
Software Stack
The air-gapped system runs a minimal software stack:
- ▶OS: Hardened Linux (no unnecessary services, no package manager connected to repositories)
- ▶Inference: llama.cpp or vLLM for model serving
- ▶Orchestration: Custom Python pipeline for task management
- ▶Storage: LUKS-encrypted filesystem with WORM-compliant evidence storage
- ▶Audit: Comprehensive logging to append-only local storage
Software is installed from verified media (USB, verified checksums). Updates follow the same physical transfer process.
Workflow
Evidence Ingestion
Evidence enters the air-gapped system via:
- ▶Physical media (USB, external drive) with chain of custody documentation
- ▶One-way data diodes (hardware enforced — data in, no data out)
- ▶Optical media for maximum assurance
Processing
The local LLM pipeline processes evidence through:
- ▶Document parsing and text extraction
- ▶Entity extraction (names, dates, organizations, financial entities)
- ▶Relationship mapping between extracted entities
- ▶Summarization of key documents
- ▶Classification by relevance and sensitivity
Results Export
Processed results (not raw evidence) are exported via:
- ▶Physical media with integrity verification
- ▶Printed reports for non-digital distribution
- ▶Encrypted exports with key management through the HSM
The key principle: raw evidence enters but doesn't leave the air-gapped system. Only processed, reviewed outputs are exported.
Operational Challenges
Model Updates
Updating models on an air-gapped system requires physical media transfer. TCI maintains a quarterly update cycle, with critical updates applied ad hoc.
Performance Tuning
Without internet access, troubleshooting performance issues requires preloaded documentation and experienced operators.
User Training
Analysts accustomed to cloud LLMs need training on the capabilities and limitations of local models. Expectations management is critical — a local 70B model won't match GPT-4, but it handles most forensic analysis tasks adequately.
Conclusion
Air-gapped AI isn't about having the best model. It's about having a model that runs where your security requirements demand. For sensitive evidence processing, the slight reduction in capability is vastly outweighed by the guarantee of data sovereignty.
TCI provides air-gapped AI deployment as a standard offering for clients handling highly sensitive evidence.
