Tools & Infrastructure

Technology Stack

Our technical infrastructure for digital forensics, evidence capture, AI-powered correlation, WORM-compliant storage, and data anonymization. Open-source where possible, enterprise-grade where required.

Infrastructure Philosophy

We build evidence capture and preservation systems using a combination of open-source tools and enterprise platforms. Our stack prioritizes immutability (WORM storage), verifiability (cryptographic hashing), AI-powered correlation (LLMs and graph analysis), and privacy compliance (automated PII redaction).

WORM
Immutable Storage
SHA-256
Hash Verification
LLM
AI Correlation
PII
Auto-Redaction

Relationship Mapping

Link Analysis & Visualization

Maltego

Link Analysis

Enterprise-grade link analysis and data visualization platform for mapping relationships between entities, infrastructure, and digital footprints.

  • Entity relationship mapping
  • Transform-based data enrichment
  • Visual graph analysis
  • Custom transform development

Gephi

Network Visualization

Open-source network analysis and visualization software for exploring and understanding graph structures in large datasets.

  • Large-scale graph rendering
  • Community detection algorithms
  • Network metrics calculation
  • Interactive exploration

Neo4j

Graph Database

Native graph database for storing and querying complex relationship data with high performance traversal operations.

  • Cypher query language
  • ACID compliance
  • Real-time graph algorithms
  • Scalable relationship storage

Machine Learning

AI & Information Correlation

LLM Pipelines

Information Extraction

Large language model workflows for entity extraction, summarization, translation, and cross-document correlation.

  • Named entity extraction
  • Document summarization
  • Multi-language processing
  • Relationship inference

Vector Databases

Semantic Search

Embedding-based search infrastructure (Pinecone, Weaviate, pgvector) for semantic similarity and concept matching across large document collections.

  • Semantic similarity search
  • Hybrid keyword + vector
  • Clustering & deduplication
  • Cross-language matching

Document AI

OCR & Extraction

ML-powered document processing for OCR, table extraction, and structured data extraction from PDFs and images.

  • Multi-language OCR
  • Table structure detection
  • Form field extraction
  • Handwriting recognition

Collection Pipelines

Evidence Capture

Archive-It / Webrecorder

Web Archiving

WARC-compliant web capture tools for creating authenticated, timestamped archives of web content with full fidelity.

  • WARC format preservation
  • JavaScript rendering capture
  • Authenticated session recording
  • Replay verification

HTTrack / wget

Site Mirroring

Command-line tools for recursive website downloading and offline archival of complete site structures.

  • Recursive crawling
  • Rate limiting & politeness
  • Link rewriting for offline use
  • Resume interrupted downloads

yt-dlp / gallery-dl

Media Capture

Specialized downloaders for video, audio, and image content from platforms with metadata preservation.

  • Multi-platform support
  • Metadata extraction
  • Format selection
  • Playlist handling

Immutable Archives

WORM Storage & Preservation

WORM Storage

Immutable Archives

Write Once, Read Many storage architecture ensuring evidence integrity. Data cannot be modified or deleted after writing, meeting legal hold and compliance requirements.

  • Immutable data storage
  • Compliance with SEC 17a-4
  • Legal hold support
  • Tamper-evident logging

AWS S3 Object Lock

Cloud WORM

Cloud-native WORM implementation using S3 Object Lock for governance and compliance mode retention policies.

  • Governance & compliance modes
  • Retention period enforcement
  • Legal hold flags
  • Versioning integration

Wasabi Hot Cloud Storage

Cost-Effective Archive

S3-compatible cloud storage with immutability features and no egress fees, ideal for large evidence archives.

  • Object lock support
  • No egress charges
  • 11x9s durability
  • S3 API compatibility

Chain of Custody

Hashing & Timestamping

SHA-256 / SHA-3

Cryptographic Hashing

Industry-standard cryptographic hash functions for evidence integrity verification and chain-of-custody documentation.

  • Collision resistance
  • Deterministic output
  • Court-admissible verification
  • NIST approved algorithms

OpenTimestamps

Blockchain Timestamping

Decentralized timestamping using Bitcoin blockchain for provable existence of data at a specific point in time.

  • Bitcoin-backed timestamps
  • Decentralized verification
  • Long-term validity
  • Open standard

RFC 3161 TSA

Trusted Timestamping

RFC 3161 compliant Timestamp Authority services for legally recognized time attestation of digital evidence.

  • PKI-based timestamps
  • Legal recognition
  • Third-party attestation
  • eIDAS compliance

Privacy Compliance

PII Redaction & Anonymization

Microsoft Presidio

PII Detection

Open-source data protection SDK for PII detection and anonymization using NLP and pattern matching.

  • Multi-language NER
  • Custom recognizers
  • Multiple anonymization operators
  • Batch processing

spaCy + Custom Models

Entity Recognition

Industrial-strength NLP library with custom-trained models for domain-specific entity extraction and redaction.

  • Custom NER training
  • Multi-language support
  • Fast inference
  • Pipeline integration

ExifTool

Metadata Handling

Comprehensive metadata reader/writer for scrubbing or extracting EXIF, XMP, and IPTC data from media files.

  • Read/write 400+ formats
  • Batch processing
  • Metadata scrubbing
  • Geolocation extraction

End-to-End Pipeline

Typical Workflow

1

Capture

Automated collection with timestamps

2

Hash

SHA-256 + blockchain timestamp

3

Store

WORM-compliant cloud archive

4

Correlate

AI-powered entity linking

5

Redact

PII anonymization for output