Tri-Netra
Securing Voice and Multimodal AI Agents Against
Deepfakes and Prompt Injection
Real-time multimodal threat detection across audio, text, and visual inputs. Powered by LCNN, Transformer, CLIP, and advanced prompt injection analysis.
Audio Path
LCNN deepfake detector + Transformer classifier + GE2E voice clone checker. Catches synthetic speech, voice cloning, and audio spoofing.
30% weightPrompt Path
Three-layer defense: keyword matching, regex pattern scanning, and structural analysis for role override and injection detection.
25% weightVision Path
EasyOCR text extraction, CLIP image-text mismatch detection, and ResNet amplification for embedded visual injection attacks.
30% weightCross-Modal Fusion
Correlates signals across all paths for holistic threat assessment. Weighted fusion produces a final risk score with explainable breakdown.
15% weightThreat Decision Pipeline
PASS
Allow to agent — safe input
FLAG
Hold for human review
BLOCK
Log + discard input