Introduction

Surfinguard

The Trust Layer for AI Agents

Surfinguard is the independent security layer that sits between AI agents and the actions they take. It analyzes URLs, commands, file operations, API calls, database queries, code execution, and 12 more action types — scoring each against 5 risk primitives to catch threats before they cause harm.

  • 18 analyzers covering every action an AI agent can take
  • 152 threat patterns from data exfiltration to prompt injection
  • 5 risk primitives: Destruction, Exfiltration, Escalation, Persistence, Manipulation
  • Zero-latency local mode — no network calls required
  • SDKs for JavaScript, Python, and Go plus a CLI and MCP server

Quick Install

npm install @surfinguard/sdk

30-Second Example

import { Guard } from '@surfinguard/sdk';
 
const guard = await Guard.create({ mode: 'local' });
 
// Check a suspicious URL
const result = guard.checkUrl('https://g00gle-login.tk/verify');
console.log(result.level);   // "DANGER"
console.log(result.score);   // 9
console.log(result.reasons); // ["Brand impersonation: google", "Risky TLD: .tk", ...]
 
// Check a destructive command
const cmd = guard.checkCommand('rm -rf / --no-preserve-root');
console.log(cmd.level);      // "DANGER"

How It Works

Every action is scored against five risk primitives: DESTRUCTION, EXFILTRATION, ESCALATION, PERSISTENCE, and MANIPULATION. Within each primitive, pattern scores are additive (capped at 10). The composite score is the maximum across all primitives.

ScoreLevelMeaning
0-2SAFENo significant risk detected
3-6CAUTIONPotential risk, review recommended
7+DANGERHigh risk, action should be blocked

Next Steps