Surfinguard

The Trust Layer for AI Agents

Surfinguard is the independent security layer that sits between AI agents and the actions they take. It analyzes URLs, commands, file operations, API calls, database queries, code execution, and 12 more action types — scoring each against 5 risk primitives to catch threats before they cause harm.

18 analyzers covering every action an AI agent can take
152 threat patterns from data exfiltration to prompt injection
5 risk primitives: Destruction, Exfiltration, Escalation, Persistence, Manipulation
Zero-latency local mode — no network calls required
SDKs for JavaScript, Python, and Go plus a CLI and MCP server

Quick Install

npm install @surfinguard/sdk

pnpm add @surfinguard/sdk

yarn add @surfinguard/sdk

bun add @surfinguard/sdk

30-Second Example

import { Guard } from '@surfinguard/sdk';
 
const guard = await Guard.create({ mode: 'local' });
 
// Check a suspicious URL
const result = guard.checkUrl('https://g00gle-login.tk/verify');
console.log(result.level);   // "DANGER"
console.log(result.score);   // 9
console.log(result.reasons); // ["Brand impersonation: google", "Risky TLD: .tk", ...]
 
// Check a destructive command
const cmd = guard.checkCommand('rm -rf / --no-preserve-root');
console.log(cmd.level);      // "DANGER"

How It Works

Every action is scored against five risk primitives: DESTRUCTION, EXFILTRATION, ESCALATION, PERSISTENCE, and MANIPULATION. Within each primitive, pattern scores are additive (capped at 10). The composite score is the maximum across all primitives.

Score	Level	Meaning
0-2	SAFE	No significant risk detected
3-6	CAUTION	Potential risk, review recommended
7+	DANGER	High risk, action should be blocked

Next Steps

Getting Started — full quickstart guide with JavaScript and Python
Risk Primitives — understand the five risk dimensions
Threat Taxonomy — all 18 analyzers and 152 threat patterns
API Reference — REST API documentation

Getting Started