GuardLLM, hardened tool calls for LLM apps

Name: GuardLLM, hardened tool calls for LLM apps
Availability: InStock
Author: mhcoen

by mhcoen·Feb 14, 2026·1 point·0 comments

Visit Project View on HN

AI Analysis

●●●BangerBig BrainSolve My ProblemWizardry

Lifecycle-aware security pipeline, not point tools—shared context from ingress through output.

Strengths

•Architectural insight: treats security as a full data lifecycle problem, not isolated checks, enabling context-aware decisions downstream.
•Concrete controls with measurable performance: 0.1ms processing, 100% coverage vs 61% from point tools, runs fully local without external APIs.
•Addresses a real LLM security gap: existing defenses either model-dependent (slow) or fragmented (OPA, Casbin, etc. don't share context).

Weaknesses

•Early stage with no releases published; adoption will depend on integration with major agent frameworks (LangChain, LlamaIndex, etc.).
•Python-only limits applicability in polyglot AI stacks; NodeJS/Go equivalents would unlock broader use.

Post Description

Most agent frameworks treat prompt injection as a model-level problem. In practice, once your agent ingests untrusted text and has tool access, you need application-layer controls — structural isolation, tool-call gating, exfiltration detection — that don't depend on the model behaving correctly. I built guardllm to provide those controls. guardllm is a small, auditable Python library that provides:

Inbound hardening: sanitize and structurally isolate untrusted content (web, email, docs, tool output) so it is treated as data, not instructions. Tool-call firewall: deny-by-default destructive operations unless explicitly authorized; fail-closed confirmation when no confirmation handler is wired. Request binding: bind (tool name, canonical args, message hash, TTL) to prevent replay and argument substitution. Exfiltration detection: scans outbound tool arguments for secret patterns and flags substantial verbatim overlap with recently ingested untrusted content. Provenance tracking: enforces stricter no-copy rules on content with known untrusted origin, independent of the overlap heuristic. Canary tokens: per-session canary generation and detection to catch prompt leakage into outputs. Source gating: blocks high-risk sources from being promoted into long-lived memory or KG extraction to reduce memory poisoning.

It is intentionally minimal and not framework-specific. It does not replace least-privilege credentials or sandboxing — it sits above them. Repo: https://github.com/mhcoen/guardllm I'd like feedback on: what threat model gaps you see; whether the default overlap thresholds are reasonable for summarization and quoting workflows; and which framework adapters would make this easiest to adopt (LangChain, OpenAI tool calling, MCP proxy, etc.).

Similar Projects

AI/ML●●●Banger

Castor – a secure execution layer for LLM agents

Kernel interception stops runaway agents where LangGraph and AutoGen only advise.

Big BrainSolve My ProblemShip It

claytonia

102mo ago

Developer Tools●●Solid

Carapace – A security-hardened Rust alternative to OpenClaw

Hardened Rust alternative to OpenClaw, but early (v0.1 preview, still rough edges).

Big BrainNiche Gem

puremachinery

204mo ago

Security●●Solid

SecureClaw – Open-Source Security Layer for OpenClaw Agents

The two-layer approach — a code plugin for gates/hardening plus a tiny ~1,230-token LLM skill for behavioral rules — is smart and practical. I appreciate that detection runs in bash (no token bloat) and that they mapped concrete checks to OWASP ASI and MITRE frameworks; the tradeoff is obvious: this is highly valuable if you run OpenClaw, but mostly irrelevant outside that ecosystem.

Niche GemBig Brain

alex_polyakov

213mo ago

Security●●Solid