Trust No Skill: BIV Audit Finds 80% of AI Agent Skills Misbehave
Unit 42's Behavioral Integrity Verification scanned 49,943 OpenClaw skills and found 80% deviate from declared behavior, with multi-stage attack chains enabling credential theft...

Executive Summary
Palo Alto Networks Unit 42 researchers have developed a new audit primitive called Behavioral Integrity Verification (BIV) that scans AI agent skills for hidden malicious behavior. Applied to 49,943 skills from the OpenClaw public registry in early 2026, BIV found that 80% of skills (39,933) exhibit at least one mismatch between what they claim to do and what they actually do. While most mismatches are benign documentation errors, a dangerous subset contains multi-stage attack chains — combining individually innocuous capabilities into credential theft, remote code execution (RCE), or silent data exfiltration. The research, published June 11, 2026, positions the agent-skill ecosystem where mobile apps and browser extensions were a decade ago: extensibility has outpaced supply-chain audit primitives.
Technical Analysis
AI agents extend their functionality through third-party "skills" — small packages bundling executable code (Python, JavaScript, shell), a YAML manifest, and a natural-language SKILL.md file that tells the agent when and how to use the skill. Once installed, a skill runs inside the agent's privileged context, with access to environment variables, file systems, external services, and shell commands.
BIV addresses a unique audit challenge: a skill's behavior splits across three modalities — metadata, executable code, and natural-language instructions. No existing scanner reads all three simultaneously. BIV uses a fixed taxonomy of 29 capabilities organized into seven families (Network, File system, Process execution, Environment, Encoding, Credentials, Instruction-level threats). Two parallel tracks populate the taxonomy: a "declared track" parses metadata and uses an LLM to extract claimed capabilities from natural-language descriptions (grounded in quoted source spans), and an "actual track" applies static analyzers (AST-level taint analysis, regex, pattern matching) to code and an LLM to instructions for prompt-injection and instruction-override motifs.
A skill passes when its actual capability set fits within its declared set. It fails when it performs an undeclared action (under-specification — the dangerous direction) or declares a capability it never uses (over-specification — usually benign template residue). Three filters keep LLM outputs honest: rejecting verbatim taxonomy echoes, requiring source-span anchoring, and demanding domain-specific keywords for high-risk capabilities. Every flagged deviation ships with file-and-line evidence for manual audit.
Across 49,943 OpenClaw skills, BIV surfaced 250,706 behavioral deviations. A clustering pass produced 137 distinct threat clusters and four novel compound threat categories:
- Exfiltration chains:
FILE_READ→ base64 encoding →NETWORK_SEND - Remote code execution (RCE) chains: download → write to disk → execute
- Code obfuscation: encoding chain →
dynamic eval() - Data lineage violations:
FILE_READ→FILE_WRITE(mostly benign data-pipeline boilerplate)
The threat lives in the chain, not the individual steps. A skill that reads a file is benign; a skill that reads a file, base64-encodes the content, and sends it to an external endpoint is exfiltration.
Mitigations & Recommendations
Security teams running LLM agents in production should inventory all installed third-party skills and require a behavioral-integrity check before installation — not after. Unit 42 recommends treating skills like any other third-party dependency: apply the principle of least privilege, restrict network egress from agent environments, and monitor for unexpected file reads or process executions. Until automated audit primitives like BIV become standard in registries, manual review of skill manifests and code for multi-step patterns (read-encode-send, download-write-execute) is advised. Palo Alto Networks customers can leverage Prisma AIRS and the Unit 42 AI Security Assessment for deeper protection.
Stay Updated
Get the latest cybersecurity news delivered to your inbox.

