Methodology

How AI Fleet Shield
reviews every skill.

Fleet Shield is an automated pattern scanner. It reads each OpenClaw skill’s source, looks for known risk patterns, rewards defensive practices, and returns a score, a grade, and a plain-English findings summary. It’s decision support, not a certification.

Example ratings:A· 96/100SafeC· 68/100CautionD· 42/100DangerousF· 18/100Blocked

What Fleet Shield actually does

1. Read the source

We fetch each skill’s full SKILL.md and supporting source. The scanner runs on plain text — no sandboxed execution, no guessing at intent. You can read the same source yourself on every skill page.

2. Match risk patterns

20 malicious patterns (code execution, credential access, external calls, obfuscation, surveillance) and 6 positive signals (validation, rate limiting, auth checks). Every match is reported with a line reference.

3. Score and rescan

Points deduct per match, points add for defensive signals. Final score 0-100, grade A-F, and verdict safe/caution/dangerous/blocked. When source changes, we rescan.

20 risk patterns we look for

Each match deducts points from a skill’s 100-point starting score. Severity drives the deduction. A skill with two or more critical matches is flagged blocked; one critical match is flagged dangerous.

PatternSeverityWhat we’re looking for−Pts
Remote Code Execution (fetch + eval)criticalDownloads code from a remote URL and executes it. The #1 attack vector for malicious skills.25
Shell Command ExecutionhighExecutes operating system commands. Can be used to install backdoors or exfiltrate data.15
Environment Variable ThefthighAccesses sensitive environment variables like API keys, secrets, or passwords.15
SSH Credential AccesscriticalAttempts to read SSH private keys or authorized_keys files for lateral movement.25
Cryptocurrency Wallet AccesscriticalAttempts to access wallets, private keys, or seed phrases.25
Data ExfiltrationhighUses browser APIs to silently send data to external servers.15
Obfuscated Code ExecutioncriticalDecodes base64 data and executes it. Used to hide malicious payloads from scanners.25
File System Write/DeletemediumWrites or deletes files on disk. Could be used to modify configs or plant backdoors.8
.env File AccesshighReads .env files which typically contain all application secrets.15
Hardcoded IP AddressmediumContains a hardcoded IP address which could be a command-and-control server.8
Tunnel Service URLhighReferences a tunneling service used to expose local servers to the internet.15
Browser Storage AccessmediumAccesses cookies or browser storage which may contain auth tokens.8
Curl Piped to ShellcriticalDownloads and immediately executes a remote script. Classic supply chain attack.25
Remote Download + ExecutecriticalDownloads a file and makes it executable. Used to install persistent malware.25
Surveillance / KeyloggingcriticalContains references to keylogging, screen capture, or device access.25
Reverse / Bind ShellcriticalAttempts to create a reverse or bind shell for remote access.25
Sandbox DisablinghighDisables security sandboxes, removing isolation protections.15
Prompt Injection AttempthighContains patterns used for prompt injection or agent instruction override.15
OpenClaw Config AccesshighAttempts to read or modify OpenClaw configuration files directly.15
Excessive External URLslowReferences many external URLs outside GitHub/npm. May indicate data exfiltration endpoints.3

6 positive signals we reward

Defensive coding patterns add points back. They don’t erase critical findings, but they raise the score on otherwise clean skills.

Input Validation / Sanitization

+3

Uses sanitize, validate, escape, DOMPurify, or schema-parse patterns.

Rate Limiting

+2

Throttles or caps request volume to external services.

Authorization Checks

+2

Verifies identity or permissions before performing an action.

Error Handling

+1

Uses try/catch, .catch, or explicit error handlers.

Timeout / Retry Limits

+1

Uses timeouts, AbortController, or bounded retries to avoid runaway calls.

Read-Only Operations

+2

Uses readonly types, Object.freeze, or explicit immutability.

Scoring, grades, and verdicts

Every skill starts at 100. Malicious matches deduct, positive signals add, score is clamped 0-100. The grade is derived from the final score. The verdict also considers how many critical matches were found.

Grade thresholds

AScore 90 - 100
BScore 75 - 89
CScore 60 - 74
DScore 40 - 59
FScore 0 - 39

Verdict levels

Safe

No critical matches. Score 75 or higher. Review the findings summary before you run it.

Caution

No critical matches but score below 75. Medium / high severity findings are reported. Read them before using in production.

Dangerous

Exactly one critical match. Not recommended without review by a technical person.

Blocked

Two or more critical matches. Flagged blocked on the catalog.

Rescan policy

Software changes. A skill that scored safe today can ship an update tomorrow that introduces a new risk. Fleet Shield rescans skills when their source changes. If the score drops a grade, the verdict changes, or new critical findings appear, the scorecard is updated and the skill’s listing reflects the new state.

Every scorecard shows the date of the most recent scan. You can see exactly what was checked and what was found, line by line, on the skill’s scorecard page.

What Fleet Shield is, and is not

Fleet Shield is: an automated, transparent pattern scanner that flags known risk patterns and rewards defensive coding practices. A consistent scoring method applied across the full catalog. A public findings summary you can read before deciding what to run.

Fleet Shield is not: a certification, a guarantee of safety, a guarantee of fitness for your use case, an audit of every possible exploit, or a substitute for your own review. Static pattern scanning cannot catch every form of obfuscation, and we do not run skills in a sandbox.

Fleet Shield is decision support. We flag what we find. We do not guarantee security or fitness for purpose. You decide what to run.

Fleet Shield covers every scored skill

Every skill in the catalog carries a Fleet Shield scorecard. Browse the full library or jump to a curated outcome stack below.

Browse 17,644+ scored skills

See the findings
before you install.

Every listed skill comes with a Fleet Shield scorecard: grade, score, verdict, and a plain-English findings summary you can read in under a minute.