CTO

Shell Allowlist β€” 63 Safe Binaries, Everything Else Blocked

0 security incidentsProductivity & Security5 min read

Key Takeaway

With 30+ AI agents sharing one server, we lock every agent to 63 pre-approved shell commands β€” anything else gets blocked instantly, no exceptions, forming the first layer of a three-layer defense system.

The Problem

Here's the scenario that keeps me up at night: 31 AI agents running on one Ubuntu server. Each agent can execute shell commands. Each agent talks to an LLM that might hallucinate, get prompt-injected, or just do something stupid.

What happens when an agent decides to run rm -rf /? Or curl a payload from an unknown IP? Or chmod 777 on your SSH keys?

"The model wouldn't do that" isn't a security policy. Models do unexpected things. They get confused. They follow injected instructions from scraped web pages. They try creative solutions that happen to be destructive.

The real question isn't if an agent will try something dangerous. It's when. And your system needs to handle that moment with zero damage.

Most people solve this with vibes. "The model is aligned." "I told it not to." That's not security. That's hope.

The Solution

Mr.Chief's shell allowlist β€” a whitelist of exactly 63 binaries that agents are permitted to execute. Everything else returns an immediate error: "command not in allowlist". No fallback. No override. No "are you sure?"

Combined with Docker sandboxing and filesystem isolation, this creates three independent layers of defense. An agent would need to bypass all three to cause real damage.

The Process

The allowlist lives in Mr.Chief's configuration. Here's what it looks like:

yamlShow code
# config.yaml
security:
  sandbox: require          # All non-main agents run in Docker
  shellAllowlist:
    - ls
    - cat
    - head
    - tail
    - grep
    - sed
    - awk
    - sort
    - uniq
    - wc
    - find
    - echo
    - printf
    - date
    - whoami
    - pwd
    - cd
    - cp
    - mv
    - mkdir
    - touch
    - chmod
    - tar
    - gzip
    - gunzip
    - curl           # allowed β€” outbound requests needed for skills
    - wget
    - git
    - python3
    - pip
    - node
    - npm
    - npx
    - jq
    - yq
    - tr
    - cut
    - paste
    - diff
    - patch
    - tee
    - xargs
    - env
    - export
    - which
    - file
    - stat
    - du
    - df
    - basename
    - dirname
    - realpath
    - readlink
    - md5sum
    - sha256sum
    - base64
    - sleep
    - true
    - false
    - test
    - [             # test alias
    - ssh
    - scp
    - rsync
    - playwright
    - chromium
    - timeout
    - kill          # own processes only (sandbox-scoped)

63 binaries. That's it. Everything an agent legitimately needs for file operations, data processing, web requests, scripting, and development. Nothing it needs for system administration, package management, service control, or destruction.

Here's what happens when an agent crosses the line:

View details
Agent β†’ exec: rm -rf /tmp/workspace
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ ❌ Command blocked: 'rm' not in allowlist   β”‚
β”‚                                             β”‚
β”‚ Allowed commands: ls, cat, head, tail, ...  β”‚
β”‚ Request approval from operator to proceed.  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Real examples from our logs:

View details
[2026-02-18 14:22] agent:research:sub3 β†’ rm -rf /tmp β†’ BLOCKED
[2026-02-19 09:15] agent:scraper:sub1  β†’ apt-get install β†’ BLOCKED
[2026-02-21 11:47] agent:finance:sub2  β†’ systemctl restart β†’ BLOCKED
[2026-03-01 16:33] agent:content:sub1  β†’ sudo su β†’ BLOCKED

None of these were attacks. They were agents trying to be "helpful" β€” cleaning up temp files, installing a missing package, restarting a service. All reasonable intent. All blocked. Because reasonable intent with unrestricted access is how servers die.

The three layers work together:

View details
Layer 1: Shell Allowlist (63 binaries)
  ↓ command passes allowlist
Layer 2: Docker Sandbox (isolated container)
  ↓ runs in ephemeral container with limited mounts
Layer 3: Filesystem Isolation (workspace only)
  ↓ can only read/write within agent workspace
  βœ… Command executes safely

The Results

63

Allowed binaries

47

Blocked commands (past 30 days)

0

Security incidents from agents

3

False positives (legitimate commands blocked)

~30 seconds

Time to add a binary to allowlist

All non-main agents (30)

Agents affected

Negligible (~2ms per command check)

Performance overhead

The three false positives were real needs β€” an agent needed zip for a file export, another needed ffmpeg for audio processing, and one needed pip3 as a separate alias. Each was reviewed, approved, and added to the allowlist in under a minute.

That's the design. It's not meant to never block legitimate commands. It's meant to always block dangerous ones. The approval process is fast. The damage from rm -rf / is not.

Try It Yourself

yamlShow code
# In your Mr.Chief config.yaml
security:
  sandbox: require
  shellAllowlist:
    - ls
    - cat
    - grep
    # ... add what your agents actually need
    # Start minimal. Add as agents request.

Start with 20 commands. Let agents hit the wall. Review what they need. Add deliberately. You'll converge on your own allowlist within a week.


63 commands is plenty. Everything else is a conversation.

Shell AllowlistDocker SandboxDefense in DepthServer SecurityAgent Safety

Want results like these?

Start free with your own AI team. No credit card required.

Shell Allowlist β€” 63 Safe Binaries, Everything Else Blocked β€” Mr.Chief