Founder
52 Skills Installed β Which Ones Actually Get Used? The Agent Audit
Key Takeaway
Running a skill audit across our 52 installed Mr.Chief skills revealed 8 dead-weight skills to prune, 3 outdated versions to update, and 2 custom skills worth publishing to ClawHub.
The Problem
Mr.Chief skills are powerful. Each one gives the agent a new capability β web scraping, financial analysis, email outreach, competitive intel. So I installed them aggressively. Every skill that looked useful got installed. Over 6 months, I accumulated 52 skills.
The problem with 52 skills: context window pollution.
Every installed skill adds to the agent's available context. Skill descriptions, capability summaries, parameter schemas β it all consumes tokens. At 52 skills, the agent was spending significant context window budget just knowing what it could do, before doing anything.
Worse: some skills overlapped. I had three different ways to search the web, two approaches to email drafting, and a LinkedIn scraper I'd literally never used. The agent occasionally picked the wrong skill for a task because the descriptions were too similar.
Skills are like apps on your phone. You install with optimism, use 30% regularly, and let the rest rot. Except these apps cost tokens every single turn.
The Solution
Alfrawd runs a skill audit. Check every installed skill against actual usage data. Score by frequency, recency, context cost, and value delivered. Prune what's dead. Update what's stale. Publish what's novel.
The ClawHub skill handles the marketplace operations β search, install, update, publish. The agent handles the judgment: what stays, what goes, what's worth sharing.
The Process
Step 1: Inventory and usage analysis
bashShow code
# List all installed skills
mrchief skills check
# Output:
# 52 skills installed
# Last audit: never
# Context budget: ~18,000 tokens on skill descriptions alone
The agent cross-references with session logs to build a usage profile:
bashShow code
# Agent analyzes session transcripts for skill invocations
# Builds usage matrix:
Skill Usage Report (Last 30 Days):
βββββββββββββββββββββββββββββββββββββββββββββββ
Skill | Uses | Last Used | Tokens
βββββββββββββββββββββββββββββββββββββββββββββββ
google-workspace | 142 | today | 380
notion | 89 | today | 420
excel | 67 | yesterday | 350
reddit-scraper | 45 | 3 days ago | 290
weather | 38 | today | 180
session-logs | 34 | today | 250
...
linkedin-influencer | 0 | never | 340
programmatic-seo-spy | 0 | never | 380
customer-win-back | 0 | never | 290
messaging-ab-tester | 0 | 45 days | 310
ββββββββββββββββββββββββββββββββββββββββββββββββ
Step 2: Pruning decision logic
The agent applies a scoring framework:
pythonShow code
# Skill value score (0-100)
def score_skill(skill):
frequency_score = min(skill.uses_30d * 2, 40) # Max 40 points
recency_score = max(0, 30 - skill.days_since_use) # Max 30 points
context_cost = -skill.token_cost / 50 # Penalty for heavy skills
value_signal = skill.user_satisfaction * 10 # From feedback signals
return frequency_score + recency_score + context_cost + value_signal
# Decision thresholds
# Score > 50: Keep
# Score 20-50: Review (might be seasonal)
# Score < 20: Prune candidate
Results:
View details
PRUNE (score < 20, unused 30+ days):
β linkedin-influencer-discovery (0 uses, 340 tokens)
β programmatic-seo-spy (0 uses, 380 tokens)
β customer-win-back-sequencer (0 uses, 290 tokens)
β messaging-ab-tester (0 uses, 310 tokens)
β champion-tracker (0 uses, 260 tokens)
β serp-feature-sniper (0 uses, 350 tokens)
β trending-ad-hook-spotter (0 uses, 280 tokens)
β tech-stack-teardown (0 uses, 320 tokens)
SAVINGS: ~2,530 tokens per turn (8 skills Γ avg 316 tokens)
UPDATE (newer versions available on ClawHub):
π excel v1.2.0 β v1.4.1 (new pivot table support)
π reddit-scraper v2.0.0 β v2.1.3 (rate limit fixes)
π weather v1.0.0 β v1.1.0 (Open-Meteo fallback)
Step 3: Execute
bashShow code
# Prune unused skills
clawhub uninstall linkedin-influencer-discovery
clawhub uninstall programmatic-seo-spy
clawhub uninstall customer-win-back-sequencer
# ... (5 more)
# Update outdated skills
clawhub update excel
clawhub update reddit-scraper
clawhub update weather
# Verify
mrchief skills check
# 44 skills installed
# Context budget: ~15,470 tokens (-2,530 saved)
Step 4: Publish custom skills
Two skills we built internally proved general enough to share:
bashShow code
# Publish PyratzLabs custom skills to ClawHub
clawhub publish skills/pyratz-compliance-checker \
--name "eu-crypto-compliance" \
--description "MiCA and EU regulatory compliance checker for crypto projects"
clawhub publish skills/pyratz-investor-pipeline \
--name "investor-pipeline-tracker" \
--description "Track investor conversations, term sheets, and follow-ups"
Both published. Both available for the community.
The Results
| Metric | Before Audit | After Audit |
|---|---|---|
| Skills installed | 52 | 44 |
| Context tokens on skills | ~18,000 | ~15,470 |
| Token savings per turn | β | 2,530 (~14%) |
| Outdated skills | 3 | 0 |
| Unused skills (30d) | 8 | 0 |
| Skills published to ClawHub | 0 | 2 |
| Skill selection accuracy | ~85% | ~94% |
The skill selection accuracy improvement was unexpected. With fewer, more distinct skills, the agent picks the right tool more often. Less confusion, fewer retries.
The 14% context savings compound. Every turn, every conversation, every day. Over a month, that's millions of tokens not wasted on skills that were never invoked.
Try It Yourself
- Run
mrchief skills checkto see your current skill inventory - Cross-reference with actual usage from session logs
- Score each skill: frequency Γ recency β context cost
- Prune anything unused for 30+ days (you can always reinstall)
- Check ClawHub for updates:
clawhub search [skill-name] - If you've built something useful, publish it:
clawhub publish
Schedule the audit quarterly. Skills accumulate like browser tabs β you don't notice the bloat until you clear it.
Every installed skill is a standing cost. Audit ruthlessly. Your context window will thank you.
Related case studies
Founder
ClawHub: From 15 Skills to 52 in One Afternoon β The Skill Marketplace That Scales Your Agent
Started with 15 bundled skills. ClawHub marketplace got us to 52 in one afternoon. Finance, legal, security, research β here's how we evaluated and installed 37 skills.
Product Manager
Monitoring 100 Competitor Pages for Changes β Weekly Diff Report
An AI agent scrapes 100 competitor pages weekly, diffs them against the previous snapshot, and flags changes. Pricing shifts, new features, team hires β nothing slips through.
CTO
Security Audit That Runs Every Morning β 149 Intrusion Attempts Caught on Day One
An AI agent runs a full security audit every morning at 7am β UFW, SSH, fail2ban, open ports. Day one: 149 blocked brute-force attempts. Here's how we set it up with Mr.Chief.
Want results like these?
Start free with your own AI team. No credit card required.