Engineering Manager

Technical Documentation That Developers Actually Read

1-2 week engineer ramp-up vs 4-8 weeksDevOps & Cloud4 min read

Key Takeaway

The Technical Writing skill generates complete documentation β€” API reference, architecture docs, onboarding guides, ADRs, runbooks, and changelogs. Structured, searchable, and actually useful β€” not the "we should document this" wall of text nobody maintains.

The Problem

Documentation in most engineering teams:

  • README.md: Written once during initial commit. Never updated. Instructions don't work.
  • Architecture docs: Confluence page from 2 years ago. Completely wrong now.
  • API docs: Swagger/OpenAPI generated from code β€” technically accurate, practically useless.
  • Onboarding guide: "Ask Steve." Steve left 6 months ago.
  • Runbooks: Don't exist. Knowledge lives in one person's head.
  • ADRs: What are those? (Architecture Decision Records β€” nobody writes them.)

The cost: every new engineer takes 4-8 weeks to become productive (vs 1-2 weeks with good docs). Every incident takes 3x longer to resolve (no runbook). Every architecture discussion re-litigates decisions made years ago (no ADRs).

The Solution

The Technical Writing skill generates structured documentation from your existing code, architecture, and plain-text descriptions.

The Process

View details
You: Generate technical documentation for our platform:
- Architecture overview (3 services + database + cache + queue)
- API reference for our REST API (endpoints attached)
- Onboarding guide for new engineers
- Runbook for the 5 most common incidents
- ADR template + 3 ADRs for our recent decisions

Architecture Doc Output:

markdownShow code
# Platform Architecture

## System Overview

[Diagram: Client β†’ CDN β†’ API Gateway β†’ Services β†’ Data]

### Service Inventory

| Service | Language | Port | Dependencies | SLA |
|---|---|---|---|---|
| API Gateway | Node.js | 3000 | Auth, Catalog, Orders | 99.9% |
| Catalog Service | Python | 8001 | PostgreSQL, Redis | 99.9% |
| Order Service | Go | 8002 | PostgreSQL, RabbitMQ | 99.99% |

### Data Flow: Order Placement
1. Client β†’ API Gateway: POST /orders
2. API Gateway β†’ Auth Service: Validate JWT
3. API Gateway β†’ Order Service: Create order
4. Order Service β†’ PostgreSQL: Insert order record
5. Order Service β†’ RabbitMQ: Publish "OrderCreated" event
6. Catalog Service: Consume event β†’ Decrement inventory
7. Notification Service: Consume event β†’ Send confirmation email

### Decision: Why Go for Order Service?
See ADR-003: Order Service Language Selection

Runbook Output:

markdownShow code
# Runbook: Database Connection Pool Exhaustion

## Symptoms
- API response times > 5s
- Logs: "connection pool exhausted" or "too many connections"
- Grafana: pg_active_connections near max_connections (100)

## Severity: P1 (service degradation)

## Diagnosis (5 minutes)
1. Check active connections:
   SELECT count(*) FROM pg_stat_activity;
2. Check idle connections:
   SELECT count(*) FROM pg_stat_activity WHERE state = 'idle';
3. Check long-running queries:
   SELECT pid, now() - pg_stat_activity.query_start AS duration,
   query FROM pg_stat_activity WHERE state = 'active'
   ORDER BY duration DESC LIMIT 10;

## Resolution

### If idle connections are high (>50):
- Kill idle connections older than 5 minutes:
  SELECT pg_terminate_backend(pid) FROM pg_stat_activity
  WHERE state = 'idle' AND query_start < now() - interval '5 minutes';
- Root cause: app not releasing connections. Check connection
  pool configuration (PgBouncer or application pool).

### If active connections are high (>80):
- Identify and kill long-running queries (>30s):
  SELECT pg_cancel_backend(pid) FROM pg_stat_activity
  WHERE state = 'active' AND query_start < now() - interval '30 seconds';
- Root cause: slow query or missing index. Check query plans.

### If connections keep growing after cleanup:
- Restart application pods (rolling restart):
  kubectl rollout restart deployment/api-gateway
- Investigate: connection leak in application code

## Prevention
- Set PgBouncer pool_size = 25 per service (total: 75 < 100 max)
- Set application pool max_connections = 20
- Alert at 70% connection usage (70 connections)
- Weekly review of slow query log

The Results

Doc TypeWithout AgentWith Technical Writing Skill
Architecture overviewOutdated ConfluenceLiving doc, updated on changes
API referenceAuto-generated (bare)Annotated with examples + errors
Onboarding guide"Ask someone"Step-by-step, tested
RunbooksDon't existCopy-paste resolution steps
ADRsDon't existDecision context preserved
New engineer ramp-up4-8 weeks1-2 weeks

Setup on MrChief

yamlShow code
skills:
  - technical-writing
  - api-design  # For API documentation
documentationtechnical-writingrunbooksadrsonboarding

Want results like these?

Start free with your own AI team. No credit card required.

Technical Documentation That Developers Actually Read β€” Mr.Chief