Software Engineer

PR Review Workflow β€” Agent Summarizes the Diff Before I Read a Single Line

PR review time: 45 min β†’ 10 minEngineering & DevOps5 min read

Key Takeaway

Before I review any pull request, an AI agent pre-analyzes the diff β€” summarizing changes, flagging risks, checking test coverage, and recommending a verdict. My review time dropped from 45 minutes to 10.

The Problem

Pull request reviews are the bottleneck nobody talks about.

A 500-line PR lands. You open it. You see 12 files changed. You have no context β€” you weren't the one who wrote this code. So you start from scratch: read the PR description (if there is one), skim the diff, try to understand the architecture, look for bugs, check edge cases, verify tests exist.

45 minutes later, you've done a mediocre review because your attention flagged around line 300. The subtle bug in the auth middleware? You scrolled right past it. The missing test for the error path? Didn't notice.

PR reviews require the most cognitive load at the exact moment you have the least context. That's a design flaw.

The Solution

When a new PR arrives on any of our repos, our agent pre-reviews it. By the time I open the PR, I have a summary of what changed, a risk assessment, test coverage delta, and a recommended verdict. I'm not reading the diff cold β€” I'm validating an analysis.

The Process

The agent hooks into PR creation events and generates a comprehensive pre-review:

bashShow code
# Fetch PR details
pr_data=$(gh pr view "$PR_NUMBER" --repo "$REPO" \
  --json title,body,files,additions,deletions,baseRefName,headRefName,\
commits,statusCheckRollup,reviews,linkedIssues)

# Get the full diff
gh pr diff "$PR_NUMBER" --repo "$REPO" > /tmp/pr_diff.patch

The pre-review has five components:

1. Change Summary

The agent reads every file in the diff and generates a plain-English summary:

View details
πŸ“ PR #156: "Add rate limiting to API endpoints"

Summary of changes:
- New middleware: RateLimitMiddleware in api/middleware/rate_limit.py
- Configuration: Rate limits defined per-endpoint in settings.py
- Redis integration: Uses Redis for distributed rate counting
- 3 endpoint decorators updated to include rate limit headers
- New management command for resetting rate limit counters

2. Risk Assessment

pythonShow code
RISK_FACTORS = {
    "high": {
        "patterns": ["middleware", "auth", "security", "payment", "migration"],
        "file_paths": ["settings.py", "urls.py", "wsgi.py", "manage.py"]
    },
    "medium": {
        "patterns": ["model", "serializer", "view", "permission"],
        "thresholds": {"files_changed": 10, "additions": 300}
    }
}

def assess_risk(pr_data: dict, diff: str) -> str:
    risk_flags = []
    for file in pr_data["files"]:
        path = file["path"].lower()
        if any(p in path for p in RISK_FACTORS["high"]["patterns"]):
            risk_flags.append(f"⚠️ HIGH: Modifies {file['path']}")

    if pr_data["additions"] > 300 and pr_data["deletions"] < 20:
        risk_flags.append("⚠️ MEDIUM: Large addition with minimal cleanup")

    return risk_flags

Output:

View details
🎯 Risk Assessment: MEDIUM-HIGH
  ⚠️ HIGH: Modifies api/middleware/ (request pipeline)
  ⚠️ MEDIUM: Touches settings.py (configuration)
  βœ… LOW: Test files have matching additions

  Flags:
  - This PR adds middleware that runs on EVERY request.
    Performance impact should be benchmarked.
  - Redis dependency added β€” ensure Redis is in all environments.

3. Test Coverage Delta

bashShow code
# Check if new code has corresponding tests
new_files=$(echo "$pr_data" | jq -r '.files[].path' | grep -v test)
test_files=$(echo "$pr_data" | jq -r '.files[].path' | grep test)

# Count new functions/classes vs new test functions
new_functions=$(grep -c "def \|class " /tmp/pr_diff.patch | grep "^+")
new_tests=$(grep -c "def test_\|class Test" /tmp/pr_diff.patch | grep "^+")

Output:

View details
πŸ§ͺ Test Coverage:
  New code: 4 functions, 1 class added
  New tests: 6 test functions added
  Coverage ratio: 1.5 tests per function βœ…

  Missing coverage:
  - No test for rate_limit_exceeded response format
  - No test for Redis connection failure fallback

4. CI & Merge Status

View details
πŸ”„ CI Status: βœ… All 4 checks passing
  - lint: βœ… (12s)
  - test-unit: βœ… (1m 34s)
  - test-integration: βœ… (2m 12s)
  - build: βœ… (3m 01s)

πŸ“‹ Merge Status:
  - Conflicts: None
  - Branch freshness: 2 commits behind main (rebase recommended)
  - Linked issues: #148 (rate limiting request)

5. Verdict

View details
🏁 Recommendation: REQUEST CHANGES

Reasoning:
1. Core logic looks correct and well-structured
2. Missing tests for error paths (Redis down, malformed headers)
3. No performance benchmark for middleware on hot path
4. settings.py change needs env variable documentation

Suggested action: Ask for 2 additional tests + performance note
in PR description. Approve after.

The complete pre-review drops into Telegram, and I can optionally have Nico (our code review agent) post it as a PR comment on GitLab/GitHub.

The Results

MetricBefore (Manual)After (Agent Pre-Review)Improvement
Time to first review comment2-4 hours5 minutes (auto-generated)96% faster
Review time per PR30-45 min10-15 min65% faster
Missed test gaps per review~2~0.385% fewer misses
High-risk changes flagged~60%98%Near-complete
Review quality (bugs caught)Baseline+40%Focused attention

Try It Yourself

Use gh pr diff and gh pr view to extract PR data. The summarization, risk scoring, and test coverage analysis can run through any LLM. The patterns for risk detection are project-specific β€” define what "high risk" means for your codebase. Pipe the output wherever your team communicates.


I still review every PR. I just don't start from zero anymore.

GitHubCode ReviewAutomationAI AgentsPull Requests

Want results like these?

Start free with your own AI team. No credit card required.

PR Review Workflow β€” Agent Summarizes the Diff Before I Read a Single Line β€” Mr.Chief