Every senior engineer has reviewed a pull request, spotted an obvious bug, and wondered why the submitter did not catch it themselves. Claude can be that first-pass reviewer — reading every PR automatically and leaving structured comments before a human even opens the diff. This does not replace code review; it makes it better by ensuring the easy things are already handled.

How It Works

A GitHub Actions workflow triggers on every pull request, extracts the diff, sends it to Claude via the Anthropic API, and posts the response as a PR comment. The whole flow takes under 30 seconds and costs a fraction of a cent per review.

Prerequisites

You need an Anthropic API key stored as a GitHub Actions secret (ANTHROPIC_API_KEY), and Python available in your runner (standard on ubuntu-latest).

name: AI Code Review
on:
  pull_request:
    types: [opened, synchronize]

jobs:
  review:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
    steps:
      - uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Get diff
        run: git diff origin/${{ github.base_ref }}...HEAD > pr_diff.txt

      - name: Run Claude review
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          PR_NUMBER: ${{ github.event.pull_request.number }}
          REPO: ${{ github.repository }}
        run: |
          pip install anthropic -q
          python .github/scripts/claude_review.py

The Review Script

Save this as .github/scripts/claude_review.py. The system prompt is where you encode your team’s standards — tailor it to your stack and policies.

import anthropic, os, subprocess

client = anthropic.Anthropic(api_key=os.environ["ANTHROPIC_API_KEY"])

with open("pr_diff.txt") as f:
    diff = f.read()[:12000]  # stay within token budget

system = """You are a senior IT engineer reviewing infrastructure and automation code.
Focus on:
- Security issues (hardcoded secrets, overly broad permissions, missing input validation)
- Error handling gaps (missing try/catch, unhandled edge cases)
- PowerShell/Python best practices and naming consistency
- Anything that could cause silent failures in production
Be concise. Use bullet points. Start with a one-line summary verdict."""

message = client.messages.create(
    model="claude-opus-4-5",
    max_tokens=1024,
    messages=[{"role": "user", "content": f"Review this PR diff:\n\n{diff}"}],
    system=system
)

review = message.content[0].text
comment = f"## AI Code Review\n\n{review}\n\n*First-pass only — human review still required.*"

subprocess.run([
    "gh", "pr", "comment", os.environ["PR_NUMBER"],
    "--repo", os.environ["REPO"],
    "--body", comment
], check=True, env={**os.environ})

Customising the System Prompt for IT Teams

Encode your team’s specific standards directly in the system prompt:

  • Flag any credentials not sourced from Key Vault or environment variables
  • Warn on -ErrorAction SilentlyContinue without a corresponding log statement
  • Alert when a script targets All users without a -WhatIf guard
  • Check that Bicep templates include resource locks on production scopes

Summary

Once deployed this workflow runs silently on every PR. Your team gets faster, more consistent feedback, and human reviewers can focus on architecture and logic rather than catching the same missing error handling for the hundredth time.