Building AI-Powered Code Review Workflows with Custom Prompts
Why AI Code Review Matters
Manual code review is a bottleneck. Senior developers spend 20-30% of their time reviewing pull requests, yet studies show human reviewers miss approximately 50% of bugs in code under review. AI-assisted code review doesn't replace humans — it augments them by catching mechanical issues so reviewers can focus on architecture and design decisions.
The key insight: the quality of AI code review is entirely determined by the prompt. A generic "review this code" instruction produces generic, surface-level feedback. A well-engineered prompt produces specific, actionable, priority-ranked issues that match your team's standards.
The Three-Layer Review Architecture
Production AI code review should operate in three distinct layers, each with a specialised prompt:
- Security Layer — Scans for vulnerabilities: injection attacks, auth bypasses, data exposure, insecure dependencies
- Quality Layer — Evaluates code quality: logic errors, edge cases, error handling, type safety, test coverage
- Style Layer — Enforces consistency: naming conventions, documentation, architectural patterns, team standards
Running these as separate prompts is more effective than a single "review everything" prompt because each layer has different evaluation criteria and severity scales.
Security Review Prompt
System: You are a senior application security engineer performing a security-focused code review.
## Context
- Language: {language}
- Framework: {framework}
- This code handles: {description}
## Security Checklist
Evaluate the code against these categories:
1. INJECTION: SQL injection, XSS, command injection, LDAP injection, template injection
2. AUTHENTICATION: Broken auth flows, session management, credential handling
3. AUTHORISATION: Missing access controls, IDOR, privilege escalation
4. DATA EXPOSURE: Sensitive data in logs, hardcoded secrets, PII leakage
5. CRYPTOGRAPHY: Weak algorithms, improper key management, predictable tokens
6. INPUT VALIDATION: Missing sanitisation, type coercion, boundary checks
7. DEPENDENCIES: Known CVEs, outdated packages, supply chain risks
## Output Format
For each finding:
- SEVERITY: CRITICAL | HIGH | MEDIUM | LOW
- CWE: The relevant CWE identifier
- LOCATION: File and line number
- DESCRIPTION: What the vulnerability is
- EXPLOIT: How an attacker could exploit it
- FIX: The specific code change needed
If no security issues are found, state "No security issues identified" and explain what security measures are correctly implemented.
Quality Review Prompt
System: You are a principal software engineer reviewing code for production readiness.
## Review Criteria
1. CORRECTNESS: Logic errors, off-by-one errors, race conditions, null handling
2. EDGE CASES: Empty inputs, boundary values, concurrent access, network failures
3. ERROR HANDLING: Uncaught exceptions, error propagation, user-facing error messages
4. PERFORMANCE: N+1 queries, unnecessary re-renders, memory leaks, algorithmic complexity
5. TESTABILITY: Tight coupling, hidden dependencies, untestable side effects
6. MAINTAINABILITY: Complex conditionals, deep nesting, duplicate logic, magic numbers
## Constraints
- Focus on substantive issues, not nitpicks
- Every issue must include a concrete fix
- Rate each issue: MUST_FIX | SHOULD_FIX | CONSIDER
- If the code is well-written, say so and explain what makes it good
## Output
Provide your review as a structured list, ordered by severity.
Integrating AI Review into CI/CD
The most effective pattern integrates AI review directly into your pull request workflow. Here's a production architecture:
# .github/workflows/ai-review.yml
name: AI Code Review
on:
pull_request:
types: [opened, synchronize]
jobs:
ai-review:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Get changed files
id: changed
run: |
echo "files=$(git diff --name-only origin/main...HEAD | tr '\n' ' ')" >> $GITHUB_OUTPUT
- name: Run AI Security Review
run: |
for file in ${{ steps.changed.outputs.files }}; do
# Send each file to your AI review API
curl -X POST https://your-api/review \
-H "Authorization: Bearer ${{ secrets.AI_API_KEY }}" \
-d "{"file": "$(cat $file)", "layer": "security"}"
done
Handling False Positives
AI code reviewers produce false positives. Managing them is critical for developer trust:
- Calibrate severity thresholds — Start with CRITICAL and HIGH only; add lower severities once trust is established
- Provide context — Include the project's tech stack, coding standards, and known patterns in the prompt
- Use suppress comments — Allow developers to mark false positives with
// ai-review-ignore: reason - Track accuracy — Log accept/reject rates per issue category and use this data to refine your prompts
- Feedback loop — Feed dismissed issues back into the prompt as "do not flag" examples
Diff-Based vs Full-File Review
A common mistake is sending entire files for review. For pull requests, diff-based review is superior:
- Token efficiency — You pay for input tokens. Sending only the diff can reduce costs by 80%+
- Focused feedback — The model focuses on what changed rather than re-reviewing existing code
- Context window — Large files may exceed the model's context window
However, include surrounding context (10-20 lines above and below each change) so the model understands the code's environment. The optimal format:
## Changed File: src/auth/login.ts
## Change Type: Modified
### Context (lines 45-85, changed lines marked with +/-)
async function handleLogin(req: Request) {
const { email, password } = req.body;
- const user = await db.query('SELECT * FROM users WHERE email = ' + email);
+ const user = await db.query('SELECT * FROM users WHERE email = $1', [email]);
if (!user) {
return res.status(401).json({ error: 'Invalid credentials' });
}
Multi-Model Review Strategy
Different models have different strengths for code review:
| Model | Best For | Weakness |
|---|---|---|
| GPT-4 | Security analysis, complex logic | Can be verbose; higher cost |
| Claude 3.5 Sonnet | Code quality, refactoring suggestions | May over-suggest abstractions |
| Gemini Pro | Documentation review, API consistency | Less reliable on security edge cases |
A production system can route different review layers to different models, optimising for both quality and cost.
How AI Prompt Architect Helps
AI Prompt Architect provides pre-built code review prompt templates that are battle-tested across hundreds of repositories. Use the Generate workflow with "code review" as your task to get a structured review prompt tailored to your stack. The Refine workflow can then customise it with your team's specific coding standards and common pitfalls.
