Severities and categories

What ADO Pilot's severities and categories mean, and how findings roll up to a PASS, ADVISORY, or FAIL verdict.

Last updated 2026-05-05

ADO Pilot tags every finding with one severity (how urgent) and one category (what domain). This page explains both axes and the rule that turns the set of findings into a single verdict on your PR.

The reference table

The full severity/category/verdict matrix lives in a shared partial so it stays consistent across docs:

Severity levels

Severity	Icon	Meaning	Blocks merge?
Blocker	red circle	Must fix before merge. Bugs that produce wrong behavior, security vulnerabilities, data loss, or breaking API changes.	Yes, when the `ai-pr-review` status is a required branch policy.
Warning	warning sign	Should fix before merge. Live-path performance regressions, error-handling gaps, design issues that compound.	No.
Suggestion	lightbulb	Consider fixing. Improves maintainability, readability, or test coverage.	No.

Category	What it covers
Bug	Logic errors, off-by-one, null dereference, race conditions, type mismatches.
Security	Injection, XSS, hardcoded credentials, weak crypto, missing auth, path traversal.
Performance	N+1 queries, unbounded loops, redundant work in production paths.
Error-handling	Swallowed exceptions, missing propagation, unhandled rejections, missing boundary validation.
Maintainability	Duplication, deep nesting, misleading names, convention violations.
Testing	Missing tests for new code, assertions that don't verify behavior, brittle setup.

Verdict roll-up

ADO Pilot computes a single verdict per review from the findings:

Findings present	Verdict
Any blocker	FAIL
Warnings only (no blocker)	ADVISORY
Suggestions only, or no findings	PASS

The verdict drives the ai-pr-review status check on your PR. PASS and ADVISORY both report Succeeded; FAIL reports Failed. Configure the status as a required branch policy if you want FAIL to block merge.

The rest of this page goes deeper on each axis.

Severities in practice

Severity answers "how urgent is this?" — it is independent from the technical domain.

Blocker is reserved for findings that produce wrong behavior, security holes, data loss, or breaking API changes. The bar is "if this ships, something concrete goes wrong." Pass 2 is conservative about labelling something a blocker; if it does, treat it as a real defect.
Warning covers findings that you should fix before merge but that do not, on their own, produce a bug today. Examples: a swallowed exception that masks future failures, an N+1 query that will degrade as data grows, a missing validation at a trust boundary.
Suggestion covers improvements that are nice-to-have. They never block merge and they never affect the verdict beyond PASS. Examples: extracting a duplicated helper, renaming a misleading variable, adding a missing unit test.

Categories in practice

Category answers "what kind of issue is this?" — it helps you route the finding to the right reviewer or tooling.

Bug. Logic errors that produce wrong output for some input. Off-by-one, null dereference, inverted condition, race.
Security. Trust-boundary violations. Injection, XSS, insecure crypto, hardcoded secrets, missing authentication or authorization, path traversal.
Performance. Resource use in production paths. N+1 queries, unbounded loops, redundant work in hot loops, missing indices.
Error-handling. Resilience and observability. Swallowed exceptions, missing propagation, unhandled rejections, missing validation.
Maintainability. Future-proofing. Duplication, deep nesting, misleading names, convention violations, tangles that hide defects.
Testing. Coverage and assertion quality. New code without tests, tests that cannot fail, brittle setup.

Severity and category combine freely. You can have a blocker · maintainability (a tangled conditional that hides a real defect), a warning · testing (the new tests pass but do not actually exercise the new branch), or a suggestion · security (a comment leaks an internal URL).

How findings roll up to a verdict

The verdict on the tracking comment is computed from the findings. The rule is short:

if any finding is a blocker  → FAIL
else if any finding is a warning → ADVISORY
else                         → PASS

Three consequences worth internalizing:

A single blocker forces FAIL. It does not matter how many warnings or suggestions are in the same review.
ADVISORY means "warnings only." If you only see warnings and suggestions, you can merge — but the warnings are worth reading.
PASS includes the case "suggestions only." A review with five suggestions and no warnings or blockers is still a PASS.

For how this maps to merge gating, see Anatomy of a review comment.

What ADO Pilot does not flag

The reviewer is intentionally quiet about a few things so the noise floor stays low:

Pre-existing issues in unchanged code.
Mechanical style — indentation, spacing, brace placement. Use a formatter.
Issues a linter already reports, unless they indicate a real bug.
Subjective preferences (const vs let, early-return vs nested branches) when both forms are correct.
Performance concerns in test fixtures or rarely-called admin paths.
TODOs and FIXMEs that the author already noted in a comment.

What is not reviewed covers the harder boundaries (file size caps, exclusions, closed PRs).