Unit, acceptance, integration — and why the shape matters

Tests come in categories with different speed and cost. The cheapest tests should catch the most bugs. When the shape inverts (lots of slow tests, few fast ones), every defect takes hours to find and fix. Fowler called this the ice-cream cone — and it's where most teams end up by accident.

The deployment pipeline runs tests from fastest and cheapest to slowest and most expensive. Any error should be caught with the cheapest test possible — if an acceptance test catches something a unit test could have, write the unit test next.

// the three layers · fastest to slowest

UNIT

ONLINE

circle

Unit tests

milliseconds · the foundation

Test a single class / function does what the programmer intended. Database stubbed out. Run on every commit, in parallel, finish in seconds. Should be the bulk of the test suite.

IDLE

task_alt

Acceptance tests

seconds-minutes · what the customer meant

Test the application as a whole — business acceptance criteria, API correctness, regression coverage. Humble + Farley: "prove our application does what the customer meant it to, not that it works the way programmers think it should."

INTEG

STANDBY

lan

Integration tests

minutes-hours · the real world

Run against real downstream production-like services, not stubs. Brittle, expensive, slow. Minimize the count — catch as much as you can in unit + acceptance first. Use virtualized / contract-tested remote services where possible.

// fowler's 10-minute build

Martin Fowler's heuristic: a 10-minute commit-stage build is well within reach. Compile + run localized unit tests with the database stubbed out. Any bug that requires a real DB or end-to-end interaction is caught in the slower acceptance stage, which can take a couple of hours.

The split is intentional: commit-stage is the fast loop for engineers; acceptance is the thorough check before promotion.

// the pyramid vs. the ice-cream cone

// pyramid · the ideal

       ┌─────┐
       │ man │  ← exploratory, manual
      ┌┴─────┴┐
      │ integ │  ← few, slow, real services
     ┌┴───────┴┐
     │ accept  │  ← business correctness
    ┌┴─────────┴┐
    │   unit    │  ← thousands, in seconds
    └───────────┘

Bulk of confidence comes from fast unit tests. Acceptance adds business correctness. Integration is the smallest tier — only what truly cannot be tested cheaper.

// ice-cream cone · the failure

   ┌──────────────┐
   │ manual / e2e │  ← hours of human time
   └─────┬────────┘
        ┌┴─────┐
        │ integ│  ← brittle, slow, expensive
        └┬─────┘
         ┌─┐
         │u│  ← almost nothing
         └─┘

The accidental shape most teams end up with. Bugs surface in slow tests, fixing them takes hours, the pipeline is constantly red. Inverts every speed advantage.

// coverage as a guardrail (not a target)

Under deadline pressure, devs stop writing unit tests. Measure coverage and surface it — you can fail the validation suite when coverage drops below a threshold (e.g. 80% of classes have unit tests). Coverage isn't a quality measure, but a sustained downward trend is a leading indicator that the pyramid is degrading.

"For a large retailer e-commerce site, we went from running 1,300 manual tests every ten days to running only ten automated tests upon every code commit — it's far better to run a few tests that we trust than to run tests that aren't reliable. Over time, we grew this test suite to hundreds of thousands of automated tests."

// Gary Gruver — VP of Quality Engineering / Release Engineering / Ops, Macys.com.

help Knowledge Check

Question 1/2

A team has 200 integration tests that take 4 hours, 20 acceptance tests, and 15 unit tests. What's the most likely outcome?

// pick one to verify

help Knowledge Check

Question 2/2

A team has 12,000 unit tests that catch nothing meaningful — the system breaks weekly in integration. What might be wrong?

// pick one to verify

arrow_back mod-05 / pipeline mod-05 / tdd-perf arrow_forward