The purpose of Continuous Integration is to fail

February 5, 2026 - 10 min read

CI is only valuable when it fails. When it passes, it's just overhead: the same outcome you'd get without CI.

What is Continuous Integration?

Software development follows a cyclical iterative pattern. Developers make changes, commit them to version control, deploy them to users, and repeat. Continuous integration (CI) sits between committing and deploying, running automated checks for every commit. If the checks pass, we say "CI passed", and the change can be deployed. If the checks fail, we say "CI failed", and the change is blocked from deployment.

Work flows to Commit, which branches to either CI Passes then Deploy, or CI Fails

If you're an experienced developer, you're probably thinking "Duh!". To really understand the purpose of CI, we have to look at what happens with and without CI.

The Feedback Loop

Even though I hear "you could also just not be stupid" a lot, realistically we developers will make mistakes, and even more so the more productive we are.

What happens when we make mistakes? Consequences can range from "the code is now misformatted" to "payments don't work and we are losing millions per hour".

Without CI, our only chance to catch mistakes is after deployment, when users or teammates encounter them. At that point we roll back to a previous version, fix the problem, and try again.

No mistakes	A mistake
Work
Commit
Deploy
	Error occurs ✗
	Error is noticed
	Rollback

Without CI

Note that the mistake only becomes apparent after deployment, and could be noticed an arbitrary amount of time (if at all!) after it caused damage. That means that this feedback loop is long, manual, and dangerous.

Catching Problems Early

No checks can notice all mistakes, but they can certainly catch some of them, and as it turns out, that's already valuable.

"Program testing can be used to show the presence of bugs, but never to show their absence!" ― Edsger W. Dijkstra

Indeed, any mistake caught by CI is one less mistake that reaches production.

Let's see what happens when CI fails because of a mistake:

No mistakes	A mistake
Work
Commit
CI runs
CI passes	CI fails ✓
Deploy

With CI

In this case, the process is interrupted (and restarted) before deployment. This made the feedback loop shorter, more automated, and less dangerous.

Remember: this only helps when CI does in fact catch the mistake. This problem is not fixable, but in the case where CI cannot catch the mistake, the process falls back to the no-CI scenario above.

In practice you'll probably want more rigorous checks than you think, but there is certainly such a thing as "too much CI" as well.

CI as a Safety Net

If we compare the "mistake" cases with and without CI side-by-side, we can see how CI changes the outcome:

Without CI	With CI
Work
Commit
Deploy	CI runs
Error occurs ✗	CI fails ✓
Error is noticed
Rollback

With a mistake

Here we clearly see the value of CI: it prevents a bad outcome (the error occurring) by catching the mistake early.

Too much CI

If CI is good, then more CI is better, right? No, not quite. To understand why, we have to look at what happens when no mistakes are made:

With CI	Without CI
Work
Commit
CI runs	Deploy
CI passes
Deploy

Without a mistake

Note that the end result is the same in both cases: the change is deployed successfully. The only difference is that in the "with CI" case, we had to wait for CI to run and pass before we could deploy.

This means that in the "no mistake" case, CI is just an extra step that adds friction and slows us down, without providing any value.

Faulty CI

The whole reason we use CI in the first place is because we expect developers to make mistakes, so we can't then assume that they won't make mistakes in setting up CI itself. Nor can we assume that the developers who built the CI system itself are infallible.

One dreaded and very common situation is when a failing CI run can be made to pass by simply re-running it. We call this flaky CI.

Flaky CI is nasty because it means that a CI failure no longer reliably indicates that a mistake was caught. And it is doubly nasty because it is unfixable (in theory); sometimes machines just explode.

Luckily flakiness can be detected: Whenever a CI run fails, we can re-run it. If it passes the second time, we are sure it was flaky. If it fails the second time, it may have caught a real mistake (but it could also just have been flaky again).

Faulty CI is a Real and Important problem that I enjoy solving, but it is outside the scope of this article.

The value of CI

Here are the four scenarios again:

With CI		Without CI
No mistakes	A mistake	No mistakes	A mistake
Work
Commit
CI runs		Deploy
CI passes	CI fails ✓		Error occurs ✗
Deploy			Error is noticed
			Rollback

With and without CI, with and without a mistake

Note that in the "no mistake" cases, CI passing or not existing makes no difference to the outcome. The difference is only in the "mistake" cases, where CI failing prevents a bad outcome. This means that the only valuable outcome of CI is when it fails.

What "Failure" Means

It's unfortunate that we use the word "failure" to describe the valuable outcome of CI, because it makes it sound like a bad thing. The colours that are being used to represent CI outcomes are also a bit backwards. This is what it usually looks like:

Even worse: The valuable "Failure" outcome is represented using the same icon and colour as the worst outcome: "Flaky".

Instead, I propose we could use icons like this instead:

Or maybe even with a bit more emoji so we definitely know how to feel about each outcome:

It's probably too late to make this change, and red meaning "action required" is well established, but I hope this reframing helps you see CI failures in a new light.

Conclusion

CI's value comes from failing, not from passing. Flakiness undermines that value.

In all the diagrams so far "Work", and "Commit" have come before "CI runs". In the next blog post we'll discuss how to optimise that further by introducing local-first CI.