The purpose of Continuous Integration is to fail
February 5, 2026 - 10 min read
CI is only valuable when it fails. When it passes, it's just overhead: the same outcome you'd get without CI.
What is Continuous Integration?
Software development follows a cyclical iterative pattern. Developers make changes, commit them to version control, deploy them to users, and repeat. Continuous integration (CI) sits between committing and deploying, running automated checks for every commit. If the checks pass, we say "CI passed", and the change can be deployed. If the checks fail, we say "CI failed", and the change is blocked from deployment.
If you're an experienced developer, you're probably thinking "Duh!". To really understand the purpose of CI, we have to look at what happens with and without CI.
The Feedback Loop
Even though I hear "you could also just not be stupid" a lot, realistically we developers will make mistakes, and even more so the more productive we are.
What happens when we make mistakes? Consequences can range from "the code is now misformatted" to "payments don't work and we are losing millions per hour".
Without CI, our only chance to catch mistakes is after deployment, when users or teammates encounter them. At that point we roll back to a previous version, fix the problem, and try again.
| No mistakes | A mistake |
|---|---|
| Work | |
| Commit | |
| Deploy | |
| Error occurs ✗ | |
| Error is noticed | |
| Rollback | |
Note that the mistake only becomes apparent after deployment, and could be noticed an arbitrary amount of time (if at all!) after it caused damage. That means that this feedback loop is long, manual, and dangerous.
Catching Problems Early
No checks can notice all mistakes, but they can certainly catch some of them, and as it turns out, that's already valuable.
"Program testing can be used to show the presence of bugs, but never to show their absence!" ― Edsger W. Dijkstra
Indeed, any mistake caught by CI is one less mistake that reaches production.
Let's see what happens when CI fails because of a mistake:
| No mistakes | A mistake |
|---|---|
| Work | |
| Commit | |
| CI runs | |
| CI passes | CI fails ✓ |
| Deploy | |
In this case, the process is interrupted (and restarted) before deployment. This made the feedback loop shorter, more automated, and less dangerous.
Remember: this only helps when CI does in fact catch the mistake. This problem is not fixable, but in the case where CI cannot catch the mistake, the process falls back to the no-CI scenario above.
In practice you'll probably want more rigorous checks than you think, but there is certainly such a thing as "too much CI" as well.
CI as a Safety Net
If we compare the "mistake" cases with and without CI side-by-side, we can see how CI changes the outcome:
| Without CI | With CI |
|---|---|
| Work | |
| Commit | |
| Deploy | CI runs |
| Error occurs ✗ | CI fails ✓ |
| Error is noticed | |
| Rollback | |
Here we clearly see the value of CI: it prevents a bad outcome (the error occurring) by catching the mistake early.
Too much CI
If CI is good, then more CI is better, right? No, not quite. To understand why, we have to look at what happens when no mistakes are made:
| With CI | Without CI |
|---|---|
| Work | |
| Commit | |
| CI runs | Deploy |
| CI passes | |
| Deploy | |
Note that the end result is the same in both cases: the change is deployed successfully. The only difference is that in the "with CI" case, we had to wait for CI to run and pass before we could deploy.
This means that in the "no mistake" case, CI is just an extra step that adds friction and slows us down, without providing any value.
Faulty CI
The whole reason we use CI in the first place is because we expect developers to make mistakes, so we can't then assume that they won't make mistakes in setting up CI itself. Nor can we assume that the developers who built the CI system itself are infallible.
One dreaded and very common situation is when a failing CI run can be made to pass by simply re-running it. We call this flaky CI.
Flaky CI is nasty because it means that a CI failure no longer reliably indicates that a mistake was caught. And it is doubly nasty because it is unfixable (in theory); sometimes machines just explode.
Luckily flakiness can be detected: Whenever a CI run fails, we can re-run it. If it passes the second time, we are sure it was flaky. If it fails the second time, it may have caught a real mistake (but it could also just have been flaky again).
Faulty CI is a Real and Important problem that I enjoy solving, but it is outside the scope of this article.
The value of CI
Here are the four scenarios again:
| With CI | Without CI | ||
|---|---|---|---|
| No mistakes | A mistake | No mistakes | A mistake |
| Work | |||
| Commit | |||
| CI runs | Deploy | ||
| CI passes | CI fails ✓ | Error occurs ✗ | |
| Deploy | Error is noticed | ||
| Rollback | |||
Note that in the "no mistake" cases, CI passing or not existing makes no difference to the outcome. The difference is only in the "mistake" cases, where CI failing prevents a bad outcome. This means that the only valuable outcome of CI is when it fails.
What "Failure" Means
It's unfortunate that we use the word "failure" to describe the valuable outcome of CI, because it makes it sound like a bad thing. The colours that are being used to represent CI outcomes are also a bit backwards. This is what it usually looks like:
Even worse: The valuable "Failure" outcome is represented using the same icon and colour as the worst outcome: "Flaky".
Instead, I propose we could use icons like this instead:
Or maybe even with a bit more emoji so we definitely know how to feel about each outcome:
It's probably too late to make this change, and red meaning "action required" is well established, but I hope this reframing helps you see CI failures in a new light.
Conclusion
CI's value comes from failing, not from passing. Flakiness undermines that value.
In all the diagrams so far "Work", and "Commit" have come before "CI runs". In the next blog post we'll discuss how to optimise that further by introducing local-first CI.