Document and cover missing-image alerting via version_checker_image_failures_total#436
Open
Document and cover missing-image alerting via version_checker_image_failures_total#436
version_checker_image_failures_total#436Conversation
Agent-Logs-Url: https://github.com/jetstack/version-checker/sessions/1c6d20ae-3be6-4d6d-93bb-8b410348b11c Co-authored-by: davidcollom <1504448+davidcollom@users.noreply.github.com>
Agent-Logs-Url: https://github.com/jetstack/version-checker/sessions/1c6d20ae-3be6-4d6d-93bb-8b410348b11c Co-authored-by: davidcollom <1504448+davidcollom@users.noreply.github.com>
Copilot
AI
changed the title
[WIP] Add alert for missing images in version checker
Document and cover missing-image alerting via May 9, 2026
version_checker_image_failures_total
davidcollom
approved these changes
May 9, 2026
There was a problem hiding this comment.
Pull request overview
This PR clarifies and exemplifies how version-checker surfaces upstream image lookup failures via the version_checker_image_failures_total Prometheus counter, and adds a focused controller-level regression test to ensure the counter increments when an upstream version cannot be resolved.
Changes:
- Document
version_checker_image_failures_totalsemantics and label set, and add a PromQL query example. - Add a Prometheus Operator
PrometheusRuleexample for alerting on recent image lookup failures. - Add a controller test that simulates a “no version found” lookup and asserts the failure counter is incremented with the expected labels.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.
| File | Description |
|---|---|
pkg/controller/pod_sync_test.go |
Adds a regression test that simulates a no-version-found error and asserts version_checker_image_failures_total is incremented with workload labels. |
docs/metrics.md |
Expands metric documentation and adds query + alerting examples for missing/unavailable upstream images. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| otherwise unavailable in the registry. | ||
| ``` | ||
|
|
||
| To make this alert effective, ensure version-checker is actually checking the containers you care about, either by enabling `--test-all-containers` / `versionChecker.testAllContainers=true` or by opting specific containers in with `version-checker.jetstack.io/enabled`. |
| rules: | ||
| - alert: VersionCheckerImageLookupFailures | ||
| expr: increase(version_checker_image_failures_total[15m]) > 0 | ||
| for: 15m |
| kubeClient := fake.NewClientBuilder().WithObjects(pod).Build() | ||
| metrics := metrics.New(log, reg, kubeClient) | ||
| checker := checker.New( | ||
| fakesearch.New().With(nil, versionerrors.NewVersionErrorNotFound("%s", fmt.Sprintf("no tags found for given image URL: %q", "docker.io/example/missing"))), |
Comment on lines
+198
to
+200
| builder := options.New(map[string]string{ | ||
| "version-checker.jetstack.io/enabled": "true", | ||
| }) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
A user asked whether version-checker can alert when a pod references an image/tag that no longer exists upstream. The behavior was already present in the metrics pipeline, but it was not clearly documented or covered by a focused regression test.
Clarify existing alerting behavior
version_checker_image_failures_totalincrements when version-checker cannot resolve an upstream image version, including removed or unavailable tags.namespace,pod,container,image) so alerts can route directly to the affected workload.Add alerting examples
PrometheusRuleexample showing how to alert on missing/unavailable upstream images.Document operational prerequisites
--test-all-containers/versionChecker.testAllContainers=trueor per-container opt-in annotations.Add regression coverage
no version foundlookup and verifiesversion_checker_image_failures_totalis incremented for the affected container.Example alert expression: