Skip to content

Document and cover missing-image alerting via version_checker_image_failures_total#436

Open
Copilot wants to merge 3 commits intomainfrom
copilot/feature-alert-for-missing-images
Open

Document and cover missing-image alerting via version_checker_image_failures_total#436
Copilot wants to merge 3 commits intomainfrom
copilot/feature-alert-for-missing-images

Conversation

Copy link
Copy Markdown

Copilot AI commented May 9, 2026

A user asked whether version-checker can alert when a pod references an image/tag that no longer exists upstream. The behavior was already present in the metrics pipeline, but it was not clearly documented or covered by a focused regression test.

  • Clarify existing alerting behavior

    • Document that version_checker_image_failures_total increments when version-checker cannot resolve an upstream image version, including removed or unavailable tags.
    • Call out the relevant labels (namespace, pod, container, image) so alerts can route directly to the affected workload.
  • Add alerting examples

    • Add a Prometheus query example for detecting recent image lookup failures.
    • Add a PrometheusRule example showing how to alert on missing/unavailable upstream images.
  • Document operational prerequisites

    • Note that version-checker must actually be evaluating the target containers, either via --test-all-containers / versionChecker.testAllContainers=true or per-container opt-in annotations.
  • Add regression coverage

    • Add a controller-level test that simulates a no version found lookup and verifies version_checker_image_failures_total is incremented for the affected container.

Example alert expression:

increase(version_checker_image_failures_total[15m]) > 0

Agent-Logs-Url: https://github.com/jetstack/version-checker/sessions/1c6d20ae-3be6-4d6d-93bb-8b410348b11c

Co-authored-by: davidcollom <1504448+davidcollom@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot May 9, 2026 14:56
Agent-Logs-Url: https://github.com/jetstack/version-checker/sessions/1c6d20ae-3be6-4d6d-93bb-8b410348b11c

Co-authored-by: davidcollom <1504448+davidcollom@users.noreply.github.com>
Copilot AI requested review from Copilot and removed request for Copilot May 9, 2026 14:58
Copilot AI changed the title [WIP] Add alert for missing images in version checker Document and cover missing-image alerting via version_checker_image_failures_total May 9, 2026
Copilot AI requested a review from davidcollom May 9, 2026 14:59
@davidcollom davidcollom marked this pull request as ready for review May 9, 2026 15:26
@davidcollom davidcollom requested a review from maria-reynoso as a code owner May 9, 2026 15:26
Copilot AI review requested due to automatic review settings May 9, 2026 15:26
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR clarifies and exemplifies how version-checker surfaces upstream image lookup failures via the version_checker_image_failures_total Prometheus counter, and adds a focused controller-level regression test to ensure the counter increments when an upstream version cannot be resolved.

Changes:

  • Document version_checker_image_failures_total semantics and label set, and add a PromQL query example.
  • Add a Prometheus Operator PrometheusRule example for alerting on recent image lookup failures.
  • Add a controller test that simulates a “no version found” lookup and asserts the failure counter is incremented with the expected labels.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.

File Description
pkg/controller/pod_sync_test.go Adds a regression test that simulates a no-version-found error and asserts version_checker_image_failures_total is incremented with workload labels.
docs/metrics.md Expands metric documentation and adds query + alerting examples for missing/unavailable upstream images.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread docs/metrics.md
otherwise unavailable in the registry.
```

To make this alert effective, ensure version-checker is actually checking the containers you care about, either by enabling `--test-all-containers` / `versionChecker.testAllContainers=true` or by opting specific containers in with `version-checker.jetstack.io/enabled`.
Comment thread docs/metrics.md
rules:
- alert: VersionCheckerImageLookupFailures
expr: increase(version_checker_image_failures_total[15m]) > 0
for: 15m
kubeClient := fake.NewClientBuilder().WithObjects(pod).Build()
metrics := metrics.New(log, reg, kubeClient)
checker := checker.New(
fakesearch.New().With(nil, versionerrors.NewVersionErrorNotFound("%s", fmt.Sprintf("no tags found for given image URL: %q", "docker.io/example/missing"))),
Comment on lines +198 to +200
builder := options.New(map[string]string{
"version-checker.jetstack.io/enabled": "true",
})
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Question - is it possible to use version-checker to alert for missing images

3 participants