OCPBUGS-86303: e2e/ote-ccm-aws: enhance tests to run hybrid in hypershift HC#464
OCPBUGS-86303: e2e/ote-ccm-aws: enhance tests to run hybrid in hypershift HC#464mtulio wants to merge 2 commits into
Conversation
|
@mtulio: This pull request references Jira Issue OCPBUGS-86303, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (mrbraga@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughThis PR adds topology-aware cloud-config retrieval for HyperShift, AWS region validation and regional endpoint configuration, centralized common test helpers (including a management-cluster skip toggle), updates NLB tests to use those helpers, promotes an openshift/api dependency, and adds OTE runner documentation. ChangesAWS CCM E2E Test Refactoring with HyperShift Support
Sequence Diagram(s)sequenceDiagram
participant TestBinary
participant common
participant loadAWSConfig
participant InfrastructureAPI
participant AWS_SDK
TestBinary->>common: GetCloudConfig(ctx, cs)
common->>InfrastructureAPI: IsExternalTopology / Infrastructure query
alt external topology
common->>ManagementCluster: fetch aws-cloud-config using HYPERSHIFT_MANAGEMENT_CLUSTER_KUBECONFIG
else standalone
common->>KubeAPI: fetch cloud-conf ConfigMap
end
TestBinary->>loadAWSConfig: loadAWSConfig(ctx)
loadAWSConfig->>InfrastructureAPI: getRegionFromInfrastructure (when needed)
loadAWSConfig->>AWS_SDK: return cfg (with optional BaseEndpoint)
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 9 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (9 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
/payload-job periodic-ci-openshift-hypershift-release-5.0-e2e-aws-ovn-conformance-ccm-techpreview |
|
@mtulio: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command |
|
@mtulio: This pull request references Jira Issue OCPBUGS-86303, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (mrbraga@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 3
♻️ Duplicate comments (1)
openshift-tests/ccm-aws-tests/e2e/aws/helper.go (1)
128-130:⚠️ Potential issue | 🟠 Major | ⚡ Quick winSame BaseEndpoint conditional logic issue as ELBv2 client.
The condition
cfg.Region != ""will always evaluate to true after successfulloadAWSConfig, resulting in unconditional BaseEndpoint forcing. See the comment on lines 50-52 for the same issue increateAWSClientLoadBalancer.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@openshift-tests/ccm-aws-tests/e2e/aws/helper.go` around lines 128 - 130, The code unconditionally overwrites o.BaseEndpoint because cfg.Region is always non-empty after loadAWSConfig; change the condition to only set o.BaseEndpoint when it is not already provided (e.g., check o.BaseEndpoint == nil or empty) in addition to cfg.Region being non-empty so the endpoint is not forced; adjust the conditional around the assignment to reference o.BaseEndpoint and cfg.Region (the symbols in question) similar to the fix applied in createAWSClientLoadBalancer.
🧹 Nitpick comments (1)
docs/dev/e2e-ote-ccm-aws.md (1)
125-125: ⚡ Quick winUse hyphenated compound modifier for consistency.
Consider changing “hosted cluster” to “hosted-cluster” in these adjective positions for grammar consistency in the doc.
Also applies to: 161-161
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/dev/e2e-ote-ccm-aws.md` at line 125, Change the adjectival phrase "hosted cluster" to the hyphenated form "hosted-cluster" for consistency in modifier usage; update each occurrence (e.g., the phrase found in the sentence starting "When running against a HyperShift hosted cluster, KUBECONFIG must point to the" and the similar occurrence around line 161) to "hosted-cluster" so the compound modifier is hyphenated wherever it precedes a noun.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/dev/e2e-ote-ccm-aws.md`:
- Around line 97-98: Replace the placeholder link token "[OCPBUGS-TBD]" in the
docs text (the line referencing OTE framework limitation) with the real issue
identifier or a full URL to the bug tracker; if no final issue exists, remove
the bracketed reference entirely and adjust the sentence to avoid a dead
placeholder so the doc reads correctly.
- Around line 27-38: The fenced code blocks showing the directory tree and the
CI job name lack a language tag; update both fenced blocks (the one containing
the directory tree starting with "openshift-tests/ccm-aws-tests/" and the other
showing the CI job name mentioned around lines 176-178) to declare a language
(e.g., use ```text) so markdownlint passes and rendering is consistent.
In `@openshift-tests/ccm-aws-tests/e2e/aws/helper.go`:
- Around line 50-52: The BaseEndpoint override is currently guarded by
cfg.Region != "" which is always true after loadAWSConfig, so update the
conditional in createAWSClientLoadBalancer and createAWSClientEC2 to reflect the
intended behavior: if you want the override only for HyperShift/external
topology, replace the cfg.Region check with common.IsExternalTopology() (use
that predicate to set o.BaseEndpoint); otherwise, if you intend to always use
the regional public endpoint, remove the conditional entirely and always assign
o.BaseEndpoint using cfg.Region. Ensure you update both functions
(createAWSClientLoadBalancer and createAWSClientEC2) and keep the BaseEndpoint
construction using fmt.Sprintf("https://elasticloadbalancing.%s.amazonaws.com",
cfg.Region).
---
Duplicate comments:
In `@openshift-tests/ccm-aws-tests/e2e/aws/helper.go`:
- Around line 128-130: The code unconditionally overwrites o.BaseEndpoint
because cfg.Region is always non-empty after loadAWSConfig; change the condition
to only set o.BaseEndpoint when it is not already provided (e.g., check
o.BaseEndpoint == nil or empty) in addition to cfg.Region being non-empty so the
endpoint is not forced; adjust the conditional around the assignment to
reference o.BaseEndpoint and cfg.Region (the symbols in question) similar to the
fix applied in createAWSClientLoadBalancer.
---
Nitpick comments:
In `@docs/dev/e2e-ote-ccm-aws.md`:
- Line 125: Change the adjectival phrase "hosted cluster" to the hyphenated form
"hosted-cluster" for consistency in modifier usage; update each occurrence
(e.g., the phrase found in the sentence starting "When running against a
HyperShift hosted cluster, KUBECONFIG must point to the" and the similar
occurrence around line 161) to "hosted-cluster" so the compound modifier is
hyphenated wherever it precedes a noun.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 1be81588-e297-4557-b9d1-4cac9b4f6369
📒 Files selected for processing (6)
docs/dev/e2e-ote-ccm-aws.mddocs/dev/ote-ccm-aws.mdopenshift-tests/ccm-aws-tests/e2e/aws/helper.goopenshift-tests/ccm-aws-tests/e2e/aws/loadbalancer.goopenshift-tests/ccm-aws-tests/e2e/common/helper.goopenshift-tests/ccm-aws-tests/go.mod
💤 Files with no reviewable changes (1)
- docs/dev/ote-ccm-aws.md
|
Local tests targeting hypershift hosted cluster are passing: Except the checking the ingress, which I am figuring out how to create a cluster with that flag enabled to get routers uses NLB on install time. |
|
/payload-job periodic-ci-openshift-hypershift-release-5.0-e2e-aws-ovn-conformance-ccm-techpreview |
|
@mtulio: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command |
|
/payload-job periodic-ci-openshift-release-main-ci-4.22-e2e-aws-ovn-rhcos10-techpreview |
|
@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/d7770d80-5495-11f1-8ea0-eafa9b9db0f9-0 |
|
/payload-job periodic-ci-openshift-hypershift-release-4.22-periodics-e2e-aws-ovn-conformance |
|
@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/f0ca3500-5495-11f1-90e5-249a448fb62c-0 |
|
/payload-job periodic-ci-openshift-hypershift-release-5.0-e2e-aws-ovn-conformance-ccm-techpreview |
|
@mtulio: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command |
|
/payload-job periodic-ci-openshift-hypershift-release-5.0-e2e-aws-ovn-conformance-ccm-techpreview |
|
@mtulio: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command |
|
/payload-job periodic-ci-openshift-hypershift-release-5.0-periodics-e2e-aws-ovn-conformance |
|
@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/0dfea370-5498-11f1-8ee7-71af62bf6546-0 |
|
/payload-job periodic-ci-openshift-hypershift-release-5.0-periodics-e2e-aws-ovn-conformance-ccm-techpreview |
|
@mtulio: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command |
|
/payload-job pull-ci-openshift-hypershift-release-4.23-periodics-e2e-aws-ovn-conformance-ccm-techpreview |
|
@mtulio: trigger 0 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command |
|
/testwith openshift/hypershift/main/e2e-aws-ovn-conformance-ccm-techpreview |
|
/payload-job periodic-ci-openshift-hypershift-release-4.23-periodics-e2e-aws-ovn-conformance-ccm-techpreview |
|
@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/51495bb0-54b2-11f1-9071-d22d799e67a5-0 |
|
/test all |
|
Alll required changes must be addressed now. Re-testing on hosted cluster and standalone before converting to ready: /payload-job periodic-ci-openshift-hypershift-release-4.23-periodics-e2e-aws-ovn-conformance-ccm-techpreview /test e2e-aws-ovn |
|
@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/e6c59230-5548-11f1-9885-d4b892a16acb-0 |
|
Seeing e2e-aws-ovn issues of unrelated changes. Asked DPTP on related (BYOIP) thread. |
|
Those changes has been reviewed locally and in previous CI payload jobs - awaiting one more with cleaned up branch.
cc @mfbonfigli @nrb |
|
/test e2e-aws-ovn |
|
Confirmed CI infra issues in standalone profiles, following the Slack thread before triggering again. |
|
@mtulio: This pull request references Jira Issue OCPBUGS-86303, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (mrbraga@redhat.com), skipping review request. The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test all |
1 similar comment
|
/test all |
|
/test unit e2e-aws-ovn |
|
Analysis: 5/6 tests with prefix
6/6 tests with prefix
[1] Local tests targeting devel env in hypershift setup is also validated: #464 (comment) The results show this new hybrid aws client configuration is working both self-managed and hosted cluster's installations. /verified by @mtulio CI and local tests. |
|
/pipeline-required |
|
@mtulio: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@mtulio: This pull request references Jira Issue OCPBUGS-86303, which is valid. 3 validation(s) were run on this bug
No GitHub users were found matching the public email listed for the QA contact in Jira (mrbraga@redhat.com), skipping review request. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: mfbonfigli The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
Address chaibot feeback by updating error wrapping, as well improve const documentations of decision taking of using regional endpoints preventing VPC provided endpoints which does not work outside VPC - where the client/ote runs.
|
Chai-bot feedback addressed. |
|
Changes should not affect job results (only error handling). Re-testing to collect more data points of stability while waiting for review. /payload-job periodic-ci-openshift-hypershift-release-4.23-periodics-e2e-aws-ovn-conformance-ccm-techpreview |
|
@mtulio: trigger 1 job(s) for the /payload-(with-prs|job|aggregate|job-with-prs|aggregate-with-prs) command
See details on https://pr-payload-tests.ci.openshift.org/runs/ci/5284e010-586f-11f1-98dd-c6c6dbea694c-0 |
|
@mtulio: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
OCPBUGS-86303
The
[cloud-provider-aws-e2e-openshift] AWSServiceLBNetworkSecurityGroupe2e tests fail on HyperShift hosted clusters due to three distinct issues:elasticloadbalancing.<vpc-endpoint-id>.amazonaws.com) that is unreachable from the CI test pod running on the management cluster;AWS_REGIONenvironment variable is set to a CI leaseUUID(fromLEASED_RESOURCE) instead of a valid AWS region, producing invalid endpoint URLs even with the public endpoint override; andcloud-confin namespaceopenshift-cloud-controller-manageron the guest cluster, but in HyperShift the CCM configuration lives in ConfigMapaws-cloud-configin the hosted control plane namespace on the management cluster.Proposed Solution
This PR forces the ELBv2 and EC2 clients to use public regional AWS endpoints via BaseEndpoint, validates the SDK-resolved region against a pattern covering all five AWS partitions (standard, China, GovCloud, European Sovereign Cloud, and ISO/ISOB) with fallback to infrastructure/cluster
status.platformStatus.aws.region, and makes the cloud-config test topology-aware by detecting External control plane topology to read the config from the HCP namespace on the management cluster using environment variables (HYPERSHIFT_MANAGEMENT_CLUSTER_KUBECONFIG,HYPERSHIFT_MANAGEMENT_CLUSTER_NAMESPACE) set by the CI conformance step. It also adds aSKIP_MANAGEMENT_CLUSTER_TESTSopt-in flag for environments without management cluster access, extracts reusable helpers (GetCloudConfig,IsExternalTopology,IsConfigPresentCloudConfig,IsNLBSecurityGroupModeManaged) intocommon/helper.gowithout Ginkgo control-flow dependencies, and replaces the stale developer documentation with corrected paths, HyperShift setup instructions, and batch test execution examples.Test Requirements
We must ensure those tests are validating those changes:
Standalone:
Summary by CodeRabbit
Documentation
New Features
Bug Fixes