-
Notifications
You must be signed in to change notification settings - Fork 5.2k
wip: Storage provisioner add multinode support #22450
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
… test to validate it.
…ity on PVs in multi-node clusters.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: medyagh The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…arch image push to use docker buildx.
…as a direct argument to Run.
…er mode, and disable leader election for the provisioner.
…n integration test to verify NodeAffinity for hostPath PVs in multi-node clusters.
|
@medyagh: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds multinode support to the storage provisioner by converting it from a single Pod to a DaemonSet. This enables hostPath-based persistent volumes to work correctly in multi-node clusters by ensuring PVs have node affinity set to the node where storage is actually created.
Changes:
- Upgraded sig-storage-lib-external-provisioner from v6 to v13 to leverage newer provisioning features
- Converted storage-provisioner deployment from Pod to DaemonSet with node affinity awareness
- Added
WaitForFirstConsumervolume binding mode to the StorageClass to enable topology-aware scheduling - Added comprehensive integration test to verify node affinity is correctly set on provisioned PVs
Reviewed changes
Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/storage/storage_provisioner.go | Added NODE_NAME environment variable handling, deterministic identity for DaemonSet restarts, and node affinity setting on PVs |
| deploy/addons/storage-provisioner/storage-provisioner.yaml.tmpl | Converted from Pod to DaemonSet, added NODE_NAME env var, and leases permissions for leader election |
| deploy/addons/storageclass/storageclass.yaml | Added WaitForFirstConsumer volume binding mode to support topology-aware provisioning |
| test/integration/multinode_test.go | Added validateStorageProvisionerNodeAffinity test to verify node affinity functionality |
| test/integration/testdata/*.yaml | Added test manifests for PVC and Pod used in the integration test |
| Makefile | Updated storage provisioner version and build process to use buildx for multi-arch images |
| deploy/storage-provisioner/Dockerfile | Changed ARG from arch to TARGETARCH for buildx compatibility |
| go.mod & go.sum | Updated dependencies to provisioner library v13 and related transitive dependencies |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| nodes := fmt.Sprintf("%s-m02", profile) | ||
| // (In a real test we might want to discover the node name dynamically, | ||
| // but FreshStart2Nodes guarantees minikube and minikube-m02 exist) |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The hardcoded node name suffix "-m02" is fragile and makes an assumption about node naming that may not always hold. Consider dynamically discovering the second node name using kubectl get nodes to make the test more robust and less dependent on internal naming conventions.
|
|
||
| hostPath := path.Join(p.pvDir, options.PVC.Namespace, options.PVC.Name) | ||
| klog.Infof("Provisioning volume %v to %s", options, hostPath) | ||
| klog.Infof("Provisioning volume %v to %s", options.PVC.Name, hostPath) |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The log message format changed from logging the entire options object to just options.PVC.Name. While this is more concise, it removes potentially useful debugging information. Consider retaining more context such as namespace, storage class, or at minimum log both namespace and name: Infof("Provisioning volume %s/%s to %s", options.PVC.Namespace, options.PVC.Name, hostPath).
| klog.Infof("Provisioning volume %v to %s", options.PVC.Name, hostPath) | |
| klog.Infof("Provisioning volume %s/%s to %s", options.PVC.Namespace, options.PVC.Name, hostPath) |
| required := nodeAffinity["required"].(map[string]interface{}) | ||
| terms := required["nodeSelectorTerms"].([]interface{}) | ||
| term := terms[0].(map[string]interface{}) | ||
| expressions := term["matchExpressions"].([]interface{}) | ||
| expr := expressions[0].(map[string]interface{}) | ||
| values := expr["values"].([]interface{}) | ||
| val := values[0].(string) | ||
|
|
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The deep navigation through the node affinity structure using type assertions without error checking is fragile. If any of these fields are missing or have unexpected types, the test will panic instead of providing a clear error message. Consider adding proper error handling and validation at each step to make test failures more debuggable.
| required := nodeAffinity["required"].(map[string]interface{}) | |
| terms := required["nodeSelectorTerms"].([]interface{}) | |
| term := terms[0].(map[string]interface{}) | |
| expressions := term["matchExpressions"].([]interface{}) | |
| expr := expressions[0].(map[string]interface{}) | |
| values := expr["values"].([]interface{}) | |
| val := values[0].(string) | |
| requiredRaw, ok := nodeAffinity["required"] | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity is missing 'required' field: %v", pvName, nodeAffinity) | |
| } | |
| required, ok := requiredRaw.(map[string]interface{}) | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity.required is not an object: %T", pvName, requiredRaw) | |
| } | |
| termsRaw, ok := required["nodeSelectorTerms"] | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity.required is missing 'nodeSelectorTerms': %v", pvName, required) | |
| } | |
| terms, ok := termsRaw.([]interface{}) | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms is not an array: %T", pvName, termsRaw) | |
| } | |
| if len(terms) == 0 { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms is empty", pvName) | |
| } | |
| term, ok := terms[0].(map[string]interface{}) | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0] is not an object: %T", pvName, terms[0]) | |
| } | |
| expressionsRaw, ok := term["matchExpressions"] | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0] is missing 'matchExpressions': %v", pvName, term) | |
| } | |
| expressions, ok := expressionsRaw.([]interface{}) | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions is not an array: %T", pvName, expressionsRaw) | |
| } | |
| if len(expressions) == 0 { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions is empty", pvName) | |
| } | |
| expr, ok := expressions[0].(map[string]interface{}) | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0] is not an object: %T", pvName, expressions[0]) | |
| } | |
| valuesRaw, ok := expr["values"] | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0] is missing 'values': %v", pvName, expr) | |
| } | |
| values, ok := valuesRaw.([]interface{}) | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values is not an array: %T", pvName, valuesRaw) | |
| } | |
| if len(values) == 0 { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values is empty", pvName) | |
| } | |
| val, ok := values[0].(string) | |
| if !ok { | |
| t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values[0] is not a string: %T", pvName, values[0]) | |
| } |
| selector: | ||
| matchLabels: | ||
| integration-test: storage-provisioner | ||
| addonmanager.kubernetes.io/mode: Reconcile |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DaemonSet selector includes the label addonmanager.kubernetes.io/mode: Reconcile which is typically used for addon management metadata and not for pod selection. This could cause issues if the addon manager changes this label. Consider using a dedicated label for pod selection (e.g., app: storage-provisioner) instead of repurposing the addonmanager label.
| identity := types.UID(uuid.NewUUID()) | ||
| if nodeName != "" { | ||
| identity = types.UID(provisionerName + "-" + nodeName) | ||
| } |
Copilot
AI
Jan 15, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fallback to a random UUID when NODE_NAME is not set creates an inconsistent behavior. A warning is logged, but the provisioner continues with a random identity which won't work correctly in a DaemonSet deployment. Consider making NODE_NAME required and returning an error if it's not set, or documenting this as supporting both single-node (legacy Pod) and multi-node (DaemonSet) deployments.
|
PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
No description provided.