Skip to content

Conversation

@medyagh
Copy link
Member

@medyagh medyagh commented Jan 14, 2026

No description provided.

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 14, 2026
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: medyagh

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jan 14, 2026
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 14, 2026
@medyagh medyagh changed the title Storage provisioner add multinode support wip: Storage provisioner add multinode support Jan 14, 2026
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 14, 2026
…n integration test to verify NodeAffinity for hostPath PVs in multi-node clusters.
@k8s-ci-robot
Copy link
Contributor

@medyagh: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
integration-vfkit-docker-macos-arm 78f1f09 link false /test integration-vfkit-docker-macos-arm
integration-none-containerd-linux-x86 78f1f09 link false /test integration-none-containerd-linux-x86
integration-docker-containerd-linux-arm 78f1f09 link false /test integration-docker-containerd-linux-arm
integration-docker-containerd-linux-x86 78f1f09 link true /test integration-docker-containerd-linux-x86
integration-docker-docker-linux-x86 78f1f09 link true /test integration-docker-docker-linux-x86
integration-docker-docker-linux-arm 78f1f09 link true /test integration-docker-docker-linux-arm
integration-none-docker-linux-x86 78f1f09 link true /test integration-none-docker-linux-x86
integration-docker-crio-linux-x86 78f1f09 link false /test integration-docker-crio-linux-x86
integration-kvm-crio-linux-x86 78f1f09 link false /test integration-kvm-crio-linux-x86
integration-kvm-docker-linux-x86 78f1f09 link true /test integration-kvm-docker-linux-x86
integration-kvm-containerd-linux-x86 78f1f09 link true /test integration-kvm-containerd-linux-x86

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds multinode support to the storage provisioner by converting it from a single Pod to a DaemonSet. This enables hostPath-based persistent volumes to work correctly in multi-node clusters by ensuring PVs have node affinity set to the node where storage is actually created.

Changes:

  • Upgraded sig-storage-lib-external-provisioner from v6 to v13 to leverage newer provisioning features
  • Converted storage-provisioner deployment from Pod to DaemonSet with node affinity awareness
  • Added WaitForFirstConsumer volume binding mode to the StorageClass to enable topology-aware scheduling
  • Added comprehensive integration test to verify node affinity is correctly set on provisioned PVs

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pkg/storage/storage_provisioner.go Added NODE_NAME environment variable handling, deterministic identity for DaemonSet restarts, and node affinity setting on PVs
deploy/addons/storage-provisioner/storage-provisioner.yaml.tmpl Converted from Pod to DaemonSet, added NODE_NAME env var, and leases permissions for leader election
deploy/addons/storageclass/storageclass.yaml Added WaitForFirstConsumer volume binding mode to support topology-aware provisioning
test/integration/multinode_test.go Added validateStorageProvisionerNodeAffinity test to verify node affinity functionality
test/integration/testdata/*.yaml Added test manifests for PVC and Pod used in the integration test
Makefile Updated storage provisioner version and build process to use buildx for multi-arch images
deploy/storage-provisioner/Dockerfile Changed ARG from arch to TARGETARCH for buildx compatibility
go.mod & go.sum Updated dependencies to provisioner library v13 and related transitive dependencies

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +626 to +628
nodes := fmt.Sprintf("%s-m02", profile)
// (In a real test we might want to discover the node name dynamically,
// but FreshStart2Nodes guarantees minikube and minikube-m02 exist)
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hardcoded node name suffix "-m02" is fragile and makes an assumption about node naming that may not always hold. Consider dynamically discovering the second node name using kubectl get nodes to make the test more robust and less dependent on internal naming conventions.

Copilot uses AI. Check for mistakes.

hostPath := path.Join(p.pvDir, options.PVC.Namespace, options.PVC.Name)
klog.Infof("Provisioning volume %v to %s", options, hostPath)
klog.Infof("Provisioning volume %v to %s", options.PVC.Name, hostPath)
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The log message format changed from logging the entire options object to just options.PVC.Name. While this is more concise, it removes potentially useful debugging information. Consider retaining more context such as namespace, storage class, or at minimum log both namespace and name: Infof("Provisioning volume %s/%s to %s", options.PVC.Namespace, options.PVC.Name, hostPath).

Suggested change
klog.Infof("Provisioning volume %v to %s", options.PVC.Name, hostPath)
klog.Infof("Provisioning volume %s/%s to %s", options.PVC.Namespace, options.PVC.Name, hostPath)

Copilot uses AI. Check for mistakes.
Comment on lines +690 to +697
required := nodeAffinity["required"].(map[string]interface{})
terms := required["nodeSelectorTerms"].([]interface{})
term := terms[0].(map[string]interface{})
expressions := term["matchExpressions"].([]interface{})
expr := expressions[0].(map[string]interface{})
values := expr["values"].([]interface{})
val := values[0].(string)

Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deep navigation through the node affinity structure using type assertions without error checking is fragile. If any of these fields are missing or have unexpected types, the test will panic instead of providing a clear error message. Consider adding proper error handling and validation at each step to make test failures more debuggable.

Suggested change
required := nodeAffinity["required"].(map[string]interface{})
terms := required["nodeSelectorTerms"].([]interface{})
term := terms[0].(map[string]interface{})
expressions := term["matchExpressions"].([]interface{})
expr := expressions[0].(map[string]interface{})
values := expr["values"].([]interface{})
val := values[0].(string)
requiredRaw, ok := nodeAffinity["required"]
if !ok {
t.Fatalf("PV %s nodeAffinity is missing 'required' field: %v", pvName, nodeAffinity)
}
required, ok := requiredRaw.(map[string]interface{})
if !ok {
t.Fatalf("PV %s nodeAffinity.required is not an object: %T", pvName, requiredRaw)
}
termsRaw, ok := required["nodeSelectorTerms"]
if !ok {
t.Fatalf("PV %s nodeAffinity.required is missing 'nodeSelectorTerms': %v", pvName, required)
}
terms, ok := termsRaw.([]interface{})
if !ok {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms is not an array: %T", pvName, termsRaw)
}
if len(terms) == 0 {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms is empty", pvName)
}
term, ok := terms[0].(map[string]interface{})
if !ok {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0] is not an object: %T", pvName, terms[0])
}
expressionsRaw, ok := term["matchExpressions"]
if !ok {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0] is missing 'matchExpressions': %v", pvName, term)
}
expressions, ok := expressionsRaw.([]interface{})
if !ok {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions is not an array: %T", pvName, expressionsRaw)
}
if len(expressions) == 0 {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions is empty", pvName)
}
expr, ok := expressions[0].(map[string]interface{})
if !ok {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0] is not an object: %T", pvName, expressions[0])
}
valuesRaw, ok := expr["values"]
if !ok {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0] is missing 'values': %v", pvName, expr)
}
values, ok := valuesRaw.([]interface{})
if !ok {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values is not an array: %T", pvName, valuesRaw)
}
if len(values) == 0 {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values is empty", pvName)
}
val, ok := values[0].(string)
if !ok {
t.Fatalf("PV %s nodeAffinity.required.nodeSelectorTerms[0].matchExpressions[0].values[0] is not a string: %T", pvName, values[0])
}

Copilot uses AI. Check for mistakes.
Comment on lines +110 to +113
selector:
matchLabels:
integration-test: storage-provisioner
addonmanager.kubernetes.io/mode: Reconcile
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The DaemonSet selector includes the label addonmanager.kubernetes.io/mode: Reconcile which is typically used for addon management metadata and not for pod selection. This could cause issues if the addon manager changes this label. Consider using a dedicated label for pod selection (e.g., app: storage-provisioner) instead of repurposing the addonmanager label.

Copilot uses AI. Check for mistakes.
Comment on lines +60 to +63
identity := types.UID(uuid.NewUUID())
if nodeName != "" {
identity = types.UID(provisionerName + "-" + nodeName)
}
Copy link

Copilot AI Jan 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fallback to a random UUID when NODE_NAME is not set creates an inconsistent behavior. A warning is logged, but the provisioner continues with a random identity which won't work correctly in a DaemonSet deployment. Consider making NODE_NAME required and returning an error if it's not set, or documenting this as supporting both single-node (legacy Pod) and multi-node (DaemonSet) deployments.

Copilot uses AI. Check for mistakes.
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 15, 2026
@k8s-ci-robot
Copy link
Contributor

PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants