-
Notifications
You must be signed in to change notification settings - Fork 0
<fix>[storage]: Serialize and modify multiple mds nodes #3215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 5.5.6
Are you sure you want to change the base?
Conversation
工作流程概览本变更将外部主存储更新消息处理从同步改为异步链式处理,通过向同步任务链提交ChainTask来包装更新操作。新增队列内处理方法、加强配置变更条件验证,并在异步分支中确保completion信号被正确触发。同时新增集成测试以验证并发场景下的重连消息处理。 变更
代码审查工作量评估🎯 3 (中等) | ⏱️ ~22 分钟 诗句
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
`@storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorage.java`:
- Line 215: Fix the conditional in ExternalPrimaryStorage to avoid the stray
spacing and possible NPE by replacing the current check that uses
oldConfig.equals(msg.getConfig()) with a null-safe comparison such as
Objects.equals(oldConfig, msg.getConfig()) (and add the import
java.util.Objects), or flip the comparison to use
msg.getConfig().equals(oldConfig) after confirming msg.getConfig() is non-null;
ensure the condition reads with a space before && (e.g., if (msg.getConfig() !=
null && !Objects.equals(oldConfig, msg.getConfig())) ) and update any imports
accordingly.
📜 Review details
Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorage.javatest/src/test/groovy/org/zstack/test/integration/storage/primary/addon/zbs/ZbsPrimaryStorageCase.groovy
🧰 Additional context used
📓 Path-based instructions (2)
**/*.*
⚙️ CodeRabbit configuration file
**/*.*: - 代码里不应当有有中文,包括报错、注释等都应当使用正确的、无拼写错误的英文来写
Files:
test/src/test/groovy/org/zstack/test/integration/storage/primary/addon/zbs/ZbsPrimaryStorageCase.groovystorage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorage.java
**/*.java
⚙️ CodeRabbit configuration file
**/*.java: ## 1. API 设计要求
- API 命名:
- API 名称必须唯一,不能重复。
- API 消息类需要继承
APIMessage;其返回类必须继承APIReply或APIEvent,并在注释中用@RestResponse进行标注。- API 消息上必须添加注解
@RestRequest,并满足如下规范:
path:
- 针对资源使用复数形式。
- 当 path 中引用消息类变量时,使用
{variableName}格式。- HTTP 方法对应:
- 查询操作 →
HttpMethod.GET- 更新操作 →
HttpMethod.PUT- 创建操作 →
HttpMethod.POST- 删除操作 →
HttpMethod.DELETE- API 类需要实现
__example__方法以便生成 API 文档,并确保生成对应的 Groovy API Template 与 API Markdown 文件。
2. 命名与格式规范
类名:
- 使用 UpperCamelCase 风格。
- 特殊情况:
- VO/AO/EO 类型类除外。
- 抽象类采用
Abstract或Base前缀/后缀。- 异常类应以
Exception结尾。- 测试类需要以
Test或Case结尾。方法名、参数名、成员变量和局部变量:
- 使用 lowerCamelCase 风格。
常量命名:
- 全部大写,使用下划线分隔单词。
- 要求表达清楚,避免使用含糊或不准确的名称。
包名:
- 统一使用小写,使用点分隔符,每个部分应是一个具有自然语义的英文单词(参考 Spring 框架的结构)。
命名细节:
- 避免在父子类或同一代码块中出现相同名字的成员或局部变量,防止混淆。
- 命名缩写:
- 不允许使用不必要的缩写,如:
AbsSchedulerJob、condi、Fu等。应使用完整单词提升可读性。
3. 编写自解释代码
意图表达:
- 避免使用布尔型参数造成含义不明确。例如:
- 对于
stopAgent(boolean ignoreError),建议拆分为不同函数(如stopAgentIgnoreError()),或使用枚举表达操作类型。- 命名应尽量用完整的单词组合表达意图,并在名称中体现数据类型或用途(例如在常量与变量名称中,将类型词放在末尾)。
- 避免使用魔法值(Magic Value):
直接使用未经定义的数值或字符串(如 if (status == 5))应替换为枚举或常量。
示例:
// 错误示例:魔法值
if (user.getStatus() == 5) { ... }
// 正确示例:常量或枚举
public static final int STATUS_ACTIVE = 5;
if (user.getStatus() == STATUS_ACTIVE) { ... }
// 或使用枚举
enum UserStatus { ACTIVE, INACTIVE }
注释:
- 代码应尽量做到自解释,对少于两行的说明可以直接写在代码中。
- 对于较长的注释,需要仔细校对并随代码更新,确保内容正确。
- 接口方法不应有多余的修饰符(例如
public),且必须配有有效的 Javadoc 注释。
4. 流程控制和结构优化
if...else 的使用:
- 应尽量减少 if...else 结构的使用,建议:
- 限制嵌套层级最多为两层,且内层不应再出现
else分支。- 尽早返回(Early Return),将条件判断中的处理逻辑提前结束或抽成独立方法。
- 使用 Java Stream 或 Lambda 表达式代替冗长的循环与条件判断。
条件判断:
- if 条件表达不宜过长或过于复杂,必要时可以将条件抽成 boolean 变量描述。
代码块长度:
...
Files:
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorage.java
🧠 Learnings (2)
📚 Learning: 2025-08-25T03:55:07.988Z
Learnt from: zstack-robot-1
Repo: MatheMatrix/zstack PR: 2504
File: plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java:2046-2078
Timestamp: 2025-08-25T03:55:07.988Z
Learning: The OFFLINE_SNAPSHOT_COMMIT path for NFS primary storage snapshot commit operations is properly handled in the test infrastructure via NfsPrimaryStorageSpec.groovy, which includes both simulator and VFS hook implementations for testing the offline snapshot commit functionality.
Applied to files:
test/src/test/groovy/org/zstack/test/integration/storage/primary/addon/zbs/ZbsPrimaryStorageCase.groovy
📚 Learning: 2025-08-24T07:35:42.793Z
Learnt from: zstack-robot-2
Repo: MatheMatrix/zstack PR: 2496
File: plugin/sharedMountPointPrimaryStorage/src/main/java/org/zstack/storage/primary/smp/KvmBackend.java:2545-2566
Timestamp: 2025-08-24T07:35:42.793Z
Learning: 在 ZStack 代码库中,当响应类包含数值字段(如 size、actualSize)时,优先使用原始类型(long)而不是包装类型(Long),以避免 NPE 风险和不必要的装箱/拆箱操作。如果 Agent 端可能不设置该字段,应在 Agent 端确保设置默认值,而不是在使用端做 null 检查。
Applied to files:
test/src/test/groovy/org/zstack/test/integration/storage/primary/addon/zbs/ZbsPrimaryStorageCase.groovy
🔍 Remote MCP AtlassianMCP
Summary of Relevant Context for PR Review
Issue Background: ZSTAC-79225
The issue (ZSTAC-79225) describes a problem where modifying one MDS node's port before the previous modification API completes results in the second modification failing. More specifically, when one MDS node port modification is initiated through the UI and another MDS node port modification is triggered before the first one completes, the second modification fails because the message for the second modification includes stale parameters from the first modification's state.
The assigned priority is P2 (Normal), and the issue is targeted for fix version 5.5.6. The root cause identified is that when modifying one MDS node, all 3 MDS node parameters are included in the message, and when the UI simultaneously modifies two MDS nodes, the later message contains the older parameters from the earlier modification, causing the update to fail.
Solution Approach
The solution implements serialization of APIUpdateExternalPrimaryStorageMsg handling by queuing updates for multiple MDS nodes, ensuring modifications are applied serially rather than concurrently.
Key Technical Changes
The PR implements:
-
Asynchronous Chain-Based Message Handling: Wraps the
APIUpdateExternalPrimaryStorageMsgupdate processing in aChainTaskand submits it to a sync task chain for serialized execution -
In-Queue Processing Method: Adds
doUpdateExternalPrimaryStorageInQueue(APIUpdateExternalPrimaryStorageMsg, NoErrorCompletion)to handle updates within the queue -
Configuration Change Validation: Tightens the update condition to only apply changes when the new config is non-null and different from the old one (with null check on oldConfig)
-
Proper Completion Handling: Ensures
completion.done()is invoked in all asynchronous branches afterbus.publish(evt)to guarantee completion signaling -
Integration with ChainTask Framework: Uses NoErrorCompletion for the chain callback to integrate with the existing chain task flow, which manages serialization of concurrent requests through dispatch queues
Test Coverage
The integration test in ZbsPrimaryStorageCase.groovy validates:
- Concurrent update API calls through thread coordination
- Proper serialization (exactly 2 reconnect messages from concurrent updates)
- Task lifecycle management (running task count remains 1 during reconnect)
🔇 Additional comments (4)
storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorage.java (2)
175-197: 链式任务实现正确,实现了 API 请求的串行化处理。使用
ChainTask和基于 UUID 的同步签名确保了对同一存储的多个并发更新请求会被串行处理,这正确解决了 ZSTAC-79225 中描述的并发修改问题。
241-251: 异步分支中的 completion 处理正确。在 reconnect 和非 reconnect 两个路径中都正确调用了
completion.done(),确保了链式任务能够正常推进到下一个任务。test/src/test/groovy/org/zstack/test/integration/storage/primary/addon/zbs/ZbsPrimaryStorageCase.groovy (2)
53-54: 新增字段用于测试并发更新场景。
ThreadFacadeImpl thdf和AtomicInteger reconnectMsgCount用于协调和验证并发更新测试,实现合理。
286-328: 修复代码格式问题并改进测试的并发稳定性。
第 288 行:
run变量的初始赋值是冗余的,因为它在消息处理器内部被立即重新赋值,建议删除此行或重构变量作用域。第 321 行:
>=运算符两侧缺少空格,应改为>= 1。测试并发稳定性问题:三个并发线程(分别配置 127.0.1.4、127.0.1.5 和 oldConfig)修改存储配置,但断言期望恰好 2 次 reconnect 消息(第 327 行)。由于线程调度顺序不确定,实际触发的 reconnect 次数可能变化,导致测试失败。建议:
- 使用
CountDownLatch或CyclicBarrier等同步机制控制线程执行顺序,或- 将第 327 行的断言改为
assert reconnectMsgCount.get() >= 2以提高稳定性。⛔ Skipped due to learnings
Learnt from: zstack-robot-1 Repo: MatheMatrix/zstack PR: 2504 File: plugin/nfsPrimaryStorage/src/main/java/org/zstack/storage/primary/nfs/NfsPrimaryStorageKVMBackend.java:2046-2078 Timestamp: 2025-08-25T03:55:07.988Z Learning: The OFFLINE_SNAPSHOT_COMMIT path for NFS primary storage snapshot commit operations is properly handled in the test infrastructure via NfsPrimaryStorageSpec.groovy, which includes both simulator and VFS hook implementations for testing the offline snapshot commit functionality.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| boolean needReconnect = false; | ||
| String oldConfig = externalVO.getConfig(); | ||
| if (msg.getConfig() != null) { | ||
| if (msg.getConfig() != null&& !oldConfig.equals(msg.getConfig())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
修复格式问题并防止潜在的空指针异常。
当前代码存在两个问题:
&&前缺少空格- 如果
oldConfig(从externalVO.getConfig()获取)为 null,调用oldConfig.equals()会抛出 NPE
建议使用 Objects.equals() 或交换比较顺序以确保安全。
🐛 建议的修复方案
- if (msg.getConfig() != null&& !oldConfig.equals(msg.getConfig())) {
+ if (msg.getConfig() != null && !msg.getConfig().equals(oldConfig)) {或者使用 Objects.equals():
+import java.util.Objects;
...
- if (msg.getConfig() != null&& !oldConfig.equals(msg.getConfig())) {
+ if (msg.getConfig() != null && !Objects.equals(oldConfig, msg.getConfig())) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (msg.getConfig() != null&& !oldConfig.equals(msg.getConfig())) { | |
| if (msg.getConfig() != null && !msg.getConfig().equals(oldConfig)) { |
🤖 Prompt for AI Agents
In
`@storage/src/main/java/org/zstack/storage/addon/primary/ExternalPrimaryStorage.java`
at line 215, Fix the conditional in ExternalPrimaryStorage to avoid the stray
spacing and possible NPE by replacing the current check that uses
oldConfig.equals(msg.getConfig()) with a null-safe comparison such as
Objects.equals(oldConfig, msg.getConfig()) (and add the import
java.util.Objects), or flip the comparison to use
msg.getConfig().equals(oldConfig) after confirming msg.getConfig() is non-null;
ensure the condition reads with a space before && (e.g., if (msg.getConfig() !=
null && !Objects.equals(oldConfig, msg.getConfig())) ) and update any imports
accordingly.
dd00df3 to
70ee841
Compare
| boolean needReconnect = false; | ||
| String oldConfig = externalVO.getConfig(); | ||
| if (msg.getConfig() != null) { | ||
| if (msg.getConfig() != null&& !oldConfig.equals(msg.getConfig())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment from yu.qiu:
已修改
b22e238 to
cb5e833
Compare
When API requests for modifying multiple MDS nodes are initiated, place the API requests into a queue for serialized modification. Resolves: ZSTAC-79225 Change-Id: I6878717373676e63796f6c7079716f6c6b6e7a61
cb5e833 to
983b093
Compare
When API requests for modifying
multiple MDS nodes are initiated,
place the API requests into a queue for serialized modification.
Resolves: ZSTAC-79225
Change-Id: I6b7963687668796878786568746d797165756b69
sync from gitlab !9034