Fix region migration reliability regressions#17513
Open
Pengzna wants to merge 1 commit intoapache:masterfrom
Open
Fix region migration reliability regressions#17513Pengzna wants to merge 1 commit intoapache:masterfrom
Pengzna wants to merge 1 commit intoapache:masterfrom
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses region migration reliability regressions across ConfigNode region-maintenance RPC behavior, IoTConsensus crash-recovery peer creation, and IoTConsensusV2 realtime replication index assignment in the Pipe subsystem.
Changes:
- Avoid “double retry storms” by using a single RPC attempt for
DELETE_OLD_REGION_PEERwhen the target DataNode isUnknown. - Allow
IoTConsensus#createLocalPeerto proceed when the consensus peer directory already exists (e.g., crash recovery). - Assign IoTConsensusV2 realtime
replicateIndexForIoTV2lazily at supply-time (and idempotently), with added unit tests.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| iotdb-core/datanode/src/test/java/org/apache/iotdb/db/pipe/source/dataregion/realtime/PipeRealtimeReplicateIndexAssignmentTest.java | Adds a unit test to validate lazy + idempotent replicate index assignment behavior. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/source/dataregion/realtime/assigner/PipeDataRegionAssigner.java | Removes eager replicate index assignment during event assignment. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/source/dataregion/realtime/PipeRealtimeDataRegionTsFileSource.java | Assigns replicate index (if needed) at supply() time. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/source/dataregion/realtime/PipeRealtimeDataRegionSource.java | Introduces shared lazy/idempotent replicate index assignment helpers. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/source/dataregion/realtime/PipeRealtimeDataRegionLogSource.java | Assigns replicate index (if needed) at supply() time. |
| iotdb-core/datanode/src/main/java/org/apache/iotdb/db/pipe/source/dataregion/realtime/PipeRealtimeDataRegionHybridSource.java | Assigns replicate index (if needed) at supply() time to avoid holes. |
| iotdb-core/consensus/src/test/java/org/apache/iotdb/consensus/iot/StabilityTest.java | Adds a test to ensure createLocalPeer tolerates an existing consensus directory. |
| iotdb-core/consensus/src/main/java/org/apache/iotdb/consensus/iot/IoTConsensus.java | Allows reuse of an existing consensus peer directory in createLocalPeer. |
| iotdb-core/confignode/src/test/java/org/apache/iotdb/confignode/procedure/env/RegionMaintainHandlerConsensusPipeTest.java | Adds tests verifying retry behavior changes based on DataNode status. |
| iotdb-core/confignode/src/main/java/org/apache/iotdb/confignode/procedure/env/RegionMaintainHandler.java | Uses node status to decide between full retry vs single-attempt RPC for deleting old peers. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| String path = buildPeerDir(storageDir, groupId); | ||
| File file = new File(path); | ||
| if (!file.mkdirs()) { | ||
| if (!file.exists() && !file.mkdirs()) { |
Comment on lines
+487
to
+491
| protected boolean shouldAssignReplicateIndex(final Event suppliedEvent) { | ||
| return !(suppliedEvent instanceof ProgressReportEvent) | ||
| && DataRegionConsensusImpl.getInstance() instanceof IoTConsensusV2 | ||
| && IoTConsensusV2Processor.isShouldReplicate((EnrichedEvent) suppliedEvent); | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
UnknowncreateLocalPeerto reuse an existing consensus directory after cluster crash recoveryValidation