HBASE-28659 Fix NPE in hmaster RegionStates.setServerState when the ServerStateNode is missing#8339
HBASE-28659 Fix NPE in hmaster RegionStates.setServerState when the ServerStateNode is missing#8339SwaraliJoshi wants to merge 3 commits into
Conversation
|
So this could only happen where is actually no regions on the region server? Otherwise when loading regions in AssignmentManger, we will create the server node? And I think we should create the server node here since this is not something incorrect, a warning message here will confusing the users... |
…pping Addresses review feedback: rather than logging a warning and skipping when the ServerStateNode is missing, recreate it. These split helpers only ever run for a crashed server inside an SCP, and the SCP removes the node again via removeServer at SERVER_CRASH_FINISH, so creating it on demand is correct and avoids a confusing warning. Update the regression test accordingly. Co-authored-by: Cursor <cursoragent@cursor.com>
@Apache9 I have made the changes. Please take a look at it. |
| // The node was recreated and reflects the last split state set above. | ||
| ServerStateNode serverNode = stateMap.getServerNode(crashedServer); | ||
| assertNotNull(serverNode, "the ServerStateNode should be recreated by setServerState"); | ||
| assertTrue(serverNode.isInState(ServerState.OFFLINE)); |
There was a problem hiding this comment.
nit: It is hard but it would be great if in thist test we can also validate that serverNode will get removed from the map after complition of SCP.
Umeshkumar9414
left a comment
There was a problem hiding this comment.
left a nit, otherwise fix looks good to me.
|
@Apache9 Requesting you to kindly take a look at this |
|
@Apache9 Please review this fix. |
|
@SwaraliJoshi do we have backport PRs ready for branch-2 and branch-2.6? I can merge them all together. |
|
@virajjasani No sir, I have not created the backport PRs yet. I thought once this is merged I can create those as well. |
| // in meta, so no node was recreated for it. See HBASE-28659. These split helpers only ever run | ||
| // for a crashed server inside an SCP, so recreate the node rather than NPE; the SCP will remove | ||
| // it again via removeServer at SERVER_CRASH_FINISH. | ||
| ServerStateNode serverNode = |
There was a problem hiding this comment.
I think we have a method for creating ServerStateNode? Let's only create it when necessary.
Always creating it here may hide other bugs...
There was a problem hiding this comment.
We will only create the serverNode when it is absent. I have updated the call with method createServer which does the exact same computeIfAbsent call as earlier (since this is cleaner and more readable).
Without this, it will throw NPE so we need to check if it exists and create if it doesn't.
@Apache9 Please take a look now.
There was a problem hiding this comment.
What I mean is that, we should change the code in SCP? If there is no server state node, we should create it before calling setServerState, but we'd better not creating server state node automatically inside setServerState.
Problem statement
A ServerCrashProcedure (SCP) can crash the PEWorker with a NullPointerException while processing a dead RegionServer, after a Master restart:
setServerStatedereferences the result ofgetServerNode(serverName)insidesynchronized (serverNode), butgetServerNodeis documented to return null when no node exists:This is worse than a single failed procedure: the SCP does not support rollback, so the uncaught exception triggers an unsupported rollback (
... does not support rollback but the execution failed and try to rollback, code bug?). The crashed server's WAL split and region reassignment never complete — and since this SCP carried meta, meta recovery is abandoned, which can wedge the cluster until manual intervention.Root cause
ServerStateNodes live only in the in-memoryRegionStates.serverMap; they are never persisted. On Master startup the map is rebuilt only from:RegionServerTracker.upgrade()->createServer()for the live set:rsListStorage.getAll()∪MasterWalManager.getLiveServersFromWALDir()(WAL dirs that do not end in-splitting),ServerManager.recordNewServerWithLock),hbase:meta(AssignmentManager.loadMeta->createServer(regionLocation)for any region whose persisted location is that server).Servers that only have an outstanding SCP are added to the deadservers list by
findDeadServersAndProcess, but noServerStateNodeis created for them. So if a persisted SCP resumes for a server that:-splitting(counted as "splitting", explicitly excluded from "live"),ServerName), andloadMetadoes not recreate the node),…then
getServerNode(oldServerName)returns null, and the nextsetServerStatecall (logSplittingatSERVER_CRASH_SPLIT_LOGS) NPEs.This is deterministic given that on-disk/in-memory state — it is not a timing race. It is "rare" only because reaching that state requires a Master restart to land inside an SCP's WAL-splitting window. A 2.5.8 -> 3.0 upgrade (rolling Master failovers while RegionServers bounce) widens that window dramatically, which is why the reporter only saw it during migration.
Timeline (from the attached hbase--master log)
The Master restart and the SCP are independently triggered events (a RegionServer crash created the SCP; the Master process was terminated separately). The bug is the coincidence of the two.
Solution
Recreate the
ServerStateNodeinsetServerStatewhen it is missing, instead of NPEing:This single change covers all four split helpers that route through
setServerState:metaLogSplitting,metaLogSplit,logSplitting, andlogSplit.These split helpers are only ever called from
ServerCrashProcedurefor a crashed server, and the SCP removes the node again viaremoveServeratSERVER_CRASH_FINISH. So creating the node on demand is the correct, self-cleaning behavior: the resumed SCP records its split state as usual, and the node is torn down when the SCP completes. This also follows the reviewer's suggestion — a missing node here is not an error condition, so a warning would only confuse operators; instead we just (re)create it, consistent with howcreateServeralready builds nodes with the samecomputeIfAbsent.Impact
setServerStateduring SCP replay after a Master restart/failover.ServerStateNodeexists). When it is absent, the node is recreated and cleaned up by the SCP atSERVER_CRASH_FINISH.Testing
TestRegionStates:testLogSplittingCreatesNodeWhenServerNodeMissing— reproduces the post-restart scenario (a server with noServerStateNode) and asserts all four split helpers (metaLogSplitting,metaLogSplit,logSplitting,logSplit) run without throwing, recreate the node, and record the final split state. Before the fix this threw NPE.testLogSplittingOkWhenServerNodePresent— contrast case confirming normal behavior when the node already exists.TestSCP,TestSCPWithMeta(meta-carrying scenario from the JIRA),TestRollbackSCP.