animal_shogi: fix path-dependent Zobrist hash that undercounted repetitions#1326
Open
gweber wants to merge 1 commit into
Open
animal_shogi: fix path-dependent Zobrist hash that undercounted repetitions#1326gweber wants to merge 1 commit into
gweber wants to merge 1 commit into
Conversation
…itions
The Zobrist hash was not a pure function of position:
- on capture, the captured piece was removed from the hash at the moving piece's `from`
square instead of its actual `to` square (_step_move)
- the drop hand-count update used {num_hand+1, num_hand} while capture used {num_hand-1,
num_hand}, so drop was not the inverse of capture (_step_drop)
- the hand term was indexed by the post-flip turn
This made the hash path-dependent, so repetition detection silently undercounted (two
identical positions could hash differently), affecting sennichite draws and the repetition
observation planes. Fix: recompute the hash from scratch as a pure function of the canonical
position each step. Adds a regression test (fails on the old code).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
animal_shogi's Zobrist hash is not a pure function of the position — it is path-dependent — so repetition detection silently undercounts. Two identical positions reached via different move orders can hash differently, which means sennichite (repetition) draws can be missed and the repetition observation planes (observation[:, :, 22:24]) can be wrong.Root cause
The incremental hash updates in
_step_move/_step_dropare inconsistent:_step_move): the captured piece sits onaction.to, but it was XOR-removed from the hash usingzb_from_(the moving piece's origin square) instead of thetosquare._step_drop): drop XORed{num_hand+1, num_hand}while capture XORed{num_hand-1, num_hand}, so a drop was not the inverse of the capture that produced the hand piece._turn.Because the per-step updates aren't invertible, the hash drifts with history.
Fix
Recompute
_zobrist_hashfrom scratch as a pure function of the canonical (turn-0 frame) position after each step (_compute_hash), and drop the now-unnecessary incremental hash bookkeeping in_step_move/_step_drop. Equal positions now hash equally by construction.INIT_ZOBRIST_HASHis derived from the same function.Test
Adds
test_zobrist_hash_is_pure_position_function: random self-play (which exercises captures and drops), asserting any two states with identical(_board, _hand, _turn)have identical_zobrist_hash. This fails on the current code (e.g.569813267 != 1222688014for an identical position) and passes with the fix. All existinganimal_shogitests still pass.How it was found
Surfaced while differentially testing a re-implementation against pgx: a game had an identical position at ply 10 and ply 20 (same board/hand/turn) but different pgx hashes, so pgx reported
rep == 0where the true repetition count was ≥ 1. No king-capture or other exotic state is involved; ordinary captures trigger it.