MatPlus.Net

 Website founded by
Milan Velimirović
in 2006

10:10 UTC
ISC 2025
 
  Forum*
 
 
 
 

Username:

Password:

Remember me

 
Forgot your
password?
Click here!
SIGN IN
to create your account if you don't already have one.
CHESS
SOLVING

Tournaments
Rating lists
1-Jul-2025

B P C F





 
 
MatPlus.Net Forum General How many legal moves with distinct standard algebraic notation strings are there?
 
You can only view this page!
(1) Posted by Alice Bob [Saturday, Jun 15, 2024 02:25]

How many legal moves with distinct standard algebraic notation strings are there?


Hi, I'm here looking for advice after finding many conflicting numbers over the internet. There's a recent video "The rarest move in chess" which analyzed game data from Lichess and it shows about 30% of possible move notations (21373 out of 29754 according to the video) never appeared in the data, that made me wonder exactly how many of these are possible. Maybe most are weird bishop/knight (double) disambiguation cases that are just unlikely (and if you are wondering, the rarest type of move was "Bishop double disambiguation capture mates" - people have been intentionally playing these moves on Lichess since the video came out), but I wouldn't be surprised if some are just impossible. Another project "SAN-strings" on GitHub calculated 29274 (but included some impossible check/mate cases), and yet another article on Chess.com "Think you know algebraic notation?" got 29724 (but included some cases like R1a1 only possible with non-standard disambiguation), and I'm sure there are more numbers out there.

Specifically I'm asking about the possible legal move notations as used in popular chess platforms: moves with capture and/or check or checkmate symbols are counted as distinct (and these symbols are not optional), disambiguation is only used when necessary, and if either rank or file is enough, always only use file disambiguation, and check/checkmate symbols do not affect disambiguation. Also, pawn moves have rank and file (like e4, exd5), promotion uses "=" and en passant is not specified in their notation. Castling is written with O's rather than 0's. No special notation for double check. Other things like draw offers and annotation symbols are not included.

A further question: stalemates don't have an official symbol and aren't considered separate types of moves in the video, so I wonder if stalemate had its own symbol, can all these moves (that are not obviously incompatible, like checks/checkmates) end the game in stalemates, or even other ways to end the game like 3-fold or 50-move draws?

Also, all the impossible patterns below are because of geometric constraints in the current position. Are there any moves/possibilities that are impossible for less obvious reasons, like requiring retroanalysis to rule out?

I haven't found an online source that comprehensively list all types of moves to count and exclude, so I will attempt to list what I have found.

Castling: uncontroversial, 6 possibilities (2 moves that can also be checks or checkmates)

Pawns: I think it's uncontroversial: 6*8 normal moves, 8*2*4 move promotions, (6*2+2)*6 normal captures, (6*2+2)*2 en passant captures (not directly shown in the data because they omitted e.p.), (6*2+2)*2*4 capture promotions, total 336 (308 shown in the data). No disambiguation needed, and I don't see any reasons why any of them can't be a check or checkmate. In fact the data shows all 924 (3*308) distinct moves, so I believe they all can be checks or checkmates.

Kings: 64 normal moves and 64 capture moves; all of these can be checks or checkmates so 384 total. The data shows 382 distinct ones, so somehow 2 are missing. (Kings can't give check except discovered check, so if the king moves from the corner or along the edge of the board it can't check or checkmate, but this doesn't matter for standard notation that doesn't care where it came from. If disambiguation is allowed even when unnecessary then it would matter.)

For other piece types, the calculation is more complicated and some authors didn't show the details or used a program instead of doing manual analysis (and I'm not sure who is correct), so I will only list some non-exhaustive observations:

Queens: for instance, some double disambiguated queen check/checkmates are impossible because queens can't discover check and the queen that moved can't attack new squares in specific configurations with other queens. For example, Qb2a1 cannot cause new attacks because other queens must have been on b1 and a2 as well, so Qb2(x)a1+/# is impossible, similarly for . This seems to be ignored in some calculations I found.

Bishops: some disambiguation are impossible like Bb2a1/Bba1/B1a1. Bishops can discover checks, but some configurations make it impossible, such as Ba1b2+/# without capture is impossible, as it can't attack new squares because other bishops were on c1 and a3. (Bishop corner-to-corner moves also can't give check, but disambiguation can't specify these)

Rooks: they can't have double disambiguation; they can't discover checks when moving along the edge of the board, and some forms of rank disambiguation like R1a1 are impossible

Knights: not sure


Sources:

https://www.youtube.com/watch?v=iDnW0WiCqNc (video that inspired my question)
https://github.com/paralogical/rarest-move-in-chess (analysis code for the video)
https://chess.stackexchange.com/questions/45636 (question that led me to this forum)
https://github.com/jacksonthall22/SAN-strings (software that tries to list all possible notation strings)
https://www.chess.com/blog/kurtgodden/think-you-know-algebraic-notation (another article that calculates number of possible strings, slightly different disambiguation handling)
https://lichess.org/forum/general-chess-discussion/how-many-possible-moves-are-there-in-algebraic-notation (another post that tries to calculate)
 
(Read Only)pid=26057
(2) Posted by Jackson Hall [Sunday, Dec 22, 2024 22:50]

Hey Alice,

I'm the author of the SAN-strings repo (https://github.com/hyprchs/SAN-strings). For whatever reason, it's been a fascination of mine for a while to nail down this exact number too, probably because anyone who has spent some time thinking about it will find it's by no means a trivial problem.

There are tons of unintuitive edge cases to work out if you try to solve it "per piece", which is what I tried originally (see the README for the per-piece strategy I was using at this earlier commit: https://github.com/hyprchs/SAN-strings/tree/d43fb25658d3592084cb12957c5a812778cd8cb8). Later, someone opened an issue showing some clearly valid SAN moves that had my code was not generating: https://github.com/hyprchs/SAN-strings/issues/1. After investigating, I realized that I missed some edge cases for bishops and rooks despite having thought through all the logic in considerable detail.

Around this time I decided that trying to exhaustively solve the per-piece disambiguation logic is not only extremely hard, but also very hard to prove to be correct to other curious forum-goers for whom this question crops up as a shower thought and who are looking more for an answer that's verifiable using simple, explainable logic that jives with their understanding of the rules of chess than an exact number—especially one that was produced by a cold, brute force calculation (boooring). When an inquisitive chess player asks "how many distinct SAN moves are there?", I think what they really would be happy to know is, "how many distinct SAN moves are there, and what about the geometry of a chess board and the syntax of SAN necessitates the answer being correct?"

Suffice it to say, it would be easy to write a program that puts one of each piece type on each square and moves it to every other unoccupied square, then puts two of each piece type on every combination of two squares and moves both to every unoccupied square, then repeats with three of each piece type, and arrive at the correct answer. However, this strategy not only kills the inherent mystery in the problem but also is computational overkill (certainly would have to double-count thousands of SANs) for a problem that I think most chess players would agree should be solvable with a more mathematical and intuitive approach. It's certainly out-of-scope for my project. I also have not been able to think of any practical use case for knowing the exact answer anyway. The only thing I've seen before is people wanting a comprehensive list of valid SAN strings to test other strings against to see if they are valid SANs, but then `python-chess` has `chess.SAN_REGEX` for this use case.

After updating my repo to use more general logic, I'm quite confident that 29,274 can be considered an upper bound on the true answer, but after watching that video about the rarest move in chess, I realized too that I had not accounted for the cases where certain disambiguated moves cannot cause new squares to be attacked. I have this issue up, which I'll hopefully get to someday: https://github.com/hyprchs/SAN-strings/issues/4. To me, an adequate solution to this problem would be piece-agnostic and start from the ground up by considering what conditions about the piece configuration must be true for such a case to occur and what conditions must be true for that case to be impossible. I think if those questions can be answered intuitively, it should be possible to add a simple check in the existing code to avoid adding certain SAN strings in the appropriate cases.

In the meantime, you could open an issue in my repo if you want to investigate together further.

En passant,
Jackson
 
 
(Read Only)pid=26506
(3) Posted by Joost de Heer [Monday, Dec 23, 2024 10:31]

-
 
 
(Read Only)pid=26507

No more posts


MatPlus.Net Forum General How many legal moves with distinct standard algebraic notation strings are there?