Statistics on chess games

My numbers agree with numbers independently computed by other people, which is a good sign that my program is bug-free. Some column headings link to the corresponding integer sequence from The On-Line Encyclopedia of Integer Sequences.

I am ignoring the draw by 3-fold repetition because it is a pain to take into account. Actually, a draw by 3-fold repetition isn't automatic: one of the players must make a correct claim for the draw to occur. So if it's legal to ignore the repetition of position, then I believe that it's ok to enumerate those games. Draw by 5-fold repetition (a rule introduced in 2014) is automatic and should affect the number of games. Initially, the rule was not very clear, with one interpretation suggesting that the earliest it can apply is at ply 22. In 2017, the rule was modified, and it is now clear that the earliest draw by 5-fold repetition occurs at ply 16, reducing the number of games at ply 17 by 16^4*20 = 1310720. Draw by the 50-move rule (not automatic), draw by the 75-move rule (automatic), and draw by impossibility of checkmate (automatic) don't apply before even more moves. For the complete rules of chess (including past versions since 2009), look for Laws of Chess in the FIDE Handbook, section E.01.

*Number of chess games ending...* (Percent)
	without check nor checkmate	in check	in checkmate	total (perft)
ply 0	1	0	0	1
ply 1	20	0	0	20
ply 2	400	0	0	400
ply 3	8890	12	0	8902
ply 4	196812	461	8	197281
ply 5	4838258	27004	347	4865609
ply 6	118251225	798271	10828	119060324
ply 7	3162798012	32668081	435767	3195901860
ply 8	84029997363	959129557	9852036	84998978956
ply 9	2403434332264	35695709940	400191963	2439530234167
ply 10	68265214423776	1078854669486	8790619155	69352859712417
ply 11	2058141026024096	39147687661803	362290010907	2097651003696806
ply 12	61622159616190772	1224448528652016	8361091858959	62854969236701747
ply 13	1936467500406079794	44252532348552226	346742245764219	1981066775000396239
ply 14				(*) 61885021521585529237
ply 15				(**) 2015099950053364471960

(*) Computed by Peter Österlund and confirmed by Ankan Banerjee, but not by me... yet!

(**) Computed by Ankan Banerjee but not by me... yet!

Discovered check, double check, etc.

It's hard to find definitions of "discovered check" and "double check" written by someone who understands the fine details. Here are typical errors:

"a discovered check is a check given by one piece when another friendly piece moves out of the way". This excludes cases where the check happens because an enemy pawn gets captured en passant.

"a double check is a discovered check where the moving piece also gives check". This definition excludes cases where a king is under two discovered attacks. Also in general the expression "the moving piece" sounds awkward, because two pieces move during castling.

My definitions:

direct check: When a piece which moved gives check.
discovered check: When a piece which didn't move gives check.
double check: When two pieces give check. Split into two categories: "direct and discovered check" and "double discovered check".

*Breakdown according to the type of check*
	direct check only	single discovered check only	direct and discovered check	double discovered check	total
ply 0	0	0	0	0	0
ply 1	0	0	0	0	0
ply 2	0	0	0	0	0
ply 3	12	0	0	0	12
ply 4	461	0	0	0	461
ply 5	26998	6	0	0	27004
ply 6	797896	329	46	0	798271
ply 7	32648427	18026	1628	0	32668081
ply 8	958135303	847039	147215	0	959129557
ply 9	35653060996	37101713	5547221	10	35695709940
ply 10	1077020493859	1531274015	302900733	879	1078854669486
ply 11	39068470901662	67494850305	11721852393	57443	39147687661803

*Breakdown according to the type of checkmate*
	direct checkmate only	single discovered checkmate only	direct and discovered checkmate	double discovered checkmate	total
ply 0	0	0	0	0	0
ply 1	0	0	0	0	0
ply 2	0	0	0	0	0
ply 3	0	0	0	0	0
ply 4	8	0	0	0	8
ply 5	347	0	0	0	347
ply 6	10828	0	0	0	10828
ply 7	435767	0	0	0	435767
ply 8	9852032	4	0	0	9852036
ply 9	399421379	1869	768715	0	400191963
ply 10	8771693969	598058	18327128	0	8790619155
ply 11	360675926605	60344676	1553739626	0	362290010907

Thanks to Joost de Heer for computing ply 11 with my program on his computer, although he did it for the chess problems, not the statistics!

Perft

The number of chess games at ply n is often called perft(n) and is probably the most interesting chess sequence (see the right-most column of the first table of this page). The name "perft" comes from a command in the chess playing program Crafty. As far as I can tell, the first computations were

perft(10) in 2003 by Richard Bean,
perft(11) in 2003 by a distributed project led by Albert Bertilsson,
perft(12) in 2006 by Paul Byrne,
perft(13) in 2011 by Paul Byrne.
perft(14) in 2013 by Peter Österlund.
perft(15) in 2017 by Ankan Banerjee.

Richard Bean wrote an article that has some of the earlier history (click on the cached version).

Perft estimation for higher plies

In 2007, I calculated fairly good estimates for perft 13 and higher by randomly pruning the game tree:

*Perft estimates*
n	perft(n)
13	1.9810e+18
14	6.187e+19
15	2.015e+21
16	6.507e+22
17	2.175e+24
18	7.214e+25
19	2.462e+27
20	8.350e+28

Perft ratios

The behavior of the perft(n)/perft(n-1) ratio is more interesting:

Ignoring the 75-move rule, I expect the ratio to converge to some value, the "maximum sustainable fan-out", probably somewhere around 84.3. Below is a position which appears to sustain such an expansion factor indefinitely if we make non-capturing moves only.

k1qrq3/rq4Qq/3q1Q2/Q6q/3Q4/1Q4Q1/R1q1Q2Q/KQR2q1q

Games leading to this type of position are very rare at first, but they must eventually dominate the perft count because the fan-out is much bigger than in "normal" games.

Thanks to Andrew Buchanan for pointing out that 18 queens are impossible with only 6 captures, and for giving some ideas on how to optimize the example.

Other people's estimates

In 1878, Edwin Anthony published an article titled "The Inexhaustibility of Chess" in The Chess Player's Chronicle, Volume 2, p193-196, where he made the estimate

perft(8) = 20*20*28*29*30*31*33*32 = 318,979,584,000 ± 20%

and suggested multiplying by 30 for each additional ply. The correct value is now known to be perft(8) = 84,998,978,956, so his estimate was off by 275%.

An updated version of the article appears in James Mason's book The Principles of Chess in Theory and Practice, 4th edition (1902), p322-324 (and also in the 2nd (1896) edition and on, according to Mario Richter). The updated article adds the explicit estimate

perft(20) = [above]*30*30*30*30*30*30*30*30*30*30*30*30 = 169,518,829,100,544,000,000,000,000,000.

Those estimates shouldn't be considered accurate since the article itself states that the goal was only to get "something of an approximation". Still, the factors that were multiplied together are quite off compared to the modern perft ratios, and the errors quickly accumulate. In fact, the ratios are not very good even for 1896 because, according to MathWorld, an almost-correct value of perft(4) was known in 1889, so the product should have started 20*20*22.255*22.163 and not 20*20*28*29.

Mario Richter tells me that Karl Fabel wrote a critical article in Die Schwalbe about Anthony's estimation and gave his own estimation as perft(20) = 0.976437504 * 10^29, which is fairly close to mine (8.350e+28).

Perft disinformation

Unfortunately, Edwin Anthony's approximations above gained the status of "fact", possibly because the non-significant digits were not rounded off in the article, so readers assumed they were all significant. As an example, the Isaac Asimov's Book of Facts reports

perft(8) = 318,979,564,000 (Anthony's value with a typo!)

and not the correct (exact) value

perft(8) = 84,998,978,956.

As of January 6, 2020, the erroneous value is leading 5920 to 215 in a Google Fight (318979564000 vs 84998978956). Anthony's perft(20) value appears on multiple trivia pages on the Web.

Games ending in mate-in-n

Here we consider games at ply p that lead to a position that is mate-in-n moves (assuming perfect play). We also consider games where the player on the move is losing under perfect play. Both of these are only possible if one of the players made a game-losing blunder in the opening. The goal is to obtain a depth-to-mate (DTM) analysis of chess openings similar to endgame tablebases. Numbers for "lose-in-0" are the same as the number of games ending in checkmate from the first table of this page.

The number of games ending in win-in-n at ply p is similar to the number of games ending in lose-in-(n-1) at ply p+1. For example, at ply 5 there are 10822 games leading to a win-in-1. Among them, 6 games (such as 1.f3 e6 2.g4 Bd6 3.h3) lead to a win-in-1 with two keys (two mating moves). This gives a total of 10828 games ending in lose-in-0 at ply 6. Click on a number in the table below to access a file with the games. I thank Serge Zhemeytsev for the idea of computing these numbers.

*Number of games ending in mate-in-n* (Percent)
	win in (mate in)...					lose in (mated in)...
	1 move	2 moves	3 moves	4 moves	5 moves	0 moves	1 move	2 moves	3 moves	4 moves
ply 0	0	0	0	0	0	0	0	0	0	0
ply 1	0	0	0	0	0	0	0	0	0	0
ply 2	0	0	0	0	0	0	0	0	0	0
ply 3	8	8	0	0	0	0	0	0	0	0
ply 4	347	303	25	0	0	8	8	0	0	0
ply 5	10822	7891	540	18	14	347	315	25	0	0
ply 6	419046	247827	29087	1403		10828	8212	547	19
ply 7	9581293	4898320	615559			435767	269465	30882	1659
ply 8	377316151	153916676				9852036	5324016	665748
ply 9	8391307628	3105258919				400191963	175265641
ply 10	339181405568					8790619155	3520797365
ply 11						362290010907
ply 12						8361091858959
ply 13						346742245764219

page last updated: January 6, 2020
back to Chess Problems by Computer