Skip to content

Cheating at Chess—Not Again

September 21, 2022


Play the opening like a book, the middle game like a magician, and the end game like a machine — Rudolf Spielmann

Kenneth Regan is my dear friend and co-writer of this blog. He obtained his doctorate—technically D.Phil not PhD—in 1986 for a thesis titled On the Separation of Complexity Classes from the University of Oxford under Dominic Welsh. He has, however, been enmeshed this month in a story quite separate from complexity classes.

It was Ken’s birthday just last week and we wish him many more.

Cheating at Chess

Ken was the 1977 US Junior co-champion and once held the record of youngest USCF Master since Bobby Fischer. He holds the title of International Master with a rating of 2372. Ken is perhaps the strongest chess player ever with a doctorate in complexity theory.

He is certainly the world best at both complexity theory and cheating at chess. Ken is one of the leading experts in detecting cheating in games played in real tournaments.

He has, however, been occupied by a major story that erupted after the world champion, Magnus Carlsen, lost to the American teenager and bottom-rated participant Hans Niemann in the third round of the Sinquefield Cup in St. Louis. The next day, Labor Day, Carlsen abruptly withdrew from the tournament with no explanation beyond a cryptic tweet. This was widely regarded as an insinuation of some kind of cheating. Ken was involved daily monitoring the event and was cited in a subsequent press release as having found nothing amiss.

Nevertheless—really everthemore—this has sparked renewed discussion of cheating at chess and measures to protect tournaments at all levels. Let’s go into that.

Detecting Cheating

How does one cheat at chess? Imagine Bob is playing a game in a live chess tournament. Bob is a strong player but is not nearly as strong as his opponent Ted. How does Bob cheat?

The basic idea is quite simple: Bob uses a computer program {P} to make moves for him. He types Ted’s moves into {P} and then makes its moves. The reason this is so powerful is that the ranking of the computer program {P} is likely much higher than Ted’s. It could be ranked at 3000 or even higher. This means that Bob is likely to not lose to Ted but perhaps even beat him.

The challenge for Bob to cheat in this manner is that he must ask the program {P} for its moves without being detected. Bob is not allowed to have a digital device like a phone or a laptop to ask the program {P} for its next move. This is the challenge that Bob, the cheater, is faced with. He must enter Ted’s last move and then follow {P}‘s move without it being noticed that he invoked the program {P}. This is the challenge that the cheater must solve.

The cheater may be able to send the moves to the program {P} in various ways. In some cases Bob has been found to use some hidden device to get this information to {P}. He also may use clever ways to get the moves from {P}.

Why Is Detection Hard?

Ken is one of the world’s foremost experts on using predictive analytics to help detect computer-assisted cheating in chess tournaments. Why is this hard? There are several reasons that this is difficult: But the central point is expressed by Alexander Grischuk who notes that “only a very stupid Bob who stubbornly plays the computer’s first line” is likely to get detected.

Let’s examine what Grischuk means. Bob as above is trying to use {P}‘s moves to defeat Ted. Grischuk’s point is that Bob is stupid if he blindly uses the first move that the program {P} suggests. Programs often suggest more than one move that is safe to play. This makes detection much harder.

An even more powerful point is that what if Bob consults more than one program. Perhaps Bob checks the top moves from several programs {P_1, P_2, \dots, P_6}. This could make the detection of his cheating even more difficult.

Bob could use similar ideas to make the detection that he is consulting a program even more complicated. This is why Ken’s checking to see if cheating occurred is so difficult. He tries to stay ahead on the detection end. For instance, his model is not predicated on identifying which program was used, and the provisionally-deployed ideas explored with his students here quantify departure from human predictivity apart from any programs.

Consult this for a recent claim that Niemann used anal beads to signal moves. Even Elon Musk raised this possibility. Just an extreme example of why detecting cheating is tough.

Losing in Translation

The chess story took another twist when Carlsen and Niemann faced each other on Monday in the Julius Baer Generations Cup, an online tournament sponsored by Carlsen’s own organization. Carlsen played one move and then resigned the game—again giving no comment. Much effort has been expended in trying to translate exactly what Carlsen meant by losing in this manner.

Two years ago, a story in the Guardian newspaper subtitled “paranoia has become the culture” featured Ken and efforts to avert cheating in tournaments that were moved online on account of the pandemic. Its quoting Ken included an example of translation from English to English:

“The pandemic has brought me as much work in a single day as I have had in a year previously,” said Prof Kenneth Regan, an international chess master and computer scientist whose model is relied on by the sport’s governing body, FIDE, to detect suspicious patterns of play. “It has ruined my sabbatical.”

What Ken actually said was, “It ate my sabbatical.”

Now Ken was mentioned in the Guardian yesterday and again today. Today’s mention linked a longer article on the ChessBase site explaining his methods and conclusions to date. Ken may have more to say after the developments—and ongoing media contacts—settle down.

Open Problems

How will chess come out of the current controversies? I hope Ken had a happy birthday in the meantime.

18 Comments leave one →
  1. September 22, 2022 7:19 am

    A colleague in a remote office recently tried to motivate me to play online chess. I used to play “competitive” chess when I was younger. I was never really strong, but I still enjoy playing chess, or solving chess puzzles. However, the possibility that my opponent in online chess could cheat (without me having any chance to detect it) totally kills my motivation to even try. My colleague tried to convince me that the online platforms have ways to detect cheating. Maybe I should believe him.

    For me, Hans position that he only ever cheated in online chess is hard to bear. Of course, he should be forgiven, but he should try to make some effort to understand what he has done. Not just the damage to his actual online opponents he cheated, but the damage to the wider community by spreading mistrust and killing motivation. For me, already this alone justifies Magnus’ actions, independent of whether Hans also cheated in offline turnaments.

  2. norbertv permalink
    September 22, 2022 9:50 am

    With respect to “why detection is hard”, another point is that a “careful cheater” would likely restrict cheating to a couple of moves each game. Especially for top-level players, such limited cheating would presumably still have a significant effect on performance.

  3. norbertv permalink
    September 22, 2022 10:13 am

    There is another reason for why detection is hard: a “careful cheater” is likely to cheat only on a couple of moves each game. Especially on the top level, such limited cheating would presumably still have a significant effect on
    performance.

  4. Peter Gerdes permalink
    September 23, 2022 3:43 am

    For in person chess why not just give up on the fancy math and just insist the competitors play inside a full farady cage or submit to a TSA style x-ray. Those machines are expensive but you can probably rent one and they are designed to detect this kind of stuff.

    Any cheating detection strategy faces the absurdly hard challenge that to be credible it will need to publish it’s methodology and players can simply design their cheating specifically to fool the methodology. Besides, the issue is as much one of creating confidence as catching the cheaters and even a highly effective analytic strategy will face a big challenge in convincing people it’s really catching the cheaters. So I just don’t see a way around physical security being used in some fashion (at very least you can use it to get a fair play player baseline).

  5. javaid aslam permalink
    September 23, 2022 9:41 pm

    Dear Prof Lipton and Prof Regan
    Hope you are not running out of topics to cover in this blog.
    If I may, I would like to suggest a topic answering why Graph Isomorphism has been (almost) establised as an “intermediate” level problem between NP complete and P.
    Most of the arguments I have read they seem to be a chain of plausible conjectures, claiming it not to be NP-complete. They are related but with some hand-waving approximations, difficult to prove or refute.

    • paul permalink
      September 28, 2022 4:10 pm

      Proving or refuting that GI is NP-intermediate would be a sensational result, so it’s unsurprising that we don’t have a proof or disproof. Hand-waving approximations are the best we can hope for. Babai’s proof that GI is solvable in quasi-polynomial time is an argument against NP-hardness though.

  6. Marek Holeva permalink
    September 27, 2022 1:39 am

    I’ve been following this issue lately as well. What struck me was the amount of exposure to Kenneth Regan’s method in the chess and non-chess media.

    Kenneth Regan’s analysis certainly works pretty well in the case of not that sophisticated cheating, especially if a suspect is a lower-rated player (say 2600 and below) who needs assistance even in more trivial positions.

    Nonetheless, as it was pointed out here (https://www.chessdom.com/fabiano-caruana-i-would-take-regans-analysis-with-a-large-grain-of-salt/) by Fabiano Caruana (and also by a few other top GMs), already decent player (say 2600 and above) would gain a tremendous edge just by knowing the information a particular position is crucial. Elite players don’t even need to know which move is the best since they can find the best move kind of often once the information about a possibility of an outstanding move is given. It seems there’s a consensus among the field of top GMs that just the hint of 2-4 exceptional moves (but not moves themselves) is sufficient for an unfair advantage.

    That being said, it seems that almost any algorithmic method, such as Kenneth Regan’s, can effectively decide if trivial cheating has occurred. But spotting high-level obfuscated means of cheating – for example, just signaling that a particular position has a crucial move – may be out of reach on any known algorithm.

    To sum it up, I see this as that Kenneth Regan’s technique has found that no particular instance of trivial cheating has happened. However, it didn’t rule out the possibility of more advanced cheating methods – which are more likely in the elite events – such as indicating pivotal positions.

    • October 23, 2022 10:52 pm

      You are right about the advantage of getting just 1 bit of useful information at a critical juncture per game—I agree with Anand’s estimate that this is worth +150 Elo points. With my guess of which case Caruana was referring to having been confirmed, I can say that his statement with “exonerated” was factually inaccurate. His general point is fair enough, but needs this context: In sports doping cases, one would expect a chemical test to be definitive, but this too devolves into statistics. CAS Lausanne describes “comfortable satisfaction” as a judgment level above “balance of probability” and this is most often regarded as the target for disqualification actions. The mere fact of having this setup, even with a regime of physical tests—not just predictive analytics—means that there are going to be cases where you can get a fairly high balance of probability while being short of comfort level for a grave action.

      To those saying my model cannot be effective with 2600+ players, take a look at slide 20 of my 2012 Turing Centennial talk at the University of Bergen. This was using an early form of my model on just 71 moves from four of nine games by the 2650-rated GM Feller; the aggregate z-score was well over the FIDE 2.50 threshold.

  7. Ross Hallock permalink
    September 28, 2022 9:44 pm

    From what I saw on Youtube, it looked like Ken’s data was at the tournament level. That is, it looked like he was assessing the distribution of correspondence percentages for entire tournaments. It didn’t look like he was looking at individual games. Is that correct? If so, one could outfox the cheat detection by only cheating in one game of a nine game tournament. Say you had 100% accuracy in one game and 50% accuracy in all the others. Then your tournament accuracy would be 55%, well within the normal curve. Am I missing something?

  8. September 28, 2022 11:13 pm

    Happy Belated Birthday Ken! I hope that you are doing well. 🙂

  9. October 3, 2022 9:16 am

    Could something like this (previously suggested by Fisher etc.) confuse the state-of-the art chess ML engines for a couple of months but not mess up human players too much?

    Roll a six-sided die at the start to get X \in {1…6} and replace the major piece with index X of white and index (7-X) of black, skipping K and Q, with a pawn.
    The replacement pawn at 1X/8(7-X) behaves like a regular pawn excepting that on its first move it’s allowed to jump over a piece to advance two places forward; this pawn cannot participate in castling.

    The ML part of chess-engines is over-fit to a training dataset therefore is undoubtedly superior to human memorization however it is extremely brittle; when we perturb even a little bit the ML stuff should come crashing down.
    The chess engine’s tree-search algorithms have vastly superior accuracy compared to humans but whether they are currently tactically/positionally superior to humans I don’t know; even so tree-search does not generalize to most real-world problems.

    There is no evidence that the current flavor of ML can generalize outside its training set and very likely we will have another AI ‘winter’ once all the crazy investment money dries up (latter is already happening).

  10. October 6, 2022 2:42 pm

    Hi, a couple questions.

    1. Does fidelity distinguish “cheating with an agent” versus “prepped with an agent”? If not, is there an argument that says “cheating with an agent” is more likely to be true that “prepping with an agent”?

    2. How many parameters are in the model?

    3. Is the code public?

  11. Tomas Klos permalink
    October 11, 2022 6:25 am

    Just FYI: the link “press release” to the blog on grandchesstour.org has a typo. The apostrophe in the word “arbiter’s” is currently straight ‘ but has to be smart ’. (Is that the term to use?) The correct version is as follows.
    https://grandchesstour.org/blog/2022-sinquefield-cup-chief-arbiter%E2%80%99s-statement
    Otherwise, thanks for another nice post!

  12. Tomas Klos permalink
    October 13, 2022 2:53 pm

    Just a quick note that the “press release” link has a typo. The apostrophe has to be ‘smart’ (is that the phrase?), like so: https://grandchesstour.org/blog/2022-sinquefield-cup-chief-arbiter’s-statement. Otherwise, thanks for the nice post!

  13. Finster Phace permalink
    November 4, 2022 3:42 am

    Hi, I’ve been following the Niemann-insinuations, lately and came across this website, recently, looking for ways, that cheating might get detected by impartial methods and models. I’m also a (hobbyist-) developer of a (relatively weak) chess-engine, myself and I’m really wondering, what is currently possible in terms of cheat-detection, that would actually hold up to some meaningful critical review. So I’m trying to formulate (basically 2) questions, that might yield further pointers for digging into this:

    1. I’ve tried finding any (preferably peer-reviewed) lit. about the method / model / algorithm, you are using, to evaluate the likelihood of engine-involvement in chess-games, supposedly played by humans. Has your work be received (and critically reviewed / ideally even cross-validated) by anyone else? Sadly, I’ve not been able to find any (peer-reviewed) publication by yourself, which directly addresses cheat-detection (rather than “just” evaluating general level of play).
    2. Has your method / model actually been used, so far, in any case, where there hasn’t been any other kind of evidence (like catching someone in the bathroom, with their i-phone or whatnot), where you have identified someone cheating, at all, solely based on your method(s)? Reason I ask is, because I have some doubts about the actual validity of purely statistical tests, that would hold up to a level of scrutiny, that would have to be met, in cases, where “positives” might mean some serious (career-damaging or even criminal) consequences (especially in a context, where no additional evidence is available)?

    I’ve looked at quite a number of podcasts and articles, which cover a lot of your (very interesting!) work, but I’m just missing anyone ever really asking these (more critical) questions, so far, hence my asking for specifically peer-reviewed work by yourself and the scientific community, ideally picking up your work and critically reviewing it (regarding “cheat-detection”)?

  14. davidlittleboy permalink
    November 4, 2022 9:48 pm

    Replying to space2001’s comment. In chess, unlike Go, the classical (non-MCTS non-ML) programs are holding their own as the top engines*. So at least for classical chess, that doesn’t work. (Also, my take is that traditional chess programs are basically better than humans at tactics, so randomization would only increase the machine advantage.)

    *: The last I checked, a non-MCTS non-ML engine was the strongest, with a hybrid engine and an MCTS/ML engine vying for second place. I’m surprised that MCTS is proving so much (relatively) less effective for chess than Go; I was expecting it to handle transitions to endgames effectively (i.e. being better able to find paths to winning (or non-losing from bad positions) endgames than regular programs or people).

  15. November 13, 2022 5:44 am

    Hello Prof Lipton and Prof Regan
    Since the days back in 2011 when I started following your blog I can’t remember having waited so long as this time – almost two month – for a new posting.
    Hope everything is okay and we can read you soon on a regular basis again as all the years before
    BR
    JL

    • November 14, 2022 2:56 pm

      Thanks very much for the good wishes. We are finally working on a post, which will include a note about the silence.

Leave a Reply to Marek HolevaCancel reply

Discover more from Gödel's Lost Letter and P=NP

Subscribe now to keep reading and get access to the full archive.

Continue reading