The gambler’s revenge - AlphaZero, the brilliant universal chess champion

Pili, Giangiuseppe; Un mistero in bianco e nero – La filosofia degli scacchi, Bologna: Le Due Torri.

Iscriviti alla Newsletter!

Is Magnus Carlsen better than AlphaZero? We don’t know actually, but it would be difficult to be argued the opposite. Magnus, appropriately, seems to not taking too seriously this entity more able of him playing chess in the universe. After all, Magnus is too young to bother that a piece of technology could be able to perform a task better than humans. He did not grow up, as I did, in a place and time in which everything about technology and innovation is seen with great suspicious to say the least. Although Italy is still an extreme case, many people were worried when DeepBlue won the match with Garry Kasparov, as it was reported by Tim Van Geleder, in a philosophical analysis in which all the usual concerns were carefully considered.[1]

When I started playing chess, everybody was worried by the idea that a computer can be better than humans playing chess. Since the beginning of the so-call AI, chess was considered a good arena to test the machine’s intelligence, as stated by Turing himself in a cornerstone paper.[2] After the machines started to win, the players were still playing, but they used to grudge, a major, nice virtue of many chess players as all human beings, after all. At any rate, the rise of the computer cheaply enough to be used really by everybody did not decide anything about chess. They did not end the game, they did not calculate everything in advance (as far as it is mathematically impossible),[3] they did not influence the interest of the players in the game. On the opposite, chess is more alive than ever before. But there is one exception: the role of the “pundits”, the great authorities of the past, is changed once and for all.

It was a typical experience in the chess clubs to listen carefully to some authorities that can shut up you because you start to play a different opening, for instance. There was a common style in chess, based basically on the fashion of the period, as everything else. The pundits and the authorities of the past were really shocked by the rising of the computers just because their opinions had a less impact – if any – on anybody else. Even Magnus sometimes is careful to express a definitive opinion on a move before checking it with a chess engine. But, again, this didn’t bring chess to an end. However, it changed the tales, the stories around the game, which are so important to all the chess players. Stories change and the chess players’ visions of the game too. At the same time, the game is more alive than ever before, and the quality of the chess players is still improving even in the average. At the end of the day, this is an important achievement. What is really changed, then, is the role of the human opinion and, specifically to chess, that chess is not the only game in town. After all, today there are real-time strategic games, now possible because of the explosion of the machine computing power (indeed, machines do not perform real-time computations, but it appears as such to the players). The human player simply changed the role from being the only chess calculator in the universe, to being supported by something else that calculates better and much faster.

However, nobody really is interested watching machine games. Why? Because they don’t display anything really interesting. 80 moves on the average, really imperceptible mistakes: definitely a competition between two objects. This is what it is. And it is a race between two cars without drivers, something very, deeply silly. Two objects do not have any will and, as far as chess is actually the struggle between two wills, let’s say, two similar but different visions of the world, what kind of interest could be raised by two inertial things that don’t even care to win or lose? Nothing at all. ‘Wars’ between ants are not studied by military strategists or commanders because they are really something not interesting from that point of view. Watching a match between two computers have the same effect on a chess player: it could be interesting from a very abstract and far point of view, but ‘real chess’ is another thing and then a chess player prefers to watch an online game between two GMs instead of switching on his/her pc to watch a fight between machines. Nobody simply does that. This was perfectly the description of my life since I discovered AlphaZero. AlphaZero is another story.

Actually, I was very suspicious toward the ‘revolution’ of the so-called deep learning, namely a way in which a software, implemented in a machine, is trained by reinforcing its ability to reach the expected outcome. I am still very suspicious about the narrative in which the humans will ‘solve’ life removing entropy with a magic technology, once chemistry, then psychology and, today, AI (and, differently from chemistry, AI still cannot feed a human being and it is not able to hail our spirit inventing something better). Then, I still found difficult to believe that there is another intelligence finite entity, able to speak and reason apart from humans. And I am not alone in thinking this. However, as far as informatic engineers have to find metaphors to speak about their products (it is important to humanize things if you want to understand their behavior and communicate it to other people), coupled with the greedy tendency to sell slogans to increase the financial budgets, you can hear everything about AI, machine learning and other amazing, revolutionary things that still increase the complexity of the human world with some gains and some losses. Then, of course, I was deeply skeptical about deep learning too and AlphaZero as an instance of it. However, something revolutionary happened this time. After all, revolutions are really unusual events, but they are still possible and we should never forget it.

The revolution is that the machine can generate its own data and then it does not rely on any kind of preferential human suggestions on how solving a problem. To be clear, the machine learning works on a definite frame in which that amount of data can be generated basically following strict rules, such as the chess rules. For the moment, and it will last many years, machines, learnt or not, cannot invent a new game. They need interests, feeling and semantics, which are still three things uncleared even for humans. Then, the revolution is still stick inside a definite, discrete limited framework. However, AlphaZero is different from Stockfish, which has a book opening, tablebases for the ending and many different functions of evaluation, starting from the common-sensical material evaluation following the more or less canonical evaluation of chess pieces still employed by humans to teach other humans to play (pawn = 1, knight = 3 etc.). Then, I thought nothing new was under the sun.

However, I talked to Marco Bettalli, a strong Candidate Master and a renowned professor of Greek history at the Università degli Studi di Siena. As many of his generation, Marco was a bit worried to discover that the top was not even achieved by the ‘old machines’ such as Stockfish and Fritz. Indeed, as many others, he noted that the machines did not invent a new style, a new way to play. The ‘old machines’ were only able to play perfectly (and boringly) because they are simply better to calculate than humans, but basically they were stupid things unable to interest anybody else and nobody could have considered them as ‘intelligence’ in any meaningful sense of the word. However, he was worried because AlphaZero actually is showing a completely new style and it seems a quite different ‘thing’ compared to the old chess engines. Although he is the father of an enfant prodige and a very good chess player, so much so I never even drew with him in the old happy days in Siena, I didn’t really believe him. But he gave me the first reason to take AlphaZero seriously and then I discovered my favorite chess player of the contemporary time.

Demis Hassabis, the CEO and founder of DeepMind, the Google enterprise that actually created AlphaZero (and, before it, AlphaGo, which was able to beat Lee Sedol, the first time a machine finally beat a human go player), stated that the revolution in this respect is exactly caused by the fact that the machines invented something new, something really never saw before and something from which we can learn. And he is right. Although I suspect that there is nothing really intelligence behind AlphaZero, there is no “Dwarf” inside “the Turk Machine”,[4] I really believe there is something interesting and relatively new under the sun. A bunch of ants finally arrived to create a military strategy that could be interesting even for Clausewitz and Sun Tzu: this is doesn’t mean this is the dawn of them, but it is the rise of something from which can learn still using Clausewitz and Sun Tzu maybe in a different, more creative way. Then, not only I had a look on the AlphaZero games, but I also followed the fun and insightful chess analysis proposed by Anna Rudolf on YouTube, where she explains how AlphaZero plays (and it is quite viewed video considering the average chess video. This seems to prove that chess players are interested in AlphaZero).[5] Marco defined AlphaZero as a Martian, something that came from outer the space. I disagree, AlphaZero is still an extension of how humans play, sometimes, as I will try to argue, but the results are definitely so astonishingly that they deserve attention. They are really funny and amusing games. That is the thing! Finally, a machine is able to provide some entertainment, which is also insightful, and we should be humble to be willing to learn from something without anything inside, a machine unable to appreciate the game that it plays.

Recently, AlphaZero played with one of the best ‘typical, old’ chess engine, Stockfish, which has a higher ranking than Magnus Carlsen, who is the chess player that had the higher ELO rating in history, surpassing even Garry Kasparov, once considered the greatest chess player of all time. According to Kasparov himself, the match between the two chess engines could have been better conceived but he doesn’t question the result.[6] Indeed, from one side, AlphaZero was run by a computer entirely dedicated to it, as it is right now: AlphaZero is not simply a software but it is the combination between a (super) computer, that incorporates neural networks working in parallel (at the same time), and a two-stage algorithm. The first stage is basically devolved to elaborate the possible variations, whose selection and elaboration follow a probabilistic evaluation (Monte-Carlo approach to creating the new variations). The second one is devolved to select the variations proposed by the Monte-Carlo algorithm. The second stage is entirely devolved in the difficult, fundamental function of selecting the most appealing variations, namely the one in which AlphaZero ‘would like to bet’. I am using here the quotations commas to suggest that this is just a metaphor. Actually, AlphaZero is simply a machine created to follow rules of games to generate variations and to select them according to a clear utility function defined by the rules of the games themselves. To sum up again, AlphaZero works creating reasonable bets on the possible variations (which are the objects of the bets) and then it selects them trying to follow the same winning pattern it found in the past games.[7]

DeepMind expert, dr. Nando de Freitas, analyzed several ways in which humans learn (by imitation, studying alone, memorizing patterns etc.).[8] One fundamental aspect that is often overridden by many in secondary schools (at least in my quite modest – and definitely unlucky – experience) is fun or, more importantly, to gain satisfaction from experience and by the learning process. If you are bored, after the right amount of time, you will stop to learn choosing something better to do: a good decision, actually. If learning is only a painful experience, then it is simply a torture, as it would be recognized by the relatively recent laws on torture and nasty interrogation techniques! This idea is connected to the un-tutored machine learning used to train AlphaZero. AlphaZero is ‘encouraged’ to select some particular paths, which are based on its own memory created by itself. This is a similar experience we encounter with some great (human) intellects.

Reading the life of two great geniuses, such as Beethoven and Prokofiev, I discovered that they were not appreciated for their styles, considered then too difficult, strange and inappropriate for the taste of the time. This is, actually, a reliable pattern: even Kant or Spinoza were recognized much later, although they immediately were considered great “intellects” (but not so much to begun being the “leading” scholars of their time). Today they are all praised as absolute geniuses. Why so? They all studied alone for a great amount of time, without taking care of the praise or taste of other people (Kant even wrote it explicitly), and that’s why they didn’t have so much success: creating a new style and new techniques is not necessarily an easy thing to digest for the average human being, trained to recognize certain patterns, following some habits of thought and discharge others. As all the great minds, their main reward was their own creative and critical intellectual satisfaction and, then, driven by it, they created something new, something un-heard. AlphaZero worked in the same fashion. It didn’t follow any precondition, created by database, opening books or even functions of evaluation. It followed its own heuristics, if any, based only in its combination between memory, calculation and reinforcement on what he stored in its (huge) memory. It starts minimizing positions that are not good bets according to it, because they don’t maximize the expected outcome based on the comparison between its analysis and the past record games it played. AlphaZero is a good gambler.

This is not something entirely new in the field of chess thinking. Chess players are gamblers. All of them. In fact, they could not exhaust chess with sheer calculation and then they have to trust to their own intuitions and feelings. This was stated by Garry Kasparov himself, who can calculate even 30-40 moves in advance, when necessary: not exactly the person you would expect saying that intuitions and feelings (he used even the word “guts”) are underrated qualities of the human mind especially when it plays chess.[9] But he stated clearly that without insights and intuitions, nobody can play chess. An old philosophical debate, the role of intuitions in our cognitive capability and activity. Actually, nobody could think. But in a more practical way, chess players are also gambler in this strict sense: they calculate insofar they stop for a general evaluation of the position (so-called “stop position”) and sometimes they select the one in which the believe more, they trust more or they like more. These are not prototypical ways to define a good method how to play chess. But this is what it is. This is done by everybody at any level of chess. When you arrive to choose two or three different variations, sometimes you make just a bet preferring one position instead of another, where the ability is exactly reducing the amount of uncertainty related to each “lottery”. Then, all the chess players are sophisticated gamblers and the better they are, the strongest they will be.

The interesting result, compared to Stockfish, is that AlphaZero is much more a qualitative machine than a quantitative machine. To be entirely clear, AlphaZero still calculates much more variations than Carlsen would ever dream of, but it is much less compared to Stockfish.[10] Then, the match between AlphaZero and Stockfish could have been less in favor of AlphaZero, if only Stockfish would have been implemented by a better machine. Actually, the Stockfish engine was not equipped to run in the best machine possible to allow it to play at best. However, as Kasparov argued, the result is sufficiently striking to be a major achievement anyway. And it is difficult to disagree because one game in particular shows an amazing capacity of AlphaZero in annihilating its opponent.[11] Basically, AlphaZero reduces the number of reasonable moves available, maximizing its pieces even sacrificing its pawns (even three or four!). It is a very good gambler in any sense of the word: from a quantitative perspective, its calculations are based on generating positions by probability; from a qualitative perspective, its games are not based on material as much as on dynamic features of the positions, always improving the activity of its pieces at the opponent’s expense. Let’s admit this: AlphaZero plays as we would have liked to play against any stupid machine such as Stockfish. We would like to out-master it, annihilating it in a so clear and undisputable way.

My title, of course, is a provocation (but maybe not entirely false, although AlpaZero is not a world chess player because a thing cannot be a player of anything as I argued elsewhere).[12] However, I prefer to watch AlphaZero games against the poor Stockfish than watching one of the games played in the last world champion for many reasons. Carlsen and Caruana are too good to minimize the chances of not losing than trying to win at all cost. I am not a utopist and I agree with them that today chess is based on the capacity to being unbeatable, that doesn’t mean “able to win anybody else”, as far as to be unbeatable is sufficient to not losing. Then, the first thing is to draw with Black and to fight with White (though another black is playing to draw, a good argument for a boring game). Instead, AlphaZero is always trying to win and it does so in a unique, enjoyable style. Is it because it is a gambler? Yes, an extraordinary gambler, of course. But everybody enjoys watching a breath-taking Russian roulette than, let’s say, a very bad movie in which you know from the beginning how it ends. Then, AlphaZero could be not intelligence, but it is definitely a very creative and insightful gambler that could be happy to win its bets because it doesn’t fear anything.

[1] Van Gelder, Tim; (1998), “In to the Deep Blue yonder”, Quadrant, 41 (2-1), pp. 33-39.

[2] Turing, Alan; (1951), “Computing Machinery and Intelligence”, Mind, Vol. 59, No. 236, pp. 433-460

[3] Ciancarini, Paolo; (2005), “Il computer gioca a scacchi”, Mondo Digitale, V. 3, p. 9.

[4] Poe, Edgar A.; (1836), The Maelzel’s chess player, Createspace Independent Pub (2014).

[5] Rudolf, Anna; (2018), “AlphaZero’s Attacking Chess”, https://www.youtube.com/watch?v=nPexHaFL1uo&t=1312s Accessed 27.05.2019, 12.45.

[6] Kasparov, Garry; (2018), “Class of 2006 War Studies Conference”, West Point – Military Accademy, https://www.youtube.com/watch?v=QSyKlzh9Zl8&t=1734s Accessed 27.05.2019, 12.49.

[7] For a very detailed analysis: Hassabis, Silver, et. All (2018) “A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play”, Science 362 (6419), 1140-1144. DOI: 10.1126/science.aar6404

[8] I strongly suggest to watch entirely the insightful video: Nando de Freitas, (2017), “DeepMind’s Nando de Freitas – Learning to Learn”, https://www.youtube.com/watch?v=5yNirTp92Uk, Accessed 27.05.2019, 12.54.

[9] Kasparov, Garry, (2013), “Garry Kasparov on “Achieving Your Potential””, https://www.youtube.com/watch?v=NPT0vg_Jl8Q, Accessed 27.05.2019, 13.00.

[10] Hassabis, Silver, et. All (2018) “A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play”, Science 362 (6419), 1140-1144. DOI: 10.1126/science.aar6404

[11] Rudolf, Anna; (2018), “AlphaZero’s Attacking Chess”, https://www.youtube.com/watch?v=nPexHaFL1uo&t=1312s Accessed 27.05.2019, 12.45.

[12] Pili, Giangiuseppe; (2012), Un mistero in bianco e nero – La filosofia degli scacchi, Le Due Torri, Bologna, Chap. 10.

Giangiuseppe Pili

Giangiuseppe Pili è Ph.D. in filosofia e scienze della mente (2017). E' il fondatore di Scuola Filosofica in cui è editore, redatore e autore. Dalla data di fondazione del portale nel 2009, per SF ha scritto oltre 800 post. Egli è autore di numerosi saggi e articoli in riviste internazionali su tematiche legate all'intelligence, sicurezza e guerra. In lingua italiana ha pubblicato numerosi libri. Scacchista per passione. ---- ENGLISH PRESENTATION ------------------------------------------------- Giangiuseppe Pili - PhD philosophy and sciences of the mind (2017). He is an expert in intelligence and international security, war and philosophy. He is the founder of Scuola Filosofica (Philosophical School). He is a prolific author nationally and internationally. He is a passionate chess player and (back in the days!) amateurish movie maker.

The gambler’s revenge – AlphaZero, the brilliant universal chess champion

Giangiuseppe Pili

Be First to Comment

Lascia un commento Annulla risposta