Machine learning is a new, exciting and growing area of computer science that looks at if and how computers can learn without explicitly being taught. Within the last few weeks, machine learning programs have learned games such as Go and Chess, and become very capable players: Google's AlphaZero beat the well-known Chess engine Stockfish after just 24 hours of learning how to play; just over a year ago, AlphaGo beat the world's strongest human player Ke Jie at Go.
AlphaZero is different from all previous Chess engines, in that it learns by playing. Having been programmed with the rules of Chess (aims of the game; how the pieces move), it played 1000 games against itself, learning as it went. The Google Alpha Zero team have published a paper of their research, and it makes for interesting reading.
From a Chess perspective, the data is very interesting as it shows how Alpha Zero discovered some key well-known openings (the English; the Sicilian; the Ruy Lopez) and how it used them in games, and then discarded them as it found 'better' alternatives. Table 2 on page 6 shows how the frequency of each opening varied against training time. There are some interesting highlights in the data:
The English Opening (1. c4 e5 2. g3 d5 3. cxd5 Nf6 4. Bg2 Nxd5 5. Nf3) was a clear favourite with Alpha Zero from very early on, and grew in popularity.
The Queen's Gambit (1. d4 d5 2. c4 c6 Nc3 Nf6 Nf3 a6 g3 c4 a4) also became a preferred opening.
Interestingly, the Sicilian Defence (1. ... c5) was not favoured, instead the preferred line against 1. e4 was the Ruy Lopez (1 ... e5).
It's worth remembering that Alpha Zero deduced these well-known and long-played openings and variations by itself in 24 hours - compared to the decades (and centuries) of human play that has gone into developing these openings.
Apart from the purely academic exercises of building machines that can learn to play games, there are the financially lucrative applications of machine learning: product recommendations. Amazon and Netflix make extensive use of recommenders, where machines make forecasts about a user, based on users who showed similar behaviour ("people who liked what you like also like this..."). Splitting out and segmenting all users to find users with similar properties is a key part of the machine learning process for this application.
In conclusion: "It's an exciting time for Machine Learning. There is ample work to be done at all levels: from the theory end to the framework end, much can be improved. It's almost as exciting as the creation of the internet." Ryan Dahl, inventor of Node.js
AlphaZero is different from all previous Chess engines, in that it learns by playing. Having been programmed with the rules of Chess (aims of the game; how the pieces move), it played 1000 games against itself, learning as it went. The Google Alpha Zero team have published a paper of their research, and it makes for interesting reading.
From a Chess perspective, the data is very interesting as it shows how Alpha Zero discovered some key well-known openings (the English; the Sicilian; the Ruy Lopez) and how it used them in games, and then discarded them as it found 'better' alternatives. Table 2 on page 6 shows how the frequency of each opening varied against training time. There are some interesting highlights in the data:
The English Opening (1. c4 e5 2. g3 d5 3. cxd5 Nf6 4. Bg2 Nxd5 5. Nf3) was a clear favourite with Alpha Zero from very early on, and grew in popularity.
The Queen's Gambit (1. d4 d5 2. c4 c6 Nc3 Nf6 Nf3 a6 g3 c4 a4) also became a preferred opening.
Interestingly, the Sicilian Defence (1. ... c5) was not favoured, instead the preferred line against 1. e4 was the Ruy Lopez (1 ... e5).
It's worth remembering that Alpha Zero deduced these well-known and long-played openings and variations by itself in 24 hours - compared to the decades (and centuries) of human play that has gone into developing these openings.
Apart from the purely academic exercises of building machines that can learn to play games, there are the financially lucrative applications of machine learning: product recommendations. Amazon and Netflix make extensive use of recommenders, where machines make forecasts about a user, based on users who showed similar behaviour ("people who liked what you like also like this..."). Splitting out and segmenting all users to find users with similar properties is a key part of the machine learning process for this application.
In conclusion: "It's an exciting time for Machine Learning. There is ample work to be done at all levels: from the theory end to the framework end, much can be improved. It's almost as exciting as the creation of the internet." Ryan Dahl, inventor of Node.js
No comments:
Post a Comment