Departments of Computer Science and Psychology
The University of Queensland, QLD 4072, Australia
Technical Report 339, Department of Computer Science, The University of Queensland, 1995
(This document is - permanently - under construction).
See also: Jay Burmeisters PhD Thesis (2000)
2.0 The Game of Go
2.3.1 Dead or Alive
2.3.2 False Eyes
2.5 Sente and Gote
2.10 Winning the Game
4.1 Academic Work
4.1.3 Reitman and Wilcox
4.1.4 Other Academic Work
4.2.1 The Many Faces of Go
126.96.36.199 Many Faces of Go
188.8.131.52 Candidate Move Generation
184.108.40.206 Evaluation Function
220.127.116.11 Performance and Timeline
4.3.1 The Ing Prize
5.0 What's in a Go Program?
6.0 Performance of Current Programs
6.1 Program versus Program Performance
7.0 Future Improvements in Performance
8.1 Computer Go related Internet Resources
8.1.3 Game Record Formats
8.1.5 The Computer Go Ladder
8.1.8 Game Databases
8.2 Go Related Internet Resources
8.2.3 The Go News Group
This document introduces the reader to the computer Go field and Internet resources pertaining to Go and Computer Go.
For readers unfamiliar with Go, section 2 describes the rules and common concepts of the game. Readers comfortable with their knowledge of Go may skip section 2, reading it only if the use of terms in this document needs clarification.
Section 6 describes the current state-of-the-art in Go programs which is far behind that of Chess programs even allowing for the comparative disparity in the amount of effort expended on developing programs for each game. Theoretical and practical reasons are given in section 3 for this disparity as well as reasons why Go cannot be programmed in a similar fashion to Chess (see Table 1 for a brief comparison).
The history of the computer Go field is briefly outlined in section 4. Included in this outline are descriptions of some of the academic work that has used Go as a vehicle to pursue research, computer Go competitions and results, and programs.
Internet resources pertaining to Go and computer Go are described in section 8. The resources listed include a server for playing a game of Go, an FTP archive site and a mailing list which provides a forum for discussing aspects of programming Go.
Go is a two player board game in which chance plays no part. It is an Oriental game which originated between 2500 and 4000 years ago and enjoys a similar status in Japan, China, Taiwan and Korea as Chess does in Western countries.
Go is played on a board which consists of a grid made by the intersection of horizontal and vertical lines. The size of the board is generally 19 x 19, however, 9 x 9 and 13 x 13 sized boards are also used for playing quicker games. Intersection points are connected along the horizontal and vertical lines such that the neighbours of any given point are the intersection points that are horizontally and vertically adjacent to it. Two players alternate in placing black and white stones on the intersection points of the board (including edges and corners of the board) with the black player moving first. The aim of Go is to capture more territory and prisoners than your opponent.
Empty points that neighbour a stone are called its liberties (points marked L in Figure 1). Any stone which has no liberties is captured and removed from the board (white stone in Figure 2). This is the only instance in which stones, once placed, are moved.
Stones of the same colour can be joined into strings by being horizontally or vertically connected to each other (Figure 4). Single stones can also be described as strings which we will refer to as unitary strings. Instead of the stones in a string having individual liberties, the liberties of the constituent stones of the string become the liberties of the string (points marked L in Figure 5). To be captured, a string must be surrounded by opponent stones such that no stone in the string has any liberties (black stones in Figure 6). When a player fills the penultimate liberty of a stone or string, it is courtesy to declare "Atari!" which alerts the owner of the stone or string of its impending capture.
Eyes are formed when a string surrounds one or more empty intersection points (point marked E in Figure 7). Strings which contain a single eye can be captured by first being surrounded and then the eye being filled (Figure 8). Filling the eye does not violate the suicide rule since the opponents captured stones are first removed, creating liberties for the last stone played. A string that has two eyes (points marked E in Figure 9) is safe from capture since there is no way for an opponent to capture such a string. The reason for the inability to capture a string with two eyes is that the stone played in the first eye will be subject to the suicide rule.
Strings which have two eyes are said to be alive. In general, a string is alive if it cannot be captured even if the opponent moves first. A string is considered to be dead if there is no way to stop it from being captured even if its owner moves first.
To stop endless cycles occurring, a player cannot play a stone which causes a previous board position to be repeated. In general, the only situation in which a cycle can arise is called a ko. In Figure 10, a white stone at 1 is captured by a black stone at 2 which could itself be captured by a white stone being played at 1 again. Such a situation is prohibited by the rule against repetition. A ko situation can lead to a ko fight (Figure 11) in which the player which loses a stone in the ko (white stone at 4 captured by black at 1) threatens an opponent stone elsewhere (at 2) and then recaptures the opponent stone in the ko (at 4) after the opponent responds to the threat (at 3).
Skilful players can engineer the game so that they gain the initiative for a number of moves and dictate the response of their opponent. In such a case, a player is said to have sente and the opponent to have gote. Sente usually occurs when players place their opponent in a situation of having to save stones and only having one option available to do so. Sente allows a player to control the course of the game so as to execute a plan of several moves without the opponent being able to frustrate the plan in midstream.
Ladders are a capturing race between a group that is almost enclosed, and a surrounding group. Consider the initial stones in Figure 12 (the stones without numbers). The black stone is doomed to be captured no matter what the Black player does. When White plays at 1, the black stone is threatened with capture. To defend this stone, Black plays at 2. White once again threatens capture by playing at 3. This process continues and results in a ladder being formed. White eventually wins the ladder when the board edge is encountered.
The only links between stones which is recognized in the rules of Go is between horizontally and vertically adjacent stones (called nobi). In practice however, there are several links which are recognized by experienced Go players (Figure 13). The ikken tobi link can be physically connected as long as black plays next. The kosumi link can be made as long as black plays next or each of the connecting points are empty if white plays first. Of the other links, physical connection depends on the context of the surrounding stones. If two strings are denied a physical connection they are said to be cut (Figure 14).
A group is composed of strings of one colour that are in close proximity to each other such as A and B, and C1, C2, C3 and C4 in Figure 15. Being in close proximity means that they can be securely linked by one of the links in Figure 13 (other than nobi of course). For the C group, C1 is linked to C2 by an ogeima link; C2 is linked to C3 by an ikken tobi link; and C3 is linked to C4 by a kogeima link. Groups are the main perceptual units concerning the player throughout the game. The most important attributes of a group is whether it is alive, and whether it can create two eyes or connect to a group with two eyes to ensure survival.
An army is a loose federation of groups of one colour that are in close proximity to each other. In Figure 15, the two groups (A and B, and C1, C2, C3 and C4) form one army. Being in close proximity means that there are no intervening enemy stings between the groups. Although armies cover a large territory, they are not as secure for defending territory as groups. Thus it is important for a players to convert armies into groups whilst stopping their opponents from doing the same. Converting an army into groups implies playing stones in between strings of separate groups such that they can be linked together as described above (e.g. by playing at one of the points enclosed by the dashed line in Figure 15).
A player may pass at any time and the game is over when both players pass consecutively. A determination of territory begins by removing the stones in black strings and white strings which are dead. Each player then adds the number of opponent stones they have captured and the number of empty intersection points which are surrounded by their stones i.e. points that cannot trace a path to an opponent's stone along the horizontal and vertical grid lines (points marked B in Figure 16). The player with the largest territory wins.
Go has a sophisticated handicap and ranking system. Players are ranked according to their ability with a complete novice being ranked at 30 kyu (pronounced "cue"). After playing between 10 and 15 games, a player's rank would likely be around 20 kyu. After reaching 1 kyu, further improvement would result in a rank of 1 dan or first-degree master. Amateur ranks continue up to 6 dan. Professional ranks start at what would be 7 dan amateur and extend from 1 dan to 9 dan.
The amount of effort required to increase to the next best rank becomes harder as a player moves up in ranks. For instance, to rise from 20 kyu to 10 kyu would require playing 1 or 2 games a week for around a year and possibly reading a few books on Go tactics and strategy. [The differentiation between professional ranks is also smaller than between amateur ranks. The difference between a 1 dan professional and a 9 dan professional is approximately equivalent to the relative difference of 2 amateur ranks, say 4 dan to 6 dan.] Many years of full-time play is required to reach professional ranking.
To provide balance for both weak and strong players, the weak player is given handicap stones at the start of the game. In the Chinese rules, the handicap stones are placed at the weaker player's discretion whereas in the Japanese rules, the handicap stones are placed on set points called hoshi points. The weaker player is given as many handicap stones as the difference between the players rankings. For instance, a 10 kyu player would give 5 handicap stones to a 15 kyu player. Ranks are usually determined in one of two ways. The first is by finding the number of handicap stones a player needs to win about half the games played against a stronger player on a 19x19 board. The other is by giving 1 handicap stone for each 10 stones that a player is beaten by. The relative value of handicap stones diminish as the board size increases. One handicap stone on a 9x9 board is worth two on a 13x13 board and four on a 19x19 board.
Playing first without handicap is equivalent to a 5 point advantage. It is customary for the weaker player to play first using black stones. If both players are ranked equally, the White player is given a 5 point bonus or komi (pronounced "ko-mee"). In tournaments the komi is usually 5 1/2 points so as to avoid ties.
Compared to the Chess programming field, the Go programming field is not well developed. A strong commitment to research on programming Chess in the 1960's and 70's has not been replicated in the Go field. There are fundamental differences between Chess and Go which have contributed to making Chess a more attractive research domain so far. However, even allowing for the smaller amount of work done on programming Go, the performance of Go programs have not kept pace with the performance of Chess programs. There are several reasons for this disparity, which is in part due to the complexity of Go and also the unsuitability of the programming techniques used in Chess as described below.
1. There are fewer types of pieces in Go than chess (in chess there are 6 types of pieces, whereas in Go each player has only one type, called a stone). However, the board size is much greater in Go (8x8 squares in chess vs 19x19 grid in Go).
2. The size of the board, and the relative freedom in the placement of stones mean that there are also many more moves made in a typical Go game (about 80 moves in chess vs. about 300 moves300 moves in Go).
3. Stones can be placed anywhere on the board, making for a very large branching factor in the selection of each move (estimated at ~200), whereas pieces in chess are constrained to a much smaller set of legal moves (branch factor of ~ 40). The diversity of chess pieces is exploited to reduce the branching factor in chess programs in a way that is not possible in Go (see discussion of evaluation functions in 7. below). The branching factor also plays a role in the different stages of both games (beginning, middle and end). In the opening stage of chess, there are many well-known openings, and these are often followed for up to 10 moves. In Go, the number of sensible openings is much larger, and set openings are rarely followed for more than about 3 moves. However, there are sequences of moves in local skirmishes, known as joseki, that reflect optimal play (for both sides) in tactical battles in the corners. Skill in the use of joseki libraries lies in selecting the right joseki to optimise interactions with stones outside the region of interest or diverge from the known sequence to serve other interests.
4. Both games can be won by resignation of the opponent, or by an outright win - check-mate in chess, or surrounding more territory and taking more prisoners in Go. In chess, the end of a game is easy to determine: checkmate is simply defined in terms of the position and threats to the King. In Go, a game is finished when both players pass, but it is frequently difficult for beginners to know when the end of a game has been reached. Beginners typically prolong play beyond the point where experts would stop, and their difficulties are echoed by Go programs, which also have difficulty estimating when to end a game. The natural end of the game occurs when playing additional stones reduces the player's score, either by filling in their own territory, or giving unnecessary prisoners to the opponent. Go programs are often inefficient in both these ways. There are several different rule systems for scoring in Go: they are all similar, based on calculating the sum of territory surrounded plus prisoners taken during the game for each player, and taking the difference between these scores (a typical margin in a professional game would be less than five points).
5. In both chess and Go, pieces can have long range effects. In chess the effect is directly a result of some pieces' ability to move many squares (e.g, queens, rooks, bishops). In Go, a stone once placed on a grid point does not move (although it can be captured and removed from the board). However, a group (or pattern of stones) does have long range effects in that it can play a role in a capturing race, or can affect the life and death struggle of another group. For example, a stone played in the path of a ladder (a group of stones involved in a certain type of capturing race) can change the potential to link two stones later in the game - even though the stones are on the other side of the board.
6. The state of the board changes rapidly in chess, as pieces move positions. In Go, the board is only gradually changing, as stones are added to the existing configurations. This gradual change compensates significantly for the greater memory load imposed by the larger board. It also allows accurate "read-out" of board positions later in the game - even beginners can read out ladders up to 60 moves ahead (a deep but narrow search). The only time the physical state of the board changes significantly is when a group is captured, and the stones are removed from the board. Groups worth as much as 30 points are often won or lost in a game as trade-offs for other groups, even though the final difference in scores may be just 2 points.
7. Evaluation of board positions (in expert human play) typically reflects both tactical and strategic factors in both chess and Go. However, in chess, there is a good correlation between the likelihood of winning (from any stage of the game) and the number and quality of pieces on each side. Thus in computer chess programs, strategic factors have not been essential in evaluating board position. In Go, there is no strong correlation between winning a tactical struggle over a group, and winning the game, as each tactical struggle over a group requires moves that are not contributing to another group. Early in the game, players strive to achieve influence on the board, rather than directly taking territory. In fact, taking territory at the start of the game can indicate an over-concentrated position, that will not be effective in containing the opponent's territorial moves much later in the game. Algorithmic approaches to measuring the influence of a position are a standard part of most Go programs but there are no methods to confirm their efficacy, except for the strength of the program itself which is usually very weak.
8. For all the reasons discussed above, programming approaches to chess are amenable to tree searches, with good evaluation criteria. Such approaches have not succeeded in Go, because the branching factor is too large for brute force search techniques, and pruning is not a viable option without good evaluation measures.
9. Human lookahead in chess and Go is very different. Beginners of both games might start by only looking ahead a few moves, but in Go, patterns such as ladders can be read out with very little effort, and beginners are soon able to trace the path of a ladder across the board (which can require up to 60 moves). However, in these read-out sequences, there is only one sensible move at each stage, i.e, a series of forcing moves, so there are no alternative paths to follow (see the horizon effect below). In a typical chess game, expert players may search alternative possible moves many steps ahead (although typically about 10), but for each move there are several alternatives. In Go, there are too many alternatives that are reasonable to search far ahead in most moves, although for any move, there are likely to be some sequences that can be very deep (e.g., ladders). This combination of factors is not captured well by either breadth-first or depth-first search techniques.
10. The horizon effect in chess is where a search to a specific depth (such as 12 moves ahead) elicits an evaluation that would be radically altered if a few more moves had been studied. This effect is often seen in sacrifice moves. The horizon effect begins to be a problem at the Grandmaster level in chess. In Go, the horizon effect is seen in assessment of even beginner level play (e.g., ladders).
11. Chess was used in the early 1970s to study human grouping processes, and Chase and Simon (1973) showed that expert chess players view the board in terms of hierarchical groups of stones. In trying to replicate the results with Go players, Reitman (1976) found a completely different structure, in which stones are seen as part of many intersecting groups. Identifying the eventual group to which stones belong is critical in determining their safety, however, from Reitman's research, it appears that no clean assignments can be made until the end stage of the game. Such results indicate sources of potential problems for computer go programs that assign stones into hierarchical groups in order to estimate features such as influence.
12. One major difference in the actual playing of chess and Go lies in the standards of the opponents. In chess, a similar standard is required for an even game. In Go, there is a very effective handicap system, in which a weaker player starts with a predefined number of stones (usually 2-9) placed at influential points on the board. Thus the greater skill of one player is offset by the material advantage of the weaker one. In such contests (and most Go games are of this type) neither player can assume an infallible opponent, for the weaker player tries to simplify the game, and play safely, whereas the stronger player must take more risks, and exploit the weakness of the other player. Although the final score of a Go game is often used as an indication of the relative strengths of the players, the important factor is winning, by however small a margin, rather than taking chances that may increase the margin, but are not as safe. Thus, estimating the other player's level of skill is a critical aspect of many Go games. It is also the case that a Go player needs to be skilled at all facets of the game, rather than having great strength in just one area. This requirement is because players exploit the weaknesses of their opponents. It is common to rank the strength of computer Go programs according to their first game against a human of a given level. In subsequent games, the same human player will often overwhelm the same program that had won previously, because the human will learn the program's weaknesses, but the program does not change.
An algorithm which can be solved in polynomial time can be practically implemented to solve "real-world" problems. Algorithms which cannot be solved in polynomial time are described as exponential-time algorithms. Except for very small problem spaces, these algorithms are impractical. Problems that can be solved by a computer with a polynomial amount of memory space are said to be solvable in P-space.
It has been shown that, given an arbitrary Go position on an n x n board, determining the winner is P-space hard (Lichtenstein and Sipser, 1980). This result used together with situations arising from ko positions have been used to show that for an arbitrary Go position on an n x n board, deciding whether Black can force a win or deciding on Black's best move is exponential-time complete (Robson, 1983). These results indicate that the use of approximation algorithms which return near-optimal solutions are necessary to solve some aspects of Go and that brute-force search methods will quickly run into difficulties.
A comparison of the complexity of Go to the complexity of other games is provided in Allis et al. (1991). They defined the terms cracked and solved, and illustrated the difference between search-space complexity and decision complexity. A game is said to be cracked if a program is capable of achieving the best possible theoretical results regardless of its opposition. To be solved, the best move must be explicable in human terms. The difference between search-space complexity and decision complexity is illustrated by a variant of Go. The variant consists of players alternatively placing stones on unoccupied points of the board or passing. The first player to place 181 stones wins. This variant has the same search-space complexity as Go, however, since if the player moving first does not pass is bound to win, the decision complexity is trivial.
Allis et al. graphed log log search-space complexity versus log decision complexity, dividing the graph into solvable and unsolvable space for 16 games. Go was shown to be the most complex in both log log search-space complexity and log decision complexity and to be in the unsolvable area. A comparison of Go and Chess shows that 9x9 Go is around about the same complexity as Chess (35 versus 50 in 10 log search space) and that 19x19 Go is a vastly more complex game than Chess and may well be unsolvable as defined above (Allis et al. (1991).
Chess programs typically use a heuristic search and evaluation technique. Search trees of board positions are generated to a fixed depth and are heuristically pruned according to an evaluation of the merit of the board positions. This approach works well in Chess because the board size is sufficiently small and the nature of Chess is more tactical than strategic.
Evaluation of a board position in Go presents problems not encountered in Chess. Go is a much more strategic game in comparison to Chess. Unlike Chess, Go does not focus around the capture of a single piece. Positional advantages are slowly built up in achieving the long term goal of acquiring more territory than the opponent. There are many direct and indirect ways to achieve this goal such as making territory, building influence, attacking weak enemy groups, securing friendly groups, destroying enemy territory etc. Due to the large size of the board, a Go game is comprised of many small local skirmishes. If a game of Chess were described as a battle, a game of Go could be described as a war. Many good tactical moves at the local level must all compete for selection in the context of strategic global considerations. Thus a player must balance resources to achieve local goals at many locations whilst trying to pursue an overall global objective.
Due to the problems associated with evaluation, the brute force methods employed in AI to program Chess will not work for Go. As a domain, Go is rich for research in AI and cognitive science. Hans Berliner, a former Correspondence Chess World Champion, a top-ranked U.S. over-the-board player and a well-known Chess programmer referring to Go as saying "... this game may have to replace Chess as the `task par excellence' for AI" (Berliner, 1978).
Early work in programming Go began in the late sixties and was primarily academic in nature. Academic programs are not generally intended to contribute to a general programmable theory of Go, rather they use Go to further other research aims. Serious attempts to develop commercial Go programs was advanced by the spread of personal computers in the home and the financial inducement of the Ing prize.
The earliest Go program was written by Albert Zobrist as part of a PhD thesis on pattern recognition (Zobrist, 1970). The present description of Zobrist's program will concentrate on those aspects that are relevant to computer Go considerations.
Zobrist introduced the idea of using an influence function to segment the board into black and white territories. The influence function computed a numeric value for every point on the board. Black stones were given a value of +50 and white stones a value of -50 and empty points were given a value of 0. The neighbours of every point that had a positive influence value had +1 added to their influence value and similarly the neighbours of points with negative influence values had -1 added to their influence values. This process was repeated another three times (for a total of four times). This influence function spread black and white influence numerically and Zobrist then segmented the board into areas of contiguous positive and negative numeric values (Figure 17).
The program constructed an internal representation of the board which was stored as numeric values in various 19x19 arrays. The representation included arrays containing information such as:
The program also used a second influence function to store an internal representation which measured the influence of stones. It was similar to the influence described above except that occupied points were initially given values of +100 and positive points radiated +3 to their neighbours whilst negative points radiated -4 to their neighbours. Since the computer played black (positive values), this influence function overstated the opponent's influence value so that the program would play cautiously.
Having constructed an internal representation, the program performed pattern recognition over the entire board which amounted to searching for certain numeric patterns contained in the program's various arrays. Along with each pattern would be stored an associated move and a numeric value reflecting the priority of the move. When a pattern match was made, the numeric value would be added to the point corresponding to the associated move. An attempt was made to match each pattern "known" to the program across the entire board, rotating and reflecting the patterns as necessary. At the end of this process, the point with the highest value became the program's next move. Thus Zobrist's program used summed feature evaluation to choose its next move.
The program exhibited weaknesses in its play which Zobrist wished to overcome. He divided the game into four phases: corner and edge play at the beginning, extension into the centre of the board, defence and attack of loosely staked out territories, and stabilizing surrounded territories. To enable the program to play appropriate moves during the various phases, only appropriate patterns (along with patterns which were relevant throughout the entire game) were allowed to be matched during those phases.
Another addition to the program was limited lookahead or heuristic search. During the pattern recognition process, local areas would be suggested for further examination by lookahead which was carried out to a depth of three moves. The program handled saving/capturing strings, connecting/cutting strings, ladders, and the formation of eyes using lookahead.
The performance of Zobrist's program was modest, beating two novices but performing poorly against experienced players.
Jon Ryder's program (Ryder, 1971) can be seen as an extension of Zobrist's approach in as much as it used an influence function and move selection was based on summed feature evaluation. However, Ryder added both strategic and tactical considerations and also used lookahead extensively. The distinction drawn by Ryder between tactics and strategy was that tactical considerations dealt with short-term objectives whilst strategic considerations dealt with long-term objectives
Due to the intractability of using game trees in Go, a suitable replacement had to be found. Ryder's program used the combined analyses of many local tactical problems performed by lookahead together with other analyses as a replacement for the game tree.
Ryder saw Go as a game which required a balance between acquiring actual or potential territory and avoiding the loss of important stones to the opponent. He formulated three areas of interest which were designed to help maintain this balance: board organization showing degree of control of each side over various regions; identification of areas of best play for both sides; and analysis of strings to determine whether they are safe or endangered.
Move generation was accomplished in two passes. In the first pass, the tactical status of all strings were determined. Armies, walls and groups were identified and a process of developmental analysis was performed on each which resulted in determining areas and endangered groups for each player. Each legal move was then examined in an attempt to eliminate as many as possible from future consideration with the fifteen best moves being further analysed in the second pass.
The second pass began with an analysis of the best fifteen moves from the first pass with respect to tactical and naive strategic theories. The score obtained by a move from these analyses were added to the score obtained in the developmental analysis. The move with the highest combined score after the first and second passes was then recommended as the next move. Second pass analysis would stop before all fifteen moves were analysed if an "obviously" best move was identified.
Ryder's program used a forward-pruning method to search the game trees that resulted during lookahead. In forward pruning, a decision based on various criteria is made regarding the desirability of pursuing a given branch in a lookahead tree i.e. whether the future implications of a particular move is worth examining. The criteria for terminating a particular search tree included whether a target string had two eyes, whether a saving move resulted in more than five liberties, or it was unlikely that the target would be captured. The danger in forward pruning is that the best move may be discarded before it has been examined.
Like Zobrist, Ryder also used an influence function to provide a numeric number to indicate the degree of tactical control each stone exerted over its neighbours. His influence function was similar to Zobrist's in that black influence was positive and white influence was negative and the influence value at each point was the sum of the influence values propagated by its nearby neighbours. Ryder's influence function was simpler than Zobrist's in that each stone contributed an influence value to its nearby neighbours according to a table of values (Figure 18).
Both too much and too little influence at a point indicated a problem with playing at that point. Too much influence indicated that there was an over-concentration to the detriment of the global situation whereas too little influence indicated that there was the possibility of capture by the opponent. Optimal moves were those that were close to the peak of the value/influence curve. Influence was regarded by Ryder to be a local property of a stone although he admitted that there were cases (such as ladder-breakers), in which stones exerted control on distant parts of the board. In such situation where the influence value was not a reliable measure, tactical evaluation was used to overcome the short-comings of the influence function.
One situation in which the influence function was not a good measure was that of connectivity. Ryder defined two terms: connectedness and strongly related. The definition for connectedness was that a point was connected to colour X if it was occupied by a non-dead stone of colour X, or if it was empty, it was adjacent to at least one stone of colour X and none of the opposite colour. An extension to connectedness was half-connectedness. An empty point was half-connected to colour X if it was adjacent to at least two stones of colour X and one or more stones of the opposite colour. The definition of strongly related was that a point was strongly related to colour X if it was occupied by or adjacent to a stone of colour X. A point could also be strongly related to colour X if it was diagonally adjacent to a stone of colour X with at least one empty point (kosumi link) or no more than one empty point away (ikken-tobi link). Groups were defined as sets of points which were connected or half-connected.
The influence function was particularly suited to determining walls and armies. Walls were defined as a connected set of points in which the influence values were equal to or larger than a predefined wall threshold. Similarly, armies were defined as a set of points including empty points in which the influence value were equal to or larger than a predefined army threshold.
The tactical concepts employed in Ryder's program included liberty estimation, ko, ladders and snapbacks, and eyes and false eyes. Ryder admitted that improvement was needed in the program's tactical ability which he described as "blunder avoidance".
Information was included in the representation if it satisfied any of the following criteria: it was basic and was needed in many calculations; it took a significant effort to re-calculate and was used more than once; or it was accessed many times and was constant through many if not all of those accesses.
Groups and armies were represented in a pointer-based manner. There was a pointer from each point in a group or an army to the entity descriptor of the group or army it belonged to. Each entity had a pointer to one of the points that comprised it which was the base point for that entity. Searches for points near entities were initiated from the base point.
The global representation of the board was contained in a 21x21 array (19x19 with an empty point border around the outside). By using various bits in the memory location of each point in the array, the following information/pointers were able to be stored: area number; army number; wall number; group number; string number; and current state (black, white or empty).
Ryder reported the results of only one game which was lost to a novice which had just learned the rules and had been given some rudimentary tips on the tactics and strategies of Go. With regard to Ryder's main interest, he reported that the tactical analysis was quite an efficient searching mechanism which provided valuable information for move generation and compared favourably to the powers of analysis exhibited by human beginners. He was also of the opinion that only a small effort would be required to improve the program's tactical strength.
Future improvements suggested by Ryder included re-using previously calculated information in searching, the incorporation of distant ladder-breakers into tactical analysis and improving the definition of endangered groups. Ryder identified the developmental analysis stage which consumed around 10% of the processing time as needing attention. One possible reason for the poor performance of Ryder's program was that it made no distinction between the opening, mid and end games and was therefore playing inappropriate tactical moves in the end game in particular. The addition of such distinction was recognized to be of importance in future improvements by Ryder.
Walter Reitman and Bruce Wilcox began using Go to conduct research in 1972. The program that resulted from that research has been described both as the Reitman-Wilcox Go program and to the INTERIM.2 Go program. After the research project was discontinued, Wilcox rewrote the INTERIM.2 program to produce NEMESIS, a commercial Go program.
The research conducted by Reitman and Wilcox using Go was predominately in the Artificial Intelligence area and included pattern recognition (Reitman et al., 1978; Reitman and Wilcox, 1975,1978), planning (Reitman and Wilcox, 1974) and replication of human perceptual and cognitive abilities in Go. Indeed, the objective of program was to model human go play and to this end, they videotaped protocols and post-game analyses of a strong Go player to try to ascertain how humans play Go.
According to Reitman and Wilcox, the three essential ingredients to a Go program are perception, knowledge and coordination (Wilcox, 1988). They tried to incorporate into their programs representation as many of the perceptions of a skilled Go player as they could. They identified and stored two types of Go knowledge: process knowledge which included how to kill a group, how to make territory, etc, and evaluation knowledge which enabled the stability of groups etc. to be judged, the estimation of relative values, etc. Coordination of the program's perceptions and knowledge was achieved through appropriate control structures which allowed it to choose the next move.
The INTERIM.2 program created and maintained a representation that was a selectively updated, multilevel network which Reitman and Wilcox said was analogous to the type of perceptual and cognitive representation a skilled Go player would create. There was a hierarchy of experts and critics which operated on the network which chose the next move. Lookahead was not full-board, rather it was a selective, goal-driven process.
INTERIM.2 summarized information up and filtered decisions down (Wilcox, 1988). Information contained in low-level representations e.g. strings, were summarized and used in high-level representations e.g. groups. The program made decisions at a high-level about what to "focus on". These decisions were filtered down to lower levels and influenced decisions made at those levels.
The program stored its representation in a structure called GAMEMAP which was indexed through a set of interrelated arrays called GAMEBOARDS. In each GAMEBOARD array there were pointers that pointed to every instance of a particular data structure e.g. STRING-BOARD contained pointers to every string on the board. The GAMEBOARDS progressed from simple to complex: TYPEBOARD, STRINGBOARD, LINKBOARD. Special GAME-BOARDS such as LENSBOARD, WEBBOARD, SECTORBOARD and TACTICSBOARD were designed to contain representations which corresponded to a skilled human players perceptual and cognitive representations.
The flow of control in the INTERIM.2 program oscillated between MOVE and REFLEX. MOVE selectively updated the components of GAMEMAP that were affected when stones, black or white, were placed on the board. REFLEX was responsible for choosing the program's next move which was achieved by stepping through an ordered list of functions and accepting the first move returned by them.
The program's tactician, PROBE, was used to answer specific questions and therefore performed narrow, goal-driven lookahead instead of general purpose, full-board lookahead that is common in Chess programs. The basic assumption behind PROBE was that if the results of a small number of intelligently chosen lines were the same for a given question, then those results would most likely be the answer to that question. Such an assumption is not always valid, however, Reitman and Wilcox felt it was a small price to pay for a quick searching and was not dissimilar to the penalties experienced when human searches failed for similar reasons. PROBE was designed to model how skilled human Go players performed tactical problem solving by selective lookahead (Reitman and Wilcox, 1979).
PROBE would be employed to answer a specific question and would propose a reasonable initial move.The initial move would be hypothetically played and a new hypothetical board position would result. PROBE would then propose another reasonable move based on the hypothetical board position. This would continue until a position was reached which could be judged as either a success or a failure with respect to the specific question that PROBE was trying to answer. If the position was a success, PROBE returned the initial move as the result. If the position was a failure, the hypothetical moves were retracted until there was a position in which a reasonable alternative move could be made and the process was repeated. In unrestricted lookahead, all moves were examined in this fashion so that the best move could be discovered. In PROBE, not even all reasonable moves were examined. Each side was limited to two or three failed variations before the search was terminated and a failure result was returned.
The selection and examination of reasonable moves was conducted by a hierarchy of experts which formed, evaluated and implemented goal-directed "chains of thought" (Wilcox, 1988). There were a variety of experts including experts that generated goals and subgoals, experts that proposed moves to realize goals, experts that used exclusively the current board position and current goals, and experts that used all goals and subgoals. The experts contained a large amount of specific Go knowledge. The control structures implemented by Reitman and Wilcox in PROBE allowed the knowledge contained in the various experts to be organized, easily debugged, and readily extended (Wilcox, 1988). The goal-directed behaviour of PROBE resulted in very restrictive searching which typically resulted in around 60 - 80 hypothetical moves per lookahead (Wilcox, 1988).
INTERIM.2 maintained many different data structures in its representation. Point occupancy (black, white, empty) was used to determine strings. A basic property of each string was whether it could be captured or saved. The links recognized by INTERIM.2 were kosumi, ikken-tobi, nikken-tobi, kogeima and ogeima. Links were classified as normal (connectable), dead (unconnectable but potentially useful) or string capture (linkages between strings capturing an opponent string). Basic properties of linkages included whether they could be joined or cut, whether they crossed opponent links, and why they were created. Special edge linkages which were links from strings up to the fourth line to the edge of the board were also recognized. Sector lines were long range links and were used in stability assessment and move generation. Virtual strings were strings connected by unbreakable links. Groups were sets of strings joined together by links. Enclosed territory was a set of empty points or dead stones that were enclosed by strings, linkages and edge linkages.The motivation for moves e.g. why a link was created, was explicitly tracked to facilitate move selection and to ease debugging.
Influence was radiated using Ryder's algorithm. Reitman and Wilcox saw influence as misleading in certain situations and therefore developed the notion of influence blocks. Areas of empty, contiguous, same-signed influence points that did not include linkage points were collected into a structure. The perimeter of these structures were located and then iteratively "shrunk" to provide nested loops. Since shrinking did not take place across stones or linkage points, it was possible that the loops created in the shrinking process would become arc segments or strips. Eventually individual structures emerged which were called influence blocks. Points contained in an influence block were considered more likely to be turned into territory the further they were away from the perimeter of the block.
Creating and maintaining INTERIM.2's representation was described by Reitman and Wilcox as perception and the data structures used were also sometimes referred to as perceptual representations.Two components of the perception system employed by the INTERIM.2 program are particularly interesting. Webs were used to determine what regions could be "seen" from a stone and lenses were used to monitor actual patterns of stones in a game using the program's knowledge of Go patterns.
A web was constructed around a group as concentric "circles" of liberty points (circumferential strands) to a maximum depth of nine levels. Radial strands emanated from the groups. The web did not continue behind other stones, either friendly or unfriendly. Links between stones and between stones and the edge of the board (up to three moves away) also stopped a web. The continuous loop of the outermost points of a web comprised its perimeter. The points between the group and the perimeter were considered to be transparent. The perimeter of a web was useful information for a group in deciding where potential threats might come from, where it could run to avoid those threats, the location of nearby friendly groups, where linkages might be attacked etc.
To Reitman and Wilcox, local stone configurations and their transformation by sequences of moves is a fundamental aspect of Go. Sequences of moves associated with standard configurations e.g. attacking or defending a kogeima link, are recognized by competent Go players. Lenses were designed to monitor such configurations and suggest moves that were known pattern continuations.
A lens was comprised of a set of fields. Each field consists of a set of points and watches for the development of a sequence of moves. The lens in turn supplied information about the suggested moves and global status information to the tactician. When a field recognized a configuration, it supplied suggested moves and other associated information to the lens. When fields no longer recognized patterns they were discarded which caused a lens to get smaller over time. This in effect amounted to the "focus" of a lens becoming more concentrated over time.
The results of three games played by the INTERIM.2 program are reported in Reitman & Wilcox (1978). Two were won against players ranked 22 and 34 kyu and one was lost against a player ranked 4 kyu. The computed ranking of the INTERIM.2 program from these games was 27 kyu.
Reitman and Wilcox were pleased with the performance of the INTERIM.2 program in regard to their initial objectives of modelling human play. It played conservatively, choosing to defend endangered groups in preference to attacking opponent groups. It defended well; the only cases in which it lost groups (other than due to bugs) were when several groups were simultaneously endangered.
Some of the academic work that has used Go as a domain is presented below. The various papers are described at a high level since, in general, they do not provide specific details that would be of use in developing a Go program.
Paul Lehner, a student of Wilcox, produced a PhD on strategic planning in Go (Lehner, 1981) which is also described in Lehner (1981). His work was based on representative search which Wilcox found to be frequently exhibited by Go players (Kerwin and Wilcox, 1973).
A representative search provides information about many potential move sequences that satisfy a strategic plan by examining only one of those sequences. Lehner's results suggested that representative lookahead was not alone sufficient to evaluate strategic plans, however, it could quickly reject strategically unsatisfactory plans. He saw the use of representative search as providing a means to reduce the number of moves that would be passed on to a more precise tactical lookahead mechanism.
Go was used as a domain for researching machine learning by David Stoutamire (1991). A simple error function was developed which, rather than incorporating tactical or strategic Go knowledge, was derived from a database of expert games. Stoutamire then developed a classification technique called pattern preference which automated the process of deriving patterns that were representative of good moves. Due to exponential growth of the pattern database, a hashing function was developed which had the nice property of graceful degradation as memory requirements increased.
To avoid the combinatorial explosion that results from using tree search to evaluate and select moves, Peter Sander (1979) devised a program with a top-down or goal directed organization that used strategies to generate possible moves. Sander's program (called Kyu) generated bottom-up hierarchically structured descriptions of board positions which were incrementally updated after every move. It progressively refined its strategies into goals and then selected moves. Kyu only played the opening game to move 25 and an intended look-ahead mechanism was also not implemented.
Kyu's data structures were based on linked records which represented strings, groups and armies of both colours. From any point on the board, it was possible to access the string, group and army that the point belonged to if occupied, or had influence over it if unoccupied, by following pointers. After each move, the data structures were incrementally updated to reflect the changes caused by a move. The update was from low-level to high-level: from points to strings, from strings to groups, and from groups to armies.
The architecture which was instantiated in Kyu was summarized by Sander as being comprised of a multi-level structured data representation, and a hierarchically organized planning system which explicitly encoded strategies and goals. He also believed that such an architecture had applications in different domains other than Go.
Kenneth Friedenbach studied the interaction of perception and cognition in the solution of complex problems by using Go as a domain (Friedenbach, 1988). He formalized the notion of abstraction hierarchies which are based on graph theory and then applied them to Go. The use of Go as a domain was because, according to Friedenbach, concepts in the Go literature are related in a multi-level hierarchy.
Friedenbach's model of perception and cognition involved three components. The first was a hierarchy of static objects which consisted of five levels: the points of the board, strings of similar points, groups of related strings, loose federations of groups or potential groups (armies), and a summary of the properties of the federations in a global level. The second was an analytic component concerned with changes in structures as a result of sequences of moves or transformations that are considered. The third was an integration component concerned with the retention and application of information from one analyses to another.
Four types of analysis were carried out. Tactical Safety analysis was carried out at the string level and was concerned with capturing and protecting strings. Strategic Safety analysis was concerned with killing and defending groups and involved analysing eyes and connections. Strategic Development analysis was carried out at the group level and was concerned with open areas, balance measures, the growth of new groups, and the interactions of emerging groups. Global Balance analysis was the highest level of analysis and was concerned with initiative and the overall strategy of the game.
Friedenbach characterizes an analysis by three conditions: the condition under which it is performed, the conditions under which it is terminated, causing its evaluation, and the conditions under which continuations are chosen. Friedenbach uses tactical analysis as an example. Tactical analysis is invoked whenever there is a string that has less than three liberties and is terminated when all strings have three or more liberties. Continuation involves considering atari moves and captures by the attacker, and captures or liberty increasing moves by the defender. A reanalysis would result if a move was made which affected the number of liberties of a key string during the analysis.
A description of past and present Go programs are given for those programs that such details could be obtained. Other programs are listed together with their programmer(s) without description in the results in section 6.
The Many Faces of Go (MFG) is one of the best commercial Go programs currently available. David Fotland has been writing Go programs on a part time basis since 1981 and released MFG in 1990. Fotland has continued to enhance MFG since then and has contributed regularly to discussions on the computer-go mailing list concerning computer go programming in general and MFG in particular. MFG evolved from two other significant programs: G2 and Cosmos.
Fotland's first Go program determined territory by radiating influence. Unfortunately, it could be beaten by a simple liberty filling algorithm so he discarded it and wrote G2. The internal structure of G2 was described by Fotland in terms of its data structures, its tactical analyser, and its move selector (Fotland, 1986). G2 used full board reading and evaluated all legal moves with an emphasis on strategic moves. After the full board evaluation of all moves, the move that received the highest score was played.
Most of G2's code was devoted to updating the data structures. Stones occupied points (referred to as squares) and were organized into strings (referred to as groups) and then into groups (referred to as armies). Data was associated with each point on the board and included the distance to the nearest and second nearest edge, a list of adjacent and diagonal neighbours, string number of the stone occupying the point, the number and a list of liberties, colour of stones on neighbouring points (white, black, or mixed), an influence function, and the nearest occupied point in each direction.
Horizontally or vertically adjacent points containing like-coloured stones were organised into strings. String data included colour, number and list of liberties, a list of its connections, a list of neighbouring enemy strings, the army it was part of, and an aliveness value (described below).
The types of connections recognized by G2 included one and two point jumps, diagonal connection and knight's move. Connection data included the strings it joined, its type, the points involved, and its status.
Groups were comprised of strings which were connected by unbreakable connections or were both adjacent to a dead enemy string. Group data included a list of its constituent strings and dead enemy strings used as connections, the number of stones and liberties, the number of eyes, and an aliveness value.
The tactical analyser was used to evaluate board positions so that a move could be selected. Each group was assigned an aliveness value which was also inherited by its constituent strings. The strength of G2 was primarily determined by the tactical analyser and consumed the largest portion of Fotland's effort (Fotland 1986). Since most of the program's time was spent in the tactical analyser, it had to be both fast and efficient in its pruning of search trees.
Aliveness values resulted from a multi-stage process: strings which could be captured were identified as dead; eyes were found by the eye evaluator; unbreakable connection were identified; connected strings were identified as groups; territory was determined by the territory evaluator; and the life and death evaluator assigned an aliveness value to each group which was then assigned to its constituent strings.
The tactical analyser examined every string with less than five liberties to determine whether it was dead i.e., whether it could be captured; strings which could obtain at least five liberties were assumed to avoid capture. Interesting moves were suggested during lookahead to determine the status of a string. Interesting attacking moves included geta moves for groups with two liberties, liberty filling moves, moves adjacent to an enemy string's liberties, and liberty gaining moves for friendly strings. Interesting defensive moves included diagonal moves into unoccupied territory, one point jumps, extensions, connections, and liberty filling moves (for adjacent enemy strings).
The interesting moves were heuristically evaluated and the best were selected for further analysis. The best of these moves was hypothetically played and the resulting situation analysed from the opponents point of view. Once again, the best moves were chosen and the best one was hypothetically played. Normal backtracking was used at each stage as well as alpha-beta pruning. Thus, a depth-first tree was generated until the string being examined was captured or gained sufficient liberties to live. The search tree was generated to a maximum of 80 moves which still enabled ladders to be accurately read. The number of "best moves" considered (between 1 and 3) depended on the depth of the current move in the search tree: more moves were considered near the root and less near the leaves. Since the value of the terminal node had only two values (i.e., captured or escaped), alpha-beta pruning was very efficient (Fotland, 1996).
An eye evaluator was used to maintain information about eyes, half eyes (eyes made in gote) and quarter eyes (eyes needing two moves to be made) by evaluating small enclosed eyespace. G2 recognised some basic dead shapes and various types of eyes including three in a row, four in a square, bulky five, pyramid, and cross.
Groups were determined by finding strings that were connected by unbreakable connections. The tactical analyser tried to cut connections to test their strength. Dead opponent strings could also be used as connections between live friendly strings.
Territory was determined by the territory evaluator. As well as completely enclosed territory, G2 also evaluated territory near the board edge which did not need to be completely enclosed. Territory could be made near the edge by being between a stone and the edge or a connection and the edge. However, such territory could be challenged by a stone within four lines along the edge, and closer to the edge (undercutting the edge).
The life and death evaluator analysed every group. An aliveness value was assigned to each group by considering the number of liberties it had, its eyespace, the amount of territory it controlled, whether it could extend along the edge, its size, whether it was in contact with an enemy string, and whether it was completely surrounded. The aliveness value rating itself had 20 different values including "has two eyes", "has enough space to make two eyes", "unsettled", "not enough eyespace", and "can't possibly make two eyes".
A rule-based strategic evaluator with 65 rules was used to select moves by choosing the highest scoring move using a two step process. In the first step, strategic moves were evaluated and in the second step, all legal moves were evaluated. Information from the data structures and the tactical analyser was used to evaluate board positions resulting from moves in both step one and two. Fotland felt that this was the weakest component of G2 due to the weakness of the data structures (Fotland, 1986).
The rules suggested strategic moves to evaluate such as playing in an empty corner, make shimari or kakari, a move from the joseki dictionary, extending along an edge, invasions, contact fights, blocking, cutting, connecting, blocking an invasion on a wall, or making a ko threat.
In the first step, all the suggested strategic moves were evaluated. A particular strategic move was hypothetically played and the resulting board position was evaluated. The evaluation process consisted of assigning values to the board points and then summing them together to return an overall score. The value for a given board point was determined by the aliveness value of the string that controlled it. If the board position resulting from a strategic move did not satisfy the strategic goals associated with the rule that suggested it, the evaluation of the move was ignored.
In the second step, all legal moves (including the strategic moves considered in the first step) were evaluated. The value assigned to a board position resulting from hypothetically playing a move was based on the aliveness, size, and amount of territory controlled by each string on the board.
The values that resulted for each move from step two were summed with the value they obtained if they were considered in step one. The move with the overall highest score was then chosen as G2's next move.
Between 1981 and 1988, Fotland spent around four years part time writing G2. He stopped work on G2 for several years because the available computers were not fast enough for the development of strong Go programs (Fotland, 1996). G2 was around 11000 lines of C code and used around 700k for the code and data. During that time, Fotland's Go skill advanced from 15 kyu to 1 dan. G2 played at about the 25 - 20 kyu level and could beat beginners without much trouble (Fotland, 1986). Fotland had some success at computer Go competitions with G2 coming 4th in the world computer Go competition in Taiwan and 1st in the US computer Go championship in 1987.
Fotland converted G2 into Cosmos in 1988. The main difference between Cosmos and G2 was that instead of evaluating all legal moves, only suggested moves were evaluated. This transition was achieved by increasing the number of move suggestors to over 250. Quiescent search was added and the connection, eye, and territory evaluations were made more accurate (Fotland, 1996).
The data structures used in Comos were basically the same as those used in G2. The eye information included the number of eyes achievable in gote, the number of eyes achievable in sente, the number of eyes achievable if the opponent moved twice, a list of vital points, and eye type. The information contained in the data structures was updated either incrementally (e.g., lists of liberties) or after every move (e.g., influence).
Essentially, Cosmos had the same tactician as G2 but better move generation and sorting. In Cosmos, the tactician was only used to determine dead and threatened strings by trying to capture them. Its parameters were maximum liberty count, maximum, depth, and maximum search size. The maximum liberty count was 4, and therefore as for G2, strings with more than 4 liberties were assumed to avoid capture.The maximum depth allowed was 100. The search size determined the playing level (Stoutamire, 1991) and since forcing moves were not counted, ladders could be accurately read even at low playing levels.
Territory was determined by radiating positive influence from alive groups and negative influence from dead groups. Radiated influence did not pass through stones or connections and was inversely proportional to distance.
Life and death of groups was the primary concern dealt with by the evaluation process. Board positions were evaluated by a similar process to that used in G2 with the addition of quiescent search.
The tactician was used to determine dead and threatened strings (those that could be captured if they moved second) and whether the stones at the diagonal of eyes could be captured. This information was useful in identifying unbreakable connections and forming strings into groups.
Eyes were analysed to determine their potential and then allocated to groups. False eyes were identified by checking the diagonals of eyes and some dead shapes were also known by Cosmos. Any group with enough eyespace to make two eyes was considered to be alive. Potential eye space could be gained by extensions along the edge, possible connections, being adjacent to a threatened enemy group, and by controlling territory which was not already considered to be an eye.
There were 25 values that described a groups stren