🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Evaluation vs Search

Started by
21 comments, last by Argus 21 years, 8 months ago
Question : these two things appear to be very similar, so why does most game AI seem biased strongly in favour of search algorithms rather than focusing on evaluation? Theoretically if one had a good enough evaluation function, search is unnecessary. On the other hand, good evaluation is often aided by some kind of search. Humans seem to focus mostly on evaluation, not possessing the memory for search, while the best chess AIs (for example) are mostly search-based. Is it reasonable to attempt to produce a decent AI based on evaluation on just the possible moves available at one point in time?
Advertisement
Is it really possible to come up with perfect evaluation function for every problem?

Also, if we attempted to create a solely evaluation-based chess game, how can we take the opponent''s future moves into account without some sort of searching?
I don''t understand how the evaluation function alone (at least in an atmosphere where an outside source is also manipulating the environment such as two or more player games) could be used to predict states more than one level into the future.
Hmm, perhaps I''d better explain that assumption. A perfect evaluation function, treated as a black box, is a function that always returns the best move. As long as there is a best move (which there is as long as there are moves to be made), then we can imagine a function which handles such an occurrence. Thus we can have a function which returns the best possible move given any game state. This is, I expect either trivial or outright wrong, although I think it''s reasonable.

Any (non-searching) evaluation is an attempt to bypass a search by taking advantage of patterns inherent in the game. But I haven''t seen many AIs of this sort around (they seem to do basic evaluation and heavy searching rather than vice verse), which leads me to ask why?
The problem is that you have to find the correct evaluation function! It may be possible for a human to find the correct function in VERY simple worlds with a small amount of states, but the more states are possible the more difficult it becomes.

One solution is to let computer learn the function on their own, through neural nets or (better since unobserved) reinforcement learning. A disadvantage of this is that it takes a very large amount of training to play even remotely good.

There is a backgammon program out there (TD-Gammon) that teached itself playing (an evaluation function) by playing against itself; after ca. 1.5 million games it achieved world-class level. But backgammon is still very simple compared to a computer game with its nearly unlimited amount of possible states.

------------------------------
"Reality is nothing, perception is everything!" - First Wizard Zeddicus Zu''l Zorander

------------------------------

There are only 10 kinds of people: those that understand binary and those that don't.

You don''t need to supervise a neural net''s training. You can ''evolve'' its dynamics.

Fogel''s checker player reached a world class ranking in about 70 generations. (Using a minimax variant and a neural net as the evaluation function)

I''ll leave the answer to the original question to Timkin as he''s much more qualified to answer it than myself.



ai-junkie.com
You''ll find that search is used most often in n -player games (typically n=2) for the following reason.

The perfect evaluation function is only useful in an environment where all state trajectories lead to the optimal state - i.e., a win - which is the definition of such a function. This is generally not likely in most n -player games since these environments are dynamic with part of the state evolution (transition) beyond the control of the player. The same is true in single player games with randomness in state transitions. There are some n -player games that have been shown to have deterministic paths to a win state for one player given the first move, but these are the exception rather than the rule.

An optimal evaluation function would return the move that maximises some measure over a limited horizon of the game states. For n -player games this horizon is usually finite and much shorter than the depth of the tree. Typically we seek an optimal evaluation function since a perfect evaluation function is not available or cannot be stored in finite memory.

Your question, at least as far as I interpret it, is really asking why so many implementations of game AI (in these games) involve search over this horizon, rather than a black box, optimal evaluation function that knows the optimal move.

The answer is that generally, search is more efficient and more accurate, given the same level of computation to solve the task (i.e., considering training and evaluation of state vs search). Simply evaluate all possible state trajectories to the horizon depth, work out their actual (or best estimate of) value and choose the move that starts your side on that path. Alternatively, you need to learn a conditional plan (policy), which dictates, for each state you (your AI) find yourself in, an action. There are certainly methods to learn such policies, however, these methods struggle once the state space gets beyond about 105 to 106 states.

As you may have considered though, even search over a local horizon requires an evaluation function, albeit a function that evaluates the quality of a state, rather than the quality of a move (so the quality of the move sequence is implicit). Obviously if the horizon depth equals the tree depth, the evaluation is simple: did I win? In other circumstances, it is difficult. In games such as Go, it seems exceptionally difficult to come up with a good evaluation function, possibly because the game state can change so dramatically from a single move (you jump branches in the tree).

My final comment revolves around the issue of using local horizons for planning (search is just a form of planning). These methods are generally seen as an extension of reactive planning (which would be a search over a horizon depth of 1; 1/2 for 2-player games). They then suffer the usual problem that reactive strategies suffer: there are no global guarantees of optimality when maximising only a local utility (value) function. The only instance when this is not the case is when you have access to the global utility function... and in that case, your problem is solved before you begin.

I hope this has helped clarify things for you.

Cheers,

Timkin
Timkin, where do you draw the statement that learning algorithms struggle above 10^6 states from? Backgammon has an estimated number of 10^20 states and can be learned to an excellent level... The problem I see is that it takes quite a while to train such a game, though.

fup, do you have an url for this checker program? 70 training steps seem to be awfully few, even for a relatively simple game as checkers...

------------------------------
"Reality is nothing, perception is everything!" - First Wizard Zeddicus Zu''l Zorander

------------------------------

There are only 10 kinds of people: those that understand binary and those that don't.

Genetic Algorithms also depend on how many individuals each generation has...

Given enough individuals you probably even could learn in less than 10 generations (well... with REALLY many individuals per generation, hehe)

twinkish, I know.
-----The scheduled downtime is omitted cause of technical problems.
grbrg: Do you doubt my word dear boy? Then read Blondie24 by David Fogel - a book well worth buying if you like this sort of stuff. If you can't afford the book then there are some papers online that give a brief overview of his techniques. A quick search should turn them up.

I can't remember the exact number of generations. 70 was a guess. It's definitely somewhere around that figure though.

If you email me I can send you a copy of an article David wrote for the magazine, "Intelligence". It's 6 mb though. Same goes for anyone else who wants a copy.

Omnibrain: The population size was small. Also, the performance of a GA does not necessarily increase with increasing population size. In fact, above a certain value it will almost certainly start to decrease. Schaffer (1989) and Grefenstette (1986) both showed that small population sizes perform better than what was condidered to be the optimum at that time. (In the test problems used, the optimum was considered to be 100. They demonstrated that the optimum was actually 20-30)

PS. Checkers is not so simple as you may think.



ai-junkie.com

[edited by - fup on October 9, 2002 2:14:35 AM]
quote: Original post by grbrg
Timkin, where do you draw the statement that learning algorithms struggle above 10^6 states from?


That wasn''t what I said (my apologies if it was unclear). I was stating that planning methods for learning optimal policies struggled on such large state spaces. This is common knowledge in the planning literature. For state spaces beyond this size, approximation techniques are generally applied. I can provide some references if you like.

Cheers,

Timkin


This topic is closed to new replies.

Advertisement