🎉 Celebrating 25 Years of GameDev.net! 🎉

Not many can claim 25 years on the Internet! Join us in celebrating this milestone. Learn more about our history, and thank you for being a part of our community!

Theroic works on human knowledge conceptualisation?

Started by
2 comments, last by Sigloxx 21 years, 6 months ago
The title already tells a lot I guess.. Many domains in wich AI searchers would have loved to make AI have brilliant results hit one big bone : the fact that an expert system or other can''t be as effective as a human mind because of the lack of knowledge that necessarily makes him dumb. Language is a fine example, geting a program to have a speech or analyze one is an interesting problem, but actually making him have a speech that has a meaning (for the machine as well as for us) looks like it''s absolutely impossible. Another way less extreme exemple would be bots detecting spam E-mails for instance..(I am thinking of working on one.. ). Making a rule based program (delete mails with many links or graphics, or mails that arrive identically for the second time, etc..) or a database of spam e-mail adresses are obvious but very imperfect solutions. Alas making a perfect program is out of reach as the program can''t have the knowledge of what really spam is. I was wondering if there are any advanced theoretical works already done and published on the problem of modelization of human knowledge as it''sa domain that would surely interest me? Thanks in advance and happy new year Sigoxx.
David Antonini.
Advertisement
And sorry if it''s more of a general AI question than a game AI one. It''s of course a problem game AI also faces, though solutions are hardly as needed

David Antonini
David Antonini.
Modelling knowledge is a very well known area in AI. Modelling a concrete domain is quite easy. For instance, Expert Systems are very good representation of those systems, they have a very good knowledge of some concrete area. the problem is to find a general representation for coping several domains.

the problem you mention (detecting spam) has been a hot topic during years in AI. The way it is usually solved is using a Bayes classifier: You choose a vocabulary of the words that you are going to consider (usually 1000 or 2000 words). Then, you represent each e-mail as the number of times that each of the words in your dictionary appears in the e-mail. Therefore, each e-mail is represented as a vector or 1000 or 2000 numbers, counting the number of occurrences of the words in your vocabulary. Then, you collect a great number of e-mails, some spam, some interesting mails. After collecting hundreds or thousands of examples, you can make a simple "Naive Bayes Classifier" to learn from this data. If the dictionary is well chosen, you can achieve accuracies of more than 80% of right spam detected. If you think this well, this is a great result, because it's not as easy as you can think, to determine if an e-mail is spam or not (may besome users are really willing to read those "enlarge your pennis" e-mails, and they are not spam for them).

Anyway, representing general knowledge is still an open problem. I'll try to find some interesting papers later if I got time, and post here the links.

cheers

santi

[edited by - popolon on January 8, 2003 5:51:03 AM]
thanks
David Antonini.

This topic is closed to new replies.

Advertisement