|
|
Senior member
Total posts: 473
Joined: Aug 28, 2010
|
I generally try to find a balance between reinventing the wheel and keeping abreast of all the research that’s been published in the fields of interest to me (e.g. artificial intelligence and robotics).
On the one hand, I think that attempting to “reinvent the wheel” is good because even though it’s very unlikely that I’d ever find a new and better solution to any given problem, it consolidates the things that I have learned. Ultimately, this engenders a greater respect and appreciation for the work of others.
However, I would also like to make as rapid progress as possible. We’re probably all familiar with Sir Isaac Newton’s famous statement that “If I have seen further it is only by standing on the shoulders of giants.” and that’s what I’d like this forum topic to be about.
So what is the current state of the art for artificial intelligence in general, and conversational software (chatbots) in particular? I’ll list a few of the projects that I know about here, from time to time, and I would like to hear about any others that you know about too. Then it would be inspirational and informative to discuss them further, possibly in topics of their own.
Number one on the list would have to be CYC (pronounced sike). This project has been in development since 1984 and was originally expected to take fifty years to complete! Unfortunately it has mostly been funded by the US military so the general public will probably never know its true capabilities (or lack of them), however, the last I heard about it was that CYC’s initial knowledge base of common sense (it’s primer) was largely complete and that it was busy reading the internet to acquire new knowledge.
http://www.cyc.com/
There is also a partly open source version that the general public can download and use here:
http://www.opencyc.org/
|
|
|
|
|
Posted: Sep 30, 2010 |
[ # 1 ]
|
|
Senior member
Total posts: 623
Joined: Aug 24, 2010
|
Cyc is a very cool project. I wish the website went into more technical detail in some areas.
The only thing about its database that I’m skeptical about is the need to transform words into special Cyc constants. I feel like this is unnecessary human interference if the database is sufficiently large. My hunch is that, just from looking at the database and comparing what contexts a word arises in, links between similar looking words (light and lit, for example) would be recognizable. Words with multiple definitions (bat and bat, light and light, etc.) as well. At least, these differences would be discernible to the point where Cyc could ask if the word can mean more than one thing, or if two words are related.
But this is only a hunch. Of course, the point of Cyc is to be an information aggregator, and the best way to do that is not always the way that allows Cyc the most freedom to grow independently. I’m sure a lot of information restructuring and culling is necessary, no matter how good the NL processing.
The FACTory is a great idea to help Cyc build confidence in various facts in a structured environment. Perhaps not as fun as chatting, but probably more beneficial.
|
|
|
|
|
Posted: Oct 1, 2010 |
[ # 2 ]
|
|
Senior member
Total posts: 473
Joined: Aug 28, 2010
|
http://www.cyc.com/cycdoc/ref/cycl-syntax.html#constants
CYC constants don’t necessarily correspond to words (which are vague anyway) and I think that’s why it’s important to distinguish them visually by the hash-dollar prefix. While it might be a workable scheme to use specific word senses e.g. bat1, bat2, light1, light2 etc the designers of CYC undoubtedly found it convenient to label (or reify) many concepts for which any given natural language simply doesn’t have a specific word or sense of a word to use.
The other thing to remember is that to be really useful, CYC has to be independent of any particular language. It should be relatively straightforward to localize CYC constants for work in English or Spanish or Japanese etc. Creation of the mappings between CYC constants and word senses or phrases in various languages is a separate task, and probably a much more difficult one, which is associated with the (logical form) grammar which would have to be created for each natural language interface.
If you like the idea of the FACTory then you should take a look at a couple of earlier, independently conceived implementations of that idea from the early days of the internet. They were called OpenMind and MindPixel respectively. Both projects have tragic stories attached to them, but they were the first seeds of what is likely to become something really great one day.
http://en.wikipedia.org/wiki/Mindpixel
http://en.wikipedia.org/wiki/Open_Mind_Common_Sense
|
|
|
|
|
Posted: Oct 1, 2010 |
[ # 3 ]
|
|
Senior member
Total posts: 623
Joined: Aug 24, 2010
|
I can understand the desire to be able to map the database to any language, however there is so much grammar tied into the structure of the database that I wonder how reasonable this would be to accomplish. Perhaps I need to go back and look more closely.
I remember hearing about the Mindpixel project while it was going on. And I think I’ve seen an implementation of OMCS at the MIT Museum before. The console was broken when I was there, which was a shame. (It might have been another project that utilized OMCS. Can’t remember now.)
Wouldn’t it be cool to get your hands on some of these projects, just long enough to really put them through their paces?
|
|
|
|
|
Posted: Oct 1, 2010 |
[ # 4 ]
|
|
Senior member
Total posts: 473
Joined: Aug 28, 2010
|
Well, I have copies of all the original data and server source code from OpenMind which I downloaded once when it was available on the internet (circa 2003). Since Push Singh’s death the project seems to have forked into many different related projects and I’m not even sure if these files can be downloaded anywhere anymore. All the code and data are written in Common Lisp and stored in plain text files, so it’s quite accessible. If you or anyone else would like to get copies, please send me a private message and I’ll send them to you.
|
|
|
|
|
Posted: Oct 1, 2010 |
[ # 5 ]
|
|
Senior member
Total posts: 974
Joined: Oct 21, 2009
|
I just read a bit about “Cyc NL” (at http://www.cyc.com/cyc/cycrandd/areasofrandd_dir/nlu).
It seems a bit too rigid for my liking. However, I highly agree that its immense store of common sense knowledge will prove invaluable for helping the system decide which parse tree is the one the user meant to convey. Also, I really think they are on the right track when they say things like :
In the example “the man saw the light with the telescope”, the semantic component would consult the KB to find out whether telescopes are typically used as instruments in seeing, and whether lights are the kinds of things that usually have telescopes.
This is the approach I’m taking, I believe humans use this kind of ‘typical’ reasoning when we interpret NL statements.
I’m not sure how flexible their concept of ‘verb-based’ templates will work though, I would REALLY love to see Cyc-NL in action, is anyone aware of any video demos or even screen shots or sample conversations?
I realize Cyc-NL wasn’t design to be part of a chatbot engine, but I would imagine somewhere there is a demo of it taking an NL statement and being asked a follow up question to test its understanding.
|
|
|
|
|
Posted: Oct 1, 2010 |
[ # 6 ]
|
|
Administrator
Total posts: 3111
Joined: Jun 14, 2010
|
Victor Shulist - Oct 1, 2010:
I realize Cyc-NL wasn’t design to be part of a chatbot engine, but I would imagine somewhere there is a demo of it taking an NL statement and being asked a follow up question to test its understanding.
I think that if you look at a lot of our “history of invention”, you’ll find that there are a LOT of instances where an item or tool was designed for one thing, but found a more productive application as part of a totally different system. The worm gear, for example, wasn’t originally designed, hundreds of years ago, to help us steer our vehicles, or to extrude plastics through a molding die, but here we are. Innovation is a GOOD thing!
|
|
|
|
|
Posted: Oct 2, 2010 |
[ # 7 ]
|
|
Senior member
Total posts: 971
Joined: Aug 14, 2006
|
Andrew Smith - Sep 30, 2010: So what is the current state of the art for artificial intelligence in general, and conversational software (chatbots) in particular?
He Andrew, again you inspired me. Here’s an idea:
suppose we make a list of AI projects and we ask Chatbots.org members to vote. Each vote is good for one point. The more point a projects gets, the more appealing it apparently is.
Furthermore, it should be a dynamic list. Each vote is good for 1 point, but when time passed by, point gradually crumble (after one day, a vote is good for 0.999 points, a day after for 0.9998 points etc )
Furthermore, votes of senior Chatbots.org members (those who’ve posted a lot) have more points. (each 100 forum postings is good for 1 point extra ).
We’ll publish this list on the home page.
What do you think? State of the art revealed?
|
|
|
|
|
Posted: Oct 2, 2010 |
[ # 8 ]
|
|
Administrator
Total posts: 3111
Joined: Jun 14, 2010
|
I’m not sure, Erwin. “State of the Art” doesn’t necessarily equate with popularity. Granted, a majority of the community here are likely to be “early adopters”, when it comes to AI technology, so the more popular projects will also be more in the “State of the Art” category, but this is a generalization that may or may not hold true. It’s certainly worth trying, but I don’t think we should put all that much emphasis on any correlations, just yet.
|
|
|
|
|
Posted: Oct 2, 2010 |
[ # 9 ]
|
|
Senior member
Total posts: 473
Joined: Aug 28, 2010
|
I agree with Dave on this one. Marketing oriented polls more properly belong with marketing oriented areas of the website. In this topic at least I would like the emphasis to be placed on projects which are pushing the boundaries of technical capability, even when it means that they are currently beyond the reach of the chatbot enthusiasts using this website. Of course, when an advanced project has byproducts that we can use (e.g. CYC is producing OpenCYC) that’s great, but it is also important to stimulate our imagination. For example, just knowing about a project like IBM’s Watson quiz show contestant could provide plenty of extra motivation for those of us engaged in long solitary coding sessions…
|
|
|
|
|
Posted: Oct 3, 2010 |
[ # 10 ]
|
|
Experienced member
Total posts: 69
Joined: Aug 17, 2010
|
Haha, I hope you guys will vote in several months. So that my system may have the chance to win.
Erwin Van Lun - Oct 2, 2010: Andrew Smith - Sep 30, 2010: So what is the current state of the art for artificial intelligence in general, and conversational software (chatbots) in particular?
He Andrew, again you inspired me. Here’s an idea:
suppose we make a list of AI projects and we ask Chatbots.org members to vote. Each vote is good for one point. The more point a projects gets, the more appealing it apparently is.
Furthermore, it should be a dynamic list. Each vote is good for 1 point, but when time passed by, point gradually crumble (after one day, a vote is good for 0.999 points, a day after for 0.9998 points etc )
Furthermore, votes of senior Chatbots.org members (those who’ve posted a lot) have more points. (each 100 forum postings is good for 1 point extra ).
We’ll publish this list on the home page.
What do you think? State of the art revealed?
|
|
|
|
|
Posted: Oct 4, 2010 |
[ # 11 ]
|
|
Senior member
Total posts: 697
Joined: Aug 5, 2010
|
I think I’d also be careful which such a thing. Usually this type of polls have much to do about a lot of stuff, except it’s original (or reflected) purpose.
|
|
|
|
|
Posted: Oct 5, 2010 |
[ # 12 ]
|
|
Senior member
Total posts: 623
Joined: Aug 24, 2010
|
On the topic of state-of-the-art for NLP, how about NELL: http://www.nytimes.com/2010/10/05/science/05compute.html
It sounds like NELL knows a limited number of sentence parses, but by crawling the web, can find enough material within its grammatical understanding to build an impressive database.
|
|
|
|
|
Posted: Oct 5, 2010 |
[ # 13 ]
|
|
Senior member
Total posts: 473
Joined: Aug 28, 2010
|
C R Hunt - Oct 5, 2010: On the topic of state-of-the-art for NLP, how about NELL: http://www.nytimes.com/2010/10/05/science/05compute.html
It sounds like NELL knows a limited number of sentence parses, but by crawling the web, can find enough material within its grammatical understanding to build an impressive database.
That is a very interesting article and the description of how NELL went wrong was particularly intriguing. Given that NELL is supposed to be able to correct itself as it learns more, I wonder if it would have been able to figure out its misunderstanding of baked goods by itself eventually. If not, then what strategies, other than the obvious one of intervention by a teacher, might reduce the likelihood of such mistakes in future?
|
|
|
|
|
Posted: Oct 5, 2010 |
[ # 14 ]
|
|
Senior member
Total posts: 971
Joined: Aug 14, 2006
|
Dave Morton - Oct 2, 2010: I’m not sure, Erwin. “State of the Art” doesn’t necessarily equate with popularity
Andrew Smith - Oct 2, 2010: I agree with Dave on this one. Marketing oriented polls more properly belong with marketing oriented areas of the website. ....In this topic at least I would like the emphasis to be placed on projects which are pushing the boundaries of technical capability, even when it means that they are currently beyond the reach of the chatbot enthusiasts using this website.
Jan Bogaerts - Oct 4, 2010: I think I’d also be careful which such a thing. Usually this type of polls have much to do about a lot of stuff, except it’s original (or reflected) purpose.
OK, so the general opinion is that we should not implement a standard way of polling. I fully agree on all arguments provided.
Here’s what I’m thinking about:
“pushing the boundaries of technical capability” is key. This is also time bounded, because what’s pushing the boundaries today is outdated tomorrow. So the list should be very dynamic. After half a year all point should be lost.
Marketing oriented polls… I actually would like to ask you guys to vote . But I understand what you mean. Someone who entered ‘chatbot’ in Google, arrived on this website, registered on this forum, can also vote. It can be guru, but it can also be a interesting younster of 16 from Walleehhoo. Once registered, he can as well.
So here’s the mechanism:[ul]
[li]we have cases: entries and description, with video if possible[/li]
[li]all registered members on Chatbots.org can uplaod cases[/li]
[li]all registered members of Chatbots.org have a credit point basket[/li]
[li]active Chatbots.org members can assign credit points to cases by dragging and drop points to cases. [/li]
[li]Per 100 postings, you earn 1 point. If you have written 300 postings, you have 3 credit points to assign to projects[/li]
[li]people with proven track records in AI (university professors, authors etc.) also get more credit points.[/li]
[li]Top 10 state of the art projects are displayed in the lefthand columns of the AI Zone forum. It is also listed in the Awards area of chatbots.org[/li]
[li]you can reassign your credit points at any time by drag and drop your credits to another project[/li]
[li]your assigments remains valid for 6 months. After that, your point fall back into your credit point basket.[/li][/ul]
What do you think?
|
|
|
|
|
Posted: Oct 5, 2010 |
[ # 15 ]
|
|
Administrator
Total posts: 3111
Joined: Jun 14, 2010
|
I think that sounds good, Erwin, except that perhaps the points should revert “back to zero” at 3 months rather than six. This is because of “Moore’s Law”, as applied to AI, making “The State of the Art” a very short-lived notion. Therefor, the lifespan of our votes should be equally short-lived.
|
|
|
|