|
|
Senior member
Total posts: 473
Joined: Aug 28, 2010
|
After the enthusiastic discussions about knowledge representation in the video hangouts over the last week, not to mention finding some thought provoking articles online (linked below), I thought it was time to dig out some of my old knowledge base code.
I wrote this program almost seven years ago after reading through “Artificial Intelligence: A Modern Approach” by Russell and Norvig. It was an attempt to consolidate my understanding of the chapters on knowledge representation and logical resolution. At the time I could never quite get it to work properly but after reading the articles linked below, problem solved! (FYI my handling of equalities was sloppy in the original version)
http://asmith.id.au/source/logic.lisp
My usual practice is to prototype my designs in Common Lisp and then develop scalable high capacity production versions in C. It will be interesting to see how long this conversion takes, and how much code is the result. I’ve always found it to be worth it though, not just for the performance boost but for the necessary simplification of the representation of the algorithms. Everything has to be written out explicitly in C and when there is no hidden magic, the program is easier to understand and maintain.
Here are the first couple of articles. They were novel and interesting but they left me wondering if they represented any real improvement over logical resolution. Maybe they just look good on paper. Has anyone else done machine learning and knowledge representation this way?
http://sifter.org/~simon/journal/20130713.h.html
http://sifter.org/~simon/journal/20130802.html
And here are some recent lecture notes about logical resolution which filled in (more of) the remaining blanks in my understanding of the subject. Judging by the amount of effort that is being devoted to advancing the theory and practice of logical resolution, and the substantial economic benefits arising from it, it does seem to be a technology worth investigating further.
http://www.doc.ic.ac.uk/~sgc/teaching/pre2012/v231/lecture8.html
http://www.doc.ic.ac.uk/~sgc/teaching/pre2012/v231/lecture9.html
Bonus link, a state of the art, ready to use, commercial quality, modelling and theorem proving system for your enjoyment:
http://alloy.mit.edu/alloy
|
|
|
|
|
Posted: Aug 11, 2013 |
[ # 1 ]
|
|
Senior member
Total posts: 133
Joined: Sep 25, 2012
|
Here are the first couple of articles. They were novel and interesting but they left me wondering if they represented any real improvement over logical resolution. Maybe they just look good on paper. Has anyone else done machine learning and knowledge representation this way?
http://sifter.org/~simon/journal/20130713.h.html
I’ve never used that system. So far I read only your first article above, and I found some drawbacks of your representation already…
When you hear that John is older than Mary and Mary is older than Jack and you “just know” without having to think about it that John is older than Jack, it’s a fair sign that that rule has been learned by your model. But when you learn the rule explicitly as above, it first becomes part of your state—mere factual memory, and not intuition.
I don’t believe rules are necessary for such representation. Binding of some sort is necessary, but binding doesn’t necessarily mean rules.
Lokendra Shastri explored what he calls “reflexive reasoning” with a model called SHRUTI. If you look through his articles on SHRUTI, you can find diagrams that are neurally inspired that show how activation (sometimes with synchronization) propagates through the system and automatically causes inferences, especially his examples of transferring ownership of a book, and the story of Little Red Riding Hood (LRRH). The idea is that in animal brains such inferences must happen very fast, so he calls such inferences “reflexive” and gives numerical time estimates for how fast such inferences are known to happen. Here is some mention of his work to get you started, though at the moment I can’t find any full articles of his online:
http://www1.icsi.berkeley.edu/~shastri/psfiles/ShastriOnVandervelde2.htm
http://cogweb.ucla.edu/Abstracts/Shastri_93.html
http://www.complexity.org.au/ci_louise/vol02/learning/node3.html
One look at Shastri’s diagrams, however, should tell you right away that a biological brain is very unlikely to contain something of that sort. Even if a brain did look like that on the inside, the problems of learning, symbol grounding, recognition, etc. would be prohibitive. I like some of Shastri’s ideas, but in my opinion any such knowledge representation fails immediately. I believe the brain is using something much more clever and much more natural than such labeled circuitry.
One drawback of your database table representation is redundancy (5 rows contain only 2 companies XCorp and ZCorp). You could represent that table as a semantic net that doesn’t duplicate fields, but semantic nets have drawbacks in representing “not”, non-binary relations, relationships between relationships, and so on. Your example of “~69.5 < X < ~70.5” is basically referring to fuzzy logic, but fuzzy logic has a number of problems as well. For example, how do you know for sure that your desired range is limited to exactly +- 0.5, and what fuzzy membership function is to be used, and why calculate all those numbers at high accuracy just to wastefully produce a fuzzy number at the end? And you already noted some of the problems with matching states: inexact states, and temporal issues. As you noted, using statistics it may be possible to make assumptions about states of complicated objects such as identification of visual objects by correlating rows and columns but any associative memory like that is wasteful since you need a separate line for every possible attribute.
So what am I saying, altogether? (1) Here’s a response to your post; and (2) knowledge representation is a hard problem that I suspect involves something more exotic than symbols, especially logic symbols. I believe you’re just falling into the same type of problems that others have encountered. I’ll read your second article hopefully soon, and I’ll try to comment. I also plan to post a lengthy math thread that discusses some of these issues indirectly.
|
|
|
|
|
Posted: Aug 11, 2013 |
[ # 2 ]
|
|
Senior member
Total posts: 473
Joined: Aug 28, 2010
|
Excuse me? They’re not my articles. They’re written by someone called Simon Funk. The articles failed to convince me that this method was worth pursuing instead of Logical Resolution, but I wondered if anyone else had any practical experience with it already.
Since making the post I’ve (coincidentally) found a more rigorous description of the method here:
http://en.wikipedia.org/wiki/Method_of_analytic_tableaux
In the meantime I’ll be sticking with Logical Resolution and yes I’m fully aware of it’s limitations, but until someone invents a better algorithm it will have to do.
PS Thanks for taking the trouble to respond. There’s certainly a lot to be discussed in this realm.
|
|
|
|
|
Posted: Aug 11, 2013 |
[ # 3 ]
|
|
Senior member
Total posts: 133
Joined: Sep 25, 2012
|
OK, I looked through the second of excuse-me-someone-called-Simon-Funk’s article, and it was just more detail (code) about his first article. Logical resolution is what I had to learn in one AI class I took, where we had to learn Prolog. I don’t have many comments on logic and Prolog apart from what is already well-known: logic is only one method of representation, such equations are a lot of work (we had to go through a lot of them in that class!) and are very similar to algebraic equations or algebraic proofs, there are problems with Prolog due to practical constraints, and in my experience humans use a very limited amount of logic, and the average person does it very poorly, so I don’t regard logic as the key to AI.
The Method of analytic tableaux is new to me and therefore mildly more interesting than the usual logic. Thanks for making us aware of that twist on proof techniques.
|
|
|
|
|
Posted: Aug 12, 2013 |
[ # 4 ]
|
|
Senior member
Total posts: 141
Joined: Apr 24, 2011
|
Hello, please let me step into this conversation.
Andrew: Tableaux is a logic method (something like a method as Gauss Jodan for Matrix-inverse resolution) but appropriate for propositional logic, also called FOL (Firs Order Logic) the whole stuff is used in Reasoners, using the Euler Method to avoid the double path problem.
In Prolog this methods are mostly used, there are lots of “reasoners” out there, as you might know. The use propositional logic and the reasoning (finding new relations) is based upon 2 kind of methods. Simply speaking, 1 way is try to infer from the known rules we have, and create new ones as far as they can be created, and see if they satisfy in some way the proposition you want to proof, this is called forward proofing.
The other method is backward-proofing, and is based upon starting from the proof you want to make, and making transformation by adding the statements you might have, until you find a closure-tautology or a contradiction.
Who knows a lot on this is Jos de Roo, I contacted him some time ago, he is very kind and a good teacher, he developed a lot of code for proofing mechanisms, just google him or go to this star point:
http://eulersharp.sourceforge.net/
Continuing with the “sort” explanation..
Both proofing methods are complicated and never not perfect: forward methods suffers from geometric path and data explosion, you normally have much more (huge amount) of hidden relations than the explicit ones; so fulfilling all is memory and time consuming, and don’t guarantee success, and the time for finding the answer is uncertain.
The other method or theorem proofing is backward-proofing, is also complicated because you search blindly until resolution is hit.. or not, also has exponential explosion but is able to be limited by Euler-path methods, it can be also enriched with local cache, etc.
My comment is that this is a expensive way to do AI, as long as they use FOL/RDF/N3 logic and the logic is binary there is little or small capability unless you have a big-server-farm and much money to waste in memory (>64GB RAM on each server, with 8 or more cores.. etc.) and even electricity…
The needed logic for AI (in my opinion) should be a “statistic” abstraction layer, backed up on “lazy rules” learned from the RDF-data (of course the FOL data and format ARE important) an automatic hinting using Linear Algebra Theory using Random Knowledge Vectors, Holographic & Relational Memory structures, where you can encode into a singe vector of (lets say) 2k components, as much as the information content of the Holly Bible, in just one vector! and make a simple linear-operation to see if something might be demonstrated with data from this text, or not.
All this has been researched a lot, for many years, the matrix linear-methods like SVD, and PCA, are very interesting and useful, but lacks precision, also they are based upon “bag-of-words” starting with a lazy knowledge representation. On the other hand are the hard-logic (boolean) proofing methods using large amount of memory and metadata..
There are also other AI-like methods (mostly classifiers) that use regression-based methods, like SVM, CRF (last fashion) and traditional HMM methods, based on Dynamic programming (Viterbi, etc.). all they go from the statistical towards the deterministic. The other comes from deterministic to exact. Both are wrong (for me)
the third branch is “parsers” which resemble FOL with a way-more complex structure, there is no yes/no logic, there s a “tree” of abstract relations (thanks to N. Chomsky) and they suffer also from geometric explosion, so they are not the “best” for AI, just good for small tasks like chunking, date-time parsing, ruler-based named entity recognition, among some more.
But in my thought…None of them can answer a “maybe” logical question, and find what are the “best” proof for a “homicide” just by reading the book and finding out all relations and guessing the most probable, (doing inference) Just in case: the “butler” did it!
best!
|
|
|
|
|
Posted: Aug 16, 2013 |
[ # 5 ]
|
|
Guru
Total posts: 1009
Joined: Jun 13, 2013
|
I’m afraid I am very unfamiliar with AI jargon, but if I understand correctly then logical resolution uses normal facts and inference, and the other writes knowledge as propositional rules (if - then). Yes?
At this point in development I’m only using regular facts and inference, but I do reckon that these will not suffice for e.g. problem-solving.
Inference would conclude from the facts “socrates is a man” + “men are mortal” that socrates inherits the trait of being mortal. My program inferes this if-then relation while running, because there are various other relations possible between facts. Storing static if-then rules in the knowledge database itself would be less versatile, restricting, and result in a larger database.
On the other hand are facts of requirement. For these one could store the rule “if it has legs, then it can walk”, or the facts “it has legs” + “legs can walk” or the requirement “walking: requires legs”. Only the latter seems suitable for backwards-proofing, which I think is the more efficient method for goal-oriented processes like problem-solving or deducing “whodunnit” (murder: requires motive, means, opportunity).
My main issue with knowledge representation is that computers operate quite linear, while I suspect that knowledge in the human brain is easily searchable in reverse. e.g. Roses are red. “Name something that is red” and “What colour is a rose?” derive their answer from one and the same association. But while MS Windows needs only a microsecond to search for the specified subdirectory “rose”, it takes minutes to search 10000 subdirectories for the specified word “red”.
|
|
|
|
|
Posted: Aug 16, 2013 |
[ # 6 ]
|
|
Senior member
Total posts: 133
Joined: Sep 25, 2012
|
Don Patrick - Aug 16, 2013:
My main issue with knowledge representation is that computers operate quite linear, while I suspect that knowledge in the human brain is easily searchable in reverse. e.g. Roses are red. “Name something that is red” and “What colour is a rose?” derive their answer from one and the same association. But while MS Windows needs only a microsecond to search for the specified subdirectory “rose”, it takes minutes to search 10000 subdirectories for the specified word “red”.
True. Content-addressable memory (CAM) can be built into a digital circuit easily enough, but the main problem is that the search lines (the vertical lines in Figure 1 at https://www.pagiamtzis.com/cam/camintro/) are costly, since you need one line for every attribute for the object in memory that you might want to search for, and with a long set of addresses that means very long lines. The brain probably doesn’t use such lines (although admittedly I’m currently designing an AI system that uses such search lines, at least in my initial design). As for searching for subdirectories, I believe brute force software search is used, not CAM, which is typically built into hardware.
As for your question about differences between various logics and production systems and their terminology, I don’t know offhand since I rarely use either of those anymore so I’d have to look them up again to refresh my memory.
Nice overview, Andres. That’s deeper than I ever got into the subject.
|
|
|
|
|
Posted: Aug 16, 2013 |
[ # 7 ]
|
|
Senior member
Total posts: 141
Joined: Apr 24, 2011
|
Hi Mark and Don
First,
Don:
you got the intuition!
Indeed but you need to read/learn a lot! if not, and even if you are brilliant.. you will struggle hard and may re-invent the AI-wheel many times, and in your lifetime you’ll never reach something affordable or just to say reaching or touching the state of-the-art.
The main reason why the academic people publish things is not just because they are egocentric (perhaps they are), but the main reason is just because its too hard to re-invent everything in.. just a fraction of a lifetime.
This is the main reason of success of the mankind, just learn what others did, and start from there!
Don’t try to reinvent the AI, just follow the papers, start reading the first ones .. older.. (never the last/new ones) or you may crash your mind, and abandon!
During your reading/learning you will find that many many people was so enlightened as you; some others less and a few, may be more!
You will find out that after reading (and for god-will.. understanding) the ‘state of the art’ methods, you will be able to see what is solved, and what not, and the paths others have followed on the search.. this gives you enormous power to create new stuff!
belief me
Mark:
Sorry for the neologisms, I just wanted to be precise, but I don’t want to include a glossary on each answer so go to google and search the AI terms, and in wikipedia there is a lot of info! (I just wanted to explain some things based upon the actual knowledge on AI) sorry for the inconvenience! and Good luck with your AI-process! and please read Marvin Minsky’s book: the society of mind! (what you mean are similar to the k-lines)
|
|
|
|
|
Posted: Aug 17, 2013 |
[ # 8 ]
|
|
Guru
Total posts: 1009
Joined: Jun 13, 2013
|
Mark, thank you for pointing out those options, I did not know of hardware solutions like CAM. I’m not a fan of brute force the way IBM practices it (I’m a fan of efficiency), but it’s nice to know there are such solutions available should I eventually run into speed limits.
Andres, such credit is not due me , but I am flattered.
You are obviously a man of great experience. I am almost ashamed to say that I must decline your good advice of learning from reading, sound as it is in general. Due to some form of dyslexia or other, I am realistically faster at re-inventing a wheel than I am at reading texts, while inventing grants me deeper insights and that most important motivation. Were one to show me solutions, or how much I would have to catch up on, I would abandon instantly.
One other reason is that I believe that all that I could read is already being read and practiced by scholars and professors far more suitable. I would just be following them further behind on the same path.
What I shall do instead of reading, is approach some AI professors who have been recommended to me. It is… more efficient.
|
|
|
|
|
Posted: Aug 17, 2013 |
[ # 9 ]
|
|
Member
Total posts: 2
Joined: Jul 29, 2013
|
Dear, Andrew, Mark, Andres, Don!
It is very interesting discussion that is going right now, but probably you can get more information by looking the following links I put in another thread several weeks ago.
The theory http://generalinformationtheory.com
The representation language is described in the chapter 4 http://generalinformationtheory.com/content4.php
The part 5 is examples.
The introduction into the theory is publicized here http://www.sciedu.ca/journal/index.php/air/article/view/2225
The subject of the theory called TMI is what kinds of knowledge is actually represent by C++ and how this language (I mean the linguistic part, but not the programing tool) has to be extended in order to produce the universal representation language.
The result of extension - the language T - is completely described in the work. There are also such example representations as the ontology of time, the ontology of event and several sentences from Sowa like following
“Diamonds were given to Ruby“.
“Mary waited until noon“.
“The truck was serviced for 5 hours“.
I have also an example representation of the following not really simple sentence:
“Back in 1969, when Grazyna Bialowas was 18 years old, she and a friend planted two rows of trees to beautify the state-owned cable factory that was the center of her world in the town of Ozarow, outside Warsaw.”
The theory is not restricted to the human language and can be applied to every kind of knowledge possessed by any possible object as living as artificial.
I hope you can answer your questions by studying this stuff.
|
|
|
|
|
Posted: Sep 30, 2013 |
[ # 10 ]
|
|
Senior member
Total posts: 133
Joined: Sep 25, 2012
|
Andres Hohendahl - Aug 16, 2013:
and please read Marvin Minsky’s book: the society of mind! (what you mean are similar to the k-lines)
Wow, Andres, you really got me going on Minsky’s book “The Society of Mind”! I bought that book years ago but barely got any of it read before moving off to other reading, but since you motivated me to look up K-lines, I decided to read that whole book, and I’m now about 1/3 the way through.
What a great book! His hypothesis about agents is very convincing, although I still have my own hypothesis of intelligence that is different. The book shows very deep thinking, which is not surprising since I remember it was online in many draft forms for years before it was finally published, which means he got loads of feedback and ideas from people on his ideas before publishing it. Already I’ve incorporated some of his concepts and terminology into my own architecture, and I’m heartened to find that many of my own conclusions or insights were independently thought of by Minsky. (Or did I happen to see those ideas in the book years ago, where they merely took hold years later where they were planted as seeds in my subconscious?) The book is full of great insights, great quotes, and great designs. Object-oriented programming (OOP) enthusiasts should take note of his K-lines design since that idea resolves one of the worst problems in OOP, the problem of multiple inheritance. You might notice that I’ve also started to include quotes from his book in my big thread about the math of AI.
Here’s part of the section on K-lines from that book, if anybody’s interested:
(p. 82)
8.1 K-LINES: A THEORY OF MEMORY
We often talk of memory as though the things we know were stored away in boxes of the
mind, like objects we keep in closets in our homes. but this raises many questions.
How is knowledge represented?
How is it stored?
How is it retrieved?
Then, how is it used?
Whenever we try to answer any of these, others seem to get more complicated, because we
can’t distinguish clearly what we know from how it’s used. The next few sections explain a
theory of memory that tries to answer all these questions at once by suggesting that we keep
each thing we learn close to the agents that learn it in the first place. That way, our knowledge
becomes easy to reach and easy to use. The theory is based on the idea of a type of agent called
a “Knowledge-line,” or “K-line” for short.
Whenever you “get a good idea,” solve a problem, or have a memorable experience,
you activate a K-line to “represent” it. A K-line is a wirelike structure that attaches
itself to whichever mental agents are active whenyou solve a problem or have a
good idea.
When you activate that K-line later, the agents attached to it are aroused, putting
you into a “mental state” much like the one you were in when you solved that
problem or got that idea. This should make it relatively easy for you to solve new,
similar problems!
In other words, we “memorize” what we’re thinking about by making a list of the agents
involved in that activity. Making a K-line is like making a list of the people who came to a
successful party.
(“The Society of Mind”, Marvin Minsky, 1986)
Boris:
I still intend to look through your web site in detail, but I’ve been too busy lately.
|
|
|
|
|
Posted: Oct 3, 2013 |
[ # 11 ]
|
|
Senior member
Total posts: 133
Joined: Sep 25, 2012
|
Boris Sunik - Aug 17, 2013:
The theory is not restricted to the human language and can be applied to every kind of knowledge possessed by any possible object as living as artificial.
Thanks, Boris. I finally looked at your web pages and file in more detail today. I liked that you created a way to declare chronological events, especially if things happen simultaneously. The main drawback I detect is the same drawback that any symbolic language has: it is abstracted and discretized representation.
For example, what if two events happened simultaneously but it were known that certain subevents within those two events happened in specific temporal relation to each other? For example, one person is loading their car while two people are conversing nearby. The car loader happens to slam his trunk shut just as one person says the word “not” within a sentence. Then that accidental overlap happens to be significant because it completely obliterates the auditory signal “not”, which in turn reverses the meaning of the associated sentence, which in turn might manifest itself in a broken friendship, broken business deal, or some other major mishap. What the conversers will later remember about the event is that one moment where the two subevents happened to interfere, so a representation system that captured *all* relevant information would need to be able to store the *continuous* details of the overlap, *exactly*.
Similar situations occur even without time. For example, to represent a chessboard, it may be sufficient during an online chess match to give only the symbol “N” for a knight, but if a physical chessboard were being sold online, the style of the carved knight (e.g., Staunton, French Regence, etc.) or whether it contained an internal weight might be information that is very relevant. That suggests that the goal or purpose of a representation is a critical criterion for knowing which information to save and which to discard, and representation of such real-world goals is quite problematic.
(p. 257)
We shall now try to show not only that human behavior can be regular
without being governed by formalizable rules, but, further, that it has to
be, because a total system of rules whose application to all possible
eventualities is determined in advance makes no sense.
In our earlier discussion of problem solving we restricted ourselves to
formal problems, in which the subject had to manipulate unambiguous
symbols according to a given set of rules, and to other context-free
problems such as analogy intelligence tests. But if CS is to provide a
psychological theory—and if AI programs are to count as intelligent—
they must extend mechanical information processing to all areas of
human activity, even those areas in which people confront and solve
open-structured problems in the course of their everyday lives.
Open-structured problems, unlike games and tests, raise three sorts of
difficulties: one must determine which facts are possibly relevant; which
are actually relevant; and, among these, which are essential and which
inessential. To begin with, in a given situation not all facts fall within the
realm of possible relevancy. They do not even enter the situation. Thus,
in the context of a game of chess, the weight of the pieces is irrelevant.
It can never come into question, let alone be essential or inessential for
deciding on a specific move. In general, deciding whether certain facts
are relevant or irrelevant, essential or inessential, is not like taking blocks
out of a pile and leaving others behind. What counts as essential depends
on what what counts as inessential and vice versa, and the distinction cannot
be decided in advance, independently of some particular problem, or
some particular stage of some particular game. Now, since facts are not
relevant or irrelevant in a fixed way, but only in terms of human pur-
poses, all facts are possibly relevant in some situation. Thus for example,
if one is manufacturing chess sets, the weight is possibly relevant (al-
though in most decisions involved in making and marketing chess sets,
it will not actually be relevant, let alone essential). This situational
character of relevance works both ways: In any particular situation an
(p. 258)
indefinite number of facts are possibly relevant and an indefinitely large
number are irrelevant. Since a computer is not in a situation, however,
it must treat all facts as possibly relevant at all times. This leaves AI
workers with a dilemma: they are faced either with storing and accessing
an infinity of facts, or with having to exclude some possibly relevant facts
from the computer’s range of calculations.
But even if one could restrict the universe for each particular problem
to possibly relevant facts—and so far this can only be done by the
programmer, not the program—the problem remains to determine what
information is actually relevant. Even in a nonformal game like playing
the horses—which is much more systematic than everyday open-struc-
tured problems—an unlimited, indefinitely large number of facts remain
as possibly relevant. In placing a bet we can usually restrict ourselves to
such facts as the horse’s age, jockey, past performance, and competition.
Perhaps, if restricted to these facts from the racing form, the machine
could do fairly well, possibly better than an average handicapper; but
there are always other factors such as whether the horse is allergic to
goldenrod or whether the jockey has just had a fight with the owner,
which may in some cases be decisive. Human handicappers are no more
omniscient than machines, but they are capable of recognizing the rele-
vance of such facts if they come across them.
(“What Computers Still Can’t Do: A Critique of Artificial Reason”, Herbert L. Dreyfus, 1992)
(p. 149)
The aim of a model is, of course, precisely not to reproduce
reality in all its complexity. It is rather to capture in a vivid, often
formal, way what is esssential to understanding some aspect of its
structure or behavior. The word “essential” as used in the above
sentence is enormously significant, not to say problematical. It im-
plies, first of all, purpose. In our example, we seek to understand
how the object falls, and not, say, how it reflects sunlight in its
descent or how deep a hold it would dig on impact if dropped from
such and such a height. Were we interested in the latter, we would
have to concern ourselves with the object’s weight, its terminal ve-
locity, and so on. We select, for inclusion in our model, those fea-
tures of reality that we consider to be essential to our purpose.
(“Computer Power and Human Reason: From Judgment to Calculation”, Joseph Weizenbaum, 1976)
|
|
|
|