Writing Efficient CycL: Part 1

"Part One"

This section will give you some specific suggestions on how to write CycL that can be more efficiently handled by our Inference Engine. The lessons will give you a set of heuristics, some of which are mutually contradictory, so you’ll have to trade off between them (all good heuristics are mutually contradictory, otherwise you could set up a decision tree, right?).

If you have completed all of the lessons in order, then you’ve been exposed to a lot of suggestions on doing OE that make representational sense, so they are suggestions from the representational side. This section, however, is focusing on stuff from the implementation and algorithmic side that treats CycL as more than just a notational exercise in predicate calculus calligraphy. You’re writing something that is meant to be used by an automaton efficiently to do useful things. So you’ll see some suggestions which might seem a little bit inelegant; actually, many of them have analogs to software engineering principles which would be considered elegant by software engineers, so it’s worth keeping your mind open.

The slide shows an outline of the lessons in this section, going from most fundamental to least fundamental (and most idiosyncratic to our system). Many of the earlier lessons apply to any knowledge representation system and some of the later ones reflect the current CycL state.

"Simpler is Better"
Simpler is Better

This is probably the most powerful engineering principle of all time. In our system, simpler is better in many very concrete ways. For example, if you can find a way of expressing the knowledge that you want using GAFs instead of rules, that is better. It is better because those GAFs probably have a much more uniform definition of all the vocabulary in it, which means that you’re probably reusing a lot of work that’s already there. So there is probably better indexing involved, better ways of more naturally generating it in English, and better ways of translating that into other applications that can deal only with GAFs.

GAFs are better than rules

Even though rules are very valuable, they are just one tool in your kit. Rule macro predicates are sort of a hybrid between rules and GAFs that have the terseness of GAFs and the expressibility of rules (you’ll learn more abou tthese in the next lesson). You really can express a lot of your knowledge using GAFs, and I strongly encourage you to do so. By weight in the system, approximately 98% of the system is GAFs. This is too low. The system should be more like 99.9% GAFs with very powerful, very general rules defining their meaning.

A lot of the rules tend to be initially conceived in a very idiosyncratic way, like “a koala eats eucalyptus.” We should have a general notion of prototypical diet that could be used in a GAF to match things like koala and eucalyptus (we, in fact, have the predicate #$hasDiet, that can match a koala with the type of food it eats). You can imagine all of the other analogous things that you could stamp out like that with one rule that defines what “prototypicalDiet” means. That’s an example of the fact that simpler representation is better because it encourages you to write a more powerful definition based on a more general, more reusable rule.

Binary is better than non-binary

  If you can represent something in binary predicates instead of non-binary predicates, that’s probably better also. There’s a reason why frame-and-slot systems and other sorts of binary representational systems can go a long way; it’s because, many times, the higher-arity stuff is an example of not factoring the knowledge concretely. A common case of this is in gratuitously ternary predicates where there is some term in the arg1 position, some other term in the arg2 position and some other term in the arg3 position. But the relationship between arg1 and arg2 and the relationship between arg1 and arg3 are independent. If you had two separate predicates, you could state those relationships independently.

This is exactly analogous to the difference between a highly normalized database and a non-normal database where you have a database of ten columns and when people enter new records, the records have a bunch of nulls in them because they don’t know the values. The fact that you can enter a null is a tip-off to an unnecessary, independent column. There should be another table which just states each column independently. So if you see yourself wanting to create a quaternary predicate, convince yourself that all that those arguments really are is the totality of the argument which is being stated -- it’s not like you’re using one assertion to state n separable things.

Horn-rules are better than non-horn rules

Horn rules are preferred over non-horn rules. For those of you who are unfamiliar with this vocabulary, a “horn” rule is one in which you have a number of literals in the antecedent added together and a single literal in the consequent. It’s used to say something like “a conjunction of things implies something to be true.” A non-horn rule would contain a disjunction in its consequent, like “x or t”.

Horn rules are easier to deal with than non-horn rules because non-horn rules tend to require proof by cases. If it’s a non-horn rule, then there is going to be a disjunction in the consequent, which means that you’re concluding something that is kind of weak and represents several cases from which you might have to choose.

Several small rules are better than one gargantuan rule

  Write several small rules instead of one gargantuan rule. I once saw a rule which filled seven printed pages -- this should strike you as being WRONG. It is way too complex to be correct. It will never get used and takes up a giant chunk of memory to store and likely does not even say what was intended.  A very nice thing about separability of that which is independent is that it’s a form of modularity which allows you to independently generalize each of the pieces.  For example, a very specific rule that only applies to spaceships taking off from Florida and going into the vacuum of outer space is very difficult to generalize. But if it had been broken into twelve separate pieces (one about transportation through a medium, another about  transportation mode, etc.), you could generalize each of the rules -- the rule about transportation mode might apply to all kinds of non-vehicle transportation, or even transporting materials in cells, or something like this.

"Simpler is Better"

As you learned in the previous lesson, simpler is better. You should write nice rules in the system, but where possible you should use a rule macro predicate that expresses it. This is a very important form of simple being better, where it states it as simply as it needs to be and no more. Not only that, it makes it easier to do meta-reasoning about the vocabulary itself. I can reason about the genlPreds hierarchy very easily, whereas if it’s all expressed in rules I have this extra interpretation step. I would have to keep asking things like “is this really a genlPreds rule? What is the superior predicate? What is the inferior predicate?” genlPreds in effect factors out the interpretation and ensures uniform interpretation. Because of this factoring out, it makes it easy to attach a new HL module and implement some new reasoning capabilities for that particular approach.

Simpler is better

Even though logically a long set of rules can be modeled as totally equivalent to a rule macro predicate, the terseness of the rule macro predicate provides some additional benefits, optimization points, and places to hook up special cases of functionality which make it very beneficial to have them.  Those of you who are programmers know this as functional abstraction. Stamping out a million rules is nothing but Copy and Paste, and abstracting out a property like genlPreds is like abstracting out your Helper Function or a function that defines what I’m intending. I can write rules that define what genlPreds means and that’s like the implementation of the abstract interface.

This is a form of basically functional abstraction at the OE level. Those with software engineering knowledge understand the utility of functional abstraction. Rule macro predicates are also analogous to macros in a language that provides you with macros. It’s kind of a way of having your Copy-and-Edit cake and eating it, too. If I use a macro, it will do the Copy and Paste uniformly.

Reason with the vocabulary itself

  If a rule is too complicated, it might be too complicated because you’re assuming that you’re not allowed to add new vocabulary. A big suggestion for making your life easier is to create the simplified vocabulary which will allow you to write these things tersely. For example, let’s say I was working for the Animal Planet.  Let’s say that they wanted to build a website that had a bunch of knowledge about which animals eat other animals. I could start writing all of these rules, like “cheetahs eat wildebeests” and “cheetahs eat antelopes” and just keep stamping out rule after rule until I have somewhere near ten thousand of these because they have all of this knowledge about what animals eat.

You can look at that and say “we’re just saying the same thing over and over. Let’s just abstract out ‘animal eats other animal’ and define what that one predicate means.” Now we can state ten thousand GAFs with the additional overhead of adding this new predicate and then write a rule that defines what it means for an animal to eat another animal.

Reason with the vocabulary itself (cont.)

Adding the predicate that represents animal eats other animal (from the previous slide) is an example of creating simpler vocabulary which allows you to dramatically simplify what you’re going to write next. That’s what we at Cycorp have done over the years. Every predicate that you see, every function that you see, every collection -- did not exist at one point.  It was added because there was utility in adding it.

Take this approach to writing CycL. If you’re entering a lot of rich knowledge in a new domain, add some vocabulary to make your work easier and define it in ways that link it up to other existing vocabulary so that you can write very tersely the things you need to say.

Add Rule Macro Predicates

You could prove that arithmetic can be defined in terms of Set Theory, for example. So you could use 0 and #$SuccessorFn and could define all of the natural numbers that way. And you could define addition in terms of operations on this. But no one would use that to balance their checkbook.  It’s useful to add some abstractions like Arabic numeral sequences and things like this which are representations that effectively chunk a whole bunch of this functionality together.

For whatever reason, there seems to be a persistent reluctance to add new vocabulary in order to help one out in doing knowledge representation. This happens with Cyc users and with users of other knowledge representation schemes.  There seems to be a desire for simplicity in vocabulary almost like the desire for elegance just for the sake of elegance. Isn’t it interesting how we can express arithmetic in terms of Set Theory? Well, yes, but it’s not very useful for an automaton that has to reason with it.

"Simpler is Better"
Some False Ideas

  There’s an erroneous mindset out there that somehow vocabulary is expensive but arbitrarily complicated expressions are somehow as cheap as any other. This is what Doug Lenat refers to as “Physics Envy” where people in knowledge representation are wishing for the “Maxwell’s Equations of KR”: four simple little things which can express how to represent knowledge and put it on a T-shirt.  It’s just not like that.  In fact we have the reverse in our system. Adding new vocabulary is no more expensive than adding any other assertion about something and complex assertions are quantitatively harder to deal with than simple things like GAFs.

New Vocabulary Will Not Make Rules Less Reusable

If you’re concerned that creating new vocabulary will make your rules less reusable, think about this way. Just as we expect people to reuse #$Dog when they are talking about the concept dog, and we expect to have tools available to enable someone to find that concept when they want to say something about dogs, we should have analogous tools for other kinds of vocabulary. If you’re looking for a predicate that can relate one type of thing to another type of thing, or a function which can denote a certain type of thing, or something else, there should be tools which help you find all of those kinds of things and reuse them. Moreover, the meaning of #$Dog is only defined by how it’s linked up to other things in the system, just like the meaning of the predicate #$isa is defined by how it’s used in the system.

So it’s not like the vocabulary for predicates and things like that have any additional burden that other terms don’t have as well.  If you expect #$Dog to get reused properly, it had better get hooked up as a spec of #$Animal, right?  It had better have links to other things that you would expect it to be linked to. So, the vocabulary is no different than other terms that we add to the KB.

Truly large-scale, reusable, interesting knowledge bases are going to be big. There must be tools which help you find what you’re looking for. This goes hand in hand with having a large knowledge base. There are other systems which have not spent as much time on tools because they tend to deal with small theories where you can print out the entire ontology in a tree-like graph that fits on one page. In this case you can use the human eye as a search tool. For small things you can get away with that, but for large-scale knowledge bases, you must have tools. For example, we have pretty good tools for walking the hierarchies.

OE vs KE vs SME’s and Their Vocabulary
The Rule of 10

This is a suggestion for when you create simplifying vocabulary, called the Rule of 10. Simplifying a bunch of assertions that you’re about to make is reason enough to create some new vocabulary. If you’re going to introduce a new term to the system, you should have about ten interesting things to say about it.

For example, if you’re going to say a whole bunch of stuff that keeps repeating the same pattern over and over, like the CycL that you see at the bottom of the slide that begins with #$and, then the concept of those people that have gender masculine would be worth factoring out, as you see in the CycL in the middle of the slide. Then you would hook up the things that you would say to #$MalePerson, instead of having to describe it every time you refer to this concept.

You should have about ten things to say about the average term in a system. If you have more than ten things to say about it, then maybe there is something that you’re not factoring out about it -- maybe there are sub-properties that are interesting to talk about. If you have less than ten things to say about it, then maybe it’s not a distinction worth making from other things in the system. In practice, if you look at the ratio of the number of assertions in our system to number of reified concepts in our system, for at least the last eleven years this ratio has hovered between 8 and 12.

Empirically, this seems to play out in representing a large knowledge base. If you find a predicate which, if you created it, you would know of at least ten uses for it and think there is potential for more in the future, then it’s worth considering creating it.

Why Simplify Vocabulary?
Summary
  • Simpler is Better
  • Use Rule Macro Predicates
  • Create Simplifying Vocabulary

This concludes the lesson on creating simplifying vocabulary.