Errors with Constants, Variables and Reliance on NL

Errors in Representing Knowledge

The “Errors in Representing Knowledge” tutorial covers a whole host of errors people have been known to make when representing knowledge in Cyc.  In this first lesson, we’ll focus on errors related to choice of vocabulary and naming of variables and constants.

Letting NL dictate your representation

  One of the errors people commonly make is assuming that, because natural language (NL) uses the same word to mean several different things, those things can safely be represented with the same Cyc vocabulary.  This will cause problems because, in order to write rules that make correct conclusions, we really need to tease apart the ambiguities present in NL.  The slide shows some of the possible meanings of “has.”  There are several states of affairs that use the English word “has” to state a relationship.   Given each of these, there are different conclusions we’d like to be able to draw about the objects involved in the relationship – for example, that John loves his son, or that John can afford to buy a sandwich.  If we used the same Cyc predicate, “#$has”, to represent these four very different relationships, it would be very difficult to express the rules we’d need to draw those conclusions.

At least 4 different senses

If we instead use more precise relations such as those on the slide, we more accurately represent what the actual relationship is and make it much easier to write rules that will draw correct conclusions.

Relying on Constant Names

  Another problem many knowledge enterers have is assuming that the name of a term that they have encountered for the first time gives them the full story on the intended meaning of the term.  Remember, the term names are basically variables; Cyc cannot guess at the meaning of a term by looking at its name.  The meaning of a given term is derived from the assertions with which that term is related.  If you haven’t examined those assertions recently, you should probably take a look to make sure you have the correct term before you make assertions using it.

Using Vague Constant Names

Using vague names sets the stage for future confusion.  Even though you shouldn't rely on constant names to tell you the meaning of a term, it’s still a good idea to be as precise as possible when choosing term names as a courtesy to other Cyclists.  When giving a precise name, we often use a convention in which the basic name is given first, and a clarifying word or phrase is appended after a hyphen.

Problems with Variables: Meaningless Variable Names

Single letter variables, especially non-mnemonic ones like ?X, make rules hard to read.  Do yourself and your fellow Cyclists a favor, and use variable names that give some indication of what they stand for.  Avoid naming a variable exactly the same as a Cyc constant, because that can be confusing, too.  Notice how hard it is to find the problem in the rule at the bottom of the slide.

Problems with Variables: Meaningless Variable Names

 If mnemonic variable names are used, the error is much easier to spot.

Problems with Variables: Check them carefully!

Be careful with your variables and make sure you always spell them the same way.  Can you identify the error in the rule on the slide?   The variable ?CIRCLE is introduced in the last line for the first time, so it could mean anything.  Remember (from the lesson on CycL Syntax) that variables that aren’t explicitly quantified are assumed to be universally quantified.  What does the rule on the slide mean?  The rule says that if you have something called ?CIRC and it is a circle with a radius, and you compute ?AREA as Pi times the radius squared, then EVERYTHING IN THE UNIVERSE has ?AREA as its area.  If the variable ?CIRCLE had been correctly spelled as ?CIRC, then the rule would have expressed the intended meaning -- if you have something called ?CIRC and it is a circle with a radius, and you compute ?AREA as Pi times the radius squared, then ?CIRC has ?AREA as its area.

Problems with Variables: Typing

  One trap that is easy to slip into is forgetting to properly constrain mnemonically-named variables.  #$inCont-Open is a relation that means the first argument is contained within the second argument, but the container is an open container.  This would hold between coffee and a coffee cup, or a puppy in an open box.  Therefore, when using this relation, one cannot assume that Cyc knows what type of thing is contained, nor what is containing the thing.  (#$in-Floating ?OBJ ?FLUID) means that ?OBJ is surrounded by and buoyed by the liquid, ?FLUID.  Again, Cyc allows you to specify what is floating and in what kind of fluid.  Can you think of counterexamples to the rule on the slide, as it written?  How about a box with two puppies in it?  Because the variables are not further defined, ?CANYON could refer to a box (or anything else, for that matter) and ?RIVER could refer to two puppies.  In order for the rule to be correct, we’d need to add the three literals that are listed at the bottom of the slide to the antecedent.

Problems with Variables: Typing

  Note that if a variable has to refer to a constant of a certain type due to argument constraints on the predicates it’s used with, it’s best NOT to state those constraints explicitly, as that would produce a rule with extraneous left-hand-side constraints (constraints in the antecedent) that aren’t strictly needed for rule correctness.  Here, the predicates #$coworkers and #$acquaintedWith require instances of #$Person in both their arguments, so we can drop the (#$isa ?PERS #$Person) literal from the left side.  One happy consequence of this is that the rule is now general enough to act like #$genlPreds.  Thus, it operates in such a way that matches a pattern of an already existing HL module and therefore already has efficient support without doing any additional work.

Summary

This concludes the lesson on common errors in using constants and variables, and reliance on natural language.