Errors with Specialization, Generalization & Rules

Errors in Representing Knowledge

In the first lesson of the “Errors in Representing Knowledge” tutorial, we covered errors related to choice of vocabulary and naming of variables and constants.  Now we are going to concentrate on over-generalization, overspecialization, and some errors with formulating rules.

Over-generalization

  When stating knowledge in Cyc, we aim to state rules that are as general as possible, without overgeneralizing.   If a rule is too general, it will result in some false conclusions.  If it is too specific, it fails to draw some conclusions that would be correct and desirable to draw.   Now, the world is a messy place, and so it may not be possible in some cases to write a single rule that captures the whole point at the proper level of generality, but that is the ideal situation.

Refer to the slide for an example of an overly general rule.  Can you think of counterexamples?   Sure:  Clams and starfish and plants (and a bunch of other living things!) don’t have heads. How would we fix this?  There are many ways to fix this problem; let’s consider one approach.  There are two main groups of critters that all have heads: vertebrates and insects.  Thus we could simply state two separate, more specific rules for these two groups.

Over-generalization

  Refer to the slide for another overly general rule example.  Would this rule apply to mute people or infants?  This rule is fine as a default rule – almost every person does speak some language.  Thus, we can state this rule as a default and then represent exceptions, using #$exceptWhen, for infants and people who are mute, in a coma, etc.

Over-generalization

  Refer to the slide for yet  another overly general rule example.  Homeless people and primitive cultures don’t live in buildings.  Thus, this rule is a good default only if we assume a modern culture.  We could therefore state this rule in a microtheory that makes that assumption and then encodes exceptions for the homeless.

Over-specialization

  Here’s an overly specific rule.  What kinds of related conclusions will it fail to draw?

Over-specialization

It would be a stronger, better rule if we wrote it as “Every organism is younger that its parent(s).”  One can even imagine a further generalization that would cover PartiallyTangibles, along the lines of “every Partially Tangible has a starting date later than the starting date of any actor in its creation.”

Over-specialization: Unnecessary Constraints

  Another way rules can end up overspecialized is with the inclusion of constraints that aren’t necessary for the conclusion to be true. In the
rule on the slide, it doesn’t matter that the mast is part of something that is a sailboat – all masts are rigid, whether or not they are part of something.  Having these unnecessary literals keeps the rule from firing in all the circumstances in which it should.  Remove them, and we have a more powerful rule that can work with partial information.

Non-modular Rules

  Another error made when representing rules is to make them non-modular.  The example on this slide represents a perfectly good rule.  It states that for every sailboat, there exists a thing, which is a mast, and the mast is part of the boat.  But, what if we also wanted to talk about the hull that is part of each sailboat?

Non-modular Rules

  The example given on the slide is not the way to accomplish this.  Nested existentials are to be avoided whenever possible, especially when they are not strictly needed to express the state of the world.  In this case, the existence of the hull doesn’t depend on the existence of the mast, so …

Non-modular Rules

 … we should split up the rule into two independent rules.

Non-modular Rules

  Refer to the slide for another  case of a non-modular rule.  Do you see the problem?

When writing rules that include existentials, it’s important to include only the minimum set of constraints within the existential quantifier;  otherwise, we end up with a rule which overly specifies the existential, thus reducing the applicability of the rule to a smaller set of situations.

Non-modular Rules

  It’s better to split the rule into two.

Summary

This concludes the lesson on Errors with Specialization and Generalization.