Tutorials

The Cyc 101 Tutorial is a self-paced online course that introduces the learner to Cyc concepts, such as representing knowledge in a Cyc Knowledge Base and creating Cyc-based applications. Each lesson takes 10-20 minutes to complete and is made up of slides containing speaker notes that roughly parallel what an instructor would say if these slides were presented in a classroom. The bottom of each slide contains links to related resources, including related lessons, related sections of the OpenCyc OE Handbook, and entries in the glossary and the vocabulary pages.

Get to the lesson format of your choice by clicking the appropriate link:

  • View the lesson in your browser (click the lesson name)
  • Print slides with speaker notes (click [ pdf ])
  • Download the PowerPoint file for a lesson (click [ zip ])
  • Download all PowerPoint files.

If you choose to view the lesson in your browser, navigate between slides by clicking on the left and right arrows at in the center of the bottom of the screen. Click on the icon in the lower right corner of the screen to view as "Full Screen Slide Show". Once in this view, right-click to navigate between slides or return to previous view.

KB Browser Interface Overview

pdf | zip | Interface Overview
Login

The Login Area allows CYC® users to login to their CYC® image, as well as lists what users are currently logged in to the image.

1. To access the Login Area, click on the Login link in the Toolbar frame at the top of the screen.  The first line of the login page shows what your current login is.  If you have just started up the image, this will either say "Guest" or the name of the default login constant; likewise if this is the first time you are accessing from your machine an image that someone else has started up elsewhere.

2. To change identities (including from or to #$Guest), enter the name of your CYC® constant in the type-in pane provided (minus the "#$"), then press the (Submit) button. For example, the user Fred Smith might be represented in the KB as #$FredSmith. To login, Fred would type "FredSmith" in the input window.

3.  Click Submit

,
Parts of the Screen

The KB Browser Interface is divided into four frames :

1. The Tools Frame: Also known as “The Toolbar Frame,” this frame allows the user to do three things:
· Search for constants using the Completion Box.
· Navigate around the KB Browser via the "Tools" and "Navigator" links.
· Access the user login facility via the "Login" link.

2. The Index Frame: This frame is used to display information about a selected constant in the KB. It contains a list of operations that can be performed on the displayed constant, as well as an index of assertion types involving the constant. The options available in this frame will vary with each constant, as well as with different versions of the CYC® System.

3. The Assertion Display Frame:  This frame contains the actual assertions about a selected constant.  Its contents are determined by the term that is selected in the Index Frame.

4. The Systems Information Frame: Also known as “The Agenda Status Bar,” this frame provides information to the user about the current state of the CYC® Agenda. This information can be updated manually, by clicking on the "Update" link, or it can be set to update automatically ("Server Update Interval") via the Browser Options page.

,
Assertion Display Frame

To see the full assertion and the bookkeeping information on a given assertion, click on the ball.  Once the full assertion is displayed, you can edit it, delete it, or make a similar assertion.

White ball = Monotonically True
Yellow ball = Default True
Blue ball = Forward Rule
Purple ball = Backward Rule
Red ball = Negated
Green ball = Inferred

,
Arguments

Information is organized in the display window by what argument position the concept fills for any assertion. These examples show the concept as it fits into the Argument3 slot, the Argument4 slot, etc.

,
Other Tools
  • Clicking on a green plus sign next to a constant in the Index Frame does an ask.  For example, (#$isa #$SewingNeedle ?ARG2).
  • Clicking on a red diamond next to a constant brings you to the hierarchy browser.
,
Searching for Terms: Completion Box and “Show”

There are several ways to look for a term in Cyc.  Using the completion box is just one.  To do this, type the CycL term into the completion box and click on the “Show” button (or hit the “Enter” key on your keyboard).

NOTE: THE PICTURE ABOVE IS DATED. YOU WILL NOT SEE A COMPLETE BUTTON OR A GREP BUTTON, AND THE SHOW BUTTON HAS BEEN CHANGED TO A SEARCH BUTTON.

 

,
Searching for Terms: Completion Box and “Complete”

If you don’t type the full term into the completion box, you can begin to type the name of a constant and click on the “Complete” button.  A box containing all known completions will appear on your screen.  Select the term that you seek.

NOTE: THE PICTURE ABOVE IS DATED. YOU WILL NOT SEE A COMPLETE BUTTON OR A GREP BUTTON, AND THE SHOW BUTTON HAS BEEN CHANGED TO A SEARCH BUTTON. INSTEAD OF THE COMPLETE BUTTON, YOU CAN TYPE A PARTIAL TERM NAME FOLLOWED BY AN ASTERISK CHARACTER '*', THE CLICK SEARCH OR HIT RETURN.

,
Searching for Terms: Wildcard Search

Type a partial name along with the wildcard (*) before or after or around it to look for terms containing that string.

,
Searching for Terms: Use English

Search for terms by typing them the way you would in English into the Completion box.

,
Systems Information Frame: Agenda

Click the “Agenda” link in the System Info area to access this screen.  On this page you can see the operation that is currently being processed.

,
Systems Information Frame: Remote Operations

This is where you can see others’ operations being processed on your local image.  The higher number in the lower screenshot reflects the number of operations that have been processed since the the upper screenshot was taken.

,
The Toolbar

 To customize your working toolbar:
1)  Click on “Tools” to view the available tools to work with.
2)  Select the check boxes to make these tools appear on your toolbar.  Some of the most useful tools are:
•Ask
•Assert
•Create
3)  Click on “Update Toolbar”

,
“Assert Similar” Tool

You can use the “Assert Similar” tool in order to use an existing assertion as a template for a new assertion.

To access this tool, click on the ball next to any assertion.

On the “Cyc Assertion” screen, click on the “Assert Similar” link.

On the “Assert Similar Formula” screen, be sure to change the Mt to the most appropriate one for your new assertion and make the assertion.

,
Ask

When performing an “Ask” you can type in your query without the necessary “#$” symbols.  Then click on the “Cyclify” button to automatically add in the “#$” symbols.  The lower screenshot shows the result of clicking on this button.

,
Slide 16

Click the “Comm”link in the System Info area to access this screen, the Communication Status Area.  This is where you specify whether you wish to receive operations.

,
Log Out

  In order to logout of the Cyc Browser :
1)  Click on “Login” in the Tool bar.
2)  Click on the link “here” in the sentence “Click here to logout now.”

Foundations of Knowledge Representation in Cyc

pdf | zip | Why Use Logic?
Foundations of Knowledge Representation in Cyc

This is an introduction to the foundations of knowledge representation in Cyc.  Our first topic is: Why use logic?

,
NL vs. Logic: Expressiveness

Knowledge representation requires a representation language.  Candidate representation languages range from natural languages (such as English or Turkish) to logic-based languages to object-oriented programming languages and others.   CycL, the language used for knowledge representation in Cyc, is a (high-level) logic-based language.  This section explores the reasons for that choice, and the advantages of logic-based knowledge representation.

One issue in the choice of representations is expressiveness. Since we want a great deal of expressiveness for the kind of knowledge Cyc is going to contain, it is sometimes suggested that we use natural language. The expressiveness of natural language, though, goes beyond what we need.  It also gives rise to special problems if one wants not only to store, but also to reason with, the represented knowledge.  Logic-based representation, in contrast, gives us enough expressiveness, and facilitates the reasoning as well.

Natural language is obviously very expressive. But this can lead to problems. Consider the first three sentences on the slide. Each of these means roughly the same thing and each of them has the implication that Jim’s falling occurred before his injury. If we want to represent that implication, do we write a rule for every natural language expression that could possibly express this point?

Logic-based languages offer a simplified, more efficient approach.  First, we identify the common concepts – for example, the relation “x caused y” –  at the heart of the English sentences.  Then, we use logical relationships to formulate rules about those common concepts. For example, “if x caused y, then x temporally precedes y”.

,
NL vs. Logic: Ambiguity and Precision

Another issue in the choice of a knowledge representation language is ambiguity. Natural language is highly ambiguous. For example, if we say, “x is at the bank,” we don’t know whether what is meant is the riverbank or a financial institution. If we say that x is running, we don’t know whether x is changing location, operating (like a piece of machinery), or running as a candidate for office. On the other hand, with a logical representation we can precisely define the concepts we use. We can, for example, define a distinct concept corresponding to each of these three senses of "running."  This allows us to place the appropriate rules on their respective concepts, whereas they could not all be placed on the one ambiguous word.

This matters greatly for representation of knowledge in the Cyc Knowledge Base.  After all, we are representing the knowledge for a purpose: we want to use the represented knowledge in reasoning.  Reasoning means, at least in part, figuring out what must be true, given what is known.   In order to reliably figure out what follows from what you know, you must be able to specify the starting point.  Reasoning requires a clear understanding of exactly what knowledge you have.  In other words, reasoning  requires precision of meaning.
 

,
NL vs. Logic:Calculus of Meaning

Logic also has the advantage of offering us a calculus of meaning. Logic features several well understood logical operators, such as those listed on the slide. They are well understood in the sense that they have been studied for years and their operation is well-documented.

For example, consider the sentences: “It is not the case that all men are taller than all women.” And also: “It is the case that all men are taller than twelve inches.” It follows from these two sentences that some women are taller than twelve inches.

You can express the format of these sentences and find the conclusion based on the logical constants alone, without knowing what the particular non-logical words (such as men, taller, and women) mean. This is very helpful for reasoning.

,
Logic-Based Language vs. Other Formal Languages

When choosing a formal symbolic language rather than a natural language, why choose logic rather than something like frames and slots?  With frames and slots (and some object-oriented languages) reasoning depends on the mode of the representation, so there’s less reuse of the knowledge. Because of that, you either get less coverage or more bulk as you try to represent the knowledge from every direction in which you could possibly want to use it.

Also, the knowledge representation must be designed around the indexing, and the implicit knowledge is lost if the code is separated from the knowledge base. When using logic, the knowledge representation (KR) is mode-independent and the indexing is independent of the KR. And the implicit knowledge is independent of the KB.

Let me explain what this means . . .

,
Indexing and KR

Here’s an example. On the left side of the slide, we have a frame-and-slot type representation (or an object-and-attribute type representation). So we have an object, carl, and carl has the animal type elephant. Carl also has the mother claire. We have another object claire. claire has the animal type elephant, and claire’s mother is elaine. Now, this particular representation will allow us to look up carl and discover that his mother is claire.

However, if we look up claire, we’ll discover that her mother is elaine, but we won’t discover that she’s the mother of carl. In order to be able to do that kind of lookup with this type of representation, we need to create a separate index going from the mother attribute values (or slot values) to the objects of which they are mothers.

To the right of the frame-and-slot type representation, we have a logical representation. The first sentence says, Carl is an instance of Elephant; the second says, Carl’s mother is Claire. What’s worth noting about this representation is that the meaning and implications of the second sentence, Carl’s mother is Claire, can be accessed based on any of its argument values. So, we could look up mother and get all of the animals bearing this relationship. We could look up Carl and find out who his mother is. We could look up Claire and find out whose mother she is or who her mother is. So this single representation does the work of many different representations in the object-oriented or frame-and-slot representation. Cyc takes advantage of this independence, by indexing all argument places -- such as are filled by mother, Carl, Claire -- of each assertion.  This comprehensive indexing enables efficient knowledge reuse.

,
Implicit knowledge

Also, in a logical representation, implicit knowledge is kept in the KB. On the left side of the slide, we have a frame-and-slot representation. We have object, elephant, and we have a number of attributes for it. Notice if you wanted to mark somewhere that you could figure out the weight of the elephant from the other attributes, there’s nowhere in the knowledge representation itself to put that information. You’d have to create a piece of code that attaches to the weight slot with instructions. But that means if you separate the knowledge base from the code base, you lose that information.

In the box on the right half of the slide, we have our logical representation. Here we have both that Elephant is a type of mammal, and we have a rule. The logical representation of that rule declaratively states that if something is an elephant and it’s male and it’s about Y meters tall, then it’s about 2Y tons heavy. That knowledge stays in the knowledge base even when you separate it from the code.

,
Summary

To summarize, the advantages of logic-based knowledge representation include: expressiveness (we get enough expressiveness using logic-based knowledge representation without the extra problems that natural language would bring us), precision (so we know exactly what the represented knowledge means), a calculus of meaning (so that we can reason with the knowledge based on logical constants), and use-neutral representation (which makes the represented knowledge more reusable).

Indexing is separated from the knowledge representation, so we get more reuse and the ability to access knowledge in unanticipated ways. And the implicit knowledge is maintained in the KB itself and retained even if we separate it from the inference engine.

pdf | zip | CycL Syntax
Foundations of Knowledge Representation in Cyc

This is a continuation of the tutorial in the Foundations of Knowledge Representation in Cyc.  This section covers the Syntax of CycL.

,
Syntax: Constants

CycL constants denote specific individuals or collections, such as individual relations, individual people, types of computer programs, or types of cars.  Each CycL constant is prefixed by the string ‘#$’ (pronounced “hash-dollar”).

Here’s a sampling of some constants.  Each constant in the first group denotes a collection.  For example, #$Dog denotes the collection of all dogs, #$SnowSkiing denotes the collection of all snow skiing events, and #$PhysicalAttribute denotes the collection of physical attributes.

Each constant in the second group denotes an individual.  For example #$BillClinton, #$Rover, and #$DisneyLand-TouristAttraction denote partially tangible individuals, individuals you can physically pick out.  #$likesAsFriend, #$bordersOn, #$objectHasColor, #$and, #$not, #$implies, and #$forAll also denote individuals, but these are individual relations, rather than tangible objects.

Each constant in the final group denotes an attribute value.  Attribute values are specific properties that something can have, such as being red in color or being sandy-soiled

,
Syntax: Formulas

A CycL formula is a relation applied to some arguments, with parentheses around the group. The slide contains three examples of CycL formulas.

The first two of these formulas are CycL sentences.  They are well-formed formulas with a truth-function in the first position (a.k.a the arg0 position).  [We'll say more about truth functions shortly.]  All sentences have a truth value; that is, it may be true or false.  The first sentence says that #$GeorgeWBush is an instance of #$Person.  The second sentence says that #$GeorgeWBush likes #$AlGore as a friend.

The third formula is not a sentence, and cannot be true or false.  Instead, it is a Non-Atomic Term (also known as a “NAT”). Instead of denoting a sentence, this formula gives us a new term, referring to a particular event: the birth of Jacqueline Kennedy Onassis.

The next two slides look at CycL Sentences and CycL Non-Atomic Terms in more detail.

,
Syntax: Sentences

Let’s look at sentences again.  Each CycL sentence starts with a truth function. Truth functions in CycL are easy to recognize because they begin with a lower-case letter. Types of truth functions include predicates, logical connectives, and quantifiers. They are called truth functions because the result of applying them to some arguments is a sentence that’s true or false.

We'll look at logical connectives and quantifiers in a moment.  For now, let's focus on predicates.  The first four examples, #$likesAsFriend, #$bordersOn, #$objectHasColor, and #$isa are all predicates.  Each of the two sample CycL sentences is constructed by applying one of these predicates to some arguments.

In the first case, the predicate #$isa, which means "is an instance of," is applied to the arguments #$GeorgeWBush, which denotes the individual George W. Bush, and #$Person, which denotes the collection of all persons.  The resulting sentence says that George Bush is an instance of person.

In the second case, the predicate #$likesAsFriend is applied to the arguments #$GeorgeWBush and #$AlGore.  The resulting sentence says that George Bush likes Al Gore as a friend.

CycL sentences are used to form assertions (to tell something to Cyc) and to form queries (to ask Cyc something).
 

pdf | zip | Collections and Individuals

  A collection is a type of thing, a kind of thing, or a class of things.  Things which belong to a collection are called its instances.

Each collection is characterized by some feature or features that all of its instances share.  For example the collection of all persons is a collection of all instances that share the properties that make something a person.  Cher, Mario Andretti, Bill Clinton, and Abraham Lincoln are all instances of #$Person.

Some other collections include: the collection of all towers, the collection of all space stations and the collection of all movie directors.

,
Individuals

  In contrast, an individual is a single thing, not a collection. #$Cher is an individual, and so is the #$EiffelTower.

Individuals do not have instances, but they may have parts. If you take a piece of the Eiffel Tower you don’t have an instance of the Eiffel Tower, you have a part of it.  That’s one way to tell the difference between an individual and a collection.

Other individuals include: the space station Mir, Orson Wells, and the United States Marine Corps.  That last one might surprise you.  Why is the United States Marine Corps an individual and not a collection?  Let’s look at that more closely.

,
Joe The Marine

  Consider Joe the marine.  Joe is a member of the Marine Corps, but he’s not an instance.  Joe is a part of the Marine Corps.  It would not make sense to say that Joe is an instance of the United States Marine Corps; he is not a Marine Corps himself.  The United States Marine Corps is an individual, specific organization.  There is only one United States Marine Corps.  It has parts (for example, Joe) but not instances.

In contrast, we also have the collection #$UnitedStatesMarine.  That collection is comprised of all human members of the United States Marine Corps.  It has instances, each of which is an individual marine, like Joe.

,
Remember...

  Remember: collections can have instances, but not parts.  Individuals can have parts but not instances.  For example, the collection of all towers can have instances, each of which is a tower, but the Eiffel Tower can have parts.

,
Everything Is An Instance of Something

  Everything is an instance of some collection.

Every collection is, at minimum, an instance of #$Collection.   So for example, the collection of all towers is an instance of #$Collection.  The collection of all military personnel is an instance of #$Collection.

Every individual is, at minimum, an instance of #$Individual.  So for example, the #$EiffelTower is an instance of #$Individual.  #$OrsonWells is an instance of #$Individual.

,
Collections of Collections and Collections of Individuals

  Now it gets more complicated: collections can have either individuals or collections as their instances.

For example, the collection of all towers has individual towers as its instances.  The collection of all dogs has individual dogs as its instances.

But the collection #$ArtifactType has collections as its instances.  For example, the collection of all computer programs is an instance of #$ArtifactType.  The collection of all pieces of pottery is an instance of #$ArtifactType.  So the instances of artifact type are themselves collections.

There are some special collections with instances of both types.  Primarily these are collections of concepts themselves, where those concepts may indicate either collections or individuals.

,
Disjoint Collections

  The collection of all dogs and the collection of all cats have no members in common.  Having no instances in common is the definition of disjointness.  Therefore, the collection of all dogs and the collection of all cats are disjoint.  We express this in CycL using the predicate #$disjointWith, as in the formula along the bottom of the slide.

,
#$isa

  To express that something is an instance of a collection in CycL, we use the predicate #$isa.  A formula of the form (#$isa X Y) means that X is an instance of collection Y.  For example #$EiffelTower is an instance of the collection of all towers, #$Canada is an instance of the collection of all countries, #$Cher is an instance of the collection of all persons, and #$UnitedStatesMarineCorps is an instance of the collection of all modern military organizations.

,
#$genls

 To express that one collection is subsumed by another, we use the CycL constant #$genls.  A formula of the form (#$genls X Y) means that every instance of the first collection, X, is also an instance of the second collection, Y. In other words, Y subsumes X; Y is a generalization of X.

For example, every instance of #$Dog is also an instance of #$Mammal; #$Mammal is a generalization of #$Dog.

Sometimes this is expressed in Cyclish by simply saying Y “is a genls of” X, or X “is a spec” (or specialization) of Y.  So we might say, as in the second example, that #$FixedStructure is a genls of #$Tower.  Or we might say #$Tower is a spec of #$FixedStructure.  In either case we mean the same thing: every instance of #$Tower is also an instance of #$FixedStructure.

,
#$genls is transitive

  The #$genls relationship is transitive.

So, since #$Dog has as a generalization, #$Mammal (meaning that every instance of the collection of all dogs is also an instance of the collection of all mammals), and #$Mammal is a specialization of #$Animal, we can conclude that #$Dog is a specialization of #$Animal – that is, every #$Dog is also an instance of #$Animal.

Similarly, since #$Computer (the collection of all computers) is a specialization of the collection #$ComputationalSystem, and #$ComputationalSystem is a specialization of #$PhysicalDevice, we can conclude that every instance of #$Computer is also an instance of #$PhysicalDevice.  That’s what we mean when we say that #$genls is transitive.

,
The #$genls hierarchy

  Here’s a sample piece of the #$genls hierarchy.

#$Lighthouse is a specialization of #$Tower, which itself is a specialization of #$FixedStructure, which in turn is a specialization of #$Individual.

Because #$genls is transitive, that means that #$Lighthouse is also a specialization of #$Individual.  And if we know about any particular lighthouse, we can conclude that it is also an instance of #$Tower, of #$FixedStructure and of #$Individual.

,
#$isa is NOT transitive

  #$isa, on the other hand, is not transitive.

Consider the number five.  It’s an instance of #$PositiveInteger. #$PositiveInteger, in turn, is an instance of #$InfiniteSetOrCollection. But five is not an instance of #$InfiniteSetOrCollection.

Let’s take another example.  #$Cher is an instance of #$Person.  #$Person is an instance of #$Collection, but #$Cher is not an instance of #$Collection.

So #$isa is not transitive.

,
Remember . . .

Because #$genls is transitive, every instance of a collection is also an instance of any #$genls (or generalization) of that collection. By the definition of #$genls, it is the case that #$isa transfers through #$genls, but #$isa does not transfer through #$isa.  An example follows.

,
What can we conclude about #$Rover the dog?

  If we know only that #$Rover is an individual dog (an instance of #$Dog), what else can we conclude about #$Rover given this #$genls hierarchy?

The relationships in dotted lines here represent instancehood or #$isa relationships, and the solid orange lines represent #$genls or generalization relationships.

Since #$isa transfers through #$genls, if we know that #$Rover is an instance of #$Dog, we can conclude that #$Rover is also an instance of #$Mammal, since #$Dog is a specialization of #$Mammal.  And so we can go up the chain.  We can conclude that #$Rover is an instance of #$Animal and an instance of #$Individual and an instance of #$Thing.

Notice that we cannot conclude that #$Rover is an instance of #$BiologicalSpecies because #$isa does not transfer through #$isa.  #$Dog is an instance of #$BiologicalSpecies, in contrast to being a specialization of #$Mammal.

,
A more complete list of collections of which #$Rover is an instance

  This kind of inheritance is very powerful.  It allows us to conclude many things from the single assertion that #$Rover is an instance of #$Dog, because of #$Dog’s place in the #$genls hierarchy.  Here's a more complete list of collections of which #$Rover is (through inheritance) an instance.

,
Is #$genls reflexive?

  Consider the formula (#$genls #$Dog #$Dog), the collection of all dogs is a generalization of the collection of all dogs.  Is this true?  It just means that every instance of #$Dog is an instance of #$Dog.  That’s true, and will be true not just for #$Dog, but for any other collection.  That means that #$genls is reflexive.

,
Is #$isa reflexive?

  #$isa, however, is not reflexive.  Consider: the collection of all dogs is not an instance of the collection of all dogs -- that collection is not itself a dog.  However, there are examples where formulas of this form are true.  The collection of all collections is itself a collection.  Therefore #$isa is not anti-reflexive.

,
Summary

  In summary, we’ve looked at collections and individuals and how to distinguish them.  We’ve looked at the concepts of #$isa and #$genls in CycL.  We discovered that #$genls is transitive and reflexive, but that #$isa isn’t transitive or reflexive.  These concepts are fundamental to knowledge representation in CycL.

pdf | zip | Microtheories

  This section is a continuation of the tutorial on foundations of knowledge representation in Cyc.  Our final topic: microtheories.

,
A Bundle of Assertions

The Cyc Knowledge Base (KB) can be thought of as a vast sea of assertions.  A microtheory is a set of assertions from that sea, an identification of a group of assertions that we pick out from the knowledge base.

Assertions can be bundled into microtheories based on shared assumptions, shared topics, shared sources, or other features.

,
Avoiding Inconsistencies

  One of the functions of microtheories is to separate assertions into consistent bundles.  Within a microtheory, the assertions must be mutually consistent.  This means that no hard contradictions are allowed, and any apparent contradictions must be resolvable by evaluation of the evidence visible in that microtheory.  In contrast, there may be inconsistencies across microtheories.

Consider the pair of microtheories on the lower left of the slide (MT1 and MT2), which differ primarily in terms of the granularity considered.  In the first microtheory (MT1), the granularity is that of ordinary human perception.  In that microtheory, tables, for example, are solid objects.  In the second microtheory (MT2), the granularity is that of particle physics.  In that microtheory, tables consist mostly of space.

Opposite those two microtheories is another example: three microtheories which differ primarily in time.  In the first microtheory, the latest of the three, Mandela is an elder statesman.  In the second microtheory, which is a bit earlier, Mandela is the president of South Africa.  In the third microtheory, which is earlier still, Mandela is a political prisoner.

,
Every Assertion is in a Microtheory

  Every assertion in the KB falls within at least one microtheory.  Currently every microtheory in Cyc is a specific reified, or named, term; each microtheory has its own constant.   Examples include the #$HumanActivitiesMt microtheory and the #$OrganizationMt microtheory.  These microtheories give us one way of indexing all of the assertions in Cyc.

,
Why Have Microtheories?

Why do we want microtheories? We want them because they enable better knowledge base building and better inference.

First, microtheories allow us to focus development of the KB.  By gathering the most relevant information together,  microtheories enable the KB builder to focus on that information, rather than wading continuously through the entire KB.  This focusing power also improves inference; reasoning can be focused on the most relevant information, reducing search space and improving efficiency.

Second, microtheories enable us to use terse assertions.  For example, if we use microtheories to gather together assertions that hold throughout 1995 and in South Africa, then when we make assertions or reason within that context, we can use a nice terse assertion such as “Mandela is president.”  Without the ability to form and specify a context, we would need to explicitly state the relevant assumptions.  That is, we would need to build such contextualization into the assertions, such as “Mandela is president throughout 1995 in South Africa.”  Microtheories make knowledge base building more efficient.

Third, microtheories allow us to cope with global inconsistency in the KB.  In building a knowledge base of this scale, covering different points of view, different times and places, different theories, and different topics, some inconsistency is inevitable.  Inconsistencies, however, can make accurate reasoning impossible.  Using microtheories, we can isolate terse assertions like the above from others with which they might be inconsistent, and reason within consistent bundles.

,
Why Have microtheories? (cont.)

  We can allow inference to focus on the most relevant assertions, and those that share the current assumptions.

Since we use terser assertions, the inference engine has terser formulas to process, also increasing efficiency.

Because inconsistencies are isolated from each other (making the KB globally inconsistent but locally consistent), the inference engine can reason with the wide range of points of view, theories, and changes over time and space that are represented in the KB without running into inconsistencies.

,
Some types of microtheories

  There are many different types of microtheories in the Cyc KB.  On the slide is a sample of microtheory types which distinguish microtheories on the basis of how context-sensitive the information is.

For example, #$VocabularyMicrotheory is the collection of all microtheories which contain primarily definitional information, such as definitions of vocabulary having to do with transportation or with computer software.

#$TheoryMicrotheory is a collection of microtheories which contain substantial rules and knowledge that is stated in vocabulary from a specific #$VocabularyMicrotheory.  So for example, #$ComputerSoftwareMt microtheory might contain general principles about computer software (that it runs on computers, for example) using the vocabulary in #$ComputerSoftwareVocabMt.

#$DataMicrotheory, on the other hand, is the collection of all microtheories which contain specific individual-level information.  So for example, #$ComputerSoftwareDataMt might have assertions about specific software programs and their use, using the general principles in the #$ComputerSoftwareMt and the vocabulary in #$ComputerSoftwareVocabMt.

,
Some types of microtheories

  #$PropositionalInformationThing is a collection of microtheories, each of which contains the content (the information) contained in some information-bearing thing, such as a picture, movie, audio tape, or book.  For example, if we wanted to represent all of the informational content of this CycL course in Cyc, we would put it into a #$PropositionalInformationThing microtheory.  If we wanted to represent this course itself, it would be an #$InformationBearingThing and the #$PropositionalInformationThing microtheory would be related using specific #$ContainsInformation relationships to the #$InformationBearingThing.

#$CounterfactualContext is an especially interesting collection of microtheories.   Each counterfactual context is a microtheory in which at least some of the assertions in it are not taken to be true.  One specialization of #$CounterfactualContext is #$FictionalContext.  Consider, for example, #$TheSimpsonsMt, which contains propositional information presented within The Simpsons TV show.  Clearly the writers of The Simpsons do not take these assertions to be literally, factually true, nor do they intend for their audience to believe them.  They want us to consider the propositions, but in some way differently than we would factual information.  It’s important to be able to distinguish microtheories like this and treat them differently from the way we treat microtheories which are supposed to be factual.

#$CounterfactualContext is also a generalization of #$HypotheticalContext.  #$HypotheticalContext’s allow us to consider what would happen if something were true, without actually treating that possibility in the KB as if it were factually true and modifying our current representation accordingly.

,
Microtheory predicates: #$ist

To say that an assertion is true within a microtheory, we use the predicate #$ist.  A formula of the form (#$ist MT FORMULA) means that the formula in the second argument place is true in the microtheory specified in the first argument place.  So for example, it’s true in the #$CyclistsMt microtheory that #$Lenat is an instance of #$Person.  It’s true in the #$NaiveStateChangeMt microtheory that if you have a freezing event and you have something that’s created in that freezing event, that created object is in a solid state.

Why is it important to locate this last assertion in the #$NaiveStateChangeMt microtheory?  Because #$outputsCreated, that predicate in the second-to-last line, is restricted to use in events in which something is destroyed and something is created.  That’s the naīve view of something like freezing, where we talk about water disappearing and ice that is created.  But of course this would not be a physicist’s view of the event.  In a physicists view,  nothing is created or destroyed; rather, something undergoes a change of state.  So, this assertion is local to #$NaiveStateChangeMt.   And #$ist allows us to specify that.

,
Microtheory predicates: #$genlMt

To say that two microtheories are related by an inheritance relationship, we use the predicate #$genlMt.  A formula of the form (#$genlMt MT-1 MT-2) means that every assertion which is true in the microtheory in the second argument place is also true in the microtheory in the first argument place.  Another way of putting this is that MT-1 inherits from MT-2, or that MT-2 is visible from MT-1.

The first example sentence states that the #$TransportationMt microtheory inherits from the #$NaivePhysicsMt microtheory.  When you’re in the #$TransportationMt microtheory you can see all of the assertions in #$NaivePhysicsMt and use them.

The second example sentence states that #$ModernMilitaryTacticsMt inherits all of the assertions from #$ModernMilitaryVehiclesMt, so if you’re talking about tactics you can use the knowledge about vehicles.

The third example sentence states that the #$Transportation Mt is visible from the #$EconomyMt.  When you are reasoning in the #$EconomyMt, say about imports and exports, you can use the knowledge about transportation.

#$genlMt is transitive.  Notice that the first sentence says that #$NaivePhysicsMt is visible from #$TransportationMt, and the third says that #$TransportationMt is visible from #$EconomyMt.  Since #$genlMt is transitive, it follows that #$NaivePhysicsMt is visible from #$EconomyMt.

,
Microtheory predicates, cont’d

Here’s a sample #$genlMt hierarchy.

The #$TransportationMt microtheory inherits from the #$NaivePhysicsMt microtheory, meaning that all naïve physics knowledge is visible in #$TransportationMt.  Since #$NaivePhysicsMt inherits from #$NaiveSpatialMt and #$genlMt is transitive, all of the assertions in #$NaiveSpatialMt are also visible in #$TransportationMt.  Given the #$genlMt heirarchy on this slide, if you are writing an assertion or asking a query in #$TransportationMt, you can rely on all of the knowledge in #$NaivePhysicsMt, #$NaiveSpatialMt, #$MovementMt and #$BaseKB.

#$NaturalGeographyMt, on the lower left of the slide, inherits from #$NaiveSpatialMt as well, but doesn’t inherit from #$NaivePhysicsMt or #$MovementMt.  If you’re stating an assertion or asking a query within the #$NaturalGeographyMt microtheory, you can count on the information in #$NaiveSpatialMt and in #$BaseKB, but not the information in #$NaivePhysicsMt or #$MovementMt.

,
Finding the right microtheory

Microtheory placement is very important in both assertion-making and query-asking.

Consider: an assertion is visible only in the Mt’s that inherit from the microtheories in which it is placed.  When a query is answered, the inference answering the query uses exactly those assertions that are in microtheories that are visible from the microtheory in which it is asked.  In other words, the query is answered using only the information available in the current microtheory or above it in the microtheory hierarchy.  Furthermore, for an assertion or query to be well-formed (to make sense in Cyc), requires that the definitional information about the terms used in that query or assertion be visible from the microtheory where it is asked or asserted.

This gives rise to a certain tension.  If you’re making an assertion and you want it to be visible everywhere in which it might be needed, you want to place it fairly high up, not too low down.  On the other hand, if you want to make sure that your assertions are not visible where they’re not needed, so that you minimize your search space for inference, you don’t want to place your assertions too high;  you want to place them lower, more specifically.

As it turns out, this tension has different consequences depending on the application that is being developed.  If accuracy (or completeness) is more important than efficiency, KB builders will tend to place their assertions higher in the microtheory hierarchy.  If efficiency is more important, they’ll tend to place them lower and sacrifice completeness.  In either case, though, good microtheory placement is very important.

,
Finding the right microtheory

  There are many instances of #$Microtheory in the KB – several thousand, in fact.  At the moment, there is no substitute for familiarity with the microtheories relevant to the area in which you are working.

To understand a microtheory in the KB, the first thing to do is to read the comment.  Examining the assertions can also reveal the intent behind the microtheory, as can examining where it falls in the microtheory hierarchy.

,
Forthcoming Changes/Improvements

Cycorp has identified additional features as forthcoming changes or improvements.  These will enable us to get the most out of the microtheory hierarchy.

To maximize the efficiency of ontology building and inference, we want microtheories that are dynamically generated.  We don’t want to specify ahead of time which gatherings, or bundles, of assertions might be most relevant.  We want to be able to do that on the fly. according to any features that we need at the moment.

We’d like to have power tools that suggest where in the microtheory hierarchy a query or assertion can be placed.  This would assist a KB builder or application user in finding the right information without requiring them to have full familiarity with all possibly relevant microtheories.  We’d also like more specifically targeted, smaller contexts.  In other words we’d like to re-place assertions to get exactly the right place on the hierarchy to counteract some of the inexactness coming from the tension described earlier.

In order to have those features we need more explicit representation of the context features than we currently have.  We’d like to explicitly represent the topic, the level of granularity, and the time period in which a microtheory holds as well as other features that, in our experience, tend to be relevant for inference and knowledge representation.  Further, for each such feature, a deep and explicit representation of the relationship that holds between contexts that differ along that feature, for example contexts that differ in time,  is also needed.

These developments are covered under two projects.  The Rapid Knowledge Formation project includes the development of power tools to assist in microtheory placement.  The internal Context Overhaul project includes moving to dynamic generation, smaller context definition, and more explicit context definition.

,
Summary

  In summary, we’ve reviewed what a microtheory is: a bundle of assertions out of the Cyc KB.  We've looked at some reasons for having microtheories, and the benefits for both knowledge base building and inference.  We’ve seen some sample types of microtheories.  We’ve looked at the two foundational  microtheory predicates: #$ist and #$genlMt.  Finally, we reviewed considerations in finding the right microtheory: not too specific and not too general.

Predicates and Denotational Functions

pdf | zip | The Basics

This is a talk about predicates and denotational functions.  You’ve probably heard a little bit about these in other talks before, but we’ll go into a little more detail than you might have heard before.  The talk will be divided into five sections.  We’ll start with The Basics.

,
Predicates and Denotational Functions

First off, this is a diagram just to show where the two collections that we’re going to be concerned with figure into a hierarchy – a collection hierarchy – in CycL.  As you can see, a #$Predicate is a kind of #$TruthFunction and a #$Function-Denotational, or denotational function, is a kind of #$Relation; a #$TruthFunction is also a kind of #$Relation.  So both of these things are specific kinds of relations that are commonly used in the CycL language.

,
#$Predicate

 Let’s look at a few basic points about predicates.  Here are three instances, or examples, of specific predicates: #$mother, #$objectHasColor, and #$memberStatusOfOrganization.  Now, the primary use for predicates is to make sentences.

,
#$Predicate

 Let’s see how these particular examples would be used to make sentences.

The first example applies the predicate #$mother to Chelsea Clinton and Hillary Clinton.  So the entire expression there is a sentence in CycL which means that Chelsea Clinton’s mother is Hillary Clinton.

The second example uses the second predicate and that sentence means that Rover is tan colored.

Finally, the third sentence applies the third example predicate to the three arguments shown, and it means that Norway is one of the founding members of NATO.

,
#$Predicate

One important feature of predicates is that when you build a sentence with a predicate the result is true or false.  This is because you have a sentence, so it’s either going to be true or false.  When I say that Chelsea Clinton’s mother is Hilary, that is a true sentence.  But if I say something like “the White House has the Lincoln Memorial as one of its physical parts,” that is a sentence of course, but it’s not true.  It’s a false sentence.  So predicates are sometimes called truth-functional relations because when you apply them to arguments, the resulting sentence will either be true or false.

,
#$Predicate

  Now a further point about the truth or falsity of a sentence made with a predicate is that whether a given sentence is true or false depends upon certain facts about the world and isn’t just a matter of logic all by itself.  So for example, our example sentence states that Chelsea Clinton’s mother is Hillary Clinton.  This happens to be true because, as a matter of fact, Hillary Clinton is Chelsea Clinton’s mother.  So it’s true, and it’s true because of facts about the world.

The second sentence that we saw, which says that the Lincoln Memorial is a part of the White House, is false because it’s just a fact that  the Lincoln Memorial is not a part of the White House.

Sometimes this feature of predicates is summed up by saying that they’re extra-logical relations.  In other words, whether they result in true or false sentences depends on more than just pure logic, but is a matter of facts beyond the world of logic; facts about the physical world, for example.

,
#$Function-Denotational

  Now let’s turn to some basic points about denotational functions.  Some example denotational functions are #$MotherFn, #$BorderBetweenFn, and #$GroupFn.  As opposed to predicates, which are mainly used to form sentences, functions are mainly used to form terms.  And since these are complex (or compound) terms that are built out of other ones, they’re sometimes called non-atomic terms.

,
#$Function-Denotational

Let’s look at some examples using our example functions:

If we take the #$MotherFn function and apply it to one term, #$ChelseaClinton, we have a non-atomic term, which is the entire line of the first example below “Terms using these functions.”

Now, the #$BorderBetweenFn happens to take two arguments at a time.  So when we apply it to two arguments, we have a term.

#$GroupFn takes one argument, so again, one term.

,
#$Function-Denotational

A very important feature of terms -- including those that are built with functions -- is that they denote things, that is they stand for or refer to things.  So, the example of #$MotherFn applied to Chelsea Clinton denotes “Hillary Clinton.”  You should think of that term as being equivalent to the compound English phrase or expression “the mother of Chelsea Clinton,” who is in fact Hillary Clinton.

You should think of the second example, where we applied the #$BorderBetweenFn to Sweden and Norway, as a non-atomic term that denotes the border between Sweden and Norway – in other words, a certain geographical line.

Non-atomic terms are like other terms in that you can apply a predicate or a function to a non-atomic term and therefore form a sentence or another (even more complex) non-atomic term.

,
How to Tell The Difference Between Predicates and Functions in CycL

Telling the difference between a predicate and a function in the CycL language is easy.  Just remember that the name of a predicate always begins with a lower-case letter.  With very few exceptions (that you don’t need to worry about at this point),  only predicates start with lower-case letters.  So if you see something starting with a lower-case letter, it’s a predicate.  Functions, on the other hand, are a little looser as far as their possible names go.  Usually, the name of a function will have Fn at the end of it, or sometimes somewhere in the middle.  Almost always it will have an Fn as part of it.  But that’s not a hard and fast rule.  There are some exceptions, as you’ll learn, but for the most part, if you’ve got the Fn there, you know it’s a function.

To see some examples that are comparable to each other that we’ve already looked at, let’s review.  You have #$mother, which is a predicate (you can tell because it’s lower-case), and that’s different from #$MotherFn, which is of course a function, not a predicate. The first one, as we know, can be applied to two arguments and it gives you a sentence.  The second one applies to one argument and it gives you a non-atomic term.

Now, below that are two other examples that you haven’t seen yet.  There’s a predicate called #$sponsors and there’s a corresponding function called #$SponsorFn-Agent.  Now the predicate #$sponsors, like all predicates, can be used to form a sentence.  The sentence there, “(#$sponsors #$GeneralMotors #$USOlympicTeam),” is a CycL sentence that’s equivalent to saying that General Motors is one of the sponsors of the US Olympic Team.  Below that, where we have a non-atomic term built with a function, that applies the function SponsorFn-Agent to the USOlympicTeam.  So, that is a non-atomic term.  It’s not a sentence, it’s not something that’s true or false.  It’s a term that denotes something.  In this case, as you can probably figure out by the names of the terms in there, it is a non-atomic term that denotes….  Well, if you think about it, the US Olympic Team doesn’t have just one sponsor, it has a number of them.  So what you might be able to surmise from that is that this term, this full non-atomic term, denotes, not just one particular sponsor (because there is no one sponsor of the US Olympic Team), it denotes the collection of sponsors of the US Olympic Team.  And if you looked at the definition of #$SponsorFn, you would see clearly, because it would be stated, that that’s the case.

,
Two Central Features of Predicates and Functions

There are a couple of important features that every predicate and function has.  First of all, arity.  Arity has to do with how many arguments a predicate or function requires, or in other words, how many arguments you have to apply the function to at a given time to result in a meaningful sentence or term.

Then there’s also the notion of argument types, which has to do with what types of things a predicate or function requires as a particular argument.

We’ll be getting into arity and argument types in more detail in the next two lessons.

,
Summary
  • Predicates are used to make sentences
  • Predicates are truth-functional relations
  • Predicates are extra-logical relations
  • Functions are used to make non-atomic terms (NATs)
  • NATs denote things
  • Differentiating between predicates and functions in CycL
  • Two central features of predicates and functions are arity and argument types

 This concludes the lesson on the basics of predicates and denotational functions.

pdf | zip | Arity
Predicates and Denotational Functions

Now we’re going on to the second of the five lessons in this section.  I will introduce arity, which you’ve already seen, and we’ll delve into it in more detail.

,
Specifying Arity

Arity, as you know, refers to the number of argument places that a particular predicate or function has.  There are two ways to express the arity of a particular predicate or function in the CycL language.

First of all, we have a predicate, #$arity, which you can apply to any relation – in other words, any predicate or function – in conjunction with a numeric value to denote how many arguments that relation accepts.  For example, (#$arity #$GroupFn 1) denotes that the #$GroupFn function accepts only one argument.  The #$mother predicate has an arity of two; thus it takes two arguments at a time.

The second method of expressing arity in CycL is through the inheritance of collection attributes, specifically the arity attribute.  There are several pre-defined collections with specific arity attributes.  For example, the collection #$UnaryPredicate is the collection of all predicates whose arity is one.  #$BinaryFunction is the collection of all functions whose arity is two.  So, instead of directly stating the arity of a particular function (for example #$GroupFn), you could say that #$GroupFn is an instance of the collection #$UnaryFunction, thus assigning #$GroupFn an arity of one.  Another example would be to say that #$mother is an instance of the #$BinaryPredicate collection, thus assigning #$mother an arity of two.

,
Keep Arity Low

Most relations in CycL have low arities; in fact, most have just one or two as their arity.  There are some relations that take three or four arguments, a few that take five arguments, and a slight few that take more than five arguments.  Seven is probably the highest arity used, although in principle arity could be any number.  Try to keep them on the low side.

,
Unary Properties

  In a sense, there is an exception to the statement that most predicates and functions have an arity of one or two.  So far, there are very few instances of #$UnaryPredicate in CycL.  There are a lot of unary functions, but very few unary predicates.  The reason for this has to do with the Cyc Inference Engine and certain facts about how it works most efficiently.   There are alternative ways to express what you might think of intuitively as a unary property.  Where you could do it with a unary predicate, we usually do it in one of these other ways.

The first way has to do with using collections to express concepts that could have been expressed using the unary predicate.  For example, consider the concept of a dog.  We could apply the unary predicate, #$dog, to the name, #$Lassie.”  This would allow us to say “Lassie is a dog.”  Instead, we assert that #$Lassie is an instance of the collection called #$Dog.  So, to say that Lassie or Rover is a dog, we do so with an #$isa statement, relating the particular dog to the collection of all dogs.

,
Unary Properties

  An alternate way to represent unary properties, without doing so in terms of a predicate, is with what we call an attribute, or #$AttributeValue.  For example, if we wanted to express the concept of something having the color tan, we could do that with a predicate like #$tanColored and say (#$tanColored #$Rover),  but instead we do it with an attribute value.  In other words, we treat tan color as a type of attribute that a thing can possess, and then we have certain predicates that relate individual things to attributes that they possess.   An example of one of these predicates is #$objectHasColor.  You could use this predicate to say that Rover has the color of tan.  Tan is being represented as an attribute value rather than as a unary predicate.

,
Variable-arity

CycL has some predicates and functions which are called “variable arity.”  This means that they can take a variable number of arguments; in other words, a variable arity relation doesn’t always take, say, just two arguments at a time, or just three arguments at a time -- it might take two or three or four, depending on the situation.

The collection of all of these variable-arity relations is called #$VariableArityRelation.  Here are a couple of examples.  The predicate #$different is variable-arity, so it is defined in such a way that it can take two or more arguments.  In other words, it has to have at least two, but it can have three, four, or more arguments.  When you apply the #$different predicate to two or more things, the resulting sentence means that all of those things are different from each other.  You can see why it’s convenient to have variable-arity.  Otherwise we’d have to have a bunch of different predicates, depending on the number of things we wanted to differentiate from each other --  this way we can do it all with one.

An example of a variable-arity function is #$JoinListsFn.  This function takes as its arguments two or more lists, and concatenates them into one list.  You can also apply that function to three or four lists, and it will do the same thing.  So, again, it’s a convenient way of doing something that would otherwise require a bunch of functions to do.

,
Well-formedness

Every relation has an arity, as we’ve seen.  You have to apply a given relation to the proper number of arguments, according to its arity, if the result is going to be what we call “syntactically well-formed.”

Take for example the predicate #$objectHasColor.  We’ve seen this predicate before and we know that it’s binary, or a two-place predicate.  So if I form an expression with only one argument, I’ve violated its arity.  Even though, strictly speaking, (#$objectHasColor #$Rover) is a term in CycL, it’s a syntactically malformed term and basically meaningless.

The second example does apply #$objectHasColor to the right number of arguments for its arity, so it is syntactically well-formed.  You’ll notice that what is says is that Rover is blue.  Assuming that Rover is a dog, it is very unlikely that Rover really is blue, so you can see that it’s a false sentence, but still it’s syntactically well-formed.

,
Summary
  • The number of argument places a predicate or function has is its “arity”
  • Two ways to specify arity
    • #$arity
    • Collections that denote arity
  • Representing unary properties in CycL
  • Variable-arity predicates and functions

This concludes the lesson on arity of predicates and denotational functions.

pdf | zip | Argument Types
Predicates and Denotational Functions

Now we’ll talk about argument types in a little more detail.

,
Argument Types

Argument types have to do with the types of things that a predicate or function requires as its arguments. If you look at the first example expression on the slide, it takes the #$MotherFn function and applies it to Hillary Clinton, and of course we know that that gives you a non-atomic term that would denote the mother of Hillary Clinton, Dorothy Rodham.  This makes perfect sense.

But consider the second example; there’s something really odd about it.  I’ve taken the #$MotherFn function and applied it to the White House.  That’s perfectly fine as far as arity goes, but there’s something strange about that argument.  The White House is a building, and a building is not the kind of thing that can even have a mother; so there’s something really weird about that term.   It is a term that is syntactically well-formed, but its meaning is nonsensical.  So you would say that the argument that #$MotherFn is being applied to is not the right type of argument for that function (it’s not an appropriate or correct type of argument for the function we have).

As you can see from the second example, we need to be able to specify, in the CycL language, what types of arguments are appropriate are for a particular function or predicate.

,
Argument Types

  There are two ways you can indicate the appropriate types of arguments for particular functions or predicates.  There are a number of predicates, generically represented as #$arg[N]Isa, which state what the argument in the Nth position of the predicate which #$arg[N]isa is applied to must be an instance of.  For example, the #$arg2Isa predicate states that the second argument to which that particular relation is applied must be an instance of a particular collection.

The other way to specify a relation’s argument types is with a similar group of predicates that denote specialization.  These predicates, generically represented as #$arg[N]genl, state that a certain relation, with respect to the Nth argument place, must take an argument that is a specialization, or sub-collection, of a particular collection.

Now let’s see how these are actually used.

,
Argument Types

Refer to the slide for some examples of argument type designation.  The first bullet under Example 1 says that the first argument of the #$mother predicate must be an instance of the collection #$Animal.  The second bullet under Example 1 says that the second argument of the #$mother predicate must be an instance of the #$FemaleAnimal collection.  So, basically, what the two sentences say is that the #$mother predicate relates an animal to a female animal.

Refer to the third sentence, applying the #$mother predicate to Chelsea Clinton and  Hillary Clinton.  This sentence is well-formed based on the bulleted argument type declarations on the slide.  Chelsea Clinton is an instance of #$Animal and Hillary Clinton is an instance of #$FemaleAnimal.  Those arguments are both of the right types for the predicate in question -- #$mother.

,
Argument Types

  Now let’s look at the same sort of thing with respect to a function, the #$TransportViaFn function, which we haven’t seen yet.  The specifications on the slide under Example 2 tell us about the #$TransportViaFn function’s first argument.  In fact it’s just a unary function, so it only has one argument.  The first bullet states that the one argument must be an instance of the #$ExistingObjectType collection.  The second bullet states that the argument must be a specialization of the #$SolidTangibleThing collection.

The application of #$TransportViaFn to the collection, #$Automobile, is well-formed.  This is because #$Automobile is an instance of #$ExistingObjectType and is a specialization of #$SolidTangibleThing.  So it meets the argument type constraints, and the arity constraints are met as well.

,
Argument Types

To sum up the points that we’ve been discussing, a relation needs to be applied to arguments that meet its argument type specifications in order for the result to be what we call “semantically well-formed.”  Reviewing some earlier examples, #$MotherFn, applied to the White House, is semantically malformed, but when it’s applied to Hillary Clinton the result is semantically well-formed.

Let’s compare that to (Cf.) the other notion of well-formedness that we talked about in the previous lesson.   If a relation is applied to arguments which meet that relation’s argument type specifications, the result is said to be semantically well-formed.  If a relation is applied to the correct number of arguments for its arity, then the result is said to be syntactically well-formed.

,

  Let’s look at some examples and determine whether they’re well-formed or not.  First we’re given some information about the #$objectHasColor predicate. We’re told it’s a binary predicate and that its first argument must be an instance of #$SpatialThing-Localized.  So we know that  it’s some spatial thing that has some location in the physical universe.  We’re also told that its second argument has to be a color.  Consider the sample sentences on the lower half of the screen and determine if they’re well-formed.

The first example applies #$objectHasColor to three arguments.  This presents a problem since #$objectHasColor is a binary predicate, meaning that it takes only two arguments to be syntactically well-formed.  And since it is not syntactically well-formed, it is therefore not semantically well-formed.  A sentence cannot be semantically well-formed if it’s not syntactically well-formed.

,
Examples of Predicate Use

  The second sentence obeys arity because it applies the predicate to two arguments, so it’s syntactically well-formed.  Now let’s determine if it’s semantically well-formed.  The first argument, #$Emerald-Gem, as you might be able to guess from the name, is the collection of all emeralds -- not a particular emerald.  Does that meet the arg type constraints?  We’re told above that the first argument must be an instance of #$SpatialThing-Localized.  Even though any individual emerald is, of course, a spatial thing localized, the collection of all emeralds is not.  The collection of all emeralds is an abstract collection that includes all emeralds, so it’s not something that has a location in the physical universe.  Therefore, #$Emerald-Gem violates the arg type constraint for the first argument place of the predicate, thus making this example not semantically well-formed.

,
Examples of Predicate Use

  Now consider the third example sentence.  Arity is obeyed, so it’s syntactically well-formed.  In the first argument place we have the White House.  As you can probably tell from that name, it is not a collection, but a particular building.  A particular building is a spatial thing localized,  so that argument meets the arg constraint for the first argument position.  The second argument position requires an instance of #$Color.  Since #$WhiteColor is an instance of #$Color, the constraint on the second argument position is also fulfilled.  So we can say that this example is both syntactically and semantically well-formed.  This sentence simply says that the White House is white.

The last example is similar to the third; it just substitutes #$PinkColor for #$WhiteColor.  Even though the sentence is false (the White House is not pink), the sentence is semantically well-formed.  In fact, if a sentence isn’t semantically well-formed, it can’t be true or false.  So, the fact that a sentence is true or false implies that it is semantically well-formed.

,
Summary

 This concludes the lesson on Argument Types.

pdf | zip | Second-order Predicates
Predicates and Denotational Functions

Now, let’s talk a little more about predicates and go into further details.

,
Second-Order Predicates

Sometimes we want to make a statement about a predicate itself, or even relate two predicates to each other.  To do so requires the use of a special kind of predicate, loosely called a “second-order predicate.”  A second-order predicate is one which can be applied, not only to individual objects, but also to relations themselves.  Some of the example predicates already discussed in this tutorial are second-order predicates.  For example, #$arg1Isa relates a predicate to a certain collection, so it’s a second-order predicate.  Refer to the #$arity predicate example on the slide.  In this example, #$arity relates  the predicate, #$mother, to the term, 2.  #$isa is also a second-order predicate because it relates something to a collection that the thing belongs to.  Some collections are collections of predicates.  For example, if we say (#$isa #$mother #$BinaryPredicate), #$isa is being used to relate the predicate, #$mother, to a certain collection, #$BinaryPredicate.  #$isa is a second-order predicate in this sense.

There are special second-order predicates that always relate predicates to each other.  #$isa can relate all sorts of different things to each other and sometimes it relates predicates to collections of predicates.  In that sense, #$isa is used in a second-order way.  However, there are some second-order predicates which are always used to relate predicates to each other.  We’re going to discuss four of them: #$genlPreds, #$genlInverse, #$negationPreds, and #$negationInverse.

,
#$genlPreds

Predicates relate to each other within a structure of predicates in Cyc.  We can think of this structure as a hierarchy, because some predicates are more general than others.  The first special predicate we’re going to look at is called #$genlPreds.  Let’s look at its definition.  #$genlPreds is a binary (#$arity = 2) predicate which takes a predicate as both its first and second arguments. If we apply #$genlPreds to two predicates (represented on the slide as the variables  ?NARROW-PRED and ?WIDE-PRED), then the first argument, ?NARROW-PRED, is a restricted version of the second argument, ?WIDE-PRED.  In other words, any arguments of which the ?NARROW predicate is true, are also arguments of which the ?WIDE predicate is true.  We’ll look at some specific examples in the following slides, but first note that predicates of any arity may be related with #$genlPreds provided that both predicates have the same arity.

,
#$genlPreds

  In the first example on the slide, the predicate #$biologicalMother is related, by #$genlPreds, to the predicate #$biologicalParents.  This means that if x is the biological mother of y, then x is also a biological parent of y.  So, as you can see, #$biologicalMother is just a more restricted, or narrower, version of the predicate, #$biologicalParents.

In the second example on the slide, #$genlPreds holds between the predicate #$createdBy and the predicate #$startsAfterStartingOf.  In other words, if x is created by y, then x starts (or x begins to exist) after y begins to exist.  This says that the creator of something has to exist before the creation exists.

,
#$genlInverse

  Let’s now consider another second-order predicate, #$genlInverse.  It’s similar to #$genlPreds, but it has a twist.  By stating that #$genlInverse holds between two predicates, we’re saying that the first predicate, which relates argument a to argument b,  is a narrower version of the second predicate, which usually relates argument a to argument b, but #$genlInverse reverses the order to argument b related to argument a.  So if the narrower predicate holds between arguments a and b, the wider predicate holds between arguments b and a, in that order.

,
#$genlInverse

  How would we express that in a rule, say with an #$implies statement (assume the arity for both predicates is 2)?

,
#$genlInverse

  If ?NARROW-PRED relates argument 1 and argument 2, then ?WIDE-PRED-INV relates the inverse argument set, or argument 2 and argument 1.

With #$genlInverse, the two predicates you’re relating have to be binary predicates.  You’ll remember with #$genlPreds they could be any arity as long as they were the same.  So this is a little more restricted.

Let’s look at one specific example, there at the bottom of the slide.  This says that the predicate, #$customers, is related by #$genlInverse to the predicate, #$suppliers.  This means that if a is a customer of b, then b is a supplier to a.

,
#$negationPreds

  The third second-order predicate is called #$negationPreds. If this predicate is applied to two predicates (P1 and P2) and P1 holds for a certain set of arguments (a1, a2, ..., aN), then P2 does not hold for P1's argument set.  Similar to #$genlPreds, the two predicates related by #$negationPreds can be of any arity, as long as they are both of the same arity.

The last line on the slide is an example of a sentence using #$negationPreds to relate #$owns to #$rents.  This says that if a owns b, then it’s not the case that a rents b; you can’t both own and rent a given thing at the same time.

(Note: the predicate #$rents means “rents from someone,” not “rents to someone.”)

,
#$negationInverse

  The final second-order predicate we’ll discuss is #$negationInverse.  You can think of this predicate as an equivalent to the combination of #$genlInverse and #$negationPreds. That is, if #$negationInverse is applied to two
predicates and the first predicate holds for two arguments, then it's not the case that the second predicate will hold for the same two arguments in reverse argument order.  Similar to #$genlInverse, the two predicates being related both have to be binary.

The Example in the last line on the slide is interesting because #$negationInverse takes #$subordinates and relates it to itself.  If we take a closer look at the meaning of this sentence, we see that if a is a subordinate of b, then it is not the case that b is a subordinate of a.  Thus, any binary predicate which is asymmetric will be related by #$negationInverse to itself, because that’s just another way of saying that a predicate is asymmetric.  In fact, if you look at the extension of  #$negationInverse in the KB (Knowledge Base), you’ll find that most of the cases represented in the KB involve relating a predicate to itself (of course those are the asymmetric ones).

,
Summary

 This concludes the lesson on second-order predicates.

pdf | zip | More On Functions
Predicates and Denotational Functions

For the final section of this tutorial,  I am going to say a little more about functions.

,
Function Result Types

  An important thing about functions, that doesn’t really apply in the case of predicates, is that when you apply functions to arguments, they don’t give you something that is true or false like a sentence, they give you a term that denotes something.  It would be useful to be able to say what kind of thing a given function returns when you apply it to appropriate arguments.  We have ways of doing that in CycL.

#$MotherFn, that we’ve seen, always returns a female animal.  So, if I apply the #$MotherFn function to any appropriate argument, the resulting NAT (non-atomic term) would denote somebody’s mother, some female animal.  We can specify that to help define the meaning of that function.

#$TransportViaFn always returns some collection of transportation events, so we also want to have a way to be able to say that a certain function returns a collection of a certain type.  So how do we do that in CycL?

,
Function Result Types

There are two ways of doing that.

First is with a predicate called #$resultIsa.  #$resultIsa is a predicate that relates a function to a type of thing that the result of a function must be an instance of (by “type of thing” we mean a collection of things).

Second is $#resultGenl.  This relates a function to at type of thing (a collection, again) such that the result of the function must be a specialization of this collection.  So this predicate will apply to functions that denote collections, because only collections are specializations of other collections.

,
Function Result Types: #$resultIsa

  Look at the first example.  We see that the “result is a” for #$GovernmentFn is #$RegionalGovernment.  That tells us that the #$GovernmentFn function always returns an instance of #$RegionalGovernment.  So, if you look below, if you form a term by applying the #$GovernmentFn function to Sweden, we know that will return a regional government.  In other words, that NAT denotes something that is an instance of #$RegionalGovernment.

Similarly, if I apply the #$GovernmentFn function to #$CityOfAustinTX, again, that is an instance of #$RegionalGovernment.  It has to be, by the definition of #$GovernmentFn in terms of its #$resultIsa.

,
Function Result Types: #$resultIsa and #$resultGenl

  Let’s look at another example where #$resultIsa and #$resultGenl are both applied to #$TransportViaFn.  We’re told that the function always returns an instance of #$Collection and we’re also told that it always returns a specialization of #$TransportationEvent.  In other words, whenever you apply it to an appropriate argument, #$TransportViaFn gives you back some collection of transportation events.

So, if I apply #$TransportViaFn to #$Automobile, the result there is an instance of #$Collection, and is a specialization of #$TransportationEvent.  As you can probably figure out by the names of the terms, the specific thing that this usage of the function would return is the collection of all transportation events in which an automobile is the transporter.

,
Transitivity of #$genls

There are certain things that can be inferred from the #$resultGenl statement regarding particular functions.  One type of inference which you can draw has to do with what we call the “transitivity of #$genls,” which you’ve probably already read about in previous lessons.  Just to review, by saying that #$genls is transitive, all we mean is that, for example, if the collection #$Dog generalizes to the collection #$Mammal and the collection #$Mammal generalizes to the collection #$Animal, then we can infer that the collection #$Dog generalizes to the collection #$Animal.

That fact, the transitivity of genls, can be used in conjunction with, say, a #$resultGenl assertion, to infer something from the two given statements.  So, for example, since I know that the #$TransportViaFn #$resultGenls to #$TransportationEvent, and I know that #$TransportationEvent, itself, genls to #$Event, I can conclude what’s stated there at the bottom of the slide: if you apply #$TransportViaFn to #$Automobile, then that result will generalize to #$Event.

,
Transitivity of #$genls Example

  There’s something called the “transfer of #$isa through #$genls.”  Here’s an example.  If I know that Lassie is a dog, and that #$Dog generalizes to #$Mammal, I can infer that Lassie is a mammal.

This kind of reasoning can be used in conjunction with #$resultIsa.  For example, since I know that the result of #$GovernmentFn is always an instance of #$RegionalGovernment, and #$RegionalGovernment generalizes to #$Organization, I can infer that when you apply #$GovernmentFn to an argument, such as Sweden, that the result is an instance of #$Organization.

,
Examples of Function Use: #$BodyPartFn

  Let’s look at a few examples of the use of particular functions that you haven’t seen yet, and determine if the uses are well-formed or not.  We have the definition of a function called #$BodyPartFn.  Now let’s see what that means.  We’re told that it is a binary function, that its first argument must be an #$Animal,  that its second argument must an #$AnimalBodyPartType, and also the second argument must be a #$UniqueAnatomicalBodyPartType.  Together, this tells us that the second argument must be a type of animal body part that is such that a given animal only has one such part.  So “leg” would not meet both of those conditions because animals that have legs have more than one.  “Tail” or “head” would work, because animals that have tails or head have only one, making them unique body part types.

We’re also told that #$BodyPartFn takes, as its second argument, something that generalizes to #$AnimalBodyPart.  It should be obvious already that it would have to do that.  Finally, we’re told that the result of #$BodyPartFn is always an instance of #$AnimalBodyPart.

So this function takes a unique animal body part type, in other words, a collection of animal body parts of a certain type, and it gives you back a specific animal body part.  In particular, what gives you back is the specific animal body part of the animal in question that is serving as the first argument.

,
Examples of Function Use: #$BodyPartFn

  Let’s look at these examples and see how it all plays out.  In the first example, in the lower part of the slide, #$BodyPartFn is applied to #$SubaruCar and #$Fender.  Now, is that well-formed?  Well, arity is okay, it’s a binary function that’s applied to two arguments.  What about argument type?  The first argument is supposed to be an animal and #$SubaruCar is not an animal (in fact, it’s a collection of cars), so that violates the argument type constraints and it’s not semantically well-formed.  In other words, it doesn’t make sense to ask what that NAT denotes.  Because it is not semantically well-formed, it doesn't have a well-defined meaning at all; it’s essentially meaningless and should not ever be used.

Look at the second example NAT.  Arity, again, is fine.  The first argument is #$GoldenRetriever. Let’s see.  The first argument of #$BodyPartFn is supposed to be an instance of #$Animal.  #$GoldenRetriever, as you could probably guess by the way Cyc constants are named, denotes a collection, not an individual.  It’s not the name of some specific golden retriever, it denotes the collection of all golden retrievers.  The collection of all golden retrievers is not an instance of #$Animal, so that violated the arg type constraint for the first argument place.  So that’s not well-formed, either.

,
Examples of Function Use: #$BodyPartFn

 Now we go down to the third example NAT.  Right away you can see that arity is violated because it only has one argument.  So that’s not good, either.

Look at the last example.  Now have something that meets all of the specifications.  The function applies to two arguments, so arity is fine.  Note that the first argument, #$Rover, as you can surmise from the name, #$Rover is the name of a particular dog (it’s not a collection), so #$Rover is an instance of #$Animal.  So that meets the constraints on the type for the  first argument.  And #$Tail, as we’ve already seen, meets the argument constraints for the second argument.  So that NAT is the only one of the four that is well-formed.  That NAT, as a whole, denotes Rover’s tail.

,
Examples of Function Use: #$InstructionsFn-Making

Let’s take an example of a function that you haven’t seen yet.  This is called #$InstructionsFn-Making.  Now let’s study its definition.  It’s a unary function.  Its first argument is an existing object type.  Its first argument generalizes to an artifact.  So, together, those two two statements tell us that the first argument to a function should be a type of artifact, or a collection of artifacts.  The result of the function is an #$ObjectType and the result generalizes to an #$Instructions, so that tells us that what it returns is a type of instruction, or a collection of instructions.  To be a little more specific about it (more than what this tells you here), when you apply this function to a type of artifact, what it returns is the collection of instructions for making artifacts of that type.  So, for example, if I applied the function to #$Cabinet (which is the collection of all cabinets) what it should return is the collection of all instructions for making cabinets.

With that in mind, look at the examples at the bottom of the slide.

,
Examples of Function Use: #$InstructionsFn-Making

  In the first example NAT, you’ll see that the function is being applied to two arguments, but #$InstructionFn-Making is a unary function, so arity is violated.  That is not syntactically well-formed, so we can reject that.

Look at the second NAT.  Here the function is applied to one argument, so arity is obeyed.  Now let’s see about the type of argument.  Think of #$RadioReceiver as the collection of all radio receivers.  That is a type of object, or an existing object type, so that’s fine.  Now, does #$RadioReceiver generalize to #$Artifact?  Yes, radios are artifacts -- they are artificial things, or man-made things.  So, #$RadioReceiver is a specialization of #$Artifact-Generic, so that’s fine as well.  It meets the argument type constraints as well,so that is semantically well-formed as well as being syntactically well-formed.  So that is a legitimate NAT and the NAT as a whole would denote the collection of all instructions for making radio receivers.

,
Examples of Function Use: #$InstructionsFn-Making

  Look at the third NAT now.  The arity is okay.  What about argument type?  Is #$Mud an existing object type?  I don’t think it is.  And even more definitely, it’s certainly not a type of artifact because mud is not, essentially, a man-made thing.  Of course, a person can take some dirt and water and mix them together and make mud, so you can create mud; but as a collection, as a whole, mud is not a type of artifact, because most mud occurs naturally.  So that violates the argument type constraints, making it not semantically well-formed.

Finally, the last NAT there applies the function to #$SetOrCollection.  Arity is okay.  I’m not sure that #$SetOrCollection is an existing object type, but I am sure that #$SetOrCollection is not a spec, or a specialization of #$Artifact-Generic because sets or collections are abstract mathematical sorts of entities which are not man-made artifacts (like chairs or radio receivers).  So that violates arg type constraints as well, making the last example not semantically well-formed either.

,
Individuals or Collections?

  There is a key distinction among  functions that we have already addressed indirectly, but let me point it out specifically now.  Some functions always return individuals, as opposed to collections.  Like the #$MotherFn function, you apply it to any argument and you get back somebody’s mother, which is an individual.  The #$BorderBetweenFn function takes two geographical regions and, if they do border each other, it would give you back the border (the geographical line that separates them).  Again, that’s an individual, not a collection.

On the other hand, there are some functions, including some that we’ve already seen, that always return collections, not individuals, like #$TransportViaFn.  As we’ve seen, when you apply it to an appropriate argument, it always gives you back a collection of transportation events.  The #$GroupFn (I think we’ve seen that as well), when you apply that to any appropriate argument, it gives you back, not a particular group, but the collection of all groups of a certain type.  So, again, that always returns a collection.

,
#$IndividualDenotingFunction

  Now, the names for these types of functions are, first of all, #$IndividualDenotingFunction.  That is a sub-collection of #$FunctionDenotational, and it is, in particular, the collection of all functions that always return an individual for any appropriate argument.  You can assert that fact in  CycL with a #$resultIsa statement, where you relate it to #$individual.

When you define an individual-denoting function in the KB, you have to specify its #$resultIsa.  That’s just a condition we insist upon so that there is a way to tell what kind of thing a function returns.  But note that with an individual-denoting function, you must not have a #$resultGenl specification.  Why is that?  Well, if you think about it, #$resultGenl means that the result of the function is a specification of some collection.  Now, you can only be a specialization of a collection if you are a collection.  An individual-denoting function always returns an individual, and an individual is never a specialization of anything, so it wouldn’t make sense for an individual-denoting function to have a #$resultGenl specification.

Here are two more examples of individual-denoting functions that we’ve already come across: #$GovernmentFn (it takes a geographical entity like a city or country, and it gives you back the government of that city or country, and that’s an individual) and #$BirthFn.  #$BirthFn is a new one to us, I think.  If you apply the #$BirthFn function to any animal, you get back the event, the particular event, in which that animal was born.  That event is an individual, not a collection, so, again, these are individual-denoting functions.

,
#$CollectionDenotingFunction

  Now, the other sort are called collection-denoting functions.   These are all functions such that the result is always a collection of things, not an individual..  Now, collection-denoting functions, like all functions, has to have a #$resultIsa specification.

And a collection-denoting function may also have a #$resultGenl specification.  Because of the fact that it always returns a collection, it does make sense to have a #$resultGenl assertion about it.  It’s not required.  It’s okay to define a function and put it in the KB without specifying the #$resultGenl, but if you can give one, it’s better.

A couple more examples of #$CollectionDenotingFunction would be #$ResidentsFn and #$TeacherFn.  #$ResidentsFn takes a geographical region and it gives you back the collection of all residents of that region.  So that’s obviously a collection-denoting function.

#$TeacherFn takes an academic subject and it returns the collection of all teachers who teach that academic subject.  So, again, that’s a collection-denoting function.

,
"Individual vs."

  Sort of summing up what we’ve been talking about, as you can see, #$Function-Denotational is divided into individual-denoting functions and collection-denoting functions.  Those are disjoint from each other because obviously nothing can be both.  If you’re an individual-denoting function, you certainly can’t be a collection-denoting function, and vice-versa.  Below those are just some of the examples we’ve just seen.

,
Summary

This concludes the lesson that explores more functions in CycL.

Errors in Representing Knowledge

pdf | zip | Errors with Constants, Variables and Reliance on NL
Errors in Representing Knowledge

The “Errors in Representing Knowledge” tutorial covers a whole host of errors people have been known to make when representing knowledge in Cyc.  In this first lesson, we’ll focus on errors related to choice of vocabulary and naming of variables and constants.

,
Letting NL dictate your representation

  One of the errors people commonly make is assuming that, because natural language (NL) uses the same word to mean several different things, those things can safely be represented with the same Cyc vocabulary.  This will cause problems because, in order to write rules that make correct conclusions, we really need to tease apart the ambiguities present in NL.  The slide shows some of the possible meanings of “has.”  There are several states of affairs that use the English word “has” to state a relationship.   Given each of these, there are different conclusions we’d like to be able to draw about the objects involved in the relationship – for example, that John loves his son, or that John can afford to buy a sandwich.  If we used the same Cyc predicate, “#$has”, to represent these four very different relationships, it would be very difficult to express the rules we’d need to draw those conclusions.

,
At least 4 different senses

If we instead use more precise relations such as those on the slide, we more accurately represent what the actual relationship is and make it much easier to write rules that will draw correct conclusions.

,
Relying on Constant Names

  Another problem many knowledge enterers have is assuming that the name of a term that they have encountered for the first time gives them the full story on the intended meaning of the term.  Remember, the term names are basically variables; Cyc cannot guess at the meaning of a term by looking at its name.  The meaning of a given term is derived from the assertions with which that term is related.  If you haven’t examined those assertions recently, you should probably take a look to make sure you have the correct term before you make assertions using it.

,
Using Vague Constant Names

Using vague names sets the stage for future confusion.  Even though you shouldn't rely on constant names to tell you the meaning of a term, it’s still a good idea to be as precise as possible when choosing term names as a courtesy to other Cyclists.  When giving a precise name, we often use a convention in which the basic name is given first, and a clarifying word or phrase is appended after a hyphen.

,
Problems with Variables: Meaningless Variable Names

Single letter variables, especially non-mnemonic ones like ?X, make rules hard to read.  Do yourself and your fellow Cyclists a favor, and use variable names that give some indication of what they stand for.  Avoid naming a variable exactly the same as a Cyc constant, because that can be confusing, too.  Notice how hard it is to find the problem in the rule at the bottom of the slide.

,
Problems with Variables: Meaningless Variable Names

 If mnemonic variable names are used, the error is much easier to spot.

,
Problems with Variables: Check them carefully!

Be careful with your variables and make sure you always spell them the same way.  Can you identify the error in the rule on the slide?   The variable ?CIRCLE is introduced in the last line for the first time, so it could mean anything.  Remember (from the lesson on CycL Syntax) that variables that aren’t explicitly quantified are assumed to be universally quantified.  What does the rule on the slide mean?  The rule says that if you have something called ?CIRC and it is a circle with a radius, and you compute ?AREA as Pi times the radius squared, then EVERYTHING IN THE UNIVERSE has ?AREA as its area.  If the variable ?CIRCLE had been correctly spelled as ?CIRC, then the rule would have expressed the intended meaning -- if you have something called ?CIRC and it is a circle with a radius, and you compute ?AREA as Pi times the radius squared, then ?CIRC has ?AREA as its area.

,
Problems with Variables: Typing

  One trap that is easy to slip into is forgetting to properly constrain mnemonically-named variables.  #$inCont-Open is a relation that means the first argument is contained within the second argument, but the container is an open container.  This would hold between coffee and a coffee cup, or a puppy in an open box.  Therefore, when using this relation, one cannot assume that Cyc knows what type of thing is contained, nor what is containing the thing.  (#$in-Floating ?OBJ ?FLUID) means that ?OBJ is surrounded by and buoyed by the liquid, ?FLUID.  Again, Cyc allows you to specify what is floating and in what kind of fluid.  Can you think of counterexamples to the rule on the slide, as it written?  How about a box with two puppies in it?  Because the variables are not further defined, ?CANYON could refer to a box (or anything else, for that matter) and ?RIVER could refer to two puppies.  In order for the rule to be correct, we’d need to add the three literals that are listed at the bottom of the slide to the antecedent.

,
Problems with Variables: Typing

  Note that if a variable has to refer to a constant of a certain type due to argument constraints on the predicates it’s used with, it’s best NOT to state those constraints explicitly, as that would produce a rule with extraneous left-hand-side constraints (constraints in the antecedent) that aren’t strictly needed for rule correctness.  Here, the predicates #$coworkers and #$acquaintedWith require instances of #$Person in both their arguments, so we can drop the (#$isa ?PERS #$Person) literal from the left side.  One happy consequence of this is that the rule is now general enough to act like #$genlPreds.  Thus, it operates in such a way that matches a pattern of an already existing HL module and therefore already has efficient support without doing any additional work.

,
Summary

This concludes the lesson on common errors in using constants and variables, and reliance on natural language.

pdf | zip | Errors with Specialization, Generalization & Rules
Errors in Representing Knowledge

In the first lesson of the “Errors in Representing Knowledge” tutorial, we covered errors related to choice of vocabulary and naming of variables and constants.  Now we are going to concentrate on over-generalization, overspecialization, and some errors with formulating rules.

,
Over-generalization

  When stating knowledge in Cyc, we aim to state rules that are as general as possible, without overgeneralizing.   If a rule is too general, it will result in some false conclusions.  If it is too specific, it fails to draw some conclusions that would be correct and desirable to draw.   Now, the world is a messy place, and so it may not be possible in some cases to write a single rule that captures the whole point at the proper level of generality, but that is the ideal situation.

Refer to the slide for an example of an overly general rule.  Can you think of counterexamples?   Sure:  Clams and starfish and plants (and a bunch of other living things!) don’t have heads. How would we fix this?  There are many ways to fix this problem; let’s consider one approach.  There are two main groups of critters that all have heads: vertebrates and insects.  Thus we could simply state two separate, more specific rules for these two groups.

,
Over-generalization

  Refer to the slide for another overly general rule example.  Would this rule apply to mute people or infants?  This rule is fine as a default rule – almost every person does speak some language.  Thus, we can state this rule as a default and then represent exceptions, using #$exceptWhen, for infants and people who are mute, in a coma, etc.

,
Over-generalization

  Refer to the slide for yet  another overly general rule example.  Homeless people and primitive cultures don’t live in buildings.  Thus, this rule is a good default only if we assume a modern culture.  We could therefore state this rule in a microtheory that makes that assumption and then encodes exceptions for the homeless.

,
Over-specialization

  Here’s an overly specific rule.  What kinds of related conclusions will it fail to draw?

,
Over-specialization

It would be a stronger, better rule if we wrote it as “Every organism is younger that its parent(s).”  One can even imagine a further generalization that would cover PartiallyTangibles, along the lines of “every Partially Tangible has a starting date later than the starting date of any actor in its creation.”

,
Over-specialization: Unnecessary Constraints

  Another way rules can end up overspecialized is with the inclusion of constraints that aren’t necessary for the conclusion to be true. In the
rule on the slide, it doesn’t matter that the mast is part of something that is a sailboat – all masts are rigid, whether or not they are part of something.  Having these unnecessary literals keeps the rule from firing in all the circumstances in which it should.  Remove them, and we have a more powerful rule that can work with partial information.

,
Non-modular Rules

  Another error made when representing rules is to make them non-modular.  The example on this slide represents a perfectly good rule.  It states that for every sailboat, there exists a thing, which is a mast, and the mast is part of the boat.  But, what if we also wanted to talk about the hull that is part of each sailboat?

,
Non-modular Rules

  The example given on the slide is not the way to accomplish this.  Nested existentials are to be avoided whenever possible, especially when they are not strictly needed to express the state of the world.  In this case, the existence of the hull doesn’t depend on the existence of the mast, so …

,
Non-modular Rules

 … we should split up the rule into two independent rules.

,
Non-modular Rules

  Refer to the slide for another  case of a non-modular rule.  Do you see the problem?

When writing rules that include existentials, it’s important to include only the minimum set of constraints within the existential quantifier;  otherwise, we end up with a rule which overly specifies the existential, thus reducing the applicability of the rule to a smaller set of situations.

,
Non-modular Rules

  It’s better to split the rule into two.

,
Summary

This concludes the lesson on Errors with Specialization and Generalization.

pdf | zip | Other Errors
Errors in Representing Knowledge

We will now discuss various other errors in knowledge representation.

,
#$equiv usually isn’t

  #$equiv is Cyc’s logical connective for bidirectional implication.  This connective exists mainly for completeness and because the canonicalizer uses it (The raison d'etre of the canonicalizer is to soundly translate equivalent epistemological level formulae into a single heuristic level construct. This avoids redundantly adding one formula that is simply a rephrasing of another formula.).  It’s best to avoid using this connective during regular KR work, because it’s easy to make errors with it.

Refer to the rule given on the slide.  Do you see the problem with this rule?  It’s certainly true that all of one’s parents’ brothers are one’s uncles, but not all of one’s uncles are the brother of some parent.  Some uncles might have married one of the sisters of a parent.

,
Definable Collections

  Another common knowledge representation problem is the over-definition of collections.  If you are considering introducing a new collection for which there are only a few things you want to say, and the collection can already be defined in terms of other, already established vocabulary, it’s best not to introduce it and instead just refer to the concept by writing the expanded formula.  However, if there are on the order of 10 or more assertions you’d like to make that use that concept, go ahead and introduce the collection even if it’s already definable.  For these reasons, it probably would not make sense to introduce the collection #$WhiteCat; however, there is probably sufficient justification for introducing the collection #$BlackCat.

,
Tedious Hand-Reification

  Certain types of knowledge may be characterized by the grouping together of a large number of objects, about which the same kinds of things can be stated.  In cases where these properties can be functionally determined, it’s a good idea to take advantage of the regularity and avoid having to create and populate each term by hand, because doing so is an error-prone process.

The metric units of measure are one such knowledge area.  As an example, what needs to be stated about Kilogram?  It’s a unit of measure.  It measures Mass, but that could be determined by the fact that Kilogram is derived from Gram, and Grams measure Mass.  A Kilogram is 1000 times larger than a Gram, but that can be determined by the fact that it uses the “kilo” prefix.

,
Avoiding Tedious Hand-Reification

We can avoid having to create all the metric units of measure by hand by creating a few basic units of measure such as #$Gram, #$Meter, #$Herz, and introducing this class of functions: #$MetricUnitPrefix.   Functions of this class are unary, so all instances take one argument.  An example is shown at the bottom of the slide: the function #$Kilo.  When applied to a single argument which is a #$UnitOfMeasureNoPrefix such as #$Meter, it produces an instance of #$UnitOfMeasureWithPrefix.  This example takes one meter and makes one thousand of them.  The 1 in the formula comes from the 1 in front of “km” in natural language.
 

,
Avoiding Tedious Hand-Reification

  By functionally generating these new units of measure, we can write rules that completely and correctly enforce the proper constraints.  The first rule on the slide states that #$Kilo applied to any unit of measure will generate a new one that is 1000 times greater than the base unit.

The second rule says that any metric prefix applied to a base unit will yield a unit of measure that measures the same quantity as the base unit measures.   From these two rules, we get our definitional assertions about (#$Kilo #$Meter).   [Note for the interested: #$natArgument and #$natFunction are #$EvaluatableFunctions that can be used in the left hand side of rules to extract items from a NAT expression.  You can’t make assertions using #$EvaluatableFunctions, but supporting code exists that will prove these statements true: (#$natFunction (#$Kilo #$Meter) #$Kilo) and (#$natArgument (#$Kilo #$Meter) 1 #$Meter).]

,
Gratuitous High-arity Predicates: Lumping independent properties together

  Another common knowledge representation error is to create high arity predicates.  People with a relational database background are prone to making this sort of error.   The major problem with having a high-arity predicate that collects all of the details of a situation together is that it makes it difficult to state or reason if you only have partial information.  For instance, the inventor and time of invention are really independent properties of an artifact and therefore should be represented as separate concepts.

,
Gratuitous High-arity Predicates: Hiding concepts inside assertions

  Refer to the slide for another example of a gratuitous higher-arity predicate.  In this case, the designer was hiding information about position-played-on-the-team in the order of the arguments.  Since each position is a rich object, about which many other things could be stated, and because we’ll want to be able to say that Aikman was quarterback even if we don’t know who all the other players were, it’s better to reify the positions and state each position held independently.

,
Summary

 This concludes the final lesson on Errors in Representing Knowledge.

Survey of Knowledge Base Content

pdf | zip | Introduction
Survey of Knowledge Base Content

  This set of lessons (tutorial) is intended to give a broad overview of the content of the Knowledge Base.  In an effort to expose you to as much of the knowledge base as possible, we included many examples to illustrate the topics we address.  Some of the slides in this tutorial are so simple and self-explanatory that we will give no explanation, others will have many examples of a common type, only some of which will be explained, with the expectation that the reader can fill in the explanation for the rest.

,
The Form and Content Of The Knowledge Base

  There are many methods for representing knowledge, including written documents, text files, databases, etc.  The advantage that Cyc has over these methods is the language in which its knowledge is written, CycL.  In CycL,  the meanings of statements and inferential connections between statements are encoded in a way that is accessible to a machine.  At the present time Natural Languages are virtually meaningless to machines.  I can say “all animals have spinal cords.  All dogs are animals.  My pet is a dog.”  From these sentences, a person can infer that my pet has a spinal cord, but a machine cannot, at least not until a machine can understand English sentences.

In the formal language Cyc uses, inference is reduced to a matter of symbol manipulation, and thus something that a machine can do.  When an argument is written in CycL, its meaning is encoded in the shape, or symbolic structure, of the assertion it contains.  Determining whether or not an argument is valid can be achieved by checking for certain simple physical patterns in the CycL sentence representing its premises and conclusions.

,
Arrangement, by Generality

  The Knowledge Base (KB) itself comprises a massive taxonomy of concepts and specifically-defined relationships that describe how those concepts are related.

This figure represents the context of the knowledge arranged by degrees of generality, with a small layer of abstract generalizations at the top and a large layer of real-world facts at the bottom.

,
Arrangement, by Generality

  The Upper Ontology doesn’t say much about the world at all.  It represents very general relations between very general concepts.  For example, it contains the assertions to the effect that every event is a temporal thing, every temporal thing is an individual, and every individual is a thing.  “Thing” is Cyc’s most general concept.  Everything whatsoever is an instance of “thing.”

,
Arrangement, by Generality

  The KB contains several core theories that represent general facts about space, time, and causality.  These are the theories that are essential to almost all common-sense reasoning.

,
Arrangement, by Generality

Domain-Specific Theories are more specific than core theories.  These theories apply to special areas of interest like military movement, the propagation of diseases, finance, chemistry, etc.  These are the theories that make Cyc particularly useful, but are not necessary for common sense reasoning.

,
Arrangement, by Generality

  The final layer contains what is sometimes called “ground-level facts.”  These are statements about particular individuals in the world.  For example, “John has anthrax” is a specific statement about one person.  Generalizations would not go here, they would go in a layer above.  Anything you can imagine as a headline in a newspaper would probably go here.

,
Summary

This concludes the introduction to the tutorial that surveys the contents of the Knowledge Base.

pdf | zip | Fundamental Expression Types
Survey of Knowledge Base Content

  In this lesson we will focus on the fundamental expression types of CycL.  Generally speaking, fundamental expression types are the building blocks of language.  They are the things that you probably think of when you think of grammar.  In English we have nouns, adjectives, verbs, etc.  Here, we’re going to focus on the fundamental expression types in CycL, such as constants, functions, terms, predicates, quantifiers, etc.

,
Constants

  Constants can denote individuals, collections, or collections of collections.  #$GeorgeWBush, #$Sudan, and #$0-TheDigit are all constants that denote a specific individual.  #$Sudan denotes Sudan, the country in Africa .  #$0-TheDigit denotes zero, which is an individual that is a specific abstract object.

#$WorldLeader and #$Country are constants that denote collections.  #$WorldLeader, for example, denotes the collection of all world leaders.  #$Country does not denote a specific country, but rather the collection of all countries.

Collections of collections can be more confusing.  They denote all of the collections of all x.  For example, the members of the collection #$AutomobileTypeByBrand are all of the kinds of vehicles characterized by being of a certain brand.  The members are not the vehicles themselves, but they are kinds of vehicles.  So collections of collections don’t denote physical clusters or groups; instead they denote an abstract group whose members are distinguished by the value of a shared attribute.

,
Functions

  Functions take arguments and return results.  Consider the following examples.

#$PresidentFn takes a country as its argument and returns the name of a president as its result.  So, (#$PresidentFn #$Mexico) takes #$Mexico as its argument and returns Vicente Fox as its result.  Another way of saying this is that (#$PresidentFn #$Mexico) denotes Vicente Fox.

The function #$MotherFn does the obvious; it denotes the name of an animal’s biological mother.  We can build compound functional expressions by putting two functions together, as in (#$MotherFn (#$PresidentFn #$UnitedStates)).  When this lesson was written, this expression denoted Barbara Bush, who was the mother of the then current president of the United States.

The function #$GroupFn denotes not an individual, but a collection.  Thus, in this case, (#$GroupFn #$Person) denotes the very large collection of all groups (or collections) of people.  Note that members of these groups can overlap, thus the groups include Americans, smokers, plumbers, athletes, members of the 1992 Boston Red Sox, etc.

,
Terms Used to Relate: #$isa and #$genls

  #$isa is the most basic term in CycL. This term is used to say that something is part of a collection.  Everything belongs to at least one collection.

(#$isa #$GeorgeWBush #$WorldLeader) says that George Bush is a world leader.  It also says that George Bush is an individual in the collection of world leaders.  (#$isa #$Cat #$OrganismClassificationType) denotes a case of a collection of collections.  #$Cat does not refer to a specific cat, but to the collection of all cats.  This expression says that #$Cat is a member of the collection #$OrganismClassificationType.

The #$genls term is used to say that one collection is a sub-collection of another.  Thus, if instead of using #$isa above, we had said (#$genls #$Cat #$OrganismClassificationType) we would have been saying that every individual in the collection #$Cat is also in the collection #$OrganismClassificationType.  This is false, of course, since Tigger The Cat is not a type of organism classification.  The expression (#$genls #$Cat #$Carnivore) says that every cat is a carnivore.  Thus, if Tigger is an individual in the collection of all Cats, Tigger is also in the collection of all Carnivores.

,
Terms Used to Relate: #$typeGenls

  The relation #$typeGenls is difficult to understand.  In order to gain a better understanding of this term, let me translate the first example on the slide.  It says “every collection that is an Organism Classification Type is such that it has #$BiologicalLivingObject as a genls.”  In other words, #$OrganismClassificationType is a collection of collections whose members can be generalized from the members of #$BiologicalLivingObject.  Because this concept is difficult to understand, let me state this one more time in different words.  If we take all of the members of #$BiologicalLivingObject and generalize them into smaller collections such as #$Cat, #$Dog, etc., those collections could also be found as members in the collection of collections called #$OrganismClassificationType.

This relation is distinct from the relation in the previous example of #$Cat being a sub-collection of Carnivore.  In that case, Carnivore is a collection of individuals (with names like Tigger and Rover), and we grouped some of those individuals into a sub-collection that we called #$Cat.  In the current example, #$Tigger and #$Rover would be explicit members of #$BiologicalLivingObject, but not explicit members of #$OrganismClassificationType (whose members would have names like #$Cat and #$Dog).  However, because we know that Cat is an Organism Classification Type and members of collections of Organism Classification Types can be generalized from the members of Biological Living Objects, we can infer that any member of the collection Cat is also in the collection of all Biological Living Objects.

,
Terms Used to Relate: #$disjointWith

The term #$disjointWith is rather simple.  It means that nothing exists that is a member of both collections to which it’s referring.  Thus, the example on the slide says that nothing is both a fish and a mammal, or rather, no individual exists that is a member of the collection #$Fish and also a member of the collection #$Mammal.

,
Other Terms Used to Relate

There are many more terms used to denote relationships in CycL.  The relationships that the terms on this slide denote are obvious in how they are named.  Consider the first example.  #$biologicalRelatives relates two terms, in this case Jerry Lee Lewis and Jimmy Swaggart.  This example asserts that these two are biological relatives (as opposed to being related legally or in some other manner).

Notice that the relationships on the slide are between individuals, not collections of individuals.  The vast majority of relational terms in CycL are used to relate one individual to another individual.

,
Connecting Relational Terms: #$genlPreds and #$genlInverse

 CycL has terms that relate one relational term to another.  The two main terms for accomplishing this are #$genlPreds and #$genlInverse.

We have already discussed these two terms in previous lessons but let’s review them again here.  #$genlPreds is used to say that if something is true of the first term, then it will be true of the second term as well.  Refer to the examles on the slide.  The first example says that if a thing x is a geographical subregion of thing y, then thing x is also a physical part of thing y.  The second example says that if something is a physical part of thing y, it also temporally intersects (or exists at the same time as) thing y.  The last example says that if x is a father to y, then x is also a biological relative to y.

#$genlInverse is just like #$genlPreds, but you flip the terms.  Consider the first example under #$genlInverse on the slide.  This example says that if x is an event that causes y, then y starts after the start, or beginning, of x.  The second example says that if x is a father to y, then y is a biological relative to x.  Compare the meaning of this sentence to the meaning of the second example sentence for #$genlPreds.  Notice that the x and y changed places in the second half of the sentence.

,
Predicates for Well-formedness: #$arity and #$argxIsa

CycL has predicates that are used to describe syntactic and semantic conditions for writing well-formed sentences.  These are #$arity and #$argxisa.

#$arity denotes the number of arguments that a predicate must have -- the syntactic constraint.  Consider the example on the slide, (#$arity #$biologicalMother 2).  It says that any assertion using the predicate #$biologicalMother must include exactly two arguments (presumably the name of the mother and the name of that which is mothered).

The #$argxIsa predicate imposes semantic constraints.  It constrains the meanings of the terms that are legal arguments for that predicate. The first #$argxIsa example on the slide, (#$arg1Isa #$biologicalMother #$Animal), says that the first argument of a sentence that uses the predicate #$biologicalMother must be an animal.  Thus, the first argument cannot be a door or a car, etc.  The second example, (#$arg2Isa #$biologicalMother #$FemaleAnimal), says that the second argument in a sentence using #$biologicalMother must be a female animal.  Thus the sentence (#$biologicalMother #$Jim #$John) would be well-formed as far as the first argument is concerned, since Jim is an animal, but not as far as the second argument is concerned, since John is presumably not a female animal.

,
Logical Connectives: #$or

  Let’s now review some CycL logical connectives that we’ve seen in previous lessons.

The logical connective #$or relates two or more assertions in such a way that an expression which uses this connective is true if any (one or more) of the assertions are true.  Refer to the example on the slide.  Lets’ say that we want to hire Chris for an entry-level programming position.  If Chris fills any of the following qualifications, we’ll extend him an offer:
• is a college graduate
• is a computer programmer
• is a genius

,
Logical Connectives: #$and and #$not

The #$not and the #$and connectives in the example on this slide are used to say that it is not the case that the following assertions are both true.  Thus it is not true that ChrisX is both a male person and a female person.

,
Logical Connectives: #$implies

  The connective #$implies is used to say that if the first assertion is true, then the following assertion is true as well.  Notice that the second assertion in the example on the slide is preceded by the #$not connective, meaning that the opposite of what the assertion claims is true.  Therefore, this example says that if Chris is a male person, that implies that Chris is not a female person.

,
Quantifiers: #$forAll

Now let’s review some CycL quantifiers.

#$forAll is the universal quantifier.  When used with variables, this term allows you to say things like the example on the slide: For any ?COUNTRY and any ?PERSON, if that ?COUNTRY is a #$Superpower, then the ?PERSON who is the head of its government has the status of #$WorldLeader.

,
Quantifiers: #$thereExists

  #$thereExists is the existential quantifier.  When used with variables, this term allows you to say that something exists.  The example on this slide says that every vertebrate has a tongue as an anatomical part.  To be more literal in the translation, it says that for every animal, it follows from that animal being a vertebrate that it has a part that is a tongue and that part is one of its anatomical parts.

,
Summary

 This concludes the lesson on Fundamental Expression Types in CycL.

pdf | zip | Top Level Collections
Survey of Knowledge Base Content

The purpose of this lesson is to expose you to some of the top level collections in Cyc and give you an idea of how they relate to each other.

,
Some Top Level Collections

  This slide shows the organization of some of the most fundamental collections in Cyc.  For example, the collection #$Intangible is related to #$Thing with a solid green line, meaning that #$Intangible generalizes to (or is a sub-collection of) #$Thing.  #$Thing is the supreme collection.  Everything in the universe (even intangible things) is an instance of #$Thing.  Consequently, every collection is a specialization of of #$Thing.

#$Intangible is the collection of all things that cannot be touched.  #$TemporalThing is the collection of things that exist in time.  An integer would not be in this collection because integers do not exist at a specific time -- they are abstract, or timeless.  Instances of #$SpatialThing-Localized are simply those things that have a location in space.

#$ExistingStuffType and #$ExistingObjectType correspond to the common-sense notions of “stuff” and “object” respectively.  Water is stuff, but a lake is an object.  The collection #$Water is stuff-like in that each portion of an instance of #$Water is also an instance of #$Water.  #$Lake is object-like in that typically any proper part of a lake is not itself a lake.

,
#$Dog (the collection of all dogs)

Let’s look at a specific lower-level collection in detail.  The collection #$Dog is asserted in Cyc to be an instance of the following collections: #$OrganismClassificationType, #$BiologicalTaxon, #$BiologicalSpecies, and #$DomesticatedAnimalType.

#$Dog is asserted to be a specialization of the collection #$CanineAnimal for the obvious reason that every instance of #$Dog is also an instance of #$CanineAnimal.

,
45 Collections of which #$Dog” is a Specialization

  “Genls” is a transitive relation, so any genls of a genls of #$Dog is itself a genls of #$Dog.  For example, #$CanineAnimal is a genls of #$Dog and #$Carnivore is a genls of #$CanineAnimal.  Consequently, #$Carnivore is a genls of #$Dog.  By the same reasoning, every genls of #$Carnivore is also a genls of #$Dog.  This slide shows all collections that are genls of #$Dog.

To take a few examples, Cyc knows that the members of #$Dog are also: agents, air-breathing vertebrates, heterotrophs (they require organic nutrients in order to survive), hexalateral objects (meaning that it is appropriate to refer to a front side, a back side, a top, a bottom, a left side, and a right side -- unlike a tennis ball which has no sides), perceptual agents (they can perceive things), and spatially localized things (they have a location in physical space), etc.

,
11 Collections of which #$Dog is an Instance

  In contrast to the previous list of 45 collections of which #$Dog is a spec, this is a list of the collections of which #$Dog is an instance.  Earlier we listed the four collections of which #$Dog is asserted to be a member.  From this, Cyc knows that #$Dog is actually an instance of eleven collections.  This is because each of the first four collections collectively are specializations of seven other collections.

#$Dog is an instance of #$ExistingObjectType, meaning that any part of a dog is not a dog on its own (a dog’s leg is not a dog).

#$Dog is an instance of #$TemporalStuffType, meaning that anything that is a dog at one time will always be a dog (this is not true of a teacher, for example, who can be a teacher one year and an attorney the next year).

#$Dog is an instance of #$OrganismClassificationType which means, roughly, that this collection is used in the scientific classification of organisms.

,
Summary

 This concludes the lesson on top level collections in Cyc.

pdf | zip | Time and Dates
Survey of Knowledge Base Content

Fundamental to all discussions of causality and reasoning is knowing what happened before what.  Therefore, time is crucial to all  reasoning.  This lesson will cover the basics of representing time and dates in Cyc.

,
Functions Which Return Time Intervals

What the functions on this slide denote seems obvious.  For example, (#$YearFn 2000) denotes the year 2000.

,
Functions Which Return Time Intervals: Composite Expressions

  You can combine the functions together to form a composite functional expression like the example on the slide.  This expression denotes the last second of the year 2000.  A more literal translation would be “the fifty-ninth second of the twenty-third hour of the thirty-first day of the month of December of the year 2000.”

,
Time As A Quantity

  There are also functions that denote intervals of time, like years, which have a place on the timeline.

The example on the slide says that the duration of the year 2000 is a one year duration.  If instead the example said (#$duration  (#$YearFn 2000) (#$YearsDuration 3)), it would correctly assert the false idea that the duration of the year 2000 is three years.

These quantities are simply abstract objects that measure the length of an interval of time.

,
Relations Between Temporal Things

Most reasoning is more concerned with duration, or chunks of time, than specific points in time.  Therefore, Cyc has many predicates for describing relationships between chunks of time.

The predicates listed on the slide are only a sample of what is available in Cyc.  Most mean just what they say, so #$startsAfterStartingOf could be applied to this hour and last hour to say that this hour starts after the starting of last hour.

The last four predicates in the column on the right are the least commonly used because they are restricted to relating points in time.

,
Relations of Types of Intervals

  Now we will consider a couple of the relationships that are so important for reasoning.

The first collection listed on the slide, #$TemporalStuffType, is a collection of events such that any proper time slice of any one of its members (events) is itself a member (event) in that collection.  In this sense, the #$TemporalStuffType collection is similar to the #$ExistingStuffType collection; however, #$TemporalStuffType applies to proper time slices of events as opposed to portions of objects.  For an example of #$TemporalStuffType, think of walking.  Disregarding issues of granularity, a time slice of walking would represent what happens at any point in the walking event.

#$TemporalObjectType is a collection of events such that any proper time slice of any one of those events is not itself in that collection.  This corresponds to #$ExistingObjectType. For example, consider a marathon run.  Any proper time slice of the run would be a shorter run, and therefore would not be a marathon run.  Another example would be the event of making a cake.  Any proper time slice of the process would be beating eggs or stirring or preheating the oven, and would not represent everything that happens in making a cake.

,
Relations of Types of Intervals

There are also predicates which allow us to make assertions about the inter-relatedness of time intervals.  The predicates listed on the slide allow us to make assertions like the following:

Every February 29th is subsumed by a February: (#$includedInIntervalType (#$DayFn 29 #$February) #$February).
Every February subsumes a Wednesday.
Every February intersects some winter season (in a theory applying to the Northern Hemisphere).
The day Jim was born occurred in a February.
Every Tuesday is followed by a Wednesday.

,
Summary

 This concludes the lesson on representing time and dates in Cyc.

pdf | zip | Spatial Properties and Relations
Survey of Knowledge Base Content

  Just as Cyc has many ways of representing aspects of time, Cyc has many ways of describing spatial properties and relationships.  Fortunately, the ideas that we present in this lesson are familiar and much of the vocabulary is self-documenting, so we won’t add much explanation.

,
Spatial Properties and Relations

  Cyc has various ways of expressing relative location, such as those listed on this slide.

Notice how many different shape attributes Cyc has -- 63!  These include attributes like #$ArcShaped, #$Linear-Planar, #$Round, and #$Amorphous.

Types of Spatial Symmetry include things like bilateral and radial.

You can specify direction and orientation with ideas like “in front of” and “above,” but Cyc requires specifics like #inFrontOf-Directly and #$inFrontOf-Generally.

Cyc differentiates between various senses of “between.”  For example, you can specify the distance between two objects on a line and you can specify the distance between two objects on a path (a path which might bend or even be circular like a track).

,
Senses of ‘In’

  What do people mean when they use the word ‘in’?  Use the questions on this and the two following slides as a guide when making decisions as to what sense of ‘in’ to use in a given situation.

,
Senses of ‘In’
,
Senses of ‘In’
,
Senses of ‘Part’

  The list of predicates on this slide represents represents different aspects of “being a part of somethig.”  The predicate #$parts is the most general predicate in this list.

#$subInformation is totally abstract.  It means that one piece of information is part of another.  For example, “Jim and Mary went to the store” has “Mary went to the store” as a part of its information.

#$physicalDecompositions allows you to refer to an arbitrary physical chunk of an object.  This is distinct from #$physicalPortions which refers to a portion that contains a representative sample of everything in the whole.  So a physical portion of a salad with five ingredients might have representatives of all five ingredients, whereas a physical decomposition might have only one, two, or three of the ingredients.

#$physicalParts is what most people think of when they think of an object’s parts.  This is used to refer to the physically-separable parts of an object, even if the parts are glued or welded together, such that each part has its own identity (i.e. the wheels of a car, the tail of a dog, the fingers on a hand, etc.).

#$anatomicalParts refers to parts like physical parts, but anatomical parts each have their own anatomical function (i.e. the throat of a dog, the nervous system of a person, the eye of a bird).

,
Organizations

  Similar to physical objects, organizations also have parts.  Cyc has a large number of terms to describe organizations.  This slide shows a sampling of those terms, most of which are self-explanatory.

Notice that both predicates and functions are listed here.  #$PolicyFn denotes the collection of all policies of an organization.

,
Summary

This concludes the lesson on spatial properties and relations.

pdf | zip | Event Types
Survey of Knowledge Base Content

  Cyc uses around 37,000 different event types to describe what happens in the world.   This large number includes events that range from the extremely general to the extremely specific.  The extremely specific events might only be used once because they were created for a very specific situation and therefore involve a large number of qualifiers.  For example, the sub-collection of all events in which garbage containing an explosive part has been disposed of by encasing in….

,
Some Events Types

  The slide lists common, everyday event types that are self-explanatory.  To use them, you would represent a particular instance of an event and then represent the relationships of the participants in that event via predicates called Actor Slots.  We’ll review some examples of this after a few more slides.

,
Roles and ActorSlots (the world’s largest collection)

  In Cyc, “roles” and “actor slots” are used to describe the kinds of things listed on this slide.  “Objects destroyed” includes objects that you can’t get back out once you put them in.  For example, eggs are destroyed in the #$BakingACake event.  Think of facilitating objects as helpers, or assistants.  For example, an electric mixer is a facilitating object in the #$BakingACake event.  Slots (predicates) of motion and location are used when describing moving events, as in moving an egg from a carton to the bowl.  The beneficiary is the recipient in an event.

,
Roles and ActorSlots

  Refer to the slide for an example of a particular instance of an action.  The following two slides will diagram how we use roles and actor slots in representing this action.

,
Roles and ActorSlots “Moe clobbered Curly with the British scepter.”

  First of all, Cyc represents events by relating the event to its participants (not by relating participants to each other).  Therefore, this diagram shows Clobbering14 in the center with all other aspects, or actor slots, related directly to the event, not to other aspects of the event.  #$performedBy is an actor slot that is used to relate Moe to Clobbering 14.  This is like saying “Clobbering, Moe did it” and “Clobbering, Curly received it”as opposed to “Moe clobbered Curly.”  Because Cyc describes events in this manner, we must have very specific ways of representing roles and actor slots.  Cyc has over 200 role and actor slot predicates.

,
Roles and ActorSlots “Moe clobbered Curly with the British scepter.”

  Clobbering14 would be an instance of the event that would be called #$Clobbering.  It is crucial that events be described in terms of a particular instance and not the event in general because Cyc links participants (performers, victims, etc.) to each particular event.  Therefore, events must be reified (represented in the knowledge base with a particular name).

,
Roles in events and subevents

  Because we describe events in Cyc in terms of a particular instance and not the event in general, roles and subevents allow us to represent more about an event.  Consider the Krebs Process, a kind of biochemical process in which any instance of the process will be represented by multiple subevents related to a parent event.  The first subevent creates an output that becomes an input to the second subevent.  The second subevent destroys that input.  So the same object (the black ball in the diagram on the slide) uses a different ActorSlot predicate (output/input) for each of the different subevents it relates to.

,
Summary
  •  Cyc has a large variety of predicates for representing roles preformed in events and the actors who perform them.
  • Events are represented by relating actors to the event.
  • The product of one sub-event in the input to another.

This concludes the lesson on event types in Cyc.

pdf | zip | Information
Survey of Knowledge Base Content

Information is one of the trickiest things to ontologize (represent in the Knowledge Base) because everything contains some kind of information.  Movies, books, plays, and TV shows are all obvious examples of things that contain information, but the rings in a tree stump and the color of the sky contain information as well.

,
Information

  There are three obvious categories of things that contain information: information-bearing things, abstract strings and characters, and propositional content.  However, in designing Cyc we discovered a fourth area which we call conceptual works.

Information-bearing things are the physical embodiments of information (i.e. an individual newspaper, which you can use to read or to wrap fish).

Information is encoded in abstract strings and characters.  They are the abstract symbols and structures like words, sounds, and handwriting that are used to convey information.  These are the most common elements used to intentionally convey information.

Propositional Content is what is encoded.  It is the information that the abstract strings and characters combine to represent.

The following slides will further explicate the above ideas and introduce the idea of a conceptual work.

,
What is “Moby Dick” ?

  This slide presents an example that we will tease apart through the remainder of this lesson.

Consider “Moby Dick” and the three main categories of information detailed on the previous slide.  As we continue with this lesson, you will see that “Moby Dick” doesn’t fit entirely into any one of those categories.

Question: What is “Moby Dick”?
Answer: A conceptual work.

Let’s see why....

,
Slide 4

  If I were to say that I like “Moby Dick,” I might mean that I like the way my special edition leather-bound copy of “Moby Dick” looks on the shelf.  In this case I would be referring to the aspect of “Moby Dick” that is an information bearing thing (IBT).

If I were to say that I like “Moby Dick,” (albeit unlikely) I might mean that I like the specific sequence of the letters in the work called “Moby Dick.”  In this case, I would be referring to the abstract information structure of “Moby Dick” (AIS).

If I were to say that I like “Moby Dick,” I might mean that I had read the story in several languages and liked it each time I read it.  In this case, I would be referring to the propositional information thing (PIT) that “Moby Dick” can denote.  This concept is important, as it is what allows Cyc to be independent of language.  A PIT that is expressed in CycL (a portion of which is given on the slide) can then be expressed in any desired language because PIT’s are independent of AIS’s (the sequence of symbols).  One PIT could have multiple AIS’s.

,
Slide 5

Since most people don’t normally use the name of a book to refer to the paper on which it is printed, when they say “Moby Dick,” they are probably not using “Moby Dick” as an information-bearing thing (IBT).

Similarly, people don’t normally use the name of a book to refer to the enormously long sequence of symbols (letters) that make the text of the book (the sequence of all of the letters in the entire work, including the small sequence on the slide, “’-T-i-s--M-o-b-y--D-i-c-k-!”).  Thus, “Moby Dick” probably does not refer to an abstract information structure (AIS).

In the same way that “Don Quixote” probably refers to a novel that is written in Spanish, when people say “Moby Dick” they are probably referring to the original English version.  In fact, some experts might argue as to the validity of reading the novel in a language other than the original English.  Thus, when people say “Moby Dick,” they are probably not referring just to the propositional information thing (PIT), but to something more, which includes the language in which it was written.

,
Slide 6

  Hence, we know that “Moby Dick” denotes something that does not fit into any one of the three obvious categories for representing information.  So we now understand the need for the fourth category -- Conceptual Works.  When someone says “Moby Dick” they are probably referring to the thing that Cyc knows as #$MobyDickTheBook-CW.

,
Slide 7

  #$MobyDickTheBook-CW is embodied in thousands of different IBT’s around the world (all the copies of all the editions, in all the libraries, homes, schools, etc.) and is represented by a specific AIS, and is associated with a specific PIT.

,
Slide 8

  In Cyc, we relate conceptual works to other things, like IBT’s, via a large number of relations.  Although these relations have no common-sense usage, they allow us to specify to Cyc the part of the conceptual work to which we are referring.  So, in order to refer to a specific copy of “Moby Dick” we would refer to the IBT that is an #$instantiationOfCW of “Moby Dick.”

,
Slide 9

This slide presents a glimpse of the various relations between conceptual works and the other categories pertaining to information.

,
Summary
  • InformationBearingThing
  • AbstractInformationStructure
  • PropositionalInformationThing
  • ConceptualWork
  • Relating these categories

This concludes the lesson on how Cyc represents Information.

pdf | zip | More Content Areas
Survey of Knowledge Base Content

  The purpose of this lesson is to introduce you to some of the content areas in Cyc.  We have already studied the Information area.  This lesson will provide a more general overview of some of the other content areas in Cyc.

,
Emotion

The column on the left of the slide lists some collections of emotional attributes.  The collection called #$Abhorrence includes things that are related to feelings of abhorrence. All the degrees of all of the ranges of abhorrence should be in this collection.  Entering knowledge about emotions into Cyc is facilitated by using relations like those that are listed on the right of the slide.

For instance, Cyc knows five different emotions that are related to Abhorrence with the #$contraryFeelings relation.  Three of these emotions are #$Enjoyment, #$Adulation, and #$Love.  Thus, Cyc knows that something is wrong when a person both abhors and loves a thing.

Similarly, we can tell Cyc which emotions are appropriate for a given situation according to the role that a person plays in the situation.  For instance, we can say that a given emotion is an #$AppropriateEmotion for the groom at a wedding, but not for the wedding coordinator.

Another emotion-representing relation is #$feelsTowardPersonType.  This relation allows us to tell Cyc how a person feels about a group of people.  #$actionExpressesFeeling is used to tell Cyc things like laughter expresses amusement.

,
Propositional Attitudes Relations Between Agents and Propositions

  Another content area in Cyc is Propositional Attitudes.  This slide lists some of the predicates that relate intelligent agents to propositions.  They are fairly self-explanatory, but we’ll discuss the #$desires relation as an example.  In the sentence “Jim desires that the sky will be cloudy,” Jim is the agent and #$desires would be used to represent the verb phrase “desires that,” which relates Jim to the proposition “the sky will be cloudy.”

Similar to relating emotion predicates to each other, we can relate propositional attitude predicates to other propositional attitude predicates.  For example, Cyc can infer that if Jim #$knows something to be true, then Jim does not hope it will be true; rather, Jim believes it is true, and it is not his goal for it to be true.

,
Biology

  Yet another another content area in Cyc is biology.  In Cyc, organisms are classified according to their biological taxon (such as #$Rat, #$Vertebrate, and #$Bee), their habitat (where they tend to live), and what they eat.  Some classifications are scientific, while others are not, like #$Worm.  Cyc can also take anatomical descriptions of organisms that say what organs they have and how they are related (for example, how they work in processes like metabolism and digestion).  We can even describe an organism’s life stages to Cyc.

,
Materials

  Cyc has a diverse Materials vocabulary for describing types of substances, physical properties of substances, and tangible attributes of substances.

You can discuss all states of matter (solid, liquid, etc.) in Cyc.  This allows us to assert, for example, that glass is classified as a solid tangible thing and yet exists in a liquid state.

Structural Attributes include things like colloid and crystalline.

Tangible attributes are used to describe the perception of how something feels.  It is important, however, to note that something that is a solid tangible thing is not necessarily in a solid state of matter.  For example, consider a piece of wood.  That piece is a solid tangible thing in that it can be touched and its shape is independent from its container, but the water that is in it is in a liquid state of matter and that water can account for 20 to 60 percent of the piece of wood.

,
Devices

  Cyc knows about lots of types of devices, from #$BlowDryer, #$Gun, #$Engine, #$CopyMachine, #$FishingHook, and #$Wheel, to #$AtomicClock.

Various attributes of devices can be represented within Cyc.  You can describe the state of a device, like #$PausedState, #$DeviceOn, and #$OffHook (for a phone).  You can describe actions of devices, like the spinning of a washing machine.  You can describe the power rating, operating cost, etc. of a device.  Finally, you can tell Cyc the purpose of a device with the #$PrimaryFunction predicate.  For example, the primary function of a hammer is to be used in a hitting event.

,
Food

Another content area in Cyc is food.  Cyc knows a lot about food.

There are various food types in Cyc, such as the #$fruit collection, the #$meat collection, etc.

You can tell Cyc who can typically eat a food with the  #$EdibleByFn function.

#$PreparingFoodOrDrink is the collection of events that are associated with preparing a #$FoodAndDrink item (whether you start “from scratch” or use a mix).  Similarly, #$ConsumingFoodOrDrink is the collection of events in which a person or other animal ingests some portion of food or drink through its mouth

Cyc also has more specific food-related vocabulary.  For example, to say that a person is a vegetarian, you can say that he or she #$eatsWillingly #$VegetarianCuisine.

,
Weather

  All normal human expressions about weather, another content area in Cyc, can be expressed in CycL.

You would say that a tornado occurred on Thursday by referring to an instance of the #$TornadoAsEvent collection.

You would distinguish the snowing event from snow itself by referring to an instance of the collection called #$SnowProcess.

In order to say that a tornado hit my house on Thursday, you’d refer to an instance of the collection called #$TornadoAsObject.

,
Geography

  Cyc has an extensive vocabulary for describing the geography content area.

Cyc has various ways of encoding physical addresses, breaking them down into street, zip code, etc.

Cyc also has various predicates for describing characteristics of the populace like religion, language, etc.

Cyc has lots of predicates for physical relations between things (like borders), geo-political subdivisions (like voting districts), and natural land divisions (like islands and seas).

,
  • Emotion
  • Propositional Attitudes
  • Biology
  • Materials
  • Devices
  • Food
  • Weather
  • Geography

This concludes the lesson on additional content areas in Cyc.

OE Example: Events and Roles

pdf | zip | Events in Cyc

This section will survey Cyc’s representation of events.  This section will tie up with later sections where we’ll see how components of events are related to events in Cyc by roles, such as actor slots and sub-events.

,
Events in Cyc

Events in Cyc are represented as individuals that belong to a collection called #$Event .  These individuals have components; that is, they’re not empty stretches of space or time, they have parts.

They are also situations. By “situation” we mean any configuration or arrangement, such as a group of people, their equipment, etc., in a room, at a specific time; or just that computer sitting on the table; or some birds flying outside.  Situations have structure, precise or “fuzzy.”

Events also have temporal extent.  That is, they occur over time.  So they’re different from arrangements such as geometric configurations or abstract mathematical series, which are also configurations of individuals, but they’re not extended temporally.

Finally, events are dynamic.  That means that they can change over time.

Most individual events in Cyc are classified, not just as instances of #$Event, but as instances of a more specialized collection of #$Event.  So you wouldn’t see #$Event001 reified in the Cyc KB, you’d more likely see #$Reading001, #$Negotiating001, or #$PoliticalCampaign001, or some other instance of some more specialized collection.

,
Partial hierarchy for #$Event

  This slide provides a glimpse of the #$genls hierarchy of the collections surrounding #$Event.  As you can see, #$Event is indeed a specialization of #$Situation-Temporal, and that’s how Cyc knows events are arrangements of objects that extend over time.  Also, you can see that #$Event is not a specialization of #$RelationalStructure, and that’s how Cyc knows that events are different from abstract geometric or mathematical series.  Events are dynamic, so #$StaticSituation is a sibling collection for #$Event.  #$StaticSituation collects situations that are extended in time but don’t change, whereas the collection #$Event collects the situations that are extended in time but do change.  Under #$Event you see some of the more general specializations of event.

This is really just the tip of the iceberg.  There are many many more specializations of #$Event.

,
Why Reify Events?

Why do we reify individual events (instances of #$Event) in Cyc?  If our knowledge changes about an event, having a reified data structure to represent the event enables us to add information or alter the representation in Cyc very easily.

Also, because kinds of events are related to each other in the #$genls hierarchy, we can use that hierarchy to inherit knowledge downward from the more general types of events to the more specialized types of events.  So for instance, if we have the general event collection, #$TransportationEvent, and we state about that collection that anything that travels during a transportation event moves from an origin to a destination, then Cyc will know that this is also true of specializations of #$Transportation event, such as #$SubmarineTransportationEvent or #$BicycleTransportationEvent.  More specialized knowledge will also hold in those cases.  We’ll know that for an instance of the #$BicycleTransportationEvent the device used was a bicycle, and still the more general assertions hold.

As we will see, roles too have a hierarchy that extends Cyc’s ability to reason about the components – the participants and sub-parts – of events.

,
Components of events

There are all kinds of things that can be components of events.  Events can have performers, and there can be devices that performers use during the events.  Events can have sub-events, or sub-stages.  Events can occur at places, and those places are somehow involved in the events.  Events take place at time, and times of events are also somehow involved in events (we have special predicates to relate times to events).  We state how components of events are involved in events with roles predicates – predicates that are instances of the collection #$Role.

,
Using Roles

For example, consider the event that is the Battle of the Nile.  It is a fairly complex event in which a number of things are involved.  In a sense, the year 1798 is involved in the Battle of the Nile, because that’s the year in which the battle occurred.  Abu Qir Bay is in involved in the event, because that’s where the event occurs.  Horatio Nelson is an actor in the event.  The British Attack is a sub-event of the battle, as is the French Defense.  And so on.

,
Roles and Events

In CycL we use special predicates called “roles” to relate reified events to their components.  We build a lot of knowledge into the construction of role predicates to help Cyc understand how these roles function to relate components of events to reified events.

,
Summary

In summary, events are a certain kind of individual that we find useful to reify in order to increase inferential efficiency.  Events are the sorts of things that have components.  We use roles to relate components of events to events.

pdf | zip | Roles and Event Predicates

This section is on roles and event predicates.  Roles are predicates that relate components of events to events.  Event predicates provide a distinct means of representing the ways that events and their components are related to one another.

,
Roles in Cyc

  Roles are specialized predicates developed for the purpose of relating components of events to events.  There are two general specializations of the collection #$Role.  They are #$ActorSlot and #$Sub-ProcessSlot.  In the following section we will discuss actor slots and in the last section we will discuss sub-process slots.

Roles are arranged in a predicate hierarchy based on #$genlPreds.  The top node of the hierarchy is #$actors.  Every instance of #$Role is a specialization (a.k.a. “spec. pred”) of #$actors.

,
Roles include many kinds of relations

Roles include many kinds of relations between a situation and its components. For instance, in an event which is a reading of a book, there are many components.  There is the person doing the reading, there is the book being read, there’s the information in the book that’s being transferred to the person doing the reading, and there are sub-events (such as the reading of each individual page).  Roles relate those components to the reading of the book.

In a naval battle there are many components which would be related to that naval battle via roles.  These include the groups engaged in the battle, the attack undergone, the officers who direct the battle, all of the different devices that are used in the battle, and other kinds of sub-events (such as the signals that occur between ships during the battle).

,
Representing roles in CycL

Here are some examples of CycL sentences with roles in the arg0 slot.  #$performedBy, #$informationOrigin, #$eventOccursAt, and #$topicOfInfoTransfer are all instances of #$Role and they specify a role that an individual plays in a particular reading event.

So, what do you think these mean?

,
Representing roles in CycL

The translations are italicized.

,
Representing roles in CycL

 See if you can translate these before going on to the next slide.

,
Representing roles in CycL

 Here are the translations.

,
Ontologizing with roles

So, we can use roles to describe particular situations and events with CycL predicates.  But even more importantly, we can use roles to reason in general terms about kinds of situations and the kinds of things that are involved in them.

Using rules with roles, we can state common-sense generalizations such as: “anything reading a book must be an intelligent agent,” “anything being read must be textual material,” or “after a reading event, the reader is familiar with the information contained in the thing read.”

In CycL, we generalize by quantifying over the individual events and their components.  With this ability to generalize, we can infer more information about situations from the information that we already have.  And note that quantification also enables us to ask Cyc questions that cover entire classes of situations and roles.

,
Rules that use roles

Here are some CycL rules that capture general knowledge about roles, including knowledge about the kinds of things that are related by certain roles.

Can you tell what the first one says?

It says: “In every instance of #$Reading that has a source, that information source is textual material” (a separate assertion should tell us that every instance of #$Reading does have an information source).  In other words, “Whenever someone reads, they read text.”

What does this next one say?

It says: “Any reading event done by a blind person is also an instance of perceiving by touch.”  In other words,  “Blind people read by touching.”

Note that we could further infer that what is being touched is the object containing the textual material.  An important additional use of roles is also to establish inferentially ascertainable relations between items that are related (by roles) to the same situation.

,
Rules that use roles

  Here are some more examples.

What does the first rule say?

It says: “For every instance of #$SeaBattle, there is some [instance of] #$BodyOfWater where that event occurs.”  Or, “Every sea battle occurs at a body of water.”

The second rule says, “Anyone who is involved in a sea battle is able to swim.”  (Is that true?  Alas, no.)

,
Inferencing using roles

  Here are some examples of how the common sense knowledge represented in Cyc, with the #$genlPreds relation between roles and some of the rules stated on roles, enable Cyc to make some common sense inferences.

Look at the two assertions.  First of all, #$NelsonPlanningTrafalgarAttack is an instance of #$PlanningToDoSomething.  Second of all, #$NelsonPlanningTrafalgarAttack is done by #$HoratioNelson.  From these assertions, Cyc can conclude that, first of all, #$HoratioNelson is an intelligent agent.  Cyc knows this because Cyc knows that anyone who plans to do something must be an intelligent agent.  Further, Cyc can conclude that Horatio Nelson performs the planning of the Trafalgar attack because Cyc knows that anything that’s done by an agent is also performed by an agent.

,
#$Role has important sub-types

#$Role has two specializations (i.e., sub-collections or “specs”): #$ActorSlot and #$SubProcessSlot.  #$ActorSlot is the largest sub-collection of #$Role.   Instances of this collection are predicates that relate specific events to things that are somehow “involved” in those events.   Instances of #$ActorSlot account for two-thirds of the instances of #$Role.  Some important instances of #$ActorSlot are #$performedBy and #$objectActedOn.

Another  important specialization of #$Role is #$SubProcessSlot. Sub-process slots account for about one-third of the instances of #$Role. Instances of this collection are predicates that relate specific events to temporal parts of those events.  The most important sub-process slot is #$subEvents.  Every instance of #$SubProcessSlot is a spec pred of #$subEvents.

There are a few instances of #$Role that don’t fall under either the #$ActorSlot or the #$SubProcessSlot collection.  These role predicates typically don’t fall into the other more specialized collections because they aren’t binary.  Every actor slot and every sub-process slot is a binary predicate.  These other #$Role predicates are usually ternary or variable-arity.  For instance, #$objectsInContact could specify any number of different things that are in contact during an event.  It need not just be the event and one thing in contact.  It’s usually going to be at least the event and then two things in contact.

,
An Alternative Representation: Event Predicates

There is an alternative method for representing events which does not require reifying them: event predicates.

Using the event/role method of describing events and their interrelations and their relation to their components, requires saying something like “There is an attack; it’s an instance of #$MilitaryAttack; that which plays the role #$performedBy in the #$MilitaryAttack is the #$BritishFleet; and that which plays the role of #$maleficiary in the attack is #$NapoleonsEgyptianFleet.”  Instead, we can say something that, on the surface, looks a lot simpler: “The #$BritishFleet attacks #$NapoleonsEgyptianFleet.”

As I said, on the surface the second method looks a lot simpler.  That’s because it’s closer to natural language.  Few people will state something in a way as complex the way the first assertion.  They’ll typically say something like “the British fleet attacks Napoleon’s Egyptian fleet.”  But, in fact, that simple statement can be expanded into the complex one.

,
The Event Representation can be Recovered

Cyc knows that anything that militarily attacks anything else does so during a military attack.  Cyc also knows that that which is attacked in an attack is the #$maleficiary of the attack.  So, the simple-looking assertion that is closer to natural language can be expanded into something like the more complex assertion automatically by Cyc.

,
Specializations of #$EventPredicate

There are two important kinds of #$EventPredicate: #$ActionPredicates and #$ScriptPredicates.  We use #$ActionPredicates to make assertions about actions performed by agents and we use #$ScriptPredicates to make assertions about instances of #$ScriptedEventType.

#$ActionPredicates usually relate two agents to each other or an agent to something, and they specify what the agent is doing.  #$ScriptPredicates typically relate various different event types to one another and establish their order in a typical performance of an event.  So, consider the brushing of teeth.  What comes first?  Well, in most cases wetting the toothbrush comes before putting the toothpaste on the toothbrush.  We establish that by using assertions that use instances of #$ScriptPredicate.

,
Summary

  In summary, there are many kinds of roles in Cyc.  Roles relate components of events to events.  We can state general facts using roles and rules, which gives a lot of inferencing power to Cyc.  The two most significant specializations of #$Role are #$ActorSlot and #$SubProcessSlot.  And event predicates are an alternative to roles.

pdf | zip | Actor Slots

  This is the third lesson on Events and Roles.  It will discuss the concept of Actor Slots.

,
Actor “slots”

  #$ActorSlot is the largest sub-collection of #$Role.  Actor slots are predicates that are arranged in a semantically powerful hierarchy.  All the instances of #$ActorSlot are predicates that have #$actors as their #$genlPreds, directly or indirectly.  In that way, #$actors imposes upon all the predicates below it the constraints shown on this slide: actor slots relate instances of #$Event to instances of #$SomethingExisting because those are the argument constraints on #$actors.

*Note about “plays a part”:  It’s not much easier to give a precise characterization of the criterion for actor slots than it is to give a criterion for the “involved in” criterion for #$Roles generally. For example: actor slots don’t necessarily indicate something is active.  There is one strict temporal requirement imposed through #$actors: anything that can fill an #$ActorSlot role #$temporallyIntersects the event.

,
#$ActorSlot hierarchy (partial)

  This slide shows a small part of the top of the #$ActorSlot #$genlPreds hierarchy.  Constraints imposed by predicates at the top of the hierarchy inherit downward.   At the bottom of this part of the hierachy, we see the #$ActorSlot #$inputsDestroyed.  Since #$inputsDestroyed inherits from #$inputs, Cyc knows that anything destroyed in an event is an input to the event.  Since #$inputs inherits from #$objectActedOn, Cyc knows that any input to an event is somehow acted on in the event.  Since #$objectActedOn inherits from #$preActors, Cyc knows that every object acted on in an event takes part in the event and exists before the event begins, and since #$preActors inherits from #$actors, Cyc knows that  any object that takes part in an event and exists before the event begins is a #$SomethingExisting that temporally intersects the event.

,
#$ActorSlot hierarchy (partial)

  This slide provides a glimpse of a larger portion of the vast #$ActorSlot hierarchy.  As you can see, there are many actor slots that inherit significant constraints from other actor slots in the hierarchy.

,
Inferential relations of roles and temporal predicates

This slide shows another example of how actor slot constraints inherit down through the#$genlPreds hierarchy.  Since #$actors inherits from #$temporallyIntersects, Cyc knows that anything that plays a role in an event temporally intersects the event.

,
Inferential relations of roles and temporal predicates

  Similarly, since #$preActors inherits from #$startsAfterStartingOf and #$actors, Cyc knows that anything that is a pre-actor in an event temporally intersects the event, and starts before the event does.

,
Inferential relations of roles and temporal predicates

Since #$postActors inherits from #$endsAfterEndingOf and #$actors, Cyc knows that anything that is a post-actor in an event temporally intersects the event, and ends after the event does.

,
Some non-#$ActorSlot roles

Although #$ActorSlot is the largest sub-collection of #$Role,  not every instance of #$Role is an instance of #$ActorSlot.  From what we have learned so far in this presentation about actor slots in Cyc, it should be easy to tell why some instances of #$Role cannot be instances of #$ActorSlot.

So, why isn’t #$affectedAgent an instance of #$ActorSlot?

,
Some non-#$ActorSlot roles

That which is specified in the arg2 slot of an #$affectedAgent GAF need not be a participant in the event that’s being described.  For example, in some sense I was affected by World War II in that it had a tremendous effect on the way the that society developed in the fifties, sixties, and so on and I’m a product of that society, but I was not a participant in World War II.  So although we might want to say that I was affected by that event, I wasn’t a participant in it.  So this predicate is not an #$ActorSlot because, as I said, all instances of #$ActorSlots are spec preds of #$actors and it’s a requirement stated on the predicate #$actors that whatever occurs in the arg2 slot must be a participant in the event in the arg1 slot.

,
Some non-#$ActorSlot roles

 #$distanceTranslated is not an #$ActorSlot.  Why not?

,
Some non-#$ActorSlot roles

  #$distanceTranslated is not an instance of #$ActorSlot because its arg2 is not required to be an instance of #$SomethingExisting.  This is another constraint placed on the #$ActorSlot hierarchy by the top node, #$actors, which requires that its second argument be an instance of the collection #$SomethingExisting.

,
Some non-#$ActorSlot roles

#$objectsInContact is not an #$ActorSlot.  Why?

,
Some non-#$ActorSlot roles

  Because it has variable-arity.  When we say that objects are in contact during an event we’re going to need to pick at least two objects and the event.  So it needs to be at least ternary to specify the event and the two objects that are in contact, and it could even be that there are more than two objects in contact during an event.  So this would not be a binary predicate, and all instances of #$ActorSlot are binary predicates.

,
Inferencing using roles

This slide shows an example of how inheritance relations via #$genlPreds roles can be very useful in inferencing.  From the first bulleted set of assertions, the #$genlPreds hierarchy under #$actors enables Cyc immediately to conclude that the assertions in the second bulleted set hold.

,
Summary

So, #$ActorSlots are a special kind of #$Role predicate with special properties enforced by a #$genlPreds hierarchy with #$actors at the top.  The #$genlPreds hierarchy enforces constraints that inherit downward and enable Cyc to conclude many significant facts from assertions made with specialized #$ActorSlot predicates that come from the lower sections of the hierarchy.

pdf | zip | Sub-events

This section will cover sub-events in Cyc.  In general, #$subEvents is the single most important predicate among the non-#$ActorSlot roles. #$SubProcessSlot is one of the two most significant specializations of #$Role.  #$subEvents is the predicate at the top of a hierarchy of predicates that make up the #$SubProcessSlot collection.

,
Sub-events

  In Cyc, an event can have sub-events.  Sub-events are events that occur within the temporal scope of a larger event.  They can be related to each other in various different ways; two different sub-events of a single event can occur simultaneously, one can occur before the other, or they can sort of overlap temporally.

Cyc will infer certain temporal relationships between the “main-event” and the sub-event(s) related to it by various sub-event roles. For instance, #$subEvents implies the relationships #$temporallySubsumes and #$parts.  #$firstSubEvents implies the temporal relations #$temporallySubsumes and #$temporallyStartedBy.

Having identified multiple sub-events of a reference event, we may then represent relationships that hold between the different sub-events.

,
A time-slice of the #$TrafalgarSignals (1)

  Let’s look at an example with some #$subEvents of a particular, extended communications event:

Consider a time-slice of the event #$TrafalgarSignals --  namely, the event in which Nelson signals at 12:15 (shortly after noon). That’s when he asks the rest of the fleet to engage the enemy more closely.

,
A time-slice of the #$TrafalgarSignals (1)

Nelson’s signaling communicates to the rest of the British fleet reading the signal that they are to engage the enemy more closely.

,
Cotemporal sub-events

  Nelson’s signalling and the fleet’s reading of the signals are cotemporal sub-events.

We can say of the first that it’s an instance of #$VisualCommunicating while the second is an instance of #$SignalReading.  Each of those are specializations of #$Event.

We can also say of the first that it’s performed by the signalman on the H.M.S. Victory and of the second that it’s performed by the rest of the British fleet.

Further, we can say that  the sender of information in the signaling is Horatio Nelson, and he’s also the sender of information in the second event.  But notice that we wouldn’t say that #$TrafalgarSignals-British is itself an instance of #$SignalReading.  Only a sub-event of it is an instance of #$SignalReading.   Nor would we say that the Trafalgar signals are entirely performed by the signalman on the HMS Victory.  Only a particular sub-event of the signaling is actually performed by that individual.

We could also say that the British fleet at Trafalgar is the recipient of information in both the signaling and the reading.

These roles, such as #$performedBy, specify the relations that certain individuals have with sub-events of the signaling.  But those relations that those individuals have with the sub-events don't necessarily inherit up or across to other sub-events.

,
Time-slices of the #$TrafalgarSignals

  Let’s look at a case where there are non-cotemporal sub-events of a larger sub-event.  Again, take the battle of Trafalgar, in which the fleets are arranged in the manner indicated in the picture on the right.

One sub-event of #$TrafalgarSignals is Nelson’s signaling the HMS Mars at 9:41 AM.  At that point, Nelson communicated to the rest of the fleet that they should take up the station of the Royal Sovereign – in other words, get in line behind the Royal Sovereign.

,
Time-slices of the #$TrafalgarSignals

  Collingwood signaled the Lee Column four minutes after Nelson’s signal.  So these sub-events of the Trafalgar signals are non-cotemporal because Nelson’s signaling precedes Collingwood’s signaling.  Collingwood just relayed Nelson’s signal down the Lee column.

,
Sequential (non-cotemporal) sub-events

  We can say of the first sub-event, Nelson signaling the H.M.S. Mars, that it is an instance of #$VisualCommunicating and that the second sub-event, Collingwood’s relaying the signal, is also an instance of #$VisualCommunicating.

Each one of these instances of #$VisualCommunicating is performed by a different signalman, namely the signalman on the relevant ships.

The sender of information is different in each case and the recipient of information is different in each case.

,
Summary

  We’ve talked about sub-events.  Sub-events are components of events and we relate sub-events to the events of which they are components via sub-process slots.  #$SubProcessSlot is one of the two significant specializations of #$Role.  The top node in the predicate hierarchy that constitutes #$SubProcessSlot is #$subEvents.  So a sub-event and all its spec preds relate certain events to the events of which they are sub-events, “temporal components” in some sense.  Sub-events stand in certain relations to each other and the event of which they are components.  They can occur simultaneously, they can occur in sequence, or they can temporally overlap.  We looked at some examples of both cotemporal sub-events and sequential sub-events.

Writing Efficient CycL: Some Concrete Suggestions

pdf | zip | Writing Efficient CycL: Part 1
"Part One"

This section will give you some specific suggestions on how to write CycL that can be more efficiently handled by our Inference Engine. The lessons will give you a set of heuristics, some of which are mutually contradictory, so you’ll have to trade off between them (all good heuristics are mutually contradictory, otherwise you could set up a decision tree, right?).

If you have completed all of the lessons in order, then you’ve been exposed to a lot of suggestions on doing OE that make representational sense, so they are suggestions from the representational side. This section, however, is focusing on stuff from the implementation and algorithmic side that treats CycL as more than just a notational exercise in predicate calculus calligraphy. You’re writing something that is meant to be used by an automaton efficiently to do useful things. So you’ll see some suggestions which might seem a little bit inelegant; actually, many of them have analogs to software engineering principles which would be considered elegant by software engineers, so it’s worth keeping your mind open.

The slide shows an outline of the lessons in this section, going from most fundamental to least fundamental (and most idiosyncratic to our system). Many of the earlier lessons apply to any knowledge representation system and some of the later ones reflect the current CycL state.

,
"Simpler is Better"
,
Simpler is Better

This is probably the most powerful engineering principle of all time. In our system, simpler is better in many very concrete ways. For example, if you can find a way of expressing the knowledge that you want using GAFs instead of rules, that is better. It is better because those GAFs probably have a much more uniform definition of all the vocabulary in it, which means that you’re probably reusing a lot of work that’s already there. So there is probably better indexing involved, better ways of more naturally generating it in English, and better ways of translating that into other applications that can deal only with GAFs.

,
GAFs are better than rules

Even though rules are very valuable, they are just one tool in your kit. Rule macro predicates are sort of a hybrid between rules and GAFs that have the terseness of GAFs and the expressibility of rules (you’ll learn more abou tthese in the next lesson). You really can express a lot of your knowledge using GAFs, and I strongly encourage you to do so. By weight in the system, approximately 98% of the system is GAFs. This is too low. The system should be more like 99.9% GAFs with very powerful, very general rules defining their meaning.

A lot of the rules tend to be initially conceived in a very idiosyncratic way, like “a koala eats eucalyptus.” We should have a general notion of prototypical diet that could be used in a GAF to match things like koala and eucalyptus (we, in fact, have the predicate #$hasDiet, that can match a koala with the type of food it eats). You can imagine all of the other analogous things that you could stamp out like that with one rule that defines what “prototypicalDiet” means. That’s an example of the fact that simpler representation is better because it encourages you to write a more powerful definition based on a more general, more reusable rule.

,
Binary is better than non-binary

  If you can represent something in binary predicates instead of non-binary predicates, that’s probably better also. There’s a reason why frame-and-slot systems and other sorts of binary representational systems can go a long way; it’s because, many times, the higher-arity stuff is an example of not factoring the knowledge concretely. A common case of this is in gratuitously ternary predicates where there is some term in the arg1 position, some other term in the arg2 position and some other term in the arg3 position. But the relationship between arg1 and arg2 and the relationship between arg1 and arg3 are independent. If you had two separate predicates, you could state those relationships independently.

This is exactly analogous to the difference between a highly normalized database and a non-normal database where you have a database of ten columns and when people enter new records, the records have a bunch of nulls in them because they don’t know the values. The fact that you can enter a null is a tip-off to an unnecessary, independent column. There should be another table which just states each column independently. So if you see yourself wanting to create a quaternary predicate, convince yourself that all that those arguments really are is the totality of the argument which is being stated -- it’s not like you’re using one assertion to state n separable things.

,
Horn-rules are better than non-horn rules

Horn rules are preferred over non-horn rules. For those of you who are unfamiliar with this vocabulary, a “horn” rule is one in which you have a number of literals in the antecedent added together and a single literal in the consequent. It’s used to say something like “a conjunction of things implies something to be true.” A non-horn rule would contain a disjunction in its consequent, like “x or t”.

Horn rules are easier to deal with than non-horn rules because non-horn rules tend to require proof by cases. If it’s a non-horn rule, then there is going to be a disjunction in the consequent, which means that you’re concluding something that is kind of weak and represents several cases from which you might have to choose.

,
Several small rules are better than one gargantuan rule

  Write several small rules instead of one gargantuan rule. I once saw a rule which filled seven printed pages -- this should strike you as being WRONG. It is way too complex to be correct. It will never get used and takes up a giant chunk of memory to store and likely does not even say what was intended.  A very nice thing about separability of that which is independent is that it’s a form of modularity which allows you to independently generalize each of the pieces.  For example, a very specific rule that only applies to spaceships taking off from Florida and going into the vacuum of outer space is very difficult to generalize. But if it had been broken into twelve separate pieces (one about transportation through a medium, another about  transportation mode, etc.), you could generalize each of the rules -- the rule about transportation mode might apply to all kinds of non-vehicle transportation, or even transporting materials in cells, or something like this.

,
"Simpler is Better"

As you learned in the previous lesson, simpler is better. You should write nice rules in the system, but where possible you should use a rule macro predicate that expresses it. This is a very important form of simple being better, where it states it as simply as it needs to be and no more. Not only that, it makes it easier to do meta-reasoning about the vocabulary itself. I can reason about the genlPreds hierarchy very easily, whereas if it’s all expressed in rules I have this extra interpretation step. I would have to keep asking things like “is this really a genlPreds rule? What is the superior predicate? What is the inferior predicate?” genlPreds in effect factors out the interpretation and ensures uniform interpretation. Because of this factoring out, it makes it easy to attach a new HL module and implement some new reasoning capabilities for that particular approach.

,
Simpler is better

Even though logically a long set of rules can be modeled as totally equivalent to a rule macro predicate, the terseness of the rule macro predicate provides some additional benefits, optimization points, and places to hook up special cases of functionality which make it very beneficial to have them.  Those of you who are programmers know this as functional abstraction. Stamping out a million rules is nothing but Copy and Paste, and abstracting out a property like genlPreds is like abstracting out your Helper Function or a function that defines what I’m intending. I can write rules that define what genlPreds means and that’s like the implementation of the abstract interface.

This is a form of basically functional abstraction at the OE level. Those with software engineering knowledge understand the utility of functional abstraction. Rule macro predicates are also analogous to macros in a language that provides you with macros. It’s kind of a way of having your Copy-and-Edit cake and eating it, too. If I use a macro, it will do the Copy and Paste uniformly.

,
Reason with the vocabulary itself

  If a rule is too complicated, it might be too complicated because you’re assuming that you’re not allowed to add new vocabulary. A big suggestion for making your life easier is to create the simplified vocabulary which will allow you to write these things tersely. For example, let’s say I was working for the Animal Planet.  Let’s say that they wanted to build a website that had a bunch of knowledge about which animals eat other animals. I could start writing all of these rules, like “cheetahs eat wildebeests” and “cheetahs eat antelopes” and just keep stamping out rule after rule until I have somewhere near ten thousand of these because they have all of this knowledge about what animals eat.

You can look at that and say “we’re just saying the same thing over and over. Let’s just abstract out ‘animal eats other animal’ and define what that one predicate means.” Now we can state ten thousand GAFs with the additional overhead of adding this new predicate and then write a rule that defines what it means for an animal to eat another animal.

,
Reason with the vocabulary itself (cont.)

Adding the predicate that represents animal eats other animal (from the previous slide) is an example of creating simpler vocabulary which allows you to dramatically simplify what you’re going to write next. That’s what we at Cycorp have done over the years. Every predicate that you see, every function that you see, every collection -- did not exist at one point.  It was added because there was utility in adding it.

Take this approach to writing CycL. If you’re entering a lot of rich knowledge in a new domain, add some vocabulary to make your work easier and define it in ways that link it up to other existing vocabulary so that you can write very tersely the things you need to say.

,
Add Rule Macro Predicates

You could prove that arithmetic can be defined in terms of Set Theory, for example. So you could use 0 and #$SuccessorFn and could define all of the natural numbers that way. And you could define addition in terms of operations on this. But no one would use that to balance their checkbook.  It’s useful to add some abstractions like Arabic numeral sequences and things like this which are representations that effectively chunk a whole bunch of this functionality together.

For whatever reason, there seems to be a persistent reluctance to add new vocabulary in order to help one out in doing knowledge representation. This happens with Cyc users and with users of other knowledge representation schemes.  There seems to be a desire for simplicity in vocabulary almost like the desire for elegance just for the sake of elegance. Isn’t it interesting how we can express arithmetic in terms of Set Theory? Well, yes, but it’s not very useful for an automaton that has to reason with it.

,
"Simpler is Better"
,
Some False Ideas

  There’s an erroneous mindset out there that somehow vocabulary is expensive but arbitrarily complicated expressions are somehow as cheap as any other. This is what Doug Lenat refers to as “Physics Envy” where people in knowledge representation are wishing for the “Maxwell’s Equations of KR”: four simple little things which can express how to represent knowledge and put it on a T-shirt.  It’s just not like that.  In fact we have the reverse in our system. Adding new vocabulary is no more expensive than adding any other assertion about something and complex assertions are quantitatively harder to deal with than simple things like GAFs.

,
New Vocabulary Will Not Make Rules Less Reusable

If you’re concerned that creating new vocabulary will make your rules less reusable, think about this way. Just as we expect people to reuse #$Dog when they are talking about the concept dog, and we expect to have tools available to enable someone to find that concept when they want to say something about dogs, we should have analogous tools for other kinds of vocabulary. If you’re looking for a predicate that can relate one type of thing to another type of thing, or a function which can denote a certain type of thing, or something else, there should be tools which help you find all of those kinds of things and reuse them. Moreover, the meaning of #$Dog is only defined by how it’s linked up to other things in the system, just like the meaning of the predicate #$isa is defined by how it’s used in the system.

So it’s not like the vocabulary for predicates and things like that have any additional burden that other terms don’t have as well.  If you expect #$Dog to get reused properly, it had better get hooked up as a spec of #$Animal, right?  It had better have links to other things that you would expect it to be linked to. So, the vocabulary is no different than other terms that we add to the KB.

Truly large-scale, reusable, interesting knowledge bases are going to be big. There must be tools which help you find what you’re looking for. This goes hand in hand with having a large knowledge base. There are other systems which have not spent as much time on tools because they tend to deal with small theories where you can print out the entire ontology in a tree-like graph that fits on one page. In this case you can use the human eye as a search tool. For small things you can get away with that, but for large-scale knowledge bases, you must have tools. For example, we have pretty good tools for walking the hierarchies.

,
OE vs KE vs SME’s and Their Vocabulary
,
The Rule of 10

This is a suggestion for when you create simplifying vocabulary, called the Rule of 10. Simplifying a bunch of assertions that you’re about to make is reason enough to create some new vocabulary. If you’re going to introduce a new term to the system, you should have about ten interesting things to say about it.

For example, if you’re going to say a whole bunch of stuff that keeps repeating the same pattern over and over, like the CycL that you see at the bottom of the slide that begins with #$and, then the concept of those people that have gender masculine would be worth factoring out, as you see in the CycL in the middle of the slide. Then you would hook up the things that you would say to #$MalePerson, instead of having to describe it every time you refer to this concept.

You should have about ten things to say about the average term in a system. If you have more than ten things to say about it, then maybe there is something that you’re not factoring out about it -- maybe there are sub-properties that are interesting to talk about. If you have less than ten things to say about it, then maybe it’s not a distinction worth making from other things in the system. In practice, if you look at the ratio of the number of assertions in our system to number of reified concepts in our system, for at least the last eleven years this ratio has hovered between 8 and 12.

Empirically, this seems to play out in representing a large knowledge base. If you find a predicate which, if you created it, you would know of at least ten uses for it and think there is potential for more in the future, then it’s worth considering creating it.

,
Why Simplify Vocabulary?
,
Summary
  • Simpler is Better
  • Use Rule Macro Predicates
  • Create Simplifying Vocabulary

This concludes the lesson on creating simplifying vocabulary.

pdf | zip | Writing Efficient CycL: Part 2
"Part One:"
,
"Factor out Commonality"

Usually, when similarity is not a coincidence, there is often some common abstraction that is worth factoring out.

,
N*M reduces to N+M

Let’s say that I have four collections (A, B, C, and D) and four specs of those collections (W, X, Y, and Z) such that collection A has W, X, Y, and Z as its specs, just as collection C has W, X, Y, and Z as its specs. To Cyc, it’s just an interesting coincidence that they all have the same specs. But often there is a reason that the specs are all the same; there’s some common property in the middle that explains the similarity. Maybe, if the specs are all this one type of thing (P), then this one type of thing is expected of all those collections.

These sorts of N by M lattices, depicted on the left side of the slide, can be reduced to N + M statements with a judicious extraction of why it is that they all have this in common, as depicted on the right side of the slide. Reducing the number of genls or genlMt relations that have to be stated explicitly saves the system a lot of work, especially when doing inference. In fact, for any situation where you have (in the case of microtheories) more than two microtheories with more than two specsMts, (N and M are greater than 2), the savings to the system equals [(N*M) - (N+M) - (the overhead of creating one new microtheory)]. In the simple case on the slide, this comes to 8 fewer genlMt assertions. The overhead involved in using one extra microtheory is negligible.

,
Commonalities in Collections

Let’s look first at an example involving collections. Suppose that we have the following four collections (shown at the bottom of the slide): Olympic Men’s Individual 400 Meter Butterfly, Olympic Women’s Individual 100 Meter Backstroke, etc. Each collection refers to all instances of that type of event.

,
Commonalities in Collections

In order for these collections to make sense, they need to inherit from (at least) the collections above them on the slide: Olympic Competition, Individual Competition, and Swimming Competition. Thus the collections along the bottom of the slide must be made specs of the collections at the top of the slide.

,
Commonalities in Collections

So we need to create genl links from each of the more specific collections along the bottom of the slide up to the more general collections along the top of the slide. Now we have 12 (3 * 4) genl links which allow each of the lower collections to inherit from the necessary collections above.

You and I can look at this and imagine that there are probably many more possible collections of the type “Olympic <Gender> Individual <Distance> <Stroke>.” The number of genl links necessary gets very high very quickly. We can also see that there is an obvious pattern in these more specific collections. Cyc doesn’t see the pattern (coincidences are possible and we don’t want Cyc to think there is a pattern when one is not really there). Why not factor out those properties that are in common among all of these collections and tell Cyc about it?

,
Commonalities in Collections

Factor out what is in common and put it in a new collection, called Olympic Individual Swimming. Now we need only 7 (3 + 4) genls links. The four lower collections inherit from the new collection and the new collection inherits from the three upper collections. The lower collections can still see the information they need, but now it’s all collected together in one place.

,
Commonalities in Microtheories

The same idea works with microtheories and genlMt links. Suppose that we’re representing in the KB the contents of each day’s Sports Section from USA Today. Each #$PropositionalInformationThing (“PIT”) is its own microtheory, so we have Mts (shown at the bottom of the screen) such as: “April 15, 2002 USA Today Sports Section Mt,” “June 2, 2002 USA Today  Sports Section Mt,” and so on. Each of these microtheories contain the semantic content of the sports section of USA Today from a particular day. We’ll have about 260 of these for each year that we represent (weekdays only!). So if a record-breaking event takes place on April 14th, the April 15 Mt will contain a description of that event.

,
Commonalities in Microtheories

  In order for these Mts to make sense, they need to inherit from (at least) the microtheories above them on the slide: Rules for Games Mt (which describes how each sport is played), American Sports Competition Mt (which has information on schedules, team rosters, and which teams play in which leagues), and Current and Historical Stats Mt (has current and historical statistics). Thus the microtheories along the bottom of the slide must be made specMts of the microtheories at the top of the slide.

,
Commonalities in Microtheories

  So we need genlMt links from each of the more specific microtheories along the bottom of the slide up to the more general microtheories along the top of the slide. Now we have 780 (3 * 260) genlMt links per year of Sports Sections, which allow each of the lower microtheories to inherit from the necessary microtheories above.

In reality, the number of genlMt links necessary would probably be greater than 3 * 260 per year because each new issue would likely have to be linked to more than 3 microtheories. We can also see that each new PIT will have to inherit from the same microtheories. So we want the PIT’s to be able to refer to all of the microtheories at the top of the slide using a single referent. We can do this by introducing one new microtheory in the middle.

,
Commonalities in Microtheories

  Now we need only 263 (3 + 260) genlMt links to represent a year’s worth of Sports Sections. The lower microtheories inherit from the new microtheory and the new microtheory inherits from the three upper microtheories. The lower microtheories can still see the information they need, but now it’s all visible via a single Mt, the Rules, Competition, and Stats Mt.

,
Microtheories + a Forward Rule

  You’ve seen how factoring out commonality saves work in inference by  reducing the number of genl/genlMt links involved. Now let’s look at another one of the benefits of factoring out commonality.

First, consider how a forward rule behaves when two or more of its literals and/or the rule itself are in different Mts. Let’s say that the KB contains some microtheories (X, Y, Z, and T) and T contains a forward rule (PQR®S). The highest (most general) specMt that can see P, Q, R, and the rule, will automatically have an S asserted in it.

,
Microtheories + a Forward Rule

  However, in a situation like the one on the slide, where several microtheories are all the highest (most general) specMt, poor Cyc can successfully conclude S, but it has no choice but to conclude S in all of these microtheories. If there are three hundred of these microtheories, Cyc will conclude the same S in all three hundred microtheories.

,
Microtheories + a Forward Rule

  So, you should abstract out an Mt from which all of the specMts inherit. Now Cyc can conclude S in one place, and all of the microtheories below inherit that conclusion from it. Not only are there fewer genlMt links, but S is only asserted in one place, so inference involving S or any of the specMts is much simpler.

This is a very general mechanism that I encourage you to use when you have similar things like this in your system: factor out that which explains the commonality.

,
"Factor out Commonality"
,
Concluding #$thereExists

  The fact that we have this nice, complicated, predicate logic language doesn’t mean that all constructs that one can state in this are equally easy to reason with. In particular, things that conclude #$thereExists are a little bit more expensive than others because they are basically extending the number of concepts and the symbols in the language that the system might have to think about. In order to conclude thereExists, our system introduces a function under the covers which will denote the thing that so exists so that the system can reason with it. This process is called skolemization. If you state a lot of existence statements, you will be creating a bunch of these under the covers.

,
State Existence Generally

  It’s also a pervasive problem that existence is stated far too specifically in many cases. Thus you see things like: Every elephant has a head. Every elephant has a head with a trunk. Every cow has a head. Etc. It is very ennobling to state that things actually exist, rather than just state constraints on things would they exist. People love to state that things actually do exist.

However, it seems like you could state a very general property like uniquePartType, that matched elephant to head, and then write one very powerful rule that says for each object there exists a unique object for each of its uniquePartTypes. And you could write one very general rule that states the existence. And then you could state constraints on those things that exist more specially. If you’re writing a rule that concludes thereExists, I would strongly encourage you to generalize that as much as possible.

,
"Factor out Commonality"
,
Overuse of #$not

The case on this slide is an example of what we consider an overuse of #$not in the system. Let’s say that someone wants to write a rule that says that birds are by default capable of flying. Then they realize “oh, but not penguins” so they stick a #$not in there. Then they get hungry and go out to lunch and forget about this. Well, what about all of the other cases like those listed on the slide? Emus? Birds with broken wings? And all of these other exceptions that would make it unable to fly?

These are exceptions and they deserve to be treated exceptionally. In our system, it does not make sense to state these rare, negated checks directly in the rule. This is mixing together the intent of the rule with meta-knowledge about when the rule doesn’t apply in the rule itself. It’s good to separate the two.

,
Use Default Reasoning

  Because we support default reasoning in our system, the best thing to do in the situation described on the previous slide would be to state the nice general rule that you want (i.e. birds are by default capable of flying) and then state exceptions to it. So this rule has an exception to it where if the bird is a flightless bird, then this rule doesn’t apply. Then you can add some nice ontologizations of penguins, emus, dodo birds -- all kinds of flightless birds.

,
Postpone Checking for Exceptions

  In our system, when we produce a candidate inference, and we find a rule that was used to produce it, if that rule has exceptions stated about it, then, at the very end, we can easily check to see if the exceptions are true. Since these are by default exceptional, they are rarely true; so we save a lot of work by postponing this check until the end when we have to. So when you see yourself writing #$not’s in the antecedent, you should ask yourself “is this really part of the warrant of the rule? Is this what the rule really means or this a meta-statement about cases where that rule wouldn't apply and should be stated as an exception?” If it’s the latter, then use exceptions.

,
"Factor out Commonality"
,
Underuse of #$not

  Other systems, that use negation by failure as a primary negation mechanism, sort of assume that you have all of this stuff that’s stated that is true and by default if it isn’t stated in the system then assume it’s not true. That’s how they get some of their “not”s. This case is an example of underuse of #$not. It’s much better to be able to strongly conclude that something is not true than to assume it’s not true. This is especially the case in our system, because we often use negations to prove inferences. Negations are the very thing that allow you to do most of the interesting pruning.

We have some very powerful vocabulary to state negations in our system. #$disjointWith between two collections allows you to state that none of the things in this collection are going to be in that collection. #$disjointWith between #$partiallyTangible and #$intangible implies trivially on the order of ten billion negations -- one single #$disjointWith plus the reasoning involved. Do I have to assume that a table is not an integer? No. I don’t have to assume that because one #$disjointWith covers that and billions of other reasons that things aren’t other things.

,
Powerful Negations

  A very powerful form of #$disjointWith is to state that each thing in an entire taxonomy of collections have this interesting property. This is what #$SiblingDisjointCollection is for. Like the Linnean taxonomy of life which says that for everything, either one is a spec of the other, or they’re disjoint. This gives you a nice perfect tree. Saying #$ SiblingDisjointCollection about animal taxonomic type allows you to conclude that a chicken is not a giraffe, a giraffe is not a wildebeest, a wildebeest is not a wombat, etc. All of these things from one statement and knowledge about the Linnean taxonomy.

Two other interesting predicates are #$completeExtentKnown and #$completeCollectionExtent. The former applied to a predicate says “the complete extent of this predicate is stated in the KB. So if it’s not there it’s false.” This is useful for things like #$nationalBorder. There are only so many countries in the world and they all have on average something like 2 or 3 borders, right? So with about 700 or fewer assertions you could state every border in the world and then state that’s it, that’s all of them. Then you could reason that Canada does not border on France just from knowing that I don’t know it and I know I should know it if it’s there.

,
Another Powerful Negation

  The analog for collections is #$completeCollectionExtent. I strongly encourage you to use this predicate about certain collections that are complete. This allows you to iterate over the current instances of the collection in inference rather than always having to wait until some other part of inference returns candidate instances. A good example of this is #$MonthOfYearType (this is based on the Julian Calendar with January, February, March, etc,.). There are only twelve of those -- there is not going to be a thirteenth. So once you have all twelve stated, tell the system that that’s it.

Note that these talk about the current theory only. So as far as the current theory is concerned, those are all of the months. In the future, if the world changed and we were to add “Yule” as something between December and January, a new two-day month, then we could update that and then there would be thirteen months and that would be all of them. Go ahead and use these things; and in the future, if you add something new, it just means that the theory has been revised, just like any other -- just like adding any other new vocabulary.

,
"Factor out Commonality"
,
Disjunctions? Try to Generalize

  The example on the slide is a particular overuse of #$or which I call “list-em” KR. People just start listing stuff in an order that hints at what they’re trying to get at. Consider the rule on the slide about sibling rivalry. It says “if there’s a person and there’s something which is either that person’s brother or their sister, then they feel rivalry towards them. Remember #$or can be an arbitrary disjunction. So following the #$or you could have something which is their sister or their hair is green or something else. This rule on the slide is a case where the intended meaning of #$or is not arbitrary. There’s an intended commonality among these two things. They are skipping around this concept of sibling. What they really should have is a predicate called #$siblings. If there was this predicate called #$siblings, which had #$brothers and #$sisters as specPred’s, then this would would be a very simple rule.

If you’re writing sentences that have #$or’s in them, ask yourself if there is some generalization that you’re wishing was there, that you’re having to describe and circumscribe around with your disjunction instead of a more appropriate predicate. Rules like the one on the slide devolve into proof by cases. Proof by cases, however, can be handled much more efficiently using genlPred’s if they already exist.

,
"Factor out Commonality"
,
Use #$different Avoid (#$not (#$equals ...))

This is very Cyc-specific. We have a version of not equals called #$different. #$different can take any number of items and state that they all have to be different. So if you have n number of items, rather than writing n2/2 not equals statements, you can just say that they’re all different and this has a nice HL module for implementing it that has a fail fast if two things that are similar show up in it. This is good for pruning, too.

,
Summary
  • Factor out Commonality
  • Existence is Expensive
  • Exceptions are Exceptional
  • State Negations Explicitly
  • Generalize - Don’t List
  • Use #$different
  • Avoid (#$not (#$equals ...))

This concludes Part 2 of the lesson called “Writing Efficient CycL.”

Inference in Cyc

pdf | zip | Logical Aspects of Inference

We’ll be talking about Inference in Cyc.  Inference is the mechanism we use to conclude new facts from other existing facts and rules in the system.  We’ll be talking about it in four sections and the first section will be about the logical aspects of inference.

,
Inference uses Deduction: Rules

  “Rules” are very general statements in a formal logical language, such as this formula, under “Rules,” which essentially says that everyone loves their mother. If ?MOTHER is the mother of some ?PERSON, then that ?PERSON loves that ?MOTHER.  It’s a conditional statement.  It’s general -- it doesn’t apply to one specific thing and it quantifies over objects in the world.  As such it usually has variables (i.e. “?PERSON”) in it and there are usually multiple literals, or statements, that are connected together by logical connectives (i.e. “implies”).

Rules tend to be the complicated logical statements.  The other statements in the system can be colloquially described as “facts.”

,
Inference uses Deduction: Facts

  “Facts” tend to be about some very specific thing in the world -- unlike rules which are general.  They tend to be ground, meaning that they don’t have any variables in them, because they’re talking about some specific case, they’re not quantifying over the world.

Facts are also atomic, which has to do with it being a single statement, like “the mother of Hamlet is Gertrude.”  It is the application of a predicate, #$mother, to arguments, #$Hamlet and #$Gertrude.  So the rule from the previous slide would not be atomic because there are two literals connected together by a logical connective to say that if this (a person has a mother) were the case, then this (a person loves their mother) would be the case.  So something is “atomic” if there is no conditional aspect to it, if it’s just a single statement.

“Facts” are more formally called ground atomic formulas because they are formulas that are both ground and atomic.

,
Inference uses Deduction: Non-atomic terms, Predicates, and Functions

A “non-atomic” term would be a functional term, something like (#$BabyFn #$Jaguar).  Let’s say you had a function called #$BabyFn that applies to any animal type and you can use it to denote the baby forms of any animal.  So you could denote a baby cat, a baby whale, or a baby jaguar.

There is a distinction between predicates and functions in logic.  Predicates are statements about the truth of something -- it is either true or it’s not.  In the case of (#$mother #$Hamlet #$Gertrude), it’s stating “it is true that the mother of Hamlet is Gertrude.”  Functions denote a new term that you’re talking about.  The application of #$BabyFn to #$Jaguar does not say “it is true that BabyFn of Jaguar.”  That doesn’t make any sense.  The application of #$BabyFn to #$Jaguar allows you to denote a new concept, that of “Baby Jaguar.”  Functions and predicates can both be used to make what Cyc calls “formulas.”

,
Inference uses Deduction: Formulas and Logic

  A formula is just an operator applied to arguments.  In that sense, logical connectives like #$implies, #$and, and #$or are also operators.  So the previous example of a rule is a formula as well. The operator #$implies takes two formulas as arguments and maks a larger formula.  This is what a logical connective is -- it is something that connects formulas into more complex formulas.

The basis for inference in Cyc is just performing the standard logical deductions or syllogisms that you learn in logic classes: All men are mortal.  Socrates is a man.  Therefore, Socrates is mortal.  This is just to point out that the basis of our system is not Bayesian Reasoning or fuzzy logic or something like that.  For every inference step we make, we actually have a logically-sound proof behind it that you can look at to see “X is true specifically because I did a deduction and Y and Z are the two things which together allow me to conclude it.”  This will be important later, when we describe how we perform truth maintenance in our system.

,
Inference uses Deduction: Rules + Facts

  Our system uses logical deduction as the basis for inference.  This can be tersely summarized as the application of rules plus facts to conclude some new facts.

An example of a fact is “the mother of Hamlet is Gertrude.”  You can use this fact plus the rule from previous slides to perform a logical deduction.  This deductive inference would be that because of this rule (everyone loves their mother) and this fact (the mother of Hamlet is Gertrude) you can logically conclude that Hamlet loves Gertrude.

,
The Resolution Principle

Let’s talk in a little more detail about how we perform these logical deductions.  The specific method that we use is the “Resolution Principle.” The Resolution Principle is a very standard logical mechanism for connecting two logical statements to come up with a third one.  The algorithm that embodies the resolution principle can be tersely summarized as “Unify, Substitute, and Merge,” in that order.  I’ll describe each of these steps in this next example.

Assume for a moment that I have a question that I’m asking the system: “Who is it that Hamlet both knows and loves?”  I want the system to try and prove what answers for “Who?” are in our system.  So, given that the formula on the left of the slide is the query that I’m asking, and we have the rule on the right of the slide in our system (which says that everyone loves their mother), then one way of concluding who someone loves would be if you know who their mother is.

,
The Resolution Principle: Unify

  I can use the query and the rule to perform one logical deductive step: I can prove an answer for your query if I can prove an answer for something else because of some rule in the system.  In effect, an application of the Resolution Principle allows you to take one logical formula and use it in concert with the thing you’re trying to prove to help you turn it into something else you should try and prove.  This can either be more complicated or less complicated than what you start with.  I’ll talk about the ramifications of both of those options later.  For now, let’s look at how we would perform this one step in this case.

The act of combining the query and the rule to help answer part of the query is one deductive step.  The following is how we apply the resolution principle to this situation.  The key is to identify one particular literal which is common between the two.  In this case, we have the literal “Who does Hamlet love?” and a rule that would allow us to conclude who everyone loves.  “Who does Hamlet love?” would be called the “pivot literal” in the query and “Person loves its mother” would be called the “pivot literal” in the rule because these are the literals around which the whole step pivots.  So we identify that we are going to try and prove this literal, (#$loves #$Hamlet ?WHO), using the rule on the right of the slide.

,
The Resolution Principle: Unify

  Se we have decided to see if we can prove the loves part, the third line of the query, using the third line of the rule.  We can do it by unifying the “loves” from the query with the “loves” from the rule.  The unification step is attempting to identify a way of making the two “loves” literals exactly the same.  This is the essence of Unification.  We’re trying to come up with a recipe for how to make both of these actually exactly the same.  That recipe is referred to as the “Most General Unifier.”  This is a recipe for substituting one thing for another such that if you did the substitution, then you would actually have the exact same thing.  So, in this case, we can get the literals to be exactly the same with this proviso: if #$Hamlet is matched up with ?PERSON and ?WHO is matched up with ?MOTHER.  It’s called the Most General Unifier because we make the most conservative recipe possible -- we’re not committing to anything more than we have to in order to get these things to match and that makes it most general.

So once we’ve identified the Most General Unifier, the most general recipe for making them the same, we’ve completed the unifying step.

,
The Resolution Principle: Substitute

  Given the Most General Unifier, we can apply this recipe to both of them to perform the Substitution step.  We apply this recipe to one literal in order to get it to be the same as the other.  If I substitute into the rule, #$Hamlet for ?PERSON and ?WHO for ?MOTHER, I’ll then have something in which the third line of both the query and the rule exactly match.  We also apply the Most General Unifier to the rest of the query and the rule.  This is a simple unification, but it is common in the inference that Cyc normally does.

,
The Resolution Principle: Merge

  The final step is the Merge step, where we then take the remaining pieces of both and merge them together into something like what you see at the bottom of the slide.  The loves part disappears because that’s the thing that actually got proved in this step.  The result of the merge is a combination of the two substituted pieces together.  So, we start out with “Who does Hamlet know and love?” and using this rule, “Everyone loves their mother,” we can turn our original query into “Who does Hamlet know? And is that person Hamlet’s mother?”  If I were to find an answer for the new, merged question, then because of the rule, it would be an answer for the query.

So what results is a new thing, shown at the bottom of the slide, that we try and prove.  It indicates that any answer for the formula, because of the rule on the right of the slide, would be an answer for the query on the left of the slide.

,
The Resolution Principle using a fact

 
So, the act of inference is taking things that you're trying to prove and applying things you already know in order to find different things to prove.  Hopefully these things you are trying to prove eventually reduce down to something like “true” which is patently true already and does not need to be proven.  Therefore, the path of how you got there would provide one complete logical proof of an answer.

Let’s look at another example of the Resolution Principle -- one that uses a fact instead of a rule.  Let’s say that we actually have “Hamlet loves Ophelia” as a fact instead of a rule.  We’re still asking “Who does Hamlet know and love?”  I can use the fact “Hamlet loves Ophelia” to resolve against the loves literal in the query.  In this case, the Most General Unifier would be simpler.  It would just be matching ?WHO with #$Ophelia.  All that would be left to prove is if Hamlet knows Ophelia, so down below we have a very simple statement.  This would usually be proved using some other knowledge, for example Ophelia is Hamlet’s friend and everyone knows their friends, or something like that.  With this, I could reduce the formula (#$knows #$Hamlet #$Ophelia) down to #$True.  There would be nothing left to prove.  It would be like there being an empty degenerate (#$and) in the query, which is equivalent to #$True.

This example shows that the use of a fact strictly simplifies a proof, while the use of a rule usually leaves a proof at least as complicated and sometimes more complicated (if it has more conditions in the antecedent) than the original.

,
Resolving to #$True

  Let me say more about proving “True.”  Once you arrive at #$True as the final thing to prove, you can realize that #$True is true, so you’ve actually finally proved it -- you’ve reached the end.

This leaves you with one complete deductive proof such that you can walk backwards and say “What did I use in each step along the way?” and collect up the formulas that you used along the way to make each step.  By collecting all of the Most General Unifiers involved in the proof, you can see things like “what did this variable ?WHO get bound to?  Oh, it eventually got bound to #$Ophelia here.”  So this kind of information keeps getting passed up to the top and eventually out the top will bubble an answer.  “Here is one answer for what you asked for and the path all the way down has each step along the way -- the intermediate steps (Most General Unifiers) and how you proved them, and what you used to justify this answer.”

Next we’ll be talking about how we actually algorithmically perform all of these resolution steps.  We just described logically how Cyc performs inference and in the next lesson we’re going to describe in more detail the mechanism that we use to perform these logical deductions.

,
Summary
  • Inference uses Deduction
    • Facts + Rules => New Fact
    • Rules vs. Facts
    • Predicates vs. Functions
  • Inference uses Resolution
    • The Resolution Principle: Unify,
    • Substitute, Merge
    • Resolving to #$True

This concludes the lesson on the logical aspects of Inference.

pdf | zip | Incompleteness in Searching
Inference in Cyc

We described earlier how inference can be viewed as a sequence of logical deductions chained together.  At each point along the way, you might often have different ways that you could now perform a new deduction.  So, in effect, from what you start with there is a branching set of possibilities for how you can prove it.  And that branching set can spread out in all kinds of different, novel ways.

,
The Search Tree

  For example, you might want to try to prove “Who does Hamlet know and love?” based on information about his family relationships.  Or you might want to try to prove it based on knowledge of his patriotism, or his knowledge of his king -- information like this.  So, you’re going to find that it’s not just a single path that you’re trying to prove.  You’ll end up having this forking, branching set of possibilities.  Therefore it’s very natural to view inference in our system as a kind of search problem where it’s much like a search tree.

You start out at the top of the tree, the root, with the branches as steps away going down.  You can imagine the top of the tree as being the query that you actually asked.  Each step down to child nodes in this tree can be viewed as one potential logical deduction that would take you from trying to prove the original query into trying to prove a different query below using this logical deductive step.  The fan out of possibilities can be viewed as this branching tree, getting bushier and bushier and deeper and deeper.

So, for purposes of explaining this process, pretend that you magically have available to you something that will tell you all of the things that you could try to prove in order to answer the query.  Each of the things you could try ends up being one of the child steps to a child node.  Later we’ll talk about how we deal with actually finding useful paths.  Now we’ll talk about how to decide amongst them.

,
Justifying the Answers

Imagine that each node in this tree represents something that you’re trying to prove.  Each link from a parent node higher up to a child node below represents one logical deduction that you did to go from the query above to the query below.  So, associated with that is going to have to be the formula you used to go there and the Most General Unifier you used (the recipe that you used to match the formula above with this rule to get this formula below).

Since each dashed line represents one potential path, you can see that a path from where you start down through the tree (shown above with a solid pink line) represents just one unique proof approach or path that you’re attempting.  You want this proof to end, or be successful, which would be identified in this metaphor by finding a child node in the tree where you have nothing left to prove.  This would be where the thing to prove at that leaf node is ‘true’ -- like the ‘base case’ of this.  We refer to this kind of node in the tree as a ‘Goal Node.’  This is where you have found one successful end to this proof -- there’s nothing left to prove here.  When you find these, then the path all the way up represents one unique, justified reason for one set of answers for the variables in your query.  You can identify that by gathering up all of the formulas you used in each deductive step and looking at the Most General Unifiers -- asking what the variable finally got bound to and passing that information up.

,
Justifying the Answers (cont.’d)

Note that if there are no free variables then the answer you’re trying to prove is “Can I prove it true or not?”  In this situation a goal node would be “yes, I was able to prove it true.”  Along the path to the goal node, I may have introduced some variables along the way such that the top formula would be true if you could find some individual with some particular property.  And then later I might find an individual who actually has that property.

So it could be that you introduced variables along the way and the actual fact of what those individuals were doesn’t show up in the bindings* at the top, but they would show up in the justification.  So if there are intermediate steps along the way in your proof, these would show up as intermediate steps of the node.

*Bindings are values that have been assigned to variables in the unification process.

,
Incompleteness in Searching

Now the problem is that you have a big tree of possibilities and you often have cases where this proof could just go on forever.  Let me now describe some potential situations for proofs that go on forever and how we deal with them.

In any sufficiently interesting or complex logical system, there is going to be an arbitrarily large number of potential proofs that you can make.  Some of them are arbitrarily long and you may not know whether you’re ever going to find an end to this proof.  This is the essence of two of the main problems for computer science; they’re often referred to as the “halting problem” and “incompleteness of logic.” Gödel proved in the 1930’s that any sufficiently complicated logical system is inherently incomplete.  There are going to be statements that you just cannot logically prove.  His argument for that is related to this other problem, the halting problem.

,
The Halting Problem

The halting problem refers to the fact that for certain algorithms, you cannot look at the algorithm plus its starting data and be able to convince yourself that the algorithm will ever end in an answer.  It may run and run and you don’t know whether if you just give it another minute or so it might actually end in an answer or if it will just run and run forever.  When we talk about Cyc we’re not talking about a system with only ten axioms and five facts to start with, we’re talking about millions of facts and tens of thousands of rules that can chain together in arbitrarily complicated and interesting ways; so the space of potential proofs is infinite.  And you’re dealing with a tree which logically is infinitely large for many interesting queries.  Due to this, you will run into some inherent incompleteness issues; for example, you cannot simply say “let’s just manifest this entire tree” or “let’s look at every possible proof and gather up all the answers”because there just aren’t enough seconds in the history of the universe for this.

You will never be able to convince yourself, because of logic and the halting problem, that you’re done yet.  Maybe if you work a little harder, you’ll get even more and more.  So we immediately ran into the problems of incompleteness in our system.  The algorithm I’m about to describe is expressly for the purpose of dealing with the problem of incompleteness in our system.

,
Depth-first Search

We run into incompleteness because the search tree that we’re describing here is just too large.  So our approach is to only search portions of the tree.  There are well-known strategies from Artificial Intelligence for how one addresses search problems like this.  One strategy is to search the tree in a “depth-first” fashion.

A depth-first search would start at the top of the tree and go as deeply as possible down some path, expanding nodes as you go, until you find a dead end.    A dead end is either a goal (success) or a node where you are unable to produce new children (you don’t have enough information to keep going deeper).  An example of an unsuccessful node would be if you came to a node where you could continue if you just knew if Nelson Mandela had a PhD in physics, but this fact is not in the system.  So the system can’t prove anything beyond that point.  The system is not omniscient.
 

,
Depth-first Search Traversal

  Let’s walk through a depth-first search and traversal of this tree.  You start at the top node and go as deeply as possible:

1)  Start at the highest node
2)  Go as deeply as possible down one path

,
Depth-first Search Traversal

  3)  When you run into a dead-end,  back-up to the last node that you turned away from.  If there is a path there that you haven’t tried, go down it.  In this case, we have to go up several nodes before we arrive at a node with another option.  Follow this option until you reach a dead-end or a goal.

,
Depth-first Search Traversal

4) This path leads to another dead-end, so go back up a node and try the other branch.

,
Depth-first Search Traversal

  5) This path leads to a goal.  In other words, this final node is a case in which you have to prove that #$true is true.  So you have one answer.  Keep searching for other answers by going up a couple more nodes and then down a path you haven’t tried.

,
Depth-first Search Traversal

6) Another dead-end!  So go back up a couple of nodes and try another path.
Continue going as deeply as possible on every path, bounded only by goals and dead-ends.

,
Depth-first: advantage and disadvantage

 
The advantage of depth-first search is that it is a very algorithmically efficient way to search trees in one format.  It limits the amount of space that you have to keep for remembering the things you haven’t looked at yet.  All you have to remember is the path back up and, for each of the steps along the path back up, their sibling nodes that you haven’t explored yet.  The amount you have to remember is proportional to the average branching factor of the tree times the depth of the tree.  So it scales roughly with the depth of the tree, and depth is a rather scalable thing -- not very explosive.

The disadvantage with depth-first search is that once you get started down some path, you will trace it all the way to the bitter end -- and we’ve got some infinite bitter ends in our system.  So it can be like Alice in Wonderland going down the rabbit hole -- there’s no end to this rabbit hole.
 

,
A Depth-first Rabbit Hole: “Cyc-ic Friends”

  Let me give you an Alice-in-Wonderland-like proof from our history.

Several years ago we had a system which wanted to demonstrate two Cyc systems collaborating to answer something (we referred to this as “Cyc-ic Friends”).  We had two Cyc’s.  One had a bunch of political knowledge; one had a bunch of geographic knowledge.  If you asked some geo-political question like “Who are all of the elected leaders of countries north of the equator?” neither Cyc alone had enough information to actually answer this question.  But when they collaborated with each other, the geographic Cyc would try to answer until it got stuck on a political thing.  Then it would say “I know you know some political stuff -- I’ll ask you.  If you can answer this question, I’ll take your answer and splice it in and continue.  So if they can collaborate, then the two of them together can answer something that neither one of them alone could answer.

In this particular situation, we asked “Who are all of the elected leaders of countries north of the equator?”  At the time it came up with some of the obvious answers quickly: Bill Clinton, John Major, etc.  And then it sort of ground-up for a while.  I interrupted it to see what it was actually attempting to prove at one point.

,
A Depth-first Rabbit Hole: “Cyc-ic Friends”

 
It was way down the rabbit hole on one proof, trying to prove if #$NelsonMandela was an instance of #$LawnFurniture.   “What?!!”  It turns out that it had gone really deep down some potential path and had clearly reached something that was going to be logically nonsensical, so this path was doomed to fail.  It had lots of ways available to it that it thought it could still try and prove it.  So this is an example of the halting problem, where it thought it still had some mechanism available -- if I just work a little bit harder I might be able to actually prove that he is lawn furniture.  It turns out that at that point we were missing some constraints about the disjointness between #$BiologicalArtifacts and #$Furniture -- something like that.

This highlights another aspect of why our system is so large.  At each step along the way it had made one logical deduction, conditioned on some default-true rule.  Each of the steps was based on a rule which locally is usually always true.  Notice that Cyc did not produce a nonsensical answer, although it was doing some reasoning that appeared nonsensical.

,
A Depth-first Rabbit Hole: “Cyc-ic Friends”

In this example, the steps were something like the following:
Certain objects are usually only found north of the equator.  There are certain objects that are geographic regions of certain types.  There are certain geographic objects which are typically outdoor regions.  There are certain objects which would be in an outdoor region.  There are certain other objects which are found outdoors and are objects.  And there are certain objects like lawn furniture and are typically found outdoors….

Each of these steps along the way involve a default-true rule which in isolation is usually almost always true -- there are some exceptions.  Well, when you start chaining up a number of these default-true rules together, the longer afield you get, the more likely that you’ll run into some exceptional condition.  So, at some point along the way we cross the line where we were attempting to prove that a person is an artifact.

,
The Halting Problem: a Trade-off

This gets us into another trade-off.  At every point along the way in doing these proofs, you’re often faced with the halting problem -- should I just try and do a little more work to try and prove this?  Or should I actually try and do a meta-proof by looking at the entire proof and asking if it is nonsensical.  Can I prove that I shouldn’t even be bothering to try and do this proof here?  You can spend your effort by either trying to do more of the proof or trying to prove if the proof is going to be fruitful at all.

You might not be able to prove, for example, if Nelson Mandela is a piece of lawn furniture -- maybe you’re missing some constraint.  Maybe he is.  Maybe certain people are robots and certain robots are robotic lawn furniture, right?  You’re missing some key constraint.  You might not be able to prove that this whole line of reasoning is nonsensical.  So you’re facing this trade-off of you only have so many resources available.  I can spend some of it cranking out more of this tree, or I can spend some of it trying to prune off large sections of the tree -- saying don’t bother, this is never going to come up with an answer.

,
The Halting Problem: a Trade-off

This is analogous to doing a geometry proof.  You might be forced to try to do a constructive proof for a while.  Then you run into problems and say, “Wait a minute.”  Now I’m going to use a different mode -- something like proof by contradiction.  I’ll assume that the thing I’m trying to prove is not true and see if I can run into some cheap contradiction.  If I do, then my original assumption must not be true.  Therefore, that’s a proof.

So it’s a lot like trying to find a switch in modes.  Can I prove this is true?  Or should I switch into a different mode where I try to prove that this whole thing could not possibly be true?  Save yourself a bunch of future work by pruning this off here.  Because we’re dealing with actual computer hardware, a finite machine, you have to decide how you’re going to spend your resources.  It’s a difficult trade-off between trying to do more and trying to prove that you shouldn't be trying to do more.  Work harder or work smarter?  You could spend all day analyzing your job, too, and never do your job.  You can’t do all of one and not any of the other.

The previous examples demonstrate that in our system, for virtually all interesting proofs, you will have rabbit holes.  You really will have a way to generate infinitely many and infinitely deep proofs.  So you can’t just use blind depth-first search.  We have these Alice-in-Wonderland holes for almost everything.

,
Breadth-first Search: advantage and disadvantage

Another strategy for searching is a breadth-first search.  This is where you search layer by layer.  First you try to do all of the zero-step proofs, then you try to do all of the one-step proofs, then all of the two-step proofs, etc.  So the advantage of breadth-first search is that you’re guaranteed to get this Ockham’s Razor benefit where you’ll get all of the simplest proofs before you get anything that’s strictly more complicated.  If there is an n-step proof, you’ll look at it before you even look at any n+1 -step proofs.  You get the simplest ones early.

The disadvantage of breadth-first search is that just as we’ve got these huge deep trees, we also have huge bushy trees.  We have steps like the one on the right of the slide where we could have thousands, tens of thousands, of child nodes.  For example, you might be doing a proof and you come to a point where if you could just find a person, some random person, then you might be able to do one more step of inference. At this point I could use any person to answer this.  But let’s say that you have available to you every current living person represented.  You could potentially have a six billion fan-out factor right there.  There are even worse situations.  For example, if all you need is an integer -- there is an infinite number of integers.  So there are cases where you can’t even generate all of the child nodes because they are infinite.

,
Breadth-first Search: another disadvantage

Another disadvantage of breadth-first searching is the amount of space you have to use to store “the stuff I haven’t looked at yet” is truly explosive.  By the time I ever look at the last node on the second layer of the tree on this slide,  I may have to have stored the entire third layer because I will have generated it from all of the others in the second layer -- all of their children.  So, if the third layer is explosively large, I would have to store all of them before I could even look at them.  The amount of space you use to store is proportional to the average branching factor to the power of the depth of the tree.  So that grows exponentially and explosively -- especially when the average branching factor is huge.

With a breadth-first search, the deeper you go into the tree, you have enormous space requirements for storing that which you haven’t looked at yet.  There just isn’t enough memory in a computer to store an infinitely large third step -- the space of possible solutions is far too large.
 

,
We Need a Better Search

So we run into the problem where two of the traditional algorithms for search, depth-first and breadth-first, are going to run into problems with a large system like ours.  If you notice the ellipses on the left and right of the chart, you’ll see that if you used either algorithm, you’d never get to the goal-node in the middle.  In the depth-first case, you’d get stuck down the rabbit hole.  In the breadth-first case, you’d get lost generating endless nodes on the first step.

In our system, because we can’t generate all of the possible nodes -- the incompleteness of it --  we have to address searching in a different way.  The search strategy that we use attempts to identify the most promising path down below and expand each node in order of which ones look most promising at the time.  If we have a good heuristic that recognizes that the node on the upper right is going to generate an enormous number of children or the node on the lower left looks like it’s going to lead down to a rabbit hole -- if we have a good estimate for which looks most promising, we ought to be able to hone in on that goal node before we look at other ones that are not as promising.  This is called a heuristic search.  You search the tree in order of heuristic quality, thereby pushing off the parts that might go on infinitely.  The devil is in the details here.  In this case, the details are: you better have a good heuristic; at each point you’d better be able to look at this node and know if it looks like it’s headed towards success.  So when we go down the solid purple path, each node should get a higher and higher heuristic estimate like: looks good, looks better, looks really good, and eventually hit the goal, where the heuristic should be like “obviously these are the best nodes.”  So, we have explosive trees and we work on them in the order which looks most promising.

,
Summary
  •  The Search Tree
  • Incompleteness of Logic
  • The Halting Problem
    • Infinite Possibilities
    • Work harder or work smarter?
  • Depth-first Searches
    • Rabbit-holes
  • Breadth-first Searches
    • Infinite fan-out
    • Infinite space required to store possible solutions

This concludes the section on the Incompleteness of Searching.

pdf | zip | Incompleteness from Resource Bounds and Continuable Searches
Inference in Cyc

We described earlier how inference can be viewed as a sequence of logical deductions chained together.  At each point along the way, you might often have different ways that you could now perform a new deduction.  So, in effect, from what you start with there is a branching set of possibilities for how you can prove it.  And that branching set can spread out in all kinds of different, novel ways.

,
Cyc is Life in the Big City

A former Cyclist referred to the explosiveness of the searches as “Life in the Big City.”  We’re not dealing with “toy” problems with a small number of terms being talked about with a small number of assertions; this is life in the Big City.  We have on the order of a million assertions and on the order of  a hundred thousand terms being talked about.  So, we very often run into huge potential proofs because we chain all of these things together -- both deep and broad.  Because of this we cannot search these trees to their exhaustion.  So we take a completely different approach.
 

,
Inference is Resource-bounded

We search within some specified resource limits.  Inference in our system is “resource-bounded.”  By “resources” we’re referring to the resources available for doing inference.  Some resource bounds that we can impose are: the amount of time you work, the number of answers you find, the depth to which you reach (this is a measure of complexity of the proof), and the number of rules to be chained together (this leads to proving child nodes which can be more complex than the original query).  By circumscribing a set of parameters which reflect real-world resources that are available for having an automaton grind out these inferences, we turn the algorithm on its head.  It’s not the algorithm’s responsibility to “magically” figure out how much resource it’s supposed to use, we’re going to give the application that’s invoking this inference control over how much resource to spend in answering the original query: how many seconds to spend thinking, how many “yes’s” to come up with, how deep to go, how complex (rule-wise) to go, etc.  This gives the application the choice to decide the boundaries for the search.

So, the application has a way to determine how much of the tree to let the inference search over.  It tries to then find the best answers within these resource boundaries.  It is then the inference mechanism's own responsibility to figure out, given the resources available, how much time and resource to spend on each particular step -- while adhering to the overall resource restrictions from the outside.
 

,
Resource-bounded Incompleteness

  In effect, we are allowing resource-bounded incompleteness, where the incompleteness comes from the algorithm’s decision as to how complete it wants to be.  For certain queries, you can say “run forever,” “give me every answer you can get,” “go as deep as you need,” or “use as many rules as you want.”  This is basically just turning it back over to Cyc and letting it give you everything.  It could go forever, so you almost always want to give some resource bounds to give it a reason to quit.  As we’ve seen, in many interesting cases if you don’t give it a reason to quit, it won’t.  It’ll go on until it runs out of memory on the computer or they shut off the lights on the universe.

Deciding how much work to do is the way in which we deal with “Life in the Big City.”  An aspect of this solution is that if you’re letting it quit early, you should give it a way to continue on from where it left off.

,
Inference is Continuable

  The other main aspect of addressing this incompleteness is making our inferences continuable.  If I use some resources to do a little bit of work, and it quits because it comes up with an answer, and I then decide to do some more work on it, I shouldn't have to start from scratch all over again.  I should be able to tell it to just pick up where it left off and use this many more resources, run for ten more seconds, give me one more answer, or go one step deeper in the tree, or something like this.  Inferences in our system are explicitly continuable, and this answers a direct need that arises from letting it quit early -- letting it pick up where it left off.

,
Proof Search Could be Stored With Meta Data

We search within some specified resource limits.  Inference in our system is “resource-bounded.”  By “resources” we’re referring to the resources available for doing inference.  Some resource bounds that we can impose are: the amount of time you work, the number of answers you find, the depth to which you reach (this is a measure of complexity of the proof), and the number of rules to be chained together (this leads to proving child nodes which can be more complex than the original query).  By circumscribing a set of parameters which reflect real-world resources that are available for having an automaton grind out these inferences, we turn the algorithm on its head.  It’s not the algorithm’s responsibility to “magically” figure out how much resource it’s supposed to use, we’re going to give the application that’s invoking this inference control over how much resource to spend in answering the original query: how many seconds to spend thinking, how many “yes’s” to come up with, how deep to go, how complex (rule-wise) to go, etc.  This gives the application the choice to decide the boundaries for the search.

So, the application has a way to determine how much of the tree to let the inference search over.  It tries to then find the best answers within these resource boundaries.  It is then the inference mechanism's own responsibility to figure out, given the resources available, how much time and resource to spend on each particular step -- while adhering to the overall resource restrictions from the outside.

,
Proof Search Could be Stored With Meta Data (cont.’d)

We search within some specified resource limits.  Inference in our system is “resource-bounded.”  By “resources” we’re referring to the resources available for doing inference.  Some resource bounds that we can impose are: the amount of time you work, the number of answers you find, the depth to which you reach (this is a measure of complexity of the proof), and the number of rules to be chained together (this leads to proving child nodes which can be more complex than the original query).  By circumscribing a set of parameters which reflect real-world resources that are available for having an automaton grind out these inferences, we turn the algorithm on its head.  It’s not the algorithm’s responsibility to “magically” figure out how much resource it’s supposed to use, we’re going to give the application that’s invoking this inference control over how much resource to spend in answering the original query: how many seconds to spend thinking, how many “yes’s” to come up with, how deep to go, how complex (rule-wise) to go, etc.  This gives the application the choice to decide the boundaries for the search.

So, the application has a way to determine how much of the tree to let the inference search over.  It tries to then find the best answers within these resource boundaries.  It is then the inference mechanism's own responsibility to figure out, given the resources available, how much time and resource to spend on each particular step -- while adhering to the overall resource restrictions from the outside.
 

,
Discarding the Search Structure

There will come a time at the end when the system knows it’s finished with the structure and that is the point where it cleans up and throws it all away.  Here’s an example: I want to know all of the states that border on Tennessee.  After finding Missouri, which is a goal, the inference mechanism will continue the search.  Eventually it will find all eight states and be unable to give any more answers -- it will have exhausted the potential search.  At that point it might decide that it’s finished with the search and will never be able to do anything more or maybe it has been told that it is not going to bother trying to do anything more, so now it can clean up the data structure.

,
The Search Tree is a Metaphor

The search tree is really a metaphor for the work of actually chaining deductions together.  There are a lot of ways that you can do this kind of proof.

For example, you could have better reasoning to say that a line of reasoning for a particular proof isn’t worth the bother because a future step is going to be unprovable, so you can prune it off.  This will create additional mechanisms for reasoning about how to prune off whole chunks of the tree.  There will be mechanisms to identify which chunks of the tree are likely to pay off and which chunks are not, so that the search can be refocused.

For another example, let’s say that you have three machines available for searching.  There will be a mechanism to assign each of the machines to search different parts of the tree.

So we have lots of possible extensions to this whole search metaphor that we’re looking forward to creating.  Viewed with this metaphor, we’re just trying to discover the most intelligent and profitable way of chaining deductions together.

,
Work Harder or Smarter? Deep Blue Example

  Let’s consider again the idea of “working harder versus working smarter.”  A good analogy for this is the Deep Blue chess project.  In this analogy, we’re searching through possible moves in chess.  Each node in the search tree, or data structure, is a possible state of the chess board.  Each step down from the origin represents a possible move of player A (player A could move his knight and go down one path, or he could move his pawn and go down another path, etc.).  Then it’s the opponent’s turn.  What could player B do now?  Well, the number of possible states of the board is basically infinite. With every move that is actually made, you get to go a node deeper and represent it with a path through this huge tree of possible chess games.  Eventually you’ll come to “Game Over” at the end, where a draw is called, or a checkmate,  or something to end the game.

,
Work Harder or Smarter? Deep Blue Example

The search tree would have to represent every possible arrangement of pieces. So every game is somewhere in that tree.  This tree is huge and your goal is to figure out which moves to make in order to create a scenario that puts you in a position where you are more likely to win than your opponent.  This gives you two options.  You can “work smarter” or you can “work harder.”

If you “work harder,” you would search the entire tree for the one step which would have your current board and below it all potential plays that are likely to leave you in a favorable position.  So in the history of chess-playing computer programs, there have been approaches that just crank out as much of the tree as possible and use some standard board-strength analysis heuristics (in the history of chess, they’ve come up with some very good heuristics for analyzing if a move is better for the white pieces or the black pieces based on the current state of the board) to analyze which move would be strongest for a given player.

In the Deep Blue project, they built this massively parallel machine which carves off huge chunks of this chess space.  At each step it looks at the current state of the board and then considers every possible future move of both players.  It distributes the problem across an enormous number of processors which search the essentially independent possible paths in parallel.  It then finds the best possible move.

,
Work Harder or Smarter? Deep Blue Example

Deep Blue does have a bit of the “think smarter” aspect in that they use the board-strength heuristic for determining which paths to go down.  So along the way, this heuristic makes the search extremely efficient, but it’s not a very smart thing.  It only gets used along the way, and doesn’t prune off large enough trunks of the tree to really be considered working “smarter.”

Deep Blue is an outstanding chess player. But as we’ve seen, such brute-strength approaches will not work in the “Big City.”

,
Summary
  • Life in the Big City
  • Inference is Resource-bounded
  • Resource-bounded Incompleteness
  • Inference in Continuable
    • Proof Search is Stored
    • Meta data could be stored
  • Deep Blue: Working Harder

This concludes the section on Incompleteness from Resource Bounds and Continuable Searches.

pdf | zip | Efficiency Through Heuristics
Inference in Cyc

This lesson will discuss the efficiency aspects of inference in Cyc. Earlier we discussed the logical aspects of inference, the steps we use to perform a logical deduction and how we chain them together. We talked about some of the incompleteness issues that have to do with the large space of potential proofs that we can construct in this fashion and some of the strategies which would not work in our system to approach this; and then we introduced heuristic search as the mechanism that we do use to attempt to efficiently identify some successful proofs in the system. This talk is now going to concentrate on the efficiency aspects of heuristic search.

,
Inference is Modular

One of the most interesting techniques that we use for performing inference in our system is to identify the fact that we don’t consider each of these inference mechanisms as one mechanism that can handle the whole space of possibilities. Instead, we consider inference in our system as being one of potentially hundreds or thousands of special cases, each of which has an efficient mechanism to identify and solve that problem. In our system, inference is modular in the sense that we have hundreds of independent modules that we call inference modules, or HL modules, to perform this reasoning -- each of which is specially tuned to handle one particular kind of problem.

So in the example on the slide, we have a node which has to do with proving what types of things Hamlet is an instance of; and out of all of the hundreds of potential reasoning modules, one is specially designed to handle questions of the form (#$isa <TERM> <VARIABLE>). In our system, we have a special-purpose reasoning structure for performing this sort of type-reasoning that can efficiently generate the approximately 25 to 30 bindings for ?WHAT and use that to generate the 25 to 30 child nodes that should come under this node.

,
Inference is Modular (cont.)

By carving up the space of potential proofs into a large number of special cases and having a very flexible system where you can add yet another special case to the system, you can identify the space of interesting problems that you want to solve and design HL modules that are optimized to solve the particular problems you’re interested in. In the tight inner loop of our inference mechanism, which is used to construct these search trees and generate child nodes, it has effectively a very efficient expert system inside it to perform this sort of meta-reasoning of “what are the inference options and modules that I have available to solve a problem on a specifically chosen node?” In this fashion it can quickly identify the irrelevant mechanisms for performing inference and then bring the relevant ones to bear.

,
Inference is Bimodal

In an earlier talk we had described the differences between inference steps which strictly simplify a problem and inference steps which transform the problem into one which is either of the same or greater complexity. We noted the tension between these two: one of them uses facts in the system to potentially simplify a proof hopefully down towards true (which would be a goal node that indicates that you now have a successful proof) and one of them applies general purpose rules to potentially produce a completely new and novel way of trying to prove your problem.

In our system, these can be thought of as the two large classes of mechanisms for performing proofs. Because they are so different, our system deserves to identify the difference between these two and try each of them in an optimal fashion. Inference in our system can be considered bimodal in that we have these two large modes to consider. These two modes we call “removal” and “transformation”. They have to do with the fact that the inference steps which apply rules tend to transform the problem from one problem into a potentially completely different problem by applying the general purpose rule to the proof of the question versus the application of facts which, because it doesn't add anything new, conditionally, to the system, will strictly remove one of the steps that you’re trying to prove, and thus strictly decrease the complexity in the proof.

,
Inference is Bimodal (cont.)

The  transformation steps provide the large fan-out which have to do with stitching together a large number of the common-sense rules that quantify over things in the world. They’re the ones which effectively produce the large space of potential novel, interesting ways to approach and solve a problem; whereas the facts in the system can be thought of as ways that, given this approach, can be used to either solve or not solve by bottoming-out the proof into facts that are known about the world.

You can think of the interplay between these two as being analogous to being at the top of a mountain, like Pike’s Peak in the middle of the Rocky Mountains, and you know that somewhere deep in the valley somewhere in the Rocky Mountains, is a pot of gold and you have to search and find this pot of gold. The application rules are analogous to transforming you from the top of one mountain to the top of another mountain (which could be potentially a great distance, maybe even several states away). So the application rules can bring you quite far afield from the place where you were originally trying to do the proof to somewhere potentially quite far away.

,
Inference is Bimodal (cont.)

Given that you have identified a promising peak to start at, the application of facts can be analogous to very quickly and efficiently pounding out an exhaustive search of all of the valleys right around that one mountain top. Because Removal strictly simplifies the problem, you know that there is a limit to the amount of work that you have to do -- there are only so many valleys around a single mountain top.

Thus there is a trade-off between introducing more mountains to start looking at and pounding out efficient searches as quickly as possible of the valleys around a particular mountain. In our system, then, we have two large sets of heuristics which have to do with these two qualitatively different approaches to solving the problem: generating new, complex proofs to examine and then efficiently trying to solve those proofs using facts that you know in our system.

,
Inference is Heuristic

  Inference in our system is heuristic. Now, the key thing to remember about heuristics in our system is that they affect efficiency, not correctness. All they do is provide an ordering on the possible proofs that you’re examining, hopefully so that you get the proofs that are going to succeed earlier rather than those that are less likely to succeed. If you are asking an exhaustive query, one that searches the entire search space, the ordering doesn't really matter because you’re intending to search all of it. So the benefits of heuristics only really show up when you provide resource constraints that allow the inference to halt early. Hopefully then you’ve hit the good ones by the time you halt early and you’ve ignored most of the less promising proofs.

Let’s explore the difference between the Transformation heuristics and the Removal heuristics. The purpose of the Transformation heuristics is effectively to order the possible proofs based on the rules involved in them, hopefully trying to provide proofs that use more coherent sets of rules -- rules that are working together in a more promising fashion. The purpose of the Removal heuristics is to generate answers as efficiently as possible and -- a corollary to that -- to prune dead ends as early as possible, noticing as early as possible that something is going to be a dead end and heading away from that. This way you can prune off entire sections of the proof space.

,
Summary
  • Inference is Modular
    • Internal expert system does meta-reasoning
    • HL modules
  • Inference is Bimodal
    • Removal (facts)
    • Transformation (rules)
  • Inference is Heuristic
    • affect efficiency, not correctness
pdf | zip | Inference Features in Cyc

This is the final lesson in the Inference Tutorial. It will focus on microtheories and forward/backward inference.

,
Inference Uses Mts for Consistency

Another unique feature of the Cyc system is our use of microtheories to deal with the difficulty of having global consistency in a knowledge base. The Cyc Knowledge Base does not consist of one single theory that has to be consistent. As theories get larger and larger, it becomes more and more difficult to maintain consistency among all of the statements in them. We solved this problem in our system by not having just one theory; we have a large number of what we call “microtheories.” These are smaller theories, usually on the order of a few hundred to a few thousand assertions in each one.

,
Mts Inherit from More General Mts Using #$genlMt

This gives us the ability to state that certain microtheories inherit from other microtheories; we can set up an ontology of theories and have one theory built upon other theories from which it inherits. It is easier to manage the space of millions of assertions because we carve them up into smaller sets of assertions that have common assumptions about them and then we state the relationships among these theories as a means of organizing them and making them more modular and reusable. The predicate that we use to state this inheritance relationship in microtheories is #$genlMt.

,
Inference is performed Within Mts

When you perform an inference in our system, you  perform an inference in a particular microtheory. This means that every assertion in that microtheory and all of the microtheories from which it inherits are “visible” for inference. This allows the system to maintain an enormous number of potential theories in an efficient fashion and to support performing inferences in any of them at the same time.

,
Inference Uses Microtheories and Inheritance

This slide shows a more complicated example of performing an inference within a microtheory.  Imagine four microtheories, represented by the blue rectangles on the slide. In the three microtheories on top, there are three assertions. In the first microtheory, there is the assertion we’re calling P. In the second microtheory, there is the assertion called Q. In the third microtheory, we have a rule that says that P and Q together imply R. Notice that in none of these three microtheories do we have a theory from which you can conclude R. But the fourth microtheory, below, since it has the three microtheories above it as genMt’s, in effect inherits all of those assertions in one place. So in that microtheory you now have a theory which can see all of those three assertions and therefore soundly logically can deduce R. If you were to ask R in any of the three theories above, Cyc would not be able to prove it. But if you were to ask it in the theory below, you would be able to prove it.

,
Two Important Microtheories: #$BaseKB and #$EverythingPSC

There are two microtheories which are worth pointing out as being interesting in cases like the one depicted on the previous slide.

The #$BaseKB can thought of as the microtheory on top, from which everything inherits. So this microtheory is always visible to all other theories. It’s meant to represent the universal theory vocabulary -- everything which is true, no matter what the theory. In fact, there are now approximately six microtheories “above” #$BaseKB (an example is #$UniversalVocabularyMt). But it is still true that all microtheories (except for these six) can see #$BaseKB (and, as a result, can see the all of the microtheories above #$BaseKB.

The converse of #$BaseKB is a microtheory called #$EverythingPSC (PSC stands for “Problem Solving Context”.). This can be thought of as the bottom of the microtheory ontology, which inherits from every microtheory in the system. In general, this is not a sound thing to do. But for pragmatic reasons, in various applications it is often useful to have available a microtheory in which you can do an ask that will effectively ignore all of the other microtheories. #$EverythingPSC is a microtheory which has no logically consistent meaning but has a high practical utility just because it is able to see the assertions in every microtheory.

,
Placing a New Microtheory

So, when you’re designing an application, it is useful to introduce a microtheory into the ontology of microtheories and add some judicious #$genlMt links to the theories that you want to use. You are, in effect, constructing the theory you want your application to reason in. This allows you to control which theories you use in inference and which theories you ignore, providing another mechanism for filtering out and pruning the space of possible proofs that you’d make when you’re performing inferences.

,
Inference can be Forward or Backward

Inference in Cyc is not limited to either forward or backward inference. We support both. Let me describe what we mean by “forward inference” and “backward inference.”

Forward inference can be considered eagerly concluding additional assertions as soon as new assertions are added to the system. So forward inference occurs at update time to the Knowledge Base. In effect, it causes more updates to the Knowledge Base which then cause more updates, until eventually it ends the system by allowing operations to complete normally. So, forward inference is eagerly concluding from the assertions towards new assertions that you may or may not ever want to use in inference.

The opposite of forward inference is backward inference. Backward inference occurs at query time and starts from particular queries that you want to ask. It attempts to prove mechanisms for how the original query would be true in terms of something else and hopefully you can chain these conditional proofs back until you eventually hit something which already is true in your knowledge base and stop the backward search.

,
Forward Inference: Strengths and Weaknesses

Both forward and backward inference can be thought of as an attempt to provide a deductive chain between what you’re trying to ask in your query and what you have already known in your assertions. So, you want to find a connection between these two? They can be thought of as just two different approaches to doing inference. One is do it eagerly at assert time and one is do it lazily at query time.

There are strengths and weaknesses to both. The strength of forward inference is that it provides a larger target for your backward inference to eventually hit. But the weakness is that you have to do a lot of work at update time. So if you have a lot of forward inference, the amount of work you do at update time could be quite large -- it could get larger and larger and larger as the knowledge base grows, so that it could eventually reach a point where there is so much to do at update time that you can’t keep up with the updates. There’s a limit to how much you can do with strictly forward inference in Cyc, because the space of potential things you can conclude is truly large and it’s often far larger than the space of things that you ever want to actually ask the system.

,
Limitation of Forward Inference

  In a system with exclusively forward inference, you would have a lot of wasted space spent on concluding things that you aren’t ever going to ask about, which in the diagram on the slide is like the two triangles at the bottom which indicate things you’ve bothered to conclude but are never going to bother to ask about.

Systems that have exclusively forward inference are fairly common in other knowledge representation systems. You can think of active databases with triggers as being ones that are exclusively forward. There are other well-known representation systems in the AI community: the RETE match is an exclusively forward-matching strategy. Magic Sets Transformation in the AI literature talks about how to encode backward inference in an exclusively forward system. So, there are many systems out there that are exclusively forward, and the limitation of them is that there is a certain size of knowledge base beyond which the space of conclusions you get in a forward fashion is so large that it just becomes unwieldy.

,
Limitation to Backward Inference

Backward inference is another common strategy which is often exclusively used in other systems. In backward inference you don’t try to remember anything beyond what is stated to the system and you re-derive things when asked at query time. Exclusively backward systems are those like Prolog, where the set of rules and facts are stated to the system ahead of time and proofs are done exclusively at query time, in a backwards fashion; and if you want to re-prove it, you have to re-run the proof again.

The downside of an exclusively backward system is the flip-side of that of the exclusively forward system. You can have enormous fan-out in the space of proofs which you are trying to prove which are never going to bottom-out at anything you know about. In this diagram, that is equivalent to the triangles on the top, which represent queries fanning out from the query that you asked, that have no hope of ever targeting anything that is stated in your system.

,
Cyc Supports Both Forward and Backward Inference

The benefit of having a system which supports both forward and backward inference is that with a judicious amount of forward inference you can increase the target of knowledge that is already represented in the system so that you have a larger target for your backward inference to hit. In the diagram on this slide, that is represented by the two approaches judiciously meeting in the middle. So, you can save all of the wasted space in the triangles by using a judicious amount of forward inference to expand the target area for your backward inference to hit.

,
A Subset of the KB is Marked “Forward”

Cyc supports both forward and backward inference, and this is the way it is used: every assertion in the system is labeled as being either a forward assertion or not. You can think of all of the forward assertions in the knowledge base as being a subset of the knowledge base that is labeled “forward”, and amongst all of the forward assertions, whenever a new assertion comes in, forward inference triggers and runs exhaustively amongst just that set. By judiciously choosing a subset of the knowledge base on which it is worthwhile to perform this forward inference, we can have a good mixture of the benefits of both forward and backward inference without having to suffer through the weaknesses of having only one or the other.

Just to give you an idea of what is labeled “forward” in the system, effectively all GAFs in the system are labeled “forward” and a tiny percentage (probably around 5 percent or less) of the rules in the system are labeled “forward”. The kind of rules that are labeled forward are typically those which are either extremely application-specific or constraints of some kind. The application-specific rules are in some focused microtheory so that the application knows that it wants these conclusions done because it is going to target exactly the results of those conclusions. Things like arities and #$argType constraints (and some classification rules that conclude things are instances of other things) are worth doing in a forward fashion -- especially those things that have to do with canonicalization and well-formedness checking; these are things where you don’t want to do deep inference at assert time to check those things, so it’s good to have those things computed in a forward fashion so that you can have simpler queries in the system to check them.

,
Summary
  • Inference Uses Mts for Consistency
  • Mts Inherit from More General Mts Using #$genlMt
  • Inference is performed Within Mts
  • Two Important Microtheories: #$BaseKB and #$EverythingPSC
  • Inference can be Forward or Backward
  • A Subset of the KB is Marked “Forward”

This concludes the tutorial on Inference in Cyc.