KB Term Indexing in Cyc

All assertions in the KB point to Cyc terms via the CNF representation of their formulas. The purpose of all the indexing hanging off of terms in the KB is to provide pointers in the other direction, from a term to all the assertions mentioning it. To assist in inference, the indexing is arranged to provide efficient access to groups of assertions which use a term in a similar fashion.

There are two classes of indexed terms: constants and assertions.

If an indexed term has assertions stated about it, then that term has a pointer to an index-structure which has a set of slots for holding the different classes of indexing structres.

Each index-structure has the following slots:

(a) arg1, arg2, arg3, arg4, arg5, arg0
These slots are used in indexing ground atomic formulas involving the term.

(b) neg, pos
These slots are used in indexing non-gafs (rules) involving the term.

(c) ist
If the term is a microtheory, this slot is used to index rules stated in that microtheory.

(d) other
This slot is used to index all the "none of the above" uses of the constant.

The format of the indexing for these slots are now discussed in detail.

 

Argument Positions 1-5 (arg1, arg2, arg3, arg4, arg5)

Consider an index for the constant #$Foo. These slots on the index are used to index all the ground atomic formulas involving exactly #$Foo in position 1 - 5, respectively, of each formula. The indexing occurs in three layers:

layer 1 : ( <count> . <pred-tree> ) or NIL

<count> is total number of assertions indexed within <pred-tree>
<pred-tree> is a binary tree where for each node:
-tag is the predicate used in all gafs in the layer 2 indexing
-state is layer 2 indexing

layer 2 : ( <count> . <mt-tree> )

<count> is total number of assertions indexed within <mt-tree>
<mt-tree> is a binary tree where for each node:
-tag is the local microtheory for all gafs in the layer 3 indexing
-state is layer 3 of indexing

layer 3 : ( <count> . <assertions> )

<count> is total number of assertions in <assertions>
<assertions> is a list of assertions

Example: a gaf assertion such as

(#$genls #$Dog #$Animal) in #$AnimalTaxonomyMt

would be indexed using these slots as follows:

#$Dog
 index-struc
  arg1 slot (layer 1)
   #$genls node (layer 2)
     #$AnimalTaxonomyMt node (layer 3)
       list of assertions, including above

On #$Animal
 index-struc
  arg2 slot (layer 1)
   #$genls node (layer 2)
     #$AnimalTaxonomyMt node (layer 3)
       list of assertions, including above

Argument Position 0 (arg0)

Consider an index for the predicate #$foo. The slot arg0 on the index is used to index all the ground atomic formulas where #$foo is used as the predicate (i.e. arg 0). The indexing occurs in two layers:

layer 1 : ( <count> . <mt-tree> ) or NIL

<count> is total number of assertions indexed within <mt-tree>
<mt-tree> is a binary tree where for each node:
-tag is the local microtheory for all gafs in the layer 2 indexing
-state is layer 2 of indexing

layer 2 : ( <count> . <assertions> )

<count> is total number of assertions in <assertions> <assertions> is a list of assertions

Example: a gaf assertion such as

(#$genls #$Dog #$Animal) in #$AnimalTaxonomyMt

would be indexed using this slot as follows:

#$genls
 index-struc
  arg0 slot (layer 1)
   #$AnimalTaxonomyMt node (layer 2)
     list of assertions, including above

Negative and Positive (neg and pos)

Consider an index for the constant #$Foo. These slots on the index are used to index all formulas involving #$Foo which are not ground atomic formulas. The NEG slot is used to index those formulas in which the term appears in a negative literal. The POS slot is used to index those formulas in which the term appears in a positive literal.

There are additional restrictions involving the use of NEG and POS:

(a) If #$foo is a predicate, then #$foo is used in the predicate position of a literal in all the assertions indexed via these slots. Also, the form of the literal involved is NOT one of those described in either (b) or (c) or (d) below.

(b) If #$Foo is a non-predicate function, then #$Foo is used in a literal of the form (#$termOfUnit ?varN (#$Foo ... )) in all the assertions indexed via these slots.

(c) If #$Foo is a collection, then #$Foo is used in a literal of the form (#$isa ?varN #$Foo) in all the formulas indexed via these slots.

(d) If <Foo> is an assertion, then <Foo> is used in a literal of the form (#$abnormal <whatever> <Foo>) in all the formulas indexed via these slots.

To summarize, these slots are used to index the terms in the predicate position of literals of non-gafs except for the three special cases outlined above, in which a different term is indexed rather than the predicate.

The indexing occurs in three layers:

layer 1 : ( <count> . <mt-tree> ) or NIL

<count> is total number of assertions indexed within <mt-tree>
<mt-tree> is a binary tree where for each node:
-tag is the local microtheory for all rules in the layer 2 indexing
-state is layer 2 of indexing

layer 2 : ( <forward> <backward> )

<forward> is layer 3 indexing of forward assertions
<backward> is layer 3 indexing of backward assertions

layer 3 : ( <count> . <assertions> )

<count> is total number of assertions in <assertions>
<assertions> is a list of assertions

Example: a backward rule such as

(#$implies
 (#$and
  (#$isa ?PER #$Person)
  (#$mother ?PER ?MOM))
 (#$loves ?MOM ?PER)) in #$HumanSocialLifeMt

would be indexed using these slots as follows:

#$Person
 index-struc
  neg slot (layer 1)
   #$HumanSocialLifeMt (layer 2)
     backward (layer 3)
      list of assertions, including above

#$mother
 index-struc
  neg slot (layer 1)
   #$HumanSocialLifeMt (layer 2)
     backward (layer 3)
      list of assertions, including above

#$loves
 index-struc
  pos slot (layer 1)
   #$HumanSocialLifeMt (layer 2)
     backward (layer 3)
      list of assertions, including above

ist

Consider an index for the microtheory #$FooMt. This slot on the index is used to index all assertions which are locally asserted in mt #$FooMt. However, if #$FooMt is a broad microtheory, i.e.

  (#$isa #$FooMt #$BroadMicrotheory)

then it is not generally worth the overhead of maintaining this indexing. Therefore, for broad microtheories, this index is not maintained and this slot is NIL. The#$BaseKB is the canonical example of a broad microtheory. The index consists of a single layer:

layer 1 : ( <count> . <assertions> ) or NIL

<count> is total number of assertions in <assertions>
<assertions> is a list of assertions

Example: the gaf example used above would be indexed using this slot as follows:

#$AnimalTaxonomyMt
  index-struc
   ist slot (layer 1)
    list of assertions, including gaf

Other

Consider an index for the constant #$Foo. This slot (other) is used to index all remaining assertions, including both ground atomic formulas and rules, which involve #$Foo but are not covered by any of the indexing listed above. Its primary purpose is to ensure that when a term is killed, all assertions referring to it are removed from the KB. This "miscellaneous" indexing is currently not used in inference, so it merely has a single layer:

layer 1 : ( <count> . <assertions> )

<count> is total number of assertions in <assertions>
<assertions> is a list of assertions

Example: a gaf assertion such as

(#$nounPrep #$Product-TheWord #$Of-TheWord (#$products :OBLIQUE-OBJECT :NOUN))
in #$EnglishMt

would be indexed using this slot as follows:

#$products
 index-struc
  other slot (layer 1)
   list of assertions, including gaf

More Examples

 

Example #1 -- use of slots arg1-arg5, and arg0

Assertion:

(#$isa #$Goolsbey #$Person) in #$BaseKB

This assertion is indexed as follows:

#$Goolsbey 
 index-struc
   arg1 slot
    #$isa
     #$BaseKB
       list of assertions, including above

#$Person 
 index-struc
  arg2 slot
    #$isa
     #$BaseKB
       list of assertions, including above

#$isa
 index-struc
  arg0 slot
   #$BaseKB
     list of assertions, including above

Note that the assertion is not indexed under the ist slot of the index-struc for #$BaseKB since #$BaseKB is a broad microtheory.

 

Example #2 -- use of slots neg, pos and ist

Assertion (backward):

(#$implies
  (#$owns ?AGT ?OBJ)
  (#$possesses ?AGT ?OBJ)) in #$PropertyMt

This is canonicalized as follows in CNF form:

neg-lits : ((#$owns ?var0 ?var1))
pos-lits : ((#$possesses ?var0 ?var1))

The assertion is indexed as follows:

#$owns
 index-struc
  neg slot
   #$PropertyMt
     backward
      list of assertions, including above

#$possesses
 index-struc
  pos slot
   #$PropertyMt
     backward
      list of assertions, including above

#$PropertyMt
 index-struc
  ist slot
   list of assertions, including above

Example #3 -- special cases of slots neg and pos

Assertion (forward):

(#$imples
 (#$and
  (#$termOfUnit ?U (#$CaptialOf ?REGION))
  (#$isa ?REGION #$State-UnitedStates))
 (#$isa ?U #$City-UnitedStates))  in #$USGeographyMt

This is canonicalized as follows in CNF form:

neg-lits : ((#$termOfUnit ?var0 (#$CaptialOf ?var1))
	    (#$isa ?var1 #$State-UnitedStates))
pos-lits : ((#$isa ?var0 #$City-UnitedStates))

The assertion is indexed as follows:

#$CaptialOf
 index-struc
  neg slot
   #$USGeographyMt
     forward
      list of assertions, including above

#$State-UnitedStates
 index-struc
  neg slot
   #$USGeographyMt
     forward
      list of assertions, including above

#$City-UnitedStates
 index-struc
  pos slot
   #$USGeographyMt
     forward
      list of assertions, including above

#$USGeographyMt
 index-struc
  ist slot
   list of assertions, including above

Note that it is not indexed under #$termOfUnit or #$isa because the form of these literals are the special cases which allow the indexing off of #$CaptialOf, #$State-UnitedStates, and #$City-UnitedStates.

Example #4 -- use of other

Assertion (backward):

(#$implies
  (#$isa ?MT #$Microtheory)
  (#$genlMt ?MT #$BaseKB)) in #$BaseKB

The assertion is canonicalized as follows in CNF form:

neg-lits : ((#$isa ?var0 #$Microtheory))
pos-lits : ((#$genlMt ?var0 #$BaseKB))

The assertion is indexed as follows:

#$BaseKB
 index-struc
  other slot
   list of assertions, including above

#$Microtheory
 index-struc
  neg slot
   #$BaseKB
     backward
      list of assertions, including above

#$genlMt
 index-struc
  pos slot
   #$BaseKB
     backward
      list of assertions, including above

Simple Indexing

For terms which have relatively few assertions stated about them, the indexing-structure is in fact just a simple list of all assertions mentioning the term. In this case, all the slots and indexing organization described above still exist but only in a virtual sense -- the indexing code is abstracted out in such a way as to be able to handle both the simple and complex cases.