CycL is a formal language whose syntax derives from first-order predicate calculus (the language of formal logic) and from Lisp. In order to express common sense knowledge, however, it goes far beyond first order logic. The vocabulary of CycL consists of terms. The set of terms can be divided into constants, non-atomic terms (NATs), variables, and a few other types of objects. Terms are combined into meaningful CycL expressions, which are used to make assertions in the Cyc knowledge base.
- Logical Connectives
- Non-Atomic Terms
Constants are the “vocabulary words” of the Cyc knowledge base. The Cyc KB is an attempt to model the world as most sane, adult humans perceive it, so each constant stands for some thing or concept in the world that we think many people know about and/or that most could understand.
The KB contains constants that denote collections of other concepts, such as #$AnimalWalkingProcess (the set of all actions in which some animal walks) or #$Typewriter (the set of all typewriters). It can have constants that denote individual things, some of which are more-or-less permanently in the KB, like #$InternalRevenueService, and some of which might get created only when reasoning about some state of affairs, like #$Walking00036 (a particular case of walking). Some of the individuals represented in the KB are predicates, such as #$isa or #$likesAsFriend, that allow one to express relationships among constants. Others are functions, such as #$GovernmentFn, which can take constants or other things as arguments, and generate new concepts (i.e., (#$GovernmentOf #$Canada)).
Each constant has its own data structure in the KB, consisting of the constant and the assertions which describe it.
Most Cyc constants have a unique name, such as #$BillJ, #$massOfObject, or #$MapleTree. (For information on those that don’t, see Reifiable Functions below.)
Cyc constants are referred to with the prefix “#$” (read “hash-dollar”). These characters are sometimes omitted in documents describing CycL, and they may be omitted by certain interface tools. But in these Cyc Documentation pages, the policy will be to use the “#$” prefix when referring to Cyc constants.
The name of a Cyc constant — the part after the “#$” prefix — must follow these rules: All Cyc constant names must be at least 2 characters long (not including the #$ prefix).
Constant names can contain upper and lower case characters and colons, underscores, and dashes. valid-constant-name-p can be used to check that a string is a valid constant name. This policy is enforced in the Cyc Functional Interface and in the Cyc Web Interface.
Cyc constant names are case-sensitive: #$foo is not the same as #$Foo. However, distinguishing two constant names solely on the basis of capitalization is prohibited by the system.
All Cyc predicate names must begin with a lowercase character. (This does not include all the things that are presently instances of #$Predicate in Cyc. Some of these latter things are more like functions, and their names begin with uppercase letters).
All non-predicate constant names must begin with an uppercase character. Non-predicate constant names may also begin with a numeric character (e.g., #$3MCorporation). We may also allow predicates to begin with numeric characters, if someone makes a compelling argument for why this should be allowed.
All Cyc constant names should be composed of one or more meaningful “words” in sequence, with no breaks except for dashes or underlines (e.g. #$isa and #$SportsCar). A sequence of numeric characters may count as a “word” (e.g., #$FrontOfficeOf123Corp). With the exception noted above for predicate names, each (non-numeric) “word” in a sequence must begin with a capital letter. An acronym may count as a “word”, but all its characters will be the same case (e.g., lower case if the acronym begins the name of a predicate constant; otherwise uppercase).
Hyphens are used to set off parts of names which restrict or refine the meaning of the name, as in #$Fruit-TheWord or #$Horse-Domesticated.
In general, it’s best to give related constants names which are alphabetically proximal. Some of our interface tools make it easy to search for all constants whose name begins with a certain string of characters, and it’s easier to find all constants having to do with horses if they have been given names like #$Horse-Domesticated and #$Horse-Wild than if they have been given names like “DomesticatedHorse” and “WildHorse”.
When naming a constant, it’s important to assign a name that distinguishes the denoted concept from other concepts it might get confused with. So “Bow” would be a terrible name for a constant. Instead, names like “Bow-BoatPart”, “BowTheWeapon”, “Bowing-BodyMovement” should be used, depending on the underlying concept denoted.
Sometimes it is possible to take this principle of specificity in names to an extreme, and attempt to embody the whole meaning of the constant in its name. This is discouraged. For example, one might be tempted to give the constant #$physicalParts the name “distinctIdentifiablePhysicalParts”, but it is better to leave the name a bit terser since it isn’t easily confused with some other concept, and put the additional information in the constant documentation.
It’s important to remember that the names we assign to constants mean nothing to Cyc. It doesn’t matter whether the concept green is represented by #$Green, #$GreenColor, #$Verde, #$Gruen, or #$EMRG.
It’s also very important never to assume that you, the observer of the Cyc KB, can know with certainty what a constant denotes to the system, just from seeing its name and nothing else.
The meaning of a constant in CycL is determined by the assertions in the KB that use that constant. For example, from the following assertions, it is easy to tell what the hypothetical constant #$EMRG means:
(#$isa #$EMRG #$Color) (#$colorOfObject #$Grass37 #$EMRG) (#$forAll ?O (#$implies (#$isa ?O #$Okra) (#$colorOfObject ?O #$EMRG)))
For convenience, we choose names for Cyc constants that will indicate to human users what the constant is intended to mean. (For example, #$Red or #$RedColor.) But remember, Cyc doesn’t understand those strings. Don’t be misled by evocative names like #$LittleRedHairedGirlLikedByCharlieBrown. Unless that constant is appropriately related to other Cyc constants such as #$FemaleChild, #$hairColor, #$RedHairColor, #$CharlieBrown, and #$likesAsFriend, it is meaningless to Cyc.
Quantified CycL expressions (discussed below) contain one or more variables which stand for constants whose identities are not specified. A variable may appear (nearly) anywhere a constant can appear.
Variable names must contain uppercase alphanumeric characters only, must start with either one or two question marks, and the character after the first or second question mark must be a letter. valid-el-var-name? can be used to check that a string meets these requirements.
In formulas in which only one variable is used, it is common to use a single-letter variable, such as “?X”. However, when a formula contains more than one variable, it will be much more readable if you give the variables mnemonic names. Here’s an example:
(#$implies (#$and (#$isa ?TRANSFER #$TransferringPossession) (#$fromPossessor ?TRANSFER ?FROM)) (#$isa ?FROM #$SocialBeing))
“The initial possessor in a possession transfer is a social being.”
CycL formulas combine terms into meaningful expressions.
Every formula has the structure of a Lisp list. It is enclosed in parentheses, and consists of a list of objects which are commonly designated ARG0, ARG1, ARG2, etc. The object in the ARG0 position may be a predicate, a logical connective, or a quantifier. The remaining arguments may be atomic constants, non-atomic terms, variables, numbers, English strings delimited by double quotes (“), or other formulas.
This is the class of well-formed formulas in CycL. If a CycL formula satisfies all the constraints on the number and types of arguments to the relations that appear in it, the system will recognize it as an instance of the collection #$CycFormula.
The simplest kind of formula is an atomic formula: a formula in which the ARG0 position is occupied by a predicate, and all the other argument positions are filled with terms:
(#$likesAsFriend #$DougLenat #$KeithGoolsbey) (#$skillCapableOf #$LinusVanPelt #$PlayingAMusicalInstrument #$performedBy) (#$colorOfObject ?CAR ?COLOR)
The first two of the atomic formulas above are ground atomic formulas (GAFs), since none of the terms filling the argument positions ARG1, ARG2, etc. are variables.
Every CycL atomic formula must begin with a predicate in order to be well-formed.
Most predicates are defined to take a fixed number of arguments. There are no optional predicate arguments in CycL. However, a few predicates, such as #$different, can take a variable number of arguments. Such predicates are elements of the collection #$VariableArityRelation. In most cases, arity is automatically inferred by Cyc when a relation or predicate is made an instance of a certain type of collection (e.g. #$BinaryPredicate). However, arity can also be asserted directly, via the binary predicate #$arity.
The number of arguments a predicate takes is determined by its parity. A predicate is described as unary, binary, ternary, quaternary, or quintary, according to whether it takes 1, 2, 3, 4, or 5 arguments. Currently, no CycL predicate takes more than 5 arguments; however, if some representation required a predicate to take more arguments, CycL would be changed to allow this.
To be well-formed, an atomic formula must have the right number of arguments for the predicate filling the ARG0 position. So,
(#$likesAsFriend #$DougLenat #$KeithGoolsbey #$Fido)
is not well-formed, since the arity of #$likesAsFriend is 2, but this formula gives 3 arguments to #$likesAsFriend.
The type of each argument must be specified in the definition of the predicate, using the predicates #$arg1Isa, #$arg2Isa, etc. For example, suppose the predicate #$residesInDwelling is defined by the following:
(#$isa #$residesInDwelling #$BinaryPredicate) (#$arg1Isa #$residesInDwelling #$Animal) (#$arg2Isa #$residesInDwelling #$ShelterConstruction)
To be well-formed, every formula which has #$residesInDwelling in the ARGO position must have a term which is an instance of #$Animal in the ARG1 position, and term which is an instance of #$ShelterConstruction in the ARG2 position. So,
(#$residesInDwelling #$PottedPlant37 #$KarensHouse)
is probably not well-formed. Though we can never be absolutely certain just from the names, #$KarensHouse could be an instance of #$ShelterConstruction, but #$PottedPlant37 is probably not an instance of #$Animal.
Complex formulas can be built up out of atomic formulas or other complex formulas by using logical connectives, which are special constants analogous to the logical operators of formal logic. The most important logical connectives in CycL are #$not, #$and, #$or, and #$implies.
The connective #$not takes a single formula as an argument. Like the “not” of formal logic, it reverses the truth valueof its argument. Thus,
(#$not (#$colorOfObject #$FredsBike #$RedColor))
will be true if and only if (#$colorOfObject#$FredsBike #$RedColor) is false. Likewise,
(#$not (#$not (<#$colorOfObject #$FredsBike #$RedColor)))
will have the same truth value as (#$colorOfObject#$FredsBike #$RedColor).
The connective #$andtakes one or more formulas as arguments. Like the “and” of formal logic, it returns true if and only if each of its arguments evaluates to true. Here’s an example:
(#$and (#$colorOfObject #$FredsBike #$RedColor) (#$objectFoundInLocation #$FredsBike #$FredsGarage))
This formula states that Fred’s bike is red and that it is located in Fred’s garage. If both of those things are true then the whole formula is true, but if one or both are false, then the whole formula is false.
The connective #$ortakes one or more formulas as arguments. Like the “or” of formal logic, it returns true if and only if at least one of its arguments evaluates to true. Here’s an example:
(#$or (#$colorOfObject #$FredsBike #$RedColor) (#$objectFoundInLocation #$FredsBike #$FredsGarage) (#$owns #$Fred #$FredsBike))
This assertion states that either Fred’s bike is red, or it is located in Fred’s garage, or Fred owns it, or all three. (The word “or” in English is sometimes taken to imply that one alternative or the other is true, but not both. That is not the case with #$or.) If any or all of these three statements is true, then the whole formula is true. All would have to be false for the formula as a whole to be false.
The connective #$impliestakes exactly two formulas as arguments. Like the “if-then” statement of formal logic, it returns true if and only if it is not the case that its first argument is true and its second argument is false. Here’s an example:
(#$implies (#$owns #$Fred #$Bike001) (#$colorOfObject #$Bike001 #$RedColor))
This assertion states that if #$Bike001 is owned by #$Fred, then it is red. Newcomers to formal logic may misinterpret #$implies as implying a causal relationship. But, strictly speaking, a #$impliesassertion says only that either the first argument is false, or the second argument is true. So, for example, the assertion
(#$implies (#$isa #$RichardNixon #$Fruit) (#$colorOfObject #$BillJ #$PastelMintGreen))
is true, because the first argument is false.
Assertions involving #$implies are very common in the Cyc KB. We also call them conditionals or rules, and we often refer to the first argument as the antecedent and the second argument as the consequent. Note, however, that the particular formula above is not representative of assertions likely to be found in the Cyc KB. We will come to some more representative examples in a moment.
Any complex formula formed by using the logical connectives will be well-formed if the formulas given as arguments to the connectives are also well-formed and if the right number of arguments are given. (The formulas given as arguments to the logical connectives need not be atomic formulas, like most of the examples above, but can be any well-formed formula.) Another way of saying this is that #$not, #$and, #$or and #$implies produce CycFormulas when they are given arguments which are also CycFormulas.
Suppose A and B are syntactically legal, and C is not. Then,
(#$not A) (#$and A) (#$and A B) (#$or A) (#$or A B) (#$implies A B)
would all be CycFormulas. But
(#$not A B) (#$and) (#$and A C) (#$implies A)
would NOT be CycFormulas. Why? (#$not A B) violates the requirement that #$not take only one formula as an argument. (#$and) and (#$implies A) also violate restrictions on the number of formulas these connectives take as arguments. (#$andA C) is not well-formed because C is not; any complex formula that contained C would be syntactically bad for the same reason.
It should also be noted that #$and and #$or are elements of #$VariableArityRelation: hence, if A, B, C, and D are well-formed CycL expressions,
(#$and A B C D)
would be well-formed and also
(#$or A B C D)
would be well-formed.
So far, we have only looked at ways to make statements about specific objects, like #$FredsBike. But CycL, like first-order predicate calculus, also gives us two ways to talk about objects without being specific about the identity of the objects involved: universal quantification and existential quantification. Universal quantification corresponds to English expressions like every, all, always, everyone, and anything, while existential quantification corresponds to English expressions like someone, something, and somewhere. CycL contains one universal quantifier, #$forAll, and four existential quantifiers, #$thereExists, #$thereExistAtLeast, #$thereExistAtMost, and #$thereExistExactly.
The quantifier #$forAll takes two arguments, a variable and a formula in which the variable appears. In practice, the formula is almost always a conditional in which the antecedentis used to restrict the scope of the variable. Here’s an example:
(#$forAll ?X (#$implies (#$owns #$Fred ?X) (#$objectFoundInLocation ?X #$FredsHouse)))
This formula states that it is true, concerning every object in the Cyc ontology, that if #$Fred owns that object, then that object is located in #$FredsHouse. In other words, all Fred’s stuff is in his house.
Formulas may contain more than one quantifier, as in the following:
(#$forAll ?X (#$forAll ?Y (#$implies (#$and (#$owns #$Fred ?X) (#$owns #$Fred ?Y)) (#$near ?X ?Y))))
which says that any two things owned by Fred are near each other. Note that each quantifier introduces a new variable, and that each variable must have a different name.
Normally, variables need to be introduced (“bound”) by a quantifier before they are used. Each quantifier binds exactly one variable, and every variable used should be bound by exactly one quantifier. Furthermore, a variable has no meaning outside the scopeof the quantifier which binds it.
However, if a unbound variable appears in a CycL formula, it is always assumed to be universally quantified, with the result that
(#$implies (#$owns #$Fred ?X) (#$objectFoundInLocation ?X #$FredsHouse))
is exactly equivalent to
(#$forAll ?X (#$implies (#$owns #$Fred ?X) (#$objectFoundInLocation ?X #$FredsHouse)))
Since the former is easier to write and read, it is almost always preferred in practice, and you will rarely see a #$forAllwhile browsing the Cyc KB. Note, however, that unbound variables which appear only in the consequent of a conditional, and not in the antecedent, may have drastic and undesired consequences. Take, for example, the following:
(#$implies (#$owns #$Fred ?WHATEBER) (#$objectFoundInLocation ?WHATEVER #$FredsHouse))
Because of the typo, the variable ?WHATEVER will range over the entire Cyc ontology. In other words, the assertion above states that as long as Fred owns one thing, everything is located in #$FredsHouse–probably not what we wanted.
The quantifier #$thereExists takes two arguments, a variable and a formula in which the variable appears. In practice one uses #$thereExistsonly in certain ways, of which the following is a good example:
(#$implies (#$isa ?A #$Animal) (#$thereExists ?M (#$mother ?A ?M)))
This assertion states that, for every animal, there exists at least one object which is that animal’s mother. The object which is the animal’s mother may be an object which is already represented by a Cyc constant, or it may be a new object of which Cyc has no knowledge. But unless and until it is told otherwise, Cyc will assume that the object is a new one not identical with any “known” object.
These three quantifiers are similar to #$thereExists, but provide greater quantitative expressiveness. Each of them takes three arguments: a positive integer, a variable, and a formula in which the variable appears. Their meaning should be fairly self-explanatory. Look at the following examples:
(#$implies (#$isa ?P #$Person) (#$thereExistExactly 2 ?LEG (#$and (#$isa ?LEG #$Leg) (#$anatomicalParts ?P ?LEG))))
(#$implies (#$isa ?T #$Table) (#$thereExistAtLeast 3 ?LEG (#$and (#$isa ?LEG #$Leg) (#$anatomicalParts ?T ?LEG))))
(#$implies (#$isa ?P #$Person) (#$thereExistAtMost 1 ?SPOUSE (#$spouse ?P ?SPOUSE)))
As you probably by now expect, any formula beginning with a quantifier is well-formed if and only if its arguments are of the right number, of the right types, in the right order, and its formula argument is well-formed.
People writing assertions for entry into the Cyc KB use #$thereExists quite frequently. But when you browse the KB, you rarely see #$thereExists in an assertion. That’s because once assertions are entered into the KB, occurences of#$thereExists are automatically converted into Skolem functions. The only exceptions are certain cases where #$thereExistsis used within an expression that is an argument to a predicate. Thus, an assertion which was entered as:
(#$implies (#$isa ?A #$Animal) (#$thereExists ?M (#$and (#$mother ?A ?M) (#$isa ?M #$FemaleAnimal))))
will appear in the KB as 4 different assertions:
(#$isa #$SKF-8675309 #$SkolemFunction) (#$arity #$SKF-8675309 1) (#$implies (#$isa ?A #$Animal) (#$mother ?A (#$SKF-8675309 ?A))) (#$implies (#$isa ?A #$Animal) (#$isa (#$SKF-8675309 ?A) #$FemaleAnimal))
For more details, look at “An Introduction to Cyc Inferencing”.
A non-atomic term (NAT) is a way of specifying a termas a function of some other term(s). Every NAT is composed of a function and one or more arguments to that function.
Consider, for example, the function #$FruitFn, which takes as an argument a type of plant and returns the collection of the fruits of that type of plant. This function can be used to build the following NATs:
(#$FruitFn #$AppleTree) (#$FruitFn #$PearTree) (#$FruitFn #$WatermelonPlant) . . . .
Note that there may or may not be a named Cyc constant corresponding to the collection of apples (that is, a constant called #$Apple). The NAT (#$FruitFn#$AppleTree) provides a way of talking about this collection even if the corresponding constant does not exist.
NATs can be used anywhere a constant can be used. One could write, for example:
(#$implies (#$isa ?APPLE (#$FruitFn #$AppleTree)) (#$colorOfObject ?APPLE #$RedColor))
Since all NATs are built around functions, it’s important to understand more about them.
Like predicates, most functions have a fixed arity. A function is described as unary, binary, ternary, quaternary, or quintary, according to whether it takes 1, 2, 3, 4, or 5 arguments. No CycL function currently takes more than 5 arguments.
A few functions do not have a fixed arity, but can take a variable number of arguments. Mathematical functions like #$PlusFn are one example. And in Cyc-10, IBQEs are now treated as NATs in which the units of measure are functions which can take either one or two arguments, according to whether they are intended to denote a single value or a range.
Functions with fixed arity are similar to predicates in that the definition of the function must specify the type of each argument, using the predicates #$arg1Isa, #$arg2Isa, etc.
Functions with no fixed arity are defined using the predicate #$argsIsa, which specifies a single type of which every argument must be an instance.
Functions differ from predicates in that they return a Cyc term as a result. Accordingly, function definitions must also describe the type of the result to be returned, using the predicate #$resultIsa. Consider, for example, the function #$GovernmentFn:
(#$arity #$GovernmentFn 1) (#$arg1Isa #$GovernmentFn #$GeopoliticalEntity) (#$resultIsa #$GovernmentFn #$RegionalGovernment)
The argument to #$GovernmentFn must always be an instance of #$GeopoliticalEntity, and a NAT created using #$GovernmentFn will always be an instance of #$RegionalGovernment. So, for instance,
(#$isa (#$GovernmentFn #$UnitedStatesOfAmerica) #$RegionalGovernment)
Most functions are instances of either #$IndividualDenotingFunction or #$CollectionDenotingFunction. #$GovernmentFn is an example of the former, since a NAT like (#$GovernmentFn #$UnitedStatesOfAmerica) denotes an individual government. On the other hand, #$FruitFn is an example of the latter, since a NAT like (#$FruitFn #$AppleTree) denotes the collectionof all apples, not an individual apple.
The distinction between individuals and collections is an important one in CycL. For more on this topic, look at the constants #$Individual and #$Collection.
The definition of an instance of #$CollectionDenotingFunction should specify, not only its argument types and result type, but also the collection that the result will have as genls. This is done using the predicate #$resultGenl. For example, if the function #$LeftPairMemberFn is defined by:
(#$isa #$LeftPairMemberFn #$CollectionDenotingFunction) (#$arity #$LeftPairMemberFn 1) (#$arg1Isa #$LeftPairMemberFn #$SymmetricalPartType) (#$resultIsa #$LeftPairMemberFn #$ExistingObjectType) (#$resultGenl #$LeftPairMemberFn #$LeftObject)
then the following must be true concerning a NAT constructed from #$LeftPairMemberFn and #$Shoe:
(#$isa (#$LeftPairMemberFn #$Shoe) <#$ExistingObjectType) (#$genls (#$LeftPairMemberFn #$Shoe) #$LeftObject)
In other words, the set of left shoes is an instance of #$ExistingObjectType and a subset of #$LeftObject.
Many CycL functions are instances of #$ReifiableFunction. Each time an instance of #$ReifiableFunction is used with a new set of arguments to build a NAT, that NAT is reified, that is, preserved in the Cyc ontology as a constant. Constants which are reified NATs don’t start out with proper constant names, but can always be referred to by their NAT expression. They can later be assigned constant names if desired.
When a new reified NAT-constant is first created, Cyc automatically sets up the correspondence
(#$termOfUnit NAT-CONSTANT NAT-EXPRESSION)
where NAT-CONSTANT is the automatically created constant and NAT-EXPRESSION is the non-atomic term that can be used to refer to it. In the Cyc web interface, such an assertion might look like this:
(#$termOfUnit (#$GovernmentFn #$Canada) (#$GovernmentFn #$Canada))
It looks like the first and second arguments are the same, but that’s because, in lieu of a proper constant name for the reified NAT-constant, the system uses the NAT expression as a print name. If you look very carefully at any #$termOfUnit assertion in the web interface, you will see that the opening parenthesis of ARG1 is a followable link (depending on the web-browser you use, it may be underlined, or a different color), but the opening parenthesis of ARG2 is just opaque text. Clicking on the opening paren of a NAT expression will display the page for the reified NAT-constant the expression denotes.
Moreover, a reified NAT can be explicitly identified with an existing constant using the predicate #$termOfUnit:
(#$termOfUnit #$TheYear1996 (#$YearFn 1996)) (#$termOfUnit #$Apple (#$FruitFn #$AppleTree))
When a NAT is identified with a constant using #$termOfUnit, the two are asserted to be de dictoequivalent.
Skolem functions are reifiable.
Non-reifiable functions include mathematical functions like #$PlusFn. Just because we use a NAT like (#$PlusFn 59 64) doesn’t mean we want to add to the KB a unit denoting the number 123. If we want to talk about the number 123, we’ll just refer to it directly.
Also, #$UnitOfMeasure is not a subset of #$ReifiableFunction, so IBQEs such as (#$Inch 37) and (#$Meter 500) are not reified when they are referred to.
You may find, on occasion, that you want to write a rule which involves quantifying into a NAT, that is, using a NAT which has a variable as one of its arguments. You can do so by following these guidelines:
- Use the variables ?ARG1, ?ARG2, ?ARG3, etc., to stand for any arguments over which you want to quantify.
- Introduce a free variable which represents the result of the NAT.
- Add a clause to the antecedent which identifies this variable with the NAT. To do this, use the predicate #$equals.
(#$implies (#$and (#$equals ?U (#$PreviouslyOwnedFn ?ARG1)) (#$isa ?X ?U)) (#$hasAttributes ?X #$Used))
For quantifying into NATs you should always use #$equals, never #$termOfUnit. The predicate #$termOfUnitis used in the automatic mapping between the system generated data structure and the original non-atomic-term and should figure in the hand-entered assertions of human Cyclists very rarely, if at all.
Note that you cannot quantify into a NAT unless it is built on an instance of #$ReifiableFunction.
So far this document has dealt mostly with the syntax of formulas in CycL. This is the syntax used by people or external programs when they assert things into a version of the Cyc KB or query the KB. Now we will shift our focus to what formulas look like once they have been asserted into the KB.
The Cyc KB consists of a large number of assertions. When a formula is successfully asserted into the KB, it is stored as one of these. Each assertion is composed of a number of elements:
- a formula
- a microtheory
- a truth value
- a direction (or access level)
- a support
We’ll discuss each of these and point out where they come from. It’s important that you be familiar with all of them:
You are already familiar with formulas–they are the CycFormulas we use to state things in the Cyc KB.
Every assertion is contained in some microtheory. A particular formula may be asserted into (or concluded in) more than one microtheory; when this is the case, there will be an assertion which has that formula in each of those microtheories. The largest number of assertions are currently in the #$BaseKB.
Microtheories are covered in more detail here, as well as in the constant vocabulary, under #$Microtheory. Where does the microtheory information on assertions come from? That depends on the origin of the assertion. If an assertion is added to the KB by the inference engine as the result of firing a rule, the inference engine code decides what microtheory the conclusion should be added in and records it at add time. If an assertion is the result of a person or external program asserting a formula into the KB, at that time the asserter must specify which microtheory the formula is to go in. Some interfaces for knowledge entry may not require the user to specify a microtheory for new assertions, and will then either try to choose the right one or will use #$BaseKB as a default. If you use such an interface make sure you know what the default behavior is.
Attached to every assertion is a truth value that indicates its degree of truth. CycL contains five possible truth values, of which the most common are default true and monotonically true.
Assertions that are monotonically true are held to be true in every case, that is, for every possible set of bindings to the universally quantified variables (if any) in the assertion, and cannot be overridden. In the case of a monotonically true assertion with universally quantified variables in its formula, if an object is found for which the assertion is not true, an error is signalled. In the case of a ground assertion that is monotonically true, if the negation of that formula is ever asserted or arrived at during inference (in the same microtheory), an error is signalled.
Assertions that are default true, in contrast, are held to be true in most cases, and can be overridden. If the negation of an existing ground, default assertion is asserted in the same microtheory, or is arrived at through inference, no error is signalled. Instead, the argumentation mechanism is invoked to decide what the final truth value of the assertion will be.
By default, GAFs which begin with the predicates #$isa and #$genls are monotonically true, while all other assertions (including rules) are default true.
Direction is a value associated with every assertion that determines when inferencing involving that assertion should be performed. There are three possible values for direction: forward, backward, and code. Inferencing involving assertions with direction forward is performed at assert time (that is, when a new assertion is added to the KB), while inferencing involving assertions with direction backward is postponed until a query occurs and that query allows backward inference. By default, GAFs have direction forward, while ruleshave direction backward. Only in very special cases should rules have direction forward.
Assertions with direction code are not used in normal inference at all; instead, special HL modules have been written to supplant the need for inference using the assertion itself. Code assertions cannot be edited via the HTML interface.
One way of viewing directions is as a hierarchy of “when it gets used” :
- :codeAssertions declared :code are exhaustively supported directly by HL modules.
- :forwardAssertions declared forward are used exhaustively in forward inference. All :code assertions are also used implicitly at this level since the HL modules supplanting them are used here.
- :backwardAssertions declared backward are only used in backward inference. All :forward assertions are also used at this level, and the HL modules supplanting :code assertions are again used at this level.
Older Cyc documentation refers to “access levels” rather than to direction. Access level 0 is equivalent to direction forward, while access level 4 or higher is equivalent to direction backward.
Attached to every assertion is a support, which consists of one or more justifications which form the support for the presence of the assertion in the KB. In many cases, at least one of the supporting justifications is local, indicating that the assertion was added to the KB from an outside source (most commonly, a human KEer). In other cases, a supporting justification is a source which indicates the assertion was inferred and which outlines the final step of some argument, or chain of reasoning, which supports the assertion.
The support is the one element of an assertion which need not be specified by the KEer when performing knowledge entry. It is created and updated automatically. However, supports can be displayed by KB browsing tools.