April 20, 2015


Constants are the “vocabulary words” of the CYC knowledge base. The CYC KB is an attempt to model the world as most sane, adult humans perceive it, so each constant stands for some thing or concept in the world that we think many people know about and/or that most could understand.

The KB contains constants that denote collections of other concepts, such as #$AnimalWalkingProcess (the set of all actions in which some animal walks) or #$Typewriter (the set of all typewriters). It can have constants that denote individual things, some of which are more-or-less permanently in the KB, like #$InternalRevenueService, and some of which might get created only when reasoning about some state of affairs, like #$Walking00036 (a particular case of walking). Some of the individuals represented in the KB are predicates, such as #$isa or #$likesAsFriend, that allow one to express relationships among constants. Others are functions, such as #$GovernmentFn, which take constants or other things as arguments, and denote new concepts, i.e., (#$GovernmentFn #$Canada).

Each constant has its own data structure in the KB, consisting of the constant and the assertions which describe it.

2.1.1.  Constant Names

All CYC constants have a unique name, such as #$GeorgeWBush, #$massOfObject, or #$MapleTree. CYC constants are referred to with the prefix “#$” (read “hash-dollar”). These characters are sometimes omitted in documents describing CycL, and they may be omitted by certain interface tools. But in these CYC Documentation pages, the policy will be to use the “#$” prefix when referring to CYC constants.

2.1.2.  Naming Conventions

The name of a CYC constant — the part after the “#$” prefix — must follow these rules: All CYC constant names must be at least 2 characters long (not including the #$ prefix).

Constant names can include any uppercase or lowercase letter, any digit, and the symbols “-” (dash), “_” (underscore), and “?” (question mark). No other characters, such as “!”, “&”, or “@” are allowed. This policy is enforced in the CYC Functional Interface and in the CYC Web Interface.

CYC constant names are case-sensitive: #$foo is not the same as #$Foo. However, distinguishing two constant names solely on the basis of capitalization is prohibited by the system.

All CYC predicate names must begin with a lowercase character. All non-predicate constant names must begin with an uppercase character. Non-predicate constant names may also begin with a numeric character (e.g., #$3MCorporation). We may also allow predicates to begin with numeric characters, if someone makes a compelling argument for why this should be allowed.

Constant names should not be plural nouns. Even in the case of collections the associated constant name should be the singular noun which describes individual members of the collection, e.g. the collection of all dogs is #$Dog, not #$Dogs.

All CYC constant names should be composed of one or more meaningful “words” in sequence, with no breaks except for dashes or underlines (e.g. #$isa and #$SportsCar). A sequence of numeric characters may count as a “word” (e.g., #$FrontOfficeOf123Corp). With the exception noted above for predicate names, each (non-numeric) “word” in a sequence must begin with a capital letter. An acronym may count as a “word”, but all its characters will be the same case (e.g., lower case if the acronym begins the name of a predicate constant; otherwise uppercase).

These conventions make for easier reading by ontological engineers, as well as better English generation for unlexified terms.

Hyphens are used to set off parts of names which restrict or refine the meaning of the name, as in #$Fruit-TheWord or #$Horse-Domesticated.

2.1.3.  Naming Strategies

All things being equal, it’s best to give related constants names which are alphabetically proximal. Some of our interface tools make it easy to search for all constants whose name begins with a certain string of characters, and it’s easier to find all constants having to do with horses if they have been given names like #$Horse-Domesticated and #$Horse-Wild than if they have been given names like “DomesticatedHorse” and “WildHorse”.

However, as Cyc’s natural language capability improves, and as new lexical lookup utilities are added, it becomes easier to look up constants by any of the string that are known to refer to them, rather than by their constant name. An example of this is if you type in “FBI” into the Complete box in the Cyc browser, it offers #$FederalBureauOfInvestigation as a disambiguation. Hence, naming constants is only one piece of the work; doing thorough lexification is also very important.

When naming a constant, it’s important to assign a name that distinguishes the denoted concept from other concepts it might get confused with. So “Bow” would be a terrible name for a constant. Instead, names like “Bow-BoatPart”, “BowTheWeapon”, “Bowing-BodyMovement” should be used, depending on the underlying concept denoted.

Sometimes it is possible to take this principle of specificity in names to an extreme, and attempt to embody the whole meaning of the constant in its name. This is discouraged. For example, one might be tempted to give the constant #$physicalParts the name “distinctIdentifiablePhysicalParts”, but it is better to leave the name a bit terser since it isn’t easily confused with some other concept, and put the additional information in the constant documentation.

2.1.4.  Significance of Names

It’s important to remember that the names we assign to constants mean nothing to CYC. It doesn’t matter whether the concept green is represented by #$Green, #$GreenColor, #$Verde, #$Gruen, or #$EMRG.

It’s also very important never to assume that you, the observer of the CYC KB, can know with certainty what a constant denotes to the system, just from seeing its name and nothing else.

The meaning of a constant in CycL is determined by the assertions in the KB that use that constant. For example, from the following assertions, it is easy to tell what the hypothetical constant #$EMRG means:

     (#$isa #$EMRG #$Color)
     (#$colorOfObject #$Grass37 #$EMRG)
     (#$forAll ?O (#$implies (#$isa ?O #$Okra) (#$colorOfObject ?O #$EMRG)))

For convenience, we choose names for CYC constants that will indicate to human users what the constant is intended to mean. (For example, #$PurpleColor or #$RedColor.) But remember, CYC doesn’t understand those strings. Don’t be misled by evocative names like #$LittleRedHairedGirlLikedByCharlieBrown. Unless that constant is appropriately related to other CYC constants such as #$FemaleChild, #$hairColor, #$RedHairColor, #$CharlieBrown, and #$likesAsFriend, it is meaningless to CYC.