Information

Survey of Knowledge Base Content

Information is one of the trickiest things to ontologize (represent in the Knowledge Base) because everything contains some kind of information.  Movies, books, plays, and TV shows are all obvious examples of things that contain information, but the rings in a tree stump and the color of the sky contain information as well.

Information

  There are three obvious categories of things that contain information: information-bearing things, abstract strings and characters, and propositional content.  However, in designing Cyc we discovered a fourth area which we call conceptual works.

Information-bearing things are the physical embodiments of information (i.e. an individual newspaper, which you can use to read or to wrap fish).

Information is encoded in abstract strings and characters.  They are the abstract symbols and structures like words, sounds, and handwriting that are used to convey information.  These are the most common elements used to intentionally convey information.

Propositional Content is what is encoded.  It is the information that the abstract strings and characters combine to represent.

The following slides will further explicate the above ideas and introduce the idea of a conceptual work.

What is “Moby Dick” ?

  This slide presents an example that we will tease apart through the remainder of this lesson.

Consider “Moby Dick” and the three main categories of information detailed on the previous slide.  As we continue with this lesson, you will see that “Moby Dick” doesn’t fit entirely into any one of those categories.

Question: What is “Moby Dick”?
Answer: A conceptual work.

Let’s see why....

Slide 4

  If I were to say that I like “Moby Dick,” I might mean that I like the way my special edition leather-bound copy of “Moby Dick” looks on the shelf.  In this case I would be referring to the aspect of “Moby Dick” that is an information bearing thing (IBT).

If I were to say that I like “Moby Dick,” (albeit unlikely) I might mean that I like the specific sequence of the letters in the work called “Moby Dick.”  In this case, I would be referring to the abstract information structure of “Moby Dick” (AIS).

If I were to say that I like “Moby Dick,” I might mean that I had read the story in several languages and liked it each time I read it.  In this case, I would be referring to the propositional information thing (PIT) that “Moby Dick” can denote.  This concept is important, as it is what allows Cyc to be independent of language.  A PIT that is expressed in CycL (a portion of which is given on the slide) can then be expressed in any desired language because PIT’s are independent of AIS’s (the sequence of symbols).  One PIT could have multiple AIS’s.

Slide 5

Since most people don’t normally use the name of a book to refer to the paper on which it is printed, when they say “Moby Dick,” they are probably not using “Moby Dick” as an information-bearing thing (IBT).

Similarly, people don’t normally use the name of a book to refer to the enormously long sequence of symbols (letters) that make the text of the book (the sequence of all of the letters in the entire work, including the small sequence on the slide, “’-T-i-s--M-o-b-y--D-i-c-k-!”).  Thus, “Moby Dick” probably does not refer to an abstract information structure (AIS).

In the same way that “Don Quixote” probably refers to a novel that is written in Spanish, when people say “Moby Dick” they are probably referring to the original English version.  In fact, some experts might argue as to the validity of reading the novel in a language other than the original English.  Thus, when people say “Moby Dick,” they are probably not referring just to the propositional information thing (PIT), but to something more, which includes the language in which it was written.

Slide 6

  Hence, we know that “Moby Dick” denotes something that does not fit into any one of the three obvious categories for representing information.  So we now understand the need for the fourth category -- Conceptual Works.  When someone says “Moby Dick” they are probably referring to the thing that Cyc knows as #$MobyDickTheBook-CW.

Slide 7

  #$MobyDickTheBook-CW is embodied in thousands of different IBT’s around the world (all the copies of all the editions, in all the libraries, homes, schools, etc.) and is represented by a specific AIS, and is associated with a specific PIT.

Slide 8

  In Cyc, we relate conceptual works to other things, like IBT’s, via a large number of relations.  Although these relations have no common-sense usage, they allow us to specify to Cyc the part of the conceptual work to which we are referring.  So, in order to refer to a specific copy of “Moby Dick” we would refer to the IBT that is an #$instantiationOfCW of “Moby Dick.”

Slide 9

This slide presents a glimpse of the various relations between conceptual works and the other categories pertaining to information.

Summary
  • InformationBearingThing
  • AbstractInformationStructure
  • PropositionalInformationThing
  • ConceptualWork
  • Relating these categories

This concludes the lesson on how Cyc represents Information.