This section gives a description of the Cyc system that sets the conceptual stage for the administrative tasks presented in the following sections.
Cyc as a Software System
Cyc is a common-sense knowledge base and machine-reasoning system. A knowledge base system is in many ways very similar to a relational database server. Among the similarities are:
- There is a centralized server component.
- The two main modes of interaction consist of updating information and answering queries (or performing inference, in the diction of knowledge based systems).
- The server offers a variety of APIs, ranging from web-based tools to programmatic APIs, connected via TCP/IP.
- There is a dedicated scripting language.
- The representation and query language can be expressive enough that some updates or queries will, by definition and mathematically provably, exhaust all available resources unless properly constrained.
However, there are also key differences, which Cyc shares with many knowledge base systems, that make the system unlike relational databases:
- Cyc is not transactional. There are no commit points and no roll-backs.
- Cyc keeps all change in memory (though it maintains a journal for replay purposes). If not explicitly instructed to serialize its knowledge base, changes made to Cyc may not be available automatically across restarts. In addition, the serialization of the knowledge is considered an infrequent operation and not very fast, compared to the rate of writes that relational databases achieve.
- Many of the performance related parameters that are set globally in a relational database are set on a “per query” basis in Cyc. That is, each query that is sent to the Cyc system includes a list of parameters and values that gate things like how long the query can run before it times out, which portion of the knowledge base it can access in its search, the maximum number of answers it can return, and a whole suite of other such parameters that can play a large role in the performance of the system. Because these parameters depend on the query being asked and can and often vary from query to query, they are not expected to be set by system administrators or IT managers and, accordingly, are not covered in this document.
The following diagram provides a high-level overview of the Cyc software system, including some of the key server-external components that administrators might be called upon to manage. (See also the section on support components below.)
- Cyc runs on top of Java virtual machine (Java version 6), which mediates all access to the operating system and its resources.
- At its core, Cyc combines a knowledge store and a reasoning engine.
- Server activity is logged to the log files.
- The majority of the knowledge is housed in read-only knowledge base files; modifications are kept in main memory and journaled transcripts.
- There is a set of core tools for manipulating the knowledge store and interacting with the reasoning engine.
- In addition to the core tools, Cyc supports server-side scripting.
- All of that functionality is exposed to the outside world via a set of network APIs in the form of TCP/IP-based socket services.