April 8, 2002
Who Is He? Doug Lenat is an artificial intelligence pioneer who is leading the human "memome" project, an effort to codify all the common sense in a person's head.
Austin, Texas-based Cycorp Inc. claims to be "the leading supplier of formalized common sense." CEO and founder Doug Lenat has labored 17 years to codify facts such as "Once people die, they stop buying things." He uses a form of symbolic logic called "predicate calculus" to classify and show the properties of information in a standard way.
The Cyc knowledge base adds power to applications by adding common-sense information on top of the domain-specific knowledge that occurs in every application, Lenat tells Computerworld's Gary H. Anthes.
"We see it as the next great thing," says Morrie Sigel, a partner at Atlantic Capital Partners LLC in Darien, Conn. "The knowledge base provides such a broad platform for a multiplicity of products that it's mind-boggling."
What have you accomplished so far? We've put in 600 person-years of effort, and we've assembled a knowledge base containing 3 million rules of thumb that the average person knows about the world, plus about 300,000 terms or concepts.
Can you give an example? Terms like "first date" and rules of thumb like "People are more polite on their first date than they are on their nth date." A lot of these things were true 50,000 years ago, like "If you are carrying a container that's open on one side, you should carry it with the open end up." The idea is to represent these in formal logic as opposed to English sentences. You want the machine to be able to crank through the logical deductions—the consequences of these assertions—the same way you or I would.
What will the knowledge base be used for? I see this more as a power source rather than a single application. [For any given application], you need common-sense knowledge and domain knowledge. We are building in the common-sense knowledge.
Are there any applications so far? Yes, it's called CycSecure, and we are beta-testing it. Cyc knows what are normal, legitimate actions and what are actions taken by hackers, [and it knows about operating system vulnerabilities]. It uses its [artificial intelligence] planning ability and knowledge of the world to come up with network attack plans. You tell it about your network, and instead of running canned exploits against it and doing the old-fashioned intrusion detection, you do hypothetical reasoning. You experiment on the model instead of the actual network.
What is OpenCyc? It's a daring gamble to gradually make everything in the Cyc knowledge base public. An initial release last week made available about 5,000 concepts and 50,000 axioms or assertions about them. We will gradually, over the next two years, migrate everything to the public mode. But OpenCyc will always lag by 24 to 30 months.
Are you continuing to add to Cyc? Yes. Cyc finally knows enough that it can actually help with the knowledge-entry process. It's changed in the past year from where we were entering these things by hand and writing them in logic to a kind of tutoring mode. For example, you say, "I want to tell you about a new kind of bacteria," and it might say, "What kinds of things does it kill? Is it similar to anything I know about already?" Up until now, the only people adding knowledge were a small priesthood of logicians. Now, suddenly, millions of people can add their knowledge to Cyc. Because of the acceleration, we'll be at 10 million assertions a year from now.
Won't input from the public bring in a lot of garbage? I'll have an OpenCyc committee to help vet knowledge that is suggested. Also, we've developed the notion of local consistency, which is analogous to our everyday notion of the earth as being locally flat and globally spherical. In the same way, we have divided the knowledge base into regions that are locally consistent, and all the inconsistent information is so far away that you can ignore it. If someone puts in "Dining room tables are made of Jello," that will contradict so many things in the "normal" part of the knowledge base that it automatically will get pushed out into the boonies.
Is Cyc like the human genome project, where eventually you will be done, or will it grow forever? I refer to it as the human "memome" project. A typical person knows about 100 million things about the world. I see us crossing that point in five years. It's difficult to predict the course thereafter.
The Cyc knowledge base uses predicate calculus to encode assertions such as “Animals sleep at home.”
(ForAll ?x (ForAll ?S (ForAll ?PLACE
(isa ?x Animal)
(isa ?S SleepingEvent)
(performer ?S ?x)
(location ?S ?PLACE))
(home ?x ?PLACE)))))
This says that if x is an animal and is the performer of a sleeping event, then the place where that event takes place is the home of x.
by Gary H. Anthes