In a leviathan effort, scientists and philosophers teach computers common sense.
Be honest: how many times in the last week did you want to slap your mulish computer upside the monitor? Before you resort to violence, consider why our relationship with these supposedly smart machines is fraught with frustration, miscommunication and disappointment.
The problem is easy to diagnose, says computer scientist Douglas Lenat, PhD ’76, who heads Cycorp in Austin, Texas. Though they are crammed with data, computers understand less about the world than a 10-year-old child does. They don’t know that you’ll get wet if you go out in the rain, that parents are older than their children, or that Cal can’t win Big Game. They’re flummoxed by plain English and stumble over anomalous data.
To become smarter, the former Stanford professor argues, computers don’t need faster chips or bigger memories. They need an infusion of common sense—all those ordinary facts and assumptions about the world that enable people to survive and communicate with each other. Eighteen years ago, Lenat left academia for the private sector, gambling that he could teach this fundamental knowledge to computers.
The fruit of his work is Cyc, smart software that according to Lenat knows and applies common sense. Cyc’s schooling has consumed $60 million and 600 person-years of effort from programmers, philosophers and others—collectively known as Cyclists—who have been codifying what Lenat calls “consensus reality” and entering it into a massive database.
Much more than another killer app, Cyc will revolutionize our interactions with computers, Lenat asserts: “Cyc will underlie almost all applications that people run on computers.” Lenat believes that Cyc will not only give us easier-to-use, more powerful software—and reduce the number of assaults on computers—but will also be a big advance in artificial intelligence (AI), the effort to build machines and programs that can reason, learn, understand human languages and perform other brainy tasks.
No other AI project comes close to Cyc’s scale and ambition, says Nils Nilsson, MS ’56, PhD ’58, an emeritus professor of computer science at Stanford. Though AI researchers acknowledge that the “common sense problem” has to be cracked, most are trying to solve it part by part, he says. “I don’t know that anyone is trying to master all of common sense, apart from Doug.”
Because Cyc falls outside the mainstream of AI, skeptics initially declared it wouldn’t work—and many persist in saying so, in part because few have seen it in action. However, Cycorp plans to release a free, slimmed-down version known as OpenCyc, which will give the public its first chance to judge whether Lenat’s gamble paid off.
When he left Stanford at age 34, Lenat had already made a reputation writing two programs that could learn on their own. The first, his dissertation project, was called AM, or Automatic Mathematician. Lenat would give the program a few mathematical concepts and rules for judging how interesting its discoveries were, and AM would search for new concepts. For instance, Lenat explains, AM independently discovered prime numbers.
Lenat proved the mettle of AM’s successor, Eurisko, by winning a national war-game tournament. Eurisko parsed the rule book—several hundred pages of minutiae, such as how extra armor changes the vulnerability and maneuverability of ships—and came up with some unorthodox designs for a battle fleet. One strategy allowed damaged ships to “commit suicide” to increase fleet maneuverability, and another deployed pesky flotillas of lightly armed ships. Lenat stomped all comers two years running; then the organizers asked him only half-jokingly not to come back.
Lenat still relishes that triumph, and Eurisko also provided a key lesson that inspired Cyc: computers that don’t know a lot can’t learn a lot. “So if you want computers not to be brittle [prone to breakdown] in their dealings with human beings, you need to prime the pump with vast quantities of consensus reality,” he says. That enormous endeavor would require far more work than one professor and a few grad students could accomplish.
A national crisis gave Lenat his opportunity. In 1982, fearing that Japanese companies would take over the global computer market, a consortium of U.S. high-tech companies launched the Microelectronics and Computer Technology Corporation (MCC). Its brief was to foster research that would help fend off the Japanese juggernaut. More important for Lenat, the participants—which included Motorola, Kodak and Hewlett-Packard—had plenty of cash for long-term projects. So in 1984, he moved to the Silicon Hills of Austin. MCC nurtured Cyc until Cycorp became a separate corporation in 1994. Since then, Cycorp has survived on contracts with government and private companies for specialized Cyc applications.
Cyc now lives and learns in Cycorp headquarters in northwest Austin. Its home is a brown brick building shaded year-round by live-oak trees. The hallways are almost silent, while in dark offices lit only by glowing computer screens, Cyc’s tutors, known as ontological engineers, are prepping it for its public debut. Almost every surface in Lenat’s office is stacked high with papers. Stocky, with dark hair almost free of gray, he wears a magenta shirt and black slacks.
If you expect Cyc to have grasping robotic arms or a droning android voice, you’ll be disappointed. It’s a “knowledge base,” basically a database and the software to use it. Although the name is short for encyclopedia, Cyc contains much more than facts. Lenat and his crew scanned newspapers, novels and ads to unearth the implicit assumptions that make text intelligible—in his words, “automating the white space.” For example, if you read “George Bush is in Washington,” you assume that Bush’s foot is also in Washington. We all know that, unless Bush has undergone amputation, his foot travels along with him. But a computer has to be told.
After 18 years of painstaking tutorial sessions, Cyc now holds some 1.5 million mostly banal assertions of this kind, all rendered in a formal language developed for the purpose. A few examples:
Water is wet.
Every person has a mother.
When people die, they stay dead.
To make this storehouse of common sense usable, Cyc boasts an inference engine, software that allows it to reason from its knowledge. In addition, Cyc can handle contradictory assertions without blowing a circuit board, because its knowledge is divided into microtheories. For example, “Vampires don’t exist” resides in one microtheory and “Dracula is a vampire” in another.
Lenat rattles off a long list of possible uses for Cyc, starting with smart web searchers that find what you want because they understand content, instead of just matching key words. Cyc could automatically cleanse files, databases and spreadsheets of errors and inconsistencies.
Cyc also might be the final piece both in developing speech-understanding software that doesn’t trip over the many ambiguities of English, and in giving computers the ability to interpret “natural language”—i.e., plain English. If a word processor could interpret text, it could see, for example, that you promised to discuss three topics and delivered only two. E-mail programs could read, summarize, annotate and even answer your messages.
Will his brainchild live up to its billing? Vaughan Pratt, one of the few independent experts to interrogate Cyc, expresses doubt. The emeritus professor of computer science at Stanford quizzed Cyc in a short session in 1994. He says its reasoning ability and knowledge didn’t live up to its billing. “Cyc knows that soup comes in eight-ounce quantities, but it doesn’t know that you die without food.” Although the program now contains about three times as many assertions as the version he tried, Pratt, PhD ’72, thinks the problem is Cyc’s premise. Instead of stuffing computers full of knowledge to make intelligent machines, he says, we need to focus on improving their ability to reason and manipulate facts.
Lenat derides this approach as the result of physics envy. He says many AI researchers are consumed with finding the “Maxwell’s equation of human thought”—a simple, elegant formulation that “you could put on a T-shirt and that would unlock the secret of intelligence.” Until that happens, Lenat says, his way is the only way to get a computer to learn common sense.
Of course, success is the best reply to criticism. Cycorp has contracts with the Defense Department and companies like SmithKlineGlaxo, and it easily won a 1998 Defense competition for intelligent databases. Another test should come sometime this year with the release of OpenCyc, the free, open-source version containing about 60,000 core assertions and 5,000 core concepts. Anyone will be able to download a copy, and that will enable Cyc to start learning from conversations with people in the outside world. That’s a big step, but Cyc is 18 years old. About time it got away from home.
Mitchell Leslie lives in Albuquerque, N.M., and is a frequent contributor to Science and Stanford magazines.