background logo
Home

In the past year over 40,000 copies of OpenCyc have been downloaded from SourceForge. How should these separate Cyc-enabled computers collaborate to create a whole which is much smarter than the sum of its parts? And while one Cyc server can easily support a dozen clients at the same time, handling operations from thousands of clients would overwhelm a computer-bound Cyc server. How do we address this scalability issue?

Current state-of-the-art suggests that a Super-Peer network is the solution.  Cyc's context mechanism provides a natural partitioning of knowledge suitable for distributed implementation.

In fact, this is how humans work: we specialize and collaborate. Although most humans share a core of common knowledge which allows them to communicate, no one human possesses all the knowledge of the human race. Instead, we have experts in law, medicine, construction, automotive repair, and software design. When a human lacks the knowledge to solve a problem, he can often find a solution by collaborating with an expert. When you go to the doctor with a persistent headache, the doctor's knowledge of medicine combines with your knowledge of your symptoms to produce a solution that neither could have reached alone.

In a distributed Cyc architecture, the network is populated with Cyc agents, each of which shares a common core of knowledge as well as a unique extension of expert knowledge in a particular domain. Most importantly, the various Cyc agents are endowed with the ability to communicate with each other and to perform inferencing in a collaborative fashion. The inter-agent communication can be handled flexibly by using CycML, OWL, or some other knowledge-sharing protocol, or it can be implemented using a more efficient Cyc-specific protocol.

Cyc agents in a distributed architecture share knowledge describing how they can be reached (via which network address, port, and protocol) and what their areas of expertise are. During inferencing, when an agent tries to expand a formula whose content lies outside its area of expertise, it determines (by consulting its own knowledge base or consulting with a central knowledge-broker agent) whether another agent is available that might be able to help. If so, the agent sends a message to its remote counterpart asking it to help expand the formula in question. The answer may be a complete set of bindings, but more commonly it is simply a partial result, still containing unbound variables. The local agent incorporates the result into its own proof tree just as if it had done the work itself, and continues its task. In many cases, the local agent may need to consult the remote agent, or several remote agents, multiple times in the course of its inferencing process.

Cycorp has cooperated with the Computer Science Department of the University of Maryland in Baltimore County (UMBC) to develop a demo of such a distributed architecture. In the demo, three Cyc agents communicate with each other using KQML. While all three agents possess the same core knowledge base, each possesses additional knowledge about an additional domain in which it is considered to be an expert: the GeoAgent in geography, the PolAgent in politics, and the EcoAgent in economics. Working together, they can answer queries that no one of them could have answered alone.

For example, suppose a user asks the GeoAgent for "elected heads of government of countries north of the equator". This might be represented as:

(#$and
   (#$headOfGovernmentOf ?x ?y)
   (#$hasAttributes ?x #$Elected)
   (#$northOf ?y #$Equator))

The GeoAgent is able to find bindings for the third clause by using its own knowledge of the geography domain:

  • Britain is in Europe.
  • Europe is in the northern hemisphere.
  • The northern hemisphere is north of the equator.
  • If region A is part of region B, and region B is north of region C, then region A is north of region C.
  • Therefore, Britain is north of the equator.

But to find bindings for the first two clauses, the GeoAgent must enlist outside help. It sends these as queries to the PolAgent, which is able to find bindings for them using its knowledge of politics domain:

  • Heads of government of democratic countries are elected.
  • Great Britain is a democratic country.
  • Tony Blair is the head of government of Britain.
  • Therefore, Tony Blair is the elected head of government of Great Britain.

When the GeoAgent agent receives this answer back from the PolAgent, it combines it with its own partial answer to produce the final result: TonyBlair.

For more details on the Cycorp/UMBC demo of the distributed Cyc architecture, see "The Cycic Friends Network: getting Cyc agents to reason together", a paper describing the project which was presented at the 1995 CIKM conference. Also see The UMBC KQML Web, a page at UMBC describing the concept in greater detail.

It should also be pointed out that, while the description above assumes that all the agents participating in the distributed architecture are Cyc agents, this is not a requirement. Cycorp has defined a very simple protocol for implementing cooperative inferencing, and any agent which has been taught to adhere to this protocol, and which possesses knowledge of interest to other agents, can meaningfully participate in such an architecture. For instance, the role normally played by a Cyc agent in a distributed architecture could equally well be filled by a gateway to

  • an expert system implemented in Prolog
  • an SQL database
  • a special-purpose inferencing tool (e.g. for spatial inferencing)
  • a human expert
  • etc.

In fact, the WWW Information Retrieval application builds on the distributed Cyc architecture by filling that role with a gateway to a WWW information source.




Copyright © 2002-2010 Cycorp, Inc. All Rights Reserved. | privacy statement | contact us | home

distributed AI