Incompleteness from Resource Bounds and Continuable Searches

Inference in Cyc

We described earlier how inference can be viewed as a sequence of logical deductions chained together.  At each point along the way, you might often have different ways that you could now perform a new deduction.  So, in effect, from what you start with there is a branching set of possibilities for how you can prove it.  And that branching set can spread out in all kinds of different, novel ways.

Cyc is Life in the Big City

A former Cyclist referred to the explosiveness of the searches as “Life in the Big City.”  We’re not dealing with “toy” problems with a small number of terms being talked about with a small number of assertions; this is life in the Big City.  We have on the order of a million assertions and on the order of  a hundred thousand terms being talked about.  So, we very often run into huge potential proofs because we chain all of these things together -- both deep and broad.  Because of this we cannot search these trees to their exhaustion.  So we take a completely different approach.
 

Inference is Resource-bounded

We search within some specified resource limits.  Inference in our system is “resource-bounded.”  By “resources” we’re referring to the resources available for doing inference.  Some resource bounds that we can impose are: the amount of time you work, the number of answers you find, the depth to which you reach (this is a measure of complexity of the proof), and the number of rules to be chained together (this leads to proving child nodes which can be more complex than the original query).  By circumscribing a set of parameters which reflect real-world resources that are available for having an automaton grind out these inferences, we turn the algorithm on its head.  It’s not the algorithm’s responsibility to “magically” figure out how much resource it’s supposed to use, we’re going to give the application that’s invoking this inference control over how much resource to spend in answering the original query: how many seconds to spend thinking, how many “yes’s” to come up with, how deep to go, how complex (rule-wise) to go, etc.  This gives the application the choice to decide the boundaries for the search.

So, the application has a way to determine how much of the tree to let the inference search over.  It tries to then find the best answers within these resource boundaries.  It is then the inference mechanism's own responsibility to figure out, given the resources available, how much time and resource to spend on each particular step -- while adhering to the overall resource restrictions from the outside.
 

Resource-bounded Incompleteness

  In effect, we are allowing resource-bounded incompleteness, where the incompleteness comes from the algorithm’s decision as to how complete it wants to be.  For certain queries, you can say “run forever,” “give me every answer you can get,” “go as deep as you need,” or “use as many rules as you want.”  This is basically just turning it back over to Cyc and letting it give you everything.  It could go forever, so you almost always want to give some resource bounds to give it a reason to quit.  As we’ve seen, in many interesting cases if you don’t give it a reason to quit, it won’t.  It’ll go on until it runs out of memory on the computer or they shut off the lights on the universe.

Deciding how much work to do is the way in which we deal with “Life in the Big City.”  An aspect of this solution is that if you’re letting it quit early, you should give it a way to continue on from where it left off.

Inference is Continuable

  The other main aspect of addressing this incompleteness is making our inferences continuable.  If I use some resources to do a little bit of work, and it quits because it comes up with an answer, and I then decide to do some more work on it, I shouldn't have to start from scratch all over again.  I should be able to tell it to just pick up where it left off and use this many more resources, run for ten more seconds, give me one more answer, or go one step deeper in the tree, or something like this.  Inferences in our system are explicitly continuable, and this answers a direct need that arises from letting it quit early -- letting it pick up where it left off.

Proof Search Could be Stored With Meta Data

We search within some specified resource limits.  Inference in our system is “resource-bounded.”  By “resources” we’re referring to the resources available for doing inference.  Some resource bounds that we can impose are: the amount of time you work, the number of answers you find, the depth to which you reach (this is a measure of complexity of the proof), and the number of rules to be chained together (this leads to proving child nodes which can be more complex than the original query).  By circumscribing a set of parameters which reflect real-world resources that are available for having an automaton grind out these inferences, we turn the algorithm on its head.  It’s not the algorithm’s responsibility to “magically” figure out how much resource it’s supposed to use, we’re going to give the application that’s invoking this inference control over how much resource to spend in answering the original query: how many seconds to spend thinking, how many “yes’s” to come up with, how deep to go, how complex (rule-wise) to go, etc.  This gives the application the choice to decide the boundaries for the search.

So, the application has a way to determine how much of the tree to let the inference search over.  It tries to then find the best answers within these resource boundaries.  It is then the inference mechanism's own responsibility to figure out, given the resources available, how much time and resource to spend on each particular step -- while adhering to the overall resource restrictions from the outside.

Proof Search Could be Stored With Meta Data (cont.’d)

We search within some specified resource limits.  Inference in our system is “resource-bounded.”  By “resources” we’re referring to the resources available for doing inference.  Some resource bounds that we can impose are: the amount of time you work, the number of answers you find, the depth to which you reach (this is a measure of complexity of the proof), and the number of rules to be chained together (this leads to proving child nodes which can be more complex than the original query).  By circumscribing a set of parameters which reflect real-world resources that are available for having an automaton grind out these inferences, we turn the algorithm on its head.  It’s not the algorithm’s responsibility to “magically” figure out how much resource it’s supposed to use, we’re going to give the application that’s invoking this inference control over how much resource to spend in answering the original query: how many seconds to spend thinking, how many “yes’s” to come up with, how deep to go, how complex (rule-wise) to go, etc.  This gives the application the choice to decide the boundaries for the search.

So, the application has a way to determine how much of the tree to let the inference search over.  It tries to then find the best answers within these resource boundaries.  It is then the inference mechanism's own responsibility to figure out, given the resources available, how much time and resource to spend on each particular step -- while adhering to the overall resource restrictions from the outside.
 

Discarding the Search Structure

There will come a time at the end when the system knows it’s finished with the structure and that is the point where it cleans up and throws it all away.  Here’s an example: I want to know all of the states that border on Tennessee.  After finding Missouri, which is a goal, the inference mechanism will continue the search.  Eventually it will find all eight states and be unable to give any more answers -- it will have exhausted the potential search.  At that point it might decide that it’s finished with the search and will never be able to do anything more or maybe it has been told that it is not going to bother trying to do anything more, so now it can clean up the data structure.

The Search Tree is a Metaphor

The search tree is really a metaphor for the work of actually chaining deductions together.  There are a lot of ways that you can do this kind of proof.

For example, you could have better reasoning to say that a line of reasoning for a particular proof isn’t worth the bother because a future step is going to be unprovable, so you can prune it off.  This will create additional mechanisms for reasoning about how to prune off whole chunks of the tree.  There will be mechanisms to identify which chunks of the tree are likely to pay off and which chunks are not, so that the search can be refocused.

For another example, let’s say that you have three machines available for searching.  There will be a mechanism to assign each of the machines to search different parts of the tree.

So we have lots of possible extensions to this whole search metaphor that we’re looking forward to creating.  Viewed with this metaphor, we’re just trying to discover the most intelligent and profitable way of chaining deductions together.

Work Harder or Smarter? Deep Blue Example

  Let’s consider again the idea of “working harder versus working smarter.”  A good analogy for this is the Deep Blue chess project.  In this analogy, we’re searching through possible moves in chess.  Each node in the search tree, or data structure, is a possible state of the chess board.  Each step down from the origin represents a possible move of player A (player A could move his knight and go down one path, or he could move his pawn and go down another path, etc.).  Then it’s the opponent’s turn.  What could player B do now?  Well, the number of possible states of the board is basically infinite. With every move that is actually made, you get to go a node deeper and represent it with a path through this huge tree of possible chess games.  Eventually you’ll come to “Game Over” at the end, where a draw is called, or a checkmate,  or something to end the game.

Work Harder or Smarter? Deep Blue Example

The search tree would have to represent every possible arrangement of pieces. So every game is somewhere in that tree.  This tree is huge and your goal is to figure out which moves to make in order to create a scenario that puts you in a position where you are more likely to win than your opponent.  This gives you two options.  You can “work smarter” or you can “work harder.”

If you “work harder,” you would search the entire tree for the one step which would have your current board and below it all potential plays that are likely to leave you in a favorable position.  So in the history of chess-playing computer programs, there have been approaches that just crank out as much of the tree as possible and use some standard board-strength analysis heuristics (in the history of chess, they’ve come up with some very good heuristics for analyzing if a move is better for the white pieces or the black pieces based on the current state of the board) to analyze which move would be strongest for a given player.

In the Deep Blue project, they built this massively parallel machine which carves off huge chunks of this chess space.  At each step it looks at the current state of the board and then considers every possible future move of both players.  It distributes the problem across an enormous number of processors which search the essentially independent possible paths in parallel.  It then finds the best possible move.

Work Harder or Smarter? Deep Blue Example

Deep Blue does have a bit of the “think smarter” aspect in that they use the board-strength heuristic for determining which paths to go down.  So along the way, this heuristic makes the search extremely efficient, but it’s not a very smart thing.  It only gets used along the way, and doesn’t prune off large enough trunks of the tree to really be considered working “smarter.”

Deep Blue is an outstanding chess player. But as we’ve seen, such brute-strength approaches will not work in the “Big City.”

Summary
  • Life in the Big City
  • Inference is Resource-bounded
  • Resource-bounded Incompleteness
  • Inference in Continuable
    • Proof Search is Stored
    • Meta data could be stored
  • Deep Blue: Working Harder

This concludes the section on Incompleteness from Resource Bounds and Continuable Searches.