What exactly is the halting problem?

poundifdef picture poundifdef · Jul 10, 2009 · Viewed 22.4k times · Source

Whenever people ask about the halting problem as it pertains to programming, people respond with "If you just add one loop, you've got the halting program and therefore you can't automate task"

Makes sense. If your program has an infinite loop, then when your program is running, you have no way of knowing whether the program is still crunching input, or if it is just looping infinitely.

But some of this seems counter intuitive. What if I was writing a halting problem solver, which takes source code as its input. rascher@localhost$ ./haltingSolver source.c

If my code (source.c) looks like this:

for (;;) {  /* infinite loop */  }

It seems like it'd be pretty easy for my program to see this. "Look the loop, and look at the condition. If the condition is just based on literals, and no variables, then you always know the outcome of the loop. If there are variables (eg while (x < 10)), see if those variables are ever modified. If not, then you always know the outcome of the loop."

Granted, these checks would not be trivial (calculating pointer arithmetics, etc) but it does not seem impossible. eg:

int x = 0
while (x < 10) {}

could be detected. along with - albeit not trivially:

int x = 0
while (x < 10)
{
   x++;
   if (x == 10)
   {
      x = 0
   }
}

Now what about user input? That is the kicker, that is what makes a program unpredictable.

int x = 0;
while (x < 10) 
{
   scanf("%d", &x); /* ignoring infinite scanf loop oddities */
}

Now my program can say: "If the user enters a 10 or greater, the program will halt. On all other input, it will loop again."

Which means that, even with hundreds of inputs, one ought to be able to list the conditions on which the program will stop. Indeed, when I write a program, I always make sure someone has the ability to terminate it! I am not saying that the resulting list of conditions is trivial to create, but it doesn't seem impossible to me. You could take input from the user, use them to calculate pointer indexes, etc - but that just adds to the number of conditions to ensure the program will terminate, doesn't make it impossible to enumerate them.

So what exactly is the halting problem? What am I not understanding about the idea that we cannot write a problem to detect infinite loops? Or, why are "loops" such an oft-cited example?

UPDATE

So, let me change the question a little bit: what is the halting problem as it applies to computers? And then I will respond to some of the comments:

Many people have said that the program must be able to deal with "any arbitrary input." But in computers, there isn't ever any arbitrary input. If I only input a single byte of data, than I only have 2^8 possible inputs. So, as an example:

int c = getchar()

switch (c) {
   case 'q':
      /* quit the program */
}

All of the sudden, I have just accounted for all of the possibilities. If c has the bit pattern 0x71, it does one thing. For all other patterns, it does something else. Even a program that accepts arbitrary string input is never really "arbitrary", since resources are finite, which means that while the theory of "arbitrary" applies... it isn't exactly one-to-one with the practice.

The other example people cited is this:

while (n != 1)
    if (n & 1 == 1) 
        n = 3 * n + 1;
    else 
        n /= 2;

If n is a 32-bit integer... then I can visually tell you whether or not this will halt.

I guess this edit isn't asking anything, but the most convincing example I've seen is this one:

Assume that you have your magical program/method to determine that a program halts.

public bool DeterminesHalt(string filename, string[] args){
    //runs whatever program you tell it do, passing any args
    //returns true if the program halts, false if it doesn't
}

Now lets say we write a small piece of code such as...

public static void Main(string[] args){
    string filename = Console.ReadLine(); //read in file to run from user
    if(DeterminesHalt(filename, args))
        for(;;);
    else
        return;
}

So for this example, we can write a program to do the exact opposite of our magical halting method does. If we somehow determine that a given program will halt, we just hop into an infinite loop; otherwise if we determine that the program is in an infinite loop, we end the program.

Then again, if you intentionally write a program which contains an infinite loop... "solving the halting problem" is kind of moot, isn't it?

Answer

Bob Cross picture Bob Cross · Jul 10, 2009

EDIT (much later than original answer): MarkCC of Good Math, Bad Math recently wrote up an excellent discussion of the Halting problem with concrete examples.

The halting problem is basically a formal way of asking if you can tell whether or not an arbitrary program will eventually halt.

In other words, can you write a program called a halting oracle, HaltingOracle(program, input), which returns true if program(input) would eventually halt, and which returns false if it wouldn’t?

The answer is: no, you can’t.

Following up on questions about whether the input to the Halting problem is relevant or a red herring: Yes, the input is important. Also, there seems to be some confusion in that I see "infinite" being used where "arbitrary" is more correct.

Practical example: Imagine that you are working in a QA position and you are to write a halting checker program (aka an oracle) that will confirm that for any arbitrary program written by the development team (D) and any arbitrary input provided by the end-user (I), program D will eventually halt when given input I.

Cue manager voice: "Ho ho, those goofy users, let's make sure that no matter what garbage they type, our server tasks will never end up in an endless loop. Make it so, code monkey!"

This seems like a great idea, right? You don't want your server to hang, right?

What the halting problem is telling you is that you are being handed an unsolvable task. Instead, in this particular case, you need to plan for tasks that run past a threshold time and be ready to cancel them.

Mark uses code instead of input to illustrate the problem:

def Deciever(i):
  oracle = i[0]
  in = i[1]
  if oracle(Deceiver, i):
    while True:
      continue
  else:
    return i

In my discussion in the comments, I went the route of malicious input manipulation to force an unsolvable problem. Mark's example is far more elegant, using the halting oracle to defeat itself:

So, the input to Deceiver is actually a list of two elements: the first one is a proposed halting oracle. The second is another input. What the halting killer does is ask the Oracle: “Do you think I’ll halt for input i?”. If the oracle says, “Yes, you’ll halt”, then the program goes into an infinite loop. If the oracle says “No, you won’t halt”, then it halts. So no matter what the oracle says, it’s wrong.

Said another way, without cheating, reformatting inputs, countable / uncountable infinities or anything other distractions, Mark has written a piece of code that can defeat any halting oracle program. You cannot write an oracle that answers the question of whether Deceiver ever halts.

Original answer:

From the great Wikipedia:

In computability theory, the halting problem is a decision problem which can be stated as follows: given a description of a program and a finite input, decide whether the program finishes running or will run forever, given that input.

Alan Turing proved in 1936 that a general algorithm to solve the halting problem for all possible program-input pairs cannot exist. We say that the halting problem is undecidable over Turing machines. Copeland (2004) attributes the actual term halting problem to Martin Davis.

One of the critical points is that you have no control over either the program or the input. You are handed those and it's up to you to answer the question.

Note also that Turing machines are the basis for effective models of computability. Said another way, everything that you do in modern computer languages can be mapped back to these archetypical Turing machines. As a result, the halting problem is undecidable in any useful modern language.