How does the Brainfuck Hello World actually work?

speeder picture speeder · May 30, 2013 · Viewed 70.6k times · Source

Someone sent this to me and claimed it is a hello world in Brainfuck (and I hope so...)

++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.

I know the basics that it works by moving a pointer and increment and decrementing stuff...

Yet I still want to know, how does it actually work? How does it print anything on the screen in the first place? How does it encode the text? I do not understand at all...

Answer

Scony picture Scony · Nov 8, 2013

1. Basics

To understand Brainfuck you must imagine infinite array of cells initialized by 0 each.

...[0][0][0][0][0]...

When brainfuck program starts, it points to any cell.

...[0][0][*0*][0][0]...

If you move pointer right > you are moving pointer from cell X to cell X+1

...[0][0][0][*0*][0]...

If you increase cell value + you get:

...[0][0][0][*1*][0]...

If you increase cell value again + you get:

...[0][0][0][*2*][0]...

If you decrease cell value - you get:

...[0][0][0][*1*][0]...

If you move pointer left < you are moving pointer from cell X to cell X-1

...[0][0][*0*][1][0]...

2. Input

To read character you use comma ,. What it does is: Read character from standard input and write its decimal ASCII code to the actual cell.

Take a look at ASCII table. For example, decimal code of ! is 33, while a is 97.

Well, lets imagine your BF program memory looks like:

...[0][0][*0*][0][0]...

Assuming standard input stands for a, if you use comma , operator, what BF does is read a decimal ASCII code 97 to memory:

...[0][0][*97*][0][0]...

You generally want to think that way, however the truth is a bit more complex. The truth is BF does not read a character but a byte (whatever that byte is). Let me show you example:

In linux

$ printf ł

prints:

ł

which is specific polish character. This character is not encoded by ASCII encoding. In this case it's UTF-8 encoding, so it used to take more than one byte in computer memory. We can prove it by making a hexadecimal dump:

$ printf ł | hd

which shows:

00000000  c5 82                                             |..|

Zeroes are offset. 82 is first and c5 is second byte representing ł (in order we will read them). |..| is graphical representation which is not possible in this case.

Well, if you pass ł as input to your BF program that reads single byte, program memory will look like:

...[0][0][*197*][0][0]...

Why 197 ? Well 197 decimal is c5 hexadecimal. Seems familiar ? Of course. It's first byte of ł !

3. Output

To print character you use dot . What it does is: Assuming we treat actual cell value like decimal ASCII code, print corresponding character to standard output.

Well, lets imagine your BF program memory looks like:

...[0][0][*97*][0][0]...

If you use dot (.) operator now, what BF does is print:

a

Because a decimal code in ASCII is 97.

So for example BF program like this (97 pluses 2 dots):

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++..

Will increase value of the cell it points to up to 97 and print it out 2 times.

aa

4. Loops

In BF loop consists of loop begin [ and loop end ]. You can think it's like while in C/C++ where the condition is actual cell value.

Take a look BF program below:

++[]

++ increments actual cell value twice:

...[0][0][*2*][0][0]...

And [] is like while(2) {}, so it's infinite loop.

Let's say we don't want this loop to be infinite. We can do for example:

++[-]

So each time a loop loops it decrements actual cell value. Once actual cell value is 0 loop ends:

...[0][0][*2*][0][0]...        loop starts
...[0][0][*1*][0][0]...        after first iteration
...[0][0][*0*][0][0]...        after second iteration (loop ends)

Let's consider yet another example of finite loop:

++[>]

This example shows, we haven't to finish loop at cell that loop started on:

...[0][0][*2*][0][0]...        loop starts
...[0][0][2][*0*][0]...        after first iteration (loop ends)

However it is good practice to end where we started. Why ? Because if loop ends another cell it started, we can't assume where the cell pointer will be. To be honest, this practice makes brainfuck less brainfuck.