Abdullah Diab’s Blog

Brainfuck Python Interpreter

What is brainfuck?

Brainfuck is a programming language

This is the definition from Wikipedia:

The brainfuck programming language is an esoteric programming language noted for its extreme minimalism. It is a Turing tarpit, designed to challenge and amuse programmers, and is not suitable for practical use. Its name has been variously bowdlerized. The name of the language is generally not capitalized, although it is a proper noun.

This programming language is very easy to learn, very hard to do work with. As the Wikipedia article describes it; the language is an esoteric language, which means it was created for fun

I discovered the language while I was reading an article about writing the perfect settings file for Django. The author used DPaste website to link to pieces of code. DPaste website says that it uses Pygments for syntax highlighting. Pygments say that their syntax highlighter even supports brainfuck. And that’s how I got to brainfuck

After I read articles about the language I wanted to test some code of it, I found some online interpreters, but I wanted to test some code on my machine. I found a compiler for it, but I’m using Windows 7 on my machine and the compiler is not compatible with it. I got pissed off and I decided I have to write my own interpreter

Brainfuck in simple description

The language consists of 8 symbols < > . , + – [ ]

The language deals with a 30,000 bytes array, at the start of the application all the items of that array are zeros.

The symbols can just deal with a pointer on that array to do specific jobs.

PLY

I mentioned in a previous post that I’m implementing a compiler for my 4th year in Informatics faculty. Hence I made a lot of searching around the net for information about implementing compilers. Implementing a compiler requires building a lexer which can be generated using lexer generator tools, and a parser which also can be generated using parser generator tools.

Since I’m a Python-Lover I searched for Python lexer and parser tools, and I came across the great PLY.

PLY (Python Lex-Yacc) is an implementation of lex and yacc parsing tools for Python.

The usage of PLY is not the target of this post.

Writing a lexer for brainfuck

The lexer would be so simple, since the language only has 8 symbols, and capturing them requires almost nothing.

The lexer file is the following

import ply.lex as lex

tokens = [
    'GoLeft',
    'GoRight',
    'Print',
    'Read',
    'Increase',
    'Decrease',
    'WhileStart',
    'WhileEnd'
]

t_GoLeft = r'\<'
t_GoRight = r'\>;'
t_Print = r'\.'
t_Read = r','
t_Increase = r'\+'
t_Decrease = r'\-'
t_WhileStart = r'\['
t_WhileEnd = r'\]'

def t_error(t):
    t.lexer.skip(1)

lexer = lex.lex()

if __name__ == "__main__":
    lex.runmain()

It simply captures the symbols of the language and ignores any other symbol. Simple isn’t it 🙂

Writing a parser for brainfuck

The parser will have to capture the instructions and store them in a tree so that it can do operations on the code without too much efforts.

The parser will ask the lexer to capture the symbols, while the parser will be responsible for making instructions from symbols.

Sparse matrix

I won’t actually allocate 30,000 bytes array for the application, and since the default value for any item is 0 then I can make use of the Sparse Matrix.

Implementing a simple sparse matrix is nothing more than a dictionary:

class sparseMatrix:
    def __init__(self):
        self.dict = {}

    def __getitem__(self, index):
        if self.dict.has_key(index):
            return self.dict[index]
        else:
            return 0

    def __setitem__(self, index, value):
        self.dict[index] = value

Building the tree

Six commands from the brainfuck language have no arguments or any other elements attaching to them, so they would just add elements containing the command type.

While two commands have a scope of code attached with them, the “[“ and “]” can be used to define while loops. So the while loop is the only command that will add an element with sub-level to the tree.

def p_start(p):
    """
    start : code
            | empty
    """
    if p[1]:
        executeStatements(p[1])

def p_empty(p):
    'empty : '
    pass

def p_code(p):
    """
    code : code statement
            | code whilestatement
            | statement
            | whilestatement
    """
    if len(p) == 2:
        p[0] = [p[1]]
    else:
        p[0] = p[1][:]
        p[0].extend([p[2]])

def p_statement(p):
    """
    statement : GoLeft
                   | GoRight
                   | Print
                   | Read
                   | Increase
                   | Decrease
    """
    if p[1] == '<':
        p[0] = ('GoLeft', None)
    elif p[1]  == '>':
        p[0] = ('GoRight', None)
    elif p[1] == '.':
        p[0] = ('Print', None)
    elif p[1] == ',':
        p[0] = ('Read', None)
    elif p[1] == '+':
        p[0] = ('Increase', None)
    elif p[1] == '-':
        p[0] = ('Decrease', None)

def p_whilestatement(p):
    'whilestatement : WhileStart code WhileEnd'
    p[0] = ('While', p[2])

Executing the tree

Since I’m not building a real compiler I won’t have to generate machine specific code after building the tree. I decided that I can execute the code from within python since the code is too simple to be executed.

So I wrote a method to execute a list of tree elements and it can execute while loops using recursive calling to itself.

The method will be called when the parser parses a valid application and when it can recognize the start symbol of the generating rules of the language.

def executeStatements(tuplesList):
    global matrix
    global index
    global MAX_INDEX

    for node in tuplesList:
        if node[0] == 'GoLeft':
            if index > 0:
                index -= 1
            else:
                raise Exception, "Index out of array bounds!"
        elif node[0] == 'GoRight':
            if index < MAX_INDEX:
                index += 1
            else:
                raise Exception, "Index out of array bounds!"
        elif node[0] == 'Print':
            print(chr(matrix[index]), sep='', end='')
        elif node[0] == 'Read':
            matrix[index] = ord(raw_input()[0])
        elif node[0] == 'Increase':
            matrix[index] += 1
        elif node[0] == 'Decrease':
            matrix[index] -= 1
        elif node[0] == 'While':
            while matrix[index] != 0:
                executeStatements(node[1])

Testing the interpreter

After running the interpreter, I tried the Hello World example in Wikipedia and I got the result:

BrainFuck> ++++++++++[>+++++++>++++++++++>+++>+<<<<-]>++.>+.+++++++..+++.>++.<<+++++++++++++++.>.+++.------.--------.>+.>.
Hello World!
BrainFuck>

Download source code

You can download and test the interpreter from here.