It has been a long time that I wrote my last C++ code. And obviously, C++ has released the new standard, C++17, for a few months. There are a lot features in which has been introduced or changed.
One out many is the pattern matching using std::variant, std::monostate and std::visit. This feature could lead the data-oriented design easier in modern C++. And it's not that hard to write a (possibly buggy) brainfuck interpreter with this feature.
Although with that said, we still need some extra 'hacking' to achieve a more elegant code style. There is a blog post by Marius Elvert explaining a two-line visitor in C++, with which you could write a few lambda expressions in one visitor and then the std::variant could be dispatched by the receiving type of those lambda.
auto my_visitor = visitor{ [&](int value) { /* ... */ } [&](std::string const & value) { /* ... */ } };
To accomplish the goal, Marius found that lager just uses two lines of code. The details are explained in aforementioned blog post. And for the readability of the code today, I just substituted the name of the struct.
And if you're in a hurry, the full code is on my GitHub, #/brainfuck-cpp17
template<class... Ts> struct brainfuck_vm : Ts... { using Ts::operator()...; }; template<class... Ts> brainfuck_vm(Ts...) -> brainfuck_vm<Ts...>;
And the next step is to define all the 8 valid ops of brainfuck.
#pragma mark - brainfuck ops struct increment_value_op {}; // + struct decrement_value_op {}; // - struct increment_ptr_op {}; // > struct decrement_ptr_op {}; // < struct print_op {}; // , struct read_op {}; // . struct loop_start_op {}; // [ struct loop_end_op {}; // ]
To keep the code clean and simple as much as possible, the using keyword is very helpful. And to be able to write some comments (with no brainfuck keyword inside of course) in brainfuck code, std::monostate would be useful too. So the std::variant to be defined will have 8+1 valid types.
/// brainfuck_op allowed ops in C++17 std::variant using brainfuck_op = std::variant< increment_value_op, decrement_value_op, increment_ptr_op, decrement_ptr_op, print_op, read_op, loop_start_op, loop_end_op, std::monostate >;
Furthermore, since we need to read from the stdin or from any file, which will using character form ops (+-><,.[]). A map is used for convenience.
/// a map from char to brainfuck_op const std::map<char, brainfuck_op> bf_op_map { {'+', increment_value_op{}}, {'-', decrement_value_op{}}, {'>', increment_ptr_op{}}, {'<', decrement_ptr_op{}}, {'.', print_op{}}, {',', read_op{}}, {'[', loop_start_op{}}, {']', loop_end_op{}} };
Well, the last definition would surely be the brainfuck virtual machine! It's highly possibly not the best design, maybe even not a good one, but it would work as we intended.
I'm using std::map to simulate an infinite length tape of brainfuck vm, typically, the tape of brainfuck vm is 30000. But, hey, why not:) Given the existence of the operator[], it's easy for you to substitute the std::map with std::array.
#pragma mark - brainfuck vm /// brainfuck virtual machine status struct brainfuck_vm_status { /// virtual infinity length tape std::map<int, int> tape; /// current cell of the tape int tape_ptr = 0; /// used for keeping track of all valid brainfuck_op std::vector<char> instruction; /// current brainfuck_op index int instruction_ptr_current = -1; /// keeping track of loops std::stack<int> instruction_loop_ptr; /// flag of skipping loop, e.g /// +-[[[------------++++++++++-.>>[>]>>>--<<<<<<--]]]++++ /// ^skipping from, but we need all ^end of skipping /// instructions inside. int jump_loop = 0; };
With all these definitions, the prototype of our interpreter would be like below.
int main(int argc, const char * argv[]) { // the brainfuck vm brainfuck_vm_status status; // temp variable for receving the character form op auto char_op = char{}; // read char op forever until EOF while (std::cin >> char_op) { // interpret run_vm(status, char_op); } }
That's pretty straightforward! Let's focus on the run_vm function then.
#pragma mark - brainfuck vm interpreter /** run brainfuck vm @param status run brainfuck vm from the given state @param char_op character form op @param via_loop due to the way I wrote, a flag is needed to avoid re-adding ops */ void run_vm(brainfuck_vm_status & status, char char_op, bool via_loop = false) { // get the op from char_op brainfuck_op op = next_op(status, char_op, via_loop); // parttern matching std::visit(brainfuck_vm { [&](increment_value_op) { }, [&](decrement_value_op) { }, [&](increment_ptr_op) { }, [&](decrement_ptr_op) { }, [&](print_op) { }, [&](read_op) { }, [&](loop_start_op) { }, [&](loop_end_op) { }, [&](std::monostate) { } }, op); }
The skeleton shares much similarity with some functional programming language. The things haven't done are the corresponding code of each op and the next_op function.
Let's start with the next_op function, without which, no tests could perform to verify our code blocks for these 8 brainfuck ops.
#pragma mark - helper function /** next brainfuck op @param status the brainfuck vm status @param char_op character form op @param via_loop due to the way I wrote, a flag is needed to avoid re-adding ops @return brainfuck_op */ brainfuck_op next_op(brainfuck_vm_status & status, char char_op, bool via_loop = false) { // find the brainfuck_op from bf_op_map if (auto op = bf_op_map.find(char_op); op != bf_op_map.end()) { // do not append the char_op if we're retriving the next op inside a loop_op if (!via_loop) { // save char_op to instruction status.instruction.emplace_back(char_op); // increse the ptr of current instruction status.instruction_ptr_current++; } // return next op return op->second; } else { // invaild char for brainfuck // monostate is returned return std::monostate(); } }
And finally, to fill the blanks we left in run_vm:) Most of which are deadly simple but the loop ones. It takes some time to check whether the boundaries and recursions are correct in the code.
#pragma mark - brainfuck vm interpreter /** run brainfuck vm @param status run brainfuck vm from the given state @param char_op character form op @param via_loop due to the way I wrote, a flag is needed to avoid re-adding ops */ void run_vm(brainfuck_vm_status & status, char char_op, bool via_loop = false) { // get the op from char_op brainfuck_op op = next_op(status, char_op, via_loop); // parttern matching std::visit(brainfuck_vm { [&](increment_value_op) { printf("increment_value_op\n"); // skip actual action if we're skipping loop if (status.jump_loop == 0) { status.tape[status.tape_ptr]++; } }, [&](decrement_value_op) { printf("decrement_value_op\n"); // skip actual action if we're skipping loop if (status.jump_loop == 0) { status.tape[status.tape_ptr]--; } }, [&](increment_ptr_op) { printf("increment_ptr_op\n"); // skip actual action if we're skipping loop if (status.jump_loop == 0) { status.tape_ptr++; } }, [&](decrement_ptr_op) { printf("decrement_ptr_op\n"); // skip actual action if we're skipping loop if (status.jump_loop == 0) { status.tape_ptr--; } }, [&](print_op) { printf("print_op\n"); // skip actual action if we're skipping loop if (status.jump_loop == 0) { putchar(status.tape[status.tape_ptr]); printf("%c - %d\n", status.tape[status.tape_ptr], status.tape[status.tape_ptr]); } }, [&](read_op) { // skip actual action if we're skipping loop if (status.jump_loop == 0) { status.tape[status.tape_ptr] = getchar(); } }, [&](loop_start_op) { printf("loop from ins ptr: %d cond[%d]\n", status.instruction_ptr_current, status.tape[status.tape_ptr]); // if and only if 1) `current_cell_value != 0` // 2) and we're not do the skipping // we can record the starting index of the if instruction // besides, if we're in condition 1) // the if statement should be also skipped if (status.tape[status.tape_ptr] != 0 && status.jump_loop == 0) { // push the starting instruction index of loop status.instruction_loop_ptr.emplace(status.instruction_ptr_current); } else { status.jump_loop++; } }, [&](loop_end_op) { printf("loop end ins ptr: %d cond[%d]\n", status.instruction_ptr_current, status.tape[status.tape_ptr]); // decrease the jump_loop value if we encounter the `]` // and we were previously doing the skip if (status.jump_loop != 0) { status.jump_loop--; } else { // if we were not in skipping // then we need to check the loop condition, `current_cell_value != 0` if (status.tape[status.tape_ptr] != 0) { // the instruction range of current loop printf("loop [%d, %d]:\n", status.instruction_loop_ptr.top(), status.instruction_ptr_current); #ifdef DEBUG // dump all instructions inside the loop for (int debug = status.instruction_loop_ptr.top(); debug <= status.instruction_ptr_current; debug++) { putchar(status.instruction[debug]); } putchar('\n'); #endif // loop the instruction until condition satisfies no more while (status.tape[status.tape_ptr] != 0) { // save current instruction pointer int current = status.instruction_ptr_current; // start the loop right after the index of `[` status.instruction_ptr_current = status.instruction_loop_ptr.top() + 1; // run one op at a time // until the next op is the corresponding `]` while (status.instruction_ptr_current < current) { run_vm(status, status.instruction[status.instruction_ptr_current], true); status.instruction_ptr_current++; } // restore the current instruction pointer status.instruction_ptr_current = current; } // pop current loop starting index status.instruction_loop_ptr.pop(); } } }, [&](std::monostate) { } }, op); }
You may try this brainfuck code which should output "Hello World!":)
++++++++[>++++[>++>+++>+++>+<<<<-]>+>+>->>+[<]<-]>>.>---.+++++++..+++.>>.<-.<.+++.------.--------.>>+.>++.
One thought on “Brainfuck Interpreter in C++17——A Modern Approach to Kill Your Brain”