Pushing the Limits of Self-Programming Artificial Intelligence

This article is the second in a series of three. See also: Part 1, Part 2, Part 3, and research paper.

Introduction

Is it possible for a computer program to write its own programs? In the previous article this question was answered with the creation of a program, using artificial intelligence, that could write its own programs to output “Hello World” and other simple strings. The AI computer program used a genetic algorithm to write child programs using the programming language brainfuck, a Turing-complete programming language consisting of 8 instructions.

1
+->,>,[>+,],,,,-<[.+<]

“dlroW olleH” (“Hello World” backwards)

The above programming code was created by an artificial intelligence program, designed to write programs with self-modifying and self-improving code. The program created the above result in 60 seconds. The above program accepts user input and prints the reversed text.

Artificial Intelligence Gets Smart, Or Was It Just Lucky?

With the success from the initial experiments of outputting simple strings, I was curious about pushing the bounds of the AI program towards generating even more complex programs. I thought, perhaps having the AI write a program to output “Hello World” was too easy. Maybe it was just luck? After all, there isn’t much logic involved in figuring out a combination of increment, decrement, and print commands to output the string. It’s not really doing anything all that interesting. Or is it?

Since brainfuck is a proven Turing-complete language, theoretically capable of solving any computational problem in the universe (which gives our AI a lof of potential), surely the AI could generate more interesting programs that actually perform a useful task.

A Personal View of Self-Programming Artificial Intelligence and Possibilities

When I see a file’s raw display of bytes, for example from a hex-edit view of an executable, I see software. I can visualize the same software in any random combination of 0’s and 1’s. If a computer program started with a blank array of bytes, and worked fast enough and long enough, it could theoretically re-create that same exact software. It’s certainly a mathematical possibility that the computer could simply select a random series of bytes on its first try, and accidentally discover the executable signature for, say.. Starcraft II.

Ok, maybe that seems too much to even consider. But, how about a bare minimum “Hello World” executable? With a working executable size of just 142 bytes, it’s clearly reasonable for a computer program to stumble across this signature, and re-create the same executable. If Moore’s Law holds true, and computers get exponentially faster and dramatically cheaper, this drastically reduces the apparent infinite time-scale to achieve such results from an evolutionary search algorithm.

Windows 8, Apple iOS, Street Fighter 2, and even DOS, are all a series of 0’s and 1’s. If an artificial intelligent program is capable of guiding itself through the randomness of bytes, it’s possible to re-create each of those computers programs by searching through the binary to locate the correct pattern. It’s also possible for the computer to discover new ones, and in the process, create new software that never existed before.

The Big Binary Bang

Binary bits of 1’s and 0’s are just like the atoms that make up every physical object in the universe. If it were possible to manipulate and arrange atoms, it would be theoretically possible to re-create any physical object. Star Trek’s replicator comes to mind. Of course, that’s just science fiction, right? However, arranging zeroes and ones is certainly within our capabilities. By manipulating and arranging bits, we can theoretically create any type of software - even ones that do not yet exist. And unlike the process of arranging atoms to create, for example gold, this process wouldn’t even require nuclear reactions.

Given a distinct enough heuristic, it could even be possible to describe strong AI. While technology has yet been able to achieve this, it is theoretically possible for an artificial intelligent program to search the vast space of binary to locate a solution - if one is even possible. But, let’s start with baby steps.

Next Generation of Artificial Intelligence

Since the success of the first AI program in writing its own “Hello World” program, the next logical step would be to have the AI write a program that accepts user input. A simple example would be to write a program that asks the user for their name and then prints, “Hello Kory”. With this initial goal in mind, I modified the AI program’s fitness function to favor user input.

Significant improvements were made in the AI engine to improve code generation speed. In particular, the simple change of replacing string concatenation with the StringBuilder class dramatically increased generation time. It specifically improved the call to encode and decode brainfuck instructions from their number representation as a double (0.0 .. 1.0) to their corresponding character representation. This was initially done by simply concatenating the next programming instruction character to the program string. However, since this method is called hundreds of thousands of times during a typical run, changing this to use StringBuilder provided a noticeable improvement.

Another speed improvement came by allowing the artificial intelligence to optimize its generated code. It does this by including a fitness bonus for programs that have the least number of executions (also called, ticks).

1
2
// Bonus for less operations to optimize the code.
countBonus += ((_maxIterationCount - _bf.m_Ticks) / 100.0);

Self-Compilation to .exe

The AI program has also been updated to include the ability to compile the resulting child code into a physical executable (.exe) file. In this manner, a developer can create his own ruleset to instruct the AI to write a program. Once the AI succeeds, a .exe file will be saved, containing the compiled brainfuck program.

Aside from automatically generating an executable upon completion of the AI writing a program, another benefit of having an automatic built-in compiler is that we can include our own extensions to the brainfuck programming language. We can add new instructions and supply their operations. Our compiler can include this support when generating the executable. The built-in compiler was actually easy to implement via the usage of the CSharpCodeProvider, which lets you pass a string of source code (with the brainfuck code embedded within it) and produce a .exe file as a result.

Traditional Brainf-ck

For the majority of the experiments in this article, the traditional brainfuck programming instruction set was used. This allows easier proof of the programs that the AI creates. However, as the programs get more complex, speed improvements were made by enhancing the instruction set with extensions, including Brainfuck Extended Type 3. We’ll take a look at that towards the end of the results.

Show Me The Fitness

Most of the fitness methods, used by the AI, rely on a simple check to judge the quality of a program, even though they include quite a bit of other code for running the brainfuck interpreter, sending input, collecting output for scoring, etc. I’ve defined the fitness functions used in the new experiments below:

Hello Name

1
2
3
4
5
6
7
for (int j = 0; j < targetStringName.Length; j++)
{
if (_console.Length > j)
{
Fitness += 256 - Math.Abs(_console[j] - targetStringName[j]);
}
}

This is very similar to the basic Hello World fitness function. We simply check the full string against the output. The key to this fitness method is in accepting user input and providing a penalty when the program asks for input at the inappropriate time.

Reverse String

1
2
3
4
5
6
7
for (int j = 0; j < name.Length; j++)
{
if (_console.Length > j)
{
Fitness += 256 - Math.Abs(_console[j] - name[name.Length - j - 1]);
}
}

For reversing a string, we follow the same pattern as the Hello World fitness methods, with the exception of checking against the reverse of the target string. Notice in the code above, we compare each output character, starting from index 0, against the target string, starting at index length - 1 and working backwards.

Add and Subtract

1
2
3
4
5
6
7
8
9
10
11
// Adding
if (Int32.TryParse(_console.ToString(), out value))
{
Fitness += 256 - Math.Abs(value - (input1 + input2));
}

// Subtracting
if (Int32.TryParse(_console.ToString(), out value))
{
Fitness += 256 - Math.Abs(value - (input1 - input2));
}

Mathematical operations use a fairly simple fitness check to compare the output with the target result. We also include a conversion of the output to an integer (if possible), which is itself, part of the fitness score. Interestingly, I was unsuccessful with applying the same fitness to multiplication.

If/Then Conditionals

Conditionals follow similar fitness methods as the Hello World programs, since they’re checking against a target output string. The main change to the fitness check for conditionals is the introduction of state into the fitness method. Specifically, an integer variable is used to maintain state for when input should be provided to the program and when output should be provided. If the program asks for input or prints output at the inappropriate time, a penalty is applied to the fitness.

I Didn’t Write The Programs You’re About To See

Keep in mind, as you browse the results below, that I didn’t write any of these programs. The artificial intelligence did. In fact, I don’t even know how to program in brainfuck (who does?). I’m familiar with the instructions set, but otherwise, if you asked me to write a program in it, I probably couldn’t do very much at all. With the disclaimer out of the way, let’s see the results.

Results?

Hello Name

The AI successfully wrote a program to output “Hello [NAME]” after 42,800 generations in about 30 minutes. It produced the following code:

1
2
3
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++[++++++++[++++.+++++++++++++++++++++
++++++++.+++++[++..<>+++.>+++++++++++++++++++++++++++++<>++++-.[,.,.,.,.]].].][.,]-,-]][+.--+><<
..>[+>[+.<+,]><<>]<><<.>.,--<>[<>-,,,><.>.<-,][-[-,-[.-[-+

If we trim off the excess, the resulting code is:

1
2
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++[++++++++[++++.+++++++++++++++++++++
++++++++.+++++[++..<>+++.>+++++++++++++++++++++++++++++<>++++-.[,.,.,.,.]]]]

When run, the above code prompts the user for input. The user can type in a character and the program repeatedly prompts the user for more characters until a value of 0 is entered. The program prints “Hello “ followed by the text that is typed in. The fact that the AI was able to program a loop, for accepting user input, was a nice surprise.

It’s interesting to note that most human programmers would probably write a program like this by first asking the user their name, and then printing the full text. In contrast, the AI first prints the word “Hello “, then asks for input, and then prints the remaining text. It actually makes sense to do it this way, and in fact, this provides a more responsive experience for the end-user (output is immediately displayed, rather than queued).

The following screenshots show the program running:

AI learning how to program

AI result

Reversing a String

The AI successfully wrote a program to reverse a string after 2,600 generations in about 60 seconds. It produced the following code:

1
2
3
+->,>,[>+,],,,,-<[.+<].[>[,,<,>><,-+[]-[<[<>>+-<[,+]]>><[,[].-[]<[>]+[.->+.+,>-[<+-+]
->>,<>,>-+][,-.<--+<[[[,+>-+]-]>,><-[>,+<[<...[++[+.><<-..[,<[+<]-<>+-,,>,,,-.[,+,+]<
>>].+]]-,...[.<,,.],-[].]<[><,<.>>[+.-+--,++].+[]..]><.-][.,-,>][+][+,--,<--.<+]

If we trim off the excess, the resulting code is:

1
+->,>,[>+,],,,,-<[.+<]

When run, the above code prompts the user for input. The user can type in a character and the program repeatedly prompts the user for more characters until a value of 0 is entered. In contrast to the Hello Name program, which immediately displayed output, this program reads all user input into memory first, and then outputs the reversed string. If you think about it, the program must read all input first, since the last character entered needs to be the first character output. The AI figured this out by itself.

The following screenshots show the program running:

AI learning how to program

AI result

Addition

The AI successfully wrote a program to add two numbers together after 92,400 generations in about 45 minutes. It produced the following code:

1
2
3
,>,-[-<+>]<+.>][+<+[<>>,>]-->>.[.<,[[,>><<[,.+,<[>,,.--->.[[+<>,[[[+>,.[,-.,>.,--<
+<+,,.][+>++-[+++.+.+.[].-+,][<[].<]<][[,].<[[+..]].->-+-]+-+.-].]<,,]+<->.[][.[>.
>]+[[<,>[.,,+-]+,+><,.,<-]-.]]+-++->,-.[-+->--]-.].>+.<.>.>,+<.+<..-],.+-,,,><[+-+.>[>

If we trim off the excess, the resulting code is:

1
,>,-[-<+>]<+.

When run, the above code prompts the user for two inputs. The input is expected to be in number format (byte), rather than character format (ASCII). If running this on a web-based brainfuck interpreter, you should provide your input using the format #7 for the number 7, otherwise if you just typed 7, most web-based interpreters would assume you mean a byte value of 55, which is the ASCII value for the character ‘7’.

The program successfully adds any two numbers from 0 - 255, which is the maximum size of a byte, and wraps any value that exceeds the limit. So for example, 30 + 50 = 80, 210 + 20 = 230, and 255 + 2 = 1.

Keep in mind, the AI program has no idea how to do addition. In fact, it knows nothing of the concept of math. By evolving child programs based upon the fitness function, the AI is able to successfully arrive at a solution for adding two numbers together. I was extremely impressed when I saw this result. This is a far leap from simply outputting a string to the console.

The following screenshots show the program running:

AI learning addition

AI result

AI result

Subtraction

The AI successfully wrote a program to subtract two numbers after 177,900 generations in about 1 hour. It produced the following code:

1
2
3
,-->,-[-<->]<+.,[+-[,,]+[-.[[]],[+>--+,>->],<+[.<>.[,][,,><[.<,]][]+],>
[-+,[-+[+-[-[[--<>,>[[],---]+<[[,>].[,<[.,,[-.<.[-<[+]<>.-<<..+[<<+]-]<
]+<>,[].,,,.]<-[<>>[+[<><,[<>,+-],-,+]+>.<><--[.]<[,,<][,[

If we trim off the excess, the resulting code is:

1
,-->,-[-<->]<+.

When run, the above code prompts the user for two inputs. As with the generated program for addition, the input is expected to be in number format (byte), rather than character format (ASCII). The program successfully subtracts the second number from the first and outputs the result. Note, the output is a byte value, so the displayed character on most web-based interpreters will appear as a different ASCII character than the digit. For example, if the result of subtracting two numbers is 90, a web-based interpreter might print out “Z” (which is the ASCII representation for the value 90).

The following screenshots show the program running:

AI result

Note, the first generated program successfully subtracts two digits, but includes extraneous zeros in the result. While this is still mathematically correct (0000008 = 8), and a reason why the AI chose it as a solution, we can do better. The second result below shows the result of the AI program after running it a second time.

AI result

AI result

If-Then Conditionals

Now we get to the more interesting and complex programs. The following programs are really beginning to push to AI towards generating complex programming logic, including the ability to perform if/then decisions and actions. If the AI can successfully formulate this kind of logic, the potential is open for a large variety of artificial intelligence automatically generated software.

The Fitness Function Is Growing

As the complexity for the target programs grows, unfortunately, so too does the fitness function. After all, the fitness function needs to guide the AI in determining how well a particular child program suits the targeted solution. For conditionals, I was initially unsuccessful with the bare-bones fitness function used for the prior programs. An additional tweak was needed.

A check was added, within the fitness function, to peek into the interpreter’s memory register (ie., current data pointer via shift operations). We count the number of distinct memory registers used by the program and give a bonus to fitness, favoring more memory registers over less. This appears to help inspire diversity.

We also record the instruction pointer used for each print command. A penalty is given for re-use of the same print command. This also appears to help foster diversity and achieve a successful if/then result. It’s up for discussion whether peering into the interpreter’s memory registers and instruction pointers is fair-game or not, but for the test cases in this article they added the needed diversity to achieve a result.

Hi Z Bye

The AI successfully wrote a program to ask the user for input (1-3) and output text dependent on which value was entered, similar to selecting an option from a menu. For example, entering 1 prints the text “hi”, entering 2 prints the text “z”, and entering 3 prints the text “bye”. The AI successfully wrote the program in 446,200 generations in about 2 hours. It produced the following code:

1
2
3
4
5
6
7
8
9
[[]]>++++>+<+++++++++++++++<--[+++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++>]++++++++++++++++++++++++
++++++++++++>,-[--[>[][[,[]]]+[<<<<.]]<+++++++++++++++++++++++++++++++++++++++++
++++++++++++++<++++++++++++++>+++++++.<++-++.<<.<]++++++++++++++++++++++++++++++
++++++++++++++++++++++[+++++[+++++++++++++[++++++++++++++++++++++++++++++++++.>+
++++++++++[+++++><><++++++++++++++++<>++++++[++<>+++++++++++++++++++++++++++++++
+++++++++++++++++++++++++++++<>+++++.<<<<<<[[[,,,<<.+-[[+.[>[-]]-[+[<<,[<<[<[,]>
[<,]>[<><<.[+<,,]++-]--[+<-[-][..-<-------<>..<[]<[>[,>.>+[,<>][,],][+->[.]+[.[[
.]]<[...[+

This is a really interesting result. It’s also the longest program yet written by the AI, containing 650 instructions (although not all instructions are needed). In the above code, the AI has successfully written a program to perform an if/then conditional decision, based upon a given problem. It then goes on to perform, one of three, independent actions depending on that decision. All of this is encoded within the above automatically generated source code. This is truly a huge step forward from the AI’s humble beginnings.

If you try the above program on a web-based interpreter, input should be entered as byte values (not ASCII), so the menu option 1 should be input as #1 and the menu option 3 should be input as #3, etc. Or you can download and run the generated executable, which automatically takes your input in number format.

The following screenshots show the program running:

AI result

AI result

AI result

Cats Are Evil

The AI is now going to combine longer strings with conditional logic. To obtain a result within a reasonable amount of time, we’re going to use a slightly extended version of the brainfuck programming language, Brainfuck Extended Type III. This version includes several new instructions, specifically 0-F, which dramatically speeds up the AI by allowing it to quickly arrive at the desired range of ASCII values. While most web-based brainfuck interpreters do not implement Extended Type 3, our built-in compiler does. So the generated executables will still run correctly, using source code written by the AI.

The AI successfully wrote a program to ask the user for input (1-3) and output the text “cats”, “are”, “evil”, depending on which value was entered. The AI successfully wrote the program in 814,400 generations in about 3 hours. It produced the following code:

1
2
3
>,--[-[7-------------.--.<7++++.-.<-]>>>>7-----------.>7++++++>6+++++++++<.,]
>[.>7----.[<<<+]]6+.>6++++++++++++++++++.>6+++++.]-+.<+<.<+++],+[]+.[++[+[+[+
,+[+..<.]C+..<-][-+..,>,-+><,><].]<++>>+,]+-,[

First, note the new instructions that were made available to the AI. The characters 0 through F speed up generation of ASCII range values by setting the value of a given memory register to a corresponding value. These instructions are part of brainfuck Extended Type 3. Once added to the pool of available instructions, the AI immediately saw the value in using them and sprinkled them throughout the code. The result was a dramatically faster AI program generation, with an increase of over 30% in speed.

Also note that the generated code is dramatically shorter. Actually, it’s compressed. This is similar to how compression algorithms like zip and rar work. They take repeating characters and replace them with smaller ones. Upon decompression, they take the single character and expand them to the repeating ones, restoring the original. The AI is performing the same idea, by replacing a repeating series of + or - instructions with a single character 7 (which actually corresponds to 112 ‘+’ instructions).

The following screenshots show the program running:

AI result

AI result

AI result

Migrating to Brainfuck Extended Type III

As demonstrated in the “cats are evil” program above, an extended version of brainfuck was used to allow the AI to successfully write a program within a shorter amount of time. The AI program actually contains the option of selecting which version of brainfuck should be used, by simply changing a value in the configuration file.

The “cats are evil” program could certainly be generated using classic brainfuck, but it would take a lot longer than three hours. While the obvious benefit of having the AI use the extended programming language is the speed boost and code compression, the down-side is that we can no longer test the result in a web-based interpreter. However, we’re going to have to leave behind classic brainfuck if we ever hope to have the AI write programs that contain even more interesting features, such as file I/O and networking capabilities. Therefore, it’s inevitable that we migrate from classic to a more enhanced version, and possibly even further. In fact, the built-in compiler for the AI project already calls its modified version “BrainPlus” in anticipation of this.

Threats, Danger, and Harmful Potential

The programming instruction set supplied to the AI is like a toy-bin consisting of 8 legos. You can build all kinds of houses, castles, dungeons, and roads with just those 8 legos. However, eventually, you’re going to want to do more. Similarly, when we wanted the speed increase from code compression, we gave the AI some additional legos (programming instructions). Likewise, if we want the AI to write programs than can access the computer’s file system or communicate over the Internet, we’re going to need new instructions. Providing these instructions is relatively straightforward. It’s up to our own interpreter and compiler for how we execute the new instructions (ie., read a byte from the network, store a byte in a file stream, etc). But, let’s think for a moment about the dangerous side-effects that can occur from such potential.

One of the reasons for choosing brainfuck as the programming language of choice for our AI was the simple instruction set. It’s also highly helpful that the instructions are simple operations on controlled memory and do not consist of changes to the host computer’s file system or memory. Early on in my experiments, I’ve tried generating programs using BASIC, C, and C++ and quickly saw the potential to damage my computer. In fact, most programs failed immediately upon execution. The worst simply rebooted my machine.

However, what if one of the programs generated code similar to the hard-drive format command signature? Or a BIOS overwrite command? Or maybe the AI generates a program with an infinite loop, accessing a network.

It’s important to keep these potential harmful side-effects in mind when adding to the instruction set with capabilities to access the host system or a remote network. AI program generation should ideally be ran within a controlled simulated environment, until a successful program is generated with the desired result. A sandbox execution environment can be created in combination with a local web server or even a simulated server. After all, bytes need not physically be exchanged over a network in order for the AI to generate a program with networking communications. The instructions could simply be interpreted to judge correctness and allow the AI to evolve accordingly.

Some Failures Too

While the above results by the AI are certainly impressive, it’s not all success. There were some failures too, mainly due to the AI hitting local maxima with the genetic algorithm. These include fitness methods for: Multiplication, Fibonacci sequence, and a Number Guessing Game.

For example, with the number guessing game program, the goal was to have the AI produce a program that would choose a random number from 1-10. It would then ask the user for a guess and respond with an answer of higher or lower or win. Ideally, it would also count the number of tries for the user to guess the correct number. In my initial experiments, the best the AI could do is to match the sequence of guesses in the training sets and produce a program that simply ran through the sequence of higher, lower, lower, win, etc to achieve the “desired” result. If the input deviated from the expected sequence, the program would just continue to print out the same sequence of phrases. Perhaps, this is a fault of the fitness function, rather than the AI itself. More experimentation is needed.

Feel free to download the source code and give it a try.

Constructive Criticism

The original article, describing the results of the AI writing a “Hello World” program, generated quite a bit of conversation in the programming community, particularly on reddit. While many were fascinated with the idea of artificial intelligence being capable of writing its own computer programs via a genetic algorithm, others were quite critical of the idea. Whether or not this type of implementation actually proves to be useful, remains to be seen. However, it’s still exciting, and so I’ve addressed some of the comments below.

Some felt that the “Hello World” program, developed by the AI, was too trivial. After all, others have written genetic algorithms that create strings from random ASCII characters before. Further, adding a programming language on top of it was just one more layer, and really just a distraction. They stated that brainfuck itself, was particularly unsuitable for AI generation in this manner, due to the fact that a single mutation to one of its bits could throw off the entire program, rendering a result useless. They claimed that using brainfuck as the target programming language may have been simply a parlor trick.

To this note, I hope the above results of more complex programs (hello name, add, subtract, reverse string, if/then, etc) help to put this doubt to rest.

Another critical point stated, the fact that the AI used tens of thousands of generations to come up with a solution proved its limitation. They mentioned that if it took the AI two hours to produce a program to display “hello world” on the screen, it would take an unacceptable amount of time (perhaps, beyond the lifetime of our universe) to produce a page, or even a paragraph, of text.

This is true, certainly in the original version of the AI program. However, note in this recent version, several modifications have been added to greatly increase the speed. First, a change was made to the fitness method to favor shorter programs over longer ones. This removes extraneous programming instructions, making the programs more concise and faster. This decreased generation time and allowed programs that could output slightly longer strings. Second, by using Extended Type III, the AI could generate programs even faster, particularly with regard to outputting strings. The extended version includes codes to quickly push the memory registers into the valid ASCII range of characters. From there, it’s a relatively simple count-down to the target string. There are likely many more speed improvements that can be made (including both programmatic and hardware) to constrain generation time.

Of course, the purpose of an AI writing its own programs isn’t to produce books. So, outputting text is just one part of the end result. The capability for the AI to capture logic is much more important. And this was demonstrated (at least, for a simple case) in the if/then examples above.

Conclusion

I believe there is great potential behind the combination of evolutionary algorithms and self-programming software. Much like the evolution of single-celled bacteria into complex forms of life, I’d still like to believe that software can be transformed via evolutionary techniques into much more complicated byproducts.

I’m really not sure if the answer to strong AI is to create an ever-more complicated database of knowledge that eventually seems to be intelligent simply by its responses to queries. Or if an increasingly packed knowledge-base somehow coalesces into consciousness. I hope that a much simpler framework must exist that can achieve this end result.

The examples, discussed above, are simply baby steps towards the potential of self-programming AI. Years ago this technique of programming was very difficult, largely due to the slower speed of computers. Today, this technique is viable. I’m actually surprised by the results already achieved. I originally thought the AI would be unable to move past printing simple phrases. However, the results in these experiments prove otherwise.

Artificial intelligence is indeed capable of producing its own computer programs with increasingly complex programmatic logic. With any luck, we can push things even further.

Download @ GitHub

Like what you see? All source code for the project is available on GitHub. This includes the simplified fitness framework for easier creation of programs by the AI.

About the Author

This article was written by Kory Becker, software developer and architect, skilled in a range of technologies, including web application development, machine learning, artificial intelligence, and data science.

Share