Code generation by genetic algorithms

dfens picture dfens · Apr 20, 2011 · Viewed 11.3k times · Source

Evolutionary programming seems to be a great way to solve many optimization problems. The idea is very easy and the implementation does not make problems.

I was wondering if there is any way to evolutionarily create a program in ruby/python script (or any other language)?

The idea is simple:

  1. Create a population of programs
  2. Perform genetic operations (roulette-wheel selection or any other selection), create new programs with inheritance from best programs, etc.
  3. Loop point 2 until program that will satisfy our condition is found

But there are still few problems:

  1. How will chromosomes be represented? For example, should one cell of chromosome be one line of code?
  2. How will chromosomes be generated? If they will be lines of code, how do we generate them to ensure that they are syntactically correct, etc.?

Example of a program that could be generated:

Create script that takes N numbers as input and returns their mean as output.

If there were any attempts to create such algorithms I'll be glad to see any links/sources.

Answer

Jivlain picture Jivlain · Apr 21, 2011

If you are sure you want to do this, you want genetic programming, rather than a genetic algorithm. GP allows you to evolve tree-structured programs. What you would do would be to give it a bunch of primitive operations (while($register), read($register), increment($register), decrement($register), divide($result $numerator $denominator), print, progn2 (this is GP speak for "execute two commands sequentially")).

You could produce something like this:

progn2(
  progn2(
    read($1)
    while($1
      progn2(
        while($1
          progn2( #add the input to the total
            increment($2)
            decrement($1)
          )
        )
        progn2( #increment number of values entered, read again
          increment($3)
          read($1)
        )
      )
    )
  )
  progn2( #calculate result
    divide($1 $2 $3)
    print($1)
  )
)  

You would use, as your fitness function, how close it is to the real solution. And therein lies the catch, that you have to calculate that traditionally anyway*. And then have something that translates that into code in (your language of choice). Note that, as you've got a potential infinite loop in there, you'll have to cut off execution after a while (there's no way around the halting problem), and it probably won't work. Shucks. Note also, that my provided code will attempt to divide by zero.

*There are ways around this, but generally not terribly far around it.