How to write self modifying code in C?

AnkurVj picture AnkurVj · Sep 16, 2011 · Viewed 38.9k times · Source

I want to write a piece of code that changes itself continuously, even if the change is insignificant.

For example maybe something like

for i in 1 to  100, do 
begin
   x := 200
   for j in 200 downto 1, do
    begin
       do something
    end
end

Suppose I want that my code should after first iteration change the line x := 200 to some other line x := 199 and then after next iteration change it to x := 198 and so on.

Is writing such a code possible ? Would I need to use inline assembly for that ?

EDIT : Here is why I want to do it in C:

This program will be run on an experimental operating system and I can't / don't know how to use programs compiled from other languages. The real reason I need such a code is because this code is being run on a guest operating system on a virtual machine. The hypervisor is a binary translator that is translating chunks of code. The translator does some optimizations. It only translates the chunks of code once. The next time the same chunk is used in the guest, the translator will use the previously translated result. Now, if the code gets modified on the fly, then the translator notices that, and marks its previous translation as stale. Thus forcing a re-translation of the same code. This is what I want to achieve, to force the translator to do many translations. Typically these chunks are instructions between to branch instructions (such as jump instructions). I just think that self modifying code would be fantastic way to achieve this.

Answer

Heath Hunnicutt picture Heath Hunnicutt · Sep 16, 2011

You might want to consider writing a virtual machine in C, where you can build your own self-modifying code.

If you wish to write self-modifying executables, much depends on the operating system you are targeting. You might approach your desired solution by modifying the in-memory program image. To do so, you would obtain the in-memory address of your program's code bytes. Then, you might manipulate the operating system protection on this memory range, allowing you to modify the bytes without encountering an Access Violation or '''SIG_SEGV'''. Finally, you would use pointers (perhaps '''unsigned char *''' pointers, possibly '''unsigned long *''' as on RISC machines) to modify the opcodes of the compiled program.

A key point is that you will be modifying machine code of the target architecture. There is no canonical format for C code while it is running -- C is a specification of a textual input file to a compiler.