martes, 19 de agosto de 2014

Automating SUB encoder

There is a very interesting method for creating reliable alphanumeric shellcode in cases when the set of allowed characters is very restricted. This so-called method is used in the HP NNM exploit ( coded by muts from offsec. You may see that in this particular exploit the egghunter code appears to be heavily encoded:

The reason for this is that the input sent to the HP NNM server is filtered by a loaded module (ovfilter.dll). This process encodes characters in the range [0x80 - 0xFF] plus a few extra badchars in the middle of the remaining set [0x00 - 0x7C], those chars are: 0x00, 0x0a, 0x0d, 0x40, 0x3f, 0x3a, 0x2f. We are left only with ASCII characters allowed, but not even the entire ASCII table. No automated encoder I know of would automatically encode generic shellcode based on these restrictions (confession: only tried those given by msfencode).

The main reason why the SUB encoding technique is so interesting to automate is that it uses logic gates in order to get the required opcodes into the stack or any other location, depending on the case. This means that making some common tasks such as zeroing registers or pushing opcodes will use only algebraic operations which, in turn, are among the allowed opcodes. These instructions may be ADD, SUB, AND, PUSH, POP and INC. If some of those instructions aren't allowed in a particular vulnerability you will see some trouble decoding the original bytecode and jumping to it.

The first operation the encoder will do is zeroing a register, this will allow us to perform algebraic operations on it to get a final value pushed on to the stack that will represent the original opcodes. Zeroing a register can be done in a few number of ways which we will not mention here. For the sake of your intellectual growth, please notice:

Performing two AND operations on any register will zero it iff the two operands used turn to be 0 when using the AND operation between them.

Example: 0x08080808 AND 0xF7F7F7F7 = 0x00000000

Which means you could do:
AND EAX, 0x08080808

And get EAX to hold a zero value, independently of what was stored before. Now, the next thing we'd like is getting EAX to hold a four byte value which will happen to be four bytes from your original shellcode. This is an interesting fact as opcodes are no longer considered semantically (that means 0xEB isn't JMP anymore) but rather numerically (that means 0xEB is simply 0xEA + 1, or anything like it).

Nowadays the most common egghunter used in public exploits could be the NtAccessCheckAndAuditAlarm, other well-known egghunters like IsBadReadPtr or NtDisplayString have been deprecated on some Windows versions and can't be relied upon. The NtAccessCheckAndAuditAlarm egghunter takes up to 32 bytes of space, which can be chopped into 8 chunks of 4 bytes each. We now want some way in which each of those 4 bytes can be put (using only algebraic operations) into our zero-value register in order to finally push that register to what ESP is pointing, efectively getting our original shellcode into the stack.

One way of doing so is decomposing our 4-byte value into two or three terms to be added to our register (EAX) so that it will hold the original 4-byte value after the two or three instructions. Proceeding this way may be complicated (as it is in the case of the NNM vulnerability) as the terms of the sum might probably contain bytes within the badchar range.

There's a way around this by using the 2's complement, this will substantially reduce the magnitude of the offset to be added or substracted from the original 4-bytes dword. The automation of the algorithmic process follows:

1. Prepare the shellcode to be encoded: Pad with NOPs or other harmless opcodes to achieve a shellcode length multiple of 4.
2. Divide into dwords: Get 4-byte groups in order to be encoded.
3. Setup right endianess for decoding:Reverse the order of the whole shellcode as well as the correct endianess of each dword to be read from the stack.
4. Find sum terms for each dword: The internals of this process is just a carry count. In brief:
  • Take each badchar in the dword and split it into a couple of terms that contain only bytes <= 0x7c. Substract 0x1 from each term to avoid 0x00 into the next term:
    0x81057f42 = 0x7c047c41 + 0x05010301
  • If the dword contains any byte bigger than 0x7f*2 a third term is required, not forgetting to fix the carry from each term, avoiding 0x00:
    0xfa057f42 = 0x7c037c40 + 0x7c010201 + 0x02010101 
    The above process may be easily automatized in python:

Adding up the zeroing of registers with calculating the sum terms for the whole shellcode we'd get something like:
push esp
pop eax

and eax,0x01010101
and eax,0x10101010
sub eax,0x2c7c6e69
sub eax,0x01500101
push eax

and eax,0x01010101
and eax,0x10101010
sub eax,0x157c167c
sub eax,0x017c010f
sub eax,0x01080101
push eax

and eax,0x01010101
and eax,0x10101010
sub eax,0x4f147c50
sub eax,0x01080101
push eax

and eax,0x01010101
and eax,0x10101010
sub eax,0x4f147c50
sub eax,0x01010e01
push eax

[ ... snip ... ]

jmp esp
 The #SUB EAX# line shows the place where the appropiate instructions are to be replaced for substracting or adding EAX to the place we want our shellcode decoded before restoring this value into ESP. This is how the final shellcode looks like:

This alphanumeric shellcode can be generated using the script from

No hay comentarios:

Publicar un comentario