[UD2] Undifined but simple anti-decompiling instruction
UD2 is an x86 assembly instruction that simulates an invalid opcode and mostly used for testing purposes, but not only. Indeed it can be used by malware authors for example to disturbe and slow down the reverse engineering process of it’s malware.
UD2 is an x86 assembly mnemonic and stands for Undefined instruction. Only used for testing purposes, this instruction simulates the presence of an invalid opcode in the code and when executed raises an Invalid opcode exception.
Opcode | Mnemonic |
---|---|
0F 0B | UD2 |
|
|
By giving you the definition of UD2 instruction, I said that it’s only used for testing purposes, well I lied. It could also be used as an anti-decompiling technique. To explain this usage, let’s see an example of a simple program.
The following program checks if the program was executed with a parameter or exits, if it’s the case, he first checks if the length of the string passed as argument is equal to 18 and then is equal to the string G0od_Byp4ss_0f_UD2. if the two conditions are not met he returns after printing a ‘Try again’ message.
|
|
Now let’s decompile our program, for this example I’m using Ghidra’s decompiler.
|
|
We can see that the pseudocode generated by the decompiler is strongly similar to our original source code and this is because of two main reasons :
1- The program is very simple
2- The program doesn’t implement any anti reverse engineering method
In order to not to be lynched by the reverse engineering community, i used the word “protected” in the title just to make the difference between the two programs, but in real cases this kind of protection is much more complexe than what I will show you. Now that I’ve got my back, we can continue ☺️ .
Let’s take our simple program and throw an UD2 instruction in the middle of the source code. Here we will use the __asm keyword to include assembly instruction in our C code. Also, we will not take care about the exception handling part.
|
|
Finally after re-decompiling our program we end up with the following pseudocode
|
|
We can see that instead of continuing the decompilation process, the decompiler stopped after adding an infinit loop that calls invalidInstructionException() function. My personal explanation to this, is that during the optimization process of the decompiler, he detected an invalid opcode (UD2) and deduced that logically the program will ends with an invalid instruction exception, which is not false when it’s for testing purposes, but in anti reverse engineering usage, the author will mostly handle this exception in order to not to affect the program workflow.
In order to bypass the “protection” in our simple program, we can simply patch it by replacing the UD2 instruction with two NOPs instruction.
Before patching
...
0x000011f8 call printf ; sym.imp.printf ; int printf(const char *format) ; sym.imp.printf
; int printf("Usage : %s password\n")
0x000011fd mov edi, 1 ; int status
0x00001202 call exit ; sym.imp.exit ; void exit(int status) ; sym.imp.exit
; void exit(38161477)
0x00001207 ud2
0x00001209 mov rax, qword [rbp - 0x10]
0x0000120d add rax, 8
0x00001211 mov rax, qword [rax]
0x00001214 mov rdi, rax
0x00001217 call strlen
...
After patching
...
0x000011f8 call printf ; sym.imp.printf ; int printf(const char *format) ; sym.imp.printf
; int printf("Usage : %s password\n")
0x000011fd mov edi, 1 ; int status
0x00001202 call exit ; sym.imp.exit ; void exit(int status) ; sym.imp.exit
; void exit(38161477)
0x00001207 nop
0x00001208 nop
0x00001209 mov rax, qword [rbp - 0x10]
0x0000120d add rax, 8
0x00001211 mov rax, qword [rax]
0x00001214 mov rdi, rax
0x00001217 call strlen
...