Description:
This is an IDA plugin which can decompile one function at a time. To try it in IDA, place your cursor on a function, and execute the plugin. The decompiled function will appear in the output window.
It is currently capable of decompiling small functions with fairly simple control flow. It may also be able to decompile larger functions by pure luck. It shows what can be done in a few thousand lines of python.
The first analysis phase takes care of transforming every instruction into a form very close to static single assignment form. For example, add eax, 1 becomes eax = eax + 1. Instructions that affect more than one memory location (such as push, pop, leave, etc) are expanded into their more basic representation, such that pop edi becomes edi = *(esp) followed by esp = esp + 4.
This phase also attempt to track modifications to the eflags register. All status bits are supported, although only zf, cf, of and sf have a proper decompiled representation, and the af and pf eflags will be displayed as PARITY(...) or ADJUST(...). Modifications to eflags are tracked by emitting assignments to special registers (named %eflags.*). When a jump instruction is later encountered, the corresponding condition is emitted using eflags as operands, for example, jz is emitted as if(%eflags.zf == 0). Unused eflags are then eliminated as dead code, and used ones are propagated the normal way when replacing uses by definitions.
The second analysis phase attempts to tracks definition-use chains. When an assignation takes place, a new def-use chain is created. All following uses of this register is attached to the chain until a subsequent assignation to the same register takes place. This enables the analysis of which register are 'active' at a specific location during the execution of the function.
In this phase, def-use chains are simplified by replacing uses by their definitions until a definition has no more uses, at which point it is eliminated as dead code.
In this phase, the basic control blocks are combined together to form more complex control blocks. Basic algorithm are applied iteratively in an attempt to make more complex statements such as if, while, do-while from simple if(...) goto constructs.
TODO:
This project could use some improvements in the following areas:
more instructions are needed. currently this decompiler supports a very limited number of x86/x64 instructions.
there is currently no attempt at data type analysis, which would be necessary in order to produce a recompilable output, or even a more correct output.
add support for different types of assemblies (ARM, etc).
add support for more calling conventions. currently, only SystemV x64 ABI (x64 linux gcc) is supported. under other compilers, function calls will be displayed without parameters.
add a GUI for renaming variables, inverting if-else branches, and other easy stuff.
when possible, functions called from the one being decompiled should be analysed to determine function arguments and restored registers.