Reverse Engineering Asked by jjmcc on February 10, 2021
I am fairly new to reversing so apologies in advance if any terminology is incorrect.
I am currently using ghidra on windows to look at the instructions/decompilation of a binary and I am looking to add some instructions to an existing function to change its behaviour. In this case, it is fairly trivial as I only want to add a fixed value to an existing function parameter, but I would also like some information on more advanced cases where the inserted code is slightly more complex.
I made a small test program to test this out and I managed to add the instruction by editing the binary in a hex editor and simply shifting the function bytes and inserting my own. However I realized that this is only possible because there was a bunch of empty memory following the function, so I could just shift them all down but this isn’t always the case.
**************************************************************
* FUNCTION *
**************************************************************
ulonglong __fastcall FUN_140011810(int param_1, int para
ulonglong RAX:8 <RETURN>
int ECX:4 param_1
int EDX:4 param_2
undefined4 Stack[0x10]:4 local_res10 XREF[3]: 140011810(W),
140011835(R),
140011848(R)
undefined4 Stack[0x8]:4 local_res8 XREF[2]: 140011814(W),
14001184e(R)
undefined1 Stack[-0x10]:1 local_10 XREF[1]: 140011867(*)
undefined4 Stack[-0xf4]:4 local_f4 XREF[4]: 140011858(W),
14001185b(R),
140011861(W),
140011864(R)
undefined1 Stack[-0xf8]:1 local_f8 XREF[1]: 140011821(*)
FUN_140011810 XREF[1]: thunk_FUN_140011810:14001104b(T),
thunk_FUN_140011810:14001104b(j)
140011810 89 54 24 10 MOV dword ptr [RSP + local_res10],param_2
140011814 89 4c 24 08 MOV dword ptr [RSP + local_res8],param_1
140011818 55 PUSH RBP
140011819 57 PUSH RDI
14001181a 48 81 ec SUB RSP,0x108
08 01 00 00
140011821 48 8d 6c LEA RBP=>local_f8,[RSP + 0x20]
24 20
140011826 48 8b fc MOV RDI,RSP
140011829 b9 42 00 MOV param_1,0x42
00 00
14001182e b8 cc cc MOV EAX,0xcccccccc
cc cc
140011833 f3 ab STOSD.REP RDI
140011835 8b 8c 24 MOV param_1,dword ptr [RSP + local_res10]
28 01 00 00
14001183c 48 8d 0d LEA param_1,[DAT_140021008] = 01h
c5 f7 00 00
140011843 e8 44 f8 CALL thunk_FUN_140011e80 undefined thunk_FUN_140011e80(ch
ff ff
140011848 8b 85 08 MOV EAX,dword ptr [RBP + local_res10]
01 00 00
14001184e 8b 8d 00 MOV param_1,dword ptr [RBP + local_res8]
01 00 00
140011854 03 c8 ADD param_1,EAX
140011856 8b c1 MOV EAX,param_1
140011858 89 45 04 MOV dword ptr [RBP + local_f4],EAX
14001185b 8b 45 04 MOV EAX,dword ptr [RBP + local_f4]
14001185e 83 c0 0a ADD EAX,0xa
140011861 89 45 04 MOV dword ptr [RBP + local_f4],EAX
140011864 8b 45 04 MOV EAX,dword ptr [RBP + local_f4]
140011867 48 8d a5 LEA RSP=>local_10,[RBP + 0xe8]
e8 00 00 00
14001186e 5f POP RDI
14001186f 5d POP RBP
140011870 c3 RET
140011871 cc ?? CCh
140011872 cc ?? CCh
140011873 cc ?? CCh
140011874 cc ?? CCh
140011875 cc ?? CCh
140011876 cc ?? CCh
140011877 cc ?? CCh
140011878 cc ?? CCh
140011879 cc ?? CCh
14001187a cc ?? CCh
14001187b cc ?? CCh
14001187c cc ?? CCh
14001187d cc ?? CCh
14001187e cc ?? CCh
14001187f cc ?? CCh
140011880 cc ?? CCh
140011881 cc ?? CCh
140011882 cc ?? CCh
Specifically, 14001185e 83 c0 0a ADD EAX,0xa
I could duplicate this instruction and change 0xa
to alter the output value.
In the more complex binary I have a larger function with similar parameters, except there is no additional memory at the end of the function so this approach to shift the remaining bytes wouldn’t work as there is another function directly below. I also can’t remove any of the current instructions to make space as that might break existing functionality. There is plenty of empty memory elsewhere in the binary so I thought of adding a jmp instruction to perform some instructions, and then jumping back but some of the instructions use local variables so I’m unsure if this will work.
So given the above example, and none of the extra memory at the end of the function, how can I insert some custom instructions?
I believe, you have to change that address to apropriate "JMP" command, and append ADD EAX,0xa at the end of the executable file, all other commands if overwritten and also the rest desired actions as well and after finishing, jmp back to the incremented address, you will need to correct executables header file as well for modified file length. Sure, when adding your own instructions, remember to correct all changed register pointers, e.g. stack...
Answered by Zurab on February 10, 2021
You are asking how to insert a "code cave" or a "balcony" into existing code. You could proceed like so:
JMP
code and four offset bytes, and possible following surplus bytes with NOP
statements to avoid garbage code.JMP
.JMP
. If addresses are involved, re-calculate according to RIP-relative addressing.JMP
back to the next statement of your original code.Let me make an example how to calculate the target address of the JMP
statement:
Assume you wish to replace the
LEA param_1, [DAT_140021008]
statement with a JMP to 140018000
where you might have free space for the code cave, and the subsequent re-insertion of that LEA
command at the new location.
Of course
Calculate the address offset: Take your destination address and get the difference to the next instruction. Your replaced code will look similar to the following one (syntax possibly not correct):
14001183c E9 xx xx xx xx JMP 140018000 ; the xx's to be calculated
140011841 90 NOP
140011842 90 NOP
140011843 e8 44 f8 ff ff CALL thunk_FUN_140011e80 ;existing code
Offset: 140018000 - 140011841 = 67bf, the JMP line becoming
14001183c E9 bf 67 00 00 JMP 140018000
At address 140018000
you might wish to re-insert the LEA statement:
140018000 48 8d 0d 79 9e ff ff LEA param_1, [DAT_140011e80]
140018007 Your new code
...
JMP back to 140011843
The correct offset for the LEA call has been calculated:
140011e80 - 140018007 = ffff9e79 = -6187
Perhaps the mnemonic param_1
will be replaced by ECX
, as that register is holding the param_1
.
At the end of your code cave you have to calculate the JMP
back to your original code in just the same way.
You might have noticed that due to the necessary target address re-calculations your simple "shift down" method also needs careful attention in the general case.
Remark: If you look in your example code at the statement at address 140011843
CALL thunk_FUN_140011e80
you might note the "thunk_
" prefix. It means that the immediate address is different from 140011e80
. It is a "proxy", probably a JMP
target inserted by the compiler leading to the address indicated in Ghidra's code. Ghidra calculates this for you.
The outlined method is to sketch the general construction of a code cave. Problems like local variables located on the stack must be considered (keeping the stack consistent), or items listed in the relocation table of the PE64 header. Care must be taken to handle those properly.
Answered by josh on February 10, 2021
What you are really looking for is a code stud. A code cave uses unused space to sort of jump add your own code and jump back. The problem with code caves is that there is a size constraint in the PE binary that won't give you a lot of space.
A code stud on the other hand is adding an addition .TEXT section. With this method you can avoid the pitfalls of DEP and have much more space. I have had easily up to 8 MB of space to work in.
All you will need to do is open the binary in the StudPE and add a section and make sure it is executable and then just jump to it ... do anything you want and jump back.
Also don't worry about inserting BYTES just use Ram Michael's multi assembler tool HERE in a debugger(olly or x64dbg) and just copy and paste the assembly code in. It needs to match MASM syntax, but converting is easy.
I have done whole projects like this so rest assured this is the easy way of doing things and save yourself hours of work.
You can use this tool here to create one Stud PE
Answered by LUser on February 10, 2021
Get help from others!
Recent Answers
Recent Questions
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP