Reverse Engineering Asked by user1365830 on July 13, 2021
How could this 32-bit x86 assembly be written in C?
loc_536FB0:
mov cl, [eax]
cmp cl, ' '
jb short loc_536FBC
cmp cl, ','
jnz short loc_536FBF
loc_536FBC:
mov byte ptr [eax], ' '
loc_536FBF
mov cl, [eax+1]
inc eax
test cl, cl
jnz short loc_536FB0
I have already figured out that it is a for loop that loops 23 times before exiting.
Such small snippets are not too hard to decompile manually. Let's try it.
You have already figured out that cl
holds a character, this means that eax
where it's read from is a pointer to a character array. Let's call it p
. Now, let's do a dumb translation for every assembly statement to C:
l1: ; l1:
mov cl, [eax] ; cl = *p;
cmp cl, ' ' ; if ( cl < ' ' )
jb short l2 ; goto l2
cmp cl, ',' ; if ( cl != ',' )
jnz short l3 ; goto l3
l2: ; l2:
mov byte ptr [eax], ' ' ; *p = ' '
l3: ; l3:
mov cl, [eax+1] ; cl = *(p+1)
inc eax ; p = p + 1
test cl, cl ; if ( cl != 0 )
jnz short l1 ; goto l1
And cleaned up:
l1:
cl = *p;
if ( cl < ' ' )
goto l2;
if ( cl != ',' )
goto l3;
l2:
*p = ' ';
l3:
cl = *(p+1);
p = p + 1;
if ( cl != 0 )
goto l1;
Now, let's have a look at the second if
. It has the following form:
if ( condition )
goto end_of_if;
<if body>
end_of_if:
And here's how we can get rid of the goto
:
if ( !condition )
{
<if body>
}
Applying it to our snippet:
l1:
cl = *p;
if ( cl < ' ' )
goto l2;
if ( cl == ',' ) {
l2:
*p = ' ';
}
cl = *(p+1);
p = p + 1;
if ( cl != 0 )
goto l1;
Now, how we can get rid of goto l2
? If you look at it carefully, you can see that the body at l2
will get executed if either cl < ' '
or cl == ','
. So we can just combine the two conditions with a logical OR (||
):
l1:
cl = *p;
if ( cl < ' ' || cl == ',' ) {
*p = ' ';
}
cl = *(p+1);
p = p + 1;
if ( cl != 0 )
goto l1;
Now we have one goto
left. We have: 1) label at the beginning of a statement block 2) check at the end of the block and 3) goto to the start of the block if the check succeeded. This is a typical pattern of a do-while
loop, and we can easily convert it:
do {
cl = *p;
if ( cl < ' ' || cl == ',' ) {
*p = ' ';
}
cl = *(p+1);
p = p + 1;
} while ( cl != 0 )
Now the code is almost nice and pretty, but we can compress it a bit more by substituting equivalent statements:
do {
if ( *p < ' ' || *p == ',' )
*p = ' ';
cl = *++p;
} while ( cl != 0 )
And, finally, the last assignment can be moved into the condition:
do {
if ( *p < ' ' || *p == ',' )
*p = ' ';
} while ( *++p != 0 )
Now it's obvious what the code is doing: it's going through the string, and replacing all special characters (those with codes less than 0x20 aka space) and commas with the spaces.
Correct answer by Igor Skochinsky on July 13, 2021
Well, especially for that, Hex-Rays Decompiler was invented. It will decompile ASM code into pseudo-C, and from there You may write C-based logic of assembly code You have.
Answered by Denis Laskov on July 13, 2021
Here's what it would have looked like in the source. Fastcall being a replacement for the custom leaf convention the compiler used when it was optimized.
void __fastcall __forceinline RemoveControlChars(char* szInput) {
int i;
for (i = 0; i < 23 && *szInput; ++i, ++szInput) {
if (*szInput < ' ' || *szInput == ',')
*szInput = ' ';
}
}
Answered by Tox1k on July 13, 2021
You can use r2dec plugin on radare2 with command pdda
[0x08048060]> pdda
; assembly | /* r2dec pseudo code output */
| /* ret @ 0x8048060 */
| #include <stdint.h>
|
; (fcn) entry0 () | int32_t entry0 (void) {
| do {
| /* [01] -r-x section size 23 named .text */
0x08048060 mov cl, byte [eax] | cl = *(eax);
0x08048062 cmp cl, 0x20 |
| if (cl >= 0x20) {
0x08048065 jb 0x804806c |
0x08048067 cmp cl, 0x2c |
| if (cl != 0x2c) {
0x0804806a jne 0x804806f | goto label_0;
| }
| }
0x0804806c mov byte [eax], 0x20 | *(eax) = 0x20;
| label_0:
0x0804806f mov cl, byte [eax + 1] | cl = *((eax + 1));
0x08048072 inc eax | eax++;
0x08048073 test cl, cl |
0x08048075 jne 0x8048060 |
| } while (cl != 0);
| }
[0x08048060]>
Answered by VIII on July 13, 2021
Get help from others!
Recent Questions
Recent Answers
© 2024 TransWikia.com. All rights reserved. Sites we Love: PCI Database, UKBizDB, Menu Kuliner, Sharing RPP