chevron_left chevron_right
Login Register invert_colors photo_library


Stay updated and chat with others! - Join the Discord!
Thread Rating:
  • 1 Vote(s) - 5 Average


Tutorial Reverse Engineering 101 - Part 1 filter_list
Author
Message
Reverse Engineering 101 - Part 1 #1
After a discussion yesterday on Discord, I've decided to run a strain of threads to teach you guys the basics of reverse engineering. This will not be a short series (it wouldn't surprise me if it came out to be 20+ parts), this will not be an easy series, and I will not be using any commercial tools (IDA, Hopper, etc) to make the job easier. If you can sit through all of this, then you might learn something.



So, part 1. In this part we will get you set up in a development environment where you will be able to compile programs, then dump and attempt to reverse engineer them. We are going to do it this way for now because it will be a lot easier for you to match code you wrote with code generated from it. You won't be expected to translate large or foreign programs for quite some time, so just sit back and relax.

Prerequisites:

1. A linux (or BSD, which I will be using) machine. I advise using a VM for this to ensure that it isnt too cluttered.
[Image: p7oJC7B.png]

2. Working knowledge of the UNIX command line
there are a few tutorials out there by @CyberWarrior that have gotten good reviews. The first one is here

3. Working knowledge of the C programming language
here is a link to a discussion about C tutorials

4. Knowledge of basic machine operations.
For the purposes of this class, you can get by with just using the basic data operation instructions, but we will be working with assembly primarily, the more you know the better.
Chapters 1, 2 and 3 from this book should give you enough background

5. The GNU toolchain (this one tends to create the least mangled code, so we will stick with this for now).
GNU cc (gcc)
GNU db (gdb)
objtools (objdump)



So, let's write our simple hello world program and test our tools out.

The first thing we will do is write our hello world program
Code:
#include <stdio.h>
int main()
{
    printf("Hello, World!\n");
    return 0;
}
Now, let's compile this as
Code:
gcc -o hello -g hello.c
we are using the g flag so that we can test that gdb works (not that we would normally need it).
[Image: ZUKPWWS.png]

Next, let's check that gdb works.
If you get a message like this, it isn't going to work. I'm not worried about this on my system as I don't need source translation
[Image: ufaptK2.png]

Now, let's make sure we can disassemble it (with objdump)
Code:
objdump -D ./hello | less
[Image: XEy87wP.png]
Now, be aware that since we enabled debugging symbols, this is going to be a lot more code than we need. Let's recompile and redump without the -g option.

[Image: o6SOBlP.png]

Now, there is a lot of garbage in there that we aren't going to worry about for now, so scroll down to the subroutine main. That's our code. Short as it may be.
Code:
push    %rbp
mov    %rsp,%rbp
mov    $0x4007d7,%edi
callq    400450
mov    $0x0,%eax
pop            %rbp
retq
nopl    0x0

Now, there are a few things I will cover on this part of the class.

1. Alignment padding
If you know anything about programming, you know functions must ALL return, so that retq (return qword, aka return long lump) is the last instruction executed in this function.
So what's the nopl 0x0 for?
Well, that's just a no-operation instruction, it does exactly nothing for 1 cycle. It's simply there for padding so that the next subroutine is 4-byte aligned, ignore it.
So, if we take that and the return out, we are left with the following:

Code:
push    %rbp
mov    %rsp,%rbp
mov    $0x4007d7,%edi
callq    400450
mov    $0x0,%eax
pop            %rbp

This brings me to point
2. Calling frames and frame pointers.
There are 3 instructions in here that deal with the call frame. This is a bit of code that will be in EVERY subroutine in the C language (though it may be different in some cases, and it is very important). It's used so that if we use recursion or call another function, we don't overwrite our local variables (those on the heap).
The first part sets uo the call frame:
Code:
push    %rbp
mov    %rsp,%rbp
The push instruction saves the previous frame to the stack, then we use our stack pointer to set up a new frame. rbp is a pointer to our heap space. So in this case, we are using the stack for that. This is common with intel instruction sets, but uncommon on other platforms (aka RISC).
we also have a pop %rbp at the end, which exits our call frame so we can return properly. Let's trim the frame stuff out.

Code:
mov    $0x4007d7,%edi
callq    400450
mov    $0x0,%eax

Now, here is the code we will reverse engineer.
the first line is the one that will throw many off, because we have to actually go look for it. Since we know that 0x450 in the text section is a call to puts (we saw that from objdump earlier), then we can check to see if its a static string (non-variable). Use the strings command for this.
[Image: sy7tAb4.png]
We see that at offset 0x7d7 there is a string there. So, you can think of this (briefly) as a variable assignment (even though its not).
Code:
str = "Hello, World!;
the next instruction is a call to puts(char *), so scrap your variable assignment and insert it directly
Code:
puts("Hello, World!");
Yes, we did lose the newline when we used strings, don't worry about it, those aren't something thats important for RE.

So, the only line we have left is
Code:
mov        $0x0,%eax
Once again, consider that a variable assignment briefly until we see the next command is retq, indicating (via the C ABI) that the return value is 0.

So, our function main has been reversed to
Code:
int main() //no args because empty frame
{
        puts("Hello, World!");
        return 0;
}
You see we didn't come up with exactly the same code, and we almost never will. This is because the compiler will optimize it for us, and we lose the ability to think upwards like the original author.



Well, there you have it! In the next part we will explore local variables.

[+] 4 users Like phyrrus9's post
Reply

RE: Reverse Engineering 101 - Part 1 #2
Nice introduction, I've been trying to get a look into this for a while. I have some minimal experience messing with GDB and it's a lot to take in.

Reply

RE: Reverse Engineering 101 - Part 1 #3
Posting for future reading please also tag me in new ones as you create.
@Skullmeat @phyrrus9 @Bish0pQ @mr.kurd and @ender are my best friends on SL

Reply

RE: Reverse Engineering 101 - Part 1 #4
I see that you followed my advice, very nice tutorial. Looking forward to the rest!
[Image: 5u8rTPk.jpg]
Click image to go to my website, it has been updated!

Reply

RE: Reverse Engineering 101 - Part 1 #5
(10-12-2016, 05:42 AM)Slacker Wrote: Posting for future reading please also tag me in new ones as you create.

I can't promise a tag, but I'll probably pop one off every 4-5 days.

(10-12-2016, 09:22 AM)Bish0pQ Wrote: I see that you followed my advice, very nice tutorial. Looking forward to the rest!

I tried to break it out into the fundamentals. This one was mostly aimed at trimming off the fat, the next one will be block identification in sort.

Reply

RE: Reverse Engineering 101 - Part 1 #6
Nice tutorial that you have made. Gonna watch out for the other parts Smile

Reply

RE: Reverse Engineering 101 - Part 1 #7
(12-27-2016, 09:53 AM)Schism Wrote: Nice tutorial that you have made. Gonna watch out for the other parts Smile

There's actually 3 parts already out, I haven't had the time recently to do any more but it's certainly on my list. Glad you enjoyed it

Reply

RE: Reverse Engineering 101 - Part 1 #8
(12-27-2016, 10:07 AM)phyrrus9 Wrote:
(12-27-2016, 09:53 AM)Schism Wrote: Nice tutorial that you have made. Gonna watch out for the other parts Smile

There's actually 3 parts already out, I haven't had the time recently to do any more but it's certainly on my list. Glad you enjoyed it

That's great to know! Gonna read them and I'll wait for your other releases soon Smile

Reply

RE: Reverse Engineering 101 - Part 1 #9
I think I'm grave digging the hell out of this but wow.... This actually is a pretty darn comprehensive tutorial. Thanks.

Reply

RE: Reverse Engineering 101 - Part 1 #10
(12-27-2016, 10:47 PM)ProfessorChill Wrote: I think I'm grave digging the hell out of this but wow.... This actually is a pretty darn comprehensive tutorial. Thanks.

I honestly think it's pretty impossible to gravedig any of my threads. I try to write them in such a way that conversation, no matter what the quality, will always bring a positive impact. The fact that this thread got "gravedug" in the first place is what exposed you to it.

Reply






Users browsing this thread: 1 Guest(s)