Home BINARY EXPLOITATION
Post
Cancel

BINARY EXPLOITATION

Introduction

Binary Exploitation is about finding vulnerabilities in programs and utilising them to do what you wish. Sometimes this can result in an authentication bypass or the leaking of classified information, but occasionally (if you’re lucky) it can also result in Remote Code Execution (RCE). The most basic forms of binary exploitation occur on the stack, a region of memory that stores temporary variables created by functions in code. When a new function is called, a memory address in the calling function is pushed to the stack - this way, the program knows where to return to once the called function finishes execution. Let’s look at a basic binary to show this.

Prerequisites

This post uses radere2, pwndbg, etc for dynamic analysis. And pwntools python package is useful for binary exploitation.

Investigation

  • Basic
1
2
3
4
5
6
file ./example
strings ./example

objdump -d ./example
# -M: type
objdump -M intel -d ./example

Security Properties

First check the executable properties.

1
checksec --file=./example

RELRO (stands for Relocation Read-Only)

  • Partial RELRO - We can read/write the global offset table.

  • Full RELRO - We can only read the global offset table. So we cannot overwrite GOT.

STACK CANARY

  • No canary found - It’s vulnerable to buffer overflow.

NX (stands for Non-eXecutable segments)

  • NX enabled - We cannot execute custom shellcode from the stack.

PIE (stands for Position Independent Executable)

  • No PIE - The binary always starts at same address.

ASLR (Address Space Layout Randomization) in Machine

ASLR is a security technique involved in preventing exploitation of memory corruption vulnerabilities.

1
2
cat /proc/sys/kernel/randomize_va_space
2
  • 0 - The address space is NOT randomized.
  • 1 - The address space is randomized.
  • 2 - The address space is randomized, and data segment as well.

x86 Architecture

General-Purpose Registers (GPR) - 16-bit naming conventions

The 8 GPRs are as follows :

  • Accumulator register (AX). Used in arithmetic operations

  • Counter register (CX). Used in shift/rotate instructions and loops.

  • Data register (DX). Used in arithmetic operations and I/O operations.

  • Base register (BX). Used as a pointer to data (located in segment register DS, when in segmented mode).

  • Stack Pointer register (SP). Pointer to the top of the stack.

  • Stack Base Pointer register (BP). Used to point to the base of the stack.

  • Source Index register (SI). Used as a pointer to a source in stream operations.

  • Destination Index register (DI). Used as a pointer to a destination in stream operations.

Identifiers to access registers and parts thereof.

Register64-bit32-bit16-bit8-bit
AccumulatorRAXEAXAXAL
CounterRCXECXCXCL
DataRDXEDXDXDL
BaseRBXEBXBXBL
Stack PtrRSPESPSPSPL
Base PtrRBPEBPBPBPL
SourceRSIESISISIL
DestinationRDIEDIDIDIL

Register

Registers are essentially places that the processor can store memory. You can think of them as buckets which the processor can store information in. Here is a list of the x64 registers, and what their common use cases are.

  • rbp : Base Pointer, points to the bottom of the current stack frame

  • rsp : Stack Pointer, points to the top of the current stack frame

  • rip : Instruction Pointer, points to the instruction to be executed

RegisterDescription
raxAccumulator register
rbxBase register
rcxCounter register
rdxData register
rsiSource index register
rdiDestination index register
r8General-purpose register
r9General-purpose register
r10General-purpose register
r11General-purpose register
r12General-purpose register
r13General-purpose register
r14General-purpose register
r15General-purpose register

There are sixteen, 64-bit General Purpose Registers (GPRs). The GPRs are described in the following table. A GPR register can be accessed with all 64-bits or some portion or subset accessed.

64-bit Register32-bit16-bit8-bit
raxeaxaxal
rbxebxbxbl
rcxecxcxcl
rdxedxdxdl
rsiesisisil
rdiedididil
rbpebpbpbpl
rspespspspl
r8r8dr8wr8b
r9r9dr9wr9b
r10r10dr10wr10b
r11r11dr11wr11b
r12r12dr12wr12b
r13r13dr13wr13b
r14r14dr14wr14b
r15r15dr15wr15b

In x64 linux arguments to a function are passed via registers. The first few args are passed by these registers :

RegisterArgument Number
rdiFirst Argument
rsiSecond Argument
rdxThird Argument
rcxFourth Argument
r8Fifth Argument
r9Sixth Argument

With the x86 elf architecture, arguments are passed on the stack. Also one thing as you may know, in C function can return a value. In x64, this value is passed in the rax register. In x86 this value is passed in the eax register.

Also one thing, there are different sizes for registers. These typical sizes we will be dealing with are 8 bytes, 4 bytes, 2 bytes, and 1. The reason for these different sizes is due to the advancement of technology, we can store more data in a register.

8 Byte RegisterLower 4 BytesLower 2 BytesLower Byte
rbpebpbpbpl
rspespspspl
ripeip  
raxeaxaxal
rbxebxbxbl
rcxecxcxcl
rdxedxdxdl
rsiesisisil
rdiedididil
r8r8dr8wr8b
r9r9dr9wr9b
r10r10dr10wr10b
r11r11dr11wr11b
r12r12dr12wr12b
r13r13dr13wr13b
r14r14dr14wr14b
r15r15dr15wr15b

In x64 we will see the 8 byte registers. However in x86 the largest sized registers we can use are the 4 byte registers like ebp, esp, eip etc. Now we can also use smaller registers, than the maximum sized registers for the architecture.

In x64 there is the rax, eax, ax, and al register. The rax register points to the full 8. The eax register is just the lower four bytes of the rax register. The ax register is the last 2 bytes of the rax register. Lastly the al register is the last byte of the rax register.

Stack

In computer architecture, the stack is a hardware manifestation of the stack data structure (a Last In, First Out queue).

In x86, the stack is simply an area in RAM that was chosen to be the stack - there is no special hardware to store stack contents. The esp/rsp register holds the address in memory where the bottom of the stack resides. When something is pushed to the stack, esp decrements by 4 (or 8 on 64-bit x86), and the value that was pushed is stored at that location in memory. Likewise, when a pop instruction is executed, the value at esp is retrieved (i.e. esp is dereferenced), and esp is then incremented by 4 (or 8).

N.B. The stack "grows" down to lower memory addresses!

Conventionally, ebp/rbp contains the address of the top of the current stack frame, and so sometimes local variables are referenced as an offset relative to ebp rather than an offset to esp. A stack frame is essentially just the space used on the stack by a given function.

  • -> Uses

The stack is primarily used for a few things :

  • Storing function arguments

  • Storing local variables

  • Storing processor state between function calls.

Now one of the most common memory regions you will be dealing with is the stack. It is where local variables in the code are stored.

For instance, in this code the variable x is stored in the stack :

1
2
3
4
5
6
7
#include <stdio.h>

void main(void)
{
    int x = 5;
    puts("hi");
}

Specifically we can see it is stored on the stack at rbp-0x4.

1
2
3
4
5
6
7
8
9
10
11
12
13
0000000000001135 <main>:
    1135:       55                      push   rbp
    1136:       48 89 e5                mov    rbp,rsp
    1139:       48 83 ec 10             sub    rsp,0x10
    113d:       c7 45 fc 05 00 00 00    mov    DWORD PTR [rbp-0x4],0x5
    1144:       48 8d 3d b9 0e 00 00    lea    rdi,[rip+0xeb9]        # 2004 <_IO_stdin_used+0x4>
    114b:       e8 e0 fe ff ff          call   1030 <puts@plt>
    1150:       90                      nop
    1151:       c9                      leave  
    1152:       c3                      ret    
    1153:       66 2e 0f 1f 84 00 00    nop    WORD PTR cs:[rax+rax*1+0x0]
    115a:       00 00 00
    115d:       0f 1f 00                nop    DWORD PTR [rax]

Now values on the stack are moved on by either pushing them onto the stack, or popping them off. That is the only way to add or remove values from the stack (it is a LIFO data structure). However we can reference values on the stack.

The exact bounds of the stack is recorded by two registers, rbp and rsp. The base pointer rbp points to the bottom of the stack. The stack pointer rsp points to the top of the stack.

Flags

There is one register that contains flags. A flag is a particular bit of this register. If it is set or not, will typically mean something. Here is the list of flags.

Flag IndexFlag NameDescription
00Carry FlagIndicates a carry or borrow occurred in an operation
01Always 1Always set to 1
02Parity FlagIndicates the parity (even or odd) of the result
03Always 0Always set to 0
04Adjust FlagAdjusts the result of BCD arithmetic operations
05Always 0Always set to 0
06Zero FlagIndicates the result of an operation is zero
07Sign FlagIndicates the sign (negative or positive) of the result
08Trap FlagAllows single-step execution for debugging purposes
09Interruption FlagEnables or disables maskable hardware interrupts
10Direction FlagSpecifies the direction for string instructions
11Overflow FlagIndicates signed arithmetic overflow or underflow
12I/O Privilege Field (Lower)Represents the privilege level for I/O operations (Lower bit)
13I/O Privilege Field (Higher)Represents the privilege level for I/O operations (Higher bit)
14Nested Task FlagIndicates if the current task is nested
15Resume FlagControls the type of task switch

There are other flags then the one listed, however we really don’t deal with them too much (and out of these, there are only a few we actively deal with).

Instructions

Now we will be covering some of the more common instructions you will see. This isn’t everything you will see, but here are the more common things you will see.

  • mov

The move instruction just moves data from one register to another. For instance :

1
mov rax, rdx

This will just move the data from the rdx register to the rax register.

dereference

If you ever see brackets like [], they are meant to dereference, which deals with pointers. A pointer is a value that points to a particular memory address (it is a memory address). Dereferencing a pointer means to treat a pointer like the value it points to. For instance :

1
mov rax, [rdx]

Will move the value pointed to by rdx into the rax register. On the flipside :

1
mov [rax], rdx

Will move the value of the rdx register into whatever memory is pointed to by the rax register. The actual value of the rax register does not change.

  • lea

The lea instruction calculates the address of the second operand, and moves that address in the first. For instance :

1
lea rdi, [rbx+0x10]

This will move the address rbx+0x10 into the rdi register.

  • add

This just adds the two values together, and stores the sum in the first argument. For instance:

1
add rax, rdx

That will set rax equal to rax + rdx

  • sub

This value will subtract the second operand from the first one, and store the difference in the first argument. For instance :

1
sub rsp, 0x10

This will set the rsp register equal to rsp - 0x10

  • xor

This will perform the binary operation xor on the two arguments it is given, and stores the result in the first operation :

1
xor rdx, rax

That will set the rdx register equal to rdx ^ rax.

The and and or operations essentially do the same thing, except with the and or or binary operators.

  • push The push instruction will grow the stack by either 8 bytes (for x64, 4 for x86), then push the contents of a register onto the new stack space. For instance :
    1
    
    push rax
    

This will grow the stack by 8 bytes, and the contents of the rax register will be on top of the stack.

  • pop

The pop instruction will pop the top 8 bytes (for x64, 4 for x86) off of the stack and into the argument. Then it will shrink the stack. For instance:

1
pop rax

The top 8 bytes of the stack will end up in the rax register.

  • jmp

The jmp instruction will jump to an instruction address. It is used to redirect code execution. For instance:

1
jmp 0x602010

That instruction will cause the code execution to jump to 0x602010, and execute whatever instruction is there.

  • call & ret

This is similar to the jmp instruction. The difference is it will push the values of rbp and rip onto the stack, then jump to whatever address it is given. This is used for calling functions. After the function is finished, a ret instruction is called which uses the pushed values of rbp and rip (saved base and instruction pointers) it can continue execution right where it left off

  • cmp

The cmp instruction is similar to that of the sub instruction. Except it doesn’t store the result in the first argument. It checks if the result is less than zero, greater than zero, or equal to zero. Depending on the value it will set the flags accordingly.

  • jnz / jz

This jump if not zero and jump if zero (jnz/jz) instructions are pretty similar to the jump instruction. The difference is they will only execute the jump depending on the status of the zero flag. For jz it will only jump if the zero flag is set. The opposite is true for jnz.

Analysis

In conducting an analysis the first thing to start :

check the file to see which is the executable format for Linux (it is recommended to follow along with this with a Virtual Machine of your own, preferably Linux).

then we have to know the security in the program can use the checksec command, like what I described above.

GDB Introductions

GDB, or the GNU Debugger, is the standard debugger of Linux systems developed by the GNU Project. It has been ported to many systems and supports the programming languages C, C++, Objective-C, FORTRAN, Java, and many more.

GDB provides us with the usual traceability features like breakpoints or stack trace output and allows us to intervene in the execution of programs. It also allows us, for example, to manipulate the variables of the application or to call functions independently of the normal execution of the program.

We use GNU Debugger (GDB) to view the created binary on the assembler level. Once we have executed the binary with GDB, we can disassemble the program’s main function.

  • Start Debug
1
2
3
4
5
# Change permission for debugging
chmod +x example

# -q: Debug mode
plugin -q example
  • $ gdb -q <File>
  • $ r2 -d -A <file>

The -d runs it while the -A performs analysis.

after we debug the thing that needs to be considered is to see the function information of a program.

example :

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
0x08049000  _init
0x08049030  gets@plt
0x08049040  puts@plt
0x08049050  __libc_start_main@plt
0x08049060  _start
0x080490a0  _dl_relocate_static_pie
0x080490b0  __x86.get_pc_thunk.bx
0x080490c0  deregister_tm_clones
0x08049100  register_tm_clones
0x08049140  __do_global_dtors_aux
0x08049170  frame_dummy
0x08049172  unsafe
0x080491ab  main
0x080491c3  __x86.get_pc_thunk.ax
0x080491d0  __libc_csu_init
0x08049230  __libc_csu_fini
0x08049231  __x86.get_pc_thunk.bp
0x08049238  _fini

It should be noted that this is only an example because each function is different, depending on the respective program.

This post is licensed under CC BY 4.0 by the author.
Trending Tags