Buffer Overflow Vulnerabilities
Buffer overflow vulnerabilities occur when an application copies user-controllable data into a memory buffer that is not sufficiently large to accommodate it. The destination buffer is overflowed, resulting in adjacent memory being overwritten with the user’s data. Depending on the nature of the vulnerability, an attacker may be able to exploit it to execute arbitrary code on the
server or perform other unauthorized actions. Buffer overflow vulnerabilities have been hugely prevalent in native software over the years, and have been widely regarded as the Public Enemy Number One that developers of such software need to avoid.
Stack Overflows
Buffer overflows typically arise when an application uses an unbounded copy operation (such as strcpy in C) to copy a variable-size buffer into a fixed-size buffer without verifying that the fixed-sized buffer is large enough. For example, the following function copies the username string into a fixed-size buffer allocated on the stack:
bool CheckLogin(char* username, char* password)
{
char _username[32];
strcpy(_username, username);
…
If the username string contains more than 32 characters, the _username buffer is overflowed, and the attacker will overwrite the data in adjacent memory. In a stack-based buffer overflow, a successful exploit typically involves over writing the saved return address on the stack. When the CheckLogin function is called, the processor pushes onto the stack the address of the instruction following the call. When CheckLogin is finished, the processor pops this address back off the stack and returns execution to that instruction. In the meantime, the CheckLogin function allocates the _username buffer on the stack right next to the saved return address. If an attacker can overflow the _username buffer, he can overwrite the saved return address with a value of his choosing, thereby causing the processor to jump to this address and execute arbitrary code.
Heap Overflows
Heap-based buffer overflows essentially involve the same kind of unsafe operation as described previously, except that the overflowed destination buffer is allocated on the heap, not the stack:
bool CheckLogin(char* username, char* password)
{
char* _username = (char*) malloc(32);
strcpy(_username, username);
…
In a heap-based buffer overflow, what is typically adjacent to the destination buffer is not any saved return address but other blocks of heap memory, separated by heap control structures. The heap is implemented as a doubly linked list: each block is preceded in memory by a control structure that contains the size of the block, a pointer to the previous block on the heap, and a pointer to the next block on the heap. When a heap buffer is overflowed, the control structure of an adjacent heap block is overwritten with user-controllable data. This type of vulnerability is less straightforward to exploit than a stack-based overflow, but a common approach is to write crafted values into the overwritten heap control structure so as to cause an arbitrary overwrite of a critical pointer at some future time. When the heap block whose control structure has been overwritten is freed from memory, the heap manager needs to update the linked list of heap blocks. To do this, it needs to update the back link pointer of the following heap block, and update the forward link pointer of the preceding heap block, so that these two items in the linked list point to each other. To do this, it uses the values in the overwritten control structure. Specifically, in order to update the following block’s back link pointer, the heap manager dereferences the forward link pointer taken from the overwritten control structure and writes into the structure at this address the value of the back link pointer taken from the overwritten control structure. In other words, it writes a user-controllable value to a user-controllable address. If an attacker has crafted his overflow data appropriately, he can overwrite any pointer in memory with a value of his choosing, with the objective of seizing control of the path of execution and so executing arbitrary code. Typical targets for the arbitrary pointer overwrite are the value of a function pointer that will later be called by the application, or the address of an exception handler that will be invoked the next time an exception occurs.
“Off-by-One” Vulnerabilities
A specific kind of overflow vulnerability arises where a programming error enables an attacker to write a single byte (or a small number of bytes) beyond the end of an allocated buffer.
Consider the following code, which allocates a buffer on the stack, performs a counted buffer copy operation, and then null-terminates the destination string:
bool CheckLogin(char* username, char* password)
{
char _username[32];
int i;
for (i = 0; username[i] && i < 32; i++)
_username[i] = username[i];
_username[i] = 0;
…
The code copies up to 32 bytes and then adds the null terminator. Hence, if the username is 32 bytes or longer, the null byte will be written beyond the end of the _username buffer, corrupting adjacent memory. This condition may be exploitable: if the adjacent item on the stack is the saved frame pointer of the calling frame, then setting the lower-order byte to zero may cause it to point into the _username buffer, and so to data that the attacker controls. When the calling function returns, this may enable an attacker to take control of the flow of execution.
A similar kind of vulnerability arises when developers overlook the need for string buffers to include room for a null terminator. Consider the following “fix” to the original heap overflow:
bool CheckLogin(char* username, char* password)
{
char* _username = (char*) malloc(32);
strncpy(_username, username, 32);
…
Detecting Buffer Overflow Vulnerabilities
The basic methodology for detecting buffer overflow vulnerabilities is to send long strings of data to an identified target and monitor for anomalous results. In some cases, subtle vulnerabilities exist that can only be detected by sending an overlong string of a specific length, or within a small range of lengths. However, in most cases vulnerabilities can be detected simply by sending a string that is longer than the application is expecting.
Programmers commonly create fixed-size buffers using round numbers in either decimal or hexadecimal, such as 32, 100, 1024, 4096, and so on. A simple approach to detecting any “low-hanging fruit” within the application is to send long strings as each item of target data identified and to monitor the server’s responses for anomalies.
In some instances, your test cases may be blocked by input validation checks implemented either within the application itself or by other components such as the web server. This often occurs when overlong data is submitted within the URL query string, and may be indicated by a generic message such as “URL too long” in response to every test case. In this situation, you should experiment to determine the maximum length of URL permitted (which is often around 2000 characters) and adjust your buffer sizes so that your test cases comply with this requirement. Overflows may still exist behind the generic length filtering, which can be triggered by strings short enough to get past that filtering.