Conference Paper

Hacking Blind

Authors:
To read the full-text of this research, you can request a copy directly from the authors.

Abstract

We show that it is possible to write remote stack buffer overflow exploits without possessing a copy of the target binary or source code, against services that restart after a crash. This makes it possible to hack proprietary closed-binary services, or open-source servers manually compiled and installed from source where the binary remains unknown to the attacker. Traditional techniques are usually paired against a particular binary and distribution where the hacker knows the location of useful gadgets for Return Oriented Programming (ROP). Our Blind ROP (BROP) attack instead remotely finds enough ROP gadgets to perform a write system call and transfers the vulnerable binary over the network, after which an exploit can be completed using known techniques. This is accomplished by leaking a single bit of information based on whether a process crashed or not when given a particular input string. BROP requires a stack vulnerability and a service that restarts after a crash. We implemented Braille, a fully automated exploit that yielded a shell in under 4,000 requests (20 minutes) against a contemporary nginx vulnerability, yaSSL + MySQL, and a toy proprietary server written by a colleague. The attack works against modern 64-bit Linux with address space layout randomization (ASLR), no-execute page protection (NX) and stack canaries.

No full-text available

Request Full-text Paper PDF

To read the full-text of this research,
you can request a copy directly from the authors.

... Another possibility is to leak the value of the canary word. In an attack called stack reading [20] or a byte-by-byte [21] attack, applications that are forking child processes are attacked by only overwriting one byte of the canary word. When the process crashes, the guess for the byte was wrong; when it continues, the bytes were guessed correctly and the attack continues with the next byte until the whole canary word is leaked. ...
... The random canary word is, as its name suggests, a random data word. For the random canary word to work, it is important to re-randomize it for every start of a process, or even with a fork of a child process to not be vulnerable against certain attacks such as stack reading [20]. Since embedded devices typically do not operate with strict processes that are created and loaded, re-randomization of the canary may take more effort and cannot fully be covered by the compiler alone. ...
Article
Full-text available
With the increasing popularity of IoT (Internet-of-Things) devices, their security becomes an increasingly important issue. Buffer overflow vulnerabilities have been known for decades, but are still relevant, especially for embedded devices where certain security measures cannot be implemented due to hardware restrictions or simply due to their impact on performance. Therefore, many buffer overflow detection mechanisms check for overflows only before critical data are used. All data that an attacker could use for his own purposes can be considered critical. It is, therefore, essential that all critical data are checked between writing a buffer and its usage. This paper presents a vulnerability of the ESP32 microcontroller, used in millions of IoT devices, that is based on a pointer that is not protected by classic buffer overflow detection mechanisms such as Stack Canaries or Shadow Stacks. This paper discusses the implications of vulnerability and presents mitigation techniques, including a patch, that fixes the vulnerability. The overhead of the patch is evaluated using simulation as well as an ESP32-WROVER-E development board. We showed that, in the simulation with 32 general-purpose registers, the overhead for the CoreMark benchmark ranges between 0.1% and 0.4%. On the ESP32, which uses an Xtensa LX6 core with 64 general-purpose registers, the overhead went down to below 0.01%. A worst-case scenario, modeled by a synthetic benchmark, showed overheads up to 9.68%.
Article
Address space layout randomization (ASLR) can hide code addresses, which has been widely adopted by security solutions. However, code probes can bypass it. In real attack scenarios, a single code probe can only obtain very limited code information instead of the information of the entire code segment. So, randomizing the entire code segment is unnecessary. How to minimize the size of the randomized object is a key to reducing the complexity and overhead for ASLR methods. Moreover, ASLR needs to be completed between the time after code probe occurs and before the probed code is used by attackers, otherwise it is meaningless. How to select an appropriate randomization time point is a basic condition for achieving effective address hiding. In this paper, we propose a runtime partial randomization method RandFun. It only randomizes the probed function with parallel threads. And the randomization is performed when and only when potential code probes are detected. In addition, RandFun can protect the probed code from being used as gadgets, whether during or after randomization. Experiments and analysis show RandFun has a good defense effect on code probes and only introduces 1.6% overhead to CPU.
Conference Paper
Full-text available
Developing expertise in vulnerability research is critical to closing the cybersecurity workforce shortage. However, very few institutions have adopted vulnerability research into their cybersecurity curriculum, and fewer have examined how to teach this skill to students. The recent emergence of lightweight, container-based vir-tualization presents a unique opportunity to address this challenge by offering reproducible environments that ease course facilitation. This paper presents an undergraduate vulnerability course design. Our approach leverages a hands-on methodology that challenges students to develop complex binary exploits over our lectures, labs, and exams. We share our detailed design, labs, experiences, lessons learned, and a lightweight virtual environment for this course for others to build on our initial success. CCS CONCEPTS • Social and professional topics → Model curricula; Computing education programs.
Article
Full-text available
Represented by reactive security defense mechanisms, cyber defense possesses a static, reactive, and deterministic nature, with overwhelmingly high costs to defend against ever-changing attackers. To change this situation, researchers have proposed moving target defense (MTD), which introduces the concept of an attack surface to define cyber defense in a brand-new manner, aiming to provide a dynamic, continuous, and proactive defense mechanism. With the increasing use of machine learning in networking, researchers have discovered that MTD techniques based on machine learning can provide omni-bearing defense capabilities and reduce defense costs at multiple levels. However, research in this area remains incomplete and fragmented, and significant progress is yet to be made in constructing a defense mechanism that is both robust and available. Therefore, we conducted a comprehensive survey on MTD research, summarizing the background, design mechanisms, and shortcomings of MTD, as well as relevant features of intelligent MTD that are designed to overcome these limitations. We aim to provide researchers seeking the future development of MTD with insight into building an intelligently affordable, optimized, and self-adaptive defense mechanism.
Article
Full-text available
Control-flow integrity (CFI) is considered a principled mitigation against control-flow hijacking even under the most powerful attacker who can arbitrarily write and read memory. However, existing schemes still demonstrated limitations in either guaranteeing high security level or achieving low performance and memory overhead. These limitations have restricted the application of CFI in real software. To improve its applicability similar to mandatory protection schemes such as DEP and ASLR, it is essential to improve both high security guarantee and low overhead. In this paper, we propose “BGCFI”, which is a fine-grained CFI based on a Bipartite Graph. The relationship between an indirect branch and a valid target address at the branch is represented by an edge in the bipartite graph. The verification of the indirect branch is achieved by checking the existence of the corresponding edge in the bipartite graph. The verification method for fine-grained CFI results in more efficiency on both computational and memory overhead, while completely preserving high security guarantee.We demonstrate our results through the implementation of a proof-of-concept module and evaluation on the SPEC CPU 2017 suite and the Firefox browser.
Article
Address space layout randomization (ASLR) has been widely deployed in operating systems (OS) to hide memory layout, which mitigates code reuse attacks (CRAs). Unfortunately, the memory probing techniques can still provide attackers with enough information to bypass ASLR. Although the control flow integrity (CFI) methods are not affected by code probing, they face the precision problem of control flow graphs (CFG). To make matters worse, most methods rely on the source code of the targets to be protected, which leads to their restrictions on the protection of the objects without source code. To mitigate CRAs, a security method that can defend memory probing, does not require a high-precision CFG, and does not rely on the source code is needed. The MagBox proposed in this paper is such a method. It identifies the risk functions that can provide gadgets for CRAs by detecting and analyzing attackers’ code probing activities. If the function is probed, it will be moved to a new address space. After that, the control flow transfers of the function will be tracked and analyzed in real time to judge their legitimacy. Experiment results and analysis show that MagBox can mitigate CRAs, and only introduces 3.4% performance overhead to the CPU.
Article
Various fine-grained randomization schemes have been designed to increase the entropy of process space, while none of them can rise from an academic exercise to industrial deployment like Address Space Layout Randomization (ASLR). One of the critical reasons is the incorrectness of randomization caused by the mismatch between their pointer collection capabilities and the high accuracy requirements of the pointer patching task. In this article, we present PointerScope, an accurate compile-time pointer collection scheme deriving from a group of novel observations. The success of PointerScope relies on the complete tracing of the pointer generation process, including the compilation chain from compiler to static linker and the interface specification between them. From this view, PointerScope identifies four types of pointer-related static linker behaviors and clarifies five types of inherent addressing modes in the x86-64 architecture. The vague understanding of them causes the Compiler-assisted Code Randomization (CCR) to incorrectly collect pointers and patch them to the wrong values after randomization. Further, we measure the pointer collection capability of augmented binary analysis, the experimental results show that they can mitigate challenges from the traditional binary analysis by the given premises, but additional heuristics still need to be designed to support the fine-grained randomization.
Article
Control-Flow Integrity (CFI) techniques focus often on protecting forward edges and assume that backward edges are protected by shadow stacks. However, software-based shadow stacks that can provide performance, security and compatibility are still hard to obtain, leaving an important security gap on x86-64. In this paper, we introduce a simple, efficient and effective parallel shadow stack design (based on LLVM), FlashStack , for protecting return addresses in single- and multi-threaded programs running under 64-bit Linux on x86-64, with three distinctive features. First, we introduce a novel dual-prologue approach to enable a protected function to thwart the TOCTTOU attacks, which are constructed by Microsoft’s red team and lead to the deprecation of Microsoft’s RFG. Second, we design a new mapping mechanism, Segment+Rsp-S , to allow the parallel shadow stack to be accessed efficiently while satisfying the constraints of arch_prctl() and ASLR in 64-bit Linux. Finally, we introduce a lightweight inspection mechanism, SideChannel-K , to harden FlashStack further by detecting entropy-reduction attacks efficiently and protecting the parallel shadow stack effectively with a 10-ms shuffling policy. Our evaluation on SPEC CPU2006 , Nginx and Firefox shows that FlashStack can provide high performance, meaningful security, and reasonable compatibility for server- and client-side programs on x86-64.
Article
Full-text available
Intel SGX (Software Guard Extensions) is a hardware-based security solution that provides a trusted computing environment. SGX creates an isolated memory area called enclave and prevents any illegal access from outside of the enclave. SGX only allows executables already linked statically to the enclave when compiling executables to access its memory, so code injection attacks to SGX are not effective. However, as a previous study has demonstrated, Return-Oriented Programming (ROP) attacks can overcome this defense mechanism by injecting a series of addresses of executable codes inside the enclave. In this study, we propose a novel ROP attack, called SGXDump, which can repeat the attack payload. SGXDump consists only of gadgets in the enclave and unlike previous ROP attacks, the SGXDump attack can repeat the attack payload, communicate with other channels, and implement conditional statements. We successfully attacked two well-known SGX projects, mbedTLS-SGX and Graphene-SGX. Based on our attack experiences, it seems highly probable that an SGXDump attack can leak the entire enclave memory if there is an exploitable memory corruption vulnerability in the target SGX application.
Article
Protecting code pointers (e.g., return address, function pointer) from leakage is desirable from a security perspective. Isolation mechanisms have been the favoured candidate to protect code pointers. However, these mechanisms result in significant performance overhead as they need to instrument extra instructions for frequent permission switching or bound checking. In this paper, we propose CPP, a novel Code Pointer-only Memory Page Management to restrict attack-critical operations for code pointers by hardware. Our hardware-software co-design allows CPP mark code pointers at page granularity that requires minor hardware modification. CPP checks the legality of their operations in parallel with instruction execution. We implement a prototype system and our evaluation shows CPP can effectively mitigate the code pointer leakage attacks with less than 2.1% performance overhead.
Article
Full-text available
Code-reuse attacks pose a threat to embedded devices since they are able to defeat common security defences such as non-executable stacks. To succeed in his code-reuse attack, the attacker has to gain knowledge of some or all of the instructions of the target firmware/software. In case of a bare metal firmware that is protected from being dumped out of a device, it is hard to know the running instructions of the target firmware. This consequently makes code-reuse attacks more difficult to achieve. This paper presents a novel approach how an attacker can gain knowledge of some of these instructions by sniffing unencrypted incremental updates. These updates exist to reduce the radio reception power for resource-constrained devices. It will be demonstrated how a return-oriented programming (ROP) attack can be accomplished on a MSP430 MCU using only the passively sniffed incremental updates. The generated updates of the R3diff and Delta Generator (DG) differencing algorithms will be under assessment. The evaluation reveals that both of them can be exploited by the attacker and how an attacker can maximize his information gain when dealing with more than one update. It also shows that the DG generated updates leak more information than the R3diff generated updates. This stresses the fact that even delta updates need to be protected with encryption. To defend against this attack, different countermeasures that consider different power consumption scenarios are proposed, but yet to be evaluated.
Article
Code tampering is serious issue in Internet of Things(IOT). IoT devices are used to collect environment data like temperature value, light intensity, hart pulse etc. Once an IoT device deployed it will left untouched forever. In that situation an adversary can tamper running code through malicious software. This paper discusses a novel way to protect a binary program running inside an IoT device. This paper uses Return Oriented Programming (ROP), which is mainly used by an adversary to hijack a software program. We use ROP to prevent binary from tampering. In this method, we combine code and stack segment to execute the code. The main program call ROP enabled chain which will fetch the address of executable binary function. if any adversary try to tamper running program then they also tamper ROP chain. So, running binary will stop and send an alert to admin. we also verify our approach with RIPE benchmark and our binary is tamper proof against code tampering attach and our performance overhead is up to 5%.
Article
Affected by vulnerabilities, the control data on the stack is easily destroyed, which provides the most convenient conditions for code reuse attacks (CRAs). The operating system (OS) does not impose strict restrictions on the control flow paths. It allows instructions to jump to any location in the same address space. The OS will prevent code execution if and only if an execution error occurs. However, attackers can use stack overflow to accurately tamper with the control data on the stack and avoid execution errors. Although canary technology has been widely adopted, it turns out that this method can be bypassed. The traditional shadow stack technology can only protect the backward control flow and is invalid for the forward control flow. In contrast, the defense effect of the control flow integrity methods is better. Unfortunately, they either cannot get rid of the source code dependence on the protected objects, or cannot provide high-precision instruction boundaries. All these problems make it difficult to eliminate the CRAs based on stack overflow. Faced with these problems, this paper proposes a new security method KPointer. It filters the vulnerable data by tracking the overwriting operation to the stack data. Next, these data will be tracked to locate the jump instructions related to them. Finally, we use new security strategies to determine whether the current instruction is illegal. Experiments and analysis show that KPointer has a good protection effect on the CRAs based on stack overflow. It does not depend on the source code of the protected objects and only introduces 2.7% performance overhead to the CPU.
Article
Full-text available
The microservice architecture has many advantages, such as technology heterogeneity, isolation, scalability, simple deployment, and convenient optimization. These advantages are important as we can use diversity and redundancy to improve the resilience of software system. It is necessary to study the method of improving the resilience of software system by diversity implementation and redundant deployment of software core components based on microservice framework. How to optimize the diversity deployment of microservices is a key problem to maximize system resilience and make full use of resources. To solve this problem, an efficient microservice diversity deployment mechanism is proposed in this paper. Firstly, we creatively defined a load balancing indicator and a diversity indicator. Based on this, a diversified microservices deployment model is established to maximize the resilience and the resource utilization of the system. Secondly, combined with load balancing, a microservice deployment algorithm based on load balance and diversity is proposed, which reduces the service’s dependence on the underlying mirror by enriching diversity and avoids the overreliance of microservices on a single node through decentralized deployment, while taking into account load balancing. Finally, we conduct experiments to evaluate the performance of the proposed algorithm. The results demonstrate that the proposed algorithm outperforms other algorithms.
Article
Shadow stacks play an important role in protecting return addresses to mitigate ROP attacks. Parallel shadow stacks, which shadow the call stack of each thread at the same constant offset for all threads, are known not to support multi-threading well. On the other hand, compact shadow stacks must maintain a separate shadow stack pointer in thread-local storage (TLS) , which can be implemented in terms of a register or the per-thread Thread-Control-Block (TCB) , suffering from poor compatibility in the former or high performance overhead in the latter. In addition, shadow stacks are vulnerable to information disclosure attacks. In this paper, we propose to mitigate ROP attacks for single- and multi-threaded server programs running on general-purpose computing systems by using a novel stack layout, called a buddy stack (referred to as Bustk ), that is highly performant, compatible with existing code, and provides meaningful security. These goals are met due to three novel design aspects in Bustk . First, Bustk places a parallel shadow stack just below a thread’s call stack (as each other’s buddies allocated together), avoiding the need to maintain a separate shadow stack pointer and making it now well-suited for multi-threading. Second, Bustk uses an efficient stack-based thread-local storage mechanism, denoted STK-TLS , to store thread-specific metadata in two TLS sections just below the shadow stack in dual redundancy (as each other’s buddies), so that both can be accessed and updated in a lightweight manner from the call stack pointer rsp alone. Finally, Bustk re-randomizes continuously (on the order of milliseconds) the return addresses on the shadow stack by using a new microsecond-level runtime re-randomization technique, denoted STK-MSR . This mechanism aims to obsolete leaked information, making it extremely unlikely for the attacker to hijack return addresses, particularly against a server program that sits often tens of milliseconds away from the attacker. Our evaluation using web servers, Nginx and Apache Httpd , shows that Bustk works well in terms of performance, compatibility, and security provided, with its parallel shadow stacks incurring acceptable memory overhead for real-world applications and its STK-TLS mechanism costing only two pages per thread. In particular, Bustk can protect the Nginx and Apache servers with an adaptive 1-ms re-randomization policy (without observable overheads when IO is intensive, with about 17,000 requests per second). In addition, we have also evaluated Bustk using other non-server applications, Firefox , Python , LLVM , JDK and SPEC CPU2006 , to demonstrate further the same degree of performance and compatibility provided, but the protection provided for, say, browsers, is weaker (since network-access delays can no longer be assumed).
Conference Paper
Full-text available
Nowadays, exploits often rely on a code-reuse approach. Short pieces of code called gadgets are chained together to execute some payload. Code-reuse attacks can exploit vulnerabilities in the presence of operating system protection that prohibits data memory execution. The ROP chain construction task is the code generation for the virtual machine defined by an exploited executable. It is crucial to understand how powerful ROP attacks can be. Such knowledge can be used to improve software security. We implement MAJORCA that generates ROP and JOP payloads in an architecture agnostic manner and thoroughly consider restricted symbols such as null bytes that terminate data copying via strcpy. The paper covers the whole code-reuse payloads construction pipeline: cataloging gadgets, chaining them in DAG, scheduling, linearizing to the ready-to-run payload. MAJORCA automatically generates both ROP and JOP payloads for x86 and MIPS. MAJORCA constructs payloads respecting restricted symbols both in gadget addresses and data. We evaluate MAJORCA performance and accuracy with rop-benchmark and compare it with open-source compilers. We show that MAJORCA outperforms open-source tools. We propose a ROP chaining metric and use it to estimate the probabilities of successful ROP chaining for different operating systems with MAJORCA as well as other ROP compilers to show that ROP chaining is still feasible. This metric can estimate the efficiency of OS defences.
Conference Paper
Return-oriented programming (ROP) relies on in-memory code sequences ending in return instructions to chain together arbitrary malware. ROP is one of the most dangerous security exploits because, if wittingly crafted, it can be used to wreak havoc on the system, network, and nodes connected to it. It is not surprising that ROP has been studied on architectures such as x86 and ARM, mostly with an operating system (OS). Xtensa is one of the most popular industry standards for digital signal processors and it is present in many resource-constrained firmware-based embedded WiFi home automation devices, which operate by reading instructions directly from flash memory. Despite leveraging no real OS, Xtensa is not immune to ROP, and there have been reports of buffer overflow vulnerability exploitations leading to ROP in Xtensa. Therefore, we present the first detection of ROP, and its variant Jump-oriented programming (JOP), in a firmware-only environment using hardware performance counters (HPCs). Our approach discerns the variations in the HPC micro-architectural events triggered by ROP attacks and benign program execution. We implemented attack scenarios using instrumented programs and exploits that perform tasks similar to those in a known microprocessor benchmark programs. We recorded micro-architectural events to train a machine learning binary classifier. The learned model identifies relevant HPCs, which could serve as predictors of ROP/JOP execution even in embedded firmware-only configurations, where features atypical to conventional processors, like instruction memory and data memory, are available. Our evaluation results indicate a high precision, recall, and accuracy of the classifier predictions.
Article
Full-text available
Currently, security-critical server programs are well protected by various defense techniques, such as Address Space Layout Randomization(ASLR), eXecute Only Memory(XOM), and Data Execution Prevention(DEP), against modern code-reuse attacks like Return-oriented Programming(ROP) attacks. Moreover, in these victim programs, most syscall instructions lack the following ret instructions, which prevents attacks to stitch multiple system calls to implement advanced behaviors like launching a remote shell. Lacking this kind of gadget greatly constrains the capability of code-reuse attacks.This paper proposes a novel code-reuse attack method called Signal Enhanced Blind Return Oriented Programming (SeBROP) to address these challenges. Our SeBROP can initiate a successful exploit to server-side programs using only a stack overflow vulnerability. By leveraging a side-channel that exists in the victim program, we show how to find a variety of gadgets blindly without any pre-knowledges or reading/disassembling the code segment. Then, we propose a technique that exploits the current vulnerable signal checking mechanism to realize the execution flow control even when ret instructions are absent. Our technique can stitch a number of system calls without returns, which is more superior to conventional ROP attacks. Finally, the SeBROP attack precisely identifies many useful gadgets to constitute a Turing-complete set. SeBROP attack can defeat almost all state-of-the-art defense techniques. The SeBROP attack is compatible with both modern 64-bit and 32-bit systems.To validate its effectiveness, We craft three exploits of the SeBROP attack for three real-world applications, i.e., 32-bit Apache 1.3.49, 32-bit ProFTPD 1.3.0, and 64-bit Nginx 1.4.0. Experimental results demonstrate that the SeBROP attack can successfully spawn a remote shell on Nginx, ProFTPD, and Apache with less than 8500/4300/2100 requests, respectively.
Article
Performing mobile phone acquisition today requires breaking—often hardware assisted—security. In recent years, Embedded Secure Element (eSE) hardware has been introduced in mobile phones, with a view towards increasing the security of critical system features and encrypted user data. The idea being that the eSE should remain secure even if the rest of the system is compromised. The eSE is set to become crucial to modern mobile phone security, challenging Digital Forensics. The eSE is designed to withstand both logical and physical attacks, including side channel attacks, and to keep the attack surface towards the rest of the system/phone small, and complexity low to minimise the risk of implementation errors. In this paper we adapt current state-of-the-art attacks to the eSE platform and present an attack on an eSE by Samsung, recently introduced in their premium mobile phones. We show how, with limited resources, our approach discovered a vulnerability that could be exploited, leading to a complete compromise of all the eSE security goals and a full loss of future eSE trust, as mitigation of our attack in already fielded devices is challenging. This eSE is Common Criteria EAL 5+ certified and our attack exposes the gap between intended and achieved security, undermining the implied trust in such certifications. We explain the eSE security design, the details of our attack, and discuss how a single vulnerability can have such devastating security results. The ultimate result of our research facilitates acquisition of affected devices, demonstrating use of offensive methods in advanced Digital Forensic Acquisition.
Conference Paper
Full-text available
Unlike library code, whose instruction addresses can be randomized by address space layout randomization (ASLR), application binary code often has static instruction addresses. Attackers can exploit this limitation to craft robust shell codes for such applications, as demonstrated by a recent attack that reuses instruction gadgets from the static binary code of victim applications. This paper introduces binary stirring, a new technique that imbues x86 native code with the ability to self-randomize its instruction addresses each time it is launched. The input to STIR is only the application binary code without any source code, debug symbols, or relocation information. The output is a new binary whose basic block addresses are dynamically determined at load-time. Therefore, even if an attacker can find code gadgets in one instance of the binary, the instruction addresses in other instances are unpredictable. An array of binary transformation techniques enable STIR to transparently protect large, realistic applications that cannot be perfectly disassembled due to computed jumps, code-data interleaving, OS callbacks, dynamic linking and a variety of other difficult binary features. Evaluation of STIR for both Windows and Linux platforms shows that stirring introduces about 1.6% overhead on average to application runtimes.
Conference Paper
Full-text available
In recent years, the deployment of many application-level countermeasures against memory errors and the increasing number of vulnerabilities discovered in the kernel has fostered a renewed interest in kernel-level exploitation. Unfortunately, no comprehensive and well-established mechanism exists to protect the operating system from arbitrary attacks, due to the relatively new development of the area and the challenges involved. In this paper, we propose the first design for fine-grained address space randomization (ASR) inside the operating system (OS), providing an efficient and comprehensive countermeasure against classic and emerging attacks, such as return-oriented programming. To motivate our design, we investigate the differences with application-level ASR and find that some of the well-established assumptions in existing solutions are no longer valid inside the OS; above all, perhaps, that information leakage becomes a major concern in the new context. We show that our ASR strategy outperforms state-of-the-art solutions in terms of both performance and security without affecting the software distribution model. Finally, we present the first comprehensive live rerandomization strategy, which we found to be particularly important inside the OS. Experimental results demonstrate that our techniques yield low run-time performance overhead (less than 5% on average on both SPEC and syscall-intensive benchmarks) and limited run-time memory footprint increase (around 15% during the execution of our benchmarks). We believe our techniques can greatly enhance the level of OS security without compromising the performance and reliability of the OS.
Article
Full-text available
Through randomization of the memory space and the confinement of code to non-data pages, computer security researchers have made a wide range of attacks against program binaries more difficult. However, attacks have evolved to exploit weaknesses in these defenses. To thwart these attacks, we introduce a novel technique called Instruction Location Randomization (ILR). Conceptually, ILR randomizes the location of every instruction in a program, thwarting an attacker's ability to re-use program functionality (e.g., arc-injection attacks and return-oriented programming attacks). ILR operates on arbitrary executable programs, requires no compiler support, and requires no user interaction. Thus, it can be automatically applied post-deployment, allowing easy and frequent re-randomization. Our preliminary prototype, working on 32-bit x86 Linux ELF binaries, provides a high degree of entropy. Individual instructions are randomly placed within a 31-bit address space. Thus, attacks that rely on a priori knowledge of the location of code or derandomization are not feasible. We demonstrated ILR's defensive capabilities by defeating attacks against programs with vulnerabilities, including Adobe's PDF viewer, acroread, which had an in-the-wild vulnerability. Additionally, using an industry-standard CPU performance benchmark suite, we compared the run time of prototype ILR-protected executables to that of native executables. The average run-time overhead of ILR was 13% with more than half the programs having effectively no overhead (15 out of 29), indicating that ILR is a realistic and cost-effective mitigation technique.
Article
Full-text available
The wide adoption of non-executable page protections in recent versions of popular operating systems has given rise to attacks that employ return-oriented programming (ROP) to achieve arbitrary code execution without the injection of any code. Existing defenses against ROP exploits either require source code or symbolic debugging information, or impose a significant runtime overhead, which limits their applicability for the protection of third-party applications. In this paper we present in-place code randomization, a practical mitigation technique against ROP attacks that can be applied directly on third-party software. Our method uses various narrow-scope code transformations that can be applied statically, without changing the location of basic blocks, allowing the safe randomization of stripped binaries even with partial disassembly coverage. These transformations effectively eliminate about 10%, and probabilistically break about 80% of the useful instruction sequences found in a large set of PE files. Since no additional code is inserted, in-place code randomization does not incur any measurable runtime overhead, enabling it to be easily used in tandem with existing exploit mitigations such as address space layout randomization. Our evaluation using publicly available ROP exploits and two ROP code generation toolkits demonstrates that our technique prevents the exploitation of the tested vulnerable Windows 7 applications, including Adobe Reader, as well as the automated construction of alternative ROP payloads that aim to circumvent in-place code randomization using solely any remaining unaffected instruction sequences.
Article
Full-text available
This paper presents a systematic solution to the persistent problem of buffer overflow attacks. Buffer overflow attacks gained notoriety in 1988 as part of the Morris Worm incident on the Internet. While it is fairly simple to fix individual buffer overflow vulnerabilities, buffer overflow attacks continue to this day. Hundreds of attacks have been discovered, and while most of the obvious vulnerabilities have now been patched, more sophisticated buffer overflow attacks continue to emerge. We describe StackGuard: a simple compiler technique that virtually eliminates buffer overflow vulnerabilities with only modest performance penalties. Privileged programs that are recompiled with the StackGuard compiler extension no longer yield control to the attacker, but rather enter a fail-safe state. These programs require no source code changes at all, and are binary-compatible with existing operating systems and libraries. We describe the compiler technique (a simple patch to gcc), as well as a set of variations on the technique that trade-off between penetration resistance and performance. We present experimental results of both the penetration resistance and the performance impact of this technique.
Conference Paper
Full-text available
To strengthen systems against code injection attacks, the write or execute only policy (W¿X) and address space layout randomization (ASLR) are typically used in combination. The former separates data and code, while the latter randomizes the layout of a process. In this paper we present a new attack to bypass W¿X and ASLR. The state-of-the-art attack against this combination of protections is based on brute-force, while ours is based on the leakage of sensitive information about the memory layout of the process. Using our attack an attacker can exploit the majority of programs vulnerable to stack-based buffer overflows surgically, i.e., in a single attempt. We have estimated that our attack is feasible on 95.6% and 61.8% executables (of medium size) for Intel x86 and x86-64 architectures, respectively. We also analyze the effectiveness of other existing protections at preventing our attack. We conclude that position independent executables (PIE) are essential to complement ASLR and to prevent our attack. However, PIE requires recompilation, it is often not adopted even when supported, and it is not available on all ASLR-capable operating systems. To overcome these limitations, we propose a new protection that is as effective as PIE, does not require recompilation, and introduces only a minimal overhead.
Conference Paper
Full-text available
Despite the numerous prevention and protection mechanisms that have been introduced into modern operating systems, the exploitation of memory corruption vulnerabilities still represents a serious threat to the security of software systems and networks. A recent exploitation technique, called Return-Oriented Programming (ROP), has lately attracted a considerable attention from academia. Past research on the topic has mostly focused on refining the original attack technique, or on proposing partial solutions that target only particular variants of the attack. In this paper, we present G-Free, a compiler-based approach that represents the first practical solution against any possible form of ROP. Our solution is able to eliminate all unaligned free-branch instructions inside a binary executable, and to protect the aligned free-branch instructions to prevent them from being misused by an attacker. We developed a prototype based on our approach, and evaluated it by compiling GNU libc and a number of real-world applications. The results of the experiments show that our solution is able to prevent any form of return-oriented programming.
Conference Paper
Return-oriented programming (ROP) has become the primary exploitation technique for system compromise in the presence of non-executable page protections. ROP exploits are facilitated mainly by the lack of complete address space randomization coverage or the presence of memory disclosure vulnerabilities, necessitating additional ROP-specific mitigations. In this paper we present a practical runtime ROP exploit prevention technique for the protection of third-party applications. Our approach is based on the detection of abnormal control transfers that take place during ROP code execution. This is achieved using hardware features of commodity processors, which incur negligible runtime overhead and allow for completely transparent operation without requiring any modifications to the protected applications. Our implementation for Windows 7, named kBouncer, can be selectively enabled for installed programs in the same fashion as user-friendly mitigation toolkits like Microsoft's EMET. The results of our evaluation demonstrate that kBouncer has low runtime overhead of up to 4%, when stressed with specially crafted workloads that continuously trigger its core detection component, while it has negligible overhead for actual user applications. In our experiments with in-the-wild ROP exploits, kBouncer successfully protected all tested applications, including Internet Explorer, Adobe Flash Player, and Adobe Reader.
Article
We introduce return-oriented programming, a technique by which an attacker can induce arbitrary behavior in a program whose control flow he has diverted, without injecting any code. A return-oriented program chains together short instruction sequences already present in a program’s address space, each of which ends in a “return” instruction. Return-oriented programming defeats the W⊕X protections recently deployed by Microsoft, Intel, and AMD; in this context, it can be seen as a generalization of traditional return-into-libc attacks. But the threat is more general. Return-oriented programming is readily exploitable on multiple architectures and systems. It also bypasses an entire category of security measures---those that seek to prevent malicious computation by preventing the execution of malicious code. To demonstrate the wide applicability of return-oriented programming, we construct a Turing-complete set of building blocks called gadgets using the standard C libraries of two very different architectures: Linux/x86 and Solaris/SPARC. To demonstrate the power of return-oriented programming, we present a high-level, general-purpose language for describing return-oriented exploits and a compiler that translates it to gadgets.
Article
Instruction Set Randomization (ISR) has been proposed as a promising defense against code injection attacks. It defuses all standard code injection attacks since the attacker does not know the instruction set of the target machine. A motivated attacker, however, may be able to circumvent ISR by determining the randomization key. In this paper, we investigate the possibility of a remote attacker successfully ascertaining an ISR key using an incremental attack. We introduce a strategy for attacking ISR-protected servers, develop and analyze two attack variations, and present a technique for packaging a worm with a miniature virtual machine that reduces the number of key bytes an attacker must acquire to 100. Our attacks can break enough key bytes to infect an ISR-protected server in about six minutes. Our results provide insights into properties necessary for ISR implementations to be secure.
Article
Attacks which exploit memory programming errors (such as buffer overflows) are one of today's most seri-ous security threats. These attacks require an attacker to have an in-depth understanding of the internal details of a victim program, including the locations of critical data and/or code. Program obfuscation is a general technique for securing programs by making it difficult for attackers to acquire such a detailed understanding. This paper de-velops a systematic study of a particular kind of obfusca-tion called address obfuscation that randomizes the loca-tion of victim program data and code. We discuss differ-ent implementation strategies to randomize the absolute locations of data and code, as well as relative distances between data locations. We then present our implemen-tation that transforms object files and executables at link-time and load-time. It requires no changes to the OS ker-nel or compilers, and can be applied to individual appli-cations without affecting the rest of the system. It can be implemented with low runtime overheads. Address ob-fuscation can reduce the probability of successful attacks to be as low as a small fraction of a percent for most memory-error related attacks. Moreover, the random-ization ensures that an attack that succeeds against one victim will likely not succeed against another victim, or even for a second time against the same victim. Each failed attempt will typically crash the victim program, thereby making it easy to detect attack attempts. These aspects make it particularly effective against large-scale attacks such as Code Red, since each infection attempt requires significantly more resources, thereby slowing down the propagation rate of such attacks.
Article
This paper presents a non-invasive, half-blind firmware extraction technique that subverts a mask ROM boot-loader in order to recover the firmware of a microcon-troller. A stack based buffer overflow is used to forcibly enter the bootloader. Further, a practical method of blind return-oriented programming is presented in which a gadget's entry point is brute forced, being unknown a priori. In this paper we show that when a software vul-nerability has been found (e.g. by fuzzing), the attacker can locate by brute force the sequences of instructions, gadgets, required for bypassing the protections present in a bootloader. In a half-blind attack, the presence of a bootloader in mask ROM helps the attacker in that, while he must still discover blindly a vulnerability in unknown firmware and the appropriate gadgets, he knows the exact contents of the bootloader.
Conference Paper
Address-space randomization is a technique used to fortify systems against buffer overflow attacks. The idea is to introduce artificial diversity by randomizing the memory location of certain system components. This mechanism is available for both Linux (via PaX ASLR) and OpenBSD. We study the effectiveness of address-space randomization and find that its utility on 32-bit architectures is limited by the number of bits available for address randomization. In particular, we demonstrate a derandomization attack that will convert any standard buffer-overflow exploit into an exploit that works against systems protected by address-space randomization. The resulting exploit is as effective as the original exploit, although it takes a little longer to compromise a target machine: on average 216 seconds to compromise Apache running on a Linux PaX ASLR system. The attack does not require running code on the stack. We also explore various ways of strengthening address-space randomization and point out weaknesses in each. Surprisingly, increasing the frequency of re-randomizations adds at most 1 bit of security. Furthermore, compile-time randomization appears to be more effective than runtime randomization. We conclude that, on 32-bit architectures, the only benefit of PaX-like address-space randomization is a small slowdown in worm propagation speed. The cost of randomization is extra complexity in system support.
Conference Paper
Current software attacks often build on exploits that subvert machine-code execution. The enforcement of a basic safety property, Control-Flow Integrity (CFI), can prevent such attacks from arbitrarily controlling program behavior. CFI enforcement is simple, and its guarantees can be established formally even with respect to powerful adversaries. Moreover, CFI enforcement is practical: it is compatible with existing software and can be done efficiently using software rewriting in commodity systems. Finally, CFI provides a useful foundation for enforcing further security policies, as we demonstrate with efficient software implementations of a protected shadow call stack and of access control for memory regions.
Conference Paper
Static analysis of programs in weakly typed languages such as C and C++ is generally not sound because of possible memory errors due to dangling pointer references, uninitialized pointers, and array bounds overflow. We describe a compilation strategy for standard C programs that guarantees that aggressive interprocedural pointer analysis (or less precise ones), a call graph, and type information for a subset of memory, are never invalidated by any possible mem- ory errors. We formalize our approach as a new type system with the necessary run-time checks in operational semantics and prove the correctness of our approach for a subset of C. Our semantics provide the foundation for other sophisticated static analyses to be applied to C programs with a guarantee of soundness. Our work builds on a previously published transformation called Automatic Pool Allocation to ensure that hard-to-detect memory errors (dan- gling pointer references and certain array bounds errors) cannot in- validate the call graph, points-to information or type information. The key insight behind our approach is that pool allocation can be used to create a run-time partitioning of memory that matches the compile-time memory partitioning in a points-to graph, and effi- cient checks can be used to isolate the run-time partitions. Further- more, we show that the sound analysis information enables static checking techniques that eliminate many run-time checks. Our ap- proach requires no source code changes, allows memory to be man- aged explicitly, and does not use meta-data on pointers or individual tag bits for memory. Using several benchmarks and system codes, we show experimentally that the run-time overheads are low (less than 10% in nearly all cases and 30% in the worst case we have seen). We also show the effectiveness of static analyses in eliminat- ing run-time checks.
Vudo malloc tricks by maxx
  • M Kaempf
Getting around non-executable stack (and fix). [Online] Available: http://seclists.org/bugtraq
  • S Designer
Pax address space layout randomization (aslr)
  • P Team
Scraps of notes on remote stack overflow exploitation
  • A Zabrocki
nginx 1.3.9/1.4.0 x86 brute force remote exploit
  • Kingcope
Mwr labs pwn2own 2013 write-up - webkit exploit. [Online] Available: https://labs.mwrinfosecurity.com/blog
  • M Labes
Deter exploit bruteforcing Available: http://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity and PaX Configuration Options#Deter exploit bruteforcing
  • Grsecurity
Introduction to intel memory protection extensions Available: http://software.intel.com/en-us/articles/introduction-to-intel-memory-protection-extensions
  • Intel
Advances in format string exploitation Available: http://www.phrack.org/archives
  • Riq Gera
About a generic way to exploit linux targets
  • Kingcope
Addresssanitizer - clang 3.4 documentation
  • T C Team
Smashing the stack for fun and profit
  • one