Reiter, Michael K.2007-04-232007-04-231993-07http://techreports.library.cornell.edu:8081/Dienst/UI/1.0/Display/cul.cs/TR93-1367https://hdl.handle.net/1813/6141While there is considerable experience with addressing the needs for security and fault-tolerance individually in distributed systems, much less is understood about how to simultaneously address these needs in a single, integrated solution. Indeed, the goals of security and availability have traditionally been viewed as being in conflict, because replicationg data and services for availability makes them inherently harder to protect. This thesis presents the design and implementation of a security architecture for fault-tolerant systems, including a set of results that underpin this architecture. We first present a methodology for balancing the aforementioned tradeoff between security and availability in distributed services. Using our techniques, a service can be replicated so that it will remain available and correct despite the corruption of some servers and clients by a malicious intruder. These results include the identification and prevention of a new form of attack in which an intruder effects and exploits violations of causality in the sequence of requests processed by the service. Second, we bring this replication methodology and other novel techniques to bear on an issue for which the conflict between security and availability is particularly troublesome, namely cryptographic key distribution via trusted services. We present authentication and time services that can securely and fault-tolerantly support cryptographic key distribution in a wide range of settings. Third, we present the design and implementation of our security architecture, which employs these services. The architecture supports process groups --a common paradigm of fault-tolerant computing--as its primary security abstraction, and provides tools to construct applications that are resilient to benign failures and malicious attacks. We discuss the integration of this architecture in the Horus system and focus on techniques to make group communication secure and efficient. In the final contributions of the thesis, we further explore the importance of detecting causal relationships for security. We present a framework for examining attacks onaattempts to detect causal relationships. We also present several algorithms to prevent these attacks in some situations.10588043 bytes1874921 bytesapplication/pdfapplication/postscripten-UScomputer sciencetechnical reportA Security Architecture for Fault-Tolerant Systemstechnical report