Now showing items 7-21 of 21

    • Fault Tolerance For Main-Memory Applications In The Cloud 

      Cao, Tuan (2013-05-26)
      Advances in hardware have enabled many long-running applications to execute entirely in main memory. With the emergence of cloud computing, thousands of machines could be made available to deploy such applications with ...
    • A Gossip-Style Failure Detection Service 

      Van Renesse, Robbert; Minsky, Yaron; Hayden, Mark (Cornell University, 1998-05)
      Failure Detection is valuable for system management, replication, load balancing, and other distributed services. To date, Failure Detection Services scale badly in the number of members that are being monitored. This paper ...
    • The Horus and Ensemble Projects: Accomplishments and Limitations 

      Birman, Kenneth P.; Constable, Robert; Hayden, Mark; Hickey, Jason; Kreitz, Christoph; Van Renesse, Robbert; Rodeh, Ohad; Vogels, Werner (Cornell University, 1999-10)
      The Horus and Ensemble efforts culminated a multi-year Cornell research program in process group communication used for fault-tolerance, security and adaptation. Our intent was to understand the degree to which a single ...
    • Horus: A Flexible Group Communications System 

      Van Renesse, Robbert; Birman, Kenneth P.; Glade, Bradford B.; Guo, Katie; Hayden, Mark; Hickey, Takako; Malki, Dalia; Vaysburd, Alex; Vogels, Werner (Cornell University, 1995-03)
      The Horus system offers flexible group communication support for distributed applications. It is extensively layered and highly reconfigurable, allowing applications to only pay for services they use, and for groups with ...
    • Incorporating System Resource Information into Flow Control 

      Hickey, Takako M.; Van Renesse, Robbert (Cornell University, 1995-02)
      Upcall-based distributed systems have become widespread in recent years. While upcall-based systems provide some obvious advantages, experiences with these systems have exposed unanticipated problems of unpredictability ...
    • Investigating correct-by-construction attack-tolerant systems 

      Constable, Robert; Bickford, Mark; Van Renesse, Robbert (2011-09-12)
      Attack-tolerant distributed systems change their protocols on-the-fly in response to apparent attacks from the environment; they substitute functionally equivalent versions possibly more resistant to detected threats. ...
    • Nerio: Leader Election and Edict Ordering 

      Van Renesse, Robbert; Schneider, Fred; Gehrke, Johannes (2011-09-26)
      Coordination in a distributed system is facilitated if there is a unique process, the leader, to manage the other processes. The leader creates edicts and sends them to other processes for execution or forwarding to other ...
    • New Applications Of Data Redundancy Schemes In Cloud And Datacenter Systems 

      Abu-libdeh, Hussam (2015-01-26)
      Data redundancy techniques such as replication and erasure coding have been studied in the context of distributed systems for almost four decades. This thesis discusses new uses of erasure coding and replication in the ...
    • Operating System Support for Mobile Agents 

      Johansen, Dag; Van Renesse, Robbert; Schneider, Fred B. (Cornell University, 1994-12)
      An "agent" is a process that may migrate through a computer network in order to satisfy requests made by its clients. Agents implement a computational metaphor that is analogous to how most people conduct business in their ...
    • Operating Systems Abstractions For Software Packet Processing In Datacenters 

      Marian, Tudor (2011-01-31)
      Over the past decade, the modern datacenter has reshaped the computing landscape by providing a large scale consolidated platform that efficiently powers online services, financial, military, scientific, and other application ...
    • Protocol Composition in Horus 

      Van Renesse, Robbert; Birman, Kenneth P. (Cornell University, 1995-03)
      Horus is a communication architecture that treats a protocol as an abstract data type. Protocol layers can be stacked on top of each other in a variety of ways, at run-time. This paper starts out with describing the many ...
    • Reducing Costs Of Byzantine Fault Tolerant Distributed Applications 

      Ho, Chi (2011-08-31)
      Byzantine fault tolerance (BFT) is a powerful technique for building software that tolerates arbitrary failures. The technique has been developed since the 70s and has a rich research literature. Yet no production system ...
    • A Security Architecture for Fault-Tolerant Systems 

      Reiter, Michael K.; Birman, Kenneth P.; Van Renesse, Robbert (Cornell University, 1993-06)
      Process groups are a common abstraction for fault-tolerant computing in distributed systems. We present a security architecture that extends the process group into a security abstraction. Integral parts of this architecture ...
    • Toward Robust High Performance Distributed Services 

      Song, Yeejiun (2011-08-31)
      This thesis presents steps towards simplifying the implementation of robust high performance distributed services. First, we investigate consensus algorithms in the context of fault tolerant systems. Consensus algorithms, ...
    • Towards A Secure Federated Information System 

      Liu, Mon Jed (2012-08-20)
      We are entering an era in which federated information systems are widely used to share information and computation. Federated systems support new services and capabilities by integrating computer systems across independent ...