A New Platform for OS Research and Dependable Systems
The Need of Singularity
Software runs on a platform that has evolved over the past 40 years and is increasingly showing its age. This platform is the vast collection of code-operating systems, programming languages, compilers, libraries, run-time systems, middleware, etc.-and hardware that enables a program to execute. On one hand, this platform is an enormous success in both financial and practical terms. The platform forms the foundation of the $179 billion dollar packaged software industry and has enabled revolutionary innovations such as the Internet. On the other hand, the platform and software running on it are less robust, reliable, and secure than most users (and developers!) would wish.
Part of the problem is that our current platform has not evolved far beyond the computer architectures, operating systems, and programming languages of the 1960's and 1970's. The computing environment of that period was very different from today's milieu. Computers were extremely limited in speed and memory capacity; used only by a small group of technically literate and non-malicious users; and were rarely networked or connected to physical devices. None of these characteristics remains true, but modern computer architectures, operating systems, and programming languages have not evolved to accommodate a fundamental shift in computers and their use.
With its exponential rate of progress, hardware evolution commonly appears to drive fundamental changes in systems and applications. Software, with its more glacial progress, rarely creates opportunities for fundamental improvements. However, software does evolve, and its change makes it possible-and necessary-to rethink old assumptions and practices. Advances in programming languages, run-time systems, and program analysis tools provide the building blocks to construct architectures and systems that are more dependable and robust than existing systems.
Languages and tools based on these advances are in use detecting and preventing programming errors. Less well explored is how these mechanisms enable deep changes in system architecture, which in turn might advance the ultimate goal of preventing and mitigating software defects.
What Is Singularity
Singularity is a research project in Microsoft Research that started with the question: what would a software platform look like if it was designed from scratch with the primary goal of dependability, instead of the more common goal of performance? Singularity is working to answer this question by building on advances in programming languages and programming tools to develop and build a new system architecture and operating system (named Singularity), with the aim of producing a more robust and dependable software platform. Although dependability is difficult to measure in a research prototype, Singularity shows the practicality of new technologies and architectural decisions, which should lead to more robust and dependable systems in the future.
Design of Singularity
A key aspect of Singularity is an extension model based on Software-Isolated Processes (SIPs), which encapsulate pieces of an application or a system and provide information hiding, failure isolation, and strong interfaces. SIPs are used throughout the operating system and application software. It is believed that building a system on this abstraction will lead to more dependable software.
SIPs are not just used to encapsulate application extensions. Singularity uses a single mechanism for both protection and extensibility, instead of the conventional dual mechanisms of processes and dynamic code loading. As a consequence, Singularity needs only one error recovery model, one communication mechanism, one security policy, and one programming model, rather than the layers of partially redundant mechanisms and policies in current systems. A key experiment in Singularity is to construct an entire operating system using SIPs and demonstrate that the resulting system is more dependable than a conventional system.
The Singularity kernel consists almost entirely of safe code and the rest of the system, which executes in SIPs, consists of only verifiably safe code, including all device drivers, system processes, and applications. While all untrusted code must be verifiably safe, parts of the Singularity kernel and run-time system, called the trusted base, are not verifiably safe. Language safety protects this trusted base from untrusted code.
The integrity of the SIPs depends on language safety and on a system-wide invariant that a process does not hold a reference into another process's object space.
Ensuring code safety is obviously essential. In the short term, Singularity relies on compiler verification of source and intermediate code. In the future, typed assembly language (TAL) will allow Singularity to verify the safety of compiled code. TAL requires that a program executable supply a proof of its type safety (which can be produced automatically by a compiler for a safe language). Verifying that a proof is correct and applicable to the instructions in an executable is a straightforward task for a simple verifier of a few thousand lines of code. This end-to-end verification strategy eliminates a compiler-a large, complex program-from Singularity's trusted base. The verifier must be carefully designed, implemented, and checked, but these tasks are feasible because of its size and simplicity.
The memory independence invariant that prohibits cross-object space pointers serves several purposes. First, it enhances the data abstraction and failure isolation of a process by hiding implementation details and preventing dangling pointers into terminated processes. Second, it relaxes implementation constraints by allowing processes to have different run-time systems and their garbage collectors to run without coordination. Third, it clarifies resource accounting and reclamation by making unambiguous a process's ownership of a particular piece of memory. Finally, it simplifies the kernel interface by eliminating the need to manipulate multiple types of pointers and address spaces.
A major objection to this architecture is the difficulty of communicating through message passing, as compared with the flexibility of directly sharing data. Singularity is addressing this problem through an efficient messaging system, programming language extensions that concisely specify communication over channels, and verification tools.
Software creators rarely anticipate the full functionality demanded by users of their system or application. Rather than trying to satisfy everyone with a monolithic system, most non-trivial software provides mechanisms to load additional code. For example, Microsoft Windows supports over 100,000 third party device drivers, which enable it to control almost any hardware device. Similarly, countless browser add-ons and extensions augment a browser's interface and components for web pages. Even open source projects-although theoretically modifiable- provide "plug-in" mechanisms, since extensions are easier to develop, distribute, and combine than new versions of software.
An extension usually consists of code that is dynamically loaded into its parent's address space. With direct access to the parent's internal interfaces and data structures, extensions can provide rich functionality. However, flexibility comes at a high cost. Extensions are a major cause of software reliability, security, and backward compatibility problems. Although extension code is often untrusted, unverified, faulty, or even malicious, it is loaded directly into a program's address space with no hard interface, boundary, or distinction between host and extension. The outcome is often unpleasant. For example, Swift reports that faulty device drivers cause 85% of diagnosed Windows system crashes. Moreover, because an extension lacks a hard interface, it can use unexposed aspects of its parent's implementation, which can constrain evolution of a program and require extensive testing to avoid incompatibilities.
Dynamic code loading imposes a second, less obvious tax on performance and correctness. Software that can load code is an open environment in which it is impossible to make sound assumptions about the system's states, invariants, or valid transitions. Consider the Java virtual machine (JVM). An interrupt, exception, or thread switch can invoke code that loads a new file, overwrites class and method bodies, and modifies global state. In general, the only feasible way to analyze a program running under such conditions is to start with the unsound assumption that the environment cannot change arbitrarily between any two operations.
One alternative is to prohibit code loading and isolate dynamically created code in its own environment. Previous attempts along these lines were not widely popular because the isolation mechanisms had performance and programmability problems that made them less appealing than the risks of running without isolation. The most common mechanism is a traditional OS process, but its high costs limit its usability. Memory management hardware provides hard boundaries and protects processor state, but it also makes inter-process control and data transfers expensive. On an x86 processor, switching between processes can cost hundreds to thousands of cycles, not including TLB and cache refill misses.
More recent systems, such as the Java virtual machine and Microsoft Common Language Runtime (CLR), are designed for extensibility and use language safety, not hardware, as the mechanism to isolate computations running in the same address space. Safe languages, by themselves, do not guarantee isolation. Shared data can provide a navigable path between computations' object spaces, at which point reflection mechanisms can subvert data abstraction and information hiding. As a consequence, these systems incorporate complex security mechanisms and policies, such as Java's fine grain access control or the CLR's code access security, to limit access to system mechanisms and interfaces. These mechanisms are difficult to use properly and impose considerable overhead.
Equally important, computations that share a run-time system and execute in the same process are not isolated upon failure. When a computation running in a JVM fails, the entire JVM process typically is restarted because it is difficult to isolate and discard corrupted data and find a clean point to restart the failed computation.
Singularity uses SIPs to encapsulate. Every device driver, system process, application, and extension runs in its own SIP and communicates over channels that provide limited and appropriate functionality. If code in a SIP fails, it terminates, which allows the system to reclaim resources and notify communication partners. Since these partners did not share state with the extension, error recovery is entirely local and is facilitated by the explicit protocols on channels.
Another run-time source of new code is dynamic code generation, commonly encapsulated in a reflection interface. This feature allows a running program to examine existing code and data, and to produce and install new methods. Reflection is commonly used to produce marshalling code for objects or parsers for XML schemas. Singularity's closed SIPs do not allow run-time code generation.
Instead, Singularity provides compile-time reflection (CTR), which provides similar functionality that executes when a file is compiled. Normal reflection, which has access to runtime values, is more general than CTR. However, in many cases, the class to be marshaled or the schemas to be parsed are known ahead of execution. In these cases, CTR produces code during compilation. In the other cases, Singularity will support a mechanism for generating code and running it in a separate SIP.
Operating systems currently do not treat programs or applications as a first-class abstraction. A modern application is a collection of files containing code, data, and metadata, which an untrusted agent installs by copying the pieces into a file system and registering them in namespaces. The system is largely unaware of relationships among the pieces and has little control over the installation process. A well-known consequence is that adding or removing an application can break unrelated software.
In Singularity, an application consists of a manifest and a collection of resources. The manifest describes the application in terms of its resources and their dependencies. Although many existing setup descriptions combine declarative and imperative aspects, Singularity manifests contain only declarative statements that describe the desired state of the application after installation or update.
The process of realizing this state is Singularity's responsibility. A manifest must provide enough information for the Singularity installer to deduce appropriate installation steps, detect conflicts with existing applications, and decide whether the installation succeeded. Singularity can prevent installations that impair the system.
Other aspects of Singularity also utilize information from a manifest. For example, Singularity's security model introduces applications as a security principal, which enables an application to be entered in a file's access control lists (ACL). Treating an application as a principal requires knowledge of the application's constituent pieces and dependencies and a strong identity, all of which come from the manifest.
At the end, the key contributions of Singularity are:
- Construction of a system and application model called software-isolated processes, which uses verified safe code to implement a strong boundary between processes without hardware mechanisms. Since SIPs cost less to create and schedule, the system and applications can support more and finer isolation boundaries and a stronger isolation model.
- A consistent extension model for the system and applications that simplifies the security model, improves dependability and failure recovery, increases code optimization, and makes programming and testing tools more effective.
- A fast, verifiable communication mechanism between the processes on a system, which preserves process independence and isolation, yet enables process to communicate correctly and at low cost.
- Language and compiler support to build an entire system in safe code and to verify interprocess communications with explicit resource management.
- Elimination of the distinction between an operating system and a safe language run-time system, such as the Java JVM or Microsoft CLR.
- Pervasive use of specifications throughout a system to describe, configure and verify components.
Related Online Articles:
No comment yet. Be the first to post a comment.