May 7-10, 2017 Asilomar, California

DaLi: Database as a Library

Gowtham Kaki

The landscape of data-intensive applications today span the gamut from large-scale Web applications expected to provide persistent, fault-tolerant, high-availability, and low-latency geo- distributed services to loosely connected IoT networks comprised of millions of heterogeneous devices streaming and processing realtime data feeds. In both cases, application logic is usually expressed using high-level, often domain-specific, language abstractions, while data management issues are typically relegated to an opaque monolithic data querying and storage service. While this architecture encourages separation of concerns, it provides little opportunity for synergies between the application and database/storage boundary. In particular, applying well-understood programming language principles and verification techniques to ensure data management services enforce application-level invariants becomes difficult, jeopardizing safety and maintainability. To overcome these drawbacks, we propose a radically different view of how applications and databases should interact with one another. Our approach encapsulates data management functionality within transparent libraries written in the same language as the application they support (OCaml in our case). An immediate benefit of our approach is that properties relevant to the application can be now directly interpreted, enforced, and verified by the data management layer. Similarly, database functionality related to consistency, integrity, scalability and fault- tolerance can be couched in terms of the data types manipulated by the application. Our ideas can be thought of as a natural extension of unikernels, applied to data (as opposed to computation). We sketch the design of a library-centric data management stack called DaLi, and describe how we unify data representation issues across layers, and exploit language-level data type information to realize scalability and persistence.