May 7-10, 2017 Asilomar, California

Physical Simulation using Database Languages

Gilbert Bernstein

(Physical) Simulations are some of the oldest, most computationally intensive, and scientifically critical numerical programs in computing. The resulting inexhaustible demand for computation leads programmers to employ a large array of different geometric representations, parallel hardware, physical models, and numeric techniques. But in code, these concerns turn into spaghetti. Professors attempt to train their students to become super-expert generalists. Code size balloons. Code bases ossify. For instance, porting a simulation from a serial prototype running on a laptop to a fully distributed supercomputer program is estimated to increase code size by 5-10x. I refer to the problem of being able to separate these concerns and curb combinatorially induced code growth as the “simulation expression problem.”

In work with colleagues, I’ve been exploring the idea that designing high-performance languages built on database concepts—notably relational algebra—can drastically simplify simulation programs. We’ve found that (1) completely automatic parallelization is possible for broad classes of simulations; (2) complex cyclic mesh data structures can be automatically verified without any proof assistant; (3) code-size can frequently be reduced by an order of magnitude. The resulting simulations often outperform hand-written code, for the simple reason that most existing code bases make sub-optimal decisions on cross-cutting concerns like data representation or parallelization strategies.