May 7-10, 2017 Asilomar, California
We live in the twilight of Moore’s Law. From data centers to mobile devices, performance and energy scaling is limited by locality (the distance over which data has to move, e.g., from nearby caches, far away main memory, or across networks) and parallelism. Because of this, I argue that we should think of the performance and efficiency of an application as determined not just by the algorithm and the hardware on which it runs, but critically also by the organization of computations and data. For algorithms with the same complexity—even the exact same set of arithmetic operations and data—executing on the same hardware, with the same level of tuning in their innermost loop, the order and granularity of execution and placement of data can easily change performance by an order of magnitude because of locality and parallelism.
To extract the full potential of our machines—indeed, to scale at all beyond the performance we have today—we need to treat the organization of computation as a first class concern in our applications, and therefore also in the languages and tools with which we build them. In this talk, I will give a taste of how this philosophy appears in the Halide language and its concept of “schedules,” point out some successes and challenges with this specific design, and suggest ways the organization of computation could be exposed and exploited more broadly.