Date of Award

12-15-2014

Document Type

Open Access Dissertation

Department

Computer Science and Engineering

First Advisor

Jason D. Bakos

Abstract

Embedded system-on-chip processors such as the Texas Instruments C66 DSP and the IBM Cell provide the programmer with a software controlled on-chip memory to supplement a traditional but simple two-level cache. By decomposing data sets and their corresponding workload into small subsets that fit within this on-chip memory, the processor can potentially achieve equivalent or better performance, power efficiency, and area efficiency than with its sophisticated cache. However, program controlled on chip memory requires a shift in the responsibility for management and allocation from the hardware to the programmer. Specifically, this requires the explicit mapping of program arrays to specific types of on chip memory structure and the addition of supporting code that allocates and manages the on chip memory. Previous work in tiling focuses on automated loop transformations but are hardware agnostic and do not incorporate a performance model of the underlying memory design. In this work we will explore the relationship between mapping and allocation of tiles for stencil loops and linear algebra kernels on the Texas Instruments Keystone II DSP platform.

Share

COinS