Date of Award


Document Type

Campus Access Dissertation


Computer Science and Engineering

First Advisor

Jason Bakos


In the heterogeneous computing execution model, one or more general-purpose processors are accelerated using one or more co-processors. In this model, general-purpose CPUs are generally assigned portions of the software that either do not map well to the available co-processor microarchitectures or whose low execution time does not warrant the extra effort required to adapt the code to the co-processor's programming model. The co-processors, on the other hand, are assigned the most computationally expensive portions of the software, and this code is adapted to the co-processor's specialized programming model.

In order for legacy code to take advantage of a heterogeneous computer, a programmer must partition its code to select which portions of it to map to the co-processor. The selection criteria typically involves finding and selecting the application's most expensive computations, or kernels. However, this methodology only considers execution time while ignoring memory behavior. In heterogeneous systems where the general-purpose processor and co-processor have disjoint memory spaces, there is a penalty required to exchange data between processors, and it is important to minimize communication cost. We refer to this category of heterogeneous systems as "Disjoint Memory Co-Processor Accelerated Computing (DiMCAC)."

The partitioning procedure is typically performed in an ad hoc manner due to the lack of existing automation tools designed for this task. The tools that do exist are not specially designed for the DiMCAC model, or require that a programmer perform manual analysis which can take a considerable amount of time when the programmer is not familiar with the application.

To address this issue, this research presents a Partitioning Analysis Tool for Heterogeneous Systems (PATHS). PATHS is an analysis toolchain that performs a fully automated behavioral analysis of applications to be partitioned for execution on computing platforms that correspond to the DiMCAC model. In this case, making effective partitioning decisions may require optimizing against competing constraints.

In this dissertation, we describe new instrumentation, measurement, presentation and selection components that are implemented in PATHS to support a systematic partitioning methodology. PATHS' primary contributions are the development of 1) a novel methodology for instrumentation and runtime data collection to monitor execution time and transferred data movement at the loop level, 2) an objective function for determining the fitness of an arbitrary set of assignments, and 3) a heuristic search technique for finding an effective solution.

In an experimental evaluation of five different computationally intensive applications, PATHS provides a top ranked candidate accelerator for each application with a fitness evaluation that is higher than candidate accelerators selected from application profiles performed by GNU Gprof.