Date of Award


Document Type

Campus Access Dissertation



First Advisor

Diansheng Guo


Spatial interactions (SI), such as human daily movements, disease spread, and commodity flows, are among the essential forces that drive many physical and socioeconomic processes. Spatial interactions are very complex in nature. A normal SI data set often contains three different data spaces: (1) the geographic space, such as locations of origins and destinations; (2) the graph/network space, such as flows/links between locations; and (3) the multivariate space, including variables for locations (e.g., unemployment rate, median income) and flows (e.g., migrant characteristics in terms of race, income, education, and profession).

The goal of this research is to address the underutilization and the underrepresentation of SI data. "Underutilization" refers to the lack of a powerful and comprehensive approach to analyze and extract the rich information lurking in large and complex SI data. "Underrepresentation" refers to the challenge related to the mapping, visualization, and communication of SI information and knowledge. Currently there is a lack of powerful exploratory analytic methods that can deal with the complexity of spatial interactions, which often involve: (1) multiple data spaces (i.e., geographic space, network space, and multivariate space), (2) various spatial constraints (e.g., travel distances, geographic contiguity, and physical barriers), (3) many variables for locations and interactions (flows), and (4) the large data size: a moderate-sized dataset which involves 50-1,000 locations can easily have thousands or millions of connections. It is unlikely that an individual method alone can fully address these challenges.

This dissertation develops an integrated computational-visual approach to examining SI data from different perspectives and synthesizing different perspective views into a holistic understanding. The contribution of this research is two-fold. First, it develops a graph partitioning method to discover spatially contiguous community patterns (SI regions). In addition to representing graph patterns, SI regions are also used as a data aggregation strategy in this research to summarize massive spatial flows. Evaluations with benchmark data indicate that the developed method is more effective and computationally efficient than traditional methods. Second, this research combines the three SI data spaces in data exploration and representation: (1) SI regions are extracted from the graph space under geographic constraints; and (2) SI regions, multivariate patterns, and geographic patterns of SI flows are analyzed simultaneously in a novel and interactive visual analytic system. The combination of graph partition, multivariate visualization, flow mapping, and interactive interfaces creates a flexible, comprehensive, and efficient environment to explore SI data from different perspectives and obtain holistic understandings.

A large inter-county migration data set of the U.S. is used to assess the developed approach and implemented visual analytic system from an application perspective. The data contains over 700,000 county-to-county migration flows (i.e., origin-destination pairs). Each flow has a vector of variables, such as income, education, and racial composition of the migrants in that flow. The results demonstrate that the SI regions obtained by analyzing the spatial information and network connections can unveil real-world structures such as the strong "core-suburban relationship" from a network perspective. A focused study on income migration shows that the developed visual analytic system can facilitate new and comprehensive analyses that existing research methodologies cannot support.