Xianshan Qu

Date of Award

Spring 2021

Document Type

Open Access Dissertation


Computer Science and Engineering

First Advisor

John Rose


With the availability of large scale data sets, researchers in many different areas such as natural language processing, computer vision, recommender systems have started making use of deep learning models and have achieved great progress in recent years. In this dissertation, we study three important classification problems based on deep learning models.

First, with the fast growth of e-commerce, more people choose to purchase products online and browse reviews before making decisions. It is essential to build a model to identify helpful reviews automatically. Our work is inspired by the observation that a customer's expectation of a review can be greatly affected by review sentiment and the degree to which the customer is aware of pertinent product information. To model such customer expectation and capture important information from a review text, we propose a novel neural network which encodes the sentiment of a review through an attention module, and introduces a product attention layer that fuses information from both the target product and related products. Our experimental results for the task of identifying whether a review is helpful or not show an AUC improvement of 5.4\% and 1.5\% over the previous state of the art model on Amazon and Yelp data sets, respectively. We further validate the effectiveness of each attention layer of our model in two application scenarios. The results demonstrate that both attention layers contribute to the model performance, and the combination of them has a synergistic effect. We also evaluate our model performance as a recommender system using three commonly used metrics: NDCG@10, Precision@10 and Recall@10. Our model outperforms PRH-Net, a state-of-the-art model, on all three of these metrics.

Second, real-time bidding (RTB) that features per-impression-level real-time ad auctions has become a popular practice in today's digital advertising industry. In RTB, click-through rate (CTR) prediction is a fundamental problem to ensure the success of an ad campaign and boost revenue. We present a dynamic CTR prediction model designed for the Samsung demand-side platform (DSP). We identify two key technical challenges that have not been fully addressed by the existing solutions: the dynamic nature of RTB and user information scarcity. To address both challenges, we develop a \ourmodel model. Our model effectively captures the dynamic evolutions of both users and ads and integrates auxiliary data sources (e.g., installed apps) to better model users' preferences. We put forward a novel interaction layer that fuses both explicit user responses (e.g., clicks on ads) and auxiliary data sources to generate consolidated user preference representations. We evaluate our model using a large amount of data collected from the Samsung advertising platform and compare our method against several state-of-the-art methods that are likely suitable for real-world deployment. The evaluation results demonstrate the effectiveness of our method and the potential for production.

Third, for Highway Performance Monitoring System (HPMS) purposes, the South Carolina Department of Transportation (SCDOT) must provide to the Federal Highway Administration (FHA) a classification of vehicles. However, due to limited lighting conditions at nighttime, classifying vehicles at nighttime is quite challenging. To solve this problem, we designed three CNN models to operate on thermal images. These three models have different architectures. Of these, model 2 achieves the best performance. Based on model 2, to avoid over-fitting and improve the performance further, we propose two training-test methods based on data augmentation technique. The experimental results demonstrate that the second training-test method improves the performance of model 2 further with regard to both accuracy and f1-score.