Yuqi Song

Date of Award

Summer 2023

Document Type

Open Access Dissertation


Computer Science and Engineering

First Advisor

Jianjun Hu


Discovering new materials and understanding their crystal structures and chemical properties are critical tasks in the material sciences. Although computational methodologies such as Density Functional Theory (DFT), provide a convenient means for calculating certain properties of materials or predicting crystal structures when combined with search algorithms, DFT is computationally too demanding for structure prediction and property calculation for most material families, especially for those materials with a large number of atoms. This dissertation aims to address this limitation by developing novel deep learning and machine learning algorithms for effective prediction of material crystal structures and properties. Our data-driven machine learning modeling approaches allow to learn both explicit and implicit chemical and geometric knowledge in terms of patterns and constraints from known materials and then exploit them for efficient sampling in crystal structure prediction and feature extraction for material property prediction.

In the first topic, we present DeltaCrystal, a new deep learning based method for crystal structure prediction. This data-driven algorithm learns and exploits the abundant atom interaction distribution of known crystal material structures to achieve efficient structure search. It first learns to predict the atomic distance matrix for a given material composition based on a deep residual neural network and then employs this matrix to reconstruct its 3D crystal structure using a genetic algorithm. Through extensive experiments, we demonstrate that our model can learn the implicit interatomic relationships and its effectiveness and reliability in exploiting such information for crystal structure prediction. Compared to the global optimization based CSP method, our algorithm achieves better structure prediction performance for more complex crystals.

In the second topic, we shift our focus from individually predicting the positions of atoms in each material structure to the idea of crystal structure prediction based on structural polyhedron motifs based on the observation that these atom patterns appear frequently across different crystal materials with high geometric conservation, which has the potential to significantly reduce the search complexity. We extract a large set of structural motifs from a vast collection of material structures. Through the comprehensive analysis of motifs, we uncover common patterns and motifs that span across different materials. Our work represents a preliminary step in the exploration of material structures from the motif point of view and exploiting such motif for efficient crystal structure prediction.

In the third topic, we propose a machine learning based framework for discovering new hypothetical 2D materials. It first trains a deep learning generative model for material composition generation and trains a random forest-based 2D materials classifier to screen out potential 2D material compositions. Then, a template-based element substitution structure prediction approach is developed to predict the crystal structures for a subset of the newly predicted hypothetical 2D formulas, which allows us to confirm their structural stability using DFT calculations. So far, we have predicted 101 crystal structures and confirmed 92 2D/layered materials by DFT formation energy calculation.

In the last topic, we focus on machine learning models for predicting material properties, including piezoelectric coefficients and noncentrosymmetric of nonlinear optical materials, as they play important roles in many important applications, such as laser technology and X-ray shutters. We conduct a comprehensive study on developing advanced machine learning models and evaluating their performance for predicting piezoelectric modulus from materials’ composition/structures. Next, we train several prediction models based on extensive feature engineering combined with machine learning models and automated feature learning based on deep graph neural networks. We use the best model to predict the piezoelectric coefficients for 12,680 materials and report the top 20 potential high-performance piezoelectric materials. Similarly, we develop machine learning models to screen potential noncentrosymmetric materials from 2,000,000 hypothetical materials generated by our material composition generative design model and report the top 80 candidate noncentrosymmetric nonlinear materials.