Manas Gaur

Date of Award

Summer 2022

Document Type

Open Access Dissertation


Computer Science and Engineering

First Advisor

Amit P. Sheth

Sixth Committee Member

Lorne Hofseth , Vignesh Narayanan


In DARPA’s view of the three waves of AI, the first wave of AI, symbolic AI, focused on explicit knowledge. The second and current wave of AI is termed statistical AI. Deep learning techniques have been able to exploit large amounts of data and massive computational power to improve human levels of performance in narrowly defined tasks. Separately, knowledge graphs have emerged as a powerful tool to capture and exploit a variety of explicit knowledge to make algorithms better apprehend the content and enable the next generation of data processing, such as semantic search. After initial hesitancy about the scalability of the knowledge creation process, the last decade has seen significant growth in developing and applying knowledge, usually in the form of knowledge graphs. Examples range from the use of DBPedia in IBM’s Watson to Google Knowledge Graph in Google Semantic Search to the application of ProteinBank in AlphaFold, recognized by many as the most significant AI breakthrough. Furthermore, numerous domain-specific knowledge graphs/sources have been applied to improve AI methods in diverse domains such as medicine, healthcare, finance, manufacturing, and defense.

Now, we move towards the third wave of AI built on the "Neuro-Symbolic" approach that combines the strengths of statistical and symbolic AI. Combining the respective powers and benefits of using knowledge graphs and deep learning is particularly attractive. This has led to the development of an approach and practice in computer science termed "knowledge-infused (deep) learning" (KiL). This dissertation will serve as a primer on methods that use diverse forms of knowledge: linguistic, commonsense, broad-based, and domain-specific and provide novel evaluation metrics to assess knowledge-infusion algorithms on various datasets, like social media, clinical interviews, electronic health records, information-seeking dialogues, and others. Specifically, this dissertation will provide necessary grounding in shallow infusion, semi-deep infusion, and a more advanced form called deep infusion to alleviate five bottlenecks in statistical AI: (1) Context Sensitivity, (2) Handling Uncertainty and Risk, (3) Interpretability, (4) User-level Explainability, and (5) Task Transferability. Further, the dissertation will introduce a new theoretical and conceptual approach called Process Knowledge Infusion, which enforces semantic flow in AI algorithms by altering their learning behavior with procedural knowledge. Such knowledge is manifested in questionnaires and guidelines that are usable by AI (or KiL) systems for sensible and safety-constrained response generation.

The hurdle to prove the acceptability of KiL in AI and natural language understanding community lies in the absence of realistic datasets that can demonstrate five bottlenecks in statistical AI. The dissertation describes the process involved in constructing a wide variety of gold-standard datasets using expert knowledge, questionnaires, guidelines, and knowledge graphs. These datasets challenge statistical AI on explainability, interpretability, uncertainty, and context-sensitivity and showcase remarkable performance gains obtained by KiL-based algorithms. This dissertation termed these gold-standard datasets as Knowledge-intensive Language Understanding (KILU) tasks and considered them complementary to well-adopted General Language Understanding and Evaluation (GLUE) benchmarks. On KILU and GLUE datasets, KiL-based algorithms outperformed existing state-of-the-arts in natural language generation and classification problems. Furthermore, KiL-based algorithms provided user-understandable explanations in sensitive problems like Mental Health by highlighting concepts that depicts the reason behind model’s prediction or generation. Mapping of these concepts to entities in external knowledge source can support experts with user-level explanations and reasoning. A cohort-based qualitative evaluation informed that KiL should support stronger interleaving of a greater variety of knowledge at different levels of abstraction with layers in a deep learning architecture. This would enforce controlled knowledge infusion and prevent model from extrapolating or overgeneralization. This dissertation open future research questions on neural models within the domain of natural language understanding. For instance, (a) Which layer within a deep neural language model (NLMs) require knowledge? (b) It is known that NLMs learn by abstraction. How to leverage external knowledge’s inherent abstraction in enhancing the context of learned statistical representation? (c) Layered knowledge infusion might result in high-energy nodes contributing to the outcome. This is counter to the current softmaxbased predictions. How to pick the most probable outcome? and others. This dissertation provide a firsthand towards addressing these questions; however, much efficient methods are needed that provide user-level explanations, be interpretable, and propel safe AI.