Date of Award

Spring 2025

Document Type

Open Access Dissertation

Department

Computer Science and Engineering

First Advisor

Amit P. Sheth

Abstract

Deception is an inherent aspect of social interactions, with research indicating that most people engage in deceptive behavior at least once or twice daily . In parallel, advances in artificial intelligence have led to machines exhibiting deceptive tendencies. These deceptions can be categorized into two types: unintended and intentional. Unintended deceptions - often referred to as hallucinations - occur when generative AI systems produce plausible and convincing narratives yet are factually inaccurate. This phenomenon primarily results from the systems' architectural design, extensive parametric memory, and reliance on statistical assumptions. In this thesis, we provide a comprehensive discussion on the characterization, detection, avoidance, and mitigation of hallucinations. Although recent research has observed early signs of deception, cheating, and self-preservation in top-performing reasoning models, these phenomena fall outside the scope of the current study. The author of The Coming Wave, co-founder of DeepMind, and current CEO of Inflection AI, Mustafa Suleyman, outlines three waves of AI: Wave 1: Classification and training. Wave 2: Generative AI, which creates new data. Wave 3: Interactive AI, where conversations serve as the interface and autonomous bots collaborate behind the scenes. We are currently in Wave 2 - Generative Artificial Intelligence (GenAI), which has profoundly impacted everyday life and accelerated AI development. Recent advancements in GenAI have demonstrated significant accuracy in generating high-quality text, images, videos, and even software code with minimal human intervention. Large Foundation Models (LFMs), like GPT and DALL-E, are accessible to the general public, enabling individuals to efficiently produce high-quality creative content on a large scale. For example, in healthcare, GenAI models help in drug discovery and medical imaging analysis, while in education, they enhance learning by creating adaptive and personalized content, among other applications. Nonetheless, the widespread adoption of GenAI has brought about substantial challenges concerning misinformation, safety, and ethical issues, highlighting the need for regulatory measures to mitigate its impact. A key challenge of LFM lies in its tendency to generate factually inaccurate, logically incoherent, or entirely fabricated outputs while maintaining an appearance of plausibility - a phenomenon referred to as ``hallucination''. For instance, earlier last year, Air Canada encountered legal action following an incident in which its AI-powered chatbot provided inaccurate information regarding bereavement travel discounts. Moreover, the Cambridge Dictionary has declared ``hallucinate'' as its Word of the Year for 2023. As GenAI systems, including large language and image or video generation models, are widely adopted across industries, hallucinations present a critical obstacle. In an interview with The Verge, Google CEO Sundar Pichai described AI hallucinations as an inherent feature of LLMs, calling it an ``unsolved problem.'' In this dissertation, I examine six distinct components to address the challenge of hallucination. (i) Characterization: We developed a first-of-its-kind taxonomy for the systematic classification of hallucinations and introduced a large-scale benchmark called HILT. (ii) Quantification: We introduced novel evaluation metrics, including the Hallucination Vulnerability Index (HVI) and its automated variant HVI_auto, to assess and rank the hallucination of LLMs. We are confident that the dataset and the evaluation metrics will be valuable resources for future researchers studying hallucination behaviors in LLMs and developing effective detection and mitigation strategies. (iii) Detection: We introduced an innovative automated span-based hallucination detection method, referred to as Factual Entailment; this technique achieved a 30% increase in accuracy on the FACTOID benchmark compared to state-of-the-art TE methods. (iv) Avoidance: We introduced a new prompting technique, termed ``Sorry, Come Again?'' (SCA), to avoid hallucinations through prompt analysis. [PAUSE] injection technique slows LLM generation to enhance comprehension. Using optimal paraphrasing combined with LDA improves performance in both the Number and Time categories. (v) Mitigation: We introduced radiant, Retrieval-Augmented entIty-context AligNmenT, a paradigm that combines RAG with alignment principles, enhancing the interaction between retrieved evidence and the model’s internal representations. (vi) Multi-modal: Lastly, we constructed a similar taxonomy of hallucinations and datasets, called VHILT and ViBe, for both (a) Image-to-Text and (b) Text-to-Video modalities, along with a preliminary analysis. These datasets will benefit researchers in the community by supporting further research. Overall, this dissertation offers a concrete approach to evaluating the content generated by language models and addressing hallucinations across all modalities.

Rights

© 2025, Vipula Rawte

Share

COinS