CB-28 Unlocking the Power of Synthetic Data: Enhancing AI Models with Artificially Generated Datasets
SCURS Disciplines
Computer Sciences
Document Type
Poster Presentation
Abstract
Synthetic data has become a revolutionary solution in the artificial intelligence domain, solving data availability, privacy, and model development challenges. The research focuses on the fundamentals of synthetic data within the context of generative AI. It analyzes its role in data privacy and model development. The study focuses on comparative analysis, which examines three primary categories of synthetic data: fully synthetic, partially synthetic and hybrid synthetic datasets, each serving unique purposes in AI development.
The work depicts actual applications of synthetic data, mainly focusing on healthcare, where it is most important to keep patient records secure while ensuring confidentiality. Employing cutting-edge generative AI techniques, the research proposes approaches to generating high-fidelity synthetic datasets that retain statistical characteristics and mask sensitive information. The study addresses the technical aspects of data generation and the broader context for AI development, encompassing significant benefits and drawbacks like scalability and privacy preservation, data quality and representativeness maintenance challenges.
This systematic review provides knowledge on current approaches, limitations, and future research avenues into synthetic data creation, augmenting the growing corpus of knowledge on generative AI applications. Findings suggest synthetic data as an exciting endeavour for powering AI innovation in addressing critical data confidentiality and availability matters, particularly in highly sensitive domains like healthcare.
Keywords
Synthetic Data, Artificial Intelligence, Generative AI, Data Privacy, Fully Synthetic Data, Partially Synthetic Data, Hybrid Synthetic Data, Data Generation Techniques, Healthcare Applications, AI Model Development.
Start Date
11-4-2025 9:30 AM
Location
University Readiness Center Greatroom
End Date
11-4-2025 11:30 AM
CB-28 Unlocking the Power of Synthetic Data: Enhancing AI Models with Artificially Generated Datasets
University Readiness Center Greatroom
Synthetic data has become a revolutionary solution in the artificial intelligence domain, solving data availability, privacy, and model development challenges. The research focuses on the fundamentals of synthetic data within the context of generative AI. It analyzes its role in data privacy and model development. The study focuses on comparative analysis, which examines three primary categories of synthetic data: fully synthetic, partially synthetic and hybrid synthetic datasets, each serving unique purposes in AI development.
The work depicts actual applications of synthetic data, mainly focusing on healthcare, where it is most important to keep patient records secure while ensuring confidentiality. Employing cutting-edge generative AI techniques, the research proposes approaches to generating high-fidelity synthetic datasets that retain statistical characteristics and mask sensitive information. The study addresses the technical aspects of data generation and the broader context for AI development, encompassing significant benefits and drawbacks like scalability and privacy preservation, data quality and representativeness maintenance challenges.
This systematic review provides knowledge on current approaches, limitations, and future research avenues into synthetic data creation, augmenting the growing corpus of knowledge on generative AI applications. Findings suggest synthetic data as an exciting endeavour for powering AI innovation in addressing critical data confidentiality and availability matters, particularly in highly sensitive domains like healthcare.