CB-28 Unlocking the Power of Synthetic Data: Enhancing AI Models with Artificially Generated Datasets

SCURS Disciplines

Computer Sciences

Document Type

Poster Presentation

Abstract

Synthetic data has become a revolutionary solution in the artificial intelligence domain, solving data availability, privacy, and model development challenges. The research focuses on the fundamentals of synthetic data within the context of generative AI. It analyzes its role in data privacy and model development. The study focuses on comparative analysis, which examines three primary categories of synthetic data: fully synthetic, partially synthetic and hybrid synthetic datasets, each serving unique purposes in AI development.

The work depicts actual applications of synthetic data, mainly focusing on healthcare, where it is most important to keep patient records secure while ensuring confidentiality. Employing cutting-edge generative AI techniques, the research proposes approaches to generating high-fidelity synthetic datasets that retain statistical characteristics and mask sensitive information. The study addresses the technical aspects of data generation and the broader context for AI development, encompassing significant benefits and drawbacks like scalability and privacy preservation, data quality and representativeness maintenance challenges.

This systematic review provides knowledge on current approaches, limitations, and future research avenues into synthetic data creation, augmenting the growing corpus of knowledge on generative AI applications. Findings suggest synthetic data as an exciting endeavour for powering AI innovation in addressing critical data confidentiality and availability matters, particularly in highly sensitive domains like healthcare.

Keywords

Synthetic Data, Artificial Intelligence, Generative AI, Data Privacy, Fully Synthetic Data, Partially Synthetic Data, Hybrid Synthetic Data, Data Generation Techniques, Healthcare Applications, AI Model Development.

Start Date

11-4-2025 9:30 AM

Location

University Readiness Center Greatroom

End Date

11-4-2025 11:30 AM

This document is currently not available here.

Share

COinS
 
Apr 11th, 9:30 AM Apr 11th, 11:30 AM

CB-28 Unlocking the Power of Synthetic Data: Enhancing AI Models with Artificially Generated Datasets

University Readiness Center Greatroom

Synthetic data has become a revolutionary solution in the artificial intelligence domain, solving data availability, privacy, and model development challenges. The research focuses on the fundamentals of synthetic data within the context of generative AI. It analyzes its role in data privacy and model development. The study focuses on comparative analysis, which examines three primary categories of synthetic data: fully synthetic, partially synthetic and hybrid synthetic datasets, each serving unique purposes in AI development.

The work depicts actual applications of synthetic data, mainly focusing on healthcare, where it is most important to keep patient records secure while ensuring confidentiality. Employing cutting-edge generative AI techniques, the research proposes approaches to generating high-fidelity synthetic datasets that retain statistical characteristics and mask sensitive information. The study addresses the technical aspects of data generation and the broader context for AI development, encompassing significant benefits and drawbacks like scalability and privacy preservation, data quality and representativeness maintenance challenges.

This systematic review provides knowledge on current approaches, limitations, and future research avenues into synthetic data creation, augmenting the growing corpus of knowledge on generative AI applications. Findings suggest synthetic data as an exciting endeavour for powering AI innovation in addressing critical data confidentiality and availability matters, particularly in highly sensitive domains like healthcare.