Publications

Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits

Kaushik Roy, University of South Carolina - ColumbiaFollow
Qi Zhang, University of South Carolina - ColumbiaFollow
Manas Gaur, University of South Carolina - ColumbiaFollow
Amit Sheth, University of South Carolina - ColumbiaFollow

Document Type

Conference Proceeding

Abstract

Contextual Bandits find important use cases in various real-life scenarios such as online advertising, recommendation systems, healthcare, etc. However, most of the algorithms use at feature vectors to represent context whereas, in the real world, there is a varying number of objects and relations among them to model in the context. For example, in a music recommendation system, the user context contains what music they listen to, which artists create this music, the artist albums, etc. Adding richer relational context representations also introduces a much larger context space making exploration-exploitation harder. To improve the efficiency of exploration-exploitation knowledge about the context can be infused to guide the exploration-exploitation strategy. Relational context representations allow a natural way for humans to specify knowledge owing to their descriptive nature. We propose an adaptation of Knowledge Infused Policy Gradients to the Contextual Bandit setting and a novel Knowledge Infused Policy Gradients Upper Confidence Bound algorithm and perform an experimental analysis of a simulated music recommendation dataset and various real-life datasets where expert knowledge can drastically reduce the total regret and where it cannot.

Publication Info

Preprint version 2021.

APA Citation

Roy, K., Zhang, Q., Gaur, M., & Sheth, A. (2021). Knowledge infused policy gradients with upper confidence bound for relational bandits.

Download

Included in

Computer Engineering Commons, Electrical and Computer Engineering Commons

COinS

Publications

Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits

Document Type

Abstract

Publication Info

APA Citation

Included in

Search

Browse

Submissions

Links

Publications

Knowledge Infused Policy Gradients with Upper Confidence Bound for Relational Bandits

Author(s)

Document Type

Abstract

Publication Info

APA Citation

Included in

Share

Search

Browse

Submissions

Links