Applying Reinforcement Learning Based Tutor Strategy Recommendation Service To The ASSISTments

Dai, Zekun

Etd

Applying Reinforcement Learning Based Tutor Strategy Recommendation Service To The ASSISTments

Public

Reinforcement Learning, specifically Multi-armed bandit algorithm(MAB), has shown great results in personalized recommendation. This thesis focused on adding a reinforcement learning based personalized hints/explanations recommendation service to the ASSISTments, an online learning platform. This thesis investigated different MABs and implemented them to test on collected educational dataset. This thesis also explored the design and implementation of the infrastructure that can give support to provide tutor-strategy recommendation service for ASSISTments. This thesis conducted experiments to compare different bandit algorithms using the mean cumulative regret as the metric, Thompson Sampling among all selected was the best choice for actual production usage. By comparing both contextual and non-contextual MABs with random controlled methods for the specific application, MAB does not introduce bias as well as they do not have a significant advantage over random methods.

Creator