Applying Reinforcement Learning Based Tutor Strategy Recommendation Service To The ASSISTments Public
Downloadable Contentopen in viewer
Reinforcement Learning, specifically Multi-armed bandit algorithm(MAB), has shown great results in personalized recommendation. This thesis focused on adding a reinforcement learning based personalized hints/explanations recommendation service to the ASSISTments, an online learning platform. This thesis investigated different MABs and implemented them to test on collected educational dataset. This thesis also explored the design and implementation of the infrastructure that can give support to provide tutor-strategy recommendation service for ASSISTments. This thesis conducted experiments to compare different bandit algorithms using the mean cumulative regret as the metric, Thompson Sampling among all selected was the best choice for actual production usage. By comparing both contextual and non-contextual MABs with random controlled methods for the specific application, MAB does not introduce bias as well as they do not have a significant advantage over random methods.
- Defense date
- Date created
- Resource type
- Rights statement
Permanent link to this page: https://digital.wpi.edu/show/0v8383662