Predicting Patient Satisfaction With Ensemble Methods Public

Downloadable Content

open in viewer

Health plans are constantly seeking ways to assess and improve the quality of patient experience in various ambulatory and institutional settings. Standardized surveys are a common tool used to gather data about patient experience, and a useful measurement taken from these surveys is known as the Net Promoter Score (NPS). This score represents the extent to which a patient would, or would not, recommend his or her physician on a scale from 0 to 10, where 0 corresponds to ""Extremely unlikely"" and 10 to ""Extremely likely"". A large national health plan utilized automated calls to distribute such a survey to its members and was interested in understanding what factors contributed to a patient's satisfaction. Additionally, they were interested in whether or not NPS could be predicted using responses from other questions on the survey, along with demographic data. When the distribution of various predictors was compared between the less satisfied and highly satisfied members, there was significant overlap, indicating that not even the Bayes Classifier could successfully differentiate between these members. Moreover, the highly imbalanced proportion of NPS responses resulted in initial poor prediction accuracy. Thus, due to the non-linear structure of the data, and high number of categorical predictors, we have leveraged flexible methods, such as decision trees, bagging, and random forests, for modeling and prediction. We further altered the prediction step in the random forest algorithm in order to account for the imbalanced structure of the data.

Last modified
  • 02/01/2021
  • English
  • etd-043015-133118
Defense date
  • 2015
Date created
  • 2015-04-30
Resource type
Rights statement


In Collection:


Permanent link to this page: