Student Work

Cleaning On Demand with a Recommender System (CODeRS)


Downloadable Content

open in viewer

Data Science has become a popular tool for deriving solutions in a variety of domains, such as medicine, materials science, and finances. The performance of such data science applications depends on how well the data is cleaned and preprocessed. However, choosing the correct techniques can be a difficult task as this decision depends highly on the data itself. Thus, the automation of this process could greatly benefit those using data science by reducing human error and, in turn, creating more reliable and generalizable predictive models. This project aims to develop an automated cleaning and preprocessing web application for non-technical users called CODeRS. We performed a literature review to find state-of-the-art techniques with a range of assumptions to recommend in our application. We implemented those techniques in an automated recommender system that provides the appropriate cleaning technique based on the dataset. Additionally, we developed a graphical user interface to simplify the user experience for those creating data science solutions in a differing domain. Furthermore, we developed this application in a modular fashion to ensure scalability, longevity, and flexibility. We deployed our final product and measured its functionality and design in a series of user studies with a group of materials scientists.

  • This report represents the work of one or more WPI undergraduate students submitted to the faculty as evidence of completion of a degree requirement. WPI routinely publishes these reports on its website without editorial or peer review.
  • 48941
  • E-project-030222-203722
  • 2022
Date created
  • 2022-03-02
Resource type
Rights statement


In Collection:



Permanent link to this page: