Student Work

HAR-CTGAN: Synthesizing Continuous and Discrete Mobile Sensor Data Using GAN models for Human Activity Recognition

Public Deposited

Downloadable Content

open in viewer

Human activity recognition (HAR) is the process of determining physical activities performed by individuals using mobile sensor data. HAR is the backbone of many mobile healthcare applications, such as passive health monitoring systems, early diagnosing systems, and fall detection systems. Effective HAR models rely on deep learning technologies in order to accurately classify what activity was being performed in a data instance. In turn, HAR models require large collections of labeled real-world human activity data. Unfortunately, HAR datasets are expensive to collect, are often mislabeled, and have large class imbalances. State-of-the-art approaches to address these challenges utilize Generative Adversarial Networks (GANs) for generating additional synthetic data along with their labels. Problematically, these HAR GANs only synthesize continuous features – features that are represented with real numbers – recorded from gyroscopes, accelerometers, GPS systems, and other sensors that produce continuous data. This is limiting since mobile sensor data commonly has discrete features that provide additional context, such as Bluetooth state, sensor location (prioception), and time-of-day. It has been shown that the availability of these discrete features can substantially improve HAR classification. Within the healthcare domain, misclassifications can have damaging or even fatal impacts on the individuals that rely on these models. Hence, we studied Conditional Tabular Generative Adversarial Networks (CTGANs) for data generation to synthesize mobile sensor data containing both continuous and discrete features, a task never been done by state-of-the-art approaches. We show HAR-CTGANs generate data with greater realism resulting in allowing better downstream performance in HAR models. Synthesized data from HAR-CTGAN when used in HAR model training resulted in a 63% greater improvement in F1 performance than using synthesized data from state-of-the-art. When state-of-the-art models were modified with HAR-CTGAN characteristics downstream F1 performed increased by 18%.

  • This report represents the work of one or more WPI undergraduate students submitted to the faculty as evidence of completion of a degree requirement. WPI routinely publishes these reports on its website without editorial or peer review.
Creator
Subject
Publisher
Identifier
  • 72466
  • E-project-081722-150154
Keyword
Advisor
Year
  • 2022
UN Sustainable Development Goals
Date created
  • 2022-08-17
Resource type
Major
Source
  • E-project-081722-150154
Rights statement
Last modified
  • 2022-12-21

Relations

In Collection:

Items

Items

Permanent link to this page: https://digital.wpi.edu/show/mg74qq26z