Etd

Contributions to Collective Dynamical Clustering-Modeling of Discrete Time Series

Pubblico

Contenuto scaricabile

open in viewer

<P>The analysis of sequential data is important in business, science, and engineering, for tasks such as signal processing, user behavior mining, and commercial transactions analysis. In this dissertation, we build upon the Collective Dynamical Modeling and Clustering (CDMC) framework for discrete time series modeling, by making contributions to clustering initialization, dynamical modeling, and scaling.</P> <P>We first propose a modified Dynamic Time Warping (DTW) approach for clustering initialization within CDMC. The proposed approach provides DTW metrics that penalize deviations of the warping path from the path of constant slope. This reduces over-warping, while retaining the efficiency advantages of global constraint approaches, and without relying on domain dependent constraints.</P> <P>Second, we investigate the use of semi-Markov chains as dynamical models of temporal sequences in which state changes occur infrequently. Semi-Markov chains allow explicitly specifying the distribution of state visit durations. This makes them superior to traditional Markov chains, which implicitly assume an exponential state duration distribution.</P> <P>Third, we consider convergence properties of the CDMC framework. We establish convergence by viewing CDMC from an Expectation Maximization (EM) perspective. We investigate the effect on the time to convergence of our efficient DTW-based initialization technique and selected dynamical models. We also explore the convergence implications of various stopping criteria.</P> <P>Fourth, we consider scaling up CDMC to process big data, using Storm, an open source distributed real-time computation system that supports batch and distributed data processing.</P> <P>We performed experimental evaluation on human sleep data and on user web navigation data. Our results demonstrate the superiority of the strategies introduced in this dissertation over state-of-the-art techniques in terms of modeling quality and efficiency.</P>

Creator
Contributori
Degree
Unit
Publisher
Language
  • English
Identifier
  • etd-042716-223920
Parola chiave
Advisor
Committee
Defense date
Year
  • 2016
Date created
  • 2016-04-27
Resource type
Rights statement

Relazioni

In Collection:

Articoli

Elementi

Permanent link to this page: https://digital.wpi.edu/show/1831ck05t