Deep Learning on Attributed Sequences

Zhuang, Zhongfang

Etd

Deep Learning on Attributed Sequences

Public

Recent research in feature learning has been extended to sequence data, where each instance consists of a sequence of heterogeneous items with a variable length. However, in many real-world applications, the data exists in the form of attributed sequences, which is composed of a set of fixed-size attributes and variable-length sequences with dependencies between them. In the attributed sequence context, feature learning remains challenging due to the dependencies between sequences and their associated attributes. In this dissertation, we focus on analyzing and building deep learning models for four new problems on attributed sequences. First, we propose a framework, called NAS, to produce feature representations of attributed sequences in an unsupervised fashion. The NAS is capable of producing task independent embeddings that can be used in various mining tasks of attributed sequences. Second, we study the problem of deep metric learning on attributed sequences. The goal is to learn a distance metric based on pairwise user feedback. In this task, we propose a framework, called MLAS, to learn a distance metric that measures the similarity and dissimilarity between attributed sequence feedback pairs. Third, we study the problem of one-shot learning on attributed sequences. This problem is important for a variety of real-world applications ranging from fraud prevention to network intrusion detection. We design a deep learning framework OLAS to tackle this problem. Once the OLAS is trained, we can then use it to make predictions for not only the new data but also for entire previously unseen new classes. Lastly, we investigate the problem of attributed sequence classification with attention model. This is challenging that now we need to assess the importance of each item in each sequence considering both the sequence itself and the associated attributes. In this work, we propose a framework, called AMAS, to classify attributed sequences using the information from the sequences, metadata, and the computed attention. Our extensive experiments on real-world datasets demonstrate that the proposed solutions significantly improve the performance of each task over the state-of-the-art methods on attributed sequences.

Creator