Attention-based Deep Learning Models for Text Classification and their Interpretability

Sen, Cansu

Etd

Attention-based Deep Learning Models for Text Classification and their Interpretability

Public

In this dissertation, I design novel attention-based neural architectures for classifying document series (e.g., clinical notes of a patient taken over time) with an emphasis on 1) Exploiting rich information contained in this complex data type corresponding to times series of documents, and on 2) Improving the interpretability of model predictions with the help of attention mechanisms. In the first part of this dissertation, I model clinical notes of a patient as hierarchical attributed sequences where categorical information is associated with different levels of the document hierarchy. Then, I design novel models ranging from hierarchical attention networks enhanced with external attributes and time-informed attention mechanisms to account for the temporal structure of document series. In the second part, I focus on the interpretability of attention-based neural networks. First, I assess how human-like are the explanations generated by a variety of attention mechanisms. To this end, I conduct a large-scale study to collect “human attention maps” through crowd-sourcing. Then I design a novel human-guided attention mechanism to concurrently achieve improved classification performance and offer human-like reasonings for model predictions.

Creator