Robust Generative Models for Deep Learning and Applications

Karkare, Rasika

Etd

Robust Generative Models for Deep Learning and Applications

Public Deposited

In this thesis, we demonstrate how deep autoencoders can be generalized to the case of inpainting, missing entry prediction and denoising, even when no clean training data is available. In particular, we show how neural networks can be trained to perform all of these tasks simultaneously. While, deep autoencoders implemented by way of neural networks have demonstrated potential for denoising and anomaly detection, standard autoencoders have the drawback that they require access to clean data for training. However, recent work in Robust Deep Autoencoders (RDAEs) shows how autoencoders can be trained to eliminate outliers and noise in a dataset without access to any clean training data. Inspired by this work, we extend RDAEs to the case where the data is not only noisy and has outliers, but it is also only partially observed. Moreover, the dataset we train the neural network on has the properties that all entries have noise, some entries are corrupted by large mistakes, and many entries are not even known. Given such an algorithm, many standard tasks, such as denoising, image inpainting, and unobserved entry imputation can all be accomplished simultaneously within the same framework. We demonstrate these techniques on standard machine learning tasks, such as image inpainting and denoising for the MNIST and CIFAR10 datasets. These approaches are not only applicable to image processing problems, but also have wide ranging impacts on datasets arising from real-world problems, such as manufacturing and network processing, where noisy partially observed data naturally arise. Moreover, multitudes of sensor data collected in manufacturing processes is pivotal in Industrial Internet of things (IIoT) applications. However, these sensors are prone to transmit inaccurate readings from time to time due to several reasons such as network connectivity and hardware issues leading to datasets with missing values and noisy features. Thus, in this thesis we develop robust neural network solutions to treat such data quality challenges in order to obtain insights into the manufacturing processes using data-driven modelling in absence of robust physics-based solutions. We show the application of our models to manufacturing datasets in which we have sensor data with varying levels and patterns of missingness due to several reasons. By combining the Hadamard Autoencoder with Self-Supervised learning paradigm, we show how our model can be used to improve the quality of training data used for further downstream analyses. We demonstrate the effectiveness of our approach via a case study on a High Pressure Die Casting (HPDC) dataset. Another application we cover in this thesis is the application of our approach to analysis of large-scale social networks. Such networks are hampered by limited data as complete network measurements are expensive or impossible to collect. We extend our model paired with pretraining, to predict missing topology information in ultra-sparsely sampled social networks. Our model is evaluated on variations of Barabási-Albert and Powerlaw cluster graphs as well as on a real-world Facebook network. Results are compared with a deterministic Low-rank Matrix Completion (LMC) method as well as an autoencoder trained on partially observed data from the test network. The results show that pretrained autoencoder far outperforms LMC when the number of distance samples available is less than 1%, while being competitive for higher fraction of samples. In the context of graph networks, we also develop a novel approach to search for the optimal parameters in the parameter space of random graph networks. Our model first searches for the most suitable synthetic graph parameters which can be used as an "oracle'' to create arbitrarily large training datasets. Such an approach enables us to evaluate the robustness of the autoencoder to artificial training data while making predictions on real-world social networks. Accordingly, we conduct experiments on real-world Facebook, Email, and Train Bombing networks and show that our approach named ''oracle search pretraining (OSP)'' outperforms models without pre-training, models pre-trained with inaccurate parameters, and other distance prediction schemes such as Low-rank Matrix Completion. Finally, we show how our models can be used in financial datasets for dimension estimation in the presence of noise and missing data. We demonstrate the robustness of our approach by applying it to synthetic data and real-world data with varying levels of anomalies. Further, we demonstrate our approach on financial intraday data from the S&P 500 constituents.

Creator