Etd

Biological Discovery via Data Science: Multigranular Artificial Intelligence Approaches for Cells, Cerebra and Cruise Ships

Public

Downloadable Content

open in viewer

Many important breakthroughs in life sciences were possible due to then novel technologies: the invention of the microscope, discovery of X-rays and DNA sequencing which allowed to decode the human genome. Each era of technological innovation provided more quantifiable data about living systems, and ever improving access to technologies such as imaging and sequencing has resulted in an exponential growth in biological data. In recent decades, computational sciences have played a fundamental role in discovering insights from biological data and helped to develop paradigms of complex systems. In this research, I designed data driven artificial intelligence approaches for three diverse biological problems. This research is driven by the need for discovering phenomenon at multiple functional levels of biological systems. First, single-cell RNA sequencing (scRNA-seq) is a recent technology that enables fine- grained discovery of cellular subtypes and specific cell states. Analysis of scRNA-seq data routinely involves machine learning methods, such as feature learning, clustering, and classification, to assist in uncovering novel information from scRNA-seq data. However, current methods are not well suited to deal with the substantial amounts of noise that is created by the experiments or the variation that occurs due to differences in the cells of the same type. To address this, a new hybrid approach is designed, Deep Unsupervised Single-cell Clustering (DUSC), which integrates feature generation based on a deep learning architecture by using a new technique to estimate the number of latent features, with a model-based clustering algorithm, to find a compact and informative representation of the single-cell transcriptomic data generating robust clusters. A technique is also designed to estimate an efficient number of latent features in the deep learning model. This method outperforms both classical and state-of-the-art feature learning and clustering methods, approaching the accuracy of supervised learning. DUSC was applied to single-cell transcriptomics dataset obtained from a triple-negative breast cancer tumor to identify potential cancer subclones accentuated by copy-number variation and investigate the role of clonal heterogeneity. The method is freely available to the community and will hopefully facilitate our understanding of the cellular atlas of living organisms as well as provide the means to improve patient diagnostics and treatment. Second, the traumatic experiences of childhood abuse and neglect can be precursors to the development of dissociative disorders such as dissociative identity disorder (DID) and the dissociative subtype of post-traumatic stress disorder (PTSD-DS), and these disorders carry an increased risk of self- injurious behavior including suicide attempts. Data from comprehensive psychiatric assessments play a fundamental role in understanding complex mental illnesses. Artificial intelligence (AI) methods can be applied to complex high-dimensional psychometric data in an unbiased and holistic manner, to identify patient clusters, key psychometrics and to predict critical outcomes, such as suicidal behavior. Here, an integrated AI approach is designed that combines unsupervised and supervised learning methods, and applied it to a patient sample with PTSD and/or DID and matched controls. Patient clusters were identified that correlate with symptom severity, specific clinical markers for categories of interest were determined, and these markers were then used to accurately predict outcomes, including suicidality. The unsupervised approach identified subclusters along a dissociation spectrum corresponding to patients with severe traumatic and dissociative symptomatology and alterations in vegetative functions. The supervised models accurately predicted suicidality, and identified fully dissociated actions as critical markers. Also observed that compared to patients without dissociative disorder, patients with DID had a significantly greater odds and risk ratios for attempting suicide. Finally, the proposed AI approach can provide clinicians an integrated view of patient psychometrics to identify subgroups with severe symptoms and predict suicide risk in new cohorts. Third, infection outbreak is a major health threat to passengers traveling in confined, close quarter environments such as airplanes and cruise ships. On cruise ships, perhaps, the most notorious and frequent disease outbreaks have been attributed to the highly contagious norovirus. Additionally, there have been instances of novel pandemic viruses that were spread on cruise ships, such as the SARS coronavirus, 2009 H1N1 virus and recently the SARS-CoV-2 virus. To date, no study exists that incorporates geographical information, human behavior and pathogen model to provide real-time analysis of an outbreak on passenger vessels. Here, a novel approach to study real-time dynamics of an infection outbreak in a confined environment is introduced. Specifically, the proposed framework integrates 3D geographic representation of the environment, agent-based modeling, an explicit virologic model of infection and containment protocols. This new framework allows the explicit modeling of attributes and behaviors of individual hosts, and viral particles and their interaction with each other. The developed approach is then applied to simulate various outbreaks on the model of an actual cruise ship and the simulations are validated against real-world outbreak data and other epidemiological models. The recent SARS-CoV-2 and its Delta variant are modeled with face mask mandates. Furthermore, four containment protocols were modeled to examine their efficacy given variations in implementation practices. The future applications of this approach may include accurate real-time modeling of a novel pathogen outbreak in any confined environment (school, corporate office, military vessel, etc.). The proposed artificial intelligence approaches demonstrate the need for data-driven approaches to discover complex biological phenomenon and can be attributed to the new era of “biological data science”. Where, artificial intelligence methods can become a part of the analytical toolbox for wet lab scientists and clinicians, and from bench-to-bedside.

Creator
Contributors
Degree
Unit
Publisher
Identifier
  • etd-48571
Keyword
Advisor
Orcid
Committee
Defense date
Year
  • 2022
Date created
  • 2022-03-01
Resource type
Rights statement
License
Last modified
  • 2023-10-09

Relations

In Collection:

Items

Items

Permanent link to this page: https://digital.wpi.edu/show/wd3760507