Harnessing Incomplete, Noisy, and Multi-level Labels for Classification and Annotation Tasks
Public DepositedDeep learning models excel in various tasks but require large amounts of accurate labels. Unfortunately, acquiring quality labels is costly and requires domain expertise. Hence, datasets tend to have missing or noisy labels. Additionally, data might be labeled on multiple levels. For instance, in detecting foodborne illness incidents from a tweet, the aim at the tweet level is to predict illness indication, while at the word level, it is to identify relevant entities like location or food. However, due to the challenges and costs in acquiring labels for both levels, these levels may have incomplete or noisy labels. This dissertation explores three directions for handling incomplete, noisy, and multi-level labeled data. Direction 1 learns from two-level task datasets where one task has complete labels and the other has incomplete labels. We propose a novel deep learning solution that integrates joint learning of tasks at both levels and strikes a balance between the fully labeled and incompletely labeled tasks. Direction 2 focuses on learning with noisy labeled data. We propose a method that harnesses the Local Intrinsic Dimensionality (LID) score to detect and correct noisy labels. Direction 3 develops strategies for annotating two-level labeled data given mostly unlabeled instances. We develop a Large Language Models (LLMs) based solution that uniquely capitalizes on the relationship between two levels and integrates multi-example retrieval methods. Our experimental studies on real-world domains demonstrate that our proposed methods outperform state-of-the-art methods for each of these difficult label-related challenges.
- Creator
- Contributors
- Degree
- Unit
- Publisher
- Identifier
- etd-121442
- Keyword
- Advisor
- Orcid
- Committee
- Defense date
- Year
- 2024
- Sponsor
- UN Sustainable Development Goals
- Date created
- 2024-04-23
- Resource type
- Source
- etd-121442
- Rights statement
- Last modified
- 2024-05-29
Relations
- In Collection:
Items
Permanent link to this page: https://digital.wpi.edu/show/g732dd94c