Student Work

Auditory Grouping: Using Machine Learning to Predict Locations of Groups in Music Clips (Extension)

Public Deposited

Downloadable Content

open in viewer

Humans perceive a variety of features from an auditory stream, such as our acoustic sensors can detect frequency, pitch, dynamics, etc. We can process music in several different ways based on these features. It’s tough for machines, however, to do the same. Some previous research models already can obtain state-of-the-art performance in predicting acoustic boundaries, but machine perception for audio segmentation based on a human perspective remains to be accomplished. Our project aims to use machine learning algorithms to build a model that makes machines able to separate music into segments as humans do. The machine learning model we built allowed for clear grouping distinction for audio clips of the same musical genre we trained the data on, but generalized poorly to other genres. We believe that the model can be improved by having more training data of a larger scope and increasing the quality of grouping boundaries labels for the data. This extension includes materials prepared to support a continuation of this project by future teams as well as an elaborated conclusion section from the original document.

  • This report represents the work of one or more WPI undergraduate students submitted to the faculty as evidence of completion of a degree requirement. WPI routinely publishes these reports on its website without editorial or peer review.
Creator
Publisher
Identifier
  • 91621
  • E-project-030623-203709
Advisor
Year
  • 2023
Date created
  • 2023-03-06
Resource type
Major
Source
  • E-project-030623-203709
Rights statement
Last modified
  • 2023-04-19

Relations

In Collection:

Items

Items

Permanent link to this page: https://digital.wpi.edu/show/g158bm62m