Year of Award

2014

Degree Type

Thesis

Degree Name

Doctor of Philosophy (PhD)

Department

Department of Computer Science.

Principal Supervisor

Leung, Clement H.C.

Keywords

Emotions, Emotions in music, Information storage and retrieval systems, Mathematical models, Music

Language

English

Abstract

The digital music industry has expanded dramatically during the past decades, which results in the generation of enormous amounts of music data. Along with the Internet, the growing volume of quantitative data about users (e.g., users’ behaviors and preferences) can be easily collected nowadays. All these factors have the potential to produce big data in the music industry. By utilizing big data analysis of music related data, music can be better semantically understood (e.g., genres and emotions), and the user’s high-level needs such as automatic recognition and annotation can be satisfied. For example, many commercial music companies such as Pandora, Spotify, and Last.fm have already attempted to use big data and machine learn- ing related techniques to drastically alter music search and discovery. According to musicology and psychology theories, music can reflect our heart and soul, while emotion is the core component of music that expresses the complex and conscious experience. However, there is insufficient research in this field. Consequently, due to the impact of emotion conveyed by music, retrieval and discovery of useful music information at the emotion level from big music data are extremely important. Over the past decades, researchers have made great strides in automated systems for music retrieval and recommendation. Music is a temporal art, involving specific emotion expression. But while it is easy for human beings to recognize emotions expressed by music, it is still a challenge for automated systems to recognize them. Although some significant emotion models (e.g., Hevner’s adjective circle, Arousal- Valence model, Pleasure-Arousal-Dominance model) established upon the discrete emotion theory and dimensional emotion theory have been widely adopted in the fi of emotion research, they still suffer from limitations due to the scalability and specificity in music domain. As a result, the effectiveness and availability of music retrieval and recommendation at the emotion level are still unsatisfactory. This thesis makes contribution at theoretical, technical, and empirical level. First of all, a hybrid musical emotion model named “Resonance-Arousal-Valence (RAV)” is proposed and well constructed at the beginning. It explores the computational and time-varying expressions of musical emotions. Furthermore, dependent on the RAV musical emotion model, a joint emotion space model (JESM) combines musical audio features and emotion tags feature is constructed. Second, corresponding to static musical emotion representation and time-varying musical emotion representation, two methods of music retrieval at the emotion level are designed: (1) a unified framework for music retrieval in joint emotion space; (2) dynamic time warping (DTW) for music retrieval by using time-varying music emotions. Furthermore, automatic music emotion annotation and segmentation are naturally conducted. Third, following the theory of affective computing (e.g., emotion intensity decay, and emotion state transition), an intelligent affective system for music recommendation is designed, where conditional random fi lds (CRF) is applied to predict the listener’s dynamic emotion state based on his or her personal historical music listening list in a session. Finally, the experiment dataset is well created and pro- posed systems are also implemented. Empirical results (recognition, retrieval, and recommendation) regarding accuracy compared to previous techniques are also presented, which demonstrates that the proposed methods enable an advanced degree of effectiveness of emotion-based music retrieval and recommendation. Keywords: Music and emotion, Music information retrieval, Music emotion recognition, Annotation and retrieval, Music recommendation, Affective computing, Time series analysis, Acoustic features, Ranking, Multi-objective optimization

Comments

Thesis (Doctor of Philosophy)--Hong Kong Baptist University, 2014.;Principal supervisor: Professor Leung Clement HC.;Includes bibliographical references (pages 186-214)


Share

COinS
 
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.