Investigating the Singing Voice: Quantitative and Qualitative Approaches to Studying Cross-Cultural Vocal Production

Proutskova, Polina. 2019. Investigating the Singing Voice: Quantitative and Qualitative Approaches to Studying Cross-Cultural Vocal Production. Doctoral thesis, Goldsmiths, University of London [Thesis]

[img]
Preview
Text (Investigating the Singing Voice: Quantitative and Qualitative Approaches to Studying Cross-Cultural Vocal Production)
COM_RedactedThesis_ProutskovaP_2019.pdf - Accepted Version
Available under License Creative Commons Attribution Non-commercial No Derivatives.

Download (15MB) | Preview
[img] Text (Investigating the Singing Voice: Quantitative and Qualitative Approaches to Studying Cross-Cultural Vocal Production)
COM_thesis_ProutskoveP_2019.pdf - Accepted Version
Permissions: Administrator Access Only

Download (17MB)

Abstract or Description

This thesis was motivated by an experiment carried out in the 1960s that studied the relationship between vocal performance practice and society by means of statistical analysis. Using a comprehensive corpus of audio recordings of singing from around the world collected over several decades, the ethnomusicologist Alan Lomax devised the Cantometrics project, the largest comparative study of music, in which 36 performance practice characteristics were rated for each recording. With particular interest in vocal production, we intended to formalise the knowledge of vocal production to enable statistical and computational approaches in the spirit of Cantometrics.

Three models of vocal production were investigated: the perceptual model from Cantometrics, a physical model from voice science and a physiological model from singing education. We built on Johan Sundberg's vocal source parameters and Jo Estill's physiological building blocks as the basis to develop an ontology of vocal production.

Two approaches to automated characterisation of the ontological descriptors were considered. For the incremental approach a proof-of-concept experiment on automatic labelling of phonation modes was presented, based on reconstructing the vocal source waveform by means of inverse filtering. We created a dataset of sustained sung vowels with annotations on pitch, vowel and phonation mode on which our model was trained. Steps to generalise this experiment to more complex data were outlined, discussing the challenges of such generalisation.

The integrated approach addressed the full variance in the data, turning to the methodology of expert knowledge elicitation in order to annotate the original Cantometrics dataset with our descriptors. We performed an investigative mixed-methods study in which 13 vocal physiology experts from different professional backgrounds were interviewed; they used our ontology to analyse vocal production in the Cantometrics dataset. The goal of the study was to: a) validate the acceptance of our ontological terms, b) verify the consensus between experts on the values of the descriptors, c) collect reliable annotations. While the acceptance of the ontology was good for most terms, quantitative analysis showed good agreement between experts for only two out of 11 descriptors (larynx height, aryepiglottic sphincter). A detailed qualitative analysis of the interview data (over 33 hours) was followed by a meta-analysis extracting common themes and confounding issues which point to probable reasons for the disagreement. For aryepiglottic sphincter and larynx height we collected the average ratings, which constitute the first set of reliable annotations on vocal production. A strong correlation was found between larynx height and the vocal width parameter from Cantometrics; larynx height was therefore a good candidate to replace vocal width as a more objective descriptor.

The current work was based on knowledge from a number of research disciplines, and its results are discussed from the viewpoint of several fields – MIR, vocal pedagogy, Cantometrics – for which they present significant implications. Future research is suggested for each of the fields. Based on the meta-analysis, we account for the reasons for disagreement between experts on the subject of vocal production, from music information retrieval (MIR) and singing education perspectives. We further explain the various kinds of bias that affect raters.

We conclude that vocal physiology, though offering a more objective language than perceptual descriptors, is not well-suited as an ontological middle layer for statistical approaches to singing given the current state of knowledge. A mixed perceptual-objective path to ontology building is suggested and ways to collect reliable annotations are outlined.

In the domain of vocal pedagogy we touch on the issue of communication on vocal physiology between experts, between teacher and student; we consider the future of teaching vocal technique and make suggestions for new experiments in the field.
A plan is presented for revising and scaling up Cantometrics as an interdisciplinary collaboration. Possible contributions of MIR, ethnomusicologists and vocal production specialists are specified.

Item Type:

Thesis (Doctoral)

Additional Information:

This is an edited version of the thesis, with pictures removed.

Keywords:

singing voice, ontology, Cantometrics

Departments, Centres and Research Units:

Computing

Date:

28 February 2019

Item ID:

26133

Date Deposited:

01 Apr 2019 13:20

Last Modified:

07 Sep 2022 17:14

URI:

https://research.gold.ac.uk/id/eprint/26133

View statistics for this item...

Edit Record Edit Record (login required)