Audio-Visual Sound Separation Using Hidden Markov Models

Casey, Michael A. and Hershey, J.. 2002. 'Audio-Visual Sound Separation Using Hidden Markov Models'. In: Advances in Neural Information Processing Systems. UNDEFINED 1/1/2002. [Conference or Workshop Item]

No full text available

Item Type:

Conference or Workshop Item (Paper)

Additional Information:

Originalty: presents a major advancement on the state-of-the-art for multi-speaker audio source separation in a single channel. Also, it is one of the first major publications introducing the joint audio-visual approach to separating individuals' speech from a mixture. Rigour: Bayesian modeling yields an Expectation Maximization algorithm to solve inference in factorial (coupled) hidden Markov models for individual speakers using audio-visual features. Significance: The NIPS acceptance rate is consistently below 30%, the conference is a primary source for machine learning and audio. Funded by Mitsubishi Electric Research, this research is used in their general audio products.

Departments, Centres and Research Units:



January 2002Published

Date range:


Item ID:


Date Deposited:

12 Mar 2009 15:41

Last Modified:

20 Jun 2017 09:43


Edit Record Edit Record (login required)