Speaker discriminability for visual speech modes
dc.contributor.author | Kim, J. | |
dc.contributor.author | Davis, C. | |
dc.contributor.author | Kroos, Christian | |
dc.contributor.author | Hill, H. | |
dc.contributor.editor | - | |
dc.date.accessioned | 2017-01-30T10:39:41Z | |
dc.date.available | 2017-01-30T10:39:41Z | |
dc.date.created | 2015-07-16T06:21:55Z | |
dc.date.issued | 2009 | |
dc.identifier.citation | Kim, J. and Davis, C. and Kroos, C. and Hill, H. 2009. Speaker discriminability for visual speech modes, in 10th Annual Conference of the International Speech Communication Association INTERSPEECH 2009, Sep 6-9 2009, pp. 2259-2262. Brighton, UK: ISCA. | |
dc.identifier.uri | http://hdl.handle.net/20.500.11937/4521 | |
dc.description.abstract |
Does speech mode affect recognizing people from their visual speech? We examined 3D motion data from 4 talkers saying 10 sentences (twice). Speech was in noise, in quiet or whispered. Principal Component Analyses (PCAs) were conducted and speaker classification was determined by Linear Discriminant Analysis (LDA). The first five PCs for the rigid motion and the first 10 PCs each for the non-rigid motion and the combined motion were input to a series of LDAs for all possible combinations of PCs that could be constructed using the retained PCs. The discriminant functions and classification coefficients were determined on the training data to predict the talker of the test data. Classification performance for both the in-noise and whispered speech modes were superior to the in-quiet one. Superiority of classification was found even if only the first PC (jaw motion) was used, i.e., measures of jaw motion when speaking in noise or whispering hold promise for bimodal person recognition or verification. | |
dc.publisher | ISCA | |
dc.relation.uri | http://www.isca-speech.org/archive/archive_papers/interspeech_2009/papers/i09_2259.pdf | |
dc.subject | Visual speech | |
dc.subject | speech modes | |
dc.subject | speaker recognition | |
dc.title | Speaker discriminability for visual speech modes | |
dc.type | Conference Paper | |
dcterms.source.startPage | 2259 | |
dcterms.source.endPage | 2262 | |
dcterms.source.issn | 1990-9772 | |
dcterms.source.title | Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009) | |
dcterms.source.series | Proceedings of the 10th Annual Conference of the International Speech Communication Association (Interspeech 2009) | |
dcterms.source.conference | INTERSPEECH 2009 | |
dcterms.source.conference-start-date | Sep 6 2009 | |
dcterms.source.conferencelocation | Brighton, UK | |
dcterms.source.place | Australia | |
curtin.accessStatus | Fulltext not available |