110 14.3 Utilisation des paramètres combinés ,
1 présente le taux d'erreur mot et le pourcentage de nouveaux mots correctement reconnus, obtenus avec les nouveaux modèles de langage. baseline+1-grammes 5 mS 10, p.61 ,
Applying Convolutional Neural Networks concepts to hybrid NN-HMM model for speech recognition, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.4277-4280, 2012. ,
Open Vocabulary ASR for Audiovisual Document Indexation, Proceedings of the IEEE International Conference on Acoustics , Speech and Signal Processing (ICASSP), pp.1013-1016, 2005. ,
Speech Recognition of Aged Voices in the AAL Context: Detection of Distress Sentences, The 7th International Conference on Speech Technology and Human-Computer Dialogue (SpeD), pp.177-184, 2013. ,
URL : https://hal.archives-ouvertes.fr/hal-00953248
Automatic Speech Recognition: A Review, International Journal of Computer Applications, vol.609, pp.34-44, 2012. ,
Automatic modeling for adding new words to a large-vocabulary continuous speech recognition system, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). T. 1, pp.305-308, 1991. ,
The DRAGON system--An overview, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.23, issue.1, pp.24-29, 1975. ,
DOI : 10.1109/TASSP.1975.1162650
Automatic Detection of the Prosodic Structures of Speech Utterances, Speech and Computer. T. 8113, pp.1-8, 2013. ,
On the Syllabification of Phonemes, Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics, pp.308-316, 2009. ,
An inequality with applications to statistical estimation for probabilistic functions of Markov processes and to a model for ecology, Bulletin of the American Mathematical Society, vol.73, issue.3, pp.360-363, 1967. ,
DOI : 10.1090/S0002-9904-1967-11751-8
Statistical Inference for Probabilistic Functions of Finite State Markov Chains, The Annals of Mathematical Statistics, vol.37, issue.6, pp.1554-1563, 1966. ,
DOI : 10.1214/aoms/1177699147
Neural Probabilistic Language Models, The Journal of Machine Learning Research, vol.3, pp.1137-1155, 2003. ,
DOI : 10.1007/3-540-33486-6_6
URL : https://hal.archives-ouvertes.fr/hal-01434258
Annotation automatique en syllabes d'un dialogue oral spontané, Journées d'Étude sur la Parole (JEP), pp.1-4, 2010. ,
Open vocabulary speech recognition with flat hybrid models, Proceedings of Interspeech, pp.725-728, 2005. ,
Any questions? Automatic question detection in meetings, IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.485-489, 2009. ,
URL : https://hal.archives-ouvertes.fr/hal-01194279
Class-Based n-gram Models of Natural Language, Computational Linguistics 18, pp.467-479, 1992. ,
A speech event detection and localization task for multiroom environments, Workshop on Handsfree Speech Communication and Microphone Arrays (HSCMA), pp.157-161, 2014. ,
BDLEX : a Lexicon for Spoken and Written French, Proceedings of the International Conference on Language Resources and Evaluation (LREC), pp.1129-1136, 1998. ,
Speech recognition in French with a very large dictionary, Proceedings of Eurospeech, pp.2150-2153, 1989. ,
Ridge estimators in Logistic Regression, Applied Statistics, vol.411, pp.191-201, 1992. ,
Institution des sourds et muets par la voie des signes méthodiques, 1776. ,
Structured language modeling, Computer Speech & Language 14, pp.283-332, 2000. ,
An Empirical Study of Smoothing Techniques for Language Modeling, 1998. ,
The BBN BYBLOS Continuous Speech Recognition system, Proceedings of the workshop on Speech and Natural Language , HLT '89, pp.89-92, 1987. ,
DOI : 10.3115/100964.100968
LIPCOM, prototype d'aide automatique à la réception de la parole par les personnes sourdes, pp.36-40, 1999. ,
Tessa, a system to aid communication with deaf people, Proceedings of the fifth international ACM conference on Assistive technologies, pp.205-212, 2002. ,
Similarity-Based Models of Word Cooccurrence Probabilities, In : Machine Learning. T, vol.34, pp.1-3, 1999. ,
Context-Dependent Pre-trained Deep Neural Networks for Large Vocabulary Speech Recognition, IEEE Transactions on Audio, Speech and Language Processing, 2012. ,
Automatic Recognition of Spoken Digits, The Journal of the Acoustical Society of America, vol.246, pp.637-642, 1952. ,
Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Sentences, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.284, pp.357-366, 1980. ,
Language Model Adaptation, Computational Models of Speech Pattern Processing. T. 169, pp.280-303, 1999. ,
Spoken digit recognition using time-frequency pattern matching, The Journal of the Acoustical Society of America, vol.32, issue.11, pp.1450-1455, 1960. ,
The EPAC corpus: manual and automatic annotations of conversational speech in French broadcast news, Proceedings of the International Conference on Language Resources and Evaluation (LREC), 2010. ,
URL : https://hal.archives-ouvertes.fr/hal-01433895
CoALT: A Software for Comparing Automatic Labelling Tools, Proceedings of the International Conference on Language Resources and Evaluation (LREC), 2012. ,
Experiments with a new boosting algorithm, Thirteenth International Conference on Machine Learning, pp.148-156, 1996. ,
Audio-visual speech recognition incorporating facial depth information captured by the Kinect, Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pp.2714-2717, 2012. ,
The ESTER 2 evaluation campaign for rich transcription of French broadcasts Syllable-based large vocabulary continuous speech recognition, Proceedings of Interspeech. IEEE Transactions on Speech and Audio Processing 9, pp.358-366, 2001. ,
Extracting deep bottleneck features using stacked auto-encoders, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp.3377-3381, 2013. ,
The ETAPE corpus for the evaluation of speech-based TV content processing in the French language, Proceedings of the International Conference on Language Resources, Evaluation and Corpora (LREC), 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00712591
Probabilistic and bottle-neck features for LVCSR of meetings, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). T. 4, pp.757-760, 2007. ,
The WEKA Data Mining Software: An Update, In : SIGKDD Explorations, vol.11, issue.1, pp.10-18, 2009. ,
Syllable-Length Acoustic Units in Large-Vocabulary Continuous Speech Recognition, Proceedings of SPECOM, pp.499-502, 2005. ,
A fast learning algorithm for deep belief nets, Neural Computation, vol.187, pp.1527-1554, 2006. ,
Analysis of speaker variability, Proceedings of Interspeech, pp.1377-1380, 2001. ,
PocketSphinx: A free, real-time continuous speech recognition system for hand-held devices, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2006. ,
Grapheme-to-Phoneme Conversion using Conditional Random Fields, Proceedings of Interspeech, pp.2313-2316, 2011. ,
URL : https://hal.archives-ouvertes.fr/inria-00614981
Bag-of-Words Input for Long History Representation in Neural Network-Based Language Models for Speech Recognition, Proceedings of Interspeech, 2015. ,
Design of a linguistic statistical decoder for the recognition of continuous speech, IEEE Transactions on Information Theory, vol.213, pp.250-256, 1975. ,
Self-organized language modeling for speech recognition, Readings in Speech Recognition, pp.450-506, 1990. ,
A Dynamic Language Model for Speech Recognition, Proceedings of the Workshop on Speech and Natural Language, pp.293-295, 1991. ,
Phonetic speaker identification, Proceedings of Interspeech, 2002. ,
Evaluating grapheme-to-phoneme converters in automatic speech recognition context, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4821-4824, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00753364
Combining forward-based and backward-based decoders for improved speech recognition performance, Proceedings of Interspeech, 2013. ,
A machine learning based approach for vocabulary selection for speech transcription, Proceedings of the 16th International Conference on Text, Speech and Dialogue (TSD). T. 8082, pp.60-67, 2013. ,
Automatic speech recognition and understanding: A first step toward natural human machine communication, Proceedings of the IEEE. T. 88. 8, pp.1142-1165, 2000. ,
Automatic detection of discourse structure for speech recognition and understanding, IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.88-95, 1997. ,
Pattern Recognition in Speech and Language Processing, 2003. ,
Using the Web to Obtain Frequencies for Unseen Bigrams, In : Computational Linguistics, vol.293, pp.459-484, 2003. ,
Estimating confidence using word lattices, Proceedings of Eurospeech, 1997. ,
A Preliminary Study Of Prosodybased Detection Of Questions In Arabic Speech Monologues, Arabian Journal for Science and Engineering2C, vol.35, pp.167-181, 2010. ,
Power-normalized cepstral coefficients (PNCC) for robust speech recognition, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4101-4104, 2012. ,
On Development of Consistently Punctuated Speech Corpora, Proceedings of Interspeech, pp.833-836, 2011. ,
Sentence modality recognition in French based on prosody, International Conference on Enformatika, Systems Sciences and Engineering -ESSE 2005. T. 8, pp.185-188, 2005. ,
URL : https://hal.archives-ouvertes.fr/hal-00013968
On Information and Sufficiency, The Annals of Mathematical Statistics 22.1, pp.79-86, 1951. ,
Heteroscedastic discriminant analysis and reduced rank HMMs for improved speech recognition, Speech communication 26, pp.283-297, 1998. ,
Microphone Array Processing for Distant Speech Recognition: From Close-Talking Microphones to Far-Field Sensors, IEEE Signal Processing Magazine, vol.296, pp.127-140, 2012. ,
The Sounds of the World's Languages, 1996. ,
Reconnaissance automatique de phonemes guide par les syllables, 2006. ,
Automatically finding semantically consistent n-grams to add new words in LVCSR systems, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4676-4679, 2011. ,
URL : https://hal.archives-ouvertes.fr/hal-00645223
Acoustic modeling for large vocabulary speech recognition, Computer Speech and Language, vol.4, issue.2, pp.127-165, 1990. ,
An overview of the SPHINX speech recognition system, IEEE Transactions on Acoustics, Speech and Signal Processing, vol.38, issue.1, pp.35-45, 1990. ,
A Retrospective View of the HEARSAY-II Architecture, Proceedings of the Fifth International Joint Conference on Artificial Intelligence, pp.790-800, 1977. ,
Detecting question-bearing turns in spoken tutorial dialogues, Proceedings of Interspeech, 2006. ,
Context dependent language model adaptation, Proceedings of Interspeech, 2008. ,
Methodology for developing an advanced communications system for the Deaf in a new domain, Knowledge-Based Systems 56, pp.240-252, 2014. ,
Evaluation of a noise-robust DSR front-end on Aurora databases, Proceedings of Interspeech, 2002. ,
Question detection in spoken conversations using textual conversations, pp.118-124, 2011. ,
Beyond SIRI: Exploring Spoken Language in Warehouse Operations, Offender Monitoring and Robotics " . In : Mobile Speech and Advanced Natural Language Solutions, pp.3-21, 2013. ,
Automatic estimation of language model parameters for unseen words using morpho-syntactic contextual information, Proceedings of Interspeech, pp.1602-1605, 2008. ,
A logical calculus of the ideas immanent in nervous activity " . In : The bulletin of mathematical biophysics 5, pp.115-133, 1943. ,
French Gigaword third edition, Proceedings of the Linguistic Data Consortium, 2011. ,
Recurrent neural network based language model, Proceedings of Interspeech, pp.1045-1048, 2010. ,
Empirical Evaluation and Combination of Advanced Language Modeling Techniques, Proceedings of Interspeech. ISCA, 2011. ,
Extensions of recurrent neural network language model, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp.5528-5531, 2011. ,
Efficient Estimation of Word Representations in Vector Space, 2013. ,
Philosophy: The Power Of Ideas: Ninth Edition, 2013. ,
Morpheme Based Factored Language Models for German LVCSR, Proceedings of Interspeech, pp.1445-1448, 2011. ,
Class-Based N-Gram Language Model for New Words Using Out-of-Vocabulary to In-Vocabulary Similarity, IEICE Transactions on Information and Systems E95-D.9, pp.2308-2317, 2012. ,
On the estimation of 'small' probabilities by leaving-one-out, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.1712, pp.1202-1212, 1995. ,
Speech recognition technology for individuals with disabilities, Augmentative and Alternative Communication 8.4, pp.297-303, 1992. ,
Comparison and Analysis of Several Phonetic Decoding Approaches, Proceedings of the 16th International Conference on Text, Speech and Dialogie (TSD), 2013. ,
PrefixSpan: Mining Sequential Patterns Efficiently by Prefix-Projected Pattern Growth, International Conference on Data Engineering, pp.215-224, 2001. ,
Qualitative investigation of the display of speech recognition results for communication with deaf people, Workshop on Speech and Language Processing for Assistive Technologies (SLPAT), 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01183349
Language Model Adaptation Using Different Class-Based Models, Proceedings of SPECOM, pp.449-454, 2007. ,
A decision tree-based method for speech processing: question sentence detection, Proceedings of the Third international conference on Fuzzy Systems and Knowledge Discovery, pp.1205-1212, 2006. ,
Automatic question detection: prosodic-lexical features and crosslingual experiments, Proceedings of Interspeech, pp.2257-2260, 2007. ,
Speaker independent recognition of isolated words using clustering techniques, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 4, pp.574-577, 1979. ,
Fundamentals of Speech Recognition, 1993. ,
A new method for OOV detection using hybrid word/fragment system, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp.3953-3956, 2009. ,
Towards using hybrid word and fragment units for vocabulary independent LVCSR systems, Proceedings of Interspeech, pp.1931-1934, 2009. ,
Transcription automatique pour malentendants : amélioration à l'aide de mesures de confiance locales, 2008. ,
Sentence Boundary Detection: A Long Solved Problem, In : Proceedings of COLING, pp.985-994, 2012. ,
Multimodal human communication -Targeting facial expressions, speech content and prosody, pp.2346-2356, 2012. ,
A training procedure for verifying string hypotheses in continuous speech recognition, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). T. 1, pp.281-284, 1995. ,
The multilayer perceptron as an approximation to a Bayes optimal discriminant function, IEEE Transactions on Neural Networks 1.4, pp.296-298, 1990. ,
Long short-term memory recurrent neural network architectures for large scale acoustic modeling, Proceedings of Interspeech, pp.338-342, 2014. ,
Using phase spectrum information for improved speech recognition performance, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). T. 1, pp.133-136, 2001. ,
Hybrid Language Models Using Mixed Types of Sub-Lexical Units for Open Vocabulary German LVCSR, Proceedings of Interspeech, pp.1441-1444, 2011. ,
Using morpheme and syllable based sub-words for Polish LVCSR, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp.4680-4683, 2011. ,
Voice control of a powered wheelchair, IEEE Transactions on Neural Systems and Rehabilitation Engineering 10.2, pp.122-125, 2002. ,
DOI : 10.1109/TNSRE.2002.1031981
Dimensionality reduction for speech recognition using neighborhood components analysis, Proceedings of Interspeech, pp.1158-1161, 2007. ,
The ETSI extended distributed speech recognition (DSR) standards: client side processing and tonal language recognition evaluation, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). T. 1, pp.129-132, 2004. ,
Towards better language models for spontaneous speech, The 3rd International Conference on Spoken Language Processing (ICSLP). ISCA, 1994. ,
Hybrid acoustic models for distant and multichannel large vocabulary speech recognition, IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.285-290, 2013. ,
Sub-word modeling of out of vocabulary words in spoken term detection " . In : Spoken Language Technology Workshop (SLT), pp.273-276, 2008. ,
Using morphemes in language modeling and automatic speech recognition of Amharic, In : Natural Language Engineering, pp.235-259, 2014. ,
Syllable-based and hybrid acoustic models for Amharic speech recognition, Workshop on Spoken Language Technologies for Under-Resourced Languages (SLTU), pp.5-10, 2012. ,
URL : https://hal.archives-ouvertes.fr/hal-00954042
Speech recognition technology for disabilities education, Journal of Educational Technology Systems, vol.33, issue.2, pp.173-184, 1994. ,
Recurrent type-2 fuzzy neural network using Haar wavelet energy and entropy features for speech detection in noisy environments, Expert systems with applications 39.3, pp.2479-2488, 2012. ,
From Frequency to Meaning: Vector Space Models of Semantics, Journal of Artificial Intelligence Research, vol.37, issue.1, pp.141-188, 2010. ,
Multichannel Automatic Recognition of Voice Command in a Multi-Room Smart Home : an Experiment involving Seniors and Users with Visual Impairment, Proceedings of Interspeech, pp.1008-1012, 2014. ,
URL : https://hal.archives-ouvertes.fr/hal-01003492
Decoding with Large-Scale Neural Language Models Improves Translation, pp.1387-1392, 2013. ,
Audio source localization by optimal control of a mobile robot, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01103949
The String-to-String Correction Problem, Journal of the ACM, vol.21, issue.1, pp.168-173, 1974. ,
Phoneme recognition using time-delay neural networks, IEEE Transactions on Acoustics, Speech and Signal Processing, vol.373, pp.328-339, 1989. ,
Letter-to-sound pronunciation prediction using conditional random fields, Signal Processing Letters 18, 2011. ,
Linguistic constraints in hidden Markov model based speech recognition, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). T. 2, pp.699-702, 1989. ,
Neuralnetwork based measures of confidence for word recognition, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.887-890, 1997. ,
A Comparison Of Word Graph And N-Best List Based Confidence Measures, Proceedings of Eurospeech, pp.315-318, 1999. ,
Confidence Measures for Large Vocabulary Continuous Speech Recognition, IEEE Transactions on Speech and Audio Processing 9, pp.288-298, 2001. ,
The HWIM speech understanding system, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2, pp.784-787, 1977. ,
Channel Selection Measures for Multi-Microphone Speech Recognition, Speech Communication, vol.57, pp.170-180, 2013. ,
Incorporating Information From Syllable-Length Time Scales Into Automatic Speech Recognition, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.721-724, 1998. ,
Random forests and the data sparseness problem in language modeling, Computer Speech & Language 21.1, pp.105-152, 2007. ,
An approach to automatic language identification based on language-dependent phone recognition, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.3511-3514, 1995. ,
Hybrid language models for out of vocabulary word detection in large vocabulary conversational speech recognition, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, pp.745-748, 2004. ,
The HTK Book Version 3.4, 2006. ,
Improved Bottleneck Features Using Pretrained Deep Neural Networks, Proceedings of Interspeech, pp.237-240, 2011. ,
Detection of questions in Chinese conversational speech, IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU), pp.47-52, 2005. ,
Word level confidence annotation using combinations of features, Proceedings of Eurospeech, 2001. ,
Acoustic Feature Combination for Robust Speech Recognition, Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). T. 1, pp.457-460, 2005. ,
The MIT SUMMIT Speech Recognition System: A Progress Report, Proceedings of the Workshop on Speech and Natural Language, pp.179-189, 1989. ,
Comparison and Analysis of Several Phonetic Decoding Approaches, Proceedings of the 16th International Conference on Text, Speech and Dialogie (TSD), 2013. ,
Qualitative investigation of the display of speech recognition results for communication with deaf people, Workshop on Speech and Language Processing for Assistive Technologies (SLPAT), 2015. ,
URL : https://hal.archives-ouvertes.fr/hal-01183349