A. , M. , S. Nakamura, K. Shikano, and H. Kuwabara, «Voice conversion through vector quantization», Proc. ICASSP, pp.655-658, 1988.

A. , R. H. Et, and S. B. Jebara, «Esophageal speech enhancement using excitation source synthesis and formant patterns modification», Proc. Int. Conf. on Signal-Image Technology & Internet Based Systems (SITIS), pp.315-324, 2006.

L. Bahl, P. Brown, P. V. De, R. Souza, and . Mercer, Maximum mutual information estimation of hidden Markov model parameters for speech recognition, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing, pp.49-52, 1986.
DOI : 10.1109/ICASSP.1986.1169179

L. Bahl, P. Brown, P. V. De, R. Souza, and . Mercer, A tree-based statistical language model for natural language speech recognition, IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.37, issue.7, pp.1001-1008, 1989.
DOI : 10.1109/29.32278

J. Baker, «The dragon system?an overview», Acoustics, Speech and Signal Processing, IEEE Transactions on, vol.23, issue.1, pp.24-29, 1975.
DOI : 10.1109/tassp.1975.1162650

L. E. Baum, «An inequality and associated maximization technique in statistical estimation for probabilistic functions of markov processes», Inequalities, pp.1-8, 1972.

B. , M. H. , J. W. Lerman, and H. R. Gilbert, «An acoustic analysis of excellent female esophageal, tracheoesophageal, and laryngeal speakers», Journal of Speech, Language and Hearing Research, vol.44, issue.1, pp.1315-1320, 2001.

C. , Y. , V. Chari, J. Macauslan, C. Huang et al., «Enhancement of electrolaryngeal speech by adaptive filtering», Journal of Speech, Language and Hearing Research, vol.41, issue.1, pp.1253-1264, 1998.

C. , D. , S. Sridharan, and M. Geva, «Application of noise reduction techniques for alaryngeal speech enhancement», Speech & Image Process. for Computing & Telecommun, pp.491-494, 1997.

D. , S. Et, and P. Mermelstein, «Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences», IEEE Transactions on Acoustics, Speech, and Signal Processing, vol.28, issue.4, pp.357-366, 1980.

D. , A. , N. Laird, and D. R. , «Maximum likelihood from incomplete data via the em algorithm», Journal of the Royal Statistical Society. Series B (Methodological, vol.39, issue.1, pp.1-38, 1977.

D. , S. , A. W. Black, B. Yegnanarayana, and K. Prahallad, «Spectral mapping using artificial neural networks for voice conversion», IEEE Transactions on Audio, Speech, and Language Processing, vol.18, issue.5, pp.954-964, 2010.

D. , D. , T. Toda, K. Nakamura, H. Saruwatari et al., 2014, «Alaryngeal speech enhancement based on one-to-many eigenvoice conversion», IEEE Trans. Audio . Speech Language, vol.22, issue.93, pp.172-183

G. , B. , J. Vicente, and E. A. , «Time-spectral technique for esophageal speech regeneration», Biosignal Analysis of biomedical signals and images, pp.113-116, 2002.

G. , J. S. , L. F. Lamel, W. M. Fisher, J. G. Fiscus et al., The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus CDROM, pp.31-68, 1993.

G. , J. Et, and C. L. , «Maximum a posteriori estimation for multivariate gaussian mixture observations of markov chains», Speech and Audio Processing, IEEE Transactions on, vol.2, issue.2, pp.291-298, 1994.

H. , R. , and H. Ney, «Linear discriminant analysis for improved large vocabulary continuous speech recognition», Proc. ICASSP, pp.13-16, 1998.

H. , A. , and H. Sawada, «Real-time clarification of esophageal speech using a comb filter», International Conference on Disability, pp.39-46, 2002.

J. , F. Et, and R. L. Mercer, «Interpolated estimation of markov source parameters from sparse data», Proc. Workshop Pattern Recognition in Practice, pp.381-397, 1980.

J. , F. , R. L. Mercer, L. R. Bahl, and J. K. Baker, «Perplexity a measure of the difficulty of speech recognition tasks», journal of acoustical society of america, vol.62, 1977.

J. , D. , L. Mauuary, and J. Monné, «Automatic adjustments of the structure of markov models for speech recognition applications», proceeding EUROSPEECH 91, pp.927-930, 1991.

J. , B. Et, and L. Rabiner, «Mixture autoregressive hidden markov models for speech signals», Acoustics, Speech and Signal Processing, IEEE Transactions on, vol.33, issue.6, pp.1404-1413, 1985.

K. , A. , and M. Macon, «Spectral voice conversion for text-to-speech synthesis», Proc. ICASSP, pp.285-288, 1998.

K. , H. , I. Masuda-katsuse, A. De, and . Cheveigne, «Restructuring speech representations using a pitch-adaptive time-frequency smoothing and an instantaneousfrequency-based f0 extraction : Possible role of a repetitive structure in sounds», Speech communication journal, vol.27, issue.3, pp.187-207, 1999.

K. , R. Et, and R. D. Mori, «A cache-based natural language model for speech recognition», IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.12, issue.6, pp.570-583, 1990.

K. , N. Et, and A. A. , «Heteroscedastic discriminant analysis and reduced rank hmms for improved speech recognition», Speech Communication, vol.26, issue.56, pp.283-297, 1998.

L. , O. , J. D. Martino, E. H. Elhaj, and A. Hammouch, 2012, «Real time contextindependent phone recognition using a simplified statistical training algorithm», 3rd International Conference on Multimedia Computing and Systems -ICMCS'12. URL https, pp.31-36

L. , O. , J. D. Martino, E. I. Elhaj, and A. Hammouch, 2014, «Improving the recognition of pathological voice using the discriminant HLDA transformation», third IEEE International Colloquium in Information Science and Technology (CIST), pp.370-373

L. , O. , J. D. Martino, E. I. Elhaj, and A. Hammouch, «A preliminary study on improving the recognition of esophageal speech using a hybrid system based on statistical voice conversion», SpringerPlus, pp.1-14, 2015.

L. , L. Et, and J. Gauvain, «High performance speaker-independant phone recognition using cdhmm», Proc. Eurospeech, pp.121-124, 1993.

L. , S. J. Et, and K. Bunton, «Perceptual effects of a flattened fundamental frequency at the sentence level under different listening conditions», Journal of Communication Disorders, vol.36, issue.6, pp.449-464, 2003.

L. , S. J. Et, and G. W. , «The effects of a flattened fundamental frequency on intelligibility at the sentence level», Journal of Speech, Language and Hearing Research, vol.42, issue.5, pp.1148-1156, 1999.

L. , K. , and H. Hon, «Speaker-independent phone recognition using hidden markov models», Acoustics, Speech and Signal Processing, IEEE Transactions on, vol.37, issue.11, pp.1641-1648, 1989.

L. , K. , H. Hon, and R. Reddy, «An overview of the sphinx speech recognition system» , Acoustics, Speech and Signal Processing, IEEE Transactions on, vol.38, issue.1, pp.35-45, 1990.

L. , Y. , A. Buzo, and R. G. , «An algorithm for vector quantizer design», IEEE Transactions on Communications, vol.28, issue.37, pp.84-95, 1980.

L. , A. , and J. Bonada, «Esophageal voice enhancement by modeling radiated pulses in frequency domain», Proceedings of 121st Convention of the Audio Engineering Society, pp.3-6, 2006.

M. , J. , L. M. Lecam, and J. Neyman, «Some methods of classification and analysis of multivariate observations», Proc. 5th Berkeley Symposium on Math, pp.281-99, 1967.

M. , A. , M. Nakano-miyatake, and H. Perez-meana, «A pattern recognition based esophageal speech enhancement system», Journal Applied Research & Tech, vol.8, issue.1, pp.56-71, 2010.

M. , J. D. Et, and A. H. Gray, «Linear prediction of speech», 1976.

M. , K. , N. Hara, N. Kobayashi, and H. Hirose, «Enhancement of esophageal speech using formant synthesis», Proc. ICASSP, pp.1831-1834, 1999.

M. , J. Et, and F. J. Smith, «Improved phone recognition using bayesian triphone models», International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp.409-412, 1998.

M. , E. Et, and F. Charpentier, «Pitch-synchronous waveform processing techniques for text-to-speech synthesis using diphones», Speech communication journal, vol.9, issue.5, pp.453-467, 1990.

N. , K. , T. Toda, H. Saruwatari, and K. Shikano, 2012, «Speaking-aid systems using gmm-based voice conversion for electrolaryngeal speech», Speech Communication journal, vol.54, issue.83, pp.134-146

N. , M. , H. Murthy, S. Rajendran, and B. Y. , «Transformation of formants for voice conversion using artificial neural networks», Speech Communication journal, vol.16, issue.1, pp.207-2016, 1995.

N. , B. Et, and Q. Y. , «Application of speech conversion to alaryngeal speech enhancement», IEEE Transactions on Speech and Audio Processing, vol.5, issue.84, pp.97-105, 1997.

N. , Y. , R. Cardin, R. De, and . Mori, «High-performance connected digit recognition using maximum mutual information estimation», Speech and Audio Processing, IEEE Transactions on, vol.2, issue.2, pp.299-311, 1994.

O. , Y. , T. Toda, H. Saruwatari, and K. Shikano, «Maximum likelihood voice conversion based on gmm with straight mixed excitation», Proc. Interspeech, pp.2266-2269, 2006.

P. , D. , S. Dhivya, A. Durga, and . Devi, 2012, «Pathological voice recognition for vocal fold disease», International Journal of Computer Applications, vol.47, issue.60, pp.31-37

Q. , Y. Et, and B. Weinberg, «Low-frequency energy deficit in electrolaryngeal speech», Journal of Speech and Hearing Research, vol.34, issue.6, pp.1250-1256, 1991.

R. , K. A. , V. M. Prasad, J. Kanagalingam, C. M. Nutting et al., «Assessment of the formant frequencies in normal and laryngectomized individuals using linear predictive coding», Journal of Voice, vol.21, issue.6, pp.661-668, 2007.

R. , T. Et, and F. Fallside, «A recurrent error propagation network speech recognition system», Computer Speech and Language, vol.5, issue.3, pp.259-274, 1991.

R. , D. E. , G. E. Hinton, and R. J. Williams, «Parallel distributed processing : Explorations in the microstructure of cognition, Learning Internal Representations by Error Propagation, vol.1, pp.318-362, 1986.

S. , H. Et, and S. C. , «A dynamic programming approach to continuous speech recognition», Proc. 7th Int. Congr. on Acoustics, pp.65-68, 1971.

S. , H. R. , I. V. Mcloughlin, and F. Ahmadi, 2010, «Reconstruction of normal sounding speech for laryngectomy patients through a modified CELP codec», Biomedical Engineering IEEE Transactions, vol.57, issue.10, pp.2448-2458

S. , Y. , O. Cappé, and E. Moulines, «Continuous probabilistic transform for voice conversion», IEEE Proc. on Speech and Audio Processing, vol.6, issue.86, pp.131-142, 1998.

T. , K. , T. Toda, G. Neubig, S. Sakti et al., «A hybrid approach to electrolaryngeal speech enhancement based on noise reduction and statistical excitation generation», IEICE Transactions on Information and Systems, vol.82, issue.93, pp.1429-1437, 2014.

T. , T. , W. Black, and K. T. , «Voice conversion based on maximumlikelihood estimation of spectral parameter trajectory», IEEE Transactions on Audio, Speech, and Language Processing, vol.15, issue.105, pp.2222-2235, 2007.

T. , T. , K. Nakamura, H. Sekimoto, and K. Shikano, «Voice conversion for various types of body transmitted speech», Proc. ICASSP, pp.285-288, 2009.

T. , T. , Y. Ohtani, and K. Shikano, «Eigenvoice conversion based on gaussian mixture model», Proc. ICSLP, pp.2446-2449, 2006.
DOI : 10.1121/1.4787180

URL : http://library.naist.jp/dspace/bitstream/10061/7903/1/ASAASJ_2006_%284%29.pdf

T. , M. Et, and Y. Ariki, «Effectiveness of kltransformation in spectral delta expansion», Eurospeech 99, pp.359-362, 1999.

T. , O. Et, and L. Arslan, «Robust processing techniques for voice conversion», Computer Speech Language journal, vol.4, issue.20, pp.441-467, 2006.

T. , H. Et, and M. Karsligil, «Reconstruction of dysphonic speech by melp», Lecture Notes in Computer Science, vol.5197, pp.767-774, 2008.

V. , H. , E. Moulines, and J. Tubach, «Voice transformation using psola technique», Proc. ICASSP, pp.145-148, 1992.

W. , A. , J. D. Martino, and S. B. Jebara, «On the use of an iterative estimation of continuous probabilistic transforms for voice conversion», Proceedings of the 5th International Symposium on Image/Video Communication over fixed and Mobile Networks (ISIVC), pp.1-4, 2010.

W. , L. , M. S. De-bodt, G. Molenberghs, M. Remacle et al., «The dysphonia severity index : an objective measure of vocal quality based on a multiparameter approach», In Journal of Speech, Language, and Hearing Research, vol.43, issue.60, pp.796-809, 2000.

Y. , S. , D. Kershaw, J. Odell, D. Ollason et al., The HTK Book Revised for HTK Version 3, pp.31-75, 2006.

Y. , S. , N. Russel, and J. Thornton, «Token passing : a simple conceptual model for connected speech recognition systems», p.44, 1989.

Y. , S. J. , J. J. Odell, and P. C. Woodland, «Tree-based state tying for high accuracy acoustic modeling», Proc. ARPA Workshop Human Language Technol, pp.307-312, 1994.

Y. , S. J. Et, and P. C. Woodland, «State clustering in hmm-based continuous speech recognition», Computer Speech and Language, vol.8, issue.48, pp.369-384, 1994.

Y. , P. , M. Ouakine, J. Revis, and A. Giovanni, «Objective voice analysis for dysphonic patients : a multiparametric protocol including acoustic and aerodynamic measurements», In Journal Voice, vol.15, issue.60, pp.529-542, 2001.

Z. , G. Et, and S. R. , «Probabilistic modeling with bayesian networks for automatic speech recognition», Australian Journal of Intelligent Information Processing, vol.5, issue.4, pp.253-260, 1999.