Back to EveryPatent.com
United States Patent | 5,165,007 |
Bahl ,   et al. | November 17, 1992 |
In a speech recognition system, apparatus and method for modelling words with label-based Markov models is disclosed. The modelling includes: entering a first speech input, corresponding to words in a vocabulary, into an acoustic processor which converts each spoken word into a sequence of standard labels, where each standard label corresponds to a sound type assignable to an interval of time; representing each standard label as a probabilistic model which has a plurality of states, at least one transition from a state to a state, and at least one settable output probability at some transitions; entering selected acoustic inputs into an acoustic processor which converts the selected acoustic inputs into personalized labels, each personalized label corresponding to a sound type assigned to an interval of time; and setting each output probability as the probability of the standard label represented by a given model producing a particular personalized label at a given transition in the given model. The present invention addresses the problem of generating models of words simply and automatically in a speech recognition system.
Inventors: | Bahl; Lalit R. (Amawalk, NY); DeSouza; Peter V. (Yorktown Heights, NY); Mercer; Robert L. (Yorktown Heights, NY); Picheny; Michael A. (White Plains, NY) |
Assignee: | International Business Machines Corporation (Armonk, NY) |
Appl. No.: | 366231 |
Filed: | June 12, 1989 |
Current U.S. Class: | 704/243; 704/256 |
Intern'l Class: | G10L 005/06 |
Field of Search: | 381/41-50 395/2 |
4032710 | Jun., 1977 | Martin et al. | 381/43. |
4156868 | May., 1979 | Levinson et al. | 381/43. |
4181821 | Jan., 1980 | Pirz et al. | 381/43. |
4319085 | Mar., 1982 | Welch et al. | 381/45. |
4383135 | May., 1983 | Scott et al. | 381/45. |
4555796 | Nov., 1985 | Sakoe | 381/43. |
4587670 | May., 1986 | Levinson et al. | 381/43. |
Bahl et al., "A Maximum Likelihood Approach to Continuous Speech Recognition", IEEE Trans on Pattern Analysis and Machine Intelligence, vol. PAMI-5, No. 2, Mar. 1983, pp. 179-190. IEEE Trans on acoustics, speech & signal processing, vol. ASSP-28, No. 2, Apr., 1980 "A Training Procedure for Isolated Word Recognition Systems" by Sadaoki Furui--pp. 129-136. IEEE ASSP Magazine, Apr. 1984, pp. 4-29 "Vector Quantization" by Robert M. Gray. Proceedings ICASSP, 1981, pp. 1153-1155 "Continuous Speech Recognition With Automatically Selected Acoustic Prototypes Obtained by Either Bootstrapping or Clustering" by A. Nadas et al. Spoken Word Spotting Via Centisecond Acoustic States, R. Bakis, IBM Technical Disclosure Bulletin, vol. 18, No. 10, March 1976, pp. 3479-3481, New York, U.S. Speacker Dependent Connected Speech Recognition Via Phonemic Markov Models, H. Bourlard et al., ICASSP '85, Tampa, Fla. U.S., 26th-29th Mar. 1985, vol. 3, pp. 1213-1216, IEEE New York, U.S. IBM Research Report, #5971, Apr. 5, 1976 "Continuous Speech Recognition Via Centisecond Acoustic States" R. Bakis, pp. 1-8 and title page with abstract. Proceedings of the IEEE, vol. 64, No. 4, Apr. 1976 "Continuous Speech Recognition by Statistical Methods" F. Jelinek pp. 532-556. Reprinted from IEEE Trans. Acoust., Speech, and Signal Process., vol. ASSP-23, pp. 67-72, Feb. 1975, "Minimum Prediction Residual Principle Applied to Speech Recognition", by Fumitada Itakura. |