ABSTRACT

There are two justifications for including a chapter on speech coding in a book on speech synthesis and recognition. The first is that some specialized low-data-rate communication channels actually code the speech so that it can be regenerated by synthesis using a functional model of the human speaking system, and some systems even use automatic speech recognition to identify the units for coding. The second justification arises, as will be explained in Chapter 5, because a common method of automatic speech synthesis is to replay a sequence of message parts which have been derived directly from human utterances of the appropriate phrases, words or parts of words. In any modern system of this type the message components will be stored in digitally coded form. For these reasons this chapter will briefly review some of the most important methods of coding speech digitally, and will discuss the compromises that must be made between the number of digits that need to be transmitted or stored, the complexity of the coding methods, and the intelligibility and quality of the decoded speech. Most of these coding methods were originally developed for real-time speech transmission over digital links, which imposes the need to avoid appreciable delay between the speech entering the coder and emerging from the decoder. This requirement does not apply to the use of digital coding for storing message components, and so for this application there is greater freedom to exploit variable redundancy in the signal structure.