Chapter Four: Synthesis

13. Speech Synthesis and the Channel Vocoder | Page 5

A Brief History of the Vocoder

Prior to the invention of the Voder and Vocoder, many earlier devices, dating as far back as the 17th Century, attempted to reproduce human speak through mechanical means.  Perhaps the most infamous of which was the Euphonia by Joseph Faber in the mid-nineteenth century.  Featuring bellows, piano strings, physical analogs of a glottis, vocal tract and tongue, Faber’s “Fabulous Talking Machine” featured 16 keys that could be manipulated to produce words in English, French and German.  It was exhibited as entertainment, but it was perhaps the disembodied female face that would open and close its mouth as the speech dictated that audiences were not ready for–indeed, it seems many were horrified.

Homer Dudley developed the vocoder with research into speech mechanisms which he began in 1928. He was further encouraged by the development of the artificial glottis during that period.  Dudley received patents for his vocoder in 1937 and 1939 while working at Bell Labs.  Bell began exploring the transmitting of the low-bandwidth analysis-stage vocoder control signals over copper telephone wires to be reconstituted by the resynthesis stage device on the receiving end in an effort to save substantial bandwidth.

As mentioned previously, Dudley fashioned his original Voder (Voice Operation DEmonstratoR, patented in 1938), which incorporated the resynthesis stage of his vocoder, to also model and demonstrate the way a human vocal tract works.  Resynthesis was controlled not by electronic analysis like the vocoder, but by human physical actions of the operator’s fingers, wrist and foot. Its unique keyboard and set of controls were “played” to mechanically produce speech.  

The voder's ten pressure-sensitive keyboard equivalents to the vocoder control the channel VCA's (which corresponded to the voder's ten bandpass "voiced" filters). They produced vowel sounds, three wrist bars produced the unvoiced consonants, a foot pedal modulated the overall pitch, and a "quiet" key muted the sound (we all need one, right).  The wrist bars were divided by stopped sounds: t-d, p-b, k-g and fed the "hiss" generator into three higher-frequency filters.  Pedro the Voder was first demonstrated publicly at the 1939 World's Fair AT&T Pavilion by 24 "Voderettes," highly-skilled women who were trained for a year (primarily by virtuoso Mrs. Helen Harper pictured here), two or whom exchanged short messages via voders from opposite coasts during the Fair.

voder
sigsaly

During WWII, the US military used vocoder technology as part of their SIGSALY program to develop and implement secure telephone and wireless transmission. SIGSALY was a nonsense acronym, though the SIG was for Army Signals Corp.  Other names included Green Hornet because of the buzzing sound anyone would get if they listened in.  Bell Labs, with brief help and approval for British use from even Alan Turing of digital computer and ENIGMA code cracking fame, received the Army contract and added several important and groundbreaking features tagged onto the vocoder's 10-channel output signal, including digitization of those streams with FSK pulse code modulation at a sampling rate of ~50 times a second, multiplexing the multi-signal control output into a single stream, inclusion of a voiced/unvoiced control signal, and a fundamental pitch inflection control signal (that varied only about 25 Hz but had 30 step values).

They further encrypted the outgoing message stream with a random noise key of six values that were subtracted (via modulo arithmetic) from each of the amplitude analysis streams.  Both the transmitting and receiving ends had the encryption key noise provided on phonographic records, which were spun on highly accurate and timed turntables.  The dozen SIGSALY stations provided communications between Roosevelt, Churchill, Eisenhower, MacArthur and other important figures. The original system weighed approximately 50 tons. An excellent technical description can be found here. While the huge SIGSALY system was in service only through 1946, the Army continued to use a more and more compact forms of the vocoder for encryption through the early 1960's, ending with the 2400 bit/s HY-2 built in 1962 and used in Vietnam.

Image source: SIGSALY exhibit at the National Cryptologic Museum

The Siemens Synthesizer, developed in the late 1950's used vocoder technology, but at a cost of more than $15,000 was beyond the means of most musicians.  The techno band Kraftwerk was very interested in the technology, and being unable to afford the Siemens, hired an engineer to design their first vocoder.

Robert Moog built a solid-state, active filter vocoder for the University of Buffalo electronic studio in 1968.  As the U.S. military was winding down their use of vocoder encryption, the 1970's seemed to be the heyday of vocoder application in the music world.  In 1971, Moog and Wendy Carlos (of Switched-On Bach fame) modified some existing Moog synthesizer modules into a vocoder configuration for the soundtrack of A Clockwork Orange. Musical vocoder applications began to add instrumental sources as well as speech and song to the vocoder's input.

Moog vocoder

By 1978, Moog had commercially released his 16 Channel Vocoder, co-designed with engineer/instrument designer Harold Bode, which was completely patchable between analysis and resynthesis stage channels.  The balance between hiss and buzz was tunable, and there was a selector switch. The "carrier" audio input was designed to take either vocal or instrumental input as it's excitation source. It also featured a pedal-controlled sample and hold to "freeze" the timbre. A new re-released version, Model MBVO, is shown on the right or below, manual here.

Moog vocoder

Also in the 1970’s, EMS (of Synthi fame) released its EMS 5000 and 2000 Studio Vocoder, used by Stockhausen, Stevie Wonder and Kraftwerk, one of the early commercial vocoder adopters. 

Harold Bode had much earlier, along with Dudley, introduced the vocoder technology to Werner Meyer-Eppler, a co-founder of the German NWDR Cologne studios and Stockhausen mentor. The meeting was tremendously influential on Meyer-Eppler, as then a professor of phonology, encouraging him to pursue his dream of combining electronic music, phonology and communication.  Bode went on to developed the Model 7702 (used by the Commodores in Animal Instincts) which was an almost identical version of the commercial Moog vocoder (and licensed to Moog Music), featuring the same complete patchability of the 16 analysis/resynthesis channels.  However, as mentioned earlier, Bode’s model added a switchable complete pass-through for the unvoiced sounds, those that made it through a highpass filter, as he claimed this made it both more responsive and sounding more natural on non-voiced sounds .  By the end of the 1970’s, many more manufacturers had jumped on board, including Korg with their VC-10, the EMS Vocoder2000 (voice of Battlestar Galactica’s Cylons), Sennheiser's more affordable Vocoder VSM201(Neil Young).

Vocoder tech made its way into many keyboard synths, including those of Roland, Korg and Kurzweil, and DAW software packages and plug-ins, including Ableton Live, Apple Logic, TAL (free plug-in) and Native Instruments.

A extensive list of vocoders can be found here.

Finally, it seems like most discussions of vocoder technology mention the “poor man’s vocoder” Heil Talk Box, made famous by Peter Frampton in Show Me the Way, which consists of a speaker driver, driven by perhaps a guitar input, feeding into a plastic tube. One puts the open tube end in one’s mouth to create a talking guitar, in this case, by altering the shape of one’s vocal cavity.  The tube is typically taped next to the vocal microphone for amplification. Pretty much no unvoiced sounds can be created.