Our perception of timbre, or tone quality, seems most closely related to the physical phenomena of unfolding partials in the spectrum of a sound, called the spectral envelope. It is what allows us to distinguish between two different instruments playing the same note at the same amplitude. What we expect of familiar sounds, say of a piano note, are certain characteristics that change over time. If one were to chop off the attack of a piano note and hear the remainder, it may not sound very piano-like to us at all. As mentioned above, classical studies of timbre were written in the 19th century first by Helmholtz in his book On the Sensations of Tone, in which the overall envelope, which enveloped a complex waveform, comprised the basis for our tonal judgment. A more accurate picture was created by Fourier—he stated that all complex waves can be expressed as a sum of one or more sine waves.
We say sounds with stronger upper partials sound "brighter," and those with weaker higher partials sound "duller." More natural spectra will roll off with varying slopes at higher frequencies. Computer music is capable of creating any rolloff desired, and so harmonic sounds that have equal energy in their upper partials, compared to their lower ones, are often characterized as being "buzzy." Noise with equally strong higher-frequency components is characterized as being more "hissy."
Other aspects of timbre include vibrato, an oscillation of frequency (most important) and tremolo, an oscillation of amplitude (less important). Violins have very narrow formants (or resonating frequencies), and the addition of vibrato may push a tone in and out of a peak formant region, making for a very dynamic sound. This is further tempered by our perception of the complex of formants for a particular sound or instrument regardless of its register, so that, although somewhat different, a low clarinet note may still be considered related to a much higher one. The art of orchestration depends heavily on the ability of a composer to mix the spectra of numerous instruments, not necessarily playing the same pitch or octave doubling, and create a single timbral entity, perhaps unheard before. Most wind players are familiar with the slang term floboe, referring to the frequent octave doubling of melodic lines by a flute and oboe in Classic symphonies.
Some studies have indicated it takes at least 60 ms. to recognize the timbre of a sound. It has also been hypothesized that we hear differences in tones up to the 30th partial—recall how close in frequency higher partials become, not to mention in nature, usually very, very soft. Temporal relationships also form an aspect of timbre. Though highly influenced by the intervallic distance between pitches, if a series of tones is played rapidly enough, they will merge into a single timbre in a process called fusion. With sufficient reverberation, even disparate tones can fuse. Stockhausen used this principle in Studie II to create mixtures of sine tones played though a reverb to create unique timbres.
Finally, instruments do not radiate their spectra equally in all directions, making it tricky to mic such instruments and gather their full tonal qualities (see the excellent diagram of the radiation pattern of a cello in Huber-Runstein: Modern Recording Techniques, 7th Ed., pg. 57).
On the Sensations of Tone as a Physiological Basis for the Theory of Music or simply Sensations of Tone was first published in German by physicist Hermann von Helmholtz (1821-1894) in 1863, readable in its English translation (1875) here. Helmholtz focused on the human physiology's response to sound, building pioneering acoustic testing apparatus, such as the Helmholtz resonator in collaboration with craftsman Rudolph Koenig. Helmholtz also worked in other areas of human perception, such as the eye, designing the first useful ophthalmoscope