A - B -
C - D - E -
F - G - H -
I - J -
K - L - M -
N - O - P -
Q - R -
S - T - U -
V - W - X -
Y - Z
Generally, the sensitivity of human hearing is restricted to the frequency range
of 20 Hz to 20,000 Hz, with greatest sensitivity centered in the 500 to 8,000 Hz
frequency range. Above and below this range, the ear becomes progressively less
sensitive. To account for this feature of human hearing, sound level meters apply
filtering of acoustic signals according to frequency. This filtering is called A-weighting.
Sound pressure level values obtained using this weighting are referred to as A-weighted
sound pressure levels and are signified by the identifier dBA.
The exact pitch value of a musical note (for example, middle C) as opposed to its
position relative to other pitches.
The point in space of the origin of sound. For a sound emitting transducer
(e.g., a loudspeaker), the point from which the spherical waves appear to
diverge as observed at remote points. (See also acoustic
The complete set of all objects and their respective physical properties having
an influence on the sound field that surrounds a listener. The acoustic environment
is a major determinant of perceived sound quality because most of the sound emitted
by a source (e.g., a loudspeaker) typically arrives at the listener through
a multiplicity of paths. (A single bounce off an object is termed a "first-order"
reflection, two bounces a "second-order reflection," and so on.) Each
time a sound reflects off an object, the object's material properties affect how
much each frequency component of the sound wave is absorbed and how much is reflected
back into the environment. Sounds can also pass through objects, including such
"substantial" objects as walls, ceilings, floors and windows. An object's
material properties and its geometry?its
corners, edges, openings, shape, size, etc.?often influence sound in ways more complex than just
reflection, including diffraction,
refraction and diffusion.
The point in time at which the signal originates. (See also
Any sound of 6db+ that masks others of similar frequency.
The measured percentage of Articulation Loss of Consonants
by a listener. %ALCONS of 0 indicates perfect clarity and
intelligibility with no loss of consonant understanding, while 10% and beyond is
growing toward poor intelligibility, and 15% typically representing the maximum
loss acceptable. %ALCONS can be measured by acoustic analyzers
such as TEF.
In room acoustics, early reflections and reverberation. The audible sense of a room
or environment surround a sound source.
A parameter of sound related to the extent of oscillation of a vibrating body, of
sound pressure, or of an analog voltage.
The function describing how the maximum amplitude of a sound waveform evolves over
time. The amplitude envelope is often characterized as consisting of four parts:
The attack portion (i.e., the part during which the amplitude is rapidly
increasing); the decay portion (i.e., the "backside" of the attack,
during which the amplitude is rapidly diminishing); the sustain portion (i.e.,
the part during which the amplitude is relatively stable); and the release
portion (i.e., the final part during which the amplitude diminishes into silence).
A change in amplitude according to a periodic or aperiodic function. If the modulation
is done periodically, its effects on the carrier tone can be described in two equivalent
ways. The first is by simply describing the result as a repeating change in the
amplitude of the carrier. The second is to describe it as a mixture of a fixed intensity
carrier with a number of additional fixed intensity tones, called "side bands."
A general term referring to the impairment of musical abilities due to damage to
one or both cerebral hemispheres.
The ability of a listener to perceptually isolate individual elements of a complex
sound or sequence, such as frequency components in a complex sound or individual
events in rapid sequences. In synthetic listening
the tendency is to perceive sound complexes or temporal sequences in a global fashion.
Literally, without echo. An anechoic chamber is a low-noise, highly absorptive environment,
often used in acoustical testing, that allows the direct sound of the device under
test (e.g., a loudspeaker) to be measured without contamination from reflections
off the chamber's walls, floor or ceiling.
A general term referring to the impairment of language abilities following damage
to the left hemisphere of right-handed people.
If two lamps at two different locations in space are flashed in close succession,
the viewer obtains an impression of motion between them.
Apparent source width (ASW).
Discovered and developed by A. H. Marshall, ASW is a subjective parameter of spaciousness
in concert halls, and is related to the level, at the listenerís ears, of
lateral reflections in the first 50 to 80 milliseconds after the arrival of the
direct sound. Increasing the ratio of this reflected energy to the direct sound
increases the sense of spaciousness. Narrow, rectangular, ìshoebox-shapedî
halls like the famous Musikvereinsaal in Vienna and Symphony Hall in Boston tend
to foster strong, early-arriving reflections from the side walls, subjectively broadening
the sound source and imparting body and fullness to the music.
(From the Italian appoggiare meaning to learn.) A short-duration tone that
is a neighboring note (a semitone or whole tone higher or lower) of the principal
note which it precedes.
Articulation loss of consonants.
A measure of speech intelligibility. The percentage of consonants heard incorrectly,
strongly influenced by noise or excessive reverberation. (See also
A type of very soft noise appearing in speech sounds. It occurs in the phoneme "h"
in English, or with less duration after the release of an unvoiced consonant, for
example, after the "p" in "pie."
(See tonal system.)
The lessening of sound signal level due to divergence, absorption, reflection, refraction,
diffraction, etc., typically expressed in decibels.
A general term referring to impairments in recognizing auditory objects, events,
and sequences that usually follow damage to both temporal lobes.
The sensation of periodic fluctuation that results when two simultaneous components
are very close to one another in frequency. Listeners hear the fluctuation pattern
as consisting of beats when their auditory system lacks enough frequency resolution
to distinguish the component frequencies.
The perceived image of the acoustic environment; the way the
acoustic environment is perceived. (See also
virtual auditory environment.)
What is auditorially perceived, in contrast to a sound event, which is a physical
phenomenon of vibrations and waves in air or other elastic medium. The relationship
between sound events and auditory events is the subject of
A mental description of a physical (or virtual) sound source and its behavior through
time. Auditory stream segregation refers to the process of perceptual organization
of sound that accomplishes the construction of this description.
(See Eustachian tube.)
The technique of using computer-based mathematical models of an acoustic environment
and 3-D sound processing methods to make audible the sound field of a source in
the modeled space. Somewhat analogous to building and viewing a scale model of a
contemplated building, auralization enables an acoustician or sound designer to
build a computer model of a listening space and then "play" the room's
sound through headphones. See also article, "Virtual
Backward recognition masking (also called informational
The reduction in the ability to recognize a sound pattern due to the subsequent
presentation of another sound pattern with similar information content. This kind
of masking is thought to result from a process different from that or normal (sensory)
The organ of hearing. More specifically, a membrane that runs the length of the
cochlea which is a bony, fluid-filled spiral in the
inner ear. The basilar membrane performs a kind
of frequency analysis of the incoming acoustic signal: different locations along
the membrane vibrate preferentially in response to different frequencies. The hair
cells connected to each part of the membrane thus preferentially send neural information
about the presence of those frequencies to the brain. The spatial pattern of activity
along the basilar membrane thus encodes the frequency content of the signal.
(See auditory beats.)
In the home entertainment context, pertaining to presentations involving the visual
and auditory sensory modalities.
Pertaining to two ears. A presentation of sound is binaural when both ears are presented
with the sound. Binaural sound also refers to a specific sound playback technology,
used mainly in headphones-based research and virtual reality applications, in which
an individual's HRTFs are determined and synthesized to enable 3-D auditory experiences
that are indistinguishable from reality, or nearly so.
BR (bass ratio).
In concert hall acoustics, the ratio of the average reverberation
times at 125 and 250 Hz to the average
of the RT's at 500 and 1000 Hz. It is determined only for a hall when fully occupied.
In concert hall acoustics, a bright, clear, ringing sound, rich in harmonics, is
called ìbrilliant.î In a brilliant sound the treble frequencies are
prominent and decay slowly. This means that the high frequencies are diminished
only by the natural absorption of the sound in the air itself.
C80(3) or clarity factor.
In concert hall acoustics, the ratio, expressed in decibels,
of the energy in the first 80 milliseconds of an impulse sound arriving at a listener's
position divided by the energy in the sound after 80 milliseconds. The divisor is
approximately the total energy of the reverberant sound. The symbol (3) indicates
the average of the C80 values in the 500, 1000 and 2000 Hz
bands. . More generally, clarity refers to the degree to which the separate strands
in a musical performance perceptually stand apart from one another; see also
In musical research, a unit of pitch change equal to 0.01 semitones.
The simultaneous sounding of a group of notes, usually three or more. In Western
music, chords of three notes consisting of the first, third and fifth degrees of
a scale are called triads. Major triads consist of intervals of a major third (four
semitones) and perfect fifth (seven semitones) with respect to a reference pitch
(the root). The third is minor (three semitones) in a minor triad. The third is
major and the fifth is augmented (eight semitones) in an augmented triad. The third
is minor and the fifth is diminished (six semitones) in a diminished triad. When
the notes of a chord are played in ascending or descending succession, the melodic
figure is called an arpeggio.
Going around the perimeter of the pinna; said
of certain headphones.
The snail-shaped cavity, approximately 1-1/4 inches long, 3/8 inches wide and 2
inches high, in the temporal bone that contains the basilar
membrane which is the organ of hearing.
Cocktail party effect.
A form of auditory stream segregation by which a listener's
ability to localize sound sources (see localization)
can increase intelligibility. So called because at a cocktail party, a listener
can focus on and understand a conversation while dozens or even hundreds of other
conversations occur all around. If a conventional 2-channel high-resolution recording
were made and subsequently played back, the listener would not be able to understand
individual conversations because they have been spatially blended into the two speakers.
A sequence of evenly spaced peaks or dips in the frequency response when viewed
on linear scale caused by two or more identical signals which combine at near equal
amplitudes but at slightly different time intervals. So called because the frequency
response plot resembles the teeth of a comb.
A problem for virtual reality designers (who by definition must add interactivity
because when a person can probe environment, the VR designer must provide for nearly
infinite possible simulations.
A tone composed of two or more pure tones. (See also spectrum.)
In acoustics, the portion of a sound wave in which air molecules are pushed together,
forming a region with higher-than-normal atmospheric pressure. The opposite of rarefaction. In audio signal processing, the
reduction in dynamic range caused by a compressor.
(See neural net.)
A range of frequencies surrounding the frequency of a designated pure tone. When
other pure tones whose frequencies are within this range are played at the same
time as the designated tone, the auditory system does not hear the two completely
independently. The designated tone may be masked (see masking),
beats may be heard, or other forms of interaction may occur. The size of the critical
band increases for higher frequency tones, ranging from about 100 Hz for low-frequency
tones to above 2 kHz for very high ones.
The distance from a sound source at which direct sound and reverberant sound are
at the same level.
Abbreviation of decibel.
(See amplitude envelope.)
A unit of the intensity of sound. The decibel (abbreviated dB) is a relational measure,
expressing the relative intensity of the described sound to a reference sound. The
decibel is a logarithmic measure, specifically 10 times the logarithm of the ratio
of two voltages, currents or sound pressures. A difference of 20 dB between two
sounds means that the more intense one has 10 times the amplitude (100 times the
power) of the softer. A single decibel is commonly thought to be the smallest change
in sound pressure level that the trained human ear can detect.
A trademark term of Keith Yates Design Group referring to highly
immersive entertainment characterized by the depth and multiplicity
of sensory modalities presented to the audience.
In concert hall acoustics, definition, like clarity, refers to the degree to which
individual strands in a musical presentation can be differentiated from each other.
There are two kinds of definition: horizontal, which applies to tones played in
succession; and vertical, in which tones are played simultaneously. Horizontal definition
refers to the degree to which sounds that follow one another stand apart. Composers
can specify certain musical factors that determine the horizontal definition, such
as tempo, repetition of tones in a phrase, and the relative loudness of successive
notes. Performers can vary the horizontal definition by the manner they choose to
phrase a passage. Acoustical factors that affect horizontal definition are the length
of the reverberation and the ratio of the loudness
of the early sound to that of the reverberant sound--the same two factors that determine
fullness of tone, but in inverse relation. Vertical definition refers to
the degree to which sounds that occur simultaneously are heard separately. Composers
specify vertical definition by choosing simultaneous tones and their relation to
the tones surrounding them, and the choice of instruments on which theyíre
played. Performers can alter vertical definition by varying the dynamics of their
simultaneous sounds and through the precision of their ensemble. Acoustical factors
such as the energy ratio of early sound to reverberant sound also affect vertical
The bending of a wave front around an obstacle in the sound field.
Sound field in which the sound pressure level is the same everywhere and the flow
of energy is equally probable in all directions.
The spatial and/or temporal scattering of sound energy.
An acoustical device designed to spread sound reflections.
Directional transfer function (DTF).
(See head-related transfer function.)
Directivity factor (Q).
The ratio of the sound pressure squared, radiated directly ahead of a sound source,
to the sound pressure squared radiated in all directions.
The perception of fine distinctions or differences between stimuli.
In Western tonal music, the fifth degree of the diatonic scale or the triad (see
chord) built on it. This is an important degree
from the standpoint of the tonal hierarchy since, as its name indicates, it dominates
the other degrees (excepting the tonic). (See also tonal
Early decay time (EDT).
In concert hall acoustics, the measurement, expressed in seconds, taken in the same
fashion as reverberation time except that EDT
is the time it takes for a signal to decay from 0 to -10 dB relative to its steady-state
value. A multiplying factor of 6 is necessary to make the EDT time comparable to
RT. Short decay times cause music and speech to sound dry or muffled. Long decay
times make speech difficult to understand or even unintelligible. ,
An individual's unique head-related transfer function (HRTF), typically derived for each ear by placing a
tiny probe microphone inside the meatus, placing a loudspeaker
at a known location relative to the listener, playing a test signal through the
loudspeaker and recording the microphone signal. By comparing the original test
signal to the signal received by the probe microphone, the filter function of a
sound source at that position, and for that ear, is known. The loudspeaker is then
moved to another location and the process is repeated until an entire, spherical
map of filter sets has been devised.
A sound wave which has been reflected or otherwise returned with sufficient magnitude
and delay (typically >90 milliseconds) to be perceived as distinct from that
A hypothetical preperceptual sensory register within which auditory information
is temporarily stored without being recorded. The function of this memory would
be to preserve sensory information during the time needed for higher-level processing
mechanisms to extract useful information. Echoic memory does not last more than
a few seconds. It corresponds to iconic memory in the visual modality.
Energy-Time Curve (ETC).
In TEF measurements, a display of all the energy
returned during a specified time span. Time is displayed on the abscissa (x axis)
and energy on the ordinate (y axis). An ETC reveals how energy is released from
a system or room or device after it is hit with a sudden application of input energy
confined to a given frequency band.
In concert hall acoustics, envelopment is the second component of
spaciousness, and generally describes a listener's impression of
the strength and directions from which the reverberant sound appears to arrive.
Listener envelopment (abbreviated LEV) is judged highest
when the reverberant sound seems to arrive at a person's ears equally from all directions--forward,
overhead and behind.
Trade name of an infrasonic floor-motion system developed by Keith Yates and produced
and marketed under the Immersive Technologies brand name. The eQuake system relies
on several proprietary elements, including a real-time processor that takes a subwoofer-output
audio feed from a surround-sound processor, and synthesizes a sub-20 Hz signal that
is output to a below-floor excitation system. The eQuake is the world's first residential
system to add realistic, infrasonic, haptic content
to conventional audio/video playback, thereby marking the transition from a
bimodal to a trimodal sensory experience.
Also known as the auditory tube, the Eustachian tube is an approximately
1-1/2 inch long conduit that serves to equalize air pressure on both sides of the
tympanic membrane (eardrum), and to allow for
drainage of the middle ear by serving as a portal
into the nasopharynx (a region of the alimentary canal).
Judging that a signal is present when it is not or that a change occurred when none
did. Also called a false-positive response.
The distribution of sound energy at a very much greater distance from a sources
than the linear dimensions of the source and in which the sound waves can be considered
to be plane waves.
A device that can change the relative amplitudes
and phases of the frequency components in the
spectrum of a signal. A high-pass filter attenuates
low frequencies and lets the high ones pass through. A low-pass filter does the
(See temporal coherence boundary.)
In room acoustics, a series of specific reflective returns caused by large surfaces
being parallel to each other.
(See resonance structure.)
A mathematical analysis of waves, discovered by the French mathematician Fourier
(1768-1830). Fourier proved that any periodic sound, or any non-periodic sound of
limited duration, could be represented (Fourier analysis) or created out of (Fourier
synthesis) the sum of a set of pure tones with different frequencies, amplitudes
A mathematical description of the relationship between functions of time and corresponding
functions of frequency; a map for converting from one domain to the other. For example,
if we have a signal that is a function of time--an impulse response--then the Fourier
Transform will convert that time domain data into frequency data, for example, a
The central portion of the retina where visual acuity, or the ability to distinguish
small objects and details, is greatest. Only about half a millimeter in diameter,
the fovea is the retina's "rod-free zone" and is densely packed with cones.
(See also retina.)
An environment in which there are no reflective surfaces within the frequency region
A measure of the rate at which something repeats. This term usually refers to the
repetition rate of a periodic waveform and is expressed in
Hz (cycles per second) or kHz (thousands of cycles per second). The
period is the inverse of frequency, or the amount of time a single cycle lasts.
(See also harmonicity.)
A speech sound produced by frication, that is, by forcing air through a constriction
in the vocal tract. Examples are "s" and "f."
G (strength factor).
In concert hall acoustics, the ratio, expressed in decibels,
of the sound energy at a seat in a hall that comes from a non-directional source
(usually located successively at one to three difference positions on the stage)
to the sound energy from the same source when measured in an
anechoic room at a distance of 10 meters. G is measured in six frequency
bands: 125, 250, 500, 1000, 2000 and 4000 Hz.
Gmid (mid-frequency strength
Same as G (strength factor), except that the decibel
levels are the average of the G's measured in the 500 and 1000 Hz bands.
Glow (low-frequency strength
Same as G (strength factor), except that the decibel
levels are the average of the G's measured in the 125 and 250 Hz bands.
>From the German word for "form" or "shape." The central
idea of Gestalt psychology is that the properties of a whole form cannot be derived
by simply summing the properties of its individual parts. The constitution of these
forms obeys the perceptual laws (or principles) that were demonstrated for visual
perception by the Gestalt psychologists in the early decades of the 20th century,
but which have in general been confirmed for auditory perception as well. These
principles include the grouping into forms of elements on the basis of their proximity,
similarity, continuity, symmetry and closure. A configuration of elements that obeys
one or more of these principles may be considered to be "well formed"
and as such is a preferred way of experiencing the sensory input. (See also
In concert hall acoustics, if the side walls or the surfaces of hanging panels are
flat and smooth and are positioned to produce strong early sound reflections, the
sound from them may take on a brittle or harsh quality, analogous to optical glare.
Acoustical glare can generally be prevented by adding irregularities to these surfaces
or by curving them. In the 18th and 19th centuries, fine-scale irregularities on
sound-reflecting surfaces were provided by baroque carvings or plaster ornamentation.
(See auditory stream,
(See basilar membrane.)
Pertaining to the sense of touch, from the Greek word haptein, to grasp.
There are four types of sensory neurons (mechanoreceptors)
involved in the haptic modality. The haptic, or tactile, sensory modality is the
only active sense that can be used to explore our environment; vision and hearing
are passive senses since they cannot act upon the environment [no e-mail regarding
the Heissenberg Uncertainty Principle, please!].
One component (or partial, or
overtone) of a complex tone whose component frequencies are all integer
multiples of a common fundamental frequency (see frequency).
The intervals between components of the harmonic series are defined by harmonic
ratios (i.e., ratios of simple integer numbers). The term "harmonic
ratios" can also be applied to very low frequency rates of repetition as are
found in rhythms.
The state of being harmonic or periodic. Periodicity is mathematically synonymous
with harmonicity, though the former refers to a regularity in the sound's time description
while the latter refers to a regularity in its frequency description. Contrasting
terms to this one include inharmonicity or aperiodicity (usually for
complex tones composed of inharmonically related partials) and randomness (usually
employed to refer to noise waveforms).
(See precedence effect.)
Head-related transfer function (HRTF).
The frequency response between the point in space where a sound source is located,
and the ear, due to anatomical features of the head, upper torso and pinnae. These
features shape the response in such a way as to allow the ear to localize a sound
source in space. (Also known as head transfer function [HTF], pinnae transform,
outer ear transfer function [OETF], and directional transfer function
[DTF]. See also localization.)
Helmholtz, Hermann von.
Scientist who, during the second half of the 19th century, contributed to our knowledge
about almost every topic in the fields of perception and sensory processes. Helmholtz
argued that perception was based upon a process of inference, in which, through
past experience, we infer from the sensations we receive at a given time the nature
of the object or event that they probably represent.
(See temporal lobe.)
The organization of a set of elements into subsets according to relations of dominance
and subordination. Each element of a subset is subordinate to the subset as a whole
which itself is subordinate to the superset of which it is an element, and so on.
In a strict hierarchy no element can be a member of more than one subset at a given
level of the hierarchy.
Abbreviation for head-related transfer function. (See
also earprint and localization.)
Abbreviation for heating, ventilation and air conditioning.
Abbreviation of Hertz. (See frequency.)
IACCA (interaural cross-correlation
The measure of the difference in the sounds arriving at the two ears of a listener
facing the performing entity in a hall. IACC is usually measured by recording on
a digital tape recorder the outputs of two tiny microphones located at the entrances
to the ear canals of a person or a dummy head, and quantifying the two ear differences
with a computer program. IACCA is determined with a frequency
bandwidth of about 100 to 8000 Hz and for a time period of 0 to about 1 second.
No frequency weighting is used.
The interaural cross-correlation coefficient determined for a time period of 0 to
80 milliseconds. It is the average of the values measured in the three octave bands
with mid-frequencies of 500, 1000 and 2000 Hz. It has been shown to be a sensitive
measure for determining the apparent source width
(ASW) of a performing entity as heard by a person seated in the audience.
The interaural cross-correlation coefficient determined by averaging the values
in the 500, 1000 and 2000 Hz bands, for a time period of 80 to 750 milliseconds.
It correlates approximately to the state of sound diffusion in a concert hall.
The ability to retrieve from memory a name or concept associated with an object
Abbreviation for inside-the-head localization.
Abbreviation for Impact Isolation Class.
Pertaining to "immersion," or the feeling of being present in a mediated
world rather than the immediate physical environment. The success of the phenomenon
is thus dependent on the absence of, or the ability to block out, sensory cues associated
with the immediate environment (the "real world"), and the degree to which
the cues supplied by the mediated world are both deep (i.e., rich in informational
content) and broad (i.e., correlated across multiple sensory modalities; see also
haptic and eQuake). The
mediated environment can be purely fictional or a temporally and/or spatially distant
real environment. The question isn't whether the created world is as real as the
physical world, but whether the created world is real enough for you to suspend
your disbelief for a period of time. The introduction of perspective in painting
by Masaccio in the 1420s took a first step toward immersion by creating a sense
of depth that integrated the spectator into the pictorial space. But because the
medium of painting simulates depth on a flat surface the spectator cannot break
through the canvas and walk into the pictorial space. (See also
Immersive Technologies Corporation.
A California company founded by Keith Yates in 1998 to develop and manufacture and/or
license technologies to increase the immersive power of movie and music playback
experiences. See also eQuake.
Impact Isolation Class (IIC).
A measure or specification of isolation effectiveness of building structures from
impact noises such as slammed doors, dropped objects, footfalls, shuffled furniture,
etc. The higher the IIC rating, the better such isolation. Impact noises can be
transmitted through walls, floors, and ceilings throughout a building and re-radiated
at distant locations. Careful design and special construction materials (floating
floors, isolation pads, resilient channels, spring rails, flexible connectors and
hangers, for example) can help improve IIC ratings, which may be thought of as the
structure-borne equivalent of the airborne noise ratings addressed by
A measurement of sound pressure versus time, showing how a device responds to an
A key concept in cognitive psychology. Drawing on the image of the way computers
work, information resulting from stimulation of the sense organs is analyzed and
transformed by a number of serial and parallel processors (see
neural net) each of which takes as input the information output by another
(See backward recognition masking.)
Pertaining to frequencies below the audible range, i.e., sub-20 Hz. Note: Sound
in the 2-5Hz range played at 100-125dB may produce difficulty in swallowing and
slight post-exposure headache. Sound in the 2-5Hz range played at 125-137dB may
produce chest wall vibration; difficulty in speaking and voice modulation; swaying
sensations; lethargy and drowsiness; and post-exposure fatigue and headaches. Sound
in the 5-15Hz range played at 125-137dB may produce middle-ear pain; difficulty
in speaking and voice modulation; severe chest wall vibration; severe abdomen vibration
and associated feelings of nausea; a falling sensation; lack of concentration and
drowsiness; tinnitus; and severe post-exposure fatigue
and headaches. According to some researchers, 7Hz is possibly the most disturbing
frequency, being close to the natural resonance frequency of many of the internal
body organs and being the same frequency as the alpha brainwaves. Sound in the 15-20Hz
range played at 125-137dB may produce severe middle ear pain; respiratory difficulties
(gagging sensations); nasal cavity vibration; persistent eye watering; tinnitus;
sensation of fear; excessive perspiration and shivering; and severe post-exposure
fatigue and headaches.
A tone composed of partial that are not all integer
multiples of a common fundamental.
Initial time delay gap (ITDG).
The deepest part of the ear. It is contained within a system of spaces and canals,
known as the osseous or bony labyrinth, in the temporal bone. These spaces and canals
are divided into three sections: the vestibule, which contains two balance
organs, the utricle and saccule; the semicircular canals, located
behind the vestibule, and the cochlea. The spaces between
the bony walls of the osseous labyrinth and the membranous labyrinth are filled
with one of several types of fluid, which deliver nutrients to the cells of the
inner ear; provide the chemical environment needed for transfer of energy from a
vibratory stimulus to a neural signal; and function as the medium to carry vibratory
stimuli from the oval window to the sensory structures along the cochlear partition.
Inside the head localization (IHL).
The name given to the physical energy with which a sound is present. It contrasts
with "loudness," which is the perceptual experience approximately correlated
with that physical intensity.
Intimacy (or presence).
In concert hall acoustics, a venue is said to have ìacoustical intimacyî
if music played in it gives the impression of being played in a small hall. In the
language of the recording and broadcast industries, an intimate hall is said to
have "presence." See also t1 (initial time-delay gap).
A sequence of events is called isochronous if the time separating each pair of successive
events is strictly equal. The absence of isochrony is called anisochrony.
Abbreviation for just noticeable difference.
Just noticeable diference (jnd).
The smallest change in a stimulus parameter (frequency, intensity, duration) that
can be detected by a listener at a predefined level of performance (e.g., 71 percent
of the time). (See also Weber's law.)
Perceptual proximity of the keys of the Western tonal system. Keys sharing more
pitches are considered to be more closely related than those with fewer pitches
Abbreviation of kiloHertz. (See frequency.)
Lateral Energy Fraction.
Lateral geniculate body.
A peanut-sized area of the brain to which the output of the
retina is sent. Each lateral geniculate body (there are two, one on each
side of the brain) routes it output to the visual cortex.
The identification of a sound that is presented over headphones is described as
"lateralization" rather than localization in recognition of the fact that
sound playback over headphones is generally not "externalized," i.e.,
it is experienced as coming from somewhere between the two ears rather than from
somewhere in the surrounding environment. Lateralization is the identification of
the position of the sound on the left-right dimension. Also referred to as inside-the-head
localization (IHL). ,
Abbreviation for listener envelopment.
The lateral energy fraction determined by the ratio of the output of a figure-8
microphone with its null axis pointed to the source of the sound, divided by the
output of a non-directional [i.e., omnidirectional] microphone at the same position.
LFE4 is determined for the time period of 0 to 80 milliseconds
and is the average of the LF's in the four frequency bands, 125, 250, 500 and 1000
Hz. It is equal to the ratio of the weighted energy in the sound that does not come
from the direction of the source to that which comes from all directions including
that of the source. LFE4 also correlates with the
apparent source width (ASW).
In concert hall acoustics, a component of spaciousness referring to a listener's
impression of the strengths and directions from which the reverberant sound seems
to arrive. Listener envelopment is judged highest when the reverberant sound seems
to arrive equally from all directions--forward, overhead, behind.
In concert hall acoustics, a subjective quality related primarily to the reverberation
times at the middle and high frequencies, those above about 350 Hz. A hall can sound
"live" and still be deficient in bass. If a room is sufficiently reverberant
at low frequencies, it is said to sound "warm."
The judgment of the place of spatial origin of a sound. Humans localize sounds based
on two primary cues: interaural intensity difference (IID), and interaural
time difference (ITD). IID refers to the fact that a sound is louder at
the ear it is closer to (the "ipsilateral" ear) for two reasons:
because sound intensity diminishes with distance traveled; and because the head
itself blocks the sound path to the more distant ("contralateral")
ear). ITD refers to the fact that a sound will arrive at the ipsilateral ear before
the contralateral ear. Generally speaking, the ear-brain system uses ITD cues to
determine the spatial origin of low-frequency sounds, and IID cues to determine
the spatial origin of higher frequency sounds. The IID/ITD keys to localization
were first proposed by Lord Rayleigh in the first decade of the 20th century, and
are sometimes referred to as the duplex theory of localization. About 60
years later researchers discovered that, in addition to IID and ITD information,
the brain processes information about the sound source's location based on how its
energy has been accentuated or attenuated in the mid- and high-frequency ranges
by minute time delays caused by the folds and depressions in the listener's
pinnae (and at lower frequencies by the shoulders and upper torso): Because
of the pinna's asymmetry, different angles of sound incidence produce different
characteristic filtering. (The spectral-shaping influence of the pinnae can be readily
verified by trying to localize sound after filling their cavities with putty.) The
effect of filtering by the pinna and upper body is termed the head-related transfer
function (HRTF) and is unique for each individual, similar to a fingerprint.
(In fact, an individual's HRTF is sometimes called his or her
earprint.) Localization accuracy in humans is most precise for sound sources
located in front of the listener and at ear level. Localization is not simply an
auditory process, but includes higher order brain functions which combine learned
responses, complex pattern matching, and cross referencing with other senses in
the brain, resulting in a unified (though not always correct) perception of the
location of a sound source. (See also lateralization
and visual capture.) ,
A scale in which the logarithm of the physical variable is used instead of the raw
value. This has the effect that equal steps along the scale represent equal ratios
between the raw values. Examples in audition are the decibel scale and the scale
of musical pitch.
The "hammer" bone of the middle ear.
The process by which one sound (the masker) affects the threshold of audibility
of another sound (the target or probe) when played at the same time. More intense
sounds mask less intense ones. The amount of masking depends on the proximity of
the frequency components (see critical bands,
frequency and harmonic) of the two sounds, as
well as on the global intensity of the masker. The greater the level, the greater
the extent to which a given masker frequency can mask target components at higher
frequencies (see backward recognition masking).
Meatus (also called the external auditory meatus).
The ear canal, leading from the concha to the tympanic membrane
(eardrum). Approximately 1 inch long, the outer one-third of the meatus is cartilaginous;
the remaining two-thirds is bony. Ceruminous (wax) and sebaceous (oil) glands are
plentiful in the cartilaginous segment, and are also found on the posterior and
superior walls of the bony canal. The wax and oil lubricate the canal and help keep
it free of debris and foreign objects.
Mechanoreceptors are the receptors involved in the haptic
(tactile) sensory system and come in four distinct types: Merkel's receptors
and Meissner's corpuscles, both with relatively small receptive fields and
located in the dermal papillae (superficial skin); and pacinian corpuscles
and Ruffini corpuscles, both with larger receptive fields and located deeper
in the skin, i.e., subcutaneously. The smaller receptive fields of the Merkel's
and Meissner's structures allow them to resolve finer spatial details that the pacinian
and Ruffini structures. The four mechanoreceptor types respond to different intensity
and frequency ranges of mechanical stimuli. Meissner's corpuscles are most sensitive
to low-frequency (< 100 Hz) sinusoidal mechanical stimuli; their excitation is
felt as a gentle fluttering in the skin, sometimes termed flutter sense.
In contrast, pacinian corpuscles are maximally sensitive to higher frequency (50-500
Hz) stimuli, which evoke a diffuse, humming sensation in the deeper tissue. Ruffini
corpuscles and Merkel's receptors respond to indentation of the skin. The spatial
distribution of mechanoreceptors is not uniform; the densest distribution can be
found in the fingertips. (See also sensory experience.)
The pattern of ascending and descending pitch changes in a melody.
A hypothetical pattern of mental or brain activity that represents some feature
of the world, of the person, or of the interaction between the person and the world.
A mental program or formula that has been proposed by Jean Piaget and other psychologists
as a means by which people represent the world and regulate their interactions with
it. The concept implies more of an active control mechanism than the concept of
The group of phenomena related to the musical measure. It consists of the hierarchical
ordering of the piece of music into units of equal duration (beats;
see also hierarchy). This ordering is indicated by the
time signature at the beginning of the score. From a phenomenological point of view,
the presence of a metric organization in the heard piece is evidenced by the fact
that one can tap one's foot or dance in synchrony with the music.
A six-sided cavity between the outer ear and the
inner ear, principally containing the ossicles (often called the "hammer"
[malleus], "anvil" [incus] and "stirrup" [stapes], the three
smallest bones in the body); two muscles, the tensor tympani and the stapedius;
and the opening to the Eustachian tube. Sound is transformed
at the middle ear from acoustical energy at the eardrum to mechanical energy at
the ossicles; the ossicles convert the mechanical energy into fluid pressure within
the inner ear via motion at the oval window. ,
The phenomenon of the "missing fundamental" is one in which the listener,
presented with a harmonic tone in which the fundamental is absent, hears the same
pitch as would be heard if the fundamental had been present. Therefore, only some
of the harmonics are needed to hear the pitch. The pitch that is heard when the
fundamental is absent is called periodicity pitch because the period of the
wave is the same whether the fundamental is present or not.
That part of a sound field, usually within about two wavelengths from a sound source,
where there is no simple relationship between sound level and distance.
A system composed of many simple processing units, formally mimicking the operation
of nerve cells, which are connected together in complex patterns of excitation and
inhibition and propagate activation to other units by way of these connections.
The current state of a given unit and the degree to which it excites other units
can be influenced by the success it has had in activating them. Propagated activity
among cells can lead the system to stable states in which the activity of the units
remains relatively constant. These states constitute the "response" of
the system to a given stimulation by the (external or internal) environment. The
main hypothesis concerning this kind of architecture (also called connectionist
or parallel distributed processing networks), is that it is better suited
to modeling the microstructure of cognition than more classical data flow or serial
processing models: processing, representation and memory are postulated to be distributed
over units in the net rather than being constrained to specific storage locations
and processing routines.
A nerve cell. A neuron's job is to take in information from the cells that feed
into it; to integrate (sum up) that information; and to deliver that integrated
information to the next neuron. The information is usually conveyed in the form
of brief nerve impulses. In a given cell, one impulse is the same as any
other; they are "stereotyped" events. Impulse rates vary from one every
few seconds to about 1000 per second. Anatomically, the nerve cells consists of
a globular-shaped cell body with a nucleus, mitochondria and other organelles;
a cylindrical-shaped, signal-transmitting nerve fiber called an axon; and
a number of branching and tapering fibers called dendrites, typically under
one millimeter in length. The entire nerve cell-the cell body, axon and dendrites-is
encased in the cell membrane. The cell body and dendrites receive information
from other nerve cells; the axon, which may be anywhere from less than a millimeter
to more than one meter in length, transmits this information from the nerve cell
to other nerve cells. Near the point where they end, an axon typically splits into
many smaller branches whose ends come very close to, but do not touch, the cell
bodies or dendrites of other nerve cells. At these regions, called synapses,
information is conveyed from one nerve cell, called the presynaptic cell,
to the next, called the postsynaptic cell. Neural signals originate at a
point near where the axon joins the cell body, and travel down the length of the
axon, away from the cell body and toward the terminal branches. At a terminal, the
information is transferred across the synapse to the next cell or cells by a process
called chemical transmission.
A random waveform whose frequency spectrum contains all audible frequencies, called
white noise. A noise signal that contains all frequencies with equal energy
per octave is called pink noise, commonly used to test loudspeakers. A noise
signal that is filtered, removing higher and lower frequencies and just letting
through a small band of frequencies, is called narrow-band or band-pass
noise. Filtering out the high frequencies starting from a certain cut-off frequency
gives low-pass noise. Taking a noise waveform over a certain time period
and then repeating this segment gives what is called frozen noise. [1
Noise criteria (NC) curves.
A measure of background noise in rooms. Each NC curve is defined by its sound pressure
level at eight octave-band center frequencies: 63, 125, 250, 500 1000, 2000, 4000
and 8000 Hz. The lower the NC rating, the lower the background noise level. The
preferred range of NC performance for sound-critical spaces (e.g., home theaters,
home media rooms, home listening rooms, concert and opera halls, recital halls and
broadcasting and recording studios) is < NC-20. Factors that must be addressed
in achieving satisfactory NC performance typically include mechanical (HVAC)
design and the construction detailing of the room's envelope, i.e., its walls, ceiling,
floor, windows and doors, in order to reduce noise infiltration from areas exterior
to the room. (See also Room Criteria,
Sound Transmission Class (STC) and Impact Isolation Class
One of the pitch intervals in music. Physically, a note
that is an octave higher than another has a frequency that is twice that of the
Not directly in front of a microphone or loudspeaker.
Pertaining to the ear; aural.
Inflammation of the ear, which may be marked by pain, fever, hearing abnormalities,
deafness, tinnitus, and vertigo.
The external structure of the ear, consisting of the pinna
Outer ear tranfer function (OETF).
(See head-related transfer function.)
A major clue to the perception of depth in vision, parallax arises from the relative
motions of near and far objects that is produced when the viewer moves his or her
head up and down or from side to side. See also stereopsis.
Parallel distributed processing.
(See neural net.)
Passing tone. Ornamental notes melodically interleaved
between two notes that are part of the triad (see chord)
of the principal key.
What the perceiver sees or hears as a result of stimulation, as opposed to the physical
reality of the stimulation. The percept may be considered the "object"
of study in perceptual psychology.
The fixing of one's gaze for sometimes very short periods of time in specific areas
as one explores a visual form. These fixation points constitute the zones of perceptual
centration. This term was applied to auditory perception by Frances to designate
the auditory information upon which listeners focus their attention at a given moment.
The impression of perceiving the same object, event or pattern in spite of variations
in stimulus structure, due, for example, to being played louder or softer, faster
or slower, higher or lower, or in different acoustic environments.
The phase is the particular point in a wave that is passing a position in space
at a certain instant of time. Phase is measured in units of degrees, with 360 degrees
representing one complete cycle of the wave. If two tones have the same period and
are occurring at the same time, the temporal lag of one with respect to the other
can be described in terms of phase. If two waves are out of phase by 180 degrees,
the later one is lagging by one-half a period.
The basic classes of sounds used to form the words of a language. Examples in English
are "k," "oo," and "th." They are often represented
by single written letters.
A hypothetical active process by which a speech sequence that is interrupted by
a noise sound in place of a given phoneme results in the listener's impression of
having heard the phoneme. This effect does not occur if a silent gap is left at
the place where the phoneme normally occurs.
The external, visible, largely cartilaginous appendage of the outer ear. Its perimeter
is demarcated by a ridge-like rim called the helix, which curves down to
the earlobe (lobule) at its bottom. Roughly in the middle is a relatively large,
cup-shaped depression called the concha.
(See head-related transfer function.)
The auditory attribute on the basis of which tones may be ordered on a musical scale.
Two aspects of the notion of pitch can be distinguished in music: one related to
the frequency (or fundamental frequency) of a sound (measured in
Hz) which is called pitch height, and the other related to its place
in a musical scale which is called pitch chroma. Pitch height varies directly
with frequency over the range of audible frequencies. This "dimension"
of pitch corresponds to the sensation of "high" and "low." Pitch
chroma, on the other hand, embodies the perceptual phenomenon of octave equivalence,
by which two sounds separated by an octave (and thus
relatively distant in terms of pitch height) are nonetheless perceived as being
somehow equivalent. This equivalence is demonstrated by the fact that almost all
scale systems in the world in which the notes are named assign the same names to
notes that are roughly separated by an octave, i.e., the labeling system cycles
at every octave. Thus pitch chroma is organized in a circular fashion, with octave-equivalent
pitches considered to have the same chroma. Chroma perception is limited to the
frequency range of musical pitch (50-4000Hz).
In TDS acoustical measurements, Polar Energy-Time Curves
(ETC) measure the magnitude and time of arrival of reflections, and, importantly,
display the direction of the reflecting surface relative to the microphone placement.
Polar ETC's can thus allow the operator to pinpoint the location of one or many
reflecting surfaces in a concert hall, auditorium, theater, recording studio or
residential playback venue.
The characteristic sound radiation pattern of a microphone and loudspeaker, usually
plotted to show sound sensitivity or output, respectively, at various angles of
The positive or negative direction of an electrical, acoustical or magnetic force.
Two identical signals in opposite polarity are 180 degrees apart at all frequencies.
Polarity is not frequency dependent.
An effect in which the human auditory system suppresses early reflections of a direct
sound, i.e., it "fuses" the direct sound and its early reflections and
localizes the source on the basis of the earlier (i.e., direct) sound. The basis
for the distinction is that the reflections arrive with a certain delay compared
to the direct sound. Precedence effect is sometimes referred to as the law of the
first wavefront or the Haas effect.
Gradual and biologically normal loss of acute hearing with advancing age.
Primary auditory cortex.
(See temporal lobe.)
The travel of sound waves through a medium (e.g., air).
The sense of body position.
A notion introduced by Rosch to designate an abstract representation of a whole
class of objects, of which the prototype would constitute the central tendency.
The study of the relationship between physical measures of sound (e.g., amplitude
and frequency) and the perception of them.
A tone with a sinusoidal waveform is called a pure tone
because it is considered to be the simplest form of tone and sounds pure when played
(See directivity factor.)
The portion of a sound wave in which air molecules are spread apart, forming a region
with lower-than-normal atmospheric pressure. The opposite of
An increase in correct recall rate for the most recently presented items of a list
compared with those presented earlier in the list.
The impression that an object, event or sequence has been experienced before or
In acoustics, the bouncing or return of a sound wave from an object larger than
one-quarter wavelength of the sound. When the object is one-quarter wavelength or
slightly smaller, it also causes diffraction of the
The change in direction of a sound wave that occurs when sound passes from one medium
to another (e.g., from air to glass to air, or through layers of air with different
The phase of one sine wave compared to another.
(See absolute pitch.)
A resonance structure can be described in terms of the relative level produced at
each frequency by a resonating object. Most physical objects (membranes, bars, air
columns, strings) have several modes of vibration that resonate at different frequencies,
thus constituting a complex resonance structure. In the case of speech, these resonance
regions are called formants. The placement of the formants
is a major clue to the identity of a vowel. The way resonant frequencies change
rapidly over time is a clue to the identity of several classes of consonants.
Technically a part of the brain and located on the inner surface of the eyeball,
the retina translates light into nerve signals, which are then routed via the optic
nerve to the lateral geniculate body. The retina consists
of three layers of nerve-cell bodies. The layer at the back of the retina contains
roughly 125 million light receptors, the rods and cones. Rods, which
considerably outnumber cones, are responsible for our vision in dim light and are
out of commission in bright light. The three types of cones do not respond to dim
light but are responsible for our ability to see fine detail and for color vision;
cones are "tuned" to absorb long, medium or short wavelengths of light,
loosely corresponding to red, green and blue. The distribution of rods and cones
varies considerably over the surface of the retina; in the center, where fine-detail
vision is best, is the fovea, which is densely packed with cones. The retina's
middle layer contains three types of nerve cells: bipolar cells, which receive
input from the receptors (i.e., rods and cones); horizontal cells, which
link receptors and bipolar cells; and amacrine cells, which link bipolar
cells and retinal ganglion cells. The layer at the front of the retina contains
approximately 1 million of the aforementioned retinal ganglion cells, whose axons
pass across the surface of the retina, collect in a bundle, and leave the eye to
form the optic nerve.
Reverberant sound field.
A sound field made of reflected sounds in which the time average of the mean square
sound pressure is everywhere the same and the flow of energy in all directions is
equally probable. This requires an enclosed space with essentially no acoustic absorption,
e.g., a reverberation chamber.
In concert hall acoustics, reverberation refers to sound that persists in a venue
after a tone is suddenly stopped. A hall that is reverberant is called a "live"
hall. (See also liveness.) A room that is not reverberant
is called a "dead" or "dry" room.
Reverberation time (RT).
Defined as the time, multiplied by a factor of 2, that it takes for the sound in
a hall to decay from -5 to -35 dB below its steady-state value. The factor of 2
is necessary because RT must conform to the original definition of sound decay which
was from 0 to -60 dB. Roughly speaking, RT is the time it takes for a loud sound
to decay to inaudibility after its source is cut off. RT is usually measured in
octave or one-third octave bands. The source of sound may be a pink noise or a sound
impulse. Originally, RT was determined from a plot of sound pressure level vs. time
as recorded on the moving paper of a graphic level recorder. Today it is determined
by the Schroeder (1965) method which involves computer integration of a backward-played
tape recording of the decaying signal. The mid-frequency RT is the average of the
RTs at 500 and 1000 Hz. The measurement is generally made in both occupied and unoccupied
halls, at two positions when occupied or at 8 to 24 positions when unoccupied. The
data in each frequency band at the various positions are averaged. A least-squares
fit to the -5 to -35 dB portion of the decay curve is used in setting the value
of RT for each band and position. The RTs of the largest stone cathedrals can be
nearly 10 seconds; the world's most renowned concert halls typically fall in the
range of 1.8 to 2.2 seconds; opera houses typically fall in the 1.2 to 1.6 second
range; aggressively damped home theaters can exhibit RTs below 0.25 seconds. A venue's
use and its RT must be consonant: A home theater with a 6 second RT would render
movie dialog unintelligible, while a cathedral with a 0.3 second RT would deflate
its sonic grandeur. ,
A sequence of events having a specific set of time intervals between the onsets
of successive events. Sequences having different onset-to-onset intervals are said
to have different rhythmic structures or temporal structures.
The time taken for a signal to rise from silence to full intensity. The tones of
different instruments can be distinguished by their rise time, the tones of percussive
instruments like the piano rising very rapidly and others like the tuba, more slowly.
In music, "rise time" is called "attack" (see
amplitude envelope). ,
Room criteria (RC) curves.
A measure or specification of background noise from HVAC
systems according to measured sound pressure level at 10 octave-band center frequencies:
16, 31.5, 63, 125, 250, 500, 1000, 2000, 4000 and 8000 Hz. Room Criteria curves
were derived for use in office spaces and are more demanding than
Noise Criteria curves at low frequencies.
Frequencies at which sound waves in a room resonate (in the form of
standing waves), based on the room dimensions.
Root mean square (rms).
The effective DC voltage of an AC signal. The square root of the mean value of the
squares of the instantaneous values of a varying quantity.
(see Reverberation time.)
In acoustics, a unit of absorption equal to the absorption of 1 square foot of surface
which is totally sound absorbent. Named after Wallace Clement Sabine, the Harvard
professor honored as the "father of architectural acoustics" for his investigations
into concert hall sound at the turn of the century.
The normal, but largely unnoticed rapid darting of the eyes from one fixed point
A set of pitches (or notes) arranged with certain intervals among them within the
span of an octave (see also pitch). The scale pattern
generally repeats in each octave. Each note constitutes
a degree of the scale. Each diatonic scale consists of intervals between adjacent
notes that are either minor or major seconds (one or two semitones, respectively).
The different arrangements of major and minor seconds yield different modes.
The two most important modes in Western tonal music are the major and minor
modes. The chromatic scale contains all twelve semitone steps within an octave.
Another kind of scale which does not fall within the tonal system but which was
used extensively in the music of Debussy and Ravel is the whole-tone scale,
which has only six notes, all separated by whole tones. Intonation (or tuning system)
refers to the exact tuning of the notes of a given scale system. The most widely
used tuning system in Western music is the equal-tempered system in which
all intervals can be expressed as integer multiples of a standardized semitone.
This system was brought to Europe from China and adopted during the 17th century.
Schroeder integration of reverberation.
In acoustics, an integration of reverberant data in which the last energy is integrated
first and the initial arrival is integrated last, all of which is normalized by
the total. The integration simulates the effect of taking many time measurements
and averaging them together.
The "blind spot" in human vision corresponding to the region where the
optic nerve enters the eye, i.e., the oval-shaped area about 2 millimeters in diameter
with no rods or cones. You can "map" your blind spot simply by closing
one eye and gazing at a small object across the room. Hold a Q-Tip at arm's length
directly in front of the object and slowly move it out to the right exactly horizontally.
The white cotton will vanish when it is about 18 degrees out. Now, if you place
the stick so that it runs through the blind spot, it will appear as a single, continuous
stick, without any gap. (This feature is referred to as "completion.")
You are not normally aware of your blind spot, and cannot be, unless you test for
it. You don't see black or white or anything there; you see nothing. ,
The process by which speech signals are divided into phonemes, syllables or words.
It consists of creating boundaries between groups of elements. In music, segmentation
refers to the process of dividing an event sequence into distinct groups of sounds.
The factors playing a role in segmentation are similar to the principles of grouping
addressed by Gestalt psychology.
The smallest standard musical interval (i.e., step in pitch)
in the Western equal-tempered pitch system (see scale).
All other intervals can be described as containing an integer number of semitones,
e.g., the octave contains 12 semitones, the perfect
fifth 7 semitones, etc. A tone that is a semitone higher than another is approximately
6 percent higher in frequency. There is a semitone separation
between any black key on the piano and its nearest white neighbor or between adjacent
white keys that have no black keys between them.
Sensory experiences occur when stimulus energies excite one or more types of receptor
neurons, of which there are five specialized types in
animals: chemoreceptors, mechanoreceptors, thermoreceptors, photoreceptors and nociceptors.
These receptors transduce (change) the form of input energy into a neural (electro-chemical)
signal. A single photon or micrometer of mechanical displacement is sufficient to
excite photoreceptors in the retina or
mechanoreceptors in the skin, respectively. Receptors selectively relay
certain features of the stimulus to the central nervous system. Individual receptors
are tuned to one or several stimulus features. Localization of a sensation is a
function of the size of the receptive field of the receptor. The duration of a sensation
is related both the duration of the stimulus and the perceived intensity. The intensity
of a sensation is mediated by two mechanisms: Stimuli of increasing intensity evoke
progressively more activity in a receptor, and recruit additional receptors with
higher activation thresholds.
Signal-to-noise ratio (S/N).
The ratio in decibels between signal and noise. An audio component with a high signal-to-noise
ratio has relatively little background noise accompanying the signal; a component
with a low signal-to-noise ratio is noisy.
The simplest form of periodic wave motion, expressed by the equation y = sin x,
where x is degrees and y is voltage or sound pressure level. All other forms can
be created by adding (mixing) a number of sine waves. The wave form of a "pure
tone" is a sine wave.
Having the shape of a sine wave.
In acoustics, a unit of loudness. Defined as the loudness of a 1000 Hz tone 40 dB
above threshold. A millisone is one-thousandth of a sone and is often called
the loudness unit.
Energy that transmitted by pressure waves in air or other materials and is the objective
cause of the sensation of hearing. Longitudinal vibrations in a medium in the frequency
range 20 Hz to 20 kHz.
Sound Transmission Class (STC).
In acoustics, a single number rating for describing sound transmission loss of a
wall or partition.
In concert hall acoustics, a hall is said to be "spacious" if the music
performed in it appears to the listener to emanate from a source wider than the
visual width of the actual source, and if the listener is noticeably enveloped by
the reverberant sound. The former attribute is often referred to as
apparent source width (ASW); the latter attribute is often referred to as
listener envelopment (LEV). ,
A description of the frequency content of a sound waveform, usually presented as
a graph with frequency on the abscissa (x axis) and amplitude on the ordinate (y
axis). A pure tone would have a single vertical line at the appropriate frequency
with a height indicating its amplitude. A complex sound (see
complex tone) would have several such lines, indicating the multiple components.
Drawing a curve through the tops of the lines would describe the spectral envelope.
A spectrogram is another representation of a spectrum in which the time component
is reintroduced: time is represented on the abscissa, frequency on the ordinate,
and amplitude is coded as the darkness of the trace at a given frequency and time.
In an auditory neural spectrogram, instead of a continuous signal, the probability
of occurrence of nerve spikes at a given moment in time is represented. The frequency
axis is replaced by a frequency-specific auditory nerve channel (see
basilar membrane). A third type of spectral representation called a time-frequency
perspective plot is drawn in three dimensions, with time along the x axis, amplitude
along the y axis, and frequency along the z axis.
A mirror-like reflection of sound from a flat surface; reflections that do not spread
A measure of sound clarity that indicates the ease of understanding speech. It is
a complex function of psychoacoustics, signal-to-noise ratio of the sound source,
and direct-to-reverberant energy within the listening environment.
Speed of sound.
In air, approximately 1130 feet per second at 20 degrees Centigrade.
A square wave is one in which there are only two values of the displacement of the
wave from the neutral position, a positive displacement and an equally large negative
displacement. The wave moves instantaneously form one state to the other and remains
equally long in each state. Its spectrum contains odd harmonics only, whose intensities
are inversely proportional to the harmonic number.
In concert hall acoustics, the measure of the degree of support that the hall, including
the walls and ceiling of the hall and of the enclosure immediately surrounding the
players, give to the players on stage. It is the difference, in decibels, between
the impulse sound energy from an omnidirectional sound source that arrives at a
player's position within the first 10 milliseconds, measured at a distance of 1
meter from the sound source, and that which arrives in the time interval between
20 and 100 milliseconds at the same position. The sound arriving in the later interval
has been reflected from one or more surfaces surrounding the player's position on
the stage, and its strength, minus the strength of the sound in the first 10 milliseconds,
is made with the chairs, music stands and percussion in place, except that those
near the source and receiver are set aside. The measurements are made at several
positions and the data are averaged.
In acoustics, an apparently stationary waveform created by multiple reflections
between opposite room surfaces. At certain points along the standing wave, the direct
and reflected waves cancel, and at other points the waves add together or reinforce
each other. These are sometimes called room modes.
The smallest muscle in the body, located in the middle ear.
Contraction of the stapedius pulls the stapes, altering the mechanical efficiency
of the ossicular chain. ,
(See Sound Transmission Class.)
Stereopsis. The most important mechanism for assessing
depth in human vision. First enunciated in 1838 by Sir Charles Wheatstone (who also
invented the "Wheatstone bridge" in electricity), stereopsis depends on
the slight differences in the two pictures projected on the
retinas. (See also parallax.)
(See analytic listening.)
t1 (initial time-delay gap or ITDG).
In concert hall acoustics, the time interval, measured in milliseconds, between
the arrival at a seat in the hall of the direct sound from a source on stage to
the arrival of the first significant reflection. It corresponds with the subjective
impression of "intimacy."
Abbreviation for Time Delay Spectrometry.
A computer based platform for measuring audio devices and acoustic environments,
manufactured by Techron and more recently Goldline under license from the Jet Propulsion
Laboratory, Pasadena, California. See also Time Delay Spectrometry.
The speed of occurrence of the beats for a given metric structure. In a musical
score, the tempo is specified in terms of the number of metric units per minute,
for example, quarter-note = 60, in which the time value of each quarter-note is
1 second. The inverse of tempo, the time between beats, is called the
An adjective meaning "pertaining to time."
The degree to which the auditory system can resolve, or separately distinguish,
events separated by extremely brief time periods.
Temporal coherence boundary.
Defines the threshold for hearing a repeating two-tone sequence as composed of a
single auditory stream across a range of frequency differences between the tones
and rates of tone presentation when the listener is trying to hear a single stream.
Above the boundary, the sequence is always heard as two streams. Below it, the sequence
may be heard as a single stream. This boundary is contrasted with the fission boundary,
which defines the threshold for hearing the same kind of repeating sequences when
the listener is trying to hear two separate streams. Above the fission boundary,
the sequence may be heard as two streams, but below it the sequence is always heard
as a single stream.
A region of the lateral part of cortex (just center of and slightly behind the ears)
concerned with audition and containing primary auditory cortex (i.e., the first
cortical area to which auditory signals are relayed, also known under the name of
(See rhythm pattern.)
Like the stapedius, a small muscle in the
middle ear. Contraction of the muscle increases the stiffness of, and thus
lessens the amount of energy conducted by, the ossicular chain. Though to a significantly
lesser extent than the stapedius, the tensor tympani is involved in acoustic reflex,
which is the automatic, protective response of the intratympanic muscles to intense
sound stimulation. ,
In TDS measurements, the 3-D display shows the change
in magnitude/frequency response versus time for a number of individual TDS sweeps.
Each sweep is offset in time by a constant amount and on the screen form a three-dimensional
surface display sometimes called a "waterfall" plot. The three dimensions
are time, energy and frequency.
A division of Lucasfilm, San Rafael, California. Also, a set of specifications for
the enhancement of sound playback in the residential environment.
Also referred to as sound quality or sound color. The classic negative definition
of timbre is: the perceptual attribute of sound that allows a listener to distinguish
among sounds that are otherwise equivalent with respect to pitch, loudness, and
subjective duration. Contemporary research has begun to decompose the attribute
into several perceptual dimensions of a temporal, spectral, or spectro-temporal
Time Delay Spectrometry (TDS).
A method, conceived by Richard Heyser, that permits a spectrum
that has been delayed to be measured with the signal delay removed. TDS measures
in the frequency domain, then transforms the results mathematically for interpretation
in the time, energy or frequency domains. The principal advantages of TDS measurements
are superior noise and distortion rejection properties, fast data gathering capability,
and the ability to make acoustical measurements under actual use situations. TDS
measurements include the frequency response, phase response, and time response data
associated with other techniques, plus energy-time curves,
polar energy-time curves, and energy-time-frequency
curves (3-D displays).
A sensation of noise, frequently of ringing, in the ears. Tinnitus aurium
refers a subjective sensation of noises in the ears. Objective tinnitus refers
to abnormal or pathological sounds originating within the body, in the region of
the ear, which are audible to others than the subject. ,
A set of musical rules that characterize Western music since the Baroque (17th century),
Classical, and Romantic styles. This system is still quite prominent in the large
majority of traditional and popular musics of the Western world. Other musical systems
in use in the West do not conform to these rules, and are consequently called non-tonal
The principal note or chord of a key in the Western
Instabilities present in the oscillation pattern of a physical object that is set
into vibration before the object settles into a stable oscillation. Also called
attack transients (see amplitude envelope). Similar
oscillatory instabilities ("legato transients") can be observed when the
object changes state suddenly as occurs when a musical instrument changes pitch
(by changing fingering on a woodwind instrument, pushing on a valve or piston in
a brass instrument, pressing down on a string with a finger, or lifting one up on
a string instrument). Transients are often characterized by a noisy or inharmonic
spectrum. (See also harmonicity,
In the home entertainment context, pertaining to the auditory, visual and
haptic sensory modalities.
Tuning system, musical.
Tympanic membrane (eardrum).
A thin, translucent, elliptically-shaped and slightly concave membrane at the end
of the meatus. The eardrum is made up of four layers.
The outermost layer is continuous with the skin of the meatus, and the innermost
layer is continuous with the mucous membrane of the middle ear.
Of the two inner layers, the outer layer is composed of radial fibers, while the
inner layer is composed of non-radial fibers. The tympanic membrane attaches to
the malleus (hammer) of the middle ear. ,
Virtual auditory environment.
A perceived auditory environment which has been manipulated
so that it does not correspond to the immediate physical environment. A trivial
example is the use of headphones, which typically foster the sense of sound originating
within the head, while the physical situation contains two sound sources located
on either side of the head.
The phenomenon is which visual perception dominates when visual cues and other sensory
haptic, etc.--are in direct conflict. In audio design, the effect allows
a loudspeaker to be placed at some distance away from a video display without the
audience perceiving the disparity in location between the visual event generated
on the screen, and the sonic event generated in the distant speaker. There are limits
to vision's tendency to "overpower" the other senses: In the case of audio
design, the limits can be usefully defined in terms of angular disparity, beyond
which the audience "hears" the sonic event as being spatially distinct
from, and thus conflicting with, the locus of the visual event.
In concert hall acoustics, warmth is defined as liveness of the bass, or fullness
of the tone between 75 and 350 Hz, relative to that of the mid-frequency tones (350
to 1,400 Hz). Musicians sometimes describe as "dark" a hall that has too
strong a bass, or whose high frequencies are greatly attenuated.
Discovered by Ernest Heinrich Weber in 1834. States that the smallest detectable
change (jnd) in intensity is a constant fraction of the level
of stimulation. Georg Fechner turned Weber's law into a psychophysical logarithm
of the magnitude of stimulation (I), or S = k log I. A great deal of psychophysical
research has attempted to establish the Weber-Fechner law for sensory dimensions
other than intensity, e.g., frequency and duration in audition. While the empirical
data conform fairly well to the law over a certain range of values for each dimension,
they can differ substantially at extremes of the range of perceptible values.