Emotions of high arousal, such as fear or joy, are associated with an increase in amplitude, F0, F0 range, F0 variability, jitter, shimmer and speech rate, as well as with fewer and shorter interruptions (inter-vocalization interval). By contrast, emotions of low arousal, such as boredom, induce a low F0, narrow F0 range and low speech rate (Scherer, 1986; Murray
& Arnott, 1993; Bachorowski & Owren, 1995; Banse & Scherer, 1996; Zei Pollermann & Archinard, 2002; Juslin & Scherer, 2005; Li et al., 2007). A recent study showed that source-related parameters linked to phonatory effort (tension), perturbation and voicing frequency allowed good classification of five emotions (relief, joy, panic/fear, hot anger U0126 cost and sadness), but did not allow good differentiation of emotional valence (Patel et al., 2011). Filter-related cues (energy distribution, formant frequencies) have been more rarely considered in studies of emotions (Juslin & Scherer, 2005). However, it seems that spectrum parameters, particularly the energy distribution in the spectrum, F3 and F4, contrary to source-related parameters, differ between emotions of similar arousal but different valence (Banse & Scherer, 1996; Laukkanen selleck chemicals et al., 1997; Zei Pollermann & Archinard, 2002; Waaramaa, Alku & Laukkanen, 2006; Waaramaa et al., 2010). Emotion perception studies showed that an increase in F3 is judged as more positive (Waaramaa et al., 2006, 2010). Valence could also
be reflected in other voice quality- and amplitude-related parameters, with positive emotions being characterized by steeper spectral slopes, narrower frequency ranges, less noisy signals (spectral noise), lower amplitude levels and earlier positions of the maximum peak frequency than negative ones (Hammerschmidt & Jürgens, 2007; Goudbeek & Scherer, 2010). Furthermore, the energy is lower in frequency in positive compared with negative emotions in a large portion of the spectrum (Zei Pollermann & Archinard, 2002; Goudbeek & Scherer, 2010). There are several difficulties associated with the study of affective prosody
in humans. First, voice parameters do not only result from the physiological state of the speaker, but also from socio-cultural and linguistic conventions, and more generally from voluntary control of emotion expression. Therefore, psychological, social 上海皓元 interactional and cultural determinants of voice production may counteract each other, and act as confounding factors in the study of affective prosody (Scherer, Ladd & Silverman, 1984; Scheiner & Fisher, 2011). Second, interferences can exist between linguistic and paralinguistic domains (i.e. between vocal emotion expression and semantic or syntactic cues). In particular, the investigation of the role of formants in emotional communication is rendered difficult by their linguistic importance. They have been suggested to be crucial for communicating emotional valence, but this hypothesis is difficult to test in humans (Laukkanen et al., 1997; Waaramaa et al., 2010).