Expressive Musical Terms (EMTs) are commonly used by composers as verbal descriptions of musical expressiveness and characters that performers are requested to convey. We suggest a classification of 55 of these terms, based on the perception of professional music performers who were asked to: 1) organize the considered EMTs in a two-dimensional plane in such a way that proximity reflects similarity; and 2) rate these EMTs according to valence, arousal, extraversion, and neuroticism, using 7-level Likert scales. Using a minimization procedure, we found that a satisfactory partition requires these EMTs to be organized in four clusters (whose centroids are associated with tenderness, happiness, anger, and sadness) located in the four quarters of the valence-arousal plane of the circumplex model of affect developed by Russell (1980). In terms of the related positive-negative activation parameters, introduced by Watson and Tellegen (1985), we obtained a significant correlation between positive activation and extraversion and between negative activation and neuroticism. This demonstrates that these relations, previously observed in personality studies by Watson & Clark (1992a), extend to the musical field.

Apreeminent function of music is its engagement with emotions (Huron, 2006). Music cannot be composed, performed, or listened to without affective involvement (Juslin & Sloboda, 2001). The way various features of a musical composition such as pitch, harmony, rhythm, tempo, or metric structure can influence listeners' emotional responses has been studied extensively since the early twentieth century at the levels of musical analysis and psychoacoustic research (Gabrielsson & Lindström, 2001a, 2001b). In contrast, in spite of a broad consensus that communication of emotions in music is very much dependent on the expressiveness of the performer's interpretation (Dahl & Friberg, 2004; Gabrielsson & Juslin, 1996; Jensenius, 2007), the way different performances of a given score affect the listener's perception has only been addressed in the last fifty years. It has mostly been studied in the case of Western classical music where there exists a clear distinction between composer and performer (Gabrielsson, 2003). Emotions in the general context of music performance have been the subject of a great deal of research (e.g., Canazza, De Poli, Drioli, Rodá, & Vidolin, 2004; Eerola & Vuoskoski, 2013; Gabrielsson & Juslin, 1996; Juslin, 1997; Juslin & Västfjäll, 2008). However, relatively little attention has been paid to performance instructions concerning musical expression, in the form of verbal descriptions in the score. In this study we therefore concentrate on how these verbal descriptions are interpreted by the performers.

Different terminologies are used by psychologists and musicians to verbally describe expressiveness in music. Psychologists deal with expressiveness in terms of affect or emotions,1 often distinguishing between expressed, perceived, and felt (or evoked) emotions (Gabrielsson, 2001). In the musical context, expressed emotion refers to the emotion the performer tries to communicate to the listeners, while perceived and felt emotions refer to the emotional responses of the listeners. There exist several models depicting affect structure in psychology in general and in music in particular. There is still, however, no consensus on which emotion model or how many emotion categories should be retained (Juslin & Sloboda, 2001). According to Eerola and Vuoskoski (2013), seventy percent of the models originating from empirical investigations of emotion in music belong to either categorical (or discrete) or dimensional approaches.

As reviewed by Yang and Chen (2012), the categorical approach is based on the concept of basic emotions, i.e., a limited number of universal and primary emotion classes, such as happiness, sadness, anger, fear, disgust, surprise, and tenderness, from which all other secondary emotion classes such as amusement, excitement, relief, and satisfaction can be derived (Ekman, 1992; Picard, Vyzas, & Healey, 2001). It assumes that a discrete and independent neural system subserves every emotion (Posner, Russell, & Peterson, 2005). However, the notion of basic emotions in music has been criticized, mostly on the basis that different researchers have come up with different sets of basic emotions (Juslin & Sloboda, 2001) and that, whatever the retained granularity, the number of primary emotion classes is too small to cope with the richness of the perceived emotions in music (Juslin & Laukka, 2004). This led Zentner, Grandjean, and Scherer (2008)—who opt for the approach of the adjective checklist introduced by Hevner (1936)—to propose a music-specific rating scale: the Geneva Emotional Music Scales (GEMS). This list involves several dozen labels organized in a hierarchical classification and extracts the underlying factors that are directly relevant to music (Eerola & Vuoskoski, 2013).

In contrast, the dimensional approach represents emotions in an abstract low- (usually two- or three-) dimensional Euclidian space where the axes correspond to perceptual representations of emotion in the human mind, found by analyzing the correlation between affective terms. One of the most famous dimensional models is the circumplex model of affect, developed by Russell (1980). This model suggests that emotions are organized in a circular order in a planar two-dimensional domain parametrized by valence and arousal (or activation). Valence refers to the intrinsic pleasantness (positive valence) or unpleasantness (negative valence) of an affect, while arousal corresponds to the level of energy or wakefulness. As stated by Posner et al. (2005), “this posits that the two underlying neurophysiological systems of valence and arousal subserve all affective states, and upon this substrate are layered various cognitive processes that interpret and refine emotional experience.” Based on Russell's (1980) model, Watson and Tellegen (1985) proposed the positive activation (PA)—negative activation (NA) or “consensual” model of emotion. This new reference frame is obtained by a rotation of 45° of the valence and arousal axes of Russell's model. PA represents the degree to which a person feels enthusiastic, active, and alert. High PA is characterized by strong feelings of energy and pleasant excitement, while low PA is characterized by feelings of sadness, fatigue, and exhaustion. In contrast, NA represents the level of negative excitement and distress. High NA is related to anger, fear, and nervousness. Low NA corresponds to tranquility and calmness. As noted by the authors, both systems of coordinates represent alternative conceptualizations of the same basic structure (Watson, Wiese, Vaidya, & Tellegen, 1999).

Musicians, on the other hand, and more specifically composers of classical Western music, commonly use ‘expressive musical terms’ (EMTs)2 as verbal instructions to the performers, not directly related to tempo or dynamics and sometimes expressed in a metaphoric form, concerning their intentions about the musical character of the composition. They can be specified at the beginning of a piece but also all along the composition, depending on the expression and musical character required for the corresponding musical phrases. They directly influence performed sound characteristics such as timbre, dynamics, articulation, and timing. Although widely used for hundreds of years, EMTs have rarely been discussed in the context of the expressed emotions in music. Note, however, a recent study that presents a machine learning analysis of the correlation between ten EMTs (Tranquillo, Grazioso, Scherzando, Risoluto, Maestoso, Affettuoso, Espressivo, Agitato, Con Brio, Cantabile) and audio features of recorded expressive solo violin performances (Li, Su, Yang, & Su, 2015). This study was applied to the development of automatic expressive violin sound synthesis from a mechanical interpretation, by varying vibrato, dynamic, and duration features (Yang, Li, Su, Su, & Yang, 2016). In contrast with the above study that relates EMTs to sound physical properties, we concentrate here on their perception by music performers.

The rationale behind our study can be summarized as follows. From the literature review, it appears that most of the models depicting affect structure in music focus on listener's perceived and felt emotions, rather than on the expressed emotions that the performer tries to communicate to listeners (Eerola & Vuoskoski, 2013; Gabrielsson & Juslin, 1996; Juslin & Västfjäll, 2008; Swaminathan & Schellenberg, 2015). As a first step towards studying this issue, the main object of this paper is to provide a classification of commonly used EMTs, based on their perception by professional music performers. In addition, relations are suggested between the obtained EMTclassification and psychological models discussed in the literature. This permits us to provide a parameterization of the EMTs in terms of psychological concepts. Such results can contribute to the development of a perceptual model of music performance expressiveness, involving notions familiar to the musician community.

Toward a Classification of Musical Expressiveness

We concentrate on a list of 55 EMTs (given in Table 1), collected from the music literature. Most of them are listed, for example, in Danhauser (1950), a very popular music theory textbook among French-speaking musicians. The remaining ones were found on websites such as wikipedia.org (“List of Italian musical terms used in English”) or 8notes.com (“Musical Glossary”). The list covers a broad range of musical characters that are not directly related to tempo (such as Andante, Allegro, … ) or dynamics (such as Calando, … ). All of them are commonly used by composers to specify the kind of musical expression the performer is expected to produce. In this work, we suggest a schematic but robust categorization of the considered EMTs in order to analyze their main features. The specific nuances that each of them can carry (which depend on the musical context, the era, and the composer's style) are, however, beyond the scope of this categorization. Furthermore, although musical instructions concerning the performance expression in languages other than Italian (such as French or German) are found in the musical literature, we opted to limit ourselves to Italian terms, often used by composers of many different nationalities, in order to minimize the confusions which could arise from mixing analogous terms in different languages or cultural environments.

TABLE 1.

List of the Considered EMTs, Together with their English Translation

Affettuoso
With affection 
Delicato
Delicately; refined 
Grave
Very slow, solemn 
Pomposo
Pompous, ceremonious 
Agitato
In an agitated manner 
Di Bravura
In a florid style; brillantly 
Grazioso
Gracefully; daintily 
Religioso
Religiously 
Amabile
Amiably 
Disperato
Desperate 
Impetuoso
Impetuously 
Risoluto
In a resolute manner 
Amoroso
Lovingly 
Dolce
Sweet and soft 
Lagrimoso
Tearfully; mournfully 
Rustico
Rustic 
Animato
Animated 
Doloroso
With grief 
Leggiero
Lightly 
Scherzando
In a sprightly, playful manner 
Appassionato
Impassioned 
Drammatico
Dramatically 
Maestoso
Majestically, in a stately fashion 
Semplice
Simple 
Brioso
Brillantly, with sparkle 
Energico
In an energetic manner 
Malinconico
Melancholic 
Serioso
Seriously 
Burlesco
Comic; funny 
Espressivo
Expressively 
Martellato
With great force; hammered 
Sostenuto
Sustained 
Cantabile
In a singing style 
Feroce
Fiercely 
Mesto
In a pensive, sad manner 
Teneramente
Tenderly 
Capriccioso
Fanciful; capricious 
Forza
Force or emphasis 
Misterioso
Mysterious 
Tranquillo
Tranquilly, calmy, peacefully 
Comodo
Comfortable 
Furioso
Furious 
Nobile
In a noble fashion 
Tristamente
With sadness 
Con Fuoco
With fire 
Giocoso
Jocosely; humorously 
Patetico
Passionately, with great emotion 
Vigoroso
Vigorously 
Con Spirito
With spirit 
Giusto
Right; exact; strict 
Pesante
Heavily; in a ponderous manner 
Vivo
With vivacity, lively 
Deciso
Decided, with firmness 
Grandioso
Grandly 
Piacevole
In a pleasing manner 
 
Affettuoso
With affection 
Delicato
Delicately; refined 
Grave
Very slow, solemn 
Pomposo
Pompous, ceremonious 
Agitato
In an agitated manner 
Di Bravura
In a florid style; brillantly 
Grazioso
Gracefully; daintily 
Religioso
Religiously 
Amabile
Amiably 
Disperato
Desperate 
Impetuoso
Impetuously 
Risoluto
In a resolute manner 
Amoroso
Lovingly 
Dolce
Sweet and soft 
Lagrimoso
Tearfully; mournfully 
Rustico
Rustic 
Animato
Animated 
Doloroso
With grief 
Leggiero
Lightly 
Scherzando
In a sprightly, playful manner 
Appassionato
Impassioned 
Drammatico
Dramatically 
Maestoso
Majestically, in a stately fashion 
Semplice
Simple 
Brioso
Brillantly, with sparkle 
Energico
In an energetic manner 
Malinconico
Melancholic 
Serioso
Seriously 
Burlesco
Comic; funny 
Espressivo
Expressively 
Martellato
With great force; hammered 
Sostenuto
Sustained 
Cantabile
In a singing style 
Feroce
Fiercely 
Mesto
In a pensive, sad manner 
Teneramente
Tenderly 
Capriccioso
Fanciful; capricious 
Forza
Force or emphasis 
Misterioso
Mysterious 
Tranquillo
Tranquilly, calmy, peacefully 
Comodo
Comfortable 
Furioso
Furious 
Nobile
In a noble fashion 
Tristamente
With sadness 
Con Fuoco
With fire 
Giocoso
Jocosely; humorously 
Patetico
Passionately, with great emotion 
Vigoroso
Vigorously 
Con Spirito
With spirit 
Giusto
Right; exact; strict 
Pesante
Heavily; in a ponderous manner 
Vivo
With vivacity, lively 
Deciso
Decided, with firmness 
Grandioso
Grandly 
Piacevole
In a pleasing manner 
 

Toward this robust classification, we first gathered the considered EMTs into clusters whose “centroids,” defined as the most representative elements of these clusters, were also found. The suggested classification was aimed at answering two questions: 1) What is the minimum number of clusters needed for obtaining a robust classification of the considered EMTs? 2) What is the characterization of these clusters and what is the connection between the terminologies used by musicians and by psychologists to describe expressiveness in music?

In this framework, two experiments were performed in the Computer Laboratory of the Music Department of Bar-Ilan University with the participation of eleven professional string players (eight violinists, two violists, and one cellist), who specialize in classical music. Their age ranged from 22 to 71 (mean age = 43, SD = 14). They all graduated from renowned music academies in England, France, Germany, Israel, Russia, and the United States. Their professional experience as orchestra, chamber music, and soloist instrumentalists was 20 years on average (SD = 15). The variety of their music education and professional experience has exposed them to the interpretations of different conductors, leaders, or musical partners. This variety was assumed to increase the representativeness of this group and prevent bias that could originate from a single musical school. All the participants were fluent in Hebrew and English, and some of them had knowledge of other languages.

Although the EMTs presented to the participants are part of a universal music language, translation to English was provided (see Table 1) together with the possibility to access online translations to other languages. The participants were asked to confirm they fully understood the meaning of all the 55 EMTs before starting the experiments. A several-minute break took place between the two experiments, and each of them lasted about one hour (a sufficient time for completion by all the participants). The protocols of the experiments were approved by the ethical committee of the Music Department. A remuneration of 250 N.I.S. was offered to each participant.

In Experiment 1, participants were required to arrange the EMTs in a two-dimensional domain, according to their levels of similarity. In Experiment 2, a rating of the EMTs in terms of their given psychological parameters was required. The experiments were performed in this order by all the participants, so that they could organize the EMTs in the first experiment according to their own choices, without being affected by the predefined parameters involved in the second experiment.

Experiment 1: Two-dimensional Spatial Arrangement of the EMTs

Participants were asked to arrange the 55 EMTs listed in Table 1 according to their level of similarity. For this purpose, we choose to follow the approach implemented by Fritz, Blackwell, Cross, Woodhouse, and Moore (2012), for comparing violin timbre adjectives, which is conveniently applicable when dealing with a few dozen qualifiers. An alternative method of analysis is a standard pair-to-pair comparison. However, this method appeared inappropriate in the present study where the large number of pair combinations (55 × 54/2 = 1485 for a set of 55 EMTs) can potentially induce a systematic drift in the participants' ratings between the beginning and the end of the experiment. Furthermore, the procedure we implemented here permits the participants to keep in mind a global view of the arrangement under construction and to correct previous choices throughout the experiment.

In this approach, the EMTs are organized in an Excel spreadsheet, by cutting and pasting them from an initial randomized list, in such a way that words with similar meanings are to be put close to each other and words with different meanings farther apart. The participants were allowed to move each EMT as many times as required to achieve a satisfactory arrangement. The distance between two words thus provides a measure of the similarity degree of their meaning. In the present experiment, an Excel spreadsheet containing 26 columns and 41 rows was used. As an example, Figure 1 displays the data provided by one of the participants. Although the two-dimensional character of the method could be viewed as a limitation, it in fact provides the participants a comprehensive overview of the EMTs, thus making their arrangement according to the perceived similarity relatively easy.

FIGURE 1.

Example of spreadsheet provided by a participant.

FIGURE 1.

Example of spreadsheet provided by a participant.

As expected, the eleven participants' spreadsheets show a large variability in their appearance. For a majority of them, determining the optimal partitioning of the EMTs into clusters by direct inspection is not straightforward. This suggested developing an algorithmic approach that would bypass the direct visual aspect and extract a global classification of the considered EMTs from the individual spreadsheets, as perceived by the majority of the participants.

ANALYSIS PROCEDURE

As previously mentioned, the aim of this analysis was to optimally gather the considered EMTs into clusters organized around their corresponding centroids and to provide a suitable characterization of the defined groups, leading to a robust classification of these EMTs. A main issue is the number of clusters to be retained. This number should be relatively small in order to make the partition meaningful, but it should nevertheless discriminate elements carrying clearly different expressions. In this context, it is difficult to prescribe totally objective criteria for determining the number of clusters. In the following, we will thus increase it progressively, until obtaining a reasonable partition that discriminates between clearly different characters and also displays stability and robustness. The analysis included three steps.

In the first step, we computed the dissimilarity matrix, based on the spreadsheet map created by each participant. Using as coordinates the column and row indices in the spreadsheet, a generic element of index (i; j) of this matrix is given by the Euclidian distance, on the spreadsheet, between the words found in positions i and j in the EMT list.

The second step was to obtain a global description of the considered EMTs from the individual arrangements made by the participants, in a way that enhances the weight of the estimates benefitting from a broad consensus. For each (i; j)-pair of EMTs, we considered the corresponding elements of all the dissimilarity matrices, from which we constructed a histogram by sorting the range of the taken values into M = 10 equally spaced bins (referred to by the index m = 1; …, M), with values denoted dm(i,j). From the number wm(i,j) of elements allocated to the bin of index m, we define the quantity

 
Wij=m=1M(Wm(i,j))sdm(i,j)m=1M(Wm(i,j))s.
(1)

When the exponent s is taken equal to 1, Wij approximates the mean value of the (i; j)-element of the dissimilarity matrix, when averaged over all the participants. However, due to the limited number of participants, the contribution of a few outliers in the case of a skewed distribution can have a substantial effect on the mean, which can be drawn away from the center of the distribution. This point is clearly exemplified by considering the case of the pair (i = amoroso; j = cantabile). The corresponding mean distance is 5:3, while direct inspection of the histogram shown in Figure 2 indicates that on a majority of individual spreadsheets, the corresponding distance is much smaller. In order to obtain a measure that gives more weight (relative to the mean) to values that have been chosen by a large proportion of the participants, a larger value of the exponent s is preferable. In the present experiment we used s = 6 (a larger exponent produces essentially the same EMT partition). Formula (1) can be considered as approximating a mode measure of central tendency. Indeed, using formula (1) with s = 6, we get Wij = 1:9, which reflects in a more appropriate way the level of dissimilarity between amoroso and cantabile perceived by most of the participants. The resulting matrix W can be viewed as a “collective dissimilarity matrix” providing a global estimate of the most commonly chosen distances between the i and j EMTs.

FIGURE 2.

Histogram of the distance dm(i,j) between amoroso and cantabile computed from the 11 individual maps.

FIGURE 2.

Histogram of the distance dm(i,j) between amoroso and cantabile computed from the 11 individual maps.

The third step was then to construct the optimal partition of the ensemble of the n considered EMTs into k groups organized around EMTs considered as the respective centroids. Clustering on the basis of similarities is one of the most widely used techniques for exploratory data analysis. Nevertheless, grouping a set of objects such that similar objects end up in the same group and dissimilar objects are separated into different groups is not necessarily uniquely defined and could even involve specific difficulties. As pointed out by Shalev-Shwartz and Ben-David (2014), “mathematically speaking, similarity (or proximity) is not a transitive relation, while cluster sharing is an equivalence relation and, in particular, a transitive relation.” They exemplify this statement by considering “a very long sequence of objects, x1, …, xm such that each xi is very similar to its two neighbors, xi−1 and xi+1, but x1 and xm are very dissimilar.” They note that “if we wish to make sure that whenever two elements are similar they share the same cluster, we are led to put all the elements of the sequence in the same cluster. However, in that case, dissimilar elements (x1 and xm) are sharing the same cluster, which violates the second requirement.”

To cope with this situation and proceed to an intrinsic determination of the EMTs chosen as centroids, we simultaneously determine the optimal partition of the EMTs into clusters together with their centroids. This is done by minimizing the sum of the distances of the elements of a cluster to the respective centroid. The few elements located at the boundary of a cluster can indeed be rather close to elements belonging to another cluster, but their distance to the centroid of their own cluster is smaller than their distance to the centroids of the other clusters.

When considering all the possible partitions, the number n!(nk)!k! of possibilities of selecting k centroids among n elements grows rapidly with k (when n = 55; this number is 1485 for k = 2, it is 26235 for k = 3 and 341055 for k = 4; for k = 5 it is close to 3.5 millions). For each set of k EMTs chosen as centroids, we first associate each of the other (nk) EMTs to the one among the k centroids, which is located at the minimal distance as given by the W matrix. We then calculate the sum S of the distances of the (nk) EMTs to the k selected ones they are respectively connected to. Finally, we select as optimal centroids the set of the k EMTs, which provides the minimum value Smin of S. This corresponds to the optimal partition of the EMTs into k clusters. Figure 3 shows the values of S for all partitions with k = 4, sorted in increasing order. In order to take into account the experimental uncertainties and test the robustness of the results, we permitted some tolerance and considered all the cluster arrangements for which(SSmin) / Smin < 5%. This threshold corresponds approximately to the first slope reduction in the variation of S from its minimum value, as seen in the insert inside Figure 3, in the case k = 4 which is analyzed in detail in the following.

FIGURE 3.

S versus partition index, sorted in S-increasing order, in the case k = 4. The insert provides detail near the origin.

FIGURE 3.

S versus partition index, sorted in S-increasing order, in the case k = 4. The insert provides detail near the origin.

RESULTS AND DISCUSSION

Partition of the EMTs into clusters

The suitable number of clusters to be used is not known a priori. To determine it, we performed the above analysis with a number k of clusters ranging from two to five and compared the resulting partitions and associated centroids.

Table 2 displays the optimal two-cluster partition the list of the EMTs belonging to each of the clusters, together with (in parentheses) their distances to the centroid of the cluster they belong to. We notice that a large majority of the EMTs in the first cluster corresponds to characters with a relatively low energy, while in the second one more energetic characters are dominant. This classification makes sense and appears to be robust when considering the ten partitions satisfying the 5% tolerance condition. The obtained centroids for Cluster 1 are amoroso (3 occurrences), cantabile (2 occurrences), piacevole (2 occurrences), lagrimoso (2 occurrences) and religioso (one occurrence), and for Cluster 2 risoluto (four occurrences), energico (five occurrences), and patetico (one occurrence). These partitions are, however, not sufficiently refined, in the sense that they are limited to the energetic aspect. Another shortcoming of the optimal two-cluster partition is that the EMTs giocoso, maestoso, and vivo, which can be viewed as high energetic, are found in Cluster 1 whose centroid is cantabile and which is dominated by low energetic terms. On the other hand, the EMTs mesto, religioso, sostenuto, and grave, located at the bottom of Cluster 2 (thus at large distances from the centroid risoluto) are not highly energetic terms. This suggests that two clusters do not provide a sufficiently refined partition.

TABLE 2.

Optimal Two-cluster Partition with, for Each Element, the Distance to the Centroid of the Cluster to Which it Belongs

Cluster 1:
Cantabile (0.00), Amoroso (1.86), Affettuoso (2.02), Nobile (2.35), Teneramente (2.57), Dolce (2.84), Amabile (2.95), Espressivo (3.25), Disperato (4.30), Piacevole (4.40), Tranquillo (4.53), Leggiero (5.28), Delicato (5.31), Giocoso (5.56), Lagrimoso (5.81), Doloroso (5.98), Grazioso (6.38), Maestoso (6.46), Malinconico (6.87), Comodo (7.28), Semplice (7.34), Vivo (8.12), Tristamente (8.27), Misterioso (8.79) 
Cluster 2:
Risoluto (0.00), Deciso (1.57), Martellato (2.51), Feroce (2.75), Di Bravura (2.79), Vigoroso (2.82), Giusto (2.96), Furioso (3.39), Patetico (3.78), Forza (3.79), Pesante (3.88), Serioso (4.07), Appassionato (4.12), Drammatico (4.67), Grandioso (4.70), Energico (4.73), Rustico (4.97), Impetuoso (5.72), Con Spirito (5.84), Agitato (6.02), Con Fuoco (6.17), Animato (6.38), Brioso (6.50), Capriccioso (6.91), Burlesco (7.07), Scherzando (7.20), Mesto (7.47), Pomposo (7.60), Religioso (8.17), Sostenuto (8.20), Grave (9.37) 
Cluster 1:
Cantabile (0.00), Amoroso (1.86), Affettuoso (2.02), Nobile (2.35), Teneramente (2.57), Dolce (2.84), Amabile (2.95), Espressivo (3.25), Disperato (4.30), Piacevole (4.40), Tranquillo (4.53), Leggiero (5.28), Delicato (5.31), Giocoso (5.56), Lagrimoso (5.81), Doloroso (5.98), Grazioso (6.38), Maestoso (6.46), Malinconico (6.87), Comodo (7.28), Semplice (7.34), Vivo (8.12), Tristamente (8.27), Misterioso (8.79) 
Cluster 2:
Risoluto (0.00), Deciso (1.57), Martellato (2.51), Feroce (2.75), Di Bravura (2.79), Vigoroso (2.82), Giusto (2.96), Furioso (3.39), Patetico (3.78), Forza (3.79), Pesante (3.88), Serioso (4.07), Appassionato (4.12), Drammatico (4.67), Grandioso (4.70), Energico (4.73), Rustico (4.97), Impetuoso (5.72), Con Spirito (5.84), Agitato (6.02), Con Fuoco (6.17), Animato (6.38), Brioso (6.50), Capriccioso (6.91), Burlesco (7.07), Scherzando (7.20), Mesto (7.47), Pomposo (7.60), Religioso (8.17), Sostenuto (8.20), Grave (9.37) 

We thus repeated the analysis with three clusters. In this case, the number of partitions satisfying the 5% tolerance constraint is increased to 31. The centroids associated with the optimal partition are amoroso, risoluto, and tristamente. Within the other partitions obeying the 5% criterion, we observe that these EMTs appear 19, 15, and 10 times as centroids of Clusters 1, 2, and 3 respectively. Additional frequently found centroids are giocoso (six times for Cluster 1), energico (eight times for Cluster 2), and lagrimoso (eight times for Cluster 3). Other centroids obtained for Cluster 1 are cantabile (four occurrences), animato (one occurrence), con spirito (one occurrence). For Cluster 2, they are forza (three occurrences), drammatico (two occurrences), di bravura, agitato, and vigoroso (one occurrence each), while for Cluster 3, they are disperato (three occurrences), cantabile, doloroso, malinconico, mesto (two occurrences), piecevole, and religioso (one occurrence each).We note that the centroids of Cluster 2 all refer to nuances concerning powerful and energetic character. However, the most frequent centroids of Cluster 1 are amoroso (which in particular arises in the optimal partition) and giocoso (which arises in the next to optimal one), although they refer to very different characters. Also, among the centroids of Cluster 3 piacevole strongly differs from lagrimoso. These observations suggest that the 3-cluster partition is not sufficiently stable and that a partition involving a larger number of clusters is necessary.

When considering the partitions of the EMTs in four clusters, much more satisfactory results are obtained. Table 3, which displays the centroid quadruplets associated to the 19 partitions satisfying a 5% tolerance, shows a strong robustness. Each column can indeed be viewed as nuances of the same idea: as centroids of Cluster 1, we get amoroso (12 occurrences), cantabile (five occurrences), and piacevole (two occurrences). For Cluster 2, we get giocoso (nine occurrences), capriccioso (four occurrences), animato (three occurrences), and burlesco (three occurrences). For Cluster 3, we get risoluto for all the partitions. For Cluster 4, we get tristamente (nine occurrences), lagrimoso (six occurrences), malinconico (two occurrences), doloroso, and religioso (one occurrence each). In particular, the two first centroid quadruplets differ only by the substitution of lagrimoso by tristamente whose meanings are very similar. Table 4 displays the optimal four-cluster partition and the distances of the elements to the centroid of their respective cluster.

TABLE 3.

Centroids of the Four-cluster Partition for Which the Relative Deviation of S From its Minimal Value is Smaller than 5%

Centroids of 4-cluster partitionsS
Amoroso Giocoso Risoluto Tristamente 180.43 
Amoroso Giocoso Risoluto Lagrimoso 182.59 
Amoroso Capriccioso Risoluto Tristamente 183.73 
Cantabile Giocoso Risoluto Tristamente 183.82 
Cantabile Giocoso Risoluto Lagrimoso 184.75 
Amoroso Animato Risoluto Tristamente 184.79 
Amoroso Burlesco Risoluto Tristamente 185.89 
Amoroso Capriccioso Risoluto Lagrimoso 185.90 
Amoroso Giocoso Risoluto Malinconico 186.04 
Amoroso Animato Risoluto Lagrimoso 187.56 
Amoroso Burlesco Risoluto Lagrimoso 187.58 
Cantabile Capriccioso Risoluto Tristamente 187.92 
Cantabile Animato Risoluto Tristamente 187.92 
Amoroso Giocoso Risoluto Doloroso 188.59 
Piacevole Giocoso Risoluto Tristamente 188.69 
Cantabile Burlesco Risoluto Tristamente 189.16 
Amoroso Capriccioso Risoluto Malinconico 189.35 
Piacevole Giocoso Risoluto Lagrimoso 189.42 
Amoroso Giocoso Risoluto Religioso 189.44 
Centroids of 4-cluster partitionsS
Amoroso Giocoso Risoluto Tristamente 180.43 
Amoroso Giocoso Risoluto Lagrimoso 182.59 
Amoroso Capriccioso Risoluto Tristamente 183.73 
Cantabile Giocoso Risoluto Tristamente 183.82 
Cantabile Giocoso Risoluto Lagrimoso 184.75 
Amoroso Animato Risoluto Tristamente 184.79 
Amoroso Burlesco Risoluto Tristamente 185.89 
Amoroso Capriccioso Risoluto Lagrimoso 185.90 
Amoroso Giocoso Risoluto Malinconico 186.04 
Amoroso Animato Risoluto Lagrimoso 187.56 
Amoroso Burlesco Risoluto Lagrimoso 187.58 
Cantabile Capriccioso Risoluto Tristamente 187.92 
Cantabile Animato Risoluto Tristamente 187.92 
Amoroso Giocoso Risoluto Doloroso 188.59 
Piacevole Giocoso Risoluto Tristamente 188.69 
Cantabile Burlesco Risoluto Tristamente 189.16 
Amoroso Capriccioso Risoluto Malinconico 189.35 
Piacevole Giocoso Risoluto Lagrimoso 189.42 
Amoroso Giocoso Risoluto Religioso 189.44 
TABLE 4.

Optimal Four-cluster Partition and Distances of the Elements to Their Respective Centroid

Cluster 1DistancesCluster 2DistancesCluster 3DistancesCluster 4Distances
Amoroso Giocoso Risoluto Tristamente 
Amabile 1.81 Scherzando 1.30 Deciso 1.57 Lagrimoso 1.48 
Cantabile 1.86 Burlesco 1.56 Martellato 2.51 Malinconico 1.48 
Affettuoso 2.05 Capriccioso 1.59 Feroce 2.75 Doloroso 1.89 
Teneramente 2.13 Animato 2.21 Di Bravura 2.79 Mesto 2.00 
Espressivo 2.37 Con Spirito 2.33 Vigoroso 2.82 Religioso 2.38 
Dolce 2.46 Grazioso 2.71 Giusto 2.96 Grave 3.38 
Tranquillo 2.81 Brioso 3.04 Furioso 3.39 Disperato 3.68 
Appassionato 3.27 Vivo 3.07 Patetico 3.78 Misterioso 6.45 
Leggiero 3.64 Agitato 3.51 Forza 3.79 Comodo 6.82 
Semplice 4.57 Energico 4.28 Pesante 3.88   
Nobile 5.28 Maestoso 7.19 Serioso 4.07   
Delicato 5.53 Pomposo 7.38 Drammatico 4.67   
Piacevole 5.77   Grandioso 4.70   
Sostenuto 6.55   Rustico 4.97   
    Impetuoso 5.72   
    Con Fuoco 6.17   
Cluster 1DistancesCluster 2DistancesCluster 3DistancesCluster 4Distances
Amoroso Giocoso Risoluto Tristamente 
Amabile 1.81 Scherzando 1.30 Deciso 1.57 Lagrimoso 1.48 
Cantabile 1.86 Burlesco 1.56 Martellato 2.51 Malinconico 1.48 
Affettuoso 2.05 Capriccioso 1.59 Feroce 2.75 Doloroso 1.89 
Teneramente 2.13 Animato 2.21 Di Bravura 2.79 Mesto 2.00 
Espressivo 2.37 Con Spirito 2.33 Vigoroso 2.82 Religioso 2.38 
Dolce 2.46 Grazioso 2.71 Giusto 2.96 Grave 3.38 
Tranquillo 2.81 Brioso 3.04 Furioso 3.39 Disperato 3.68 
Appassionato 3.27 Vivo 3.07 Patetico 3.78 Misterioso 6.45 
Leggiero 3.64 Agitato 3.51 Forza 3.79 Comodo 6.82 
Semplice 4.57 Energico 4.28 Pesante 3.88   
Nobile 5.28 Maestoso 7.19 Serioso 4.07   
Delicato 5.53 Pomposo 7.38 Drammatico 4.67   
Piacevole 5.77   Grandioso 4.70   
Sostenuto 6.55   Rustico 4.97   
    Impetuoso 5.72   
    Con Fuoco 6.17   

Performing five-cluster partitions required a dramatic increase of the computational resources. The centroids of the optimal partition are the same as those of the four-cluster one, supplemented by maestoso, which is in fact more related to a specific musical character than to the expression of a certain affect. The centroids obtained in the range of 5% from Smin again show a strong robustness: among the 39 partitions satisfying the 5% tolerance criteria, we obtain as centroids of Cluster 1, amoroso 35 times, cantabile three times, and dolce once. For Cluster 2, we get giocoso 36 times, burlesco twice, and capriccioso once. For Cluster 3, we get risoluto 20 times, forza nine times, vigoroso five times, furioso twice, feroce twice, and con fuoco once. For Cluster 4, we get tristamente 25 times, lagrimoso nine times, malinconico four times, and doloroso once. For cluster 5, we get maestoso 20 times, pomposo five times, drammatico four times, sostenuto four times, comodo twice and con fuoco, vigoroso, piacevole, and grandioso once each. The five-cluster partition displays a strong consistency with the previous four-cluster partition, only supplementing a small additional nuance to the previous categorization. Table 5 displays the optimal five-cluster partition and distances of the elements to their respective centroids.

TABLE 5.

Optimal Five-cluster Partition and Distances of the Elements to the Respective Centroids

Cluster 1DistancesCluster 2DistancesCluster 3DistancesCluster 4DistancesCluster 5Distances
Amoroso 0.00 Giocoso 0.00 Risoluto 0.00 Tristamente 0.00 Maestoso 0.00 
Amabile 1.81 Scherzando 1.30 Deciso 1.57 Lagrimoso 1.48 Grandioso 1.79 
Cantabile 1.86 Burlesco 1.56 Martellato 2.51 Malinconico 1.48 Pomposo 2.38 
Affettuoso 2.05 Capriccioso 1.59 Feroce 2.75 Doloroso 1.89 Nobile 2.65 
Teneramente 2.13 Animato 2.21 Di Bravura 2.79 Mesto 2.00 Pesante 2.92 
Espressivo 2.37 Con Spirito 2.33 Vigoroso 2.82 Religioso 2.38 Sostenuto 3.09 
Dolce 2.46 Grazioso 2.71 Giusto 2.96 Grave 3.38 Drammatico 3.39 
Tranquillo 2.81 Brioso 3.04 Furioso 3.39 Disperato 3.68   
Appassionato 3.27 Vivo 3.07 Patetico 3.78 Misterioso 6.45   
Leggiero 3.64 Agitato 3.51 Forza 3.79 Comodo 6.82   
Semplice 4.57 Energico 4.28 Serioso 4.07     
Delicato 5.53   Rustico 4.97     
Piacevole 5.77   Impetuoso 5.72     
    Con Fuoco 6.17     
Cluster 1DistancesCluster 2DistancesCluster 3DistancesCluster 4DistancesCluster 5Distances
Amoroso 0.00 Giocoso 0.00 Risoluto 0.00 Tristamente 0.00 Maestoso 0.00 
Amabile 1.81 Scherzando 1.30 Deciso 1.57 Lagrimoso 1.48 Grandioso 1.79 
Cantabile 1.86 Burlesco 1.56 Martellato 2.51 Malinconico 1.48 Pomposo 2.38 
Affettuoso 2.05 Capriccioso 1.59 Feroce 2.75 Doloroso 1.89 Nobile 2.65 
Teneramente 2.13 Animato 2.21 Di Bravura 2.79 Mesto 2.00 Pesante 2.92 
Espressivo 2.37 Con Spirito 2.33 Vigoroso 2.82 Religioso 2.38 Sostenuto 3.09 
Dolce 2.46 Grazioso 2.71 Giusto 2.96 Grave 3.38 Drammatico 3.39 
Tranquillo 2.81 Brioso 3.04 Furioso 3.39 Disperato 3.68   
Appassionato 3.27 Vivo 3.07 Patetico 3.78 Misterioso 6.45   
Leggiero 3.64 Agitato 3.51 Forza 3.79 Comodo 6.82   
Semplice 4.57 Energico 4.28 Serioso 4.07     
Delicato 5.53   Rustico 4.97     
Piacevole 5.77   Impetuoso 5.72     
    Con Fuoco 6.17     

Table 6 collects the number of occurrence of the centroids in the partitions with two to five clusters satisfying the 5% condition. Interestingly, increasing the number of clusters preserves the most frequent selected centroids of less refined partitions and only adds additional components. We notice that the centroids of the optimal (or quasi-optimal) four-cluster partitions are very reminiscent of the four basic emotions (defined as biological and universal emotions beyond the culture differences) which are most common in music: tenderness, happiness, anger, and sadness. The fifth centroid, maestoso, although commonly used in the repertoire, does not appear directly related to a basic emotion but rather indicates a manner of performing. A similar remark can be made for the other elements of Cluster 5.We thus chose to concentrate in the following on four clusters, which appear to provide the minimal acceptable partition of the considered EMTs.

TABLE 6.

Centroids of the Two, Three, Four, and Five Cluster Partitions with their Number of Occurrences, for Which the Deviation of S from its Minimal Value is Smaller than 5%.

Number of ClustersCentroids of Cluster 1OccurrencesCentroids of Cluster 2OccurrencesCentroids of Cluster 3OccurrencesCentroids of Cluster 4OccurrencesCentroids of Cluster 5Occurrences
 2 Cantabile
Amoroso
Piacevole
Lagrimoso
Religioso 
 2
 3
 2
 2
 1 
  Risoluto
Energico
Patetico 
 4
 5
 1 
    
 3 Amoroso
Giocoso
Cantabile
Con Spirito
Animato 
19
 6
 4
 1
 1 
  Risoluto
Energico
Forza
Drammatico
Agitato
Vigoroso
Di Bravura 
15
 8
 3
 2
 1
 1
 1 
Tristamente
Lagrimoso
Disperato
Doloroso
Mesto
Cantabile
Malinconico
Piacevole
Religioso 
10
 8
 3
 2
 2
 2
 2
 1
 1 
  
 4 Amoroso
Cantabile Piacevole 
12
 5
 2 
Giocoso
Capriccioso
Animato
Burlesco 
 9
 4
 3
 3 
Risoluto 19 Tristamente
Lagrimoso
Malinconico
Doloroso
Religioso 
 9
 6
 2
 1
 1 
  
 5 Amoroso
Cantabile
Dolce 
35
 3
 1 
Giocoso
Burlesco
Capriccioso 
36
 2
 1 
Risoluto
Forza
Vigoroso
Furioso
Feroce
Con Fuoco 
20
 9
 5
 2
 2
 1 
Tristamente
Lagrimoso
Malinconico
Doloroso 
25
 9
 4
 1 
Maestoso
Pomposo
Drammatico
Sostenuto
Comodo
ConFuoco
Vigoroso
Piacevole
Grandioso 
20
 5
 4
 4
 2
 1
 1
 1
 1 
Number of ClustersCentroids of Cluster 1OccurrencesCentroids of Cluster 2OccurrencesCentroids of Cluster 3OccurrencesCentroids of Cluster 4OccurrencesCentroids of Cluster 5Occurrences
 2 Cantabile
Amoroso
Piacevole
Lagrimoso
Religioso 
 2
 3
 2
 2
 1 
  Risoluto
Energico
Patetico 
 4
 5
 1 
    
 3 Amoroso
Giocoso
Cantabile
Con Spirito
Animato 
19
 6
 4
 1
 1 
  Risoluto
Energico
Forza
Drammatico
Agitato
Vigoroso
Di Bravura 
15
 8
 3
 2
 1
 1
 1 
Tristamente
Lagrimoso
Disperato
Doloroso
Mesto
Cantabile
Malinconico
Piacevole
Religioso 
10
 8
 3
 2
 2
 2
 2
 1
 1 
  
 4 Amoroso
Cantabile Piacevole 
12
 5
 2 
Giocoso
Capriccioso
Animato
Burlesco 
 9
 4
 3
 3 
Risoluto 19 Tristamente
Lagrimoso
Malinconico
Doloroso
Religioso 
 9
 6
 2
 1
 1 
  
 5 Amoroso
Cantabile
Dolce 
35
 3
 1 
Giocoso
Burlesco
Capriccioso 
36
 2
 1 
Risoluto
Forza
Vigoroso
Furioso
Feroce
Con Fuoco 
20
 9
 5
 2
 2
 1 
Tristamente
Lagrimoso
Malinconico
Doloroso 
25
 9
 4
 1 
Maestoso
Pomposo
Drammatico
Sostenuto
Comodo
ConFuoco
Vigoroso
Piacevole
Grandioso 
20
 5
 4
 4
 2
 1
 1
 1
 1 

Relation with affect models

The 4-cluster partition can be discussed in terms of affect models commonly used in the literature. We previously observed that the four centroids are reminiscent of four basic emotions: amoroso corresponds to tenderness, giocoso to happiness, risoluto can be related to anger, and tristamente to sadness.

We note that the EMTs belonging to each cluster can be viewed as nuances and varieties of these four basic emotions. Other commonly considered basic emotions, such as fear or disgust, appear to be associated to almost none of the EMTs included in the considered list. Apparently, instructions to performers referring directly to fear or disgust are definitively far less common than tenderness, happiness, anger, and sadness. Furthermore, as noted in the review paper by Eerola and Vuoskoski (2013), there is a debate about the exact number and labeling of emotion categories in music. Several authors thus modified or replaced categories that appear inappropriate for music (Juslin & Västfjäll, 2008). This point is nevertheless beyond the scope of our study.

An interpretation of the optimal 4-cluster partition is provided by the two-dimensional model of affect introduced by Russell (1980). This model suggests representing emotions in a two-dimensional space, in terms of increasing continuous parameters, valence (from averseness to attractiveness) and arousal (from drowsy to energized). In this framework, the four EMT clusters obtained above as the minimal acceptable partition, correspond to the four quarters of this two-dimensional parameter space: Cluster 1 corresponds to positive valence and low arousal, Cluster 2 to positive valence and high arousal, Cluster 3 to negative valence and high arousal, and Cluster 4 to negative valence and low arousal. Note that the two-cluster partition displayed in Table 2 discriminates only between EMTs with high and low arousal, showing the leading importance of this parameter.

It is of interest at this step to provide a graphic representation for the EMTs' locations in an abstract space. For this purpose, multidimensional scaling (MDS), discussed for example in Giguére (2006), Jaworska and Chupetlovska-Anastasova (2009), or Hout, Papesh, and Goldinger (2013), was used to construct from the collective dissimilarity matrix a small-dimensional perceptual space. In this space, each EMT appears as a point, such that the distance between each pair approximates as well as possible the corresponding element of the collective dissimilarity matrix. The various MDS algorithms found in the literature differ slightly with respect to the treatment of proximity data, but they are all aimed at providing a geometric representation of data in a reduced space whose dimension results from a tradeoff between accuracy and complexity. We here used the mdscale function of Matlab software in its default configuration to perform a nonmetric MDS, which is suitable for a collective dissimilarity matrix where the ordering of the elements, rather than their specific values, is the important property to be preserved.

In the following, we prescribe a two-dimensional perceptual space, as suggested by the observation that the EMTs' clusters previously determined are satisfactorily characterized in terms of valence and arousal. This space, displayed in Figure 4, is obtained when applying the above analysis to the collective dissimilarity matrix (without any reference to cluster partitioning). The different symbols and colors refer to the optimal partition of the EMT representative points into four clusters. We verified that no significant difference was observed when, in the mdscale function, the Kruskal stress with the default option is replaced by the Sammon stress within a metric scaling.

FIGURE 4.

2D map (obtained by MDS) of the optimal partition into 4 clusters (magenta squares for Cluster 1, red + signs for Cluster 2, green triangles for Cluster 3 and blue circles for Cluster 4). The turquoise line corresponds to the optimal fit by a circle. Here and in other figures, the symbol ⊗ refers to the centroids.

FIGURE 4.

2D map (obtained by MDS) of the optimal partition into 4 clusters (magenta squares for Cluster 1, red + signs for Cluster 2, green triangles for Cluster 3 and blue circles for Cluster 4). The turquoise line corresponds to the optimal fit by a circle. Here and in other figures, the symbol ⊗ refers to the centroids.

Note that dimension 1 and dimension 2, obtained from the above MDS analysis and corresponding to the coordinate axes in Figure 4, turn out to accurately fit valence and arousal respectively. As each cluster is located with a rather good precision in a different quarter of the plane, they can be associated to the positive/negative valence and high/low arousal affects involved in Russell (1980) circumplex model. Furthermore, most of the EMTs are located close to the turquoise circle (with radius r = 8:76 and center (−0:65;−0:70), thus close to the origin) obtained by a best fit procedure, which appears as a reminiscence of the circular distribution of affect presented in Figure 1 of Russell (2003). Their locations and order are consistent with this model. Interestingly, the elements of the additional cluster and in particular its centroid, arising in the optimal five-cluster partition (which indeed are not pure affects), are not adjacent to the above circle, but rather located inside the disk.

The accuracy of a two-dimensional description can be evaluated by comparing the elements of the four resulting quadrants of the plane with those of the four clusters previously obtained without any reference to a geometrical representation. We observed an agreement of 85.5% on the allotment of the EMTs, which demonstrates the relevance of a two-dimensional representation. The EMTs Comodo, Con Fuoco, Impetuoso, Pomposo, Maestoso, Sostenuto, Appassionato, and Grazioso, which do not belong to the same clusters in the two partitions, are located at or relatively close to the cluster boundaries. This makes the cluster they belong to especially sensitive to the additional uncertainty introduced by the MDS procedure (which cannot always exactly preserve the original distances between the considered elements). The character of these EMTs is indeed neutral in terms of valence and/or medium in terms of arousal. For example, Comodo, which is the farthest term in the cluster characterized by negative valence and low arousal when using the initial procedure, is, when making use of MDS, located in the farthest third of the cluster associated to positive valence and low arousal. The observation that, depending on the procedure, the Comodo EMT is either characterized by a slightly positive valence (Figure 4) or appears as the most marginal element of Cluster 4 (associated with negative valence, see Table 4), is consistent with the statement that this EMT does not have a strong connotation in terms of valence. Its low-arousal character is in contrast a robust property.

Consistency of the analysis

At this point of the analysis, it is of interest to compare the 4-cluster partitions obtained from the individual participant spreadsheets with the collective partition presented above. The correspondence between the collective and individual clusters was made on the basis of their respective centroids, as described in Table 7. We noticed that for the partitions of most of the participants, the four centroids are close, in terms of valence and arousal, to the four centroids of the collective partition, despite noticeable differences in each personal arrangement. This led us to categorize the four clusters obtained by each participant in terms of positive/negative valence and high/low arousal, in order to compare them with the appropriate cluster of the collective partition. The comparison consisted of counting the number of EMTs common to the individual and to the corresponding collective clusters. Figure 5 summarizes the percentage of matching for each participant. We observe that for 8 participants it exceeds 50% and the mean reaches almost 60%. This correspondence is remarkable because of the significant methodological differences between the two approaches. As previously explained, the collective 4-cluster partition is obtained from the collective dissimilarity matrix, which only involves distances, with no reference to the individual partitions.

TABLE 7.

Centroids of the Collective and Individual Four-cluster Partitions

Centroids of Cluster 1Centroids of Cluster 2Centroids of Cluster 3Centroids of Cluster 4
CollectiveAmorosoGiocosoRisolutoTristamente
Participant 1 Comodo Drammatico Furioso Doloroso 
Participant 2 Amoroso Pomposo Energico Feroce 
Participant 3 Cantabile Burlesco Feroce Lagrimoso 
Participant 4 Sostenuto Giocoso Furioso Nobile 
Participant 5 Amabile Animato Maestoso Lagrimoso 
Participant 6 Delicato Burlesco Drammatico Religioso 
Participant 7 Dolce Brioso Feroce Malinconico 
Participant 8 Nobile Energico Furioso Espressivo 
Participant 9 Dolce Patetico Impetuoso Mesto 
Participant 10 Cantabile Giocoso Drammatico Tristamente 
Participant 11 Patetico Burlesco Energico Tristamente 
Centroids of Cluster 1Centroids of Cluster 2Centroids of Cluster 3Centroids of Cluster 4
CollectiveAmorosoGiocosoRisolutoTristamente
Participant 1 Comodo Drammatico Furioso Doloroso 
Participant 2 Amoroso Pomposo Energico Feroce 
Participant 3 Cantabile Burlesco Feroce Lagrimoso 
Participant 4 Sostenuto Giocoso Furioso Nobile 
Participant 5 Amabile Animato Maestoso Lagrimoso 
Participant 6 Delicato Burlesco Drammatico Religioso 
Participant 7 Dolce Brioso Feroce Malinconico 
Participant 8 Nobile Energico Furioso Espressivo 
Participant 9 Dolce Patetico Impetuoso Mesto 
Participant 10 Cantabile Giocoso Drammatico Tristamente 
Participant 11 Patetico Burlesco Energico Tristamente 
FIGURE 5.

Matching percentage between the individual and the collective partitions.

FIGURE 5.

Matching percentage between the individual and the collective partitions.

This analysis confirms that the 4-cluster collective partition is consistent with a large majority of the partitions obtained from individual participant spreadsheets. To be more specific, it is of interest to determine how often the location of each EMT in the arrangements made by individual participants fits its location in the collective partition. This information is presented in Table 8, where the EMTs in bold represent the centroids of the collective clusters. We notice that the centroids in the four-cluster collective partition arise in the second (once in the first) position in terms of percentage of agreement between individual participants. Within each cluster, the elements benefitting of high consensus carry close musical characters and can indeed be viewed as representative of these clusters.

TABLE 8.

Optimal Four-cluster Partition Together with, for each EMT, the Absolute (f) and Relative (in %) Numbers of Times it Arises in Individual Partitions

Cluster 1f%Cluster 2f%Cluster 3f%Cluster 4f%
Affettuoso 10 91 Animato 82 Feroce 10 91 Tristamente 82 
Amoroso  9 82 Giocoso 64 Risoluto  9 82 Lagrimoso 82 
Amabile  9 82 Con Spirito 64 Furioso  9 82 Doloroso 73 
Dolce  8 73 Brioso 64 Con Fuoco  8 73 Malinconico 73 
Cantabile  7 64 Grazioso 64 Forza  8 73 Mesto 73 
Delicato  7 64 Maestoso 64 Vigoroso  8 73 Religioso 73 
Espressivo  7 64 Pomposo 55 Di Bravura  7 64 Disperato 55 
Tranquillo  7 64 Scherzando 55 Rustico  7 64 Misterioso 55 
Piacevole  6 55 Agitato 45 Deciso  6 55 Grave 45 
Teneramente  6 55 Capriccioso 45 Impetuoso  6 55 Comodo 27 
Leggiero  5 45 Energico 45 Martellato  6 55    
Semplice  5 45 Vivo 45 Giusto  4 36    
Sostenuto  5 45 Burlesco 36 Patetico  4 36    
Nobile  4 36    Pesante  4 36    
Appassionato  3 27    Drammatico  3 27    
      Grandioso  3 27    
      Serioso  3 27    
Cluster 1f%Cluster 2f%Cluster 3f%Cluster 4f%
Affettuoso 10 91 Animato 82 Feroce 10 91 Tristamente 82 
Amoroso  9 82 Giocoso 64 Risoluto  9 82 Lagrimoso 82 
Amabile  9 82 Con Spirito 64 Furioso  9 82 Doloroso 73 
Dolce  8 73 Brioso 64 Con Fuoco  8 73 Malinconico 73 
Cantabile  7 64 Grazioso 64 Forza  8 73 Mesto 73 
Delicato  7 64 Maestoso 64 Vigoroso  8 73 Religioso 73 
Espressivo  7 64 Pomposo 55 Di Bravura  7 64 Disperato 55 
Tranquillo  7 64 Scherzando 55 Rustico  7 64 Misterioso 55 
Piacevole  6 55 Agitato 45 Deciso  6 55 Grave 45 
Teneramente  6 55 Capriccioso 45 Impetuoso  6 55 Comodo 27 
Leggiero  5 45 Energico 45 Martellato  6 55    
Semplice  5 45 Vivo 45 Giusto  4 36    
Sostenuto  5 45 Burlesco 36 Patetico  4 36    
Nobile  4 36    Pesante  4 36    
Appassionato  3 27    Drammatico  3 27    
      Grandioso  3 27    
      Serioso  3 27    

It is also of interest to consider the EMTs that obtained low percentage (less than 50%) in terms of their belonging to the same cluster in the collective partition and in the partitions made from the individual participant spreadsheets. Even for these EMTs, most of the results appear coherent. Consider for example the EMT Sostenuto, which was obtained in the cluster whose centroid is Amoroso in the collective partition. At the level of the 11 individual spreadsheets, it was obtained once in the cluster whose centroid is Giocoso, five times in the cluster whose centroid is Amoroso, and five times in the cluster whose centroid is Tristamente. This means that Sostenuto can be associated with positive or negative valence and low arousal. This makes sense, as Sostenuto means sustaining the played notes. In the case of string instruments, it indeed refers to a bow technique of connecting between notes of a musical phrase, even when not played in the same bow direction. Leggiero, which was also obtained in the cluster whose centroid is Amoroso in the collective partition, was obtained six times in the cluster whose centroid is Giocoso and five times in the cluster whose centroid is Amoroso. This means that Leggiero can be played in a way expressing positive valence and various amounts of arousal. Other interesting examples are Vivo and Energico, which arise in the cluster whose centroid is Giocoso in the collective partition. In the individual partitions, they appear once in the cluster whose centroid is Amoroso, five times in the cluster whose centroid is Giocoso, and five times in the cluster whose centroid is Risoluto. Indeed, these two EMTs are characterized by high arousal and relatively neutral valence. Agitato, which belongs to the cluster whose centroid is Giocoso in the collective partition, was obtained five times in the cluster whose centroid is Giocoso and six times in the cluster whose centroid is Risoluto. Agitato indeed corresponds to high arousal with either a positive or negative valence, although it is more commonly related to negative valence. Grandioso, which is located in the cluster whose centroid is Risoluto in the collective partition, arises twice in the cluster whose centroid is Amoroso, three times in the cluster whose centroid is Risoluto, and six times in the cluster whose centroid is Giocoso. Grandioso can in fact suggest positive valence. Comodo, which belongs to the cluster whose centroid is Tristamente in the collective partition, was obtained twice in the cluster whose centroid is Giocoso, three times in the cluster whose centroid is Tristamente, and six times in the cluster whose centroid is Amoroso. This means that most of the participants put Comodo together with EMTs which are characterized by positive valence and low arousal. Indeed, Comodo literally means comfortable.

The coherence of the results presented in this section and obtained by different analysis procedures indicates that the 55 EMTs we have considered can be satisfactorily gathered into four clusters that turn out to be associated with negative/positive valence and high/low arousal. This suggests performing a second experiment where the participants use these concepts directly. This is the object of the next section, where the effect of metaphorical personality traits (extraversion and neuroticism) and their relationship to musical expressiveness are also studied.

Experiment 2: Direct Rating of EMTs in Terms of Psychological Parameters

The EMTs classification obtained in Experiment 1 appears consistent with the two-dimensional circumplex model of affect developed by Russell (1980), which is parametrized by valence and arousal. The first aim of Experiment 2 is to validate these observations by asking the same participants to rate these EMTs directly in terms of valence and arousal. Furthermore, as many of the considered EMTs describe metaphorical musical characters, we were also interested in studying the relevance of a classification in terms of parameters taken from personality models related to the expression of emotions. We concentrated on the “Big Two” traits of personality, extraversion and neuroticism (Watson et al., 1999), which, in the context of self-rated affect measurement, were found to be correlated with the “Big Two” affective dimensions, Positive and Negative Activation (Watson & Clark, 1992a). In order to test the relevance of extraversion and neuroticism in the evaluation of the considered EMTs, participants were asked to also rate the considered EMTs, in terms of these parameters which, in the musical context, are to be taken in a metaphorical sense. A Likert 7-level rating scale was used, in which valence, extraversion, and neuroticism vary discretely from −3 to +3 and arousal from 1 to 7, by steps of 1 in all the cases. For convenience, we opt to rate arousal on a positive scale that more closely fits the variation from low to high usually considered for this parameter. As previously mentioned, valence refers to intrinsic attractiveness (positive valence) or averseness (negative valence), while arousal corresponds to the level of energy or wakefulness. Extraversion refers to outgoing and energetic behavior, in contrast with introversion, which is manifested in more reserved and solitary behavior. Here, −3 corresponds to very introvert and +3 to very extravert. Neuroticism corresponds to emotional instability, sensitiveness, and nervousness, in contrast to emotional stability, secureness, and confidence. It is also related to a lack of self-control (Toegel & Barsoux, 2012), with −3 corresponding to very emotionally stable and +3 to very neurotic. These psychological parameters appear relevant for connecting between the terminologies used by musicians and psychologists to describe expressiveness in the context of music performance.

ANALYSIS OF THE EXPERIMENTAL DATA

For the 55 EMTs (referred to by the index i), we define the “collective” valence Vi, arousal Ai, extraversion Ei, and neuroticism Ni, viewed as a global expression of the ratings given by a majority of participants according to the formulas

 
Vi=m=1M(Wvmi)sVmim=1M(Wvmi),   Ai=m=1M(Wami)samim=1M(Wami)s,
(2)
 
Ei=m=1M(Wemi)semim=1M(Wemi)s,   Ni=m=1M(Wnmi)snmim=1M(Wnmi)s,
(3)

where wvmi, wami, wemi, wnmi are the number of participants attributing the rates vmi, ami, emi, nmi relatively to the four considered parameters (valence, arousal, extraversion, and neuroticism respectively) to the EMT of index i. As in Experiment 1, we choose s = 6 in order (as explained earlier in the Analysis Procedure section) to give a higher weight to the most frequently estimates (the results stabilize for s = 5 or larger).

The aim of the analysis was to examine whether the two parameters valence and arousal are sufficient, or whether supplementing extraversion and neuroticism can improve the EMT classification. Each of these parameters can be viewed as a dimension in a perceptual space. This permits the definition of distances between these points and the construction of a dissimilarity matrix from which the same distance minimization procedure as described earlier in the Analysis Procedure section can be implemented, leading to optimal partitions of the EMTs into clusters.

RESULTS AND DISCUSSION

Valence and arousal – comparison with Experiment 1

Figure 6 displays the partition in four clusters, obtained from the collective valence and arousal directly computed from the participant ratings. It is of interest to compare this partition with that originating from the Excel spreadsheets of Experiment 1, displayed in Table 4. We first note that the centroids of the clusters refer to very similar characters: Piacevole versus Amoroso, Burlesco versus Giocoso, Vigoroso versus Risoluto, Lagrimoso versus Tristamente. Furthermore, the repartition of the EMTs into the corresponding clusters is globally similar, with an agreement of 71% between the two partitions. Another 13% of the EMTs are located near the boundary between two clusters of the valence-arousal partition, which indicates that for these EMTs, either the valence is quasi-neutral or the arousal moderate. We observe that more EMTs are located in the positive valence range of the valence-arousal plane than in the corresponding range of the representation obtained from the MDS analysis in Experiment 1. This possibly reflects the fact that a sizeable number of the considered EMTs are perceived in this analysis as musically pleasant.

FIGURE 6.

Collective arousal versus collective valence directly computed from the participant ratings (magenta squares for Cluster 1, red + signs for Cluster 2, green triangles for Cluster 3 and blue circles for Cluster 4).

FIGURE 6.

Collective arousal versus collective valence directly computed from the participant ratings (magenta squares for Cluster 1, red + signs for Cluster 2, green triangles for Cluster 3 and blue circles for Cluster 4).

If we consider the four-cluster partitions based on the three-dimensional spaces (valence, arousal, extraversion) and (valence, arousal, neuroticism), or on the four dimensional space (valence, arousal, extraversion, neuroticism), no major difference was obtained when compared to the partition based on the two-dimensional space only (valence, arousal). More specifically, when we compared the centroid quadruplets of the clusters constructed on the two-dimensional space (Piacevole, Burlesco, Vigoroso, Lagrimoso) to those of the other partitions mentioned above (Piacevole, Burlesco, Vigoroso, Mesto), (Delicato, Di Bravura, Forza, Lagrimoso) and (Sostenuto, Burlesco, Forza, Mesto), they appeared very similar. The agreement at the level of the cluster elements was 94.6%, 87.3%, and 85.5%, respectively. Nevertheless, extraversion and neuroticism ratings display an intrinsic interest, as they appear as personality traits metaphorically used in music and related to the musical expression described by the EMTs. These issues are discussed in the next sections.

Extraversion and neuroticism

We now consider the collective extraversion and neuroticism of the EMTs, resulting from the ratings of extraversion and neuroticism made by the individual participants. An optimal four-cluster partition of these data is presented in Figure 7. Cluster 1 (magenta squares) corresponds to EMTs associated with strong to moderate emotional stability, Cluster 2 (red + signs) to high extraversion but moderate or strong neuroticism, Cluster 3 (green triangles) to strong neuroticism and extraversion, Cluster 4 (blue circles) to low extraversion and either high or low neuroticism. This partition differs from that based on valence and arousal, although the cluster centroids in both partitions are EMTs with very close meanings. Note that we observe in the present musical context some correlation between extraversion and neuroticism, in contrast with the negative correlation reported in personality studies in psychology (McCrae & Costa, 1990). Indeed, while in the latter field, a person who experiences negative emotions is less social and less prone to dialog, most of the EMTs perceived with low extraversion also obtained low neuroticism ratings and a majority of those perceived with high extraversion received high neuroticism ratings. In fact, in the musical context, the social aspects that are associated to extraversion are not present and indeed not all the aspects of extraversion and neuroticism defined in the personality context are relevant for the musical context. In spite of these differences, many important aspects of extraversion and neuroticism can be useful in the description of musical characters. The participants indeed found these psychological parameters appropriate for ratings the EMTs.

FIGURE 7.

Collective extraversion versus collective neuroticism directly computed from the participant ratings (magenta squares for Cluster 1, red + signs for Cluster 2, green triangles for Cluster 3 and blue circles for Cluster 4).

FIGURE 7.

Collective extraversion versus collective neuroticism directly computed from the participant ratings (magenta squares for Cluster 1, red + signs for Cluster 2, green triangles for Cluster 3 and blue circles for Cluster 4).

Interestingly, when considering the collective quantities and comparing the partitions in terms of extraversion versus neuroticism and in terms of arousal versus valence, a matching of 75% is obtained between the two-cluster partitions resulting from the merging of Cluster 1 with Cluster 4 and of Cluster 2 with Cluster 3 in each partition. The set resulting from the merging of Clusters 1 and 4 of the arousal-valence partition corresponds to EMTs with low to mid arousal, while the union of Clusters 2 and 3 of this partition includes the EMTs with mid to high arousal. On the other hand, when merging Cluster 1 and 4 of the extraversion-neuroticism partition, the resulting set collects the EMTs with low neuroticism and low to mid extraversion, while the merging of Clusters 2 and 3 puts together EMTs with mid to high neuroticism and high extraversion. This can be viewed as reflecting the expected overlapping between low arousal and emotional stability with low to moderate extraversion, and also between high arousal with high neuroticism and extraversion.

To interpret the above observations, it is useful to consider the alternative conceptualization of the basic structure of affect, developed by Watson and Tellegen (1985) who proposed a Positive Activation–Negative Activation (PANA) model. These two independent coordinate axes result from a rotation of 45° of the valence and arousal axes proposed by Russell (1980). The change of coordinates from valence-arousal (v; a) to positive-negative (P;N) activations is easily obtained by the following argument. Let ê1 and ê2 be the unit vectors along the valence and arousal axes. For any EMT M with valence v and arousal a, one writes M=ce^1+(aa0)e^2, where a0 is the arousal value taken as neutral. Denoting by η^1 and η^2, the unit vectors along the Positive Activation (PA) and Negative Activation (NA) axes, defined as the oriented diagonals relatively to the valence-arousal reference frame centered at zero valence and arousal a0 (see Figure 8), one also has M=Pη^1+Nη^2. Since η^1=12(e^1+e^2) and η^2=12(e^1+e^2), it follows that

FIGURE 8.

Two-dimensional structures of affect.

FIGURE 8.

Two-dimensional structures of affect.

 
P=12(aa0+v),   N=12(aa0v).
(4)

Many researchers (Costa & McCrae, 1980, 1984; Emmons & Diener, 1985, 1986; Tellegen, 1985; Warr, Barter, & Brownbridge, 1983; Watson & Clark, 1984, 1992a, 1992b, Watson & Tellegen, 1985) found that NA is strongly related to neuroticism, whereas PA is highly related to extraversion, in the sense that extraverted people tend to report moods that are higher in PA while neurotic people tend to report moods that are higher in NA. We observe here that this link is also applicable to the current context, performers' perception of verbal expression of affect and metaphorical character in music.

To make this link more explicit, we perform the change of coordinates from valence-arousal to PANA described above and show in Figure 9 the EMTs in the space, parametrized by PA and NA, with the best fourcluster partition. As expected, the partition is identical to that obtained in terms of valence-arousal because the transformation does not affect the distances between the EMTs. Figures 10 and 11 display the EMTs in the planes PA-extraversion and NA-neuroticism respectively. The trend lines result from the linear regression between PA and extraversion or between NA and neuroticism. On both figures, significant correlation between the two coordinates is conspicuous with correlation coefficients :70 and probability values of order of 10−9 in the two cases. This result, previously mentioned in the literature in the different context of personality and selfrated affect, is thus seen to extend to the field of musical performance, in the sense that EMTs that participants judged to be higher in extraversion also tend to be judged higher in positive activation, whereas EMTs that participants judged to be higher in neuroticism also tend to be judged higher in negative activation.

FIGURE 9.

Negative Activation versus Positive Activation (magenta squares for Cluster 1, red + signs for Cluster 2, green triangles for Cluster 3 and blue circles for Cluster 4).

FIGURE 9.

Negative Activation versus Positive Activation (magenta squares for Cluster 1, red + signs for Cluster 2, green triangles for Cluster 3 and blue circles for Cluster 4).

FIGURE 10.

Extraversion versus Positive Activation.

FIGURE 10.

Extraversion versus Positive Activation.

FIGURE 11.

Neuroticism versus Negative Activation.

FIGURE 11.

Neuroticism versus Negative Activation.

CONCLUSION

In order to investigate the performers' perception of Expressive Musical Terms used in classical Western music to indicate, beyond pitch, rhythm, and dynamics, the expression and musical character that is intended to be conveyed, we performed a partition of commonly used 55 EMTs that cover a broad range of characters, not referring directly to tempo, and related this classification to psychological models of emotions discussed in the literature. The two experiments we performed led us to organize these EMTs into a minimal acceptable partition involving four clusters whose centroids turn out to be associated with tenderness, happiness, anger, and sadness. Using multidimensional scaling, we furthermore showed that the obtained localization of these EMTs in the plane actually fits the circumplex model of affect developed by Russell (1980) and that the four clusters correspond essentially to the four quarters of the valencearousal plane. Alternatively, when positive-negative activation parameters (linearly related to the previous one by a 45° rotation of the coordinate axes in the two-dimensional plane) are used, we obtain a significant correlation between positive activation and extraversion and between negative activation and neuroticism. This demonstrates that these relations, previously observed in personality studies (Watson & Clark, 1992a), extends to the musical field. In particular, it turns out that valence and arousal suffice for obtaining a possibly schematic but robust EMTs classification. Indeed, combining them with two broad traits of personality—extraversion and neuroticism—does not improve the accuracy of the classification. A possible explanation for this observation is the redundancy of these four dimensions, as demonstrated in Experiment 2 based on a rating procedure.

The findings of the present research may assist in delineating a perceptual model of musical expressiveness in performance where a significant role is played by EMTs which, although commonly used by musicians, have rarely been considered in the field of music cognition. In a planned future study, the selection of the four EMTs arising as cluster centroids will be useful in particular when building an audio corpus of short musical excerpts from the violin repertoire, each of them played by a violinist according to these EMTs (supplemented by a neutral, i.e., mechanical and non-expressive performance), in order to compare and examine their main features, with the aim of investigating the perception of expressiveness in violin performance. We believe that referring to notions belonging to the language of musical performance should facilitate cooperation with musicians, for understanding cognitive mechanisms involved in communication and perception of expressiveness in music performance. A perceptual model of musical expressiveness in performance can contribute to the development of new methods for musical education by raising students' awareness of expressiveness in musical performance. It can also be useful for technological applications such as expressive sound synthesis, machine emotion recognition and advanced tools for recording editing.

Notes

Notes
1.
In this paper, affect and emotion are used interchangeably, although psychologists differentiate between them. Affect is considered an umbrella term that covers all evaluative states such as emotion, mood, and preference. Emotions are defined as relatively intense affective responses that usually involve a number of subcomponents that are more or less synchronized. Emotions focus on specific objects and last from a few minutes to a few hours (Juslin & Västfjäll, 2008).
2.
We are following here Li et al. (2015), who explain that, “the expressive musical term is defined as the Italian musical term which describes an emotion, feeling, image or metaphor, rather than merely an indication of tempo or dynamics. It includes, but is not limited to emotional terms.”

References

References
Canazza, S., De Poli, G., Drioli, C., Rodà, A., & Vidolin, A. (
2004
).
Modeling and control of expressiveness in music performance
.
Proceedings of the IEEE
,
92
(
4
),
686
701
.
Canazza, S., De Poli, G., Rodà, A., & Vidolin, A. (
2013
). Expressiveness in music performance: Analysis, models, mapping, encoding. In J. Steyn (A cura di),
Structuring music through markup language: Designs and architectures
(pp.
156
186
).
Hershey
:
IGI Global
.
Canazza, S., Poli, G. D., & Rodà, A. (
2015
).
CaRo 2.0: An interactive system for expressive music rendering
.
Advances in Human-Computer Interaction
,
2015
,
1
13
.
Costa, P. T., & McCrae, R. R. (
1980
).
Influence of extraversion and neuroticism on subjective well-being: Happy and unhappy people
.
Journal of Personality and Social Psychology
,
38
(
4
),
668
678
.
Costa, P. T., & McCrae, R. R. (
1984
). Personality as a lifelong determinant of well-being. In C. Z. Malatesta & C. E. Izard (Eds.),
Emotion in adult development
(pp.
141
157
).
Beverly Hills, CA
:
Sage Publications
.
Dahl, S., & Friberg, A. (
2004
).
Expressiveness of a marimba player's body movements
.
Speech, Music and Hearing - Quarterly Progress and Status Report
,
46
(
1
),
75
86
.
Danhauser, A. (
1950
).
Théorie de la musique
.
Paris, France
:
Henri Lemoine
.
Eerola, T., & Vuoskoski, J. K. (
2013
).
A review of music and emotion studies: Approaches, emotion models, and stimuli
.
Music Perception
,
30
,
307
340
.
Ekman, P. (
1992
).
An argument for basic emotions
.
Cognition and Emotion
,
6
(
3–4
),
169
200
.
Emmons, R. A., & Diener, E. (
1985
).
Personality correlates of subjective well-being
.
Personality and Social Psychology Bulletin
,
11
(
1
),
89
97
.
Emmons, R. A., & Diener, E. (
1986
).
Influence of impulsivity and sociability on subjective well-being
.
Journal of Personality and Social Psychology
,
50
(
6
),
1211
1215
.
Fritz, C., Blackwell, A. F., Cross, I., Woodhouse, J., & Moore, B. C. (
2012
).
Exploring violin sound quality: Investigating English timbre descriptors and correlating resynthesized acoustical modifications with perceptual properties
.
Journal of the Acoustical Society of America
,
131
(
1
),
783
794
.
Gabrielsson, A. (
2001
).
Emotion perceived and emotion felt: Same or different?
Musicae Scientiae
,
5
(
1_suppl
),
123
147
.
Gabrielsson, A. (
2003
).
Music performance research at the millennium
.
Psychology of Music
,
31
(
3
),
221
272
.
Gabrielsson, A., & Juslin, P. N. (
1996
).
Emotional expression in music performance: Between the performer's intention and the listener's experience
.
Psychology of Music
,
24
(
1
),
68
91
.
Gabrielsson, A., & Lindström, E. (
2001a
). The role of structure in the musical expression of emotions. In P. N. Juslin & J. A. Sloboda (Eds.),
Handbook of music and emotion: Theory, research, applications
(p.
367
400
).
Oxford, UK
:
Oxford University Press
.
Gabrielsson, A., & Lindström, E. (
2001b
). The influence of musical structure on emotional expression. In P. N. Juslin & J. A. Sloboda (Eds.),
Music and emotion
(pp.
223
248
).
New York
:
Oxford University Press
.
Giguére, G. (
2006
).
Collecting and analyzing data in multidimensional scaling experiments: A guide for psychologists using SPSS
.
Tutorials in Quantitative Methods for Psychology
,
2
(
1
),
27
38
.
Hevner, K. (
1936
).
Experimental studies of the elements of expression in music
.
The American Journal of Psychology
,
48
(
2
),
246
268
.
Hout, M. C., Papesh, M. H., & Goldinger, S. D. (
2013
).
Multidimensional scaling
.
Wiley Interdisciplinary Reviews: Cognitive Science
,
4
(
1
),
93
103
.
Huron, D. B. (
2006
).
Sweet anticipation: Music and the psychology of expectation
.
Cambridge, MA
:
MIT Press
.
Jaworska, N., & Chupetlovska-Anastasova, A. (
2009
).
A review of multidimensional scaling (MDS) and its utility in various psychological domains
.
Tutorials in Quantitative Methods for Psychology
,
5
(
1
),
1
10
.
Jensenius, A. R. (
2007
).
Action-sound: developing methods and tools to study music-related body movement
(Doctoral dissertation).
University of Oslo
,
Norway
.
Juslin, P. N. (
1997
).
Emotional communication in music performance: A functionalist perspective and some data
.
Music Perception
,
14
,
383
418
.
Juslin, P. N., & Laukka, P. (
2004
).
Expression, perception, and induction of musical emotions: A review and a questionnaire study of everyday listening
.
Journal of New Music Research
,
33
(
3
),
217
238
.
Juslin, P. N., & Sloboda, J. (Eds.). (
2001
).
Handbook of music and emotion: Theory, research, applications
.
Oxford, UK
:
Oxford University Press
.
Juslin, P. N., & Västfjäll, D. (
2008
).
Emotional responses to music: The need to consider underlying mechanisms
.
Behavioral and brain sciences
,
31
(
5
),
559
575
.
Li, P.-C., Su, L., Yang, Y.-H., & Su, A. W. (
2015
,
October
). Analysis of expressive musical terms in violin using score-informed and expression-based audio features. In M. Müller and F. Wiering (Eds.)
Proceedings of the 16th ISMIR Conference
, (pp.
809
815
).
Malaga, Spain
.
Mccrae, R. R., & Costa, P. T. (
1990
).
Personality in adulthood
.
New York
:
The Guildford Press
.
Picard, R.W., Vyzas, E., & Healey, J. (
2001
).
Toward machine emotional intelligence: Analysis of affective physiological state
.
IEEE Transactions on Pattern Analysis and Machine Intelligence
,
23
(
10
),
1175
1191
.
Posner, J., Russell, J. A., & Peterson, B. S. (
2005
).
The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology
.
Development and Psychopathology
,
17
(
3
),
715
734
.
Russell, J. A. (
1980
).
A circumplex model of affect
.
Journal of Personality and Social Psychology
,
39
(
6
),
1161
1178
.
Russell, J. A. (
2003
).
Core affect and the psychological construction of emotion
.
Psychological review
,
110
(
1
),
145
172
.
Shalev-Shwartz, S., & Ben-David, S. (
2014
).
Understanding machine learning: From theory to algorithms
.
Cambridge, UK
:
Cambridge University Press
.
Swaminathan, S., & Schellenberg, G. E. (
2015
).
Current emotion research in music psychology
.
Emotion Review
,
7
(
2
),
189
197
.
Tellegen, A. (
1985
). Structures of mood and personality and their relevance to assessing anxiety, with an emphasis on self-report. In A. H. Tuma & J. D. Maser (Eds.),
Anxiety and the anxiety disorders
(pp.
681
706
).
Hillsdale, NJ
:
Lawrence Erlbaum Associates
.
Toegel, G., & Barsoux, J.-L. (
2012
).
How to become a better leader
.
MIT Sloan Management Review
,
53
(
3
),
51
60
.
Warr, P. B., Barter, J., & Brownbridge, G. (
1983
).
On the independence of positive and negative affect
.
Journal of Personality and Social Psychology
,
44
(
3
),
644
651
.
Watson, D., & Clark, L. A. (
1984
).
Negative affectivity: the disposition to experience aversive emotional states
.
Psychological Bulletin
,
96
(
3
),
465
490
.
Watson, D., & clark, L. A. (
1992a
).
On traits and temperament: General and specific factors of emotional experience and their relation to the five-factor model
.
Journal of Personality
,
60
(
2
),
441
476
.
Watson, D., & Clark, L. A. (
1992b
).
Affects separable and inseparable: on the hierarchical arrangement of the negative affects
.
Journal of Personality and Social Psychology
,
62
(
3
),
489
505
.
Watson, D., & Tellegen, A. (
1985
).
Toward a consensual structure of mood
.
Psychological Bulletin
,
98
(
2
),
219
235
.
Watson, D.,Wiese, D., Vaidya, J., & Tellegen, A. (
1999
).
The two general activation systems of affect: Structural findings, evolutionary considerations, and psychobiological evidence
.
Journal of Personality and Social Psychology
,
76
(
5
),
820
838
.
Yang, C.-H., Li, P.-C., Su, A. W., Su, L., & Yang, Y. H. (
2016
). Automatic violin synthesis using expressive musical term features. In P. Rajmic, F. Rund, & J. Schimmel (Eds.),
Proceedings of the 19th International Conference on Digital Audio Effects (DAFx-16)
(pp.
1
7
).
Brno, Czech Republic
.
Yang, Y.-H., & Chen, H. H. (
2012
).
Machine recognition of music emotion: A review
.
ACM Transactions on Intelligent Systems and Technology (TIST)
,
3
(
3
),
1
30
.
Zentner, M., Grandjean, D., & Scherer, K. R. (
2008
).
Emotions evoked by the sound of music: Characterization, classification, and measurement
.
Emotion
,
8
(
4
),
494
521
.