Previous work suggests that the perception of a visual beat in conductors’ gestures is related to certain physical characteristics of the movements they produce, most notably to periods of negative acceleration, and low position in the vertical axis. These findings are based on studies that have presented participants with somewhat simple gestures, and in which participants have been required to simply tap in time with the beat. Thus, it is not clear how generalizable these findings are to real-world conducting situations, in which a conductor uses considerably more complex gestures to direct an ensemble of musicians playing actual instruments. The aims of the present study were to examine the features of conductors’ gestures with which ensemble musicians synchronize their performance in an ecologically valid setting and to develop automatic feature extraction methods for the analysis of audio and movement data. An optical motion capture system was used to record the gestures of an expert conductor directing an ensemble of expert musicians over a 20-minute period. A simultaneous audio recording of the performance of the ensemble was also made and synchronized with the motion capture data. Four short excerpts were selected for analysis, two in which the conductor communicated the beat with high clarity, and two in which the beat was communicated with low clarity. Twelve movement variables were computationally extracted from the movement data and cross-correlated with the pulse of the ensemble’s performance, the latter based on the spectral flux of the audio signal. Results of the analysis indicated that the ensemble’s performance tended to be most highly synchronized with periods of maximal deceleration along the trajectory, followed by periods of high vertical velocity (a higher correlation than deceleration but a longer delay).

This content is only available via PDF.