Battles in Japanese role-playing games (JRPGs) present special problems for game immersion because of their sheer ubiquity and repetitiveness. This is particularly true of games that require the player to “grind”—that is, to engage in repetitive fights in order to level-up characters and thereby gain tactical advantages later. This paper argues that battle music is first and foremost a variety of functional music, a genre that fans measure not by its sonic beauty but by its psychological effectiveness. This point leads directly to a number of pressing questions: How do battle themes function? How do these functions relate to the all-important concerns of immersion and interactivity? How do we evaluate the effectiveness of battle music? And finally, what would a preliminary theory of battle music composition look like?
This article examines how the grind of JRPG fighting manifests as an integral part of battle music composition, analyzing three standard conventions in detail: (1) a clear opening audiovisual rupture, (2) a fanfare as cadence, and (3) a sustained period of harmonic stasis underpinning busy surface textures. This last phenomenon creates a sense of “musicospatial stasis”—that is, a musically induced sense of stasis that intermingles with and projects itself onto the visual and narrative fields. Because the repetitive grind of JRPG battles interrupts movement through the overworld, battle themes should be understood as ruptures in the sonic environment, just as the battle stage is a spatial rupture in the overworld. Battle themes therefore make little sense as analytical objects out of context: by definition, they signal a break that impedes the player’s movement throughout a larger environment.
At the 2015 meeting of the North American Conference on Video Game Music, William Gibbons presented research on the risks of cinematic game scores, and he used the main battle theme of Ni no Kuni: Wrath of the White Witch (Level-5, 2011) as one of his primary examples.2 This cue—a sumptuously orchestrated piece by Studio Ghibli veteran Joe Hisaishi—fell flat among some fans of Japanese role-playing games (JRPGs), who thought it ineffective despite its beauty as a piece of music. The most prominent critique along these lines came from Kirk Hamilton, at the time an editor-at-large for the prominent gaming website Kotaku:
Hisaishi’s work on Ni no Kuni is one of the most remarkable things about the game, both for its outright quality (it’s often jaw-droppingly good) and because it’s the kind of grand orchestration and performance that we rarely hear in video games.…But I just don’t care for the battle music.…The problem…is that Hisaishi has assembled this battle music like a regular non-video-game composition, without allowing for the requirements of JRPG battle music.…With the game’s battle music, the soundtrack’s more distinctive qualities work against it. Repetition is something that any video game composer should take into account, and a failure to do so can make otherwise exceptional music like the Ni no Kuni battle anthem feel chaotic and unpleasant.3
For Hamilton, Hisaishi wrote the battle theme of Ni no Kuni against the conventions of JRPG battle music, marring the player experience even as the piece stood out for its musical merits. This paradoxical relationship between compositional craft and in-game effectiveness has piqued the interest of several ludomusicologists, as it involves stakes well beyond the critical fortunes of a single game. Gibbons suggested that the theme didn’t effectively accompany the in-game action because it fell into a kind of “uncanny valley” between Western classical composition and traditional, loop-based game scoring. (In the Q&A afterward, one distinguished colleague offered that “sucky valley” might have been a more appropriate label, leading to some hilarity.) At another conference a few years later, Matthew Thompson suggested that the battle cue was simply much too slow, that it would have been perfectly fine if it were faster.4 Though these commentators all highlight different elements within Ni no Kuni’s battle theme, their concerns all circle around a central idea: a perceived mismatch of the music and its context, and consequently a failure to immerse the player in combat.
Immersion has long been a central concern for writers on game music, though the concept itself can be slippery to define. Spatial immersion—the feeling of being spatially transported into the narrative world of an artwork—is a phenomenon that literary critics and theorists of virtuality have explored in great detail over the last few decades, and it is this understanding of immersion that underpins this essay.5 Ludomusicologists, almost by definition, are those who join a passion for gaming with powerful epistemic motivations; we write about games because we have been immersed in them. Immersion has also been one of the key justifications for staking out the field. Music is one of the primary elements that integrates the experience of immersion, and William Cheng, Karen Collins, Winifred Phillips, Tim Summers, and Isabella van Elferen have all offered extended meditations on this theme.6 Their ideas generally emphasize the importance of audiovisual continuity, but Elizabeth Medina-Gray has observed that musical disjunction can also be essential for articulating virtual spaces—a key insight that we will return to later.7 Few talk about the implications of failed immersion, however, and I suspect this is for a very simple reason: in a sea of possible topics and examples, gamers/scholars reserve their time and attention for the games they like, and—as a rule—they like to be immersed.8 Immersion connects attention and pleasure, curiosity and play.
JRPG battles present special problems for musical immersion because of their sheer ubiquity and repetitiveness. A poorly designed cue might be overlooked if it’s heard only once, but in many games battle themes are heard hundreds of times. Hamilton goes so far as to claim that battle themes are “arguably the most important music” in JRPGs. “It’s the battle music that will accompany your many victories and defeats,” he states, “and so it had better be good.”9 The stakes are particularly high in games that require the player to “grind”—that is, to engage in repetitive fights in order to level-up characters and thereby gain tactical advantages later.10 How, then, to keep the player musically immersed through the hundred-odd battles on the way to yet another boss battle? Clearly designers and game composers frequently succeed, because JRPG battle themes are at the center of the emerging canon of game music, with wide representation in fan remixes, orchestral concerts, soundtrack sales, and ludomusicology conferences.11 Ni no Kuni’s relative failure in this arena can provide us with an important clue: battle music is first and foremost a variety of functional music, a genre that fans measure not by its sonic beauty but by its psychological effectiveness. This point leads directly to a number of pressing questions. How, then, do battle themes function? How do these functions relate to the all-important concerns of immersion and interactivity? How do we evaluate the effectiveness of battle music? And finally, can we create a preliminary theory of battle music composition?
In this essay, I examine how the grind of JRPG fighting manifests as an integral part of battle music composition, analyzing how a number of standard conventions articulate the spatial logic of a video game world. I focus on three sonic gestures: (1) a clear opening audiovisual rupture, (2) the fanfare as cadence, and (3) a sustained period of harmonic stasis underpinning busy surface textures. Julianne Grasso has recently touched on the first two, suggesting that RPG “musical transitions…act as framing devices that further clarify a distinction between the mode of exploration and the mode of battle.”12 I delineate these gestures further while offering a novel reading of harmonic stasis in battle themes. I define this last phenomenon as musicospatial stasis—that is, a musically induced sense of stasis that intermingles with and projects itself onto the visual and narrative fields. In developing these concepts of battle music construction, I seek to nuance earlier work in immersion, which has engaged deeply with questions of how music promotes spatial and psychological flow.13 I extend Grasso’s work on how game music mediates time and narrative engagement by demonstrating that game music also conditions the player’s sense of space at a visceral level.14 Because the repetitive grind of JRPG battles interrupts movement through the overworld, I argue that battle themes should be understood as ruptures in the sonic environment, just as the battle stage is a spatial rupture in the overworld. Battle themes therefore make little sense as analytical objects out of context: by definition, they signal a break that impedes the player’s movement throughout a larger environment.
A well-executed RPG enchants precisely because of its freedom, its judicious range of choices, its varied modes of play. Yet all RPGs have to negotiate a balance between linearity and openness, between the drive of an overarching narrative and the ludic possibilities of open-ended exploration.15 If the game is too linear, the player may grow restless at being shunted through a storyline with no meaningful decisions to make along the way. Too much open space, and one can become lost or bored. (Filling a large open world with compelling content can be a daunting task for developers, too.) As of this writing, most Western developers strongly emphasize open-world elements in their games, whereas Japanese role-playing games continue a decades-long tradition of more linear storytelling. Gibbons sums up the differences as follows:
Broadly speaking, Western RPGs tend to favor individualized character creation, free exploration, and the creation of dark, “realistic” fantasy worlds; JRPGs, on the other hand, typically privilege colorful, often cartoonish environments and situations, and a linear narrative with pre-established characters.…These differences, though increasingly pronounced in recent years, are the product of a decades-long process of stylistic evolution, driven by technological differences, cultural preferences, and eventually, generic expectations.16
These distinctions have far-reaching implications for musical scores. According to Gibbons, JRPGs are usually more stylistically eclectic, borrowing elements from rock and J-pop in addition to the classical and cinematic styles of Western RPGs.17 Musical loops based on common locations and functions (dungeons, battles, towns, etc.) are still common in JRPGs, though Western RPGs are increasingly moving toward shorter atmospheric cues, silence, or more interactive forms of underscoring.18 In this essay, I will focus specifically on JRPGs; while some of the following observations are applicable to Western RPGs as well, the conventions of the genres are different enough to require contrasting analytical methods.19
In a JRPG, the spaciousness of exploration comes to a sudden halt during battle. While fighting, the player confronts a radically constricted field of motion and possibilities. This restricted range of play frequently coincides with a narrower field of vision and a less flexible camera, creating a tension with the ludic freedom that is, in some sense, the raison d’être of gaming—as Collins puts it, the player’s perspective in games is “fundamentally different” from a viewer’s perspective in films and TV precisely because players usually have “some control over both the visual camera and the auditory perspective.”20 Battles—especially repetitive battles—hem gamers in and act as obstacles to further spatial and narrative progression.
What actually happens, musically, in a JRPG battle? In many games, there is a clear audiovisual break between the overworld and the smaller battlefield environment. This obviously has much to do with the memory limitations of early game consoles, which often needed to create a different computational space for exploration and battle functions.21 Technological advances eventually enabled the seamless integration of the battle system into the exploration of the overworld, and this approach has become common in hybrid genres such as the action-JRPG and the action-adventure game; one well-known example is The Legend of Zelda: Ocarina of Time (Nintendo, 1998), which uses music to warn the player that enemies are near, gradually edging the player into combat.22 Yet a dedicated battle space remains the convention in standard JRPGs, especially those with turn-based battle systems. Three classic examples of a spatial break occur in the Zelda II: The Adventures of Link (Nintendo, 1987), Pokémon Yellow (Nintendo, 1998), and Final Fantasy X (Square, 2001).23
All three of these games use sonic cues to transition from one space to another. Zelda II uses a chromatic figure rocketing up and down, while Pokèmon uses a swirling, dissonant musical gesture that mirrors the spiraling visual dissolve (see Example 1). In Final Fantasy X the overworld space literally shatters and is swept away (see Figure 1).
There is, of course, a strong mimetic element present in the sounds and visual motion of the latter two examples. It is only in Zelda II that no visual transition is working in tandem with the sound effect, which I suspect is a technology-based omission. At the spatial break, Zelda II suddenly flashes a plain blue screen, which ends up functioning as the blue sky in the backdrop of the battle area.
Once the battle space appears, it’s game on or game over. (Some JRPGs let you escape, too, but we will forgo a discussion of the soundtracks of failure and cowardice here.26) A well-designed battle system gives the player plenty of choices on the way to the battle’s end, but the possible outcomes are extremely limited. Each has its own musical resolution:
Battle begins →
Victory → Fanfare
Defeat → Game Over
Escape → (variable; sometimes no cue)
Assuming the player fights on to victory, the game has to move back into the main exploratory space. In most JRPGs, this visual transition is much less sudden or violent than entering battle. Instead of a quick dissolve or a shattering screen, the game may announce the battle’s end by lingering on the final attack, having the characters flourish their weapons, or, in the case of Final Fantasy (Square, 1987), treating the player to an 8-bit victory dance. But like the entrance into the battle space, victory almost invariably receives its own musical treatment.
Victory cues are found throughout all sorts of games, and in fact, the fanfares closing JRPG battles are derived from much earlier entertainment systems. In her history of game sound, Collins claims that sound effects were vital to the experience of early gambling machines: “Sound was a key factor in generating the feeling of success, as sound effects were often used for wins or near wins, to create the illusion of winning.”27 While the earliest video games featured similarly basic sound design, games began including much more elaborate music in the 1980s. According to Neil Lerner, the “tonally and motivically related cues” underpinning Donkey Kong (Nintendo, 1981) and Super Mario Bros. (Nintendo, 1985) drew on the aesthetics of silent cinema, accompanying the on-screen action with an algorithmic “score” of melodramatic cues.28 The infamous ♭6–♭7–I cadence that heralds a level’s end in Super Mario Bros. (see Example 2) also concludes many JRPG battles, and the number rises further if one considers closely related progressions as well.29 The many variations of this chord progression could fill page after page; two representative examples occur in Final Fantasy (see Example 3) and Tales of Symphonia (Namco, 2003) (see Example 4).
In these examples as well as in JRPGs more generally, the fanfare acts as a sudden final cadence to the battle, opening a celebratory theme in a closely related key. (When released on an original soundtrack, the fanfare and victory theme are almost invariably a single track.) The rhythmic pulse usually settles into a slower tempo as the tension of battle dissipates. Music mediates the change from the battle space to the overworld area, with a possible detour through a ludic non-place: depending on the game, one may pass through extensive menus showing experience gained and all the great loot acquired. A quick fade or dissolve, and the overworld reappears with its exploratory theme.
As we can see from these examples, the battle arena—what we might call the battle function of the classic JRPG—is not just removed from the rest of the game spatially; it is also portioned off sonically. Medina-Gray’s analytical concepts of modularity, smoothness, and disjunction provide a useful way to read these transitions. Game music is modular, in that it consists of many different loops or modules that can be stitched together in a multitude of ways depending on the player’s actions—in a very real sense, players assemble the score every time they play the game.31 Modularity is “how game music moves from…‘rule-based space’ of the underlying code and into the ‘mediated space’ of the game’s audio-visual and tactile presentation to the player,” and the seams between modules “are as critical a part of real-time game soundtracks as are the modules themselves.”32 Game composers and sound designers can intentionally vary the smoothness or disjunction of the musical seams for immersive effect.
By incorporating Medina-Gray’s insights into our reading, we can recapitulate what we have discussed so far into the following schematic of the modules and transitions of a JRPG battle:
Overworld modular loop→
Disjunct transition →
Battle music modular loop →
Disjunct fanfare leading smoothly into victory theme →
Fade out or dissolve →
Overworld modular loop
The player returns where they left off in the overworld, as does the music. A violent transition interrupted this larger space and projected the player into the battle arena, while a disjunct fanfare and smoother victory theme ushered them out. This leaves us with all the materials in the middle. What about the music that takes place within the battle space? What about the battle music itself?
Music in the Fray
Battle music often involves the most-repeated themes in the game; these are the cues that get ground deep into a player’s memory and stay there for years. A cursory survey of classic battle themes reveals some common recurring characteristics, such as minor keys, fast tempos, dissonance, driving bass lines, chaotic textures, relatively harsh or noisy timbres, and so forth. (These characteristics are also typical of the combat cues in action-adventure games.33) Collins suggests that combat music provides some of the clearest examples of musical “mood induction” available within gaming: “Mood induction and physiological responses are typically experienced most obviously when the player’s character is at significant risk of peril, as in the chaotic and fast boss music.”34 These mimetic gestures parallel the narrative tension and create the proper physiological state for the player. Yet the mimesis is not just physiological: it is also spatial, because battle themes mediate a player’s perception of space at a fundamental level.
The harmonic content of battle music is often rather threadbare, full of the simplest of chord progressions. In a surprisingly large number of cases, the bass line goes nowhere at all. Obviously, this is not true of all battle music; as Aaron N. Price has recently shown, Motoi Sakuraba’s battle themes are often highly harmonically active, developing a rhythmic “groove” that hooks players into the music.35 Yet many battle themes include long stretches of harmonic nothing. Consider this bass ostinato from Chrono Trigger (Square, 1995) in Example 5.
This bluesy bass groove underpins the entirety of the battle theme, interlocking with a series of additional ostinatos that further complicate the rhythmic texture. Other instrumental lines feature rising and falling gestures that suggest other Dorian-inflected harmonies, but these nonfunctional progressions provide no escape from the bass’s relentless D. While attempting a detailed harmonic analysis of Chrono Trigger’s variations on D Dorian might be amusingly pedantic, collapsing its chord structures would tell us very little about what makes it effective as a piece of music.
Many other battle themes also include obsessive ostinatos or long pedal tones where the bass holds the listener in a tonally immobile space; we might even say the bass line grinds in place. These sections can occur at the beginnings of pieces, or sometimes they appear as contrasting sections within larger loops. A representative sampling can be pulled from Super Mario RPG (Square, 1996), Final Fantasy VIII (Square, 1999), Persona 3 (Atlus, 2006), and Bravely Second: End Layer (Silicon Studio, 2015), and a brief survey of these can tell us much about why battle themes are structured as they are.
The main combat theme in Super Mario RPG, “Fight against Monsters,” is unusual for being in a major key, but otherwise it is constructed very similarly to Chrono Trigger’s battle theme: a series of layered ostinatos combine in a rhythmically energetic texture of beeps and lo-fi synthesized brass. Drum fills give a sense of direction at the end of each loop, which lasts about twenty-eight seconds. A single bass ostinato runs underneath almost the entire piece (see Example 6).
Though the bass outlines a functional progression, its quick tempo, repetitiveness, and banality give a sense of total harmonic stasis within the loop. For as long as the battle lasts, the player is stuck in a mood of relentless cheerfulness. Besides this bass line, a number of melodic ostinatos play in turn, many of them closely related to each other (see Example 7). Although the entire loop can be divided into four-measure sections, the differing ostinatos’ lengths provide a slight variety, as some of them can be heard as one- or two-measure phrases.
The other battle themes in Super Mario RPG also are short and highly repetitive, though none of them are so ebullient. “Fight against a Somewhat Stronger Monster” has significantly more dissonance and more unpredictable harmonic turns, as does “Fight against an Armed Boss.” Composer Yoko Shimomura also included a parody of the boss music from Final Fantasy IV for Culex, a hidden boss and the single most difficult enemy in Super Mario RPG.
Final Fantasy VIII’s secondary battle theme, “Man with the Machine Gun,” is significantly more complex than the Chrono Trigger and Super Mario RPG examples earlier, yet it too is constructed around two ostinatos and their transformations (see Example 8). The first ostinato always plays at pitch; the second is sometimes transposed or otherwise altered in order to fit the chord changes that occur. Furthermore, these ostinatos can work either in sequence or layered on top of one another. At least one plays throughout the entire piece except for a short, eight-measure phrase at the end of the loop. This motivic economy—a less charitable analyst might say motivic poverty—works in tandem with the limited harmonic palette to create a sense of stasis. (The first chord change appears at m. 17, approximately thirty seconds into the piece.) Electronica timbres and a dance beat round out the piece. Despite this simplistic construction—or perhaps because of it—“Man with the Machine Gun” has become a fan favorite, with a significant afterlife in remixes and keyboard paraphrases. An orchestral arrangement by Shirō Hamaguchi has played in concert halls around the world.37
A somewhat different approach can be seen in “Mass Destruction,” the main battle theme of Persona 3. Unlike all the other themes we have considered, this is the only to have vocals; it also incorporates a wider variety of styles, borrowing elements from rock, hip-hop, and J-pop. On the player encountering an enemy, a frenzied vocal introduction breaks the overworld space and launches the player into the battle arena. The following segment is dominated by blues- and Phrygian-inflected horn riffs accompanied by electric guitar power chords on G-sharp minor (see Example 9, m. 1). Rap vocals join, and the electric guitars develop the modal elements further (see Example 9, mm. 2–5).
About fifty-five seconds into the track, dance piano riffs and J-pop vocals burst onto the texture. The harmonic structure loosens slightly, including walking riffs that toggle between G-sharp minor and E major with passing chords in between. Over the course of the loop, the effect is one of sudden, grinding stasis gradually expanding outward into a tonal progression. The regular battles of Persona 3 do not end with a fanfare; although the loop can repeat verbatim, vocal introduction and all, many normal battles will end so quickly that this gradual tonal unfolding will take the player all the way back to the overworld.
In Bravely Second: End Layer, many of the techniques outlined earlier come together. The primary combat theme, “The Battle Bell Tolls,” begins with a guitar solo in ascending diminished triads (see Example 10). Afterward, the theme proper begins. A thundering bass ostinato on C rolls onward: its rhythmic groove propels each two-bar repetition forward, each phrase attacking the next with a Phrygian neighbor tone. Wailing electric guitars enter with a highly syncopated melody. The bass escapes from C after twelve measures, only to slowly walk up and cadence on C again (see Example 11).
This harmonic grinding does not last forever, but it is nonetheless one of the focal points of the battle theme, accompanying its most memorable melodic material. The remainder of the loop develops a wider range of harmonies, including a cadence on D and a dissonant sequence, before returning to C minor.
Harmonically static combat themes are ubiquitous throughout JRPGs. There are many other examples that we could have chosen, and if we count themes that feature classic prolongations such as descending tetrachords in minor, the pool would have been even larger.38 One could argue that this lack of harmonic motion is a sign of the structural weakness of game music, that it is just a musical fetish-object for regressive listeners. (I suspect that the anti-lowbrow music philosopher Theodor Adorno, for example, would have been very scornful.39) I believe, however, that we should consider this lack of harmonic motion not as evidence of musical weakness but as a key stylistic element in battle music composition. Harmonic stasis is not present in all battle themes, of course, but it is a common enough feature that we should consider it a central convention of the form.
The purpose of this stasis becomes clear when we consider what battles actually do in the course of a JRPG. In many classic games, battles are about grinding. Ordinarily, the battle arena is not a space where the plot moves forward or important quest items are obtained: the battle space is an obstacle that the player has to break through in order to level-up, move through environments, and progress through the game. The harmonic stasis of much battle music thus corresponds perfectly with what happens spatially in the battle arena—the bass foundation holds the player in place within this restricted environment, while the melodic lines create a chaotic counterpoint above, mirroring the on-screen action. The main battle theme therefore is always experienced as an event within a larger spatial context. (It is important to note that this observation only applies to the main battle themes in a game; in most RPGs, battles that have a special plot significance—such as boss battles—get their own musical cues.) The theme is actually a rupture in the sonic fabric of the game, just as the battle space is an interruption of the main exploratory space. The simultaneous interruption of sound and space is articulated musically, creating what we can call musicospatial stasis.
The bass has a paradoxical dual role in the creation of this musicospatial stasis. Bass frequencies have long been used to immerse participants in contemporary spectacles—consider the heart-pounding subwoofers of movie theaters, rock concerts, or theme park attractions. Collins describes this property of sound as envelopment, which she defines as “the sensation of being surrounded by sound or the feeling of being inside a physical space (enveloped by that sound).” According to Collins, this feeling is most commonly “accomplished through the use of the subwoofer and bass frequencies, which create a physical, tangible presence for sound in a space”; because sound “is an extension of ourselves and also physically permeates us,” it can be more immediate than images, which are “external to us.”40 In JRPG battle themes, the bass therefore plays at least two functions: it envelops the player due to its acoustic properties, and it grounds the player through its harmonic content.
The harmonic stasis and repetitious structures of battle themes do not, for the most part, bother players because more varied and elaborate musical cues would actually contradict the spatial and temporal structures set up in JRPGs. The music’s aesthetic impact is inextricable from its functional role of integrating the gamer’s experience of the battle space. Summers corroborates this insight from another angle, drawing on an example from Final Fantasy VII (Square, 1997):
One gamer describes how listening to “Anxious Heart” [a slow, reflective environmental theme] allowed them to “grind”…and achieve success later in the game.…The player implies that the repetitious task of grinding is made more easily bearable through listening to the same cue repeatedly (albeit punctuated by the battle mode music). The looped cue apparently makes a task that is arduous because of its unvaried repetition, more, rather than less, easily completed.41
In other words, the constant, monotonous toggling between “Anxious Heart” and the battle theme (“Fighting”) did not disturb the gamer or their immersion within the gameplay; in fact, it centered them within the experience. Repetition, on its own, does not harm immersion:
This paradox reveals a crucial aspect of game music repetition. Direct repetition is not so much of a problem in game music as in other musical forms, partly because of the repetitious qualities of the medium. It is common for game music to be criticized for being somehow annoying, but more rarely for repetition alone, apart from in extreme cases.42
In highly repetitive modes of gameplay, monotonous sound design can actually heighten the player’s experience. As Sarah Gates has argued, music design can even encourage or discourage grinding; the music and sound design can condition or “texture” the gameplay, as Summers puts it.43 The stasis of a battle theme does not bother gamers, because they do not hear the theme by itself: they always hear it in alternation with contrasting material. Battle themes are simply a highly charged, harmonically static part of the vast and aleatoric musical structure that is the game.
Obstacles and Flow
Interpreting battle themes as a sonic rupture has interesting implications for how we theorize musical immersion, which has long been a central concern in gaming design. Sonic immersion is particularly important in RPGs, which require the player to maintain interest in long quests through imaginary landscapes and cultures. Winifred Phillips, a veteran composer in the industry, puts it this way:
As the composer for an RPG…the primary focus should be the enhancement of the world.…All the components of an RPG are structured to encourage the player to get out into the world and interact with it, learning about the people and culture while simultaneously advancing a compelling storyline through successful combat and the completion of quests. The music should surround the player with aural details about the intrinsic nature of the setting in which the game takes place. In essence, the music should serve as a world builder, joining forces with all the other elements of game design, visual artistry, and storytelling to complete the sensation of full immersion in the role-playing experience.44
Game music scholars such as van Elferen agree. In her influential essay on the ALI model of musical immersion, van Elferen argues that affective cues, prior musical literacy, and interactivity with the sonic environment all combine in a single immersive illusion.45 Phillips and van Elferen emphasize the importance of continuity of the soundtrack and the gaming world. But battle environments present a fascinating irony, in that the space itself is discontinuous. In order to create a continuous illusion, the sound design has to smooth over—one might even say justify—the discontinuities of the gaming space. As Medina-Gray puts it, “When musical disjunction supports usability and world-creation…it thus contributes to the player’s immersion in gameplay and the virtual world.”46 To mirror the environment, the music must include sonic ruptures and harmonic blockages that impede the player’s movement. Flow demands obstacles.
The relationship of in-game space and time reflects this paradox. Grasso argues that music mediates the player’s sense of time, in terms of both the gameplay and the narrative. For her, RPGs show that there is “something fundamental about music as an aesthetic element of video games: music exists through time, and time in games is organized by narrative events facilitated by play. Music, while meant to be heard…is also meant to accompany the actions of the player in that space.”47 Grasso examines Final Fantasy IV (Square, 1991) and its cave sequence as part of her arguments that the relationship between time and narrative is “particularly multifaceted in RPGs,” that “music fills the spaces of these virtual worlds, inviting players to experience a musical presence in the game’s fiction,” that music “signals events in the game narratives by marking those changes with musical ones.”48 To build on Grasso’s insights, we can say that the relationship between music and space is also particularly multifaceted in JRPGs—music mediates the player’s sense of space, and the feeling of being spatially immersed within a game is deeply entangled with the complex interactions of a game’s musical, narrative, and temporal structures.
By way of summary, let us consider another case study: Grandia II (Game Arts, 2000). The composer, Noriyuki Iwadare, puts a fascinating spin on the problem of harmonic motion in battle themes. Any ordinary battle can take one of two distinct battle themes, depending on how the player enters the fray. If the player moves in aggressively and takes the initiative, the game plays the theme “FIGHT!! Vers. 1”; if the player dawdles or gets caught unawares, it triggers the imaginatively titled “FIGHT!! Vers. 2.” Both loops have long periods of harmonic stasis, but “FIGHT!! Vers. 1” includes a more active, harmonically energetic opening (see Example 12).
After this energetic opening, “FIGHT!! Vers. 1” lands on an A minor chord. The bass grinds on A for fifteen measures, but the upper voices move through a variety of chords, creating significant harmonic tension. Harmonically static and active segments alternate throughout the rest of the battle. “FIGHT!! Vers. 2,” on the other hand, has essentially no harmonic movement whatsoever. The piece lands on an E minor seventh chord and stays there—until the player achieves victory. (Every ninety seconds or so, the bass moves rapidly through D-D-D#-E, but as this cadences back on E minor it can scarcely be thought of as harmonic motion.) Both versions feature driving rock rhythms, engaging solos, and lots of jagged timbres. Both include extended periods of harmonic stasis, but the placement of this stasis mirrors the player’s engagement with the battle space. In Version 1, the player takes an active role and gets to input all of their attack commands before the enemy does; in Version 2, the player has to sit back and watch helplessly as the enemies clobber their characters first. Iwadare’s battle themes for Grandia (Game Arts, 1997) are similarly distinct: a harmonically active theme plays if the gamer strikes first, and a harmonically passive theme plays when the gamer is ambushed.
We are now equipped to answer our original questions: How do battle themes function? And how do we evaluate the effectiveness of battle music? Battle themes function by articulating the space in which the battle takes place. This articulation happens over a number of parameters, but among the most important are the juxtaposition of busy, minor-key surface textures and a driving bass line that grinds in place. The success of a battle theme cannot be measured just by evaluating its originality and musical sophistication; any analysis of a battle theme has to take into account its effectiveness. To be effective, a battle theme has to negotiate both the continuities and discontinuities of the space in which the game takes place. A battle theme has to immerse the player within the battle arena while also frustrating them, providing a sense of musical and spatial release when they finally can exit and continue on with the game.
By way of conclusion, let’s apply these concepts to Ni no Kuni to clarify why players might have perceived its battle music as ineffective or poorly designed. Ni no Kuni has all of the moving parts we would expect from a classic JRPG battle: a clear audiovisual transition into the arena, long periods of harmonic stasis in the music itself, and a fanfare as a final victory cadence. Yet the theme fails to hold within itself the essential tension of a battle theme: the harmonic ground is not reiterated strongly enough, and the textures are not sufficiently chaotic to provide a convincing backdrop for the on-screen action. The orchestral sounds are beautifully rounded where they would ordinarily be noisy and penetrating; the bass line is elegantly suggestive where, in other games, it would assault and violently restrain the player. The music mirrors neither the immobility of the space nor the mobility of the action.
My intention in all this is not to say that this is bad music or that Hisaishi is an ineffective composer—far from it. I think it is more likely that Studio Ghibli was trying to create an experience somewhat different from the typical JRPG, just as their films are something apart from other anime. But I would argue that Ni no Kuni represents a strong departure from convention, whether that departure was intentional or unintentional. The score forgoes the kinds of internal tension that ordinarily drives the spatial logic of JRPG battle systems. This unconventional approach that fans and scholars alike have noticed is not a mirage: it is a real feature of Ni no Kuni’s score, it can be immediately perceived, and it has powerful implications for how we analyze battle music.
The examination of battle themes and their gestures, then, can provide us with a preliminary model of battle music construction. Battle themes are characterized by a powerful harmonic ground working in opposition with surface chaos; the music fuses the confinement and inevitability of the fighting environment with the unpredictability of the action within that space. Music mediates the spatial logic of the JRPG and clearly articulates the motion between the places of the overworld and the arena; as was noted earlier in the discussion on fanfares, music even mediates the player’s motion through non-places, such as menus. This, then, is the sound of the grind—a musicospatial stasis that creates satisfaction through harmonic frustration, continuity in discontinuity, and immobility within chaos.
The author is grateful to Richard J. Anatone, Karen Cook, Sarah Gates, Will Gibbons, Matthew Hoover, Neil Lerner, Elizabeth Medina-Gray, Ross Mitchell, Dana Plank, Aaron N. Price, Tim Summers, and the anonymous readers of this journal for their unfailingly helpful commentary, shared research, and vital encouragement throughout the process of developing this article. He would also like to thank Joel Armstrong, Steven Cooper, and Anthony Johnson, whose influence is woven throughout this work.
William Gibbons, “Navigating the Musical Uncanny Valley: Red Dead Redemption, Ni no Kuni, and the Dangers of Cinematic Game Scores” (presentation, North American Conference on Video Game Music, Fort Worth, TX, January 17, 2015). Gibbons also touches briefly on Ni no Kuni in his essay “Music, Genre, and Nationality in the Postmillennial Fantasy Role-Playing Game,” in The Routledge Companion to Screen Music and Sound, ed. Miguel Mera, Ronald Sadoff, and Ben Winters (New York: Routledge, 2017), 412–27.
Kirk Hamilton, “The Curious Case of Ni no Kuni’s Unpleasant Battle Music,” Kotaku, February 4, 2013, accessed December 28, 2020, https://kotaku.com/the-curious-case-of-ni-no-kunis-unpleasant-battle-music-5981512.
This conversation took place in the Q&A after an early version of this article, which I presented as “Sounding the Grind: Musicospatial Stasis in Classic RPG Battle Themes” (presentation, North American Conference on Video Game Music, Hartford, CT, March 30, 2019). Note that Ni no Kuni includes an alternate, much faster version of the piece as “Battle II,” suggesting that Hisaishi was aware that the original “Battle” track was indeed unusually slow.
For a much more detailed examination of the varieties of immersion, including spatial immersion, see Marie-Laure Ryan, Narrative as Virtual Reality: Immersion and Interactivity in Literature and Electronic Media (Baltimore and London: Johns Hopkins University Press, 2001), especially Part II, “The Poetics of Immersion,” 89–174.
See the introduction to William Cheng, Sound Play: Video Games and the Musical Imagination (Oxford, UK: Oxford University Press, 2014), 3–18; Karen Collins, Playing with Sound: A Theory of Interacting with Sound and Music in Video Games (Cambridge, MA: MIT Press, 2013), particularly chapter 2, “Being in the Game: A Sonic Approach,” 39–58; Winifred Phillips, A Composer’s Guide to Game Music (Cambridge, MA: MIT Press, 2014), especially chapter 3, “Immersion: How Music Deepens the Play Experience,” 35–54; Tim Summers, Understanding Video Game Music (Cambridge, UK: Cambridge University Press, 2016), chapters 3–4, “Texturing and the Aesthetics of Immersion” and “Music and Virtual Game Worlds,” 57–115; and Isabella van Elferen, “Analysing Game Musical Immersion: The ALI Model,” in Ludomusicology: Approaches to Video Game Music, ed. Michiel Kamp, Tim Summers, and Mark Sweeney (Sheffield, UK: Equinox, 2016), 32–52; Isabella van Elferen, “Virtual Worlds from Recording to Video Games,” in The Cambridge Companion to Music in Digital Culture, ed. Nicholas Cook, Monique M. Ingalls, and David Trippett (Cambridge, UK: Cambridge University Press, 2019), 209–26.
Elizabeth Medina-Gray, “Modular Structure and Function in Early 21st-Century Video Game Music” (PhD diss., Yale University, 2014), 9.
For one example of a discussion of failed immersion, see Summers on Advent Rising (GlyphX Games, 2005) in his Understanding Video Game Music, 150–55. On the blurring of the roles of ludomusicologist and fan, see Summers again: “These chapters are, of course, a tour of the favorite sites, views, texts and obsessions of your guide-cum-narrator. But to admit my bias and forfeit any pretence at objectivity is not to discredit my arguments or choices.…It would be disingenuous to claim that subjective fun, play, and enjoyment should have no part in the study and discussion of game music. This is, after all, the primary reason why we engage with games” (57).
Hamilton, “Ni no Kuni’s Unpleasant Battle Music.”
For more on the musical implications of grinding in the Final Fantasy series, see Sarah Gates, “Enjoying the Grind: Musical Encouragement of Repetitive Action in Final Fantasy X” (presentation, Society for Music Theory, Arlington, VA, November 3, 2017); Stefan Greenfield-Casas, “Quick Takes—Playing with Time in the Zodiac Age,” Musicology Now, August 18, 2017, accessed December 29, 2020, http://www.musicologynow.org/2017/08/playing-with-time-in-zodiac-age.html; and Lee Hartman, “Quick Takes—A Double-Edged Sword: How FFXII: The Zodiac Age’s Score Keeps the Player Engaged Despite the Game’s Heavy Automation,” Musicology Now, August 23, 2017, accessed December 29, 2020, http://www.musicologynow.org/2017/08/quick-takes-double-edged-sword-how.html.
For a variety of perspectives on game music’s coalescing canon, see the colloquy “Canons of Game Music and Sound,” Journal of Sound and Music in Games 1, no. 1 (January 2020), 75–99, accessed September 20, 2020, https://online.ucpress.edu/jsmg/issue/1/1. On the intersections of classical concert culture and video games, see William Gibbons, Infinite Replays: Video Games and Classical Music (Oxford, UK: Oxford University Press, 2018), especially the final chapter, “Classifying Game Music,” 157–71.
See Julianne Grasso, “Music in the Time of Video Games: Spelunking Final Fantasy IV,” in Music in the Role-Playing Game: Heroes and Harmonies, ed. William Gibbons and Steven Reale (New York: Routledge, 2020), 98.
Flow, a psychological concept developed by Mihály Csíkszentmihályi in works such as Flow: The Psychology of Optimal Experience (New York: Harper and Row, 1990), has been highly influential in video game design and analysis. Flow—a state characterized by pleasurable, creative absorption in one’s task, leading to an altered perception of time—is almost a synonym for being in a state of immersion. As game designer Scott Rogers puts it, “The goal of good level design is to help players achieve what psychologist Mihály Csíkszentmihályi calls flow” (emphasis in original); see his Level Up! The Guide to Great Video Game Design, 2nd ed. (West Sussex, UK: Wiley, 2014), 364.
Grasso, “Music in the Time of Video Games,” 98.
Games mingle both narrative and ludic elements, causing difficulties for textual approaches engaging with only one component. According to Cheng, early theoretical studies of games tended toward “overdrawn (albeit occasionally productive) oppositions between narratological and ludological approaches”; see his Sound Play, 181n38.
Gibbons, “Music, Genre, and Nationality,” 413.
Gibbons, “Music, Genre, and Nationality,” 415–416.
Gibbons, “Music, Genre, and Nationality,” 419–420.
For a recent collection of case studies covering both Western and Japanese RPGs, see Gibbons and Reale, Music in the Role-Playing Game.
Collins, Playing with Sound, 49.
Famous counterexamples include The Legend of Zelda (Nintendo, 1986) and Chrono Trigger (Square, 1995), in which battle and exploration functions occur in the same virtual space.
The musical design elements of action-JRPGs is beyond the scope of this essay, but some representative examples include Kingdom Hearts III (Square Enix, 2019), Tokyo Xanadu eX+ (Nihon Falcom, 2016), and most intriguingly, Ni no Kuni II: Revenant Kingdom (Level-5, 2018). For a detailed discussion of one action-adventure’s sound design for combat, see Tim Summers, The Legend of Zelda: Ocarina of Time: A Game Music Companion (Bristol and Chicago: Intellect, 2021), 215–27.
Note that Zelda II, despite its influence on the JRPG as a genre, is usually categorized as an action-adventure game, not a JRPG.
This transcription is adapted by the author from several fan-made transcriptions linked in Rose Bridges, “Music Transcription and Video Game Fandom: A Reception Study,” last modified December 11, 2016, accessed September 8, 2020, https://scalar.usc.edu/works/video-game-music-transcription/index. Providing exact credits for transcriptions of famous cues is not always possible, as many transcriptions are either anonymously or pseudonymously published and fans frequently build on each other’s work—the game music community is, in many ways, a crowdsourced community. All other transcriptions are the work of the author unless otherwise attributed.
Image captured by the author from Final Fantasy X/X-2 HD Remaster (Square Enix, 2019), Nintendo Switch.
For one discussion of game over cues, see Summers, Legend of Zelda: Ocarina of Time, 241–42.
Karen Collins, Game Sound: An Introduction to the History, Theory, and Practice of Video Game Music and Sound Design (Cambridge, MA: MIT Press, 2008), 8.
Neil Lerner, “Mario’s Dynamic Leaps: Musical Innovations (and the Specter of Early Cinema) in Donkey Kong and Super Mario Bros.,” in Music in Video Games: Studying Play, ed. K. J. Donnelly, William Gibbons, and Neil Lerner (New York: Routledge, 2014), 1–29.
In the Final Fantasy series, for example, the victory themes of the first ten games all use closely related melodic material, though some have slightly altered harmonies underpinning the melody.
This transcription is adapted by the author from Jason Brame, “The Mario Cadence,” 8-Bit Analysis, April 1, 2011, accessed September 16, 2020, http://gamemusictheory.blogspot.com/2011/04/mario-cadence.html.
Elizabeth Medina-Gray, “Modularity in Video Game Music,” in Ludomusicology: Approaches to Video Game Music, ed. Michiel Kamp, Tim Summers, and Mark Sweeney (Sheffield, UK: Equinox, 2016), 60–65.
Medina-Gray, “Modular Structure and Function,” 4 and 7.
For example, see Summers’s discussion of rhythmic ostinatos, harmonic structure, and dissonance in Legend of Zelda: Ocarina of Time, 217–23.
Collins, Game Sound, 133.
Aaron N. Price, “From Grinding to Grooving: An Investigation of Motoi Sakuraba’s RPG Combat Music” (presentation, North American Conference on Video Game Music, June 13, 2020).
Adapted from a transcription by YouTube user GUIM, reproduced in Price, “From Grinding to Grooving.”
This arrangement was first commercially released as the eighth track of the compact disc FITHOS LUSEC WECOS VINOSEC: Final Fantasy VIII (Tokyo: DigiCube, 1999); it was also frequently programmed on the Final Fantasy: Distant Worlds concert series, which launched in 2007 and toured worldwide until the COVID-19 crisis.
Descending tetrachords—for example, the chord progression A minor, G major, F major, and E major repeated over and over again—have the special property of obsessively reiterating (i.e., prolonging) a tonal center even as they fall away from that center.
Theodor W. Adorno, “On the Fetish Character in Music and the Regression of Listening,” in The Culture Industry: Selected Essays on Mass Culture, ed. J. M. Bernstein (New York: Routledge, 2001), 29–61.
Collins, Playing with Sound, 54.
Summers, Understanding Video Game Music, 169.
Gates, “Enjoying the Grind”; Summers, 58–60.
Phillips, Composer’s Guide to Game Music, 103.
Van Elferen, “Analysing Game Musical Immersion.”
Medina-Gray, “Modular Structure and Function,” 9.
Grasso, “Music in the Time of Video Games,” 98.
Grasso, “Music in the Time of Video Games,” 101 and 104.