In recent years, music analysts have grappled with the sonic strategies from popular expressions that evade traditional notation. Their approaches often rely on harmonic spectrographs or various textual tools to decode the creative mechanics of these art forms. But for many practices with innate musicality—such as spoken-word poetry—these common techniques make limited explanatory headway. This article proposes an alternate path to fill the gaps: Adopt an analytic perspective, grounded in phenomenology, that listens for the musical subject’s negotiation of embodiment through their calculated treatment of timbre in the voice. Here, the analyst traces their perception of the subject’s bodily resonance through diagrams called timbral maps. And through these maps, two key concepts are discovered that structure the creator’s interior logic: timbral surfaces and timbral moments. Surfaces and moments are built into recognizable patterns, which in turn disclose the methods of these artists as lucid on their own terms. This “surface-moment” model is prototyped using a recorded performance of “This Clouded Heart” by the grunge-era Seattle poet and performance artist Steven Jesse Bernstein. The model reveals several stylistic tactics honed by Bernstein through his play with resonant shifts, but more significantly, argues for recasting timbre in analytic contexts: first, as a sustained and winding musical dimension, able to unfurl like other large-scale organizing principles; and second, as a heuristic capable of engaging listeners in an empathetic web between themselves and the subject through the mimetic connection of their bodies.

“We want music!”

- Bruce Pavitt, concert heckler and founder of Sub Pop Records, on August 9, 1987

“This is music, asshole!”1

- Steven Jesse Bernstein, performer, in response


When I first hear the recorded performance of the poem “This Clouded Heart” by its author, Steven Jesse Bernstein, I experience an initial repulsion. I tense during his sputtered delivery of the text: “all the walls are covered with pictures, already of her and me and you, fucking each other, all day, every day, with the cars and the vans going in and out, the cops taking away the night for something it did, shatterlines across the moon where it used to shine, heaven up in jail, God splintered by bars, drinking out of the toilet in San Diego.”2 But as the track continues, I slowly uncoil; I feel less guarded, less profaned. Through my gradual immersion in the grain of his voice, my own sacrosanct self-regard—so unlike this vulgar carnival barker, I insist—begins to soften and crumble. Because I realize that I know people who live this way. In my own darker moments, I recognize the spiral that tempts me to think this way. That desolate landscape is not foreign to me: the paranoia of entrapment, of a person chained to an inexorable cycle of self-destruction and loathing; yet all while admitting as much and adopting a mirthful mask—a Nietzschean affect—as the only possible reprieve. Here is a chasm of the human condition to whose edge many have stepped, but few experience (or narrate) the plunge like Bernstein. The sheer abrasion of Bernstein’s voice causes his storytelling to lodge in my gut. And then his characters become familiar to me—animated by the poet’s coarse inflections, they are the personas of desperation that have entered and exited my life across many stages. Yet I also cannot deny their presence in my own subjectivity, now buried (or perhaps only dormant for a time).

When I confront Bernstein’s musicality, two conflicting reactions take hold: visceral discomfort, from his crass sensibilities and quavering tone; and aesthetic awe, from his cunning turn-of-phrase and compelling sonic coherence. But while oppositional at face value, they in fact work in tandem. Their forceful blending—joining his tone colors and frictions with the existential anxieties and emotional rancor that rivet his language—provoke me to experience his voice as a bodily conduit. Indeed, Bernstein reveals these colors and frictions, i.e., timbre, as his currency of choice, not only for his own artistic negotiation but also for the listener’s encounter with his performative self through the body. So I will analyze “This Clouded Heart” with the lens of timbre to find the musical techniques that imbue Bernstein’s poetic storytelling with such power. And to follow Bernstein’s treatment of timbre on its own terms, this project adopts a phenomenology of embodied sensation in the mode of philosophers like Maurice Merleau-Ponty and music theorists like Judith Lochhead.3 Doing so enables a new analytic model that drives this venture: freely drawn maps of timbral change in the body that borrow Thomas Clifton’s spatial theory of relief felt across a perceived musical surface.4 I harness aspects of Clifton’s work, and link them to insights from S. Alexander Reed, to explain how timbre can be experienced as a large-scale organizing principle that unfolds concurrently in two layered modes of listening: the “surface” and the “moment.”5 These individual concepts, as well as the larger analytic model and its maps, will be later explicated in depth.6 But when deployed, this approach discloses the following stylistic tactics by Bernstein:

  1. Construct a claustrophobic auditory space to activate parallel tensions in the listener.

  2. Play with shifts of resonance control to modulate the bodily profile felt by the listener.

  3. Subvert the timbral surface set by the first two tactics to catalyze a heuristic through timbre.

These three tactics provide a grammar for understanding Bernstein’s deviant musicality, and they will be teased out during the exploration of maps for “This Clouded Heart.” But there are significant upshots for my approach beyond structural edifice. First, this project pushes against dominant trends in scholarship on timbre: namely, a tendency to favor snapshots of timbre in place of sustained, temporal streams; and a tendency for the gaze of timbral analysis to focus only on the isolated performer rather than see timbre as a sonorous bridge that links the agency of performer and listener.7 Second, this project reconciles two camps in scholarship on timbre: namely, those who prioritize its structure and function through mechanistic readings, and those who prioritize its role in musical embodiment and intersubjective empathy.8 Mending this rift, and productively joining their respective insights, becomes necessary to reach my leading contention: that timbre is a heuristic device that explicates the empathy of embodiment in musical contexts and renders it meaningful; not domineering or holding sway, but empowering a litany of paths for listeners to trial-and-error new subjective and artistic relations.9 Furthermore, timbre’s heuristic capacity is nonnarrative—no one creative interpretation or story arc is privileged over another. Our embodied experience of timbre generates an imaginative experience, governed by the listener’s own self-determination.10 However, before doing the map analysis or arguing the model’s significance for timbre and music scholarship, this project must first situate the significance of Bernstein qua artist.

Grunge Archetype

Steven Jesse Bernstein (1950–91) operated at the intersection of poetry, music, performance art, and theater and grew into a celebrated figure across the storied counterculture landscape of 1970s and ’80s Seattle. Sub Pop Records founder and grunge impresario Bruce Pavitt reflects on how “Jesse was able to move amongst different disciplines with great ease. He was a bit of a chameleon, although he never really changed his colors. So many different groups of people would sort of adopt him and claim him to be their own.…He always had several things going on, writing novels, writing short pieces, he was a playwright, an actor, a director, it was just astonishing the amount of work he created.”11 For instance, in music, Bernstein was crowned the “godfather of grunge” by the UK magazine The Independent, and his spoken-word readings opened concerts for groups that would soon change the landscape of rock ’n’ roll, including Nirvana, Mudhoney, Big Black, and Soundgarden.12 As a “street poet” and performance artist, Bernstein is often lauded in the same breath as convict-turned-writer William Wantling, as well as another “laureate of American lowlife,” Charles Bukowski.13 And designers, painters, and photographers throughout the burgeoning art circuit of the Pacific Northwest constantly sought him out as a provocative subject.14 In many respects, Bernstein became the living icon of a far more expansive and dynamic grunge ethos than is popularly known, reaching beyond the small cadre of bands most associated with this American zeitgeist. Indeed, his ability to bring together disparate corners of counterculture resembles another subversive guru: William S. Burroughs. And perhaps unsurprisingly, Bernstein was himself a Burroughs confidant and collaborator: They gave several poetry readings together; the Burroughsian “cut-up” composition method appears on many of Bernstein’s works; and Burroughs first suggested that Bernstein set his poetic texts to music, as heard on “This Clouded Heart” and other tracks.15 Promoter Larry Reid argues that “[Steven Jesse Bernstein] was the orator of grunge, the erudite voice of this counterculture movement.”16

In the late 1970s, when Bernstein started as a vanguard poet doing raucous club performances, Greater Seattle’s counterculture was just another backwater outpost in just another second-tier American city. But by 1991, the year of his tragic suicide, the scene had metamorphosed into a juggernaut so distinctive in aesthetic, affect, and values that this new grunge movement became the standard-bearer of American rock writ large for the next five years. Asserting that Bernstein played a catalytic role is no hyperbole. In fact, Reid remarks how “when some of these grunge bands first started to tour Europe, I would get reports back that audiences in Germany or Belgium were shouting lyrics from [Bernstein’s poem on the 1988 Sub Pop 200 compilation] ‘Come Out Tonight.’ It had a real influence on the punk milieu.”17 Bernstein’s magnetism extended to the pivotal artists as well. Reid tells how, during the same August 9, 1987, concert that introduces this article, “right in the front is a very young, well-scrubbed Kurt Cobain. During Jesse's set he's right up in the front row. He's leaning on the stage, completely eating up Bernstein.”18 While Bernstein reads from his novella Personal Effects to several hundred fans—jammed together in this sticky, humid warehouse—the jeers dwindle and a pall overtakes the space, punctuated by the occasional nervous laughter.19 His voice is mesmerizing, and I hear how each intonation is bent and stretched to prematurely disarm the crowd. The winding and macabre imagery encircle the crowd, and I watch in real time how the same features that I hear on “This Clouded Heart”—the strained inflections, piercing sardonic tone, and flush agitation of his body—produce a trancelike state over this unlikeliest of poetry audiences. That the same aural stratagems captivate Cobain in 1987, and me in 2021, no doubt explains why the works of Bernstein have enjoyed a periodic resurgence in the popular consciousness across literary, music, and film circles.20 To contextualize this enduring appeal, the next step is to investigate just how “This Clouded Heart” and its album, Prison, came about.

Creative Process in Prison

Prison is a bizarre, disjointed creature compared to conventional albums; nonetheless, it represents the definitive portrait of Bernstein’s recorded work. To corral his spoken-word poetry into a commercially viable package, the label Sub Pop hired local producer Steven Fisk to craft arrangements for their backing. The original concept paid homage to At Folsom Prison, Johnny Cash's much-heralded 1968 album.21 Evidently, Sub Pop believed that by recording in a similar fashion—in front of a throng of inmates from the Special Offenders’ Unit at the State Reformatory in Monroe, Washington—a similarly rollicking feel would result. This effort ultimately lapsed, since these sessions simply lacked the spontaneity or call-and-response atmosphere that lent so much vitality to the Cash recordings. According to Laura Cassidy, staff writer and columnist at the Seattle Weekly: “It was recorded in the morning, [and] the prisoners were into it, but they weren’t whooping and hollering. I bet it was a little too quiet and dull, so why release a record with awkward room noise when you can either record Bernstein in a better setting or set a bunch of music to it?”22

After this early stumbling block, Fisk was bestowed a wide creative berth by both Sub Pop and Bernstein. As a producer and engineer, Fisk had already amassed numerous credits with early Seattle grunge groups, yet Bernstein's carte blanche to his collaborator was rare.23 The sessions were structured whereby Bernstein first recorded the unaccompanied readings, and he then delivered these takes to Fisk for his studio wizardry to commence. The result was, according to critic Joseph Larkin, “a challenging tour de force made by two men who really didn’t have much to do with each other.”24 What may at face value have seemed like a strange potpourri—spoken-word poetry over a veneer of “bad metal or cheesy synth-jazz…[with] a touch of ambient noise gurgling”—coalesced into a serendipitous and remarkably unified musical statement.25 Yet when Bernstein took his own life on October 22, 1991, only two tracks were fully assembled, albeit in crude form at Fisk's home studio. “No No Man (Part One)” and “More Noise, Please!” were the only works from Prison that Bernstein heard before his passing.26

Thus, each track on Prison contains two musical entities: Bernstein's evocative reading and Fisk's calculated accompaniment. Together, they congeal, split, rush, or recede during episodes that recast the atmospherics of the recording. Fisk deliberately inserts flourishes (or dead time) in tandem with Bernstein’s pivots to push the work in new directions. Fisk not only imbues these tape loops with his empathetic read of Bernstein as artist but sculpts the ambient surrounding to support the heuristic paths of other listeners as well.27

The Sound World of “This Clouded Heart”

Fisk’s background for “This Clouded Heart” is a sequenced collection of samples that are layered in formal hierarchies. These sample sections, characterized by Fisk as “clusters of collage material,” are played against a cassette recording of Bernstein's voice.28 The combined recording is then transferred to a 24-track tape, and the final mastering is performed in a “proper [analog] studio” with minor editing interventions.29 The final result incorporates a wide orchestration: syncopated hissing tones, assorted hand drums, a kick drum, an upright bass, an electric guitar, and a muted trumpet, each selectively entering and exiting to sculpt specific ensemble idioms that continuously cycle.

The arrangement’s most striking feature is a phasing comb filter, or flanger, that assimilates and colors the entirety of the collage materials.30 Listeners first encounter its prominence in the trumpet motives that open Fisk's production, imparting a sickly, swirling patina to the warble of brass.31 This flanger cycles up and down the harmonic spectrum (commencing at timestamp 0:06 on the Prison recording), and formerly dynamic sound profiles are stripped away to only those frequency bands allowed by the filter. Both the hissing tones and the instrumental samples are organized rhythmically to feature certain slices of the flanger as it glides up and down in a sine-wave pattern. This enables the flanger to obscure its total shape, moving in the backdrop and never displaying its serpentine whole. And a concurrent sonic element: the grittiness of samples chosen by Fisk, as if a deliberate grime is caked onto the instruments. Each sample suffers from a malignant low fidelity, darkening the production to loom over the unsuspecting poet. A prime example is audible at timestamp 1:03, as Bernstein delivers the lines “…this is a neighborhood of padded mud, wheels gone all the way, kisses like the electric wires inside eels, nervous knives, pretty pistols, mothers, Gods, fathers, cops, leaning with shame.” Here, the arrangement subtly thickens before it is punctured on the upswing of a flanger cycle by a shrill trumpet flourish up and down its register.

Fisk's arrangement copies the common song form Verse, Chorus, Verse, Chorus, Bridge, Verse, Chorus—notorious in part due to its sheer ubiquity across Seattle grunge.32 But looking closer, “This Clouded Heart” deviates slightly from the AABA template and may be represented as follows: Verse, Chorus, Verse’, Chorus’, Bridge, (half) Chorus, Verse”. Bernstein did not organize “This Clouded Heart”—the written poem or the recorded vocal track—with any sort of recognizable refrain, aside from the reprise of content and character allusions. So Fisk’s introduction of a hierarchical substructure behind Bernstein's irregular flow required musical tact and keen attention to his subject's guttural turns.33 Some production shifts are dictated by arbitrary time considerations, discovered through the equal timestamp ratios of different sections as they enter. In others, like the abnormally long bridge that occurs near the piece's three-quarter mark, the production is intimately linked with Bernstein's vocal agitations. Fisk clearly timed certain rhythmic and motivic ideas to suit the signification of the voice; he remarked to me how “Jesse's rhythm is the thing that drove all of that stuff. You know, there was a cadence and musicality to what he did…in his reading that didn’t really need an awful lot of editing or anything like that…I was really trying to pay attention to the dynamic of what he was doing.”34 The result is an off-kilter duet; instrumental fragments dance in the breathless seconds before Bernstein rapidly inhales and commences another poetic torrent. Examples of this syncopated interplay occur prominently at timestamps 2:11 (“look up there with the visionary stethoscope”) and 5:50 (“oh mama, get me a plane ticket out of here, oh mama, put me on a bus”).

Fisk included careful analog alterations to Bernstein's voice as well. Listeners hear reverberation trails extending from not only the decay of Bernstein's voice (as is typical) but also before its initial attack. Here, the reverberation algorithm extracts the first three or four syllables of a spoken segment and places them like grace notes just prior to the original delivery. Listeners hear the spectral whispers of a disembodied statement infiltrating the attack profile of the primary voice, and are briefly disoriented by this doubled portrait of Bernstein.35 Collectively, Fisk’s clever production choices and his sample tapestry on “This Clouded Heart” strengthen the tug of the poet’s timbral maneuvers. But how exactly does timbre access the empathy of the human voice, and by what means might an analyst describe it?

Timbral Empathy and Heuristics

Timbral empathy starts with the simple assertion—in the spirit of Barthes—that how one vocalizes is more important in its signifying force than what is being articulated through linguistic meaning. Notably, Barthes references the “erotic” of the voice's grain: “I am determined to listen to my relation with the body of the man or woman singing or playing, and that relation is erotic but in no way ‘subjective’.”36 This is a powerful claim: Barthes contends that, through this erotic relation, listeners can hear past the markers of identity that they have been socialized to code with certain sonic characteristics.37 The erotic relation of timbre is instead felt in the gut, grinding and incessant. Barthes implies that timbre can bypass the ego’s defenses and dismantle a static narrative between a performer and listener before the infiltration is even realized. And the musical ramifications for speech performance are significant. During Bernstein’s readings, his timbral elements—e.g., vocal tone, striations, amplitude, and attack profile—are modulated by the muscular quivering of his lungs, throat, mouth, and lips on a continuum of expressivity.38 For all of his complex poetic devices written on the page, they ring hollow if not properly filtered through the body’s musical instincts. So timbre is, on a functional level, the refractive prism between poetic meaning and this primal, sub-aware embodiment described by Barthes as erotic.

A corollary insight is derived from this dynamic: that listeners access and interrogate their own embodiment as they experience others doing the same, particularly when charged with artistic intent. This is not a new claim. Merleau-Ponty describes the phenomenology of embodiment as “anchorage,” or the grounding of our being in the midst of the sensuous world.39 Vocalization triggers anchorage by providing an act of will that controls the body and situates the speaker. And this assertion of embodiment then empowers listeners to explore similar reflexes of sonic self-determination. Or, as remarked by theorist Arnie Cox, “there is little or no musical imagery that does not involve motor imagery—in other words, thinking about music involves imagining doing (making) music.”40 So embodiment appears to reside in our flesh, but also leaps beyond to influence whole networks of other persons, signifiers, and creative practices—a concept titled “intercorporeity” by Merleau-Ponty. Lochhead adapts its implications to music and argues that “intercorporeity grounds an understanding of musical meaning that resides not simply in the constituting agency of the composer but in the shared and bodily-based activities of…performer and listener.”41

Indeed, when listeners perceive another’s voice, they generate an impression of the transmitting source—not only as a body that sounds like their own, but as one that tenses and oscillates its performative flesh in the same way they might. Reed calls this “aural voyeurism,” to plumb our own depths through the echoes of another.42 But I argue that, in practice, such voyeurism reaches to a wider horizon: Beyond flesh, the listener is seduced to speculate about the many contingencies of the transmitting body’s cultural and psychological life as well. Cox states: “When a[n emotive] congruency is found across domains, the result is a metaphoric conceptualization of embodied musical experience.”43 Or, in plainer terms from philosopher Don Ihde: “[the dramatic] voice does not display the difference between appearance and reality so much as it does the multiple possibilities of every voice transformed from ordinary to extraordinary. The ‘others’ who appear are the human possibilities which are also ‘my’ possibilities, and the drama is a ‘universal’ play of the existential possibilities of humankind.”44 So I call this wider horizon timbral empathy, which beckons listeners to step into the speaker’s shoes and commence an indeterminate journey that tries to anticipate the motivations and expectations of this distant body.

However, Bernstein’s point is not to induce the same (or similar) emotional world among his listeners, but rather to make some path available for empathy to unfold on its own terms. This evasion of any one-to-one correspondence or dominant narrative, and reliance instead on an open-ended framework for learning, indicates that Bernstein is pursuing a heuristic through the tactics of his voice. A heuristic enables a person to discover or practice something for themselves through trial and error and embraces all the imperfections and pitfalls of the path toward understanding. And so Bernstein’s heuristic toward timbral empathy is an evocative display of music’s socializing power, which, per Cox, “provides a medium whereby we can enact participation and community—literally, a state of sharing, where the thing shared is a state of being, by way of a shared state of doing.”45

The upshot of this heightened understanding between performer and listener is a weakening of the normal barriers of alienation between agents coursing through social worlds. These barriers dissolve when listeners see themselves in the stories being told throughout “This Clouded Heart”—a process that starts by identifying with the conditions of embodiment that render them audible. Only then do listeners pierce through Bernstein’s caustic mannerisms and begin to fathom the characters and circumstances that comprise his poetic yarns. The next step is to describe an analytic framework capable of translating how Bernstein implements this heuristic strategy and grapples with his expressive goals. This requires a phenomenological bent: to differentiate the precise ways that timbral fluctuation steers the listener’s perception of embodiment in creative episodes.

The “Surface-Moment” Model

To interpret the construction of a heuristic for listener empathy, this project proposes a new analytic framework that integrates structural and embodied approaches to timbre. This is done by retooling a pivotal idea from the phenomenology of music: theorist Thomas Clifton’s concept of perceived musical surfaces in auditory space.46 For Clifton, space is the undifferentiated “field of action” or backdrop our mind inhabits during listening episodes, and surfaces are the psychoacoustic baselines projected onto these imagined spaces when musical motion establishes some constant function that persists for a time (before evolving in new directions).47 So I suggest two major changes to plug-in the prior theoretical discussion and engage the uncommon musicality of vocalizing artists like Steven Jesse Bernstein. First, adapt musical space from the abstract or etheric to what listeners are captivated by; namely, the resonant space(s) in the human body that sculpt timbre. And second, reclaim surface from purely ensemble or contrapuntal logics to trace timbre’s production of its own baseline and drift through perceived variation of contour, width, or distance across bodily terrains. Taken together, this analytic approach then looks for timbral movement in the performer’s resonating body that acquires its own textural dimension, which listeners experience in turn as a heuristic evocation.

Clifton describes the fundamental component of musical space as the line. These are sustained musical gestures with direction that can be perceived as thick or thin; forward or receding; and with retention or protention in the possible field of action.48 And when musical lines reach a threshold of consistency or stability, they cohere into surfaces. But this is not the end point of development in musical space. Surfaces then give rise to new lines, and their projection away from the fixed surface generates a feeling of distance or relief for listeners.49 So in Clifton’s theory, there is a cycling pattern that shapes texture through an extended topographical metaphor: Lines explore new territory in musical space before their transformation into surfaces, which then congeal and soon launch lines of relief anew.50 A given surface is almost immediately subjected to relational impingements from the relief lines it spawns. This project’s analytic framework is based on the same dynamic of surfaces and relief lines in co-creative tension with one another. But here, auditory space is now the tableau of the body; the timbral surface is a bodily profile of resonating sites, experienced in textural equilibrium; and the line’s perceptual relief is felt when the performer inserts muscular and articulatory deviations that activate new or reaccentuated areas of flesh for vocal production. In effect, the textural sensation of “pull” away from the surface is staged by the artist gradually carving out (or revealing) different bodily locations for timbral vibration.51 And the listener negotiates this as the tug between memory of the body’s past profile that the performer just departed (surface) and the edge of its present in the midst of arrival (line).

However, this interplay of referent versus relief—now graphed onto the performer’s anatomy—requires an additional modality to capture the listener’s experience. This is called a timbral moment, borrowed from the separate scholarship of S. Alexander Reed.52 Reed’s original definition is circumscribed and technical: They are sonic moments “whose component frequencies’ periodicities are unchanging, particularly with relation to one another…yet when the piece is taken as a whole…this relation of minimal units to one another can be viewed as an internal set.”53 This mechanistic framing—like pitch classes and integer functions, but for timbre—shortchanges the usefulness of the basic idea. So I repurpose and incorporate timbral moments to describe a missing third state between the competing forces of surface and line. Timbral moments occur when the pull of the relief line briefly plateaus but does not produce any sense of a “new norm” to challenge the current timbral surface. Listeners may perceive them as small rungs or steps on the growth of the relief line, often with a short-lived fixedness of tone color or intensity. They are waypoints that orient the felt expansion or contraction of relief against referent and rein in the untethered play of the line. But they provide definite utility for performers as well. Timbral moments enable artists like Bernstein to regulate the growth of relief over surface, in effect “domesticating” the sonorous spaces of their body for a few seconds before stumbling toward new terrain. By steadying vocal deviation and providing quick respite, a timbral moment then offers a jumping-off point for pursuing the next set of improvisatory inflections in the body.

To characterize the different textural scenarios that are revealed through phenomenological analysis of auditory space, Clifton posits four classifications of surface according to their scale of relief: undifferentiated, low, middle, and high.54 These four types retain their utility when adapted to musical timbre in the body, as shown later in this project. On “This Clouded Heart,” Bernstein modulates between two of Clifton’s surface types, low and middle. Clifton specifies a low relief surface as a texture wherein “the line will be perceived as adhering to the surface rather than detaching itself and going its own way. Competition between the line and the surface is held to a minimum.”55 Here, the line only barely ruptures the homogeneity of its surface and is reluctant to seek independence elsewhere in a musical space. By contrast, Clifton spells out middle relief as “the crowded, opaque surface of the music [giving] way to…noticeable projections.” For these textures, “[the relief line] still appear[s] to adhere to the underlying surface, although the strained quality with which…[it is] sung tends to make the projection quite audible.”56 As Bernstein’s performance is explicated, these two surface types will be shown to inform the range of allowable movement for a given auditory scene.

In total, I call this new framework the “surface-moment” model. There are several key turns here: for instance, the transference of Clifton and Reed’s devices from the proverbial “mind’s eye” in auditory space to the folds and cavities of the human body; the conversion of their theories for timbral topography from vertical splay in segment to horizontal spread over time; and most importantly, the linkage of this prototype model to musical concerns beyond the gamesmanship of analysis for its own sake. Indeed, I argue that the surface-moment model delivers rapprochement for timbre’s warring factions by showing the point of handoff: What starts in the structural domain by deciphering creative technique—such as Bernstein’s three stylistic tactics—can then be assimilated into the larger meditations on empathetic embodiment across the performer and listener divide. I endeavor to do precisely this; and specifically, to frame the heuristic invitation that starts in Bernstein’s bodily choices and discloses itself through patient development of the spaces, surfaces, lines, and moments that diagram his delivery. My project’s analysis of “This Clouded Heart” using the surface-moment model is shown through a visual schema that I dub a timbral map. Next, I will explicate the design and interpretation of these maps and prime the reader for following—across three timbral maps of the piece—how the surface-moment model tracks Bernstein’s conjuring of musical meaning through the empathy of embodied relations.

Timbral Maps

Three timbral maps have been produced for this project. Example 1 stretches from timestamp 00:20 to 1:20 on the “This Clouded Heart” recording from Prison; example 2 from 2:50 to 3:50; and example 3 from 5:00 to 6:00. Each is an aural scene that shows the operation of the surface-moment model, contributes to the development of the performer’s heuristic outreach through timbre, and/or exemplifies the three stylistic tactics unique to Bernstein’s musicality. As prototypes of the surface-moment model, these maps represent just one design interpretation of the model’s dynamic relationships and reflect the tools and expertise ready-to-hand for me alone. I see them as open-source, and anything but authoritative; I encourage readers who are moved by the larger framework to adopt their own mapping practice and innovate on my initial attempt in any number of ways. These images were crafted in GIMP (GNU Image Manipulation Program), which is a free, cross-platform alternative to Adobe Photoshop or other proprietary software. I decided to make a common template to standardize aspects of each example that are plausibly “objective”: for instance, the timestamp counter with poetic text callouts on the upper x-axis; flags for Fisk’s arrangement changes; the full text at the bottom; the letter annotations; and the faded spectrograph layer in the background. Note that the spectrograph serves as another temporal aid, but its formants play no role in the drawing of key components (surface, line, moment, etc.), nor do they show correspondence with the dramatic shifts of these materials; the spectrograph is functionally inert in the model.

More significant, however, are the representations of surface, line, and moment on the maps. These are freely but meticulously drawn—hand to mouse to cursor—over my repeated listening to “This Clouded Heart,” with a sustained focus on the one-minute segments chosen as examples. The timbral surface is the flowing, wavelike presence in light shading, traveling left to right in time; the relief line is the more jagged, serpentine, and segmented entity in dark shading, whose sharp edges face the viewer perspective; and timbral moments are shown with simple parentheses on the relief line. Surfaces and lines are drawn using the basic Paint function in GIMP, and each is assigned its own “brush stroke”—selected by me from the software’s preset palette—to elicit my artistic impression of their respective behaviors and functions. The ripples and strands seen in both brush strokes are precoded in GIMP, and I believe they help improve the graphic separation and dimensionality of each component for the viewer. But the smudges and fades are carefully added to both the surface and line brush strokes by me, to render my sensation of beginning or end with certain fragments, the temporary obscuring of particular segments, and/or an embodied feeling of directional tug that I perceive as teased or left unresolved by the performer’s body.

The y-axis of these maps deserves special attention. This vertical plane is the field of action that I perceive as the available sounding space of the performer’s body. Movement in this field tracks the listener’s gestalt impression of the performer’s body opening and announcing itself through newly animate timbral sites, versus closing or restricting these same areas of production to “show” less of themselves in sound. Accordingly, the y-axis is labeled “body profile” and displays an undifferentiated continuum of “conceal” to “reveal.” Lower regions of the space correspond to more cloistered or guarded listener impressions, suggesting a narrow timbral bandwidth and the performer’s tighter control of their bodily profile being shared. And as progressively higher regions of the space are explored, the listener feels the incremental disclosure of the performer’s bodily capacities through their activation of more complex and multifaceted timbral sites working together. So the timbral surface of each map is the listener’s held memory of an expressive plateau on their subject’s body, whereas the line is the drift of those felt resonate sites toward more articulate or restrained modes (depending on the line’s direction).57 The perceptual distance between these bodily states is the relief or pull sensed by the listener, which results in the changing topography of the drawn map over time. In this way, the embodied relations between listener and performer acquire their own textural dimension (contour, thickness, layer, width, etc.) through the musical voice—an insight made tangible by the surface-moment model and its maps.

There is also value in clarifying what these timbral maps are not. First: Just as Clifton’s original theory of auditory space resisted any reduction to standard musical categories, so too do these maps.58 Certainly pitch, harmonics, amplitude, rhythmic punctuation, dynamic fluctuation, prosodic shifts, perceived brightness (and more) are all heard in a performer’s voice. Thus, the temptation to analyze timbre along its constitutive elements, rather than as an elusive whole, is understandable. But with embodiment in play, reduction cuts both ways. Consider pitch: Andrew Mead describes how “our most immediate experience of pitch comes from our voice, and pitch control derives from muscle contraction and relaxation. The shorter the vocal cords, the higher the pitch—reproducing the same physical sensation of muscular contraction experienced when lifting our arms, objects, or ourselves.”59 Indeed, attempts to circumnavigate the complexity of timbre tend to land us back in the body, where “the grain” is grasped a priori to its component parts. Second: Some may be frustrated by the visualization of ambiguous concepts like surface and relief in musical memory, or timbral articulation in the body, when reliable tools for the monitoring of sound like waveforms and spectrographs are close by.60 But for all the particulate data and important perspectives revealed by precision instruments like these, they disguise their own multitude of musical senses and impressions—targeted by this project—that are every bit as real to our listening as frequency bands, yet pass through their diagrams like a sieve. In their own way, they are just as “slippery” as the phenomenological or intersubjective approaches. And third: The y-axis of the maps—while attuned to bodily resonance—does not chart shifts across specific organs or issues, as if laid out on a grid from top to bottom. There is no correspondence of the vertical plane to different hemispheres of the body, like from diaphragm or lung, to throat or middle tessitura, to nose or sinuses. Imagining a single nexus of sound to be tracked on its path to various sites of musculature is wishful thinking. Rather, any stretch of the relief line in vertical space represents various and distributed clusters of resonance in our physiology. And each of these arrays, when activated, is felt by me as pulsing and quivering on a spectrum: from tightly bound or homogenized bandwidths in the lower echelons of the map to flush or frenetic spans in higher echelons that animate and intensify more of the surrounding tissues. So the y-axis is conceptualized as a function of bodily discipline in timbre creation, journeying on a fluid continuum from concealment to revealment as sensed by the listener.

These examples show that the compelling analysis and musical insights gained through this project are simply incommensurable to quantitative measure(s). There is no objective unit or scale that an embodied sensation of timbral referent and relief syncs to—if for no other reason than because my ownership of my body, and my phenomenological scan, have no universal rigor that I can intellectualize for others. One might then argue that this model and its maps lack scholarly reproducibility. But I counter: The point is not to correctly repeat my findings when you listen and diagram in turn. Rather, the upshot is your own attempted navigation of the heuristic that Bernstein and kindred performers extend through timbral play in the body; by design, other analysts will obtain personal results. The model and its maps are merely formal means for revealing the point of entry to “ends” that prove mysterious for objective analysis: narrowing the empathetic gap between performer and listener. And this is the promised rapprochement between the structural and intersubjective approaches.61 For every analysis that utilizes this approach, the diagrams may look different. But the creative organizing principles and musical awareness made lucid to the individual are anything but idiosyncratic—unlike the many bespoke methods for quantitative timbral analysis that have proliferated. Here, each person follows guideposts on a path of their own choosing, validated by their own embodied relations with the performer, to arrive at an unspecified destination alongside their subject.62

Next, I will discuss the specific maps shown as examples 1, 2, and 3; highlight the development of Bernstein’s three stylistic tactics within these scenes; and describe the analytic throughlines that link them, as exemplars of both the surface-moment model and Bernstein’s deviant musicality.

Example 1 (00:20–01:20)

The annotations (A–E) of the first timbral map demonstrate the organizing principles and basic conditions of the surface-moment model. But they also set a norm for Bernstein’s handling of timbral embodiment, allowing later deviations of body (and model) to feel meaningful for listeners. Annotation A (“windows”) introduces the timbral surface that begins “This Clouded Heart.” This first surface is the baseline of articulatory origin from which relief lines will build and later seize control of new expressive territories. I experience his initial voice as studiously monotone; flat yet rigid; almost detached from its own flesh, like a mannequin controlled from afar. Accordingly, this surface is fixed on the lowest possible echelon of the auditory space I imagine, and any expeditions away from this dampened territory are felt by me against this original profile of his body. Annotation B (“two wagons”) starts Bernstein’s cautious play with the low relief space immediately above this original surface, as he slowly coaxes subtle shadings into his timbral profile before an abrupt halt and return to the taut and muted baseline. This induces the initial feel, however faint, of pull or distance for the listener. Other audible qualities of low relief, beyond impressions of the body, include dim amplitude wavers on implied pitches and flickers of consonant coloration, especially while other parameters hold firm. These minor shifts create a “patina” effect of swirling shades in a narrow space; some emerge into the fore while others recede. Annotation C (“padded”) denotes the first significant timbral moment with a parenthesis—lasting approximately five seconds before its quasi-pitched quality becomes stale—and I then experience Bernstein start to slowly unclench his upper chest and throat musculature as he pivots into new space.

Figure 1.

Timbral map.

Figure 1.

Timbral map.

After this line crests and bends back, a less flustered timbral moment is felt at Annotation D (“cruel dust”) and persists for roughly three to four seconds. Here, listeners notice a striking pattern of timbral moments being constructed in an ascending, steplike fashion. In other words, each plateau of timbral consistency encouraged by Bernstein is a rung on a ladder toward wider resonant territories. And these, in turn, disclose more information to listeners about the poet’s body and his own understanding of its musical capabilities. I sense the urgency of Bernstein’s embodiment upon grasping the stepwise tendency of his timbral moments. After a palpable line bulge that climaxes at Annotation E (“shake out”)—dramatic, considering the constricted tissues that began the piece—listeners are shown the third timbral moment of this surface. But my orientation to Bernstein’s original silhouette on the surface is weakened because the relief line’s drift from its plane of genesis starts to feel untethered. Thus, Annotation E foreshadows Bernstein’s inception of a new timbral surface that will supplant the old and reorient my memory of his body in example 2. Through Bernstein's subtle loosening and growth of the location(s) activated for sound, I understand that he now occupies a different embodied space with wider possibilities. So this map demonstrates two key concepts: first, the performer’s construction of a low relief texture, as adapted from Clifton into the surface-moment model; and second, Bernstein’s stylistic tactic of fomenting a claustrophobic auditory space, whose constricted profile and rising pressures are imitated in the listener’s body as well.

Example 2 (02:50–03:50)

The annotations (F–I) of the second timbral map demonstrate the flexibility and expanded circumstances of the surface-moment model. Chief among these is exploration of both positive and negative relief encircling the timbral surface; a bifocal extension of the “normal” style of articulatory growth my body was conditioned to follow in example 1. In this case, my baseline impression of the poet’s anatomy occupies the felt median—rather than the minimum, per example 1—of the total auditory space Bernstein’s vocal artistry can divulge. So now relief is shown protruding beneath the surface, as the poet clamps down on his resonate width and constrains his body past my fixed impression. But because of these fluxing lines, both above and below the surface, I experience this referent as less robust than example 1; hence, an instance of middle relief texture as described by Clifton. I must triangulate the surface as an imagined pivot between the topography of positive and negative relief to maintain my focus on Bernstein’s bodily states—particularly since I feel dragged along with each abrupt leap in the poet’s flesh, from tight pursed lips to sharp snarls and back again. Annotation F (“too much talk here”) highlights the stubborn pursuit of negative relief through repeated retreats into the tensed, monotone territories of example 1. Solid arrows trace this trajectory, with a snap back near the baseline before pushing even farther down.

Figure 2.

Timbral map.

Figure 2.

Timbral map.

After the bridge starts in Fisk's arrangement, Annotation G (“You fucking phoney [sic] genie”) showcases Bernstein's rededication to positive relief above the surface through his dilation of flesh for the listener. The poet erects two footholds through recitative-style timbral moments that link together as if in call-and-response. But these moments feel different to me than example 1. Whereas the “steps” of the first map felt dry and sardonic (perhaps evoking the text’s absurdist themes), these throb with a groundswell of energy, like rusty pipes quaking and groaning—a recapitulation of Bernstein’s first stylistic tactic in new contexts.

By Annotation H (“give them back”), Bernstein tests the limits of corporeal saturation. The fullness of his sonorous presence is dangled to the listener via the nasal wail of example 2’s final timbral moment. Here, the relief line nearly scrapes the ceiling of the possible auditory space, and this exhaustion of room to maneuver feels literal in my body, since I also hear him run out of air. My impression of his airways tightening and roiling as breath leaves without fresh return induces my own lungs to strain in reflex. This deformation of voice through the start of suffocation, and the alarm felt by its listeners, exemplifies the musical mimesis or “aural voyeurism” theorized by scholars like Cox, Mead, Eidsheim, and Reed.63 I hang onto the last breath in Bernstein’s throat alongside him, but importantly, I linger longer—and ruminate deeper—in his stream of storytelling as well.

Immediately after at Annotation I (“She’s got something to cry about”), Bernstein's resonant cavities snap shut and again the listener perceives only a shred of his full bodily profile. I am back to the start: a subterranean territory of dispassionate musings far more guarded than the baseline surface of example 2. But while this gesture frustrates my listening, another function is served. The leap back to concealment—from the edge of breathlessness, where his whole sounding apparatus was almost revealed—rescues Bernstein from impacting against the ceiling of his available space prior to the poetic climax of the piece. And the poet’s recovery here also resets listener expectations, while pulling their bodies into sync with his cycling performance. So this map demonstrates another two key concepts: first, the performer’s construction of a middle relief texture, as adapted from Clifton into the surface-moment model; and second, Bernstein’s stylistic tactic of rapidly shifting his resonant control to steer the listener’s embodied reactions.

Example 3 (05:00–06:00)

The annotations (J–L) of the third timbral map demonstrate the limits of the surface-moment model, at least for the psychoacoustic capacity of my listening. The third map’s initial surface occupies the median territory of my auditory space, like example 2. And reprising his second stylistic tactic, Bernstein starts this section by curbing my expectation for a gentle timbral landscape that opens slowly. Again, he plays with leaps of felt relief above and below the surface through whiplash modulations of his bodily discipline. I sense this maneuver’s onset at Annotation J; here, Bernstein constructs a harsh and biting timbral moment during his delivery of the phrase, “my Dad's in there, he's like a God to me, my God's in there, he's like a Dad to me.” He then inhibits his tissues and veers back down to negative relief below the surface at Annotation K (“all your cigarettes”). So at the outset, my listening relies on a weakened and triangulated timbral surface.

Figure 3.

Timbral map.

Figure 3.

Timbral map.

But unlike example 2, this dive is quickly counteracted by a pivot that arcs up and through the felt timbral surface: the first instance of the relief line crossing the “membrane” of the referent rather than leaping across. Spanning Annotation K to Annotation L (“a bird or two”), Bernstein then undertakes a long, unbroken run of the line in positive relief, buoyed by two brief timbral moments that gird my sensation of its inexorable ascent. These timbral moments occur between “knifed” at 5:15 and “cry” at 5:20, as well as between “fire” at 5:32 and “walkie-talkie” at 5:38. Along this continuous and winding relief curve—the longest distance yet—I feel Bernstein slowly liberate his vocal apparatus with an almost granular progression, reaching a clarity of his bodily profile I had not perceived since two minutes ago in the second map. When this similar precipice of relief over referent was reached during example 2, Bernstein throttled back suddenly to preserve the integrity of the perceived surface—even choking from lack of air to do so. But no such recalibration is attempted now, and accordingly, I lose my psychoacoustic grip on the memory of the timbral surface during Bernstein’s ascent to the summit. This is rendered on the map through the fading out of the surface just after the last timbral moment (“walkie-talkie” at 5:38). Most striking, however, is that no new surface emerges to replace the old, and I am unable to reestablish a baseline impression of Bernstein’s resonant body from this point until the end of “This Clouded Heart.”

This breakdown of the surface that governed my embodied listening should perhaps have been anticipated. Since the start of the recording, Bernstein leveraged constant stressors against the cohesion of the timbral surface in my image of the auditory space. The surface began as rooted and steady in example 1, but by example 2, abrupt shifts in the line and stymied expectations caused the referent to flicker on my phenomenological horizon. And so example 3 is primed as the stage when Bernstein’s articulation becomes so exposed and drastic that my aural imagination loses its focus. The surface then disintegrates entirely as an explanatory apparatus for grounding my own body against the performer. My listening perseveres by tracing only timbral gestures in Bernstein’s flesh, without any function of relief. The apex of this referent-less pursuit occurs at Annotation L. Here again, my body cringes as Bernstein squeezes out every drop of air, and his last layer of muscular inhibition is shed to do so. This emancipation corresponds with the mock questioning of his mother: Why is his father in jail; why can they not escape their economic misery? Explain to me, he pleads, “how come the hole in the roof isn’t big enough so I can fly out, but big enough so the rain can get in?” The unhinged state that follows bears no resemblance to the component parts or musical logic of the surface-moment model, and so I opt to render this on the map as a forceful and unpredictable circuitous ascent after Annotation L, devoid of plateaus or consistencies to fix in memory. Now, Bernstein has reached the embodied state that would seem generic for most spoken-word poets: the full-throated and intensive molding of each sonorous shape. Yet he demanded a concentrated effort from both himself and the listener to reach this point where, in extolling his sad queries to his mother, I experience his full and sustained corporeal portrait. No matter my early distrust, by the end of example 3, I feel invested in Bernstein’s emotional pilgrimage as enacted through his heuristic of the resonant body. So this last map demonstrates Bernstein’s third stylistic tactic: to undermine and demolish the psychoacoustic surface that the first two tactics shaped, so performer and listener are bridged by a shared empathy of the flesh.


This project argues for a re-envisioning of both how timbre mediates the relationship between sonorous and listening persons, and its possibilities in analytic contexts as well. Timbre need not be an atemporal or momentary inflection, broken apart into an aural taxonomy of frequencies and formants, for its importance to be discovered. As demonstrated here, more potency is available to listeners and analysts than the previous scholarship suggests. Indeed, by engaging with timbre’s sustained and winding musicality as it unfurls—shown in the three timbral maps of “This Clouded Heart”—its position as a large-scale organizing principle is strengthened, and its former role as a one-way signifier for the performer alone is dramatically expanded. Bernstein’s three stylistic tactics are prime examples of such organizing principles in action.

The surface-moment model sets forth a method for making artistic and emotional sense out of compelling repertoires that seem to lack conventional musical grammar(s). The model suggested here is but one of many forms that embodied timbral readings, or heuristic music projects, might adopt in the future. Furthermore, the design of these timbral maps—incorporating adaptable ideas of space, surface, relief, and moment—are easily transferrable to other analytic contexts. Avenues for future scholarship in this arena include: joining the surface-moment model to quantitative or objective assessments of timbre-centric works; looking at structural readings of embodiment on different scales, from general style to features across tracks to small segments; examining other biomechanical aspects of vocal production in tandem with the surface-moment model, such as Bernstein’s pronounced nasality; and/or incorporating a culture-theoretical perspective, like Bernstein’s references to Judaism in the song text and discourses surrounding a “Jewish voice” in performance. Even within the sphere of spoken-word performance, there are promising angles for future work on Bernstein, such as musical comparisons of his readings to kindred spirits like Bukowski or Ginsburg, or situating his work in the larger milieu of paranoiac and conspiracist monologues by cult icons like Francis E. Dec.

This project’s most significant contribution is the idea of an empathetic bridge through timbre—extended by the performer as a heuristic invitation to the listener—to bring their respective agencies into psychological or cultural communion. The heuristic sets forth numerous emotional or creative paths for the listener to experiment with as they struggle to understand the subjective world of the performer. And so each listener’s heuristic journey is intimately theirs alone, as they negotiate the alienation and mistrust that is our social default. The surface-moment model as applied to “This Clouded Heart” captures the construction of the heuristic in real time. Bernstein warily opens his body to his audience through the conduit of vocal timbre, and listeners tepidly allow themselves to hear what he has to say. In short, they learn how to trust him through their experience of his resonant body, which they recognize like their own.

Recall my initial repulsion during the listening encounter that started this article: Perhaps these macabre musings strung together by Bernstein are not isolated to his tumultuous life alone. Perhaps his voice is telling me, if I am brave enough to listen, that I share some shreds of his experiences and he shares some shreds of mine. Indeed, I think what attracts listeners to artists like Bernstein—whether Kurt Cobain in 1987, or me in 2021—is a recognition of this sharing and the catharsis it provides. And of course, this strange charisma is not confined to just poetic readings with a backing track. Consider Bruce Pavitt describing the day he first met Bernstein and offered him a recording contract:

[Bernstein] came into the Sub Pop offices and we went out to lunch. I found him to be a master storyteller. He riffed on his personal life at length for an hour. By the end of that hour, he had convinced me that his mother was an opera singer, and that his father generated a small fortune through the invention of the plastic strawberry basket. And we saw him as an integral part of the rock culture, even though he wasn’t playing rock music.64


