Microphone Practice on Bon Iver’s “Skinny Love”

Microphone Practice

Within musicological circles recording practice has been, until recently, neglected as a purely musical area of academic study.  While a great deal has been written on pop and rock recordings, these studies chiefly address the analytic priorities of disciplines which are not primarily interested in musical technique per se, such as media studies, cultural studies, cultural anthropology, critical ethnomusicology, and political-economy, to name a few. Other studies, such as Kevin Ryan’s and Bryan Kehew’s (2008) ground breaking survey of The Beatles’ recording practice throughout the 1960’s, provide a thoroughly researched historical record of technologies and techniques which recordists used to create certain tracks, but they rarely explain the audible consequences of those techniques beyond the recorded repertoire of just one band in particular. In fact, very few studies have considered microphone practice as a fundamentally musical concern that is worthy of music-analytic scrutiny.[1]

This is strange because, in my opinion, microphone practice is instrumental in creating the characteristic sound of a recording.  Though every step in the tracking and mixing process influences the resulting sound of a recording, I hope to prove in this paper that the audible consequences of microphone practice are some of the most easily audible characteristics of a completed recording of a song.  This is especially true for forms of music that eschew conspicuous signal processing techniques in favour of more veridic production values.[2] There are several distinct aspects of microphone practice that I will examine individually in order to best elucidate the audible consequences of each.  However, all of the examined aspects of microphone practice are heard by a listener at once while auditioning a recording, and so these aspects work together in tandem to create part of that recording’s sonic character.  Because of this, no one aspect of microphone practice, that is, microphone choice, proximity, placement or angle, can truly be considered separate from the others.[3]

Bon Iver’s breakthrough album, For Emma, Forever Ago (2007), provides an invaluable case study for the model of analysis outlined in my master’s thesis, Towards a Model for Analyzing Microphone Practice on Rock Recordings (Lewis, 2010).[4] This album has been lauded by critics, musicians and recordists alike, most often as a triumph of the emerging ‘project’ aesthetic in recording practice (which refers to recording performed in spaces other than professional studios, with limited access to recording technology).  In fact, the vast majority of attention which FEFA has received from critics in the past three years has been largely focused on the notoriously ascetic ‘project’ environment which Justin Vernon constructed to track the album. Specifically, Vernon tracked FEFA in a hunting cabin deep in the woods of Wisconsin (Captain Obvious, 2007).  He recorded all but a few of the vocal and horn tracks which appear on FEFA using only a single Shure SM57 dynamic (moving coil) microphone, a Pro-Tools “Mbox” digital-audio interface, and a laptop computer loaded with the Pro Tools “Mpowered” DAW that comes bundled with the purchase of every new “Mbox” interface (ibid).  Though all of his tracking choices ultimately influence FEFA’s overall sonic character, Vernon’s unconventional use of a single dynamic microphone to transduce all of his vocal and acoustic guitar tracks is of particular importance.

Vernon’s microphone choice on FEFA is unconventional because it is highly unusual for recordists to track an entire album with a single microphone.  On the contrary, it is much more usual for a wide variety of microphones to be utilized, each with its own response characteristics, preferred usage and ‘operations principle’.  The ‘operations principle’ of a microphone determines the way it transduces sound.  There are three main ‘operations principles’ used in modern recording studios: dynamic (or, moving coil), ribbon, and condenser (or, capacitor).  The ribbon microphone is more rarely used than either the dynamic or condenser microphone, and is not pertinent to the analysis of FEFA. Because of this, and due to the brevity of this paper, I will not spend time explaining the ribbon microphone’s operations principle.[5]

A condenser microphone contains a capsule called a capacitor; the term ‘condenser’, itself, is actually an outdated term for a capacitor (Izhaki, 2008: 119).  In a very simplified manner of speaking, a capacitor consists of two plates, one fixed and the other unfixed.  The unfixed plate sits at the front of the capsule and acts as the microphone’s diaphragm.  As the front plate is disturbed by soundwaves, it vibrates sympathetically, inducing a charge between itself and the unfixed plate.   Because the front plate of the capsule is relatively light and moves easily, condenser microphones offer an accurate, nearly uncoloured transduction of a soundwave.  The condenser microphone also boasts a wide frequency response, compared to the other types of microphones, for similar reasons.  These characteristics have led the condenser microphone to become the conventional microphone choice for both acoustic guitars and vocal tracks.

In this sense, Justin Vernon’s decision to use a Shure SM57 to transduce the majority of the tracks on FEFA is unconventional, even outside of the narrowed scope of microphone choice. The Shure SM57 is a dynamic (moving coil) microphone, which has a distinctly rugged operations principle.  It consists of a magnetic core, with many turns of wire wrapped around it.  These turns of wire are referred to as the ‘voice coil’ of the microphone, which is connected at the front of the microphone to a diaphragm.  When a soundwave disturbs a dynamic microphone’s diaphragm, the microphone’s voice coil moves in sympathetic vibration with the soundwave.  The voice coil of a dynamic microphone, however, is much more heavy and rigid than the front plate of the condenser microphone’s capsule.  This means that the dynamic microphone has a much more limited frequency response and because of this provides a much less transparent transduction of soundwaves.

The Shure SM57 plays a prominent role in constructing many of the sounds heard on FEFA. Vernon’s vocal tracks are particularly dark and muddy, even when compared to those transduced for his later EP, Blood Bank (2009), released soon after the broad distribution of FEFA.  This is despite the fact that the titular song was reputed to have been written for Bon Iver’s debut album (Some American, 2009).  All of the vocal tracks on FEFA, without exception, are muddier than the lead vocal tracks on “Woods,” Vernon’s first foray into conspicuous processing, and the final track on Blood Bank. To be more specific, there is more upper mid and high frequency content on the vocal tracks from the later song than on any of the tracks from FEFA.  Surely the spectral content on “Woods” is further complicated by Vernon’s the extensive use of pitch shifting, but even through this heavy signal processing a listener can hear delicate aspects of the vocal performance that are less audible on the 2007 album.  In particular, during the opening moments of “Woods” the sibilance of Vernon’s performance and the “pops” of air from his plosive consonants are exceptionally present.  Though these parts of his performance are also audible on, say, “Blindsided,” the hisses and pops are more abrupt in their envelope, and lack the nuance of the transduction of “Woods.”[6]

Clearly this proves that microphone choice has, at the very least, some bearing on the final character of a recording. However, proximity and placement also influence this character and, as earlier stated, it is inadvisable to examine one aspect of microphone practice in a vacuum from the others.  The remainder of this paper will consist of a case study of the song “Skinny Love” from Bon Iver’s FEFA (2007).[7] The detailed examination of the album recording of this song will support claims that microphone practice can be read as a primarily musical concern and that the aspects of microphone practice cannot be considered independently from one another.[8]

A Case Study: “Skinny Love”

“Skinny Love” boasts clearer, and more present, guitar tracks than most other songs on FEFA. Upon first blush, there seems to be a ‘slapback’ echo on the guitar tracks, but, as the song continues, it becomes apparent that there are simply two guitar tracks playing in unison.[9] At 0:10, the guitar tracks split – panned hard left and hard right, respectively – and the different microphone techniques used to capture both become clear. One track is quite warm and seems to be the result of a microphone positioned at the sound hole.  This instrument supplies the song’s bass line, which has a slightly different rhythm than the second guitar part.  It seems likely that Vernon used the baritone guitar to perform this part, given its generally dark character.  String noise is noticeably absent from this track, which indicates that the microphone was placed at the sound hole and pointed away from the neck.

The second guitar on “Skinny Love” is slightly detuned, and the open strings function as a ‘drone,’ or a static tone that is played through extended sections of the song (these are the higher strings of Vernon’s guitar).  Buzzing and ‘string noise’ are prominent on this track, and when Vernon allows the open strings to resonate freely, reverberation becomes audible.  The microphone on this track is distant from the neck of the guitar.  The lack of low-frequency content in the ‘drone’ notes also suggest that there is some distance between the microphone and the neck of the guitar, and that the microphone is pointed slightly ‘off-axis’ from the sound hole, from its position at the twelfth fret.[10]

More interestingly, “Skinny Love” breaks from the percussive trends established in the first two songs on FEFA. Rather than simple kick and snare, which had been established as the norm, Vernon elects to include several tracks of hand claps. The first of these tracks enters at 3:05, but the precise timing is difficult to pinpoint because the rhythm, at first, is the same as the snare, and both tracks mask each other in the mix. The hand claps only become clearly audible in the mix after 3:20. The texture of the song thins at this point, and the hand claps both change their rhythm and become less precise in their attack. The quality of the individual hand clap tracks is brought to the listener’s attention as the attacks become more staggered and exaggerated. The microphone does not ‘peak’ when the claps occur, and there is a great deal of room reverberation on these tracks. It seems that Vernon transduced the claps at a distance from the microphone, probably of several feet. It is also likely that Vernon recorded the hand claps from several angles of incidence to the microphone, in order to get several different ‘tones’, giving the impression of a group of people clapping rather than just one person clapping very loudly.

The vocal tracks on “Skinny Love” present an excellent example of ‘comb-filtering’. “Skinny Love” features only a single melody, which Vernon doubles throughout the song.  Comb-filtering is a phenomenon that can occur when two signals are summed.  If the signals are out of phase with one another, discrepancies in their amplitude and/or frequency can cause certain frequencies to be cancelled out of the summed signal. Sometimes this can occur and whole spectral bands can be lost, seriously degrading the transparency of the transduction. The misalignment of the consonants from both vocal tracks suggest that they are separate takes. When Vernon sings in his chest voice, both tracks sound completely ‘in-phase’, and the overall loudness of the track increases evenly across the frequency spectrum, such as on the line “my my my, my my my my my” at the end of the first verse. Most other lines are sung in Vernon’s trademark falsetto. In these sections, certain frequencies seem to be louder than others, and the combined tone of the two tracks becomes distorted.

On the line “come on skinny love, just last the year, pour a little salt, we were never here,” a distinct proximity discrepancy can heard between the two vocal tracks, one of which is panned slightly to the left, and the other slightly to the right. The vocal track panned more to the left of the sonic ‘sound-stage’ is both more present and darker than the other. This track is also further forward in the ‘mix’ than the other, and serves as the ‘lead’ vocal track. It seems to have been transduced with the Shure SM57 at a fairly close proximity to Vernon’s mouth, and directly ‘on-axis’. The lack of ‘pops’ – that is, bursts of air that overload the circuitry of the microphone at the incident of plosive alveolar and fricative consonants – indicates that Vernon very likely used a ‘pop screen’ and was careful not to sing directly into the microphone’s diaphragm during moments of elevated intensity; or, it could indicate that the proximate nature of the track was actually achieved through equalization. The second vocal track on “Skinny Love”, panned slightly right, plays a supportive role in the texture of the song. In the first line of the song, the distance between the SM57 and Vernon’s mouth on this track is clearly audible. The tone of the voice on the right side of the ‘sound-stage’ is thinner than the voice on the left.[11] Vernon seems to intentionally use a weaker vocal tone, with more audible breath leaking past his vocal folds, but there is less evidence of the proximity effect on this track than on the other.  The proximity effect is a phenomenon that is observable exclusively in directional microphones.  It is caused by an increase in a microphone’s sensitivity to bass frequencies when positioned near a sound source.  This distance varies from source to source, depending on the SPLs emitted from the source.  Generally the proximity effect can be observed in vocal tracks miked within two inches of the singer’s mouth, but it can still color the sound of louder instruments from a distance of up to two feet (Lewis, 2010: 39).

Clearly then the aspects of microphone practice are inextricable from one another.  Microphone choice, placement and positioning (or angling) each have their own audible consequences and influence the audible consequences of the other aspects.  The guitar tone on “Skinny Love” is not simply a product of microphone proximity, just as the vocal tracks are not simply a product of microphone choice.  Moreover, when these facets are considered in tandem, it becomes clear that they contribute a great deal to the sonic character of a recording.  This is immediately audible on “Skinny Love,” “For Emma,” and “Creature Fear,” but it true for almost any song with veridic production values.  It is precisely in the universality of these audible consequences, and the complexity with which they present themselves within a given recording, that clear analytic value can be found.

About the Author

Amanda Lewis
Don Wright Faculty of Music, The University of Western Ontario


[1] One very notable exception to this is Albin J Zak III’s The Poetics of Rock: Cutting Tracks, Making Records (University of California: 2001).

[2] Signal processing techniques that are considered “particularly conspicuous” in this contexts include pitch-shifting, explicit auto-tuning, such as that used on the Bon Iver track “Woods” (which is similar to pitch shifting), phase-shifting and excessive use of synthetic reverberation.  Such techniques are considered to be “particularly conspicuous” because they are audible to the untrained ear, and can mask the audible consequences of various other steps in the tracking process.

[3] I refer to placement and proximity as discreet aspects of microphone practice. Proximity refers only to the distance between the source sound and the microphone used. Placement, however, refers to the positioning of the microphone along the body of a resonating surface. A microphone placed at the sound hole of an acoustic guitar (a natural bass port) will transduce a signal with a much stronger low frequency presence than a microphone placed at the same instrument’s bridge.

[4] I will refer to For Emma, Forever Ago as FEFA for the remainder of this paper.

[5] For more information on this topic, one can refer to a number of sources, including The Microphone Book (Eargle, 2004), Modern Recording Techniques (Huber, 2009), The Audio Dictionary (Louie & White, 2005) or my recent master’s thesis, Towards a Model for Analyzing Microphone Practice on Rock Recordings (Lewis, 2010).

[6] It is worth noting that other tracks on Blood Bank, most particularly the title track, do share the dark, muffled vocal tone of FEFA. It was unclear, at time of submission, whether or not this is due to similar microphone choices.

[7] The scope of this paper does not allow for an analysis of the full album.  I have chosen this track both for its status as a de facto single on the indie release, and for its relatively wide and varied applications of microphone practice.

[8] It may be useful to consult both “Chapter Two: Microphones, Transduction and Acoustics” and “Chapter Three: Constructing the Model” from Towards a Model for Analyzing Microphone Practice in Rock Recordings (Lewis, 2010) before continuing on to read this brief case study.  Though I clarify concepts as best I can in my analysis, they are better explained in this previous work.

[9] Slapback echo is a type of special processing first popularized by Chess and Sun Records in the 1950’s.  “That is, a single, rapid repeat of the source sound, spaced with sufficient delay time to make the repeat clearly audible, but near enough in time to source to provide a rhythmic effect.” (Doyle, 2005: 235) Further information on slapback can be found in The Audio Dictionary (Louie& White, 2003).

[10] Off-axis refers to the positioning of the microphone in front of the sound source.  A microphone that is ‘on-axis’ points directly at the source, while an ‘off-axis’ microphone is pointed askew.  The significance of this difference is due to the ‘polarity’ of any given microphone.  While some microphones, called omnidirectional microphones, pick up sounds from all directions equally, others, known as unidirectional microphones, have specific directionality in their pickup pattern.  Such microphones will transduce the same sound source differently, depending on their angle of incidence, or the angle at which they are offset from the source.

[11] The ‘sound stage’ of a recording is the imaginary space where, upon listening, a recording seems to take place.  This space has width, height and depth that are audible due to, among other reasons, the variations of the reflections of the sound source, either captured or constructed by the audio engineer to create a ‘natural space’ in which their recording can ‘exist’.  Recordings that lack these reflections can often sound unnatural and jarring.


