The Medium In The Message: Phonographic staging techniques that utilize the sonic characteristics of reproduction media.


This article is concerned with a phenomenon that I am calling media based staging and which has developed out of the work of William Moylan (1992) and Serge Lacasse (2001). William Moylan (ibid: 48) describes the sound stage in recorded music as “the location within the perceived performance environment, where the sound sources appear to be sounding”. In the literal sense this involves direction and distance but Moylan has also postulated a third metaphorical dimension of high to low pitch. The most common example of this form of staging relates to a stereo set up and the listener’s triangulation with the two speakers but these ideas can also be applied to mono, quadrophonic and 5.1 surround audio reproduction systems as well. Serge Lacasse (2001) has extended Moylan’s idea to include how the treatment of the sound in the production process can alter its meaning through mimesis and metaphor. Thus particular types of distortion are mimetically or metaphorically related to the aural manifestation of aggression such as the sound of an overdriven amplifier sharing timbral qualities with a shouting human voice. Exaggerating the amplitude of the high frequency content of a sound can suggest proximity because high frequency amplitude dissipates more quickly over distance that that of low frequency sound. This sense of proximity in turn suggests intimacy through association.

The concept of media based staging takes the idea of ‘location’ a step further to include perceptions of time and place that are associative rather than perceptual: how the aural ‘footprint’ of particular forms of mediation associated with audio reproduction media have been used to generate meaning within the production process.

Before I addressing the four proposed categories of media based staging in detail, I will start with a brief discussion of the ways in which audio reproduction media can impose their ‘sonic footprint’ on the sound we hear.

The Sound of Particular Media

Particular audio reproduction media involve specific limitations in frequency and dynamic range and particular forms of distortion, ambience and noise. These media will generate associative meaning for audiences with particular forms of cultural experience.  If we are familiar with telephones, we will recognize the ‘sonic footprint’ of telecommunications technology on the sound of a voice: a much reduced frequency and dynamic range which is often combined with harmonic distortion.

This is also further complicated by issues of familiarity and expertise that may allow for finer or coarser gradations of differentiation. For example, the sound of early recording is a broad based association familiar to most members of post industrial societies. On the other hand,  hearing the difference between wax cylinder or acoustic recording and 1920s electric disc recording is a relatively easy skill to acquire but not one that’s common in contemporary society. Recognising the sound of a voice coming down a phone line is, once again, a widely acquired social skill whereas hearing the difference between a land line and a mobile phone is a more specific skill. On a slightly  more esoteric level, being able to differentiate between 16 and 24 bit recording or between good quality MP3 and a .wav or an .aif file is less widely found talent.

Environmental Forms of Media Based Staging

I have distinguished associations related to chronology and those that I’ve called environmental i.e. associations related to contemporary society but to different environments or arenas of experience. Within this category of environmental forms of media based staging we can identify various distinct forms such as:

  • the public address systems in various types of environment – supermarkets, railway stations, sporting events, aeroplanes etc.
  • Sound Reproduction Systems such as muzak in an elevator or a supermarket, film sound in a movie theatre, AM Radio sound, or the sound of TV.
  • various communication media such as different types of telephone calls, walkie talkie radios, police radios, or the sound of astronauts communicating from the moon.

I shall now investigate the way that these forms of staging can create meaning by looking at a few examples.

My first example is Eminem and the supermarket PA system or tannoy on The Real Slim Shady (2000). Aside from referencing a familiar form of paging – or requesting someone’s presence – that relates to the lyrical content, this also is a culturally familiar form of disembodied voice. The role of the narrator, a very common form of disembodied voice in contemporary media, can be summoned in many more conventional ways than this – the sound of radio announcers or TV voice overs being two examples. Record production is itself another medium that generates disembodied voices – the paradox that Evan Eisenberg (1987: 157) has described as the performer without an audience and the audience without a performer. Techniques such as this can allow the creation of multiple levels of disembodiment. The staging in this example identifies this version of Eminem’s voice as different to the main vocal – a step further away and thus a narrator commenting on or preparing us for the lead vocal.

A further example of this can be found on 10.30 Appointment by Soweto Kinch (2006), a UK rap artist from Birmingham. The cultural significance of an interview at a Job Centre in the UK (conjured up by the tannoy announcements and the office environment noises) where the protagonist explains to the employment officer that he wants to be a rapper, is not only very British but is also culturally specific to the unemployed and students claiming state benefits. The ritual humiliation of the ticket and window interview queuing technique is more broadly familiar however.

Moving from public address to sound reproduction systems, one frequently used example involves tuning in a radio station.

On I Wish by Skee-Lo (1995) the introduction to the track is staged as playback with the limited frequency range of a small speaker AM radio and this is itself introduced by the sound of a tuning dial being turned. The track is referencing the audio media that is expected to be one of the primary forms of playback. It also works as an interesting twist on an often used popular music tool: arranging the introduction as a lighter version of the main theme. Rather than, for example, a solo piano introduction before the band plays the fuller arrangement, this provides a version of the track with high and low frequency filtering which is then removed as the vocal starts. In a move that dilutes and confuses the message of the radio tuning, there is also the sound of a ringing tone mixed quietly into the introduction setting up the ‘hello’ of the vocal as answering a phone – although without any phone voice treatment.

The telephone voice utilised by Britney Spears on Oops…I did it again (2000) is, however, recognisably treated and is dropped into the verse narrative of the song a few times – seemingly quite randomly – as an arrangement tool. The familiar staging of telephone communication is used as a tone colour in the vocal arrangement rather than as a cipher for disembodiment or separation – the form of meaning usually associated with telephone references in popular music.

Later on in the track, the filmic reference to the Titanic (1997) movie – a diamond necklace dropped into the ocean by the old lady narrator – is given the sonic characteristics of the reduced sound quality of a movie theatre.

The media based staging in these instances seem more like references to popular culture rather than creating meaning related to the musical and lyrical content. I shall return to this idea of ‘namechecking’ references that would be familiar to one’s target audience a little later.

Chronological Forms of Media Based Staging

The other common reason for using media based staging in record production is to evoke the sound of a particular (or more commonly just a vague) historical period. In the same way that sepia tinting of film and photographs, black and white photography and the particular colour saturation associated with Super8 and other home movie formats are used to denote age, the sound of early recordings are used as well.

On the other hand, another crucial aspect of this that should be mentioned is the way that particular forms of clarity and audio quality are associated with modernity. This has become quite tightly entangled with the distinction between expensive and cheap record production which will crop up again a little later.

Interestingly my memory of the Beatles’ Honey Pie (1968) was that it had a vocal treatment simulating a 1920s megaphone but when I came to listen to it the vocal was full frequency range except for a short fragment of ‘old crackly record’ at the beginning. In this instance an obvious reference to the stylistic period of the track.

The Buggles track, Video Killed The Radio Star (1979), uses a limited frequency range and dynamic compression on the vocals to suggest the sound of early radio broadcasts. This is mixed into a contemporary (to 1979) production sound and the production itself juxtaposes perceptions of antiquity with those of modernity – the voice and keyboard sounds have the restricted frequency range of antiquity whilst the female vocals, kick drum and bass have a sound of modernity that was set to become the standard in the 1980s.

Authority and Dilettante Related Forms of Media Based Staging

This brings us to a further distinction that can be made about the way that media based staging can create meaning: a way that is related to the ideas of familiarity and expertise that were mentioned earlier. The references I mentioned in the Britney Spears tracks related to popular culture in ways that were designed to resonate with the demographic of her projected audience – mobile phone conversations and romantic films.

Historical references can be similarly grounded in ideas of what might be perceived as ‘cool’ to a particular target audience. The idea of authenticity and the perceived authority that stems from speaking with a particular voice is central to the production used on Don’t Look Back In Anger by Oasis (1995). In this example the notion of authority stems from the ‘voice’ of late 1960s and early 1970s record production – the sound of analogue tape and valve or tube amplifiers. This voice of authority is the perceived golden age of rock – used to distance the sound of Oasis (and other Manchester bands of the early to mid 1990s) from the sound of the 1980s.

There are many other examples of particular types of production technology developing an authenticity within a particular musical style. Roland TR808 drum machines and TB303 synthesisers were, amongst others, central to both the sound and artistic credibility in house and techno in the late 1980s. Playing, sampling and pressing a performance to vinyl as part of the creative process were also important statements of authenticity within the Bristol sound of artists such as Roni Size and Portishead. There has also been a pronounced anti-synthesiser stance by various rock bands at various points in the development of rock music. Both Queen and Rage Against The Machine went so far as to print explicit statements on their album covers. Rage Against The Machine’s eponymous 1992 album states “no samples, keyboards or synthesizers used in the making of this record” and Queen used the statement “No Synthesisers were used on this Album” on albums released before 1980.

In our post modern age though, the cache of sonic signatures can go up as well as down and the ‘voice of authority’ can be sincere or it can be ironic. Whereas on certain hip hop tracks, for instance, the presence of the sound of vinyl crackle is a signifier of authenticity: of sampling from the original repertoire, in the case of the Mike Flowers Pops’ Wonderwall (1995) it is part of the ironic language of retro cheesiness that was central to the ‘Lounge’ scene of the time.

This inverted snobbery can be seen in terms of Bourdieu’s (1984) ideas on cultural capital – specifically of Thornton’s (1995) idea of subcultural capital: only an audience with the habitus of listening within a particular sonic world will understand the cultural resonances – the subtleties of authority and irony – that allow the “correct” reading of this audio event. The capital in this instance stems from the acquired skill of recognising the sound of ‘Lounge’: a deliberate attempt to reframe modern compositions in a 1950s or 1960s sonic landscape. To ‘belong’ one must be in touch with contemporary society enough to recognize the songwriting of Oasis, and yet sufficiently ‘above’ it to reject the original contemporary version in favour of an ironic reinterpretation that demonstrates an understanding and appreciation of the history of popular culture. Demonstrating ownership of this sort of cultural capital speaks to one’s worldliness in a similar, but more contemporary, way that the appreciation of fine wines might: as membership of an exclusive elite.

Another important way that media based staging can affect the meaning of recorded music is through a dilettante approach – on the face of it, a superficial, amateurish and partially understood approach to recording. Garage bands from the late 1950s onwards have produced rough and unpolished recordings and this has led to it being embraced as a production aesthetic in itself. If the dilettante approach is chosen rather than being accidental or an economic necessity then it takes on additional meaning. In Darkthrone’s Transilvanian Hunger (1994) professional quality recording becomes a signifier for the ‘establishment’ and the rejection of it. They have deliberately adopted production techniques that sound amateur and ‘raw’ for this album and this is one of the markers that demonstrate the band members’ move towards the anti-consumerist ‘Black Metal’ scene from the more ‘professionally’ produced sound of ‘Death Metal’ in their earlier career. The choice to adopt a Lo-Fi approach to the recording becomes a political statement: a marker of difference. The ability to recognize this meaning is sub-cultural capital: the listeners who understand the ideological intention of this form of production (rather than dismissing it as incompetent or cheap) can count themselves as members of a group through this specialist knowledge. This is, however, further complicated by the issue of whether and to what extent one takes on board the ideology. Anyone who has read this article can count themselves as owners of this sub-cultural capital to the extent that they understand the ideological intent. That doesn’t mean, however, that we buy into the ‘Black Metal’ ideology. The belief structures that surround these forms of cultural and sub-cultural capital inflect their meaning. I may possess the cultural capital required to understand the Mike Flowers Pops and Darkthrone examples but my belief system will determine whether I interpret it as a higher form of understanding, a pretentious affectation or simply one of the myriad of competing forms of identity creation that popular music throws up.

An important aspect of this which relates back to issues of familiarity and expertise that were mentioned earlier is the fact that the signifier – the characteristic that identifies the media in question – is often highly exaggerated. The crackle on the Mike Flowers record is so loud that it would have been a signifier of a badly worn record, the vocal track on Video Killed The Radio Star has a more restricted frequency range than the actuality of early AM radio and the slap back delay on the tannoy in The Real Slim Shady has slightly more feedback than the real thing. Gaining the stamp of authenticity, of speaking with the voice of authority, often requires the ‘tone of that voice’ to be exaggerated. In Biscuit by Portishead (1994) the signifying surface noise from the vinyl is not only exaggerated by loudness but also by making it intermittent, the attention is drawn to it even more.


In conclusion, I have identified various aspects of media based staging. It has been noted that it can create both environmental and chronological associations and that it can create the perception of authenticity either through association with an established voice of authority or through a dilettante rejection of those types of authority. In any event these techniques rely on an audience that has accumulated the appropriate forms of experience. These can be based on very broad social communities of experience – associations such as the telephone being a cipher for communication but also possibly separation – or they can be based on more tribal or sub-cultural groupings – associations such as Lo-Fi recording with anti-consumerism and rebellion or vinyl crackle with the more authentic sound of records over CDs in DJ culture.

Thus, whilst some examples of the meaning created through the use of media based staging relate to broadly recognized forms of technology, others have a much more esoteric coded meaning. The former may seek to generate quite specific but ubiquitous meaning such as the references to early recording technology in the Beatles (1968) and Buggles (1979) examples. The latter may involve complex notions of belonging and identity related to the recognition and correct interpretation of nuances in recorded sound such as The Mike Flowers Pops (1995) and Darkthrone (1994). This recognition and interpretation  relies on skills that can be seen as cultural or sub-cultural capital (Bourdieu 1984 and Thornton 1995) but the meaning that they may have for different people that possess these skills depends on belief systems.


