Unheard Sounds: The Aesthetics of Inaudible Sounds Made Audible


Human hearing is limited in frequency range: Our hearing simply hasn’t evolved to pick up those frequencies that lie below 20 Hz or beyond 20 kHz (many of us who have lived a few years may even have problems hearing frequencies at 16-17 kHz) likely because the information in this range didn’t improve our chances of survival. Other animals, however, can hear frequencies that are much higher. For instance, cats can hear frequency content up to 85 kHz (Heffner and Heffner 1985), and bats up to 200 kHz (Adams and Pedersen 2000 pp. 140-41). So what does it sound like, this ultrasonic world that exists beyond human perception? That is a question we will never be able to answer, of course. What we can do, however, is to modify the content of that ultrasonic world so that we can perceive it, simply by transposing the ultrasonic components of the spectrum down to the audible range. This idea was the starting point for the project Unheard Sounds: The Aesthetics of Inaudible Sound Made Audible.

We are indeed not the first to think of using downward transposition to “translate” inaudible sound in the ultrasonic range into sound one can hear and describe. Zoologists have transposed bat calls (Menino: 2012, p. 14)[i] and sound designers[ii], musicians, composers and sound artists have transposed recordings comprising the ultrasonic components of everything from crushing rocks (Barrett: 2010) to fireworks, (Prebble: 2010) computers, street lights, TV sets and cicadas[iii].  However, as far as we can determine, there are no academic publications describing the musical use of transposition of recordings containing ultrasonic material to the audible range, neither in the field of electroacoustic music studies nor aesthetic of record production. There are, however, a number of publications describing the use of ultrasound in artistic applications, but these mainly deal with ultrasound as a means of either measuring physical properties or producing tactile feedback (see e.g.Ciglar: 2010; Reynolds, Schoner et al.: 2001; Fléty: 2000). There are also a number of publications dealing with the so-called hypersonic effect, although there are some controversies around the reproducibility of this research (Oohashi, Nishina et al.: 2000, Oohashi, Nishina et al.: 2002, Yagi, Nishina et al.: 2002, Yagi, Nishina et al.: 2003). The effect refers to physiological, psychological and behavioural effects on listeners listening to recordings containing ultrasonic material when compared with recordings that solely comprise the audible range. Furthermore, there is a body of research literature describing the use of ultrasonic loudspeakers which produce audible sound through heterodyning (see e.g. Roh and Moon: 2002; Gan, Tan et al.: 2011) In this context, neither of these research areas is directly relevant to our study. Somewhat more relevant for our research, is an electronic article by Boyk describing the measurements of frequency spectra of several musical instruments (Boyk: 1997). Although it says nothing about neither transposition nor the experiential aspects of the instrument sounds, it points to the existence of ultrasonic components for many of the instruments measured, including some of the sound sources in this project.

Thus, with very little directly relevant prior research to base our inquiries on, we have had to take an exploratory approach, both technically and aesthetically. Our intention has been to develop strategies for recording ultrasound, including testing equipment and settings to capture ultrasonic content, and mapping out what kind of sounds have content in the ultrasonic range. Moreover, we have wanted to experiment with this content using different degrees of downward transposition and different transposition methods. Finally, we have aimed at investigating the aesthetic potential of these transposed sounds through creative experimentation, practical composition and improvisation.

To contextualize our research, we start by discussing the relationships between sampling rate and sound quality, including some historical perspectives. In particular, we focus on issues of transposition and how different degrees and different methods can affect sound quality. In the following section we describe the practical methods and technology applied in recording a number of sound sources using different microphones and sampling rates. Then, we go on to discuss how these differences along with differences in the degree of downward transposition affect the different sounds we recorded. Using the transpositions that we found most interesting resulting from this process, we show how this material can be used creatively, both in a fixed composition and free improvisation. After concluding our main findings and experiences so far in the project we end the article by sketching out some relevant applications for these findings and future developments of the project.

Sampling rate and sound quality, before and now

Since sound quality and sampling rate in sound recording are central concepts in the Unheard Sounds project, it is helpful to present a brief historical and technical contextualization of these two notions.

From the introduction of digital sound reproduction with sampler keyboards like the Synclavier and Fairlight CMI and the Compact Disc, sound quality or fidelity was tied primarily to the question of digital resolution and its two key numbers: the bit resolution and the sampling frequency. The former denotes the number of binary digits, or bits, used to represent a measurement of the signal, and the latter denotes the number of times per second the signal is measured. The sampling rate also directly affects the frequency range with which it is theoretically possible to represent a signal. According to the so-called Nyquist sampling theorem, the highest frequency that can usefully be represented by a digital system is given as sr/2, where sr is the sampling frequency (Roads, Strawn et al. 1996, p30-31). In other words, the sampling frequency must be at least twice the frequency of the highest frequencies being sampled. Frequencies above the sr/2 limit – the Nyquist frequency as it is often called – will be subject to aliasing or foldover, i.e. they will be reflected down at a lower frequency and thereby construct sonic artefacts that weren’t present in the original signal[iv].  To be able to represent all frequencies that a human ear could perceive (20 Hz – 20 kHz), one would need sampling frequencies above 40 kHz, which is why the sampling rate standards of 44.1 kHz and 48 kHz have been established. Sampling rate is therefore a parameter that can be linked to both sound quality and frequency range.

In later years, however, the 44.1 kHz and 48 kHz standards have been challenged by high definition audio formats such as Super Audio CD and DVD-Audio, the former with a frequency response up to 100 kHz and the latter with a sampling rate of 192 kHz. While these formats were thought to deliver unprecedented sound quality, it has been shown that both laymen and sound professionals are not able to differentiate perceptually between these formats and the standard CD format (44.1 kHz/16-bit) in an ordinary music playback situation, apart from recognizing a slightly higher noise level in the “silent” parts between songs on a CD compared to the other formats (Meyer and Moran: 2007)[v]. Even if ultra-high sampling rates thereby appear to gain sound quality up to a point for ordinary playback situations, this changes where downward transposition is involved.

Returning to the earlier mentioned digital samplers that entered the high-end market at the end of the 70s and beginning of the 80s, the question of transposition is relevant to our study. Since data storage was a limited resource in the early samplers, they usually had relatively few samples stored in memory, and these samples were then transposed quite a lot to make them cover the whole range of the keyboard. While this was a practical and economical arrangement, meaning that manufacturers could use less expensive memory components, the sounding results were highly affected by what some have called “munchkinsation”, after the munchkins of The Wizard of Oz, small human-like fantasy creatures whose voices were created increased speed playback (Harametz: 1977, p. 97). As most of us know from experience, playing back recorded sound at increased speed affects not only the pitch of the original recording, but also the timing and the timbre of the sound in such a way that it can affect the properties we assign to the sound source. For instance, we tend to experience creatures and objects as smaller when their sounds are played back at a higher speed, and as larger when played back at a lower speed, something that is frequently used in sound design for film (Smith, Patterson et al.: 2005; Sonnenschein: 2001). This is due to the fact that the spectral envelope, with the perceptually important spectral peaks, or formants as they are usually called, will be transposed in the same ratio as the pitch of the recording. In real life, however, these spectral peaks tend to be much more invariant, with the effect that speeding up a recording of a human voice more than about 20%, can change properties like apparent age, gender or size of that person.

To return to the transposition process used in the digital samplers, the sonic effects are comparable to those caused by changing the playback speed. In addition, depending on the sampling rate and method/algorithm, transposition can introduce sonic artefacts that exclusively belong to the digital domain.[vi] One method of transposition that was applied in the first Fairlight CMI samplers was to vary the sampling rate to change the transposition. By playing back a sample at half the sampling rate it would thereby drop one octave. This would also limit the overall frequency range of the sound and, according to the Nyquist theorem discussed above, limit the possible range of frequencies that could be represented. For transpositions of several octaves, this would affect the sound quality through introducing grittiness (due to the lowered resolution) and a loss of presence (due to the lack of high-frequency content). In other words, it would produce a sound quality that one would typically characterize as lo-fi today. For the alternative method applied by many later samplers, often referred to as resampling, the interpolation of new samples between the original ones could lead to less grittiness after transposition, but the loss of presence with large downward transpositions would be about the same. Here, the method and resolution of this interpolation is of relatively great importance. For instance, the method of interpolating linearly between two samples will fill in samples that lie on a straight line between the original samples, and therefore reconstruct a rather edgy waveform that will add noise and/or aliased frequency components in the upper frequencies. With the hugely popular AKAI S1000 and S1100 samplers that entered the market in the late 1980s the “eight-point windowed sinc interpolation” method introduced artefacts “only with extreme transpositions” according to Sound on Sound’s Steve Howell (Howell 2005). Naturally, when memory capacity and CPU became a lot cheaper during the 90s and 00s, the practice of providing one or several samples per note has become unproblematic in terms of hardware. Hence, the need for high quality transposition and interpolation algorithms has been radically minimized.

The lack of high frequency content in extreme downward transpositions is not possible to remedy with clever algorithms, and can be calculated by dividing Nyquist frequency by transposition factor. For instance, using CD-quality, the highest frequency content present in would be:

  • up to about 11kHz for transposition one octave down
  • up to about 6.5kHz for transposition two octaves down
  • up to about 3.3kHz for transposition three octaves down

Hence, one would not have to transpose the sound a lot before it starts to feel “muffled”, distant or lacking presence. Moreover, at one point it will start to get a “grainy” or “gritty” quality as well, no matter what transposition algorithm one uses. Even if sound designers have noted that there is considerable improvement when one uses 192 kHz for recordings before extreme transpositions (2 octaves or more), we believe that it is through combining high sample rates with the use of microphones covering the ultrasonic range that we can get the most out of extreme transpositions in terms of sound quality. In the next section we describe our recording procedures with these issues in mind.

Recording ultrasound

The recordings were made in two sessions using different technical setups, and applying a large variety of different sound sources. Both sessions were carried out in a sound studio environment in order to gain as much acoustical control as possible and eliminate as much unwanted background noise as possible. The idea was to get as precise recordings as possible for investigating the ultrasonic range. In order to ensure the capture of sound in the ultrasound register it was necessary to prepare a system that did not compromise the sound along the signal chain. The first part of the signal chain was a Brüel & Kjær 4939 ¼-inch free-field microphone that operates in the range between 4 Hz to 100 kHz. The microphone was then connected to a Norsonic 1201 preamplifier. By distributing the pre-amplified signal to two recording systems, one using a 48kHz sample rate and the other 192 kHz (both 24-bit), we could produce two recordings of the identical source but with different sampling rates.


Figure 1: Recording setup for session 1.

The first line was sampled at 192 kHz through a Prism Sound Orpheus interface using Cubase 7 as recording medium, the second line was sampled through a Soundcraft Vi 4 digital mixer at 48 kHz connected to a RME MADI interface using Protools 11 native as the recording medium. Based on the experiences gained from this first session, and after listening to these recordings in different transpositions, we decided to do a second recording session in order to explore both the technical setup and the sounding content further. The intention behind the second session was both to record mores sound sources, but at the same time also identify the difference between using an ultrasound microphone and a standard studio microphone, AKG C 414 XLS. The AKG was set to omni polar configuration in order to make the comparison with the ultrasound microphone as similar as possible.  Since the microphones were amplified using different preamplifiers, we had to ensure that we had a similar gain structure for both of them. We achieved this by sending pink noise through a speaker in the recording room and adjusting the gain by ear.  The sound interfaces used for AD conversion were identical.  The setup consisted of two RME Fireface 800 audio interfaces running at sample rates of 48 kHz and 192 kHz respectively, using Cubase 7 as the DAW for both systems.


Figure 2: Recording setup for session 2.

As can be seen in Figure 2, this set up provided simultaneous recordings of both microphones at two different sample rates. In both sessions we wanted to explore a wide range of sound sources to have a broad basis for making comparisons of the different recordings and also to provide rich material for composition. The following section elaborates on the selection of sources.

Selection of sound sources

It was difficult to predict which type of objects had information in the ultrasonic range before the recordings started. From the literature, we knew that many musical instruments had content in the ultrasonic range (Boyk 1997), but apart from that we had to rely on experimentation and intuition. The first session was used to try out a large variety of sources of different materials and size, using different excitation methods. This included glass and metal objects, electrical tools, water sounds, sounds of lighter gas and the lighting of matches. The experienced gained from these first recordings enabled us to make a more focused selection of sources for the second session as we already had gained experience in what to look for in different objects. This session was also used to expand the selection of recordings to include musical instruments such as piano, trombone and different types of percussion

Microphone placement and monitoring

Making recordings of unheard sound was an unusual situation in the recording studio since, clearly, we were unable to hear the content we wanted to capture. Because of this we were dependent on using software to visually monitor the content along the way. To maintain the gain structure at a desired level for the different recordings we adjusted both the microphone pre-amplification and the microphone placement relative to the sound source. The input levels were set from visual monitoring of the meters within the preamps and the DAW together with auditioning what was monitored back within our hearing threshold. When it came to monitoring sounds above the hearing threshold we were dependent upon visual feedback. For this purpose we used the Voxengo Span plug-in (see Figure 4) inside the DAW. This analysis tool could be adjusted to scale content up to 96 kHz, and gave us the possibility to see which sound objects produced information in the ultrasound range – and which of them that did not. Since most commercially available analysis tools for music production operate only up to 20kHz, this specific plug in was necessary in order to select which sound sources should be investigated further, and which to abandon.


Figure 3: Voxengo Span

Doing recordings without having the possibility to listen to the source on the way in is, of course, not an ideal solution, but the above method helped us gain enough experience along the way to foresee some of the results without hearing them in real time. A next step in continuing this work would be to use an application that could transpose the sound in real time.

Evaluating transpositions

Having recorded a selection of sound sources with different microphone and sampling rate configurations, we subjected the resulting sound files to different degrees of downward transposition, and for two of the sounds we also experimented with filtering out the frequency content that was originally present in the audible range, i.e. below 20 kHz, before transposing the ultrasonic content down into the audible range. Given the preliminary and exploratory nature of this phase of the project at the date of writing, we have chosen to subject a small number of the rendered sound files to an informal and subjective evaluation.

The samples that were transposed without further filtering were single notes from a cowbell, a snare, and a trombone, each lasting no more than about 4-5 seconds. We applied the synthesis and processing environment Csound for all transpositions, and in most cases, we also applied the opcode diskin2, using the 64-point windowed sinc interpolation algorithm, for rendering the transpositions[vii].  It is these transpositions that we start with.

Comparing recordings without transposition

On the whole, we observed one discernible difference between recordings before they were transposed, in that the overall noise floor was higher for the B&K than the AKG. Apart from that we were not able to distinguish between the recordings at different sampling rates from each other, which is in accordance with Meyer and Moran’s study (2007).

Comparing two-octave transpositions

For the recordings with the AKG microphone transposed two octaves down, the differences between 192 kHz and 48 kHz were quite subtle. For the cowbell sound, which when transposed sounds like a drum of sorts, one can hear that the noisy part of the sound has slightly higher high-frequency components and a general experience of more “headroom” (sound example1A and sound example1B)[viii]. The situation is comparable to that of the snare drum, but here the relatively fast decay of the high-frequency component, makes it more difficult to detect (sound example 2A ⇔ sound example 2B)[ix]. When listening to the recordings made with the B&K microphone, the differences become much more salient. One can detect differences in the 48 kHz recordings between the AKG and the B&K, in that the latter has slightly more high-frequency content in the attack portion of the sound, especially for the snare drum and cowbell sounds (sound example 3A ⇔ sound example 3B and sound example 4A ⇔ sound example 4B). But it is first and foremost when comparing the B&K 192 kHz recordings that really salient differences appear.  Suddenly, the other recordings sound “transposed” and lacking presence in comparison with the B&K 192 kHz, the high-frequencies sounding a lot more defined and rich (sound example 5 ⇔ sound examples 6A, 6B, 6C; and sound example 7 ⇔ sound examples 8A, 8B, 8C).

Comparing three octave transpositions

When comparing the different three octave transpositions, the differences resemble the two octave transpositions. There are relatively subtle differences between the AKG recordings and the 48kHz B&K recordings for both the snare drum and the cowbell samples. Here it is interesting to note that the differences between the two sampling rates for the AKG microphone appear less than the differences when including the B&K 48 kHz samples into the comparison. However, more striking differences appear when we compare these to the 192 kHz B&K samples. As above, these samples clearly have more high frequency content the other three categories, also creating a higher degree of presence and sense of “acuity” in the sounds. Considering the highly percussive nature of these sounds, it is perhaps surprising that these high-frequency components are not only present in the beginning of the sound, but seem to last for several seconds.

Unheard sounds

One of our initial ideas when we started the project was that we wanted to hear how the samples sounded when we isolated the ultrasonic frequencies by filtering and then transposed just those frequencies down to the audible area. To achieve this we made a Csound script that filtered the sound twice using second-order Butterworth high-pass filters before transposing it using the previously discussed method. Even if the roll-off of the filters would leak small portions of the higher frequencies into the sound that was transposed, we thought this would be enough to get a pretty good approximation of how the “unheard sounds” sounded when transposed down into the audible range. While we tested this for many of the sounds, we were perhaps a little disappointed in the results, at least regarding the musical potential of the sounds. Often, the transposed ultrasonic components resembled the quality of the audible range, only a lot thinner. For instance, we tried to filter the ultrasonic components of a cymbal strike and then transpose it down one, two, three and four octaves (sound example 9). It still sounds a bit like a cymbal, only a lot thinner and brighter. And, as expected, the four octave transposition clearly has the recognizable grittiness familiar from more moderate transpositions at lower sampling rates. When the transpositions were synced up and mixed together, it clearly sounds more interesting than one by one, now more or less resembling a tense metal wire being struck (sound example 10). We also tried the same approach with a recording of sugar grains falling down on paper, but here the impossibility of syncing a random stream of tiny impact noises, instead resulted in a layered texture of granular noises (sound example 11). The sound that surprised us using this method, however, was a recording of a trombone note at fortissimo (sound example 12). Here, each of the transpositions had almost completely lost any pitched quality, and what remained of the sound was only a stream of high to mid-frequency grains or pulses, probably the peaks of the original oscillations (sound example 13). Thus, when mixed together, the result somehow resembled the mix of sugar transpositions, only now with regular rather than irregular grains (sound example 14). After establishing that it is first and foremost by using the combination of the 192 kHz sampling rate and microphones sensitive to the ultrasonic range that we attain significant improvement in terms of sound quality and presence, we describe in the next section how we used the sounds resulting from this combination in musical composition and improvisation.

Making music

The use of playback speed manipulation as a part of music composition can be traced almost 70 years back in time to musique concrete and the genres and practices that followed in its footsteps (Schaeffer 1952; Holmes 2008). In today’s music, concrete sounds, subjected to manipulation or not, are commonly used in a wide range of genres. However, in using transpositions of concrete sonic material containing ultrasonic components we have had access to a sonic realm that thus far has been only sporadically explored. We have followed two quite distinct paths. Firstly, we applied a significant number of sounds in an existing sample playback instrument, which then was used in free improvisation. Secondly, we applied post-production techniques to compose a piece of acousmatic electroacoustic music, albeit with musical elements from popular genres. We describe these processes in turn.

Improvisations with sample playback instrument

Our first approach was to use our samples in a transposing sample playback instrument. We put together two sets of samples in Particles, a software instrument originally made for a motion tracking based music device[x]. This instrument is a kind of sample player, using a sound bank of several hundred samples with the triggering frequency based on the activity of the user and the choice of sample based on his/her position in the space. In this version, however, it was controlled with a Wii-controller, with acceleration determining the triggering frequency, and with yaw (horizontal angle) choosing the sounds. What is interesting in this context, is that the instrument also had a feature that enabled transposition by lowering the upmost point of the body towards the floor, reaching down two octaves when positioned close to the floor. While earlier one experienced the familiar grittiness and lack of presence in the original instrument when doing this with 48 kHz samples, this was now radically improved using our high-definition samples, with the pitch of the Wii-controller now controlling transposition rate. The result was that the instrument as a whole felt a lot more lively and timbrally rich in the lower region without disturbing artefacts. We have attached two excerpts of two “jams” made with this instrument using two different sound banks (sound example 15 and 16).

Composing acousmatic electroacoustic music

Our second approach was to compose a piece of acousmatic electroacoustic music. We can divide the composition process into four steps: 1. Selection of material, 2. Noise reduction, 3. Constructing musical elements and structures, and 4. Final mix. We will discuss these steps in turn.

Step 1: Selection of material


Figure 4: Selection process.

As shown in figure 4, the selection process consisted of five stages, from left to right: 1) selecting objects and instruments to use as sound sources; 2) playing and recording these; 3) transposing the recordings into different octaves and editing the results; 4) from the resulting pool of sounds, we then selected those we considered most interesting in terms of further compositional work; 5) finally, the four preceding stages resulted in ideas for the second recording session, as indicated by the arrow going from right to left. Thus, the experiences of the first recording session gave us much clearer view of the directions we were to take in the second session. All in all, the selection step was a very important part of the compositional process as a whole.

Step 2: Noise reduction

The next part of the process was to improve the sound quality of the selected sounds. When transposing the different sounds a side effect was that the overall noise level increased. Since most conventional noise-reduction tools don’t operate at ultra-high sampling rates, we had to compromise with our original intentions of keeping the recorded sampling rate constant through the whole process. Thus, it was necessary to down-sample the compositional material to 96 kHz to be able to use them together with the restoration plug-ins we had at our disposal. The noise reduction was then done in the X–Noise plug in from Waves, using Wavelab 8 for down sampling. This process prepared the selected sounds for further processing and composition.

Step 3: Constructing musical elements and structures

When reaching this point the sounds were tried out in different ways. First of all, shorter and longer excerpts where tried out individually and in combination with each other in order to find appropriate musical connections. This technique was, for example, used to make quasi-harmonic content that was used in the intro of the composition. An example of this can be heard in sound example 17, which is the sound of a plate being carved with a fork transposed down 2 octaves. Several of the sound clips were also convoluted with each other in order to build interesting new sonic textures. An example of this can be heard in sound example 18, which is a sound morph between a peppercorn dropping on a small drum and the sound of pouring water, using the convolution plug-in SIR2 (both sounds are transposed down 3 octaves). Another approach was to identify tonal attributes within different sounds that could be tuned and used both in single appearances or mapped out as multi-timbral instruments for use in melodies and chords. One example of this is the bass line in the composition which is a downward transposed sound of a cymbal screeching created by drawing the backside of drumstick over it (sound example 19). Another example is a multi-timbral instrument mapped out on a Halion 5 sampler consisting of a downward transposed cymbal screech and a tone from a small metal object (sound example 20). The rhythmical content contained in the composition consists of a trombone tone being downward transposed 3 octaves, with time-stretched versions of different recordings quantized into rhythmical patterns (sound example 21). A few short extracts from the improvisation discussed above were also included in the composition. Lastly, the mentioned elements, structures and textures were then arranged along a timeline so as to constitute the different musical parts.

Step 4: Final Mix

The last step in the process was to make the different sounds fit against each other in a final mix that was carried out in a standard music production process. Different sounds were individually equalized, placed within the stereo field, edited, automated and grouped together as in a traditional mix. The result can be heard in sound example 22.

Conclusion and future work

All in all, our project highlighted the benefits of combining recording at 192 Hz with the use of microphones that could record in the ultrasonic range. The sounds recorded in this manner were significantly better in terms of sound quality and presence than recordings at 48 kHz and/or the use of conventional microphones when subjecting the sounds to large downward transpositions, especially 2-3 octaves down. In our experience, these sounds had a lot of musical potential both in a more experimental improvisational context, and in an expression combining musical elements from more popular genres with electroacoustic composition techniques. While we originally had hoped that the ultrasonic components would be highly interesting when isolated through filtering, this turned out to be less interesting for the sounds we had chosen to work with.

As we see it, the preliminary results of this project can be relevant to many practitioners working in the field of record production, composition, performance and sound design. A great many practices within these fields are based on sample playback, often using transposition. This is particularly the case for sound designers and electroacoustic composers, who often use downward transpositions of several octaves. Our research highlights the benefits of combining high sampling rates with recording equipment sensitive to the ultrasonic range and the sound quality improvements this represents at extreme transpositions.

However, the recording and compositional processes highlighted some challenges that can be addressed in future work. Firstly, the issue of monitoring ultrasonic material was a challenge, since it made us rely on visual feedback. Secondly, the state-of-the art recording equipment we used to capture the ultrasonic components was still something that added a considerable amount of noise to the recordings, especially for low-level sounds. Removing the noise components turned out to also require down-sampling to 96 kHz, something we initially wanted to avoid.

With the experience we have gathered thus far in this project, we therefore see the monitor and noise removal issues as something we can improve. Using real-time pitch shifting/transposition as feedback during recording and trying out non-commercial noise removal software is something we want to explore. Moreover, we would like to work systematically to locate sound sources and excitation methods with high musical potential, and create large sound banks that will make up a rich source for musical exploration. We also want to explore field recordings using the mentioned methods, to see whether out-of-studio locations can offer interesting sonic potential. While the sound sources we have worked with up to now haven’t given very interesting results when it comes to transposing down the ultrasonic range by itself, we still want to include this approach for sounds we record in the future. In short, we want to continue our search for the aesthetic potential of unheard sounds.

[i] Moreover, an article at the Wildlife Sound Recording Society advocates the use of cheap microphones covering the ultrasonic range in order to capture animal vocalization above the human hearing threshold, and then transposing them down to the audible range: http://www.wildlife-sound.org/equipment/technote/micdesigns/ultrasonic.html (Accessed: March 2015).

[ii] The need for high resolution transposition is probably the reason why several sound effect libraries come as 192 kHz sound files. See e.g. the list in http://designingsound.org/resources/sfx-independence/ (Accessed: March 2015).

[iii] The album Ultrasonic Scapes by Eisuke Yanagisawa consists solely of ultrasonic material transposed down to the audible range through the means of a bat detector. See http://www.gruenrekorder.de/?page_id=5260 for details (Accessed: March 2015). The artist Artificial Memory Trace aka Slavek Kwi has also used transposed ultrasonic recordings on several of his albums, for instance Surroundings and Ama_Zone1:Black-Waters. See http://www.artificialmemorytrace.com/ (Accessed: March 2015).

[iv] While this is used by some as an effect, it is in most cases considered an unwanted artefact

[v] This is mainly due to the 16-bit resolution.

[vi] See Roads, Strawn et al.: 1996, p125-130 for a discussion of two different methods of pitch-shifting for digital audio; varying clock frequency and sample-rate conversion (resampling).

[vii] See http://www.csounds.com/manual/html/diskin2.html (Accessed: March 2015) for details.

[viii] When listening to the sound examples we recommend using a high-quality sound system.

[ix] Here, the “⇔” sign means compare with.

[x] The instrument is described in more detail in Bergsland and Wechsler: 2013.

Audio Examples

Example 1a – Cowbell, AKG, 48kHz, two octaves down

Example 1b – Cowbell, AKG, 192kHz, two octaves down

Example 2a – Snare drum, AKG, 48kHz, two octaves down

Example 2b – Snare drum, AKG, 192kHz, two octaves down

Example 3a – Snare drum, AKG, 48kHz, two octaves down

Example 3b – Snare drum, B&K, 48kHz, two octaves down

Example 4a – Cow bell, AKG, 48kHz, two octaves down

Example 4b – Cow bell, B&K, 48kHz, two octaves down

Example 5 – Snare drum, B&K, 192kHz, two octaves down

Example 6a – Snare drum, AKG, 192kHz, two octaves down

Example 6b – Snare drum, B&K, 48kHz, two octaves down

Example 6c – Snare drum, AKG, 48kHz, two octaves down

Example 7 – Cow bell, B&K, 192kHz, two octaves down

Example 8a – Cow bell, AKG, 192kHz, two octaves down

Example 8b – Cow bell, B&K, 48kHz, two octaves down

Example 8c – Cow bell, AKG, 48kHz, two octaves down

Example 9 – Cymbal, B&K, 192kHz, high-pass filtered at 20kHz, one, two, three, four and five octaves down.

Example 10 – Cymbal, B&K, 192kHz, high-pass filtered at 20kHz, superimposed transpositions.

Example 11 – Sugar falling on paper, B&K, 192kHz, high-pass filtered at 20kHz, superimposed transpositions

Example 12 – Trombone tone, B&K, 192kHz, unmanipulated

Example 13 – Trombone tone, B&K, 192kHz, high-pass filtered at 20kHz, one, two, three and four octaves down

Example 14 – Trombone tone, B&K, 192kHz, high-pass filtered at 20kHz, superimposed transpositions

Example 15 – Improvisation with Particles instrument, sound bank one

Example 16 – Improvisation with Particles instrument, sound bank two

Example 17 – Fork on plate, B&K, 192kHz, two octaves down

Example 18 – Convolution using sound of peppercorn dropping on a small drum and the sound of pouring water, both B&K, 192KHz, both transposed down three octaves

Example 19 – Drumstick screeching on cymbal, transposed down

Example 20 – Melodic phrase based on multi-timbral instrument using recordings of cymbal and small metal object

Example 21 – Rhythmical layer based on trombone tone, high-pass filtered at 20kHz and transposed three octaves down

Example 22 – Final composition


Barrett, N. (2010). ‘Crush-2.’   Available at: http://www.natashabarrett.org/crush2.html (Accessed: May 2015)

Bergsland, A. and R. Wechsler (2013). ‘Movement-Music Relationships and Sound Design in MotionComposer, an Interactive Environment for Persons with (and without) Disabilities’. In: The proceedings of re new 2013, Copenhagen.

Boyk, J. (1997) ‘There’s Life Above 20 Kilohertz! A Survey of Musical Instrument Spectra to 102.4 KHz.’ Available at: http://www.cco.caltech.edu/~boyk/spectra/spectra.htm (Accessed: March 2015).

Ciglar, M. (2010). ‘An ultrasound based instrument generating audible and tactile sound’. In: Proceedings of New Interfaces for Musical Expression (NIME 2010), Sydney, Australia.

Fléty, E. (2000). ‘3D Gesture Acquisition Using Ultrasonic Sensors’. In: Wanderley, M. and M. Battier (eds.). Trends in Gestural Control of Music. Paris: Ircam, pp. 193-208.

Gan, W.-S., E.-L. Tan and S. M. Kuo (2011). ‘Audio projection : directional sound and its applications in immersive communication.’ In: IEEE signal processing magazine 28, 1, pp. 43-57.

Menino, H. (2012). Calls Beyond Our Hearing: Unlocking the Secrets of Animal Voices: Macmillan.

Oohashi, T., E. Nishina and M. Honda (2002). ‘Multidisciplinary study on the hypersonic effect.’ In: International Congress Series 1226, 0, pp. 27-42.

Oohashi, T., E. Nishina, M. Honda, Y. Yonekura, Y. Fuwamoto, N. Kawai, T. Maekawa, S. Nakamura, H. Fukuyama and H. Shibasaki (2000). ‘Inaudible high-frequency sounds affect brain activity: hypersonic effect.’ In: Journal of neurophysiology 83, 6, pp. 3548-3558.

Prebble, T. (2010). ‘Why Use High Sample Rates? In: Music of Sound [blog]. Available at: http://www.musicofsound.co.nz/blog/why-use-high-sample-rates (Accessed: May 2015)

Reynolds, M., B. Schoner, J. Richards, K. Dobson and N. Gershenfeld (2001). An immersive, multi-user, musical stage environment. Proceedings of the 28th annual conference on Computer graphics and interactive techniques, New York, ACM.

Roads, C., J. Strawn, C. Abbott, J. Gordon and P. Greenspan (1996). The computer music tutorial. Cambridge, Mass. & London, UK: MIT Press.

Roh, Y. and C. Moon (2002). ‘Design and fabrication of an ultrasonic speaker with thickness mode piezoceramic transducers.’ In: Sensors and Actuators A: Physical 99, 3, pp. 321-326.

Yagi, R., E. Nishina, M. Honda and T. Oohashi (2003). ‘Modulatory effect of inaudible high-frequency sounds on human acoustic perception.’ In: Neuroscience Letters 351, 3, pp. 191-195.

Yagi, R., E. Nishina, N. Kawai, M. Honda, T. Maekawa, S. Nakamura, M. Morimoto, K. Sanada, M. Toyoshima and T. Oohashi (2002). ‘Auditory display for deep brain activation: Hypersonic effect’. International Conference on Auditory Display, Kyoto, Japan.