This is a tutorial on audio editing as it relates to recorded bird calls. The example is a recording of a White-Crowned Sparrow. It was recorded with an iPhone. The audio editing was done in Audacity, a free audio editing program available for both Mac and Windows.
The White-Crowned Sparrow was recorded on January 4, 2021. It was found calling from the top of a tree in suburban southern California. As I wasn’t planning recording anything at the time, all I had on me was an iPhone. I placed it near the bottom of the tree, pressed record, and let it record for about a minute.
The resulting recording is…
Trimming the Beginning/End
In the original recording, the WCSP can be heard calling at roughly 7-9k. Its first call occurs at 0:06, making a short ascending vocalization. It repeats this call at 0:15, 0:22, and so on. There are a variety of other noises in the recording. There’s some construction noise at 0:09. A neighborhood child biked by at 0:29. There’s general background noise throughout.
I started editing by trimming of the dead space at the beginning of the recording. There’s no downside to deleting space before the bird starts vocalizing. The benefit of trimming off that initial audio is anyone listening to your recording will hear the target bird within a few seconds of pressing play.
In this case, I deleted about 4.5 seconds from the beginning of the audio. Doing so just means that the listener need only wait 2 seconds before hearing the bird.
Typically, I would also trim off any dead space at the end of the recording. In this case, I did not do so because the bird called very near to the end of the recording.
Deleting space at the beginning/end of a file in Audacity is self-explanatory. Simply click/drag to highlight the beginning (or end) of the audio and then press “delete” on your keyboard.
One note… It is not advisable to delete dead space in between songs/calls. Exactly how long a bird waits before vocalizing again is important information, and that “waiting time” should not be removed via editing. In this case, the bird called at 0:06 and then again at 0:15. It would be problematic to delete any of the time between those calls.
After removing the dead space at the beginning, the resulting recording is…
I personally find it nice to add a quick Fade In at the beginning and a quick Fade Out at the end. This is purely cosmetic. It provides no scientific benefit, nor does it do any harm. It simply makes the audio sound a little better and removes the possibility of the audio popping in abruptly when a listener presses play.
Typically, fades that are between 0.5 and 1.0 seconds in length feel about right.
To add a Fade In in Audacity:
1 – Select the very beginning of the audio file
2 – Select Effects Menu > Fade In
To add a Fade Out in Audacity:
1 – Select the very end of the audio file
2 – Select Effects Menu > Fade Out
After making some quick fades, the resulting recording is…
In this recording, the bird call is at 7-9k. The majority of the background noise is lower than that, sounding mostly at 1-4k.
This breakdown is fairly common with recordings of bird vocalizations. Most bird calls occur in high frequencies. Birds are generally small individuals, and small things (like piccolo) produce high pitches. As the birds get bigger, the calls/songs they make tend to descend. A great horned owl song will be lower in pitch than a bushtit song, and so on.
By contrast, a lot of background noise is lower in register. This is in large part because of distance. When sound travels, the high frequencies dissipate faster than the low frequencies. You can see this effect plainly in this recording. The recording was made roughly 0.3 miles from the 405 freeway. Much of the background noise is that of distant traffic. This background noise comes across as low register noise in the 1-4k range. By contrast, a nearby construction worker dropped a metal object at 0:40. This noise spans the full pitch spread all the way up to 10k.
As a result of these factors, it is quite common to have a pitch separation between a bird call and background noise, with the bird call in the upper register and the background noise in the low register.
Notably, this frequency separation is not guaranteed. If you record a Eurasian Collared Dove, it will show up at or below 1k in your recording. And background noise, particularly sounds that are close by, very well may extend all the way up to 10k.
But in this case, the bird call exists only in the upper frequencies of 6.5k and higher. Simultaneously, there’s a lot of noise in the low register. Given the pitch separation, we can reduce the noise without altering the bird call by reducing only the low frequencies.
The function to reduce certain frequencies in Audacity is called the Filter Curve. To use the filter curve…
1 – Select All to select the full recording
2 – Go to Effects Menu > Filter Curve
3 – Draw a curve that lowers the frequencies you want to reduce
4 – Click OK
In this case, I drew the following filter curve to reduce the frequencies below 4k.
When entering points in the Filter curve, you are drawing a representation of how you want the audio to be changed. You are not drawing a representation of the original audio frequencies or the final resulting frequencies. The effect functions like a math equation….
Original Audio + Filter Curve = Final Audio
With the curve that I drew, all sound above 4000Hz will be unaltered. The sound at 2000Hz will be reduced by 20 db. The sound at 1000Hz will be reduced by 36 db. The sound at 200Hz will be reduced by 48 db.
After filtering the recording, the new audio is…
A few cautions about this functionality.
First, do not try to remove noise in the same frequency as the bird call. If your bird is singing at 4k and there is background noise at 4k, such is life.
Second, do not remove frequencies above the bird call.
All pitched sounds consist of a fundamental pitch and a series of higher pitched overtones.
For a moment, let’s imagine a piano string. Let’s assume that the piano string is a length that produces a pitch at 1k when the full string is vibrated. This pitch is called the fundamental pitch. Now let’s suppose we clamp that string at its exact midpoint. When we vibrate only half the string, it produces the pitch at twice the frequency…in this case at 2k.
Now let’s go back to vibrating the full string. The full string actually produces both pitches. It’s as if the full string vibrates to sound 1k, while simultaneously each half of the string vibrates to produce 2k. The pitch of the full string is the fundamental, and the higher pitch is an overtone. Overtones then occur at each point that would divide the string mathematically into thirds, fourths, fifths, and so on.
The overtones are a vital part of the character of any pitched sound. Exactly how strong each overtone is determines the sonic quality of the sound.
Bird calls have overtones just like piano notes. Below is a clear example. The recording is of a Royal Tern. Each call has a fundamental pitch at roughly 4.5k and a clear overtone at 9k.
As another example, look at all the overtones in a Swainson’s Hawk call.
It is possible for there to be overtones in a recording that do not appear in the graph. In the case of the Royal Tern recording above, we noted a fundamental pitch at roughly 4.5k and an overtone at roughly 9k. No overtones are visible above 9k in the graph, but only because the graph stops at 10k.
Let’s look at the same file, imported into Audacity, and expand the frequency range up to 15k. As it turns out, there is an additional overtone at roughly 13.5k.
Big picture… There may be overtones in a sound that are not immediately visible in your graph. In the case of weak or absent overtones, their absence is also important information. Increasing or decreasing the frequencies above a bird song risks impacting the overtones and changing the sonic quality of the bird call. As a result, it’s best to leave the frequencies above the bird calls unaltered.
Last, some will argue that birders should not alter the frequencies of their recordings at all. The argument goes something like…. Reducing or enhancing only certain frequencies fundamentally changes the recording. It makes the recording sound different than the sounds that were audible in the field. This is, of course, objectively true.
The counterargument goes something like…. Yes, any kind of editing changes the file. Effective editing makes the file better, while poor editing makes the file worse. Most birders edit their photographs. Even recording equipment distorts images and sound. Any recording made with a parabolic mic distorts the frequencies. Parabolic mics magnify the sounds originating in front of the parabola while deflecting the sounds originating behind the parabola. Parabolic mics also only magnify the higher frequencies, which are small enough waveforms to be captured by the dish. Waveforms that are larger than the dish itself can’t be captured by the dish and are not magnified. This means that parabolic mics distort audio as they record, and man do the recordings sound great! Given these realities, “don’t process the files” is an overly cumbersome ideology, and implementing it consistently would eliminate the majority of media examples in most databases. Alternatively, the goal should be to make effective and responsible edits that improve the usefulness of the media.
Ultimately, you get to decide which of these viewpoints makes sense to you. For my part, I allow myself to edit files when doing so improves the media. I also feel no obligation to make edits, as in some cases doing so would decrease the value of the media.
As an example of a file that should not be filtered, below is a recording of some Rock Pigeons.
This recording was made at a nesting site. There are adult pigeons cooing in the very low register. There are nestlings begging at 5-6k. All frequencies in the recording are providing information about the birds, and any filtering would reduce the value of the recording.
Amplifying audio simply makes it louder. It doesn’t change the frequencies or character of the sound.
Digital audio has a maximum decibel level of 0 db. Decibel levels are then expressed in negative numbers, such as -10 db or -25.4 db.
Most recordings of bird calls are quite quiet. The observer is making the recording in an outdoor environment. The bird is often somewhat far from the observer. In most cases, the result is a low decibel recording, and it is desirable to increase the decibel level.
In the case of the White-Crowned Sparrow recording, the maximum decibel level was -25.491 db. At such a low level, the recording will be hard to hear in less than ideal listening environments. If someone listens to this recording in the quiet of their home using high quality speakers, it will sound fine. But if someone listens to this recording on their mobile phone while out birding, they will struggle to hear the bird call even with their phone volume at maximum.
A more ideal decibel level would be somewhere between 0 db and -10 db. Audacity has a function called “Amplify” that raises the decibel level of the recording. It takes all the sounds in the recording and simply makes them louder. To use the Amplify function…
1 – Select All to highlight the full recording
2 – Go to Effects Menu > Amplify
3 – Entered the desired maximum decibel level in the field “New Peak Amplitude (db)”
4 – Click OK.
For the White-Crowned Sparrow recording, I set the new peak amplitude to -10 db. This will raise all sounds in the audio by 15.491 db, making everything louder.
The new recording is…
Notice that Amplify raises the level of all sounds, making both the bird call and background noise louder. Often, this changes the look of the graph, changing the background from a white to grey color. In essence, this is revealing how much noise is actually in the recording.
Notably…. That noise is not new. It was in the previous recording. It just wasn’t visible in the graph because the overall level of the recording was so low. So amplify doesn’t add noise to a recording—though it may make it more obvious in the spectrogram how much noise is in the recording.
When done, you can export your audio from Audacity using File Menu > Export As. The most common file types are MP3 and WAV. MP3 files are lower quality and contain less audio information. WAV files are higher quality and contain more audio information. The tradeoff is that WAVs are larger files, take up more space on your hard drive and/or phone, and are harder to upload/email/share. In the end, I would try to work with WAV files as much as possible, but there is a convenience factor to working with MP3s that is relevant.
Note— The quality level of the file is also determined when the audio is initially recorded. A file that is originally recorded as an MP3 can’t later be upgraded to a WAV. Even if you import an MP3 into Audacity and export it to WAV, that new WAV file doesn’t have more audio information in it than the initial MP3 did. If you want to export to a WAV file at the end of your editing process, then be sure to record to WAV files when you are initially recording. If recording on a smartphone, there are apps that allow you to record directly to WAV rather than MP3. Of course, those WAV files will take up more space on your phone, hence the quality/space tradeoff.
Viewing Media in a Spectrogram Graph
When editing bird calls, it is helpful to view the audio file in a spectrogram graph. A spectrogram places frequency on the Y-axis, allowing the listener to see the exact frequency of audio activity. The graphical representations of the bird calls above are spectrograms.
To view a spectrogram in Audacity, click on the file name in the track header panel, which is highlighted below in red.
After clicking on the track name, a drop down menu appears. Select Spectrogram.
Audacity defaults to spectrogram settings that differ from other commonly used platforms. You can personalize the spectrogram settings in Audacity’s preferences. The preferences can be accessed from the track name drop down menu or from Audacity Menu > Preferences > Tracks > Spectrograms. For those accustomed to the look of eBird graphs, the settings below most closely emulate eBird’s settings.
Once the preferences in the settings window are changed, they become the default for all files going forward.
New Audacity users should familiarize themselves with the zoom and track size options. Zoom In is Command-1 or View Menu > Zoom > Zoom In. Zoom Out is Command-3 or View Menu > Zoom > Zoom Out. I personally like the Fit to Height option, which expands the height of the audio file to fill up the window. Fit to Height is in View Menu > Track Size > Fit to Height.