Creating and Editing Audio Files
Creating and Editing Audio Files


The ability to create and edit audio files, and link them into HTML documents, is important for language students. Increasingly, research papers in the Humanities are including speech samples, ranging from dialect specimens to examples of interpreting solutions to spoken instructions for visually-challenged users of standard web material. Music files are edited similarly to speech files, and are also being increasingly in academic work.

For the PK5 exam, students must show that they can create and/or edit an audio file. This may be done by (A) creating one's own speech recording or by (B) editing an audio file obtained from elsewhere (and in either case, briefly describing what you have done and how you have done it). (Examples of both exam options are given below.)

  1. If you record your own audio file, identify the speaker and explain briefly with what software and other equipment you made the recording;
  2. If the "edited" option is chosen, put links on one of your web pages to both the 'original' and 'edited' versions of the file, and briefly explain what type of editing was done, with what software (e.g. whether it was shortened, combined with another file, enhanced in some other way, converted from one audio format to another, etc.).

NB: If the student's "HTML paper" already includes either created or edited audio material, then the separate recording/editing page for the PK5 exam is not required.

Accessories/Peripherals Required to Create Audio Files

In order to create and edit audio files, your computer must be equipped with a sound card, which usually includes jacks for a microphone, a headset and/or external speakers, "line in" and "line out" capability, etc. Virtually all computers sold during the past decade have included sound cards.

To record speech, a microphone and speakers (either standalone as part of a headset) are normally also required. These would normally be plugged into the microphone/speaker jacks on your computer's sound card, or connected via a USB port, depending on the type of equipment you have. Laptop computers almost always have built-in speakers and many also have built-in microphones, in which case nothing will need to be connected separately.

Recent PK5 survey forms have reported that all students have audio playback capability on their computers, and most also have microphones. However, speech samples may also be recorded on many cell phones, recordable MP3 players, etc., and transferred to one's computer via Bluetooth or other wireless technologies or via a USB memory stick, among other options. Tape recordings may also be transferred via cable to the sound card's "line in" jack.

If necessary, PK5 students may also use the teacher computer in Pinni B-4087 on appointment after class hours to create or edit audio files for the exam requirement; John also has a microphone which may be borrowed to create audio files on your home computer.

Software Used in class For Audio File Creation and Editing

Two software solutions will be demonstrated in the PK5 classes. One of these is Microsoft Sound Recorder (Programs > Accessories > Entertainment > Sound Recorder), which is a standard component of all versions of Windows. Sound Recorder enables users to record audio samples and also perform basic editing of the files.

Note however that Sound Recorder has changed signficantly from Windows XP to Windows Vista to the present Windows 7 (see Wikipedia background). As Windows 7 is used in B-4087 and Windows XP is still installed on some university public computers (and many student computers), the differences in these versions may be confusing. The XP version of Sound Recorder only allowed recordings of up to 60 seconds, but enabled recording in WAV and a number of other audio formats, and also had a relatively wide range of editing options. Windows Visa expanded the maximum recording time, but reduced the editing options. Windows 7 reduced the recording and editing options further, only allowing recording in 'Windows Media Audio' (WMA) format.

To illustrate, see these four screenshots of the Windows XP Pro version of Sound Recorder

     
The screenshots above show the Windows XP Pro version of Sound Recorder still used on many UTA computers.
The pull-down menus in images 2-4 give an idea of the greater capability of the software.

Also demonstrated in class will be the open-source Audacity software, downloadable from audacity.sourceforge.net. Audacity is considerably more sophisticated, enabling both the basic functions of Sound Recorder and advanced audio editing well beyond the PK5 requirements. Among its advantages are much longer file length, the ability to work with MIDI and MP3 files as well as WAV files (see below), and 'wavetable' editing of audio files. (When using Audacity to save files in MP3 format, the lame_enc.dll file — available here — must be downloaded and added to Audacity when you are prompted to do so. It is recommended that lame_enc.dll be put in your computer's Windows folder.)

About WAV, MP3, WMA and MIDI Files

WAV, MP3 and MIDI are the three most common digital formats for audio files. Of these, WAV represents the highest audio quality, though as a result also the largest filesize. "WAV" is an abbreviation for "waveform audio." Most commercial CD music recordings are in WAV format. The WAV format is "uncompressed"; e.g. it has 100% of the digital signalling of the audio event which has been recorded. MP3, on the other hand (with the abbreviation derived from "Moving Picture Experts Group"), is a highly compressed digital format, in which signals which are not normally audible to the human ear have been removed in order to achieve a smaller file size.

WMA ("Windows Media Audio") is another compressed digital format, similar in function to MP3. WMA is a proprietary format of the Microsoft Corporation. WMA, WAV and MP3 all record sound as digital 'waves,' which may be either speech or music or a combination of these.

MIDI, in turn, is an acronym for "Musical Instrument Digital Interface." MIDI files are produced by music synthesizers and other digital music production devices. MIDI files are always musical. As they only represent key-tone data (as opposed to sound 'waves'), they are usually quite small in size. Due to their compact filesize and quick loading, MIDI files are often used as background music for themed HTML and Powerpoint presentations. Thousands of MIDI files are downloadable free of charge from the internet for such purposes, with one of the largest download sites being MidiWorld.

Examples of Recording, Editing and Linking Speech Files

Only a simple recording is needed to demonstrate your ability to create a speech sample. The recording need not be more 'profound' than the 'rain in Spain' recording made in class by John (cf. another version of this in 'WAV' format via Windows Sound Recorder and a Plantronics digital microphone), (though it can be more sophisticated if you wish). With your audio file, identify the voice(s) in the recording, and tell what the format is and what software was used to produce it. The file may be either linked or embedded). The 'identification' of the file should be similar to either (a) the in-text description of this paragraph or that of the (b) embedded audio link below.


Rain in Spain, recorded by J. Hopkins
Using Sound Recorder and the B-4087 microphone
Converted from WAV to MP3 using Audacity
(this illustrates the detail needed for the PK5 exam)

The file above was inserted into the web page as a clickable link similarly to how any other file would be linked (in this case as a 'relative' link <a href="rain.wav"> rain in Spain'</a> to the rain.wav file which is in the same web directory), the only difference being the filename, with its extension telling a web browser that it is an audio file recorded in a particular format (WAV, MP3, WMA, MIDI, etc.). When the link is clicked, the browser will then open in a separate window the appropriate audio software for that format, with which the user can control the playback.

The filesize of the linked WAV version of 'Rain in Spain' above is 429KB. Compare this to the same file converted to MP3 format as embedded in the table above, which has a filesize of 79KB. Is any any audible difference in quality? Generally for web files (particularly for speech files), MP3 would be the preferred format, as one gets a considerably smaller and faster-loading file with little noticeable "loss" of relevant information. However, depending on the quality level chosen for a WAV recording, it is possible there would be little difference between WAV and MP3 recordings of the same event.

Audio files can also be coded to auto-start as soon as a new web page is loaded. Any type of audio file can be used for this purpose, and the file can be coded to play only once, or to "loop" more than once, or even infinitely (or at least until one moves to another web page). An example of such auto-start files is Max Pechstein's Dancers. This uses the coding <bgsound src="dancers.mp3"> which is put as one of the first lines of code once the "body" segment of the HTML coding has begun. The coding means you would like a "background sound" to play, and the "source" of that sound is the filename you identify. This coding will play the file only once ("dancers.mp3" is a large MP3 file, so with slower internet connections there may be a delay in its starting).

However, the simple coding given above may not work with all web browsers. It should work with Internet Explorer and Firefox, but not always with other browsers, such as Google Chrome. An alternate way of coding which should work with all browsers is to use the <embed> command. An example of this is Max Pechstein's Dancers (Version 2). Here, the HTML command used is <embed src="dancers.mp3" autostart=true loop=false hidden=true>. This tells the browser to use an "embedded" audio control console to play the file, in auto-start mode ("autostart=true"), played only once ("loop=false") and with the console itself being hidden from the user ("hidden=true").

One may also "play it safe" by combining both codings in the same page, leaving it to the browser to choose the one is recognizes. In this case the 'combination' coding for the two "Dancer" examples above would be:

    <EMBED src="dancers.mp3" autostart=true loop=false hidden=true>
    <NOEMBED><BGSOUND src="dancers.mp3"></NOEMBED>
    .

Differences Between Browsers in How Audio Consoles Will Display

While the coding above will work with all standard web browsers, the way the consoles will display depends on (a) which browser you are using and (b) which audio software that browser uses as its default.

The key difference is the default audio software the browser uses for particular types of audio files (WAV, MP3, WMA, MIDI, etc.). The Firefox, Safari and Google Chrome browsers by default all use Quicktime to play audio clips; Internet Explorer prefers Microsoft Media Player, although Quicktime or other audio players may be set as the default instead.

With Google Chrome, however, if Microsoft Media Player is set as the default for MP3 audio files, and there is more than one audio clip in a particular web page, the browser will become confused and play all of them simultaneously. If the computer's audio software default for MP3 files is set to Quicktime, this problem will not occur. Thus, if students experience problems similar to this, the solution may often be conflicts between your computer's default audio settings and those the browser you were using was expecting, rather than a problem in web page coding as such.

Below are examples of how the console above-right displayed on the same computer with identical settings using Internet Explorer 8 (left) with Microsoft Media Player as the default audio software, and Mozilla Firefox 3.0.4 (right), which uses Adobe Quicktime as its default audio player. [The images are screen-shots, in which the original captions also appear.] The IE8-Windows Media console is differently-colored and thicker than the Firefox-Quicktime console, with slightly different controls. Notice also how the Firefox browser has separated the console and caption with additional space, and how the caption font displays differently. [Bear in mind that (according to the paragraph above this one), if IE8 had been set to use Quicktime as the default audio player, the consoles would display similarly to the Firefox/Quicktime example below: e.g. the default audio settings will supercede the preferences of the browser itself.]

 
(NB: The two consoles above are only images; they are not functional!)

Auto-start Files and Timed Transfer From One Page to Another

Two more examples of auto-start files are linked below. These use brief home-made speech files in WAV format, which should load and play instantly (without the possible time delay of the large MP3 file in the Dancers page(s) above). The "embed" technique of coding the auto-start was used on both of these pages.

These files also illustrate automatic timed transfer from one web page to another, which may be useful for auto-play time-sequenced web materials. In addition, the second of the two pages gives an example of coding for "phased" page entry and exit. Click here to start the sequence (which will return to this point in this file).

Special Coding Needed For MP3 Files in Google Chrome!

NB: The basic auto-start feature shown above will work with all audio filetypes in all recent versions of Internet Explorer and Firefox, but not for example with MP3 files in the Google Chrome browser (at least through version 10, February 2011). [WAV files will work properly; the problem applies only to MP3 files.]

However, the 'alternate' (<embed src= . . . ) coding above does work for Google Chrome as well. Thus this second version is recommended for autostart coding unless you are certain that those viewing your pages will have a compatible browser for the basic coding. [It may be useful to include a note in your auto-start pages suggesting what browsers will work, in case those viewing your auto-start files are using non-compatible browsers.]

Google Chrome will also not play audio consoles using MP3 files with 'standard' coding that works in IE and Firefox. However, a solution to this is to add the additional parameter type="audio/mpeg" to the auto-start or console box commands (see examples in the html source code for the MP3 'Rain in Spain' console above and the 'Millionaires' Row' MP3 file consoles below). This additional parameter will not affect the console functioning correctly in Internet Explorer or Firefox. Note also that the Google Chrome browser always uses Apple's Quicktime software to display consoles and play MP3 audio (no matter how your default audio settings have been set otherwise).

Editing an Existing Audio File and Using Visible Embedded Consoles

Below are two illustrations of editing existing audio files. Both give an original file and an "edited" extract, with an explanation of what has been done. This also illustrates how the PK5 exam option could be done (although clickable links may also be used for the exam requirement in place of embedded consoles).

The first example edits an extract from the Rev. Dr. Martin Luther King Jr's 28 August 1963 "I Have a Dream" speech, delivered on the steps of the Lincoln Memorial in Washington, D.C. (source American Rhetoric: Top 100 Speeches — see the Wikipedia entry for background on the speech). The "edited" version has been shortened further to isolate the "I have a dream" sentences by removing the parts of the speech before and after these.

 

On the left above is an audio "console" on which you can click to hear the "original extract."
The console to the right will produce the edited shorter extract.
Click each console in turn to activate it, and then use the controls as desired. Both files are in WAV format.

While the same examples could also have been given simply as clickable links (as with the 'rain in Spain' example toward the top of the page) using embedded control consoles allows the page user to pause the file, adjust the volume, fast-forward or -reverse, or jump to the beginning or end of the file, as well as providing a 'progress bar' of the file length. The same controls should be available from your media player in the separate window opened via clickable links, but embedded console controls allow the reader of the page to stay within the same window, and are especially useful when . Further detail on embedding consoles will follow later in this file.

The second example below uses two extracts from a 5-minute MP3 audio tour demo file of the "Millionaire's Row" neighborhood in New York's Manhattan district, originally in the Tourcaster website. Say that you want to use the tour guide's voice as an example of "New York dialect" in a paper on American regional speech. The demo file begins with a Tourcaster promotional blurb which is not useful for your paper. Likewise 5 minutes is too long; all you need is a brief coherent segment to illustrate the distinctive tonality and other speech characteristics of the tour guide's voice. The following illustrates how this might be done.

 

The left console will play a 1 minute 50 second extract of the "original" 5-minute demo in MP3 format.
The console to the right will play a 24.5-second edited version.

Further Detail on the <EMBED> Tag

To summarize, the "embed" (<embed>) tag allows you to insert audio into your Web page using an audio "console" or "control box" so page users can control playback of the audio. Examples of audio consoles are given above. The "embed" command is supported by the newer versions of all major browsers. It may be used either for invisible playing of auto-start files, or to produce visible user-controllable consoles.

A. Coding Options for Embedded Auto-start Files

The example given above to embed the auto-starting "dancers.mp3" was:
<embed src="dancers.mp3" autostart=true loop=false hidden=true>
Based on this example, other coding options could include:
  • src=fileURL. In this case the fileURL is "dancers.mp3", which is in the same directory as the "dancers" web page, so only a brief 'relative' hyperlink is given. If the audio file were in a different directory, or different website altogether, you would need a specific URL for both the location and filename of your audio file.

  • loop=false. "Loop" options tell the browser how many times to play the sound file. False has it play just once. True has it loop infinitely. The numbers 2,3,4, etc. would tell the sound to loop a specified number of times. Do not use the values 0 or 1 as different browsers do not use these values in the same way. Loop="-1" would also play the sound file infinitely.

  • autostart=true. "True" starts the sound playing automatically when the page loads. If you leave out the autostart switch, or else insert "autostart=false", the sound will start playing only when the reader uses the sound controls to start it playing.

B. Coding Options for Visible Embedded Audio Consoles

  • controls=console. To insert an audio console into your Web page, you may use the "controls=" option. This embeds the visible audio player (or console). Readers can stop, pause, reverse, rewind, and re-play the sound depending on how many controls you give your audio console.

    Generally this option is not needed, as the "embed" tag itself will produce the visible console (as long as the "hidden=true" switch is not used). However, if you are using a browser other than Internet Explorer and that browser does not show your audio consoles, try this option to see if is the solution. Always test audio consoles on different browsers, if you are aiming your audio pages at a wider audience. Different browsers display consoles differently, and some [mainly older versions of] browsers won't display them at all.

  • height and width. Height=XX defines the height of the console, in pixels, as you would like it to appear. Width=XX in turn defines the width of the console as you would like it to appear. The console examples given above on this page use a height of 40 and width of 300, which (at least in 800x600 resolution) displays all usable elements of the console but no more. Experiment with other numbers to get a result that works for your own pages.

  • frameborder. Frameborder=yes should place a border around your console and frameborder=no will remove the border from the console window. Depending on the browser and version, there may not be any difference between "yes" and "no" or using the "frameborder" command at all. Likewise a "border=x" (where "x" represents a number) will work with some browsers to produce a more pronounced border.

  • Align=. The "align" attribute determines where where the console is located. Options include the following:
    1. LEFT: The console will appear to the left of text which follows (or on the left side of a separate line).
    2. RIGHT: The console will appear to he right of the text which follows (or on the right side of a separate line).
    3. CENTER: This aligns the console(s) in the center of the page or line. The examples above have two consoles each on a separate line, with each of the consoles coded to "align=center" and a non-breaking space added between the two.
    4. BASELINE: This aligns the console along the bottom edge of the text.
    5. ABSMIDDLE: The places the console to the absolute center of the text.

  • Volume. The "volume=x" (for example "volume=100") tells some browsers to start playback of the audio clip at either full available volume or a lesser percentage. Generally it is not necessary, as there is a volume control on the console. Experiment with this if desired.


TopPK5 Reference IndexPK5 Home

Last Updated 24 October 2011