Digital Audio Storage
Storage and transmission explained.
Text:/ Scott Willsallen & Luis Miranda
In the previous articles on the Principles of Digital Audio we have reviewed the theory of coding audio signals, filter design characteristics, digital signal processing and digital audio transmission. In this article we will review typical digital audio storage methods as well as common methods for reducing the size of digital audio files.
DIGITAL AUDIO STORAGE
Digital audio is ubiquitous along with the means to store it. Big technological steps have been made in audio storage from the first cylinders used in the phonograph but the intent is the same: store an audio signal to be played back at a later time at the user’s convenience. Digital audio storage provides great advantages compared to typical analogue storage methods: digital audio storage provides the ability to produce limitless amounts of copies and unlimited playback without degrading the quality of the original recorded signal. Also, thanks to digital audio storage and the advances in data networks, large amounts of digital audio files can be shared between users only limited by the extent of the network.
It is important to note that the quality of the stored signal does not depend on the storage media; it will depend on the recording process, the quality of the A/D convertors and any file compression algorithm that might be applied to the signal. What makes a digital audio storage medium better than other is usually defined by storage capacity and the speed at which audio data is retrieved.
The most common audio data storage media can be divided in magnetic storage, optical storage and hard disk and flash memory storage. What follows is a brief review of digital data storage media, at this stage only reviewing storage of data; how audio is coded as data is reviewed in a later section.
Magnetic storage is not a common as it once was, now that optical storage and hard disk has taken over. The most common medium for magnetic storage is magnetic tape. A magnetic tape storage system records digital signals into a magnetic tape by polarising the particles in its coating. It relies on the ability of these particles to be forced to take a magnetic polarisation and to keep this polarisation once the magnetic force is removed. For digital signals only two polarised stages are required. The system will usually include a recording head that polarises the tape and a reading head that will retrieve the information from the polarised tape. The storage capacity of a magnetic tape is determined by the density of particles on the tapes’ surface. It’s important to note that particles cannot be placed too close together or they will interact, causing demagnetisation; this can cause lower magnetic force in the reading stage leading to reading errors. Some of the most common digital audio tape formats from the last 20 years are: DAT, ADAT, DASH and, less commonly, the Digital Compact Cassette.
The most common optical storage media are flat discs. To retrieve information stored in an optical disc a light is shone onto its surface and variations from this light are read and transformed to binary data. Data can be stored in optical discs such that a change in the data stored will cause: a change in the intensity of the reflected light, a change in the polarisation of the reflected light or a change in the phase of the reflected light. The light differences are commonly caused by pits engraved into a reflective layer on the optical disc. This reflective layer is usually enclosed between two other layers: a transparent one on the bottom of the disc and another one, transparent or opaque, on top. These two layers provide protection. Information is stored along a spiral track or concentric tracks on the disc and the storage capacity of a disc is limited by the spot size of the light beam shone onto the disc. To maximise storage capacity a highly focused laser light is used. The most common optical storage devices are differentiated by the wavelength of the laser used. The following table shows the most common optical disc storage media along their maximum storage capacity and laser wavelength; it is important to note that the standard disc size in this table for any of this media has a diameter of 120mm. The less common 80mm disc size is not included in this table.
*The data in this column only includes user data and does not include the data read as part of the error detection and correction methods included as a standard in all this formats. The raw data read rates would be higher.
The following table summarises the most common lossy audio codecs available.
HARD DISKS, SOLID STATE DRIVES & FLASH MEMORY
This type of storage is the media that has gained the most terrain in the last years. Under this category we can include all computers, memory cards, USB storage devices, etc. The variety of techniques to store data in this manner are widely different and beyond the scope of this article, but we can summarise that all of this store data in an electronic way and in most cases this data can be erased and replaced with new data.
DIGITAL AUDIO CODING
In recording and playing back digital audio, a set of guidelines is required to dictate the way audio is stored into the chosen media and then later retrieved for playback. There are several methods used to encode digital audio; they all have been designed with specific goals in mind including: capabilities of the storage and playback media, compression capabilities, security features, etc. We can divide digital audio coding into two overarching categories: lossy audio compression algorithms and lossless audio coding schemes.
Both coding schemes allow smaller representations of audio signals, which make storage and transmission more effective. The exception is when using linear PCM or Delta-Sigma Modulation, where the coding only provides framing, clock information and error detection and correction. Digital compression algorithms use mainly two strategies, which are not exclusive to each other, to reduce the size of audio files: perceptual irrelevancies and statistical redundancies (these terms taken from Spanias, et. al., Audio Signal Processing and Coding, Wiley & Sons, Inc., Hoboken, NJ, USA, 2007). A detailed explanation of these concepts is beyond the scope of this article but a small explanation is included in the following paragraphs.
‘Perceptual irrelevancies’ refers to the characteristic of the human hearing system where some sounds will be rendered inaudible by other sounds by a combination of frequency spacing, difference in level and the threshold of hearing. Masking is then calculated based on frequency bands that resemble how humans hear. Bit reduction is achieved by reducing the bit depth of bands that are deemed unable to be perceived by the coding algorithm. For higher compression rates more information is deemed to be perceptually irrelevant, this results in audio that closely resembles the original signal but with a clear loss in quality.
‘Statistical redundancies’ refers to compression of the audio data when viewed simply as a string of values. Compression can be achieved by analysing the data and exploiting patterns in the data that are continuously repeated throughout the audio data string. A process called entropy is widely used. Entropy refers to assigning a special code to parts of the data that are repeated the most in the analysed file; reduced file sizes are achieved by assigning smaller codes to the data strings that are repeated the most.