Reading CD Readers
Details
These are notes for an oral presentation delivered at Code as Conversation: Transmedia Dialogues Around Critical Code Studies, Cambridge Digital Humanities/Cambridge University, Cambridge, UK. (1 June 2024). Feedback or corrections welcome by email.
Abstract
The MP3 moment is well understood as a decisive episode in the history of music industry (Sterne 2012), as it engendered radical changes in how music was distributed and consumed, and ultimately precipitated music streaming, which consigns listeners to the status of renters (Drott 2024). However, with limited exceptions (Witt 2015, Eve 2021), scant attention has been paid to the mechanisms and practices by which users reformatted physical - albeit digital - releases in the compact disc (CD) medium into audio files, so that they were amenable to later distribution via the various infamous file-sharing platforms. In this paper, I discuss the first codes for that enabled bit-for-bit capture of audio data from CDs (digital audio extraction, or DAE), which circulated on the Internet during the early 1990s.
I focus on Heiko Eißfeldt’s Linux application, cdda2wav, and develop a close reading of some the earliest publicly available sources for this tool. By placing these codes in the large context of other subsequent developments in the CD “ripping” software ecosystem, the practice of DAE is revealed to be a complex, time-critical media operation (Ernst 2013, Marshall 2019). These sources show that DAE requires active maintenance and co-operation between users (Mulvin 2017) to remain viable in the face of not just the industry’s attempts to minimise unauthorised copying but also the recondite character of the CD format itself.
This reading also illuminates interdependencies between the source code and a panoply of technical standards that specify how mid-90s computers “talked” to the CD-ROM drives with which they were being newly put in contact. From a methodological perspective, this kind of relationality - which is perhaps typical of code that negotiates the interface between software and hardware (i.e. between something read and the thing doing the reading) - challenges the notion that source code, as the putative object of critical code studies, is somehow integral.
Talk
This talk is about compact discs and what it means to read them. Audio compact discs are an optical media format, upon which digital audio is stored as a series of microscopic impressions on a plastic substrate. This medium has been made reflective, so that when a a monochromatic laser beam is shone on the disc, the variations in the intensity of the light that bounces off it can be measured and deciphered. How this came about is a matter of some interest and debate, perhaps for another time. As I have done my best to argue elsewhere, there is a need for an account of the development of the CD that goes beyond corporate history and the history of the music industry to consider the development of digital optical media as a pivotal moment in the material history of phonography, and perhaps even of writing in general.
A brief but essential explainer: there are only two levels of depth on compact discs - either a depressed “pit” or a raised “land”; this is the physical property that contributes most to the CD’s binary character. But these differences, measured in nanometers, do not uncomplicatedly stand for zeroes and ones, somehow directly mapping to the bits and bytes making up the audio. Instead, an arbitrary and novel lookup table is used to relate patterns of pits and lands into a fixed vocabulary of 8-bit words. These words are composed together in an intricate, double-layer error-correction scheme which leverages discrete mathematics to make the disc resilient to defects, dust, fingerprints, and scratches below a certain size. The combination of these two techniques in the CD, along with the contact-free laser readout system, forms the core of the intellectual property patented and jointly licensed for manufacture by Sony and Philips around 1981.
Two important aspects to note: first, since audio data was digitally represented in the disc surface meant that it could easily co-exist with its metadata: in particular, information about the passage of time was interwoven between “frames” of audio, and was represented explicitly and symbolically in terms of the passage of fractions of sections. And, crucially, the definition of the CD format means that a complete timecode is represented only once every 1/75th of a second. We return to that fact later. Second, is that the data represented in the CD’s surface sits on a single, continuous spiral track: not unlike a gramophone record.
Though CD players allow listeners to cue up tracks in arbitrary order, CD media is not a truly random-access medium: readers need to locate themselves along this spiral, and seek to the required location on disc before playback can begin - this can take up to a second. It is only the fact that CDs also include a baked-in table of contents that allows players to quickly jump to the approximate location in the spiral. Indeed, the CD’s table of contents - the first consumer digital audio format to have this feature - strengthens the case for the place for that format in the history of the book, not just as a platform for multimedia literature (which place is already well established) but also for its codicological innovation.
But back to history of computing: Audio CDs came first; CD-ROMs - which are compact discs containing arbitrary data for use by personal computers - only appeared on the market after it was determined that the audio CD could be retooled as a kind of platform to support the inscription of increasingly unstructured information. By the early 1990s, manufacturers of personal computers advertised so-called “multimedia PCs”, which bundled in CD-ROM drives with desktop towers. This allowed consumers to play back high-quality digital audio using their computers and gave rise to the multimedia CD-ROM as a new genre of electronic publication: encyclopedias, manuals, catalogues, and interactive video games.
As Jeremy Wade Morris (2015) explains, using CD-ROM drives for the playback of audio CDs was something of an afterthought, and not all CD-ROM drives supported audio CD playback. And, crucially (from our point of view), since this was the case, it was not necessarily implemented in an entirely digital signal chain: most early CD-ROM drives included an analog output, meaning that on-board hardware to decode the pits and lands of audio CD into usable audio, which was then piped either to a headphone jack on the front of the CD-ROM drive or onward to an internal soundcard (which - ironically - may have even converted the audio back to a digital signal).
All this leaky mediation led users to ponder if there was a better way. They knew that multimedia computers already performed a digital read-out of the discs in order to access the executable data that lay there, and they knew that that the data CD-ROM format was essentially a creative abuse of the predecessor audio CD format. It stood to reason that a more direct way to transfer the audio from CDs to host computers was possible. This observation gave rise to the first generation of software that allowed users for the first time to perform what became known as direct audio extraction (DAE), or is more commonly called “ripping”.
There would be no mp3 moment without DAE, since this would appear to be the most common way that high-quality copies of music were made during the late 1990s and into the 2000s. These files were destined for clandestine circulation via IRC, on the topsites of the warez Scene, and - later on - on mp3 blogs and peer-to-peer filesharing networks. All are well accounted for in the literature. Once the RIAA in the United States had aggressively pursued peer-to-peer filesharers, they even unsuccessfully advanced legal arguments that the very act of CD ripping itself should be deemed illicit; this is a situation that remains somewhat underdetermined on this side of the Atlantic. The journalist Stephen Witt’s compelling account of the fate of Dell Glover, an optical media factory worker who was convicted for leaking almost 2,000 releases to the mp3 Scene by hiding the compact format behind his belt buckle, makes it clear that the affordances of the CD are closely connected to the history of music piracy.
Despite all of this, the role of the CD ripping hardware and software ecosystem often is taken for granted. Yet there are dozens of code sources available to support the intensive study of CD ripping software, so I will only present a couple of early examples.
Today’s focus is a Linux tool, first programmed by Heiko Eißfeldt and Olaf Kindel in early 1994, called cdda2wav. This software allowed owners of an initially very small number of CD-ROM drives to rip high-quality audio to hard disk, something most users of computers took for granted during the first decade of the 2000s. The earliest release I can identify source code for dates from October 1994: which Eißfeldt versioned as v0.2alpha; Eißfeldt seems to have taken over as lead developer in releases past v0.1. As Eißfeldt notes, his code is in fact a port of an earlier MS-DOS application written by Jim McLaughlin, distributed as CDDA.EXE (then da2wav, then cdda-reader). Eißfeldt secured the C or C++ source for McLaughlin’s code and, according to code comments, first adapted it for Linux in June 1994.
Eißfeldt’s port is notable for a two reasons, however. First, as an open-source code for the Linux platform, cdda2wav could support community contributions. McLaughlin’s first code supported only one CD-ROM drive: the Toshiba 3041. From the beginning, McLaughlin developed CDDA.EXE on a closed-source basis (eventually moving to a shareware model) and gradually improved the compatibility of his software with other drives, drawing on the virtual feedback and debug logs shared by interlocutors who together owned a greater variety of models than McLaughlin could afford to acquire and test himself. Eißfeldt, on the other hand, from the beginning anticipated community contributions from more technically proficient users that would expand the compatibility of cdda2wav. This is place even in the earliest versions of cdda2wav, before he knew precisely what features the forthcoming hardware was going to support. We can see this in an example from the code that shows the names of devices that Eißfeldt had only heard by repute may support his code: these remained to be tested, an exercise left to the reader.
Even with the drives in hand, expanding the gamut of supported drives was not technically straightforward: first, there were broadly two families of CD-ROM drives, which connected to the host PC using different approaches (i.e. ATA/IDE vs. SCSI). I say “approaches”, because these terms do not fully specify a protocol for talking to CD drives: during the early years, different drive vendors required programmers to manage data coming to and from the peripheral in different ways. In the documentation accompanying his MS-DOS ripper, McLaughlin describes frustrating phone calls with technical support staff trying to finesse specification documents from different vendors in order that he could expand the compatibility of his tool, as the number of devices on the market grew. It took until 1997, some 11 years after the first commercial CD-ROM publications in 1986, until industry co-operated to specify a common language for controlling optical media drives: the MMC standard. In part, this accounts for the proliferation of switch cases shown here, but also for the arcane references to “cooked” and “raw” interfaces - which refer to different states of processing in which CD data is passed to the application. In the case of the Linux operating system, the preferred interface to CD-ROM drives (and other similar peripherals), which simplified such interactions, was not fully stabilised until 2003.
The second reason to consider cdda2wav, is that its code sources speak to the practical difficulty of developing and debugging the first audio CD rippers. Crucially, in the early years, there was no known-good digital reference against which to quantitatively evaluate the fidelity of the audio data extracted from a given disc. The sole acknowledgement in the documentation for cdda2wav v0.2 reads: “Thanks to Rammi for debugging and hearing 100 times the first 16 seconds of the first track of the Krupps cd bravely.” Discs that are ready-to-hand in the development environment become test media, and the developers have no choice but to resort to their ears, even if this was tedious - and seemed to affect not only the software developers, but their friends or family too. And why the first 16 seconds? This came from default rip parameters that Eißfeldt had coded into cdda2wav’s earliest versions: 16-bit depth, mono audio, with 22,050Hz sample rate. This was a quarter of the information that the CD audio standard supported, but led to a modest 700kB file at a time when 1MB of hard drive space cost approximately $1 (not adjusted for inflation). Both code and comments elsewhere in the documentation imply that it would not be unexpected that the hard-drive would run out of space while performing a rip.
Apart from listening, another technique that could be used in testing the behavior of ripping software was the use of cryptographic checksums: these summarise the content of large files in a small footprint, such that two files that produce the same checksum are likely to be the same. This allows designers and power users of ripping software to quickly assess whether - for example - repeated rips of the same disc were identical, without storing large amounts of ripped data since the rip could be discarded after the checksum was computed. During the early years of CD ripping, and for some drives in particular, it was not the case that repeated rips were bit-wise identical, and the reason is the 1/75th-second resolution that was mentioned earlier. What this means is that every time a CD drive is instructed to move to a particular region on the disc, the laser pickup could be in the wrong place by up to 500 audio samples: just one differing sample would show up as a difference in checksum (and invalidate the rip), while two non-adjacent tranches of audio, incorrectly sutured together by the ripper could produce pops and clicks at discontinuities in the audio data.
By v0.8 of cdda2wav in 1996, Eißfeldt had implemented a simple technique to mitigate the effects of jitter: make multiple, overlapping reads of the data on the disc in a principled way. Where subsequent reads agree, accept this as a valid readout; where subsequent reads differ, some calculations could be done to determine if the first and second reads differ. He (or others) had also implemented track-by-track computations of the MD5 cryptographic checksum function, to facilitate authoritative rips and the comparison of the software’s behavior with different drives, interfaces, and hosts. And, by that time, other actors began to crop up in the history of cdda2wav: the THANKS document notably acknowledges the financial support of the Fraunhofer Institute for Integrated Circuits, the innovators of the mp3 format.
These versions of cdda2wav attracted the attention of another programmer who attempted to improve even further the accuracy of the rips that Eißfeldt’s tool produced. Christopher “Monty” Montgomery was a professionally active multimedia engineer who proposed a series of additions to v0.8 of cdda2wav. His changes introduce a series of heuristic checks that test for the presence of errors due to frame offset issues, scratches, and other oddities. Essentially, the code assumes a starting position of distrust toward the audio data emerging from the CD-ROM drive: they make cdda2wav even more of a paranoid reader than it already is. As CD-ROM drives became more sophisticated, they began to use caching strategies to speed up the read-out of data from discs. These techniques made the repeated read strategy implemented by Eißfeldt and others problematic, since the cache would intercept the request for a re-read and return the same data - which, ex hypothesi, was suspicious. Further developments of Monty’s cdparanoia essentially sought to outwit these caching strategies, in a kind of arms race against the vendors - each of whom implemented this technique in different ways, each requiring different solutions.
The media archaeologist Wolfgang Ernst has discussed at some length the role of buffers in computational media. Here is an open and shut case of the adversarial aspect of contemporary commercial media technologies: closed-source vendors and open-source, free software advocates agonise over a deceptively simple question, namely, when is “now” in an operative optical media system? While this might seem a marginal concern, it’s nevertheless one that is deeply felt: in part because these frame offset errors are not the only kinds of errors that might make their way into a CD rip. Since the advent of recordable CDs in the late 1990s, audiophile listeners have pursued the question of the “bit-perfect rip”, a reformatting of the CD they own on to some other storage medium that they can guarantee matches the digital bitstream that the optical media manufacturer intended. Such an artifact is sometimes called a “secure copy” of the medium. Various kinds of timing, sampling, and quantization errors - real and imagined - may influence a user’s judgment about how faithful a digital rip of a CDs.
One solution, and one that is still widely used today in well-known software including dbPowerAmp and Exact Audio Copy, is to adapt the principles of cryptographic checksums already described above. By doing so, this community has effectively leveraged the internet to build a distributed database of observations of “secure copies”, called AccurateRip. This proprietary database propagates knowledge about the properties of known-good readers and rips through the network and using this to authenticate rips as faithful to the physical medium. As Owen Marshall (2019) has argued, this particular class of temporal errors relating to digital media - known collectively as jitter - has the capacity to escalate from accurately describing seemingly highly technical concepts to describing the character of particular social arrangements. Because AccurateRip requires the co-operation of users, the paranoia becomes mutualised to an extent, with some users even aligning their own interest in perfection with the pathological. In the introduction to his “The Art of the Rip”, from a mature period in the history of CD ripping, user Thomas writes:
Making good CD rips is hard - much harder than it should be. Especially if you are affected by that perfect mix of neurotic/paranoid/packrat/perfectionist genes that I seem to be affected by.
In retrospect, we can see the history of cdda2wav in two steps. The first phase has bookends: it begins with its first public release in 1994 (which may be v0.2alpha - this is not yet established) and ends with its integration in 1997 into a package of CD-related software called cdrtools, which was maintained by another influential programmer in the history of optical media - Joerg Schiling. By that time, cdda2wav had been substantially rewritten to make use of CD drive interface routines that Schilling had written for other tools. Opening up its second life, Montgomery - the author of the first cdparanoia patch from 1996 - would eventually rewrite the version of cdparanoia that depended on cdda2wav as a self-standing software, and released this as Paranoia III, which was actively developed into the late 2000s. This took Eißfeldt’s cdda2wav as a starting point but addressed many issues in the codebase, and, notably, was distributed primarily as a redistributable C library that was integrated into other CD ripping tools and eventually inspired the paranoid approach to reading CDs that characterises the most recent, active, efforts embodied by software making use of the AccuraterRip database. Even if Eissfeldt’s cdda2wav remains a distant memory in computer history, its code sources - alongside McLaughlin’s MS-DOS precursor - make clear its importance in answering many fertile and as-yet unposed questions about the relationship between time, space, and digital media and the late 21st century problematics of optical media more generally.
Selected Bibliography
Bell, Eamonn. “Interleaving as Cultural Technique in the Audio CD and the End of Archaeophonography.” Media Theory 5, no. 1 (September 25, 2021): 115–46.
Drott, Eric. Streaming Music, Streaming Capital. Durham: Duke University Press, 2024.
Ernst, Wolfgang. Chronopoetics: The Temporal Being and Operativity of Technological Media. Media Philosophy. London ; New York: Rowman & Littlefield International, 2016.
Eve, Martin Paul. Warez: The Infrastructure and Aesthetics of Piracy. 1st ed. punctum books, 2021. https://doi.org/10.53288/0339.1.00.
Long, Kenneth. “The RIAA’s Case Against Ripping CDs: When Enough Is Enough.” Houston Business & Tax Law Journal 11 (2011): 173–202.
Marshall, Owen. “Jitter: Clocking as Audible Media.” International Journal of Communication 13, no. 0 (April 14, 2019): 17ff.
Morris, Jeremy Wade. Selling Digital Music, Formatting Culture. Oakland, CA: University of California Press, 2015.
Sterne, Jonathan. MP3: The Meaning of a Format. Sign, Storage, Transmission. Durham: Duke University Press, 2012.
Witt, Stephen. How Music Got Free: The End of an Industry, the Turn of the Century, and the Patient Zero of Piracy. New York: Viking, 2015.