Skip to content

Commit

Permalink
More wavinfo elaboration
Browse files Browse the repository at this point in the history
  • Loading branch information
iluvcapra committed Nov 9, 2023
1 parent c002120 commit 9911836
Showing 1 changed file with 91 additions and 59 deletions.
150 changes: 91 additions & 59 deletions data/share/man/man7/wavinfo.7
Original file line number Diff line number Diff line change
@@ -1,23 +1,23 @@
.TH waveinfo 7 "2023-11-08" "Jamie Hardt" "Miscellaneous Information Manuals"
.SH NAME
wavinfo \- everything you ever wanted to know about WAVE metadata but were
afraid to ask
.SH DESCRIPTION
.SH NAME
wavinfo \- WAVE file metadata
.SH SYNOPSIS
Everything you ever wated to know about WAVE metadata but were afraid to ask.
.SH DESCRIPTION
.PP
The WAVE file format is forwards-compatible. Apart from audio data, it can
hold arbitrary blocks of bytes which clients will automatically ignore
unless they recognize them and know how to read them.
.PP
Without saying too much about the structure and parsing of WAVE files
themselves, a subject beyond the scope of this document, WAVE files are
themselves \- a subject beyond the scope of this document \- WAVE files are
divided into segments or
.BR chunks ,
which a client parser can either read or skip without reading. Chunks have
an identifier, or signature: a four-character-code which informs a client
what kind of chunk it is, and a length. Based on this information, a client
can look at the identifier and decide if it knows how to read that chunk or
if it wants to. If it doesn't, it can simply read the length and skip
past it.
an identifier, or signature: a four-character-code that tells a client what
kind of chunk it is, and a length. Based on this information, a client can look
at the identifier and decide if it knows how to read that chunk and if it wants
to. If it doesn't, it can simply read the length and skip past it.
.PP
Some chunks are mandated by the Microsoft standard, specifically
.I fmt
Expand All @@ -26,8 +26,8 @@ and
in the case of PCM-encoded WAVE files. Other chunks, like
.I cue
or
.I bext
are optional, and hold metadata.
.IR bext ,
are optional, and optional chunks usually hold metadata.
.PP
Chunks can also nest inside other chunks, a special identifier
.I LIST
Expand All @@ -51,16 +51,15 @@ WAVE file
.BR in-place ,
without having to re-write the file or audio data.
.IP 2)
Older authorites recommend placing metadata before the audio data, so
clients reading the file sequentially will hit it before having to seek through
the audio. This may have improve metadata read performance on certain
architecures.
Older authorites recommend placing metadata before the audio data, so clients
reading the file sequentially will hit it before having to seek through the
audio. This may improve metadata read performance on certain architecures.
.IP 3)
Older authorities also recommend inserting
.I JUNK
before the
.I data
chunk, sized in such a way so that the first byte of the
chunk, sized so that the first byte of the
.I data
payload lands immediately at 0x1000 (4096), because this was a common
factor of the page boundaries of many operating systems and architectures. This
Expand All @@ -72,63 +71,96 @@ here) tend to place the Broadcast-WAVE
.I bext
metadata before the data, followed by the data itself, and then other data
after that.
.PP
Clients reading WAVE files should be tolerant and accept any configuration of
chunks, and should accept any file as long as the obligatory
.I fmt
and
.I data
chunks
are present.
.\" .PP
.\" Clients reading WAVE files should be tolerant and accept any configuration of
.\" chunks, and should accept any file as long as the obligatory
.\" .I fmt
.\" and
.\" .I data
.\" chunks
.\" are present.
.PP
It's not unheard-of to see a naive implementor expect
.B only
these chunks, in this order, and to hard-code the offsets of the short
.I fmt
and
.I data
chunks, in this order, and to hard-code the offsets of the short
.I fmt
chunk and
.I data
chunk into their program, and this is something that should always be checked
when evaluating a new tool, just to make sure the developer didn't do this.
Many coding examples and WAVE file explainers from the 90s and early aughts
give the basic layout of a WAVE file and naive devs go along with it.
give the basic layout of a WAVE file, and naive devs go along with it.
.SS Encoding and Decoding Text Metadata
.PP
Modern metadata systems, anything developed since the late aughts, will defer
encoding to an XML parser so when dealing with
.I ixml
or
.I axml
so a client can mostly ignore this problem.
.PP
The most established metadata systems are older than this though, and so the
entire weight of text encoding history falls upon the client.
.PP
The original WAVE specification, a part of the Microsoft/IBM Multimedia
interface of 1991, was written at a time when Windows was an ascendant and
soon-to-be dominant desktop environment. Audio files were almost
never shared via LANs or the Internet or any other way. When audio files were
shared, among the miniscule number of people who did this, it was via BBS or
usenet. Users at this time may have ripped them from CDs, but the cost of hard
drives and low quality of compressed formats at the time made this little more
than a curiosity. There was no
.I CDBaby or
.I CDDB
to download and populate metadata from at this time.
.PP
So, the
.I INFO
and
.I cue
metadata systems, which are by far the most prevalent and supported, were
published two years before the so-called "Endless September" of 1993 when the
Internet became mainstream, when Unicode was still a twinkle in the eye, and
two years before Ariana Grande was born.
.\" .PP
.\" Modern metadata systems, anything developed since the late aughts, will defer
.\" encoding to an XML parser, so when dealing with
.\" .I ixml
.\" or
.\" .I axml
.\" so a client can mostly ignore this problem.
.\" .PP
.\" The most established metadata systems are older than this though, and so the
.\" entire weight of text encoding history falls upon the client.
.\" .PP
.\" The original WAVE specification, a part of the Microsoft/IBM Multimedia
.\" interface of 1991, was written at a time when Windows was an ascendant and
.\" soon-to-be dominant desktop environment. Audio files were almost
.\" never shared via LANs or the Internet or any other way. When audio files were
.\" shared, among the miniscule number of people who did this, it was via BBS or
.\" Usenet. Users at this time may have ripped them from CDs, but the cost of hard
.\" drives and low quality of compressed formats at the time made this little more
.\" than a curiosity. There was no CDBaby or CDDB to download and populate metadata
.\" from at this time.
.\" .PP
.\" So, the
.\" .I INFO
.\" and
.\" .I cue
.\" metadata systems, which are by far the most prevalent and supported, were
.\" published two years before the so-called "Endless September" of 1993 when the
.\" Internet became mainstream, when Unicode was still a twinkle in the eye, and
.\" two years before Ariana Grande was born.
.PP
The safest assumption, and the mandate of the Microsoft, is that all text
metadata, by default, be encoded in Windows codepage 819, a.k.a. ISO Latin
alphabet 1, or ISO 8859-1. This covers most Western European scripts but
excludes all of Asia, Russia and most of the European Near East, the Middle
excludes all of Asia, Russia, most of the European Near East, the Middle
East.
.PP
To account for this, Microsoft proposed a few conventions, none of which have
been adopted with any consistency among clients of the WAVE file standard.
.IP 1)
The RIFF standard defines a
.I cset
chunk which declares a Windows codepage for character encoding, along with a
native country code, language and dialect, which clients should use for
determining text information. We have never seen a WAVE
file with a
.I cest
chunk.
.IP 2)
Certain RIFF chunks allow the writing client to override the default encoding.
Relevant to audio files are the
.I ltxt
chunk, which encodes a country, language, dialect and codepage along with a
time range text note. We have never seen the text field on one of these
filled-out either.
.PP
Some clients in our experience simply write UTF-8 into
.IR cue ,
.IR labl ,
and
.I note
fields without any kind of framing.
.PP
The practical solution at this time is to assume either ISO Latin 1, Windows
CP 859 or Windows CP 1252, and allow the client or user to override this based
on its own inferences. The
.I chardet
python package may provide useable guesses for text encoding, YMMV.
.SH CHUNK MENAGERIE
A list of chunks that you may find in a wave file from our experience.
.SS Essential WAV Chunks
Expand Down

0 comments on commit 9911836

Please sign in to comment.