From cd5aacfe108e62b2e252bdf0e45e47434157a0a6 Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 08:22:22 -0800 Subject: [PATCH 01/13] Update README.md Added "Last Commit" badge and rearranged badges --- README.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 24c1bd1..65dc017 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,9 @@ -[![Documentation Status](https://readthedocs.org/projects/wavinfo/badge/?version=latest)](https://wavinfo.readthedocs.io/en/latest/?badge=latest) ![](https://img.shields.io/github/license/iluvcapra/wavinfo.svg) ![](https://img.shields.io/pypi/pyversions/wavinfo.svg) [![](https://img.shields.io/pypi/v/wavinfo.svg)](https://pypi.org/project/wavinfo/) ![](https://img.shields.io/pypi/wheel/wavinfo.svg) +![](https://img.shields.io/pypi/pyversions/wavinfo.svg) [![](https://img.shields.io/pypi/v/wavinfo.svg)](https://pypi.org/project/wavinfo/) ![](https://img.shields.io/pypi/wheel/wavinfo.svg) [![Lint and Test](https://github.com/iluvcapra/wavinfo/actions/workflows/python-package.yml/badge.svg)](https://github.com/iluvcapra/wavinfo/actions/workflows/python-package.yml) [![codecov](https://codecov.io/gh/iluvcapra/wavinfo/branch/master/graph/badge.svg?token=9DZQfZENYv)](https://codecov.io/gh/iluvcapra/wavinfo) +![GitHub last commit](https://img.shields.io/github/last-commit/iluvcapra/pycmx) [![Documentation Status](https://readthedocs.org/projects/wavinfo/badge/?version=latest)](https://wavinfo.readthedocs.io/en/latest/?badge=latest) ![](https://img.shields.io/github/license/iluvcapra/wavinfo.svg) + # wavinfo The `wavinfo` package allows you to probe WAVE and [RF64/WAVE files][eburf64] From 9e41d39b26d852ebe7e7de546f374b6c6b5a1504 Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 09:36:58 -0800 Subject: [PATCH 02/13] More info --- data/share/man/man7/wavinfo.7 | 31 ++++++++++++++++++++++++++----- 1 file changed, 26 insertions(+), 5 deletions(-) diff --git a/data/share/man/man7/wavinfo.7 b/data/share/man/man7/wavinfo.7 index f370925..3f45a58 100644 --- a/data/share/man/man7/wavinfo.7 +++ b/data/share/man/man7/wavinfo.7 @@ -1,4 +1,4 @@ -.TH waveinfo 7 "2023-11-07" "Jamie Hardt" "Miscellaneous Information Manuals" +.TH waveinfo 7 "2023-11-08" "Jamie Hardt" "Miscellaneous Information Manuals" .SH NAME wavinfo \- information about wave sound file metadata .\" .SH DESCRIPTION @@ -14,6 +14,7 @@ alignment and other data. May take an "extended" form, with additional data the file or if it is a compressed format. .IP data The audio data itself. PCM audio data is always stored as interleaved samples. +.SS Auxiliary WAV Chunks .IP JUNK A region of the file not currently in use. Clients sometimes add these before the @@ -38,6 +39,24 @@ four-character code identifying the form of the list, and is then followed by chunks of the standard key-length-data form, which may themselves be LISTs that themselves contain child chunks. WAVE files don't tend to have a very deep heirarchy of chunks, compared to AVI files. +.SS Extensions for Large Files +.IP RF64 +An RF64 file has affordances to hold chunks larger than four gigabytes. +RF64 is designed so that a RIFF WAVE file can be in-place upgraded to an +RF64 without having to rewrite any audio or metadata that may already be +written. An RF64 file begins with an +.I RF64 LIST +form instead of a +.I RIFF +form. This is immediately followed by the obligatory... +.IP ds64 +In RF64 files, the ds64 chunk begins the chunk list (in fact it must appear at +a fixed offset) and provides a list of 64-bit chunk sizes for any chunks in the +file that exceed four gigabytes. In an RF64 file, any chunk that exceeds the +32 bit size restriction will set its length field (after the identifier) to +.I 0xFFFFFFFF +and will write its true size into the list in +.IR ds64 . .SS RIFF Metadata The RIFF container format has a metadata system common to all RIFF files, WAVE being the most common at present, AVI being another very common format @@ -73,14 +92,14 @@ but hosts tend to use notes for longer text), and "length text" .I ltxt metadata records, which can give a cue a length, making it a range, and a text field that defines its own encoding. -.IP CSET +.IP cset Defines the character set for all text fields in .IR INFO , .I adtl and other RIFF-defined text fields. By default, all of the text in RIFF metadata fields is Windows Latin 1/ISO 8859-1, though as time passes many clients have simply taken to sticking UTF-8 into these fields. The -.I CSET +.I cset cannot represent UTF-8 as a valid option for text encoding, it only speaks Windows codepages, and we've never seen one in a WAVE file in any event and it's vanishingly likely an audio app would recognize one if it saw it. @@ -117,13 +136,14 @@ and encoding properties of individual channels in the WAVE file, and if the WAVE file contains object-based audio, it will also give all of the positioning and panning automation envelopes. .IP bxml -This is defined by the ITU as a gzip-compressed version of the +A gzip-compressed version of the .I axml chunk. .IP sxml -This is a hybrid binary/gzip-compressed-XML chunk that associates ADM +A hybrid binary/gzip-compressed-XML chunk that associates ADM documents with timed ranges of a WAVE file. .SS Dolby Metadata +Dolby metadata appears in Dolby Atmos Master ADM WAVE files. .IP dbmd Records hints for Dolby playback applications for downmixing, level normalization and other things. @@ -138,6 +158,7 @@ Region and cue point metadata. .IP elm1 .IP minf .IP umid +Doesn't actually hold a SMPTE UMID! .SH HISTORY The oldest document that defines the form of a Wave file is the .I Multimedia Programming Interface and Data Specifications 1.0 From 7bc5378304787768d5e28c9120f9ff1f584929f5 Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 11:59:21 -0800 Subject: [PATCH 03/13] BEginning to add references. --- data/share/man/man7/wavinfo.7 | 53 ++++++++++++++++++++++++++--------- 1 file changed, 40 insertions(+), 13 deletions(-) diff --git a/data/share/man/man7/wavinfo.7 b/data/share/man/man7/wavinfo.7 index 3f45a58..769f634 100644 --- a/data/share/man/man7/wavinfo.7 +++ b/data/share/man/man7/wavinfo.7 @@ -163,19 +163,46 @@ Doesn't actually hold a SMPTE UMID! The oldest document that defines the form of a Wave file is the .I Multimedia Programming Interface and Data Specifications 1.0 of August 1991. -.\" .SH REFERENCES -.\" .SS ESSENTIAL FILE FORMAT -.\" .TP -.\" .UR https://www.aelius.com/njh/wavemetatools/doc/riffmci.pdf -.\" Multimedia Programming Interface and Data Specifications 1.0 -.\" .UE -.\" The original definition of the -.\" .I RIFF -.\" container, the -.\" .I WAVE -.\" form, the original metadata facilites, and things like language, country and -.\" dialect enumerations. -.\" .TP +.SH REFERENCES +.SS Essential File Format +.TP +.B Multimedia Programming Interface and Data Specifications 1.0. Microsoft Corporation, 1991. +The original definition of the +.I RIFF +container, the +.I WAVE +form, the original metadata facilites (like +.IR INFO " and " cue ), +and things like language, country and +dialect enumerations. This document also contains descriptions of certain +variations on the WAVE, such as +.I LIST wavl +and compressed WAVE files that are so rare in practice as to be virtually +non-existent. +.TP +.B ITU Recommendation BS.2088-1-2019 \- Long-form file format for the international exchange of audio programme mterials with metadata. ITU 2019. +Formalized the RF64 file format, ADM carrier chunks like +.IR axml +and +.IR chna . +Formally supercedes the previous standard for RF64, +.BR "EBU 3306 v1" . +One oddity with this standard is it defines the file header for an extended +WAVE file to be +.IR BW64 , +but this is never seen in practice. +.TP +.B RFC 2361 \- WAVE and AVI Codec Registries. IETF Network Working Group, 1998. +Gives a throughly exhaustive list of all of the codecs that Microsoft had +assigned to vendor WAVE files as of 1998. At the time, numerous hardware +vendors, sound card and chip manufacturers, sound software developers and +others all provided their own slightly-different adaptive PCM codecs, linear +predictive compression codes, DCTs and other things, and Microsoft would issue +these vendors WAVE codec magic numbers. Almost all of these are no longer in +use, the only ones one ever encounters in the modern era are integer PCM +(0x01), floating-point PCM (0x03) and the extended format marker (0xFFFFFFFF). +There are over a hundred codecs assigned, however, a roll-call of failed +software and hardware brands. .\" .UR https://datatracker.ietf.org/doc/html/rfc2361 .\" RFC 2361 .\" .UE From f3f9f6b784a26f2eed8bdcde93513dad8766cf60 Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 12:23:43 -0800 Subject: [PATCH 04/13] More updates to man --- data/share/man/man7/wavinfo.7 | 90 +++++++++++++++++------------------ docs/source/references.rst | 3 ++ 2 files changed, 46 insertions(+), 47 deletions(-) diff --git a/data/share/man/man7/wavinfo.7 b/data/share/man/man7/wavinfo.7 index 769f634..3fdef83 100644 --- a/data/share/man/man7/wavinfo.7 +++ b/data/share/man/man7/wavinfo.7 @@ -14,7 +14,7 @@ alignment and other data. May take an "extended" form, with additional data the file or if it is a compressed format. .IP data The audio data itself. PCM audio data is always stored as interleaved samples. -.SS Auxiliary WAV Chunks +.SS Optional WAVE Chunks .IP JUNK A region of the file not currently in use. Clients sometimes add these before the @@ -39,32 +39,12 @@ four-character code identifying the form of the list, and is then followed by chunks of the standard key-length-data form, which may themselves be LISTs that themselves contain child chunks. WAVE files don't tend to have a very deep heirarchy of chunks, compared to AVI files. -.SS Extensions for Large Files -.IP RF64 -An RF64 file has affordances to hold chunks larger than four gigabytes. -RF64 is designed so that a RIFF WAVE file can be in-place upgraded to an -RF64 without having to rewrite any audio or metadata that may already be -written. An RF64 file begins with an -.I RF64 LIST -form instead of a -.I RIFF -form. This is immediately followed by the obligatory... -.IP ds64 -In RF64 files, the ds64 chunk begins the chunk list (in fact it must appear at -a fixed offset) and provides a list of 64-bit chunk sizes for any chunks in the -file that exceed four gigabytes. In an RF64 file, any chunk that exceeds the -32 bit size restriction will set its length field (after the identifier) to -.I 0xFFFFFFFF -and will write its true size into the list in -.IR ds64 . .SS RIFF Metadata The RIFF container format has a metadata system common to all RIFF files, WAVE being the most common at present, AVI being another very common format historically. -.IP INFO -A -.I LIST -form containing a flat list of chunks, each containing text metadata. The role +.IP "LIST form INFO" +A flat list of chunks, each containing text metadata. The role of the string, like "Artist", "Composer", "Comment", "Engineer" etc. are given by the four-character code: "Artist" is .IR IART , @@ -77,10 +57,8 @@ Comment is etc. .IP cue A binary list of cues, which are timed points within the audio data. -.IP adtl -A -.I LIST -form containing text labels +.IP "LIST form adtl" +Contains text labels .RI ( labl ) for the cues in the .I cue @@ -92,14 +70,14 @@ but hosts tend to use notes for longer text), and "length text" .I ltxt metadata records, which can give a cue a length, making it a range, and a text field that defines its own encoding. -.IP cset +.IP CSET Defines the character set for all text fields in .IR INFO , .I adtl and other RIFF-defined text fields. By default, all of the text in RIFF metadata fields is Windows Latin 1/ISO 8859-1, though as time passes many clients have simply taken to sticking UTF-8 into these fields. The -.I cset +.I CSET cannot represent UTF-8 as a valid option for text encoding, it only speaks Windows codepages, and we've never seen one in a WAVE file in any event and it's vanishingly likely an audio app would recognize one if it saw it. @@ -136,14 +114,14 @@ and encoding properties of individual channels in the WAVE file, and if the WAVE file contains object-based audio, it will also give all of the positioning and panning automation envelopes. .IP bxml -A gzip-compressed version of the +This is defined by the ITU as a gzip-compressed version of the .I axml chunk. .IP sxml -A hybrid binary/gzip-compressed-XML chunk that associates ADM +This is a hybrid binary/gzip-compressed-XML chunk that associates ADM documents with timed ranges of a WAVE file. .SS Dolby Metadata -Dolby metadata appears in Dolby Atmos Master ADM WAVE files. +Dolby metadata is present in Dolby Atmos master ADM WAVE files. .IP dbmd Records hints for Dolby playback applications for downmixing, level normalization and other things. @@ -158,12 +136,10 @@ Region and cue point metadata. .IP elm1 .IP minf .IP umid -Doesn't actually hold a SMPTE UMID! -.SH HISTORY -The oldest document that defines the form of a Wave file is the -.I Multimedia Programming Interface and Data Specifications 1.0 -of August 1991. .SH REFERENCES +(Note: We're not including URLs in this list, the title and standard number +should be sufficient to find almost all of these documents. The ITU, EBU and +IETF standards documents are freely-available.) .SS Essential File Format .TP .B Multimedia Programming Interface and Data Specifications 1.0. Microsoft Corporation, 1991. @@ -193,16 +169,36 @@ WAVE file to be but this is never seen in practice. .TP .B RFC 2361 \- WAVE and AVI Codec Registries. IETF Network Working Group, 1998. -Gives a throughly exhaustive list of all of the codecs that Microsoft had -assigned to vendor WAVE files as of 1998. At the time, numerous hardware -vendors, sound card and chip manufacturers, sound software developers and -others all provided their own slightly-different adaptive PCM codecs, linear -predictive compression codes, DCTs and other things, and Microsoft would issue -these vendors WAVE codec magic numbers. Almost all of these are no longer in -use, the only ones one ever encounters in the modern era are integer PCM -(0x01), floating-point PCM (0x03) and the extended format marker (0xFFFFFFFF). -There are over a hundred codecs assigned, however, a roll-call of failed -software and hardware brands. +Gives an exhaustive list of all of the codecs that Microsoft had assigned to +vendor WAVE files as of 1998. At the time, numerous hardware vendors, sound +card and chip manufacturers, sound software developers and others all provided +their own slightly-different adaptive PCM codecs, linear predictive compression +codes, DCTs and other things, and Microsoft would issue these vendors WAVE +codec magic numbers. Almost all of these are no longer in use, the only ones +one ever encounters in the modern era are integer PCM (0x01), floating-point +PCM (0x03) and the extended format marker (0xFFFFFFFF). There are over a +hundred codecs assigned, however, a roll-call of failed software and hardware +brands. +.SS Broadcast WAVE Format +.TP +.B EBU Tech 3285 \- Specification of the Broadcast Wave Format (BWF). EBU, 2011. +Defines the elements of a Broadcast WAVE file, the +.I bext +metadata chunk structure, allowed sample formats and other things. Over the +years the EBU has published numerous supplements covering extensions to the +format, such as embedding SMPTE UMIDs, pre-calculated loudness data (EBU Tech +3285 v2), +.I peak +waveform overview data (Suppl. 3), ADM metadata (Suppl. 5 and 7), Dolby master +metadata (Suppl. 6), and other things. +.TP +.B SMPTE 330M-2011 \- Unique Material Identifier. SMPTE, 2011. +Describes the format of the SMPTE UMID field, a 32- or 64-byte UUID used to +identify media files. Broadcast-WAVE files conforming to +.B "EBU 3285 v2" +have a SMPTE UMID embedded in the +.I bext +chunk. .\" .UR https://datatracker.ietf.org/doc/html/rfc2361 .\" RFC 2361 .\" .UE diff --git a/docs/source/references.rst b/docs/source/references.rst index 6661bf2..bf6d44f 100644 --- a/docs/source/references.rst +++ b/docs/source/references.rst @@ -1,6 +1,9 @@ References ========== +A complete list of technical references and commentary is available as man page +and is installed as wavinfo(7) when you install `wavinfo` via pip. + Wave File Format ---------------- From 75ec68f5003d21465e7bebdaf5d7f44e037bf185 Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 12:47:44 -0800 Subject: [PATCH 05/13] More --- data/share/man/man7/wavinfo.7 | 63 ++++++++++++++--------------------- 1 file changed, 25 insertions(+), 38 deletions(-) diff --git a/data/share/man/man7/wavinfo.7 b/data/share/man/man7/wavinfo.7 index 3fdef83..b0ab5e5 100644 --- a/data/share/man/man7/wavinfo.7 +++ b/data/share/man/man7/wavinfo.7 @@ -70,17 +70,17 @@ but hosts tend to use notes for longer text), and "length text" .I ltxt metadata records, which can give a cue a length, making it a range, and a text field that defines its own encoding. -.IP CSET +.IP cset Defines the character set for all text fields in .IR INFO , .I adtl and other RIFF-defined text fields. By default, all of the text in RIFF metadata fields is Windows Latin 1/ISO 8859-1, though as time passes many clients have simply taken to sticking UTF-8 into these fields. The -.I CSET +.I cset cannot represent UTF-8 as a valid option for text encoding, it only speaks -Windows codepages, and we've never seen one in a WAVE file in any event and -it's vanishingly likely an audio app would recognize one if it saw it. +Windows codepages, and we've never seen one in a WAVE file in any event, and +it's unlikely an audio app would recognize one if it saw it. .SS Broadcast-WAVE Metadata Broadcast-WAVE is a set of extensions to WAVE files to facilitate media production maintained by the EBU. @@ -194,41 +194,28 @@ metadata (Suppl. 6), and other things. .TP .B SMPTE 330M-2011 \- Unique Material Identifier. SMPTE, 2011. Describes the format of the SMPTE UMID field, a 32- or 64-byte UUID used to -identify media files. Broadcast-WAVE files conforming to +identify media files. UMIDs are usually a dumb number in their 32-byte form, +but the extended form can encode a high-precision timestamp (with options for +epoch and timescale) and geolocation information. Broadcast-WAVE files +conforming to .B "EBU 3285 v2" have a SMPTE UMID embedded in the .I bext chunk. -.\" .UR https://datatracker.ietf.org/doc/html/rfc2361 -.\" RFC 2361 -.\" .UE -.\" A large RFC compilation of all of the known (in 1998) audio encoding formats -.\" in use. 104 different codecs are documented with a name, the corresponding -.\" magic number, and a vendor contact name, phone number and address (no -.\" emails, strangely). Almost all of these are of historical interest only. -.\" .SS RF64/Extended WAVE Format -.\" -.\" .TP -.\" .UR https://www.itu.int/dms_pubrec/itu-r/rec/bs/R-REC-BS.2088-1-201910-I!!PDF-E.pdf -.\" ITU Recommendation BS.2088-1-2019 -.\" .UE -.\" BS.2088 gives a detailed description of the internals of an RF64 file, -.\" .I ds64 -.\" structure and all formal requirements. It also defines the use of -.\" .IR , -.\" .IR , -.\" .IR , -.\" and -.\" .I -.\" metadata chunks for the carriage of Audio Definition Model metadata. -.\" .TP -.\" .UR https://tech.ebu.ch/docs/tech/tech3306.pdf -.\" EBU Tech 3306 "RF64: An Extended File Format for Audio Data" -.\" .UE -.\" Version 1 of Tech 3306 laid out the -.\" .I RF64 -.\" extended WAVE -.\" file format almost identically to -.\" .IR BS.2088 , -.\" Version 2 of the standard wholly adopted -.\" .IR BS.2088 . +.SS Audio Definition Model +.TP +.B ITU Recommendation BS.2076-2-2019 \- Audio definition model. ITU, 2019. +Defines the Audio Definition Model, entities, relationships and properties. If +you ever had any questions about how ADM works, this is where you would start. +.SS iXML Metadata +.TP +.B iXML Specification v3.01. Gallery Software, 2021. +iXML is a standard for embedding mostly human-created metadata into WAVE files, +and mostly with an emphasis on location sound recorders used on film and +television productions. Frustratingly the developer has never published a DTD +or schema validation or strict formal standard, and encourages vendors to just +do whatever, but most of the heavily-traveled metadata fields are standardized, +for recording information like a recording's scene, take, recording notes, +circled or alt status. iXML also has a system of +.B "families" +for associating several WAVE files together into one recording. From bbbe947f3b1f1ad94eb72af266b0ac182a2801be Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 14:25:43 -0800 Subject: [PATCH 06/13] Update wavinfo.7 Introduction and description --- data/share/man/man7/wavinfo.7 | 92 ++++++++++++++++++++++++++++++++++- 1 file changed, 90 insertions(+), 2 deletions(-) diff --git a/data/share/man/man7/wavinfo.7 b/data/share/man/man7/wavinfo.7 index b0ab5e5..12a7a35 100644 --- a/data/share/man/man7/wavinfo.7 +++ b/data/share/man/man7/wavinfo.7 @@ -1,7 +1,95 @@ .TH waveinfo 7 "2023-11-08" "Jamie Hardt" "Miscellaneous Information Manuals" .SH NAME -wavinfo \- information about wave sound file metadata -.\" .SH DESCRIPTION +wavinfo \- everything you ever wanted to know about WAVE metadata but were +afraid to ask +.SH DESCRIPTION +.PP +The WAVE file format is forwards-compatible, apart from audio data it can +hold arbitrary blocks of bytes which clients will automatically ignore +unless they recognize them and know how to read them. +.PP +Without saying too much about the structure and parsing of WAVE files +themselves, a subject beyond the scope of this document, WAVE files are +divided into segments or +.BR chunks , +which a client parser can either read or skip without reading. Chunks have +an identifier, or signature: a four-character-code which informs a client +what kind of chunk it is, and a length, which gives the client enough +information to skip over the chunk and find the next chunk in the file, +in the case the client doesn't care about it or doesn't know how to read +it. +.PP +Some chunks are mandated by the Microsoft standard, specifically +.I fmt +and +.I data +in the case of PCM-encoded WAVE files. Other chunks, like +.I cue +or +.I bext +are optional, and hold metadata. Chunks can also nest inside other +chunks, a special identifier +.I LIST +is used to indicate these. A WAVE file +is a recursive list: a top level list of chunks, where chunks may contain +a list of chunks themselves. +.SS Order of Metadata Chunks in a WAVE File +.PP +Chunks in a WAVE file can appear in any order, and a capable parser can +accept them appearing in any order, however authorities give guidance on +where chunks should be placed, when creating a new WAVE file. +.PP +.IP 1) +For all new WAVE files, clients should always place an empty chunk, a +so-called +.I JUNK +chunk, in the first position in the top-level list of a WAVE file, and +it should be sized large enough to hold a +.I ds64 +chunk record. This will allow clients to upgrade the file to a RF64 +WAVE file +.BR in-place , +without having to re-write the file or audio data. +.IP 2) +Older authorites recommend placing metadata before the audio data, so +clients reading the file sequentially will hit it before having to seek +through the audio. This may have improved metadata read performance at one +time. +.IP 3) +Older authorities also recommend inserting +.I JUNK +before the +.I data +chunk, sized in such a way so that the first byte of the +.I data +payload lands immediately at 0x1000 (4096), because this was a common +factor of the page boundaries of many operating systems and architectures. +This could optimize the audio I/O performance in certain situations. +.IP 4) +Modern implemenations (we're looking at +.B Pro Tools +here) tend to place the Broadcast-WAVE +.I bext +metadata before the data, followed by the data itself, and then other +data after that. +.PP +Clients reading WAVE files should be tolerant and accept any configuration +of chunks, and should accept any file as long as the obligatory +.I fmt +and +.I data +chunks +are present. It's not unheard-of to see a naive implementor expect +.B only +these chunks, in this order, and to hard-code the offsets of the short +.I fmt +chunk and +.I data +chunk into their program, and this is something that should always be +checked when evaluating a new tool, just to make sure the developer +didn't do this. Many coding examples and WAVE file explainers from the +90s and early aughts give the basic layout of a WAVE file and naive devs +go along with it. .SH CHUNK MENAGERIE A list of chunks that you may find in a wave file from our experience. .SS Essential WAV Chunks From d04af2d194f89631347d20b2d621ef94d04f987a Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 15:23:40 -0800 Subject: [PATCH 07/13] Update wavinfo.7 --- data/share/man/man7/wavinfo.7 | 43 ++++++++++++++++++++++++++++++++++- 1 file changed, 42 insertions(+), 1 deletion(-) diff --git a/data/share/man/man7/wavinfo.7 b/data/share/man/man7/wavinfo.7 index 12a7a35..4d72cf8 100644 --- a/data/share/man/man7/wavinfo.7 +++ b/data/share/man/man7/wavinfo.7 @@ -79,7 +79,9 @@ of chunks, and should accept any file as long as the obligatory and .I data chunks -are present. It's not unheard-of to see a naive implementor expect +are present. +.PP +It's not unheard-of to see a naive implementor expect .B only these chunks, in this order, and to hard-code the offsets of the short .I fmt @@ -90,6 +92,45 @@ checked when evaluating a new tool, just to make sure the developer didn't do this. Many coding examples and WAVE file explainers from the 90s and early aughts give the basic layout of a WAVE file and naive devs go along with it. +.SS Encoding and Decoding Text Metadata +.PP +Modern metadata systems, anything developed since the late aughts, will +defer encoding to an XML parser so when dealing with +.I ixml +or +.I axml +so a client can mostly ignore this problem. +.PP +The most established metadata systems are older than this though, and +so the entire weight of text encoding history falls upon the client. +.PP +The original WAVE specification, a part of the Microsoft/IBM Multimedia +interface of 1991, was written at a time when Windows was an ascendant +and soon-to-be dominant desktop environment. Audio files were almost +never shared via LANs or the Internet or any other way. +When audio files were shared, among the miniscule number of people +who did this, it was via BBS or usenet. Users at this time may have +ripped them from CDs, but the cost of hard drives and low quality of +compressed formats at the time made this little more than a curiosity. +There was no +.I CDBaby or +.I CDDB +to download and populate metadata from at this time. +.PP +So, the +.I INFO +and +.I cue +metadata systems, which are by far the most prevalent and supported, +were published two years before the so-called "Endless September" of +1993 when the Internet became mainstream, when Unicode was still a +twinkle in the eye, and two years before Ariana Grande was born. +.PP +The safest assumption, and the mandate of the Microsoft, is that all +text metadata, by default, be encoded in Windows codepage 819, +a.k.a. ISO Latin alphabet 1, or ISO 8859-1. This covers most Western +European scripts but excludes all of Asia, Russia and most of the European +Near East, the Middle East. .SH CHUNK MENAGERIE A list of chunks that you may find in a wave file from our experience. .SS Essential WAV Chunks From d7540b0a7972183eb6daa5d89bcd3e315bda2fd9 Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 15:37:08 -0800 Subject: [PATCH 08/13] Update wavinfo.7 --- data/share/man/man7/wavinfo.7 | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/data/share/man/man7/wavinfo.7 b/data/share/man/man7/wavinfo.7 index 4d72cf8..45f9411 100644 --- a/data/share/man/man7/wavinfo.7 +++ b/data/share/man/man7/wavinfo.7 @@ -4,7 +4,7 @@ wavinfo \- everything you ever wanted to know about WAVE metadata but were afraid to ask .SH DESCRIPTION .PP -The WAVE file format is forwards-compatible, apart from audio data it can +The WAVE file format is forwards-compatible. Apart from audio data, it can hold arbitrary blocks of bytes which clients will automatically ignore unless they recognize them and know how to read them. .PP @@ -14,10 +14,10 @@ divided into segments or .BR chunks , which a client parser can either read or skip without reading. Chunks have an identifier, or signature: a four-character-code which informs a client -what kind of chunk it is, and a length, which gives the client enough -information to skip over the chunk and find the next chunk in the file, -in the case the client doesn't care about it or doesn't know how to read -it. +what kind of chunk it is, and a length. Based on this information, a client +can look at the identifier and decide if it knows how to read that chunk or +if it wants to. If it doesn't, it can simply read the length and skip +past it. .PP Some chunks are mandated by the Microsoft standard, specifically .I fmt @@ -27,12 +27,12 @@ in the case of PCM-encoded WAVE files. Other chunks, like .I cue or .I bext -are optional, and hold metadata. Chunks can also nest inside other -chunks, a special identifier +are optional, and hold metadata. +.PP +Chunks can also nest inside other chunks, a special identifier .I LIST -is used to indicate these. A WAVE file -is a recursive list: a top level list of chunks, where chunks may contain -a list of chunks themselves. +is used to indicate these. A WAVE file is a recursive list: a top level +list of chunks, where chunks may contain a list of chunks themselves. .SS Order of Metadata Chunks in a WAVE File .PP Chunks in a WAVE file can appear in any order, and a capable parser can From c002120c612b54b220bc2016d8e9fbd90a85a2fc Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 15:42:59 -0800 Subject: [PATCH 09/13] gq gq gq --- data/share/man/man7/wavinfo.7 | 74 +++++++++++++++++------------------ 1 file changed, 36 insertions(+), 38 deletions(-) diff --git a/data/share/man/man7/wavinfo.7 b/data/share/man/man7/wavinfo.7 index 45f9411..b75c195 100644 --- a/data/share/man/man7/wavinfo.7 +++ b/data/share/man/man7/wavinfo.7 @@ -52,9 +52,9 @@ WAVE file without having to re-write the file or audio data. .IP 2) Older authorites recommend placing metadata before the audio data, so -clients reading the file sequentially will hit it before having to seek -through the audio. This may have improved metadata read performance at one -time. +clients reading the file sequentially will hit it before having to seek through +the audio. This may have improve metadata read performance on certain +architecures. .IP 3) Older authorities also recommend inserting .I JUNK @@ -63,18 +63,18 @@ before the chunk, sized in such a way so that the first byte of the .I data payload lands immediately at 0x1000 (4096), because this was a common -factor of the page boundaries of many operating systems and architectures. -This could optimize the audio I/O performance in certain situations. +factor of the page boundaries of many operating systems and architectures. This +may optimize the audio I/O performance in certain situations. .IP 4) Modern implemenations (we're looking at .B Pro Tools here) tend to place the Broadcast-WAVE .I bext -metadata before the data, followed by the data itself, and then other -data after that. +metadata before the data, followed by the data itself, and then other data +after that. .PP -Clients reading WAVE files should be tolerant and accept any configuration -of chunks, and should accept any file as long as the obligatory +Clients reading WAVE files should be tolerant and accept any configuration of +chunks, and should accept any file as long as the obligatory .I fmt and .I data @@ -87,32 +87,30 @@ these chunks, in this order, and to hard-code the offsets of the short .I fmt chunk and .I data -chunk into their program, and this is something that should always be -checked when evaluating a new tool, just to make sure the developer -didn't do this. Many coding examples and WAVE file explainers from the -90s and early aughts give the basic layout of a WAVE file and naive devs -go along with it. +chunk into their program, and this is something that should always be checked +when evaluating a new tool, just to make sure the developer didn't do this. +Many coding examples and WAVE file explainers from the 90s and early aughts +give the basic layout of a WAVE file and naive devs go along with it. .SS Encoding and Decoding Text Metadata .PP -Modern metadata systems, anything developed since the late aughts, will -defer encoding to an XML parser so when dealing with +Modern metadata systems, anything developed since the late aughts, will defer +encoding to an XML parser so when dealing with .I ixml or .I axml so a client can mostly ignore this problem. .PP -The most established metadata systems are older than this though, and -so the entire weight of text encoding history falls upon the client. +The most established metadata systems are older than this though, and so the +entire weight of text encoding history falls upon the client. .PP The original WAVE specification, a part of the Microsoft/IBM Multimedia -interface of 1991, was written at a time when Windows was an ascendant -and soon-to-be dominant desktop environment. Audio files were almost -never shared via LANs or the Internet or any other way. -When audio files were shared, among the miniscule number of people -who did this, it was via BBS or usenet. Users at this time may have -ripped them from CDs, but the cost of hard drives and low quality of -compressed formats at the time made this little more than a curiosity. -There was no +interface of 1991, was written at a time when Windows was an ascendant and +soon-to-be dominant desktop environment. Audio files were almost +never shared via LANs or the Internet or any other way. When audio files were +shared, among the miniscule number of people who did this, it was via BBS or +usenet. Users at this time may have ripped them from CDs, but the cost of hard +drives and low quality of compressed formats at the time made this little more +than a curiosity. There was no .I CDBaby or .I CDDB to download and populate metadata from at this time. @@ -121,25 +119,25 @@ So, the .I INFO and .I cue -metadata systems, which are by far the most prevalent and supported, -were published two years before the so-called "Endless September" of -1993 when the Internet became mainstream, when Unicode was still a -twinkle in the eye, and two years before Ariana Grande was born. +metadata systems, which are by far the most prevalent and supported, were +published two years before the so-called "Endless September" of 1993 when the +Internet became mainstream, when Unicode was still a twinkle in the eye, and +two years before Ariana Grande was born. .PP -The safest assumption, and the mandate of the Microsoft, is that all -text metadata, by default, be encoded in Windows codepage 819, -a.k.a. ISO Latin alphabet 1, or ISO 8859-1. This covers most Western -European scripts but excludes all of Asia, Russia and most of the European -Near East, the Middle East. +The safest assumption, and the mandate of the Microsoft, is that all text +metadata, by default, be encoded in Windows codepage 819, a.k.a. ISO Latin +alphabet 1, or ISO 8859-1. This covers most Western European scripts but +excludes all of Asia, Russia and most of the European Near East, the Middle +East. .SH CHUNK MENAGERIE A list of chunks that you may find in a wave file from our experience. .SS Essential WAV Chunks .IP fmt Defines the format of the audio in the .I data -chunk: the audio codec, the sample rate, bit depth, channel count, block -alignment and other data. May take an "extended" form, with additional data -(such as channel speaker assignments) if there are more than two channels in +chunk: the audio codec, the sample rate, bit depth, channel count, block +alignment and other data. May take an "extended" form, with additional data +(such as channel speaker assignments) if there are more than two channels in the file or if it is a compressed format. .IP data The audio data itself. PCM audio data is always stored as interleaved samples. From 99118367e9ac450650c216190ac230b51b516684 Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 17:07:38 -0800 Subject: [PATCH 10/13] More wavinfo elaboration --- data/share/man/man7/wavinfo.7 | 150 +++++++++++++++++++++------------- 1 file changed, 91 insertions(+), 59 deletions(-) diff --git a/data/share/man/man7/wavinfo.7 b/data/share/man/man7/wavinfo.7 index b75c195..ccf73b6 100644 --- a/data/share/man/man7/wavinfo.7 +++ b/data/share/man/man7/wavinfo.7 @@ -1,23 +1,23 @@ .TH waveinfo 7 "2023-11-08" "Jamie Hardt" "Miscellaneous Information Manuals" -.SH NAME -wavinfo \- everything you ever wanted to know about WAVE metadata but were -afraid to ask -.SH DESCRIPTION +.SH NAME +wavinfo \- WAVE file metadata +.SH SYNOPSIS +Everything you ever wated to know about WAVE metadata but were afraid to ask. +.SH DESCRIPTION .PP The WAVE file format is forwards-compatible. Apart from audio data, it can hold arbitrary blocks of bytes which clients will automatically ignore unless they recognize them and know how to read them. .PP Without saying too much about the structure and parsing of WAVE files -themselves, a subject beyond the scope of this document, WAVE files are +themselves \- a subject beyond the scope of this document \- WAVE files are divided into segments or .BR chunks , which a client parser can either read or skip without reading. Chunks have -an identifier, or signature: a four-character-code which informs a client -what kind of chunk it is, and a length. Based on this information, a client -can look at the identifier and decide if it knows how to read that chunk or -if it wants to. If it doesn't, it can simply read the length and skip -past it. +an identifier, or signature: a four-character-code that tells a client what +kind of chunk it is, and a length. Based on this information, a client can look +at the identifier and decide if it knows how to read that chunk and if it wants +to. If it doesn't, it can simply read the length and skip past it. .PP Some chunks are mandated by the Microsoft standard, specifically .I fmt @@ -26,8 +26,8 @@ and in the case of PCM-encoded WAVE files. Other chunks, like .I cue or -.I bext -are optional, and hold metadata. +.IR bext , +are optional, and optional chunks usually hold metadata. .PP Chunks can also nest inside other chunks, a special identifier .I LIST @@ -51,16 +51,15 @@ WAVE file .BR in-place , without having to re-write the file or audio data. .IP 2) -Older authorites recommend placing metadata before the audio data, so -clients reading the file sequentially will hit it before having to seek through -the audio. This may have improve metadata read performance on certain -architecures. +Older authorites recommend placing metadata before the audio data, so clients +reading the file sequentially will hit it before having to seek through the +audio. This may improve metadata read performance on certain architecures. .IP 3) Older authorities also recommend inserting .I JUNK before the .I data -chunk, sized in such a way so that the first byte of the +chunk, sized so that the first byte of the .I data payload lands immediately at 0x1000 (4096), because this was a common factor of the page boundaries of many operating systems and architectures. This @@ -72,63 +71,96 @@ here) tend to place the Broadcast-WAVE .I bext metadata before the data, followed by the data itself, and then other data after that. -.PP -Clients reading WAVE files should be tolerant and accept any configuration of -chunks, and should accept any file as long as the obligatory -.I fmt -and -.I data -chunks -are present. +.\" .PP +.\" Clients reading WAVE files should be tolerant and accept any configuration of +.\" chunks, and should accept any file as long as the obligatory +.\" .I fmt +.\" and +.\" .I data +.\" chunks +.\" are present. .PP It's not unheard-of to see a naive implementor expect .B only -these chunks, in this order, and to hard-code the offsets of the short +.I fmt +and +.I data +chunks, in this order, and to hard-code the offsets of the short .I fmt chunk and .I data chunk into their program, and this is something that should always be checked when evaluating a new tool, just to make sure the developer didn't do this. Many coding examples and WAVE file explainers from the 90s and early aughts -give the basic layout of a WAVE file and naive devs go along with it. +give the basic layout of a WAVE file, and naive devs go along with it. .SS Encoding and Decoding Text Metadata -.PP -Modern metadata systems, anything developed since the late aughts, will defer -encoding to an XML parser so when dealing with -.I ixml -or -.I axml -so a client can mostly ignore this problem. -.PP -The most established metadata systems are older than this though, and so the -entire weight of text encoding history falls upon the client. -.PP -The original WAVE specification, a part of the Microsoft/IBM Multimedia -interface of 1991, was written at a time when Windows was an ascendant and -soon-to-be dominant desktop environment. Audio files were almost -never shared via LANs or the Internet or any other way. When audio files were -shared, among the miniscule number of people who did this, it was via BBS or -usenet. Users at this time may have ripped them from CDs, but the cost of hard -drives and low quality of compressed formats at the time made this little more -than a curiosity. There was no -.I CDBaby or -.I CDDB -to download and populate metadata from at this time. -.PP -So, the -.I INFO -and -.I cue -metadata systems, which are by far the most prevalent and supported, were -published two years before the so-called "Endless September" of 1993 when the -Internet became mainstream, when Unicode was still a twinkle in the eye, and -two years before Ariana Grande was born. +.\" .PP +.\" Modern metadata systems, anything developed since the late aughts, will defer +.\" encoding to an XML parser, so when dealing with +.\" .I ixml +.\" or +.\" .I axml +.\" so a client can mostly ignore this problem. +.\" .PP +.\" The most established metadata systems are older than this though, and so the +.\" entire weight of text encoding history falls upon the client. +.\" .PP +.\" The original WAVE specification, a part of the Microsoft/IBM Multimedia +.\" interface of 1991, was written at a time when Windows was an ascendant and +.\" soon-to-be dominant desktop environment. Audio files were almost +.\" never shared via LANs or the Internet or any other way. When audio files were +.\" shared, among the miniscule number of people who did this, it was via BBS or +.\" Usenet. Users at this time may have ripped them from CDs, but the cost of hard +.\" drives and low quality of compressed formats at the time made this little more +.\" than a curiosity. There was no CDBaby or CDDB to download and populate metadata +.\" from at this time. +.\" .PP +.\" So, the +.\" .I INFO +.\" and +.\" .I cue +.\" metadata systems, which are by far the most prevalent and supported, were +.\" published two years before the so-called "Endless September" of 1993 when the +.\" Internet became mainstream, when Unicode was still a twinkle in the eye, and +.\" two years before Ariana Grande was born. .PP The safest assumption, and the mandate of the Microsoft, is that all text metadata, by default, be encoded in Windows codepage 819, a.k.a. ISO Latin alphabet 1, or ISO 8859-1. This covers most Western European scripts but -excludes all of Asia, Russia and most of the European Near East, the Middle +excludes all of Asia, Russia, most of the European Near East, the Middle East. +.PP +To account for this, Microsoft proposed a few conventions, none of which have +been adopted with any consistency among clients of the WAVE file standard. +.IP 1) +The RIFF standard defines a +.I cset +chunk which declares a Windows codepage for character encoding, along with a +native country code, language and dialect, which clients should use for +determining text information. We have never seen a WAVE +file with a +.I cest +chunk. +.IP 2) +Certain RIFF chunks allow the writing client to override the default encoding. +Relevant to audio files are the +.I ltxt +chunk, which encodes a country, language, dialect and codepage along with a +time range text note. We have never seen the text field on one of these +filled-out either. +.PP +Some clients in our experience simply write UTF-8 into +.IR cue , +.IR labl , +and +.I note +fields without any kind of framing. +.PP +The practical solution at this time is to assume either ISO Latin 1, Windows +CP 859 or Windows CP 1252, and allow the client or user to override this based +on its own inferences. The +.I chardet +python package may provide useable guesses for text encoding, YMMV. .SH CHUNK MENAGERIE A list of chunks that you may find in a wave file from our experience. .SS Essential WAV Chunks From f978c5cf8b73123654c5490b802a71f5e8f02d02 Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 18:03:38 -0800 Subject: [PATCH 11/13] Update README.md --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 65dc017..ae42f9d 100644 --- a/README.md +++ b/README.md @@ -7,9 +7,9 @@ # wavinfo The `wavinfo` package allows you to probe WAVE and [RF64/WAVE files][eburf64] -and extract extended metadata, with an emphasis on film, video and -professional music production. - +and extract extended metadata. `wavinfo` has an emphasis on film, video and +professional music production but aspires to be the encyclopedic and final +aource for all WAVE file metadata. ## Metadata Support From a2ea978de0b4972ba1a412d5b612e01c73cc25ea Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 18:04:49 -0800 Subject: [PATCH 12/13] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index ae42f9d..a620e44 100644 --- a/README.md +++ b/README.md @@ -9,7 +9,7 @@ The `wavinfo` package allows you to probe WAVE and [RF64/WAVE files][eburf64] and extract extended metadata. `wavinfo` has an emphasis on film, video and professional music production but aspires to be the encyclopedic and final -aource for all WAVE file metadata. +source for all WAVE file metadata. ## Metadata Support From 9fee03a67b67f0b1e4b1de1f617173c4501cf2fa Mon Sep 17 00:00:00 2001 From: Jamie Hardt Date: Wed, 8 Nov 2023 18:23:46 -0800 Subject: [PATCH 13/13] Update README.md --- README.md | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/README.md b/README.md index a620e44..1580312 100644 --- a/README.md +++ b/README.md @@ -30,13 +30,13 @@ source for all WAVE file metadata. * The __wav format__ is also parsed, so you can access the basic sample rate and channel count information. -[bext]:https://wavinfo.readthedocs.io/en/latest/scopes/bext.html -[smpte_330m2011]:https://wavinfo.readthedocs.io/en/latest/scopes/bext.html#wavinfo.wave_bext_reader.WavBextReader.umid -[adm]:https://wavinfo.readthedocs.io/en/latest/scopes/adm.html -[ebu3285s6]:https://wavinfo.readthedocs.io/en/latest/scopes/dolby.html -[ixml]:https://wavinfo.readthedocs.io/en/latest/scopes/ixml.html -[info-tags]:https://wavinfo.readthedocs.io/en/latest/scopes/info.html -[eburf64]:https://tech.ebu.ch/docs/tech/tech3306v1_1.pdf +[bext]: https://wavinfo.readthedocs.io/en/latest/scopes/bext.html +[smpte_330m2011]: https://wavinfo.readthedocs.io/en/latest/scopes/bext.html#wavinfo.wave_bext_reader.WavBextReader.umid +[adm]: https://wavinfo.readthedocs.io/en/latest/scopes/adm.html +[ebu3285s6]: https://wavinfo.readthedocs.io/en/latest/scopes/dolby.html +[ixml]: https://wavinfo.readthedocs.io/en/latest/scopes/ixml.html +[info-tags]: https://wavinfo.readthedocs.io/en/latest/scopes/info.html +[eburf64]: https://tech.ebu.ch/docs/tech/tech3306v1_1.pdf ## How To Use @@ -60,6 +60,12 @@ The package also installs a shell command: $ wavinfo test_files/A101_1.WAV ``` +## Contributions! + +Any new or different kind of metadata you find, or any +new or different use of exising metadata you encounter, please submit +an Issue or Pull Request! + ## Other Resources * For other file formats and ID3 decoding,