Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flac: add support for LPC encoding #66

Merged
merged 15 commits into from
Oct 31, 2023
Merged

flac: add support for LPC encoding #66

merged 15 commits into from
Oct 31, 2023

Conversation

mewmew
Copy link
Member

@mewmew mewmew commented Oct 29, 2023

Fixes #35

Note: this is a preliminary commit which stores
residuals during decoding to facilitate round-trip
encoding.

Future commits will make this optional to reduce
memory usage when only decoding is required.
This is a follow-up commit of 1649e72,
reducing the memory footprint required for round-trip
encoding of audio samples (by avoiding to store
residuals when decoding).
Also, fix handling of wasted bits when encoding.
@mewmew
Copy link
Member Author

mewmew commented Oct 29, 2023

This PR adds support for Fixed and FIR LPC encoding of audio samples.

A set of round-trip test cases are used to ensure that decoding/encoding result in bit-by-bit identical FLAC files.

While many test cases pass, there are also a lot of test cases that produce diffs, and these should be investigated before merging this PR.

go test -v -run TestEncode

Furthermore, there are two TODOs that should be discussed to ensure that the act of encoding audio samples is non-destructive (i.e. does not modify frame.Subframes[i].Samples).

@mewmew
Copy link
Member Author

mewmew commented Oct 29, 2023

The fq tool by @wader has been invaluable to investigate the cause of diffs between input (decoding) and output (encoding) FLAC files.

To investigate a mismatch between input FLAC files and corresponding FLAC files produced by encode, do as follows.

  1. uncomment the ioutil.WriteFile lines in TestEncode to output FLAC file (to /tmp/got.flac) when the contents of the round-trip test case mismatch.
  2. uncomment a test case that is currently failing (e.g. testdata/59996.flac)
  3. run go test -v -run TestEncode/testdata/59996.flac
=== RUN   TestEncode/testdata/59996.flac
    enc_test.go:206: "testdata/59996.flac": content mismatch;
    expected 66 4C 61 ... 7C A2 21 22 3B ... 80 32 2A,
    got      66 4C 61 ... 7C 62 21 12 3B ... 80 32 2A
                             ^     ^
                             |     |
                             |     |
                            DIFF  DIFF
--- FAIL: TestEncode (0.02s)
    --- FAIL: TestEncode/testdata/59996.flac (0.02s)
  1. use fq to analyze the diff
    • 4a: analyze input FILE (source of truth): fq d -d flac testdata/59996.flac
    • 4b: analyze output FILE (produced by encode): fq d -d flac /tmp/got.flac

binary diff (using vbindiff) between input and output FLAC files

screenshot_2023-10-29_11:10:27

fq analysis of input FILE (source of truth)

fq d -d flac testdata/59996.flac

screenshot_2023-10-29_11:14:40

fq analysis of output FILE (produce by mewkiz/flac encoding)

fq d -d flac /tmp/got.flac

screenshot_2023-10-29_11:14:50

@mewmew mewmew marked this pull request as draft October 29, 2023 10:16
@wader
Copy link

wader commented Oct 29, 2023

🥳 happy to hear that!

Some additional tricks i usually use with fq:

  • Do "structural" diff. Will output a JSON structure with differences: fq -n 'diff(input; input)' a b. Does not work great when arrays are diffing.
  • Use same query for two or more files fq 'first(grep_by(.type=="stts"))' a.mp4 b.mp4
  • Combined fq -n 'def f: first(grep_by(.type=="stts")); diff(input|f; input|f)' a.mp4 b.mp4
  • dd is nice if you want to see all bytes, ddv if you want to see more start/stop offset details

BTW the flac decoder in fq does sample decoding to be able to verify md5 but does not expose the sampels in the decode tree as it would use a bit too much memory for most flac files. But i could probably add an option to do if it would be useful?

Also nice to see more encoding support! i've thought learning more about how that is done but all the LPC stuff feels a bit overwhelming. Will surely be following this PR with great interest!

@mewmew
Copy link
Member Author

mewmew commented Oct 29, 2023

BTW the flac decoder in fq does sample decoding to be able to verify md5 but does not expose the sampels in the decode tree as it would use a bit too much memory for most flac files. But i could probably add an option to do if it would be useful?

Having an experimental branch of fq that outputs samples (and residuals if possible) would definitely be useful for trouble-shooting : )

@mewmew
Copy link
Member Author

mewmew commented Oct 29, 2023

--input (orig):    testdata/59996.flac
++output (encode): /tmp/got.flac
 residual: (0b-0000000000000000011111100101101) -16173
 residual: (0b-0000000000000000011101101010011) -15187
 residual: (0b00000000000000000111101011001001) 31433
-residual: (0b-0000000000000000001110111110011) -7667
+residual: (0b-0000000000000000001110111110010) -7666
-residual: (0b00000000000000000110001000010010) 25106
+residual: (0b00000000000000000110001000010001) 25105
-residual: (0b00000000000000000010001110110100) 9140
+residual: (0b00000000000000000010001110110101) 9141
-residual: (0b-0000000000000000110000000101110) -24622
+residual: (0b-0000000000000000110000000110000) -24624
-residual: (0b00000000000000000001000011000000) 4288
+residual: (0b00000000000000000001000011000010) 4290
-residual: (0b00000000000000000110101101001001) 27465
+residual: (0b00000000000000000110101101000110) 27462
-residual: (0b-0000000000000000111100011100001) -30945
+residual: (0b-0000000000000000111100011011110) -30942
-residual: (0b00000000000000000111111110100011) 32675
+residual: (0b00000000000000000111111110100000) 32672
-residual: (0b-0000000000000000101000111001011) -20939
+residual: (0b-0000000000000000101000111001001) -20937
-residual: (0b-0000000000000000011011001011101) -13917
+residual: (0b-0000000000000000011011001011110) -13918
 residual: (0b00000000000000001000100101110011) 35187
 residual: (0b-0000000000000001101011111001001) -55241

There seem to be a small diff in the residual between the original input FLAC file and the encoded output FLAC file.

Still not sure why. Anyone who may have an idea? : )

@wader
Copy link

wader commented Oct 30, 2023

@mewmew Did a quick PR wader/fq#802 run with -o sample_details=true. This also reminded me that maybe some usages of "sample" in the current output should actually be "residual"?

@wader
Copy link

wader commented Oct 30, 2023

Still not sure why. Anyone who may have an idea? : )

What residual encoding mode is this? rice parameter or escape? some interesting diff in high/low or before/after zigzag? all from same partition?

@mewmew
Copy link
Member Author

mewmew commented Oct 30, 2023

What residual encoding mode is this? rice parameter or escape? some interesting diff in high/low or before/after zigzag? all from same partition?

The residuals are encoded using using rice parameter. There could for sure be issues in the high/low and/or zigzagging parts of encoding. Another aspect I think may be at fault (I think there may be more than one that's slightly broken) is the inter-channel decorrelation of audio samples.

I think it should be:

mid := int32((int64(l) + int64(r)) / 2)

instead of what is currently used for inter-channel decorrelation:

mid := int32(math.Round((float64(l) + float64(r)) / 2.0)) // rounded up.

In other words, mid should be truncated, at least according to the docs for correlation.


@wader har du discord? : )

@wader
Copy link

wader commented Oct 30, 2023

What residual encoding mode is this? rice parameter or escape? some interesting diff in high/low or before/after zigzag? all from same partition?

The residuals are encoded using using rice parameter. There could for sure be issues in the high/low and/or zigzagging parts of encoding. Another aspect I think may be at fault (I think there may be more than one that's slightly broken) is the inter-channel decorrelation of audio samples.

I think it should be:

mid := int32((int64(l) + int64(r)) / 2)

instead of what is currently used for inter-channel decorrelation:

mid := int32(math.Round((float64(l) + float64(r)) / 2.0)) // rounded up.

Aha i see, yeah float seems fishy. The div ends up with a float with a representation on wrong "side" when rounding up or so?

In other words, mid should be truncated, at least according to the docs for correlation.

Nice explanation in the comment, remember thinking about that shift/truncate and why it works.

@wader har du discord? : )

Yeap men har dålig koll på hur de fungerar, jag tror mitt username är "wader1". Hänger mest i diverse jqlang kanaler :)

@MarkKremer
Copy link
Collaborator

MarkKremer commented Oct 31, 2023

For 44 - 8-channel surround, 192kHz, 24 bit, using only 32nd order predictors.flac, WriteUnary receives numbers above 255 but the current implementation doesn't support it because the n parameter of WriteBits is of type uint8. Possible fix:

func WriteUnary(bw *bitio.Writer, x uint64) error {
	for ; x > 8; x -= 8 {
		if err := bw.WriteByte(0x0); err != nil {
			return err
		}
	}

	bits := uint64(1)
	n := byte(x + 1)
	if err := bw.WriteBits(bits, n); err != nil {
		return err
	}
	return nil
}

Edit: see PR below.

MarkKremer and others added 2 commits October 31, 2023 23:54
* Support writing Unaries above 255

* Improve unary test error messages

* Update comment on "44 - ...flac" test
Run `goimports -w` to sort imports and change order of
"expected xx, got xx" to match other test cases
(e.g. TestZigZag).
@mewmew mewmew marked this pull request as ready for review October 31, 2023 23:01
@mewmew
Copy link
Member Author

mewmew commented Oct 31, 2023

With the Unary fix in #68 by @MarkKremer, this PR is now ready to be merged.

Very happy that we finally get support for LPC encoding. The original issue (#35) was created about 5 years ago : )

Cheerful regards,
Henry & Robin

@mewmew mewmew merged commit fc0e2e6 into master Oct 31, 2023
2 checks passed
@wader
Copy link

wader commented Oct 31, 2023

🥳

@mewmew
Copy link
Member Author

mewmew commented Oct 31, 2023

While this PR has been merged, I just realized there is one final TODO we need to resolve before tagging the 1.0.10 release.

@wader
Copy link

wader commented Oct 31, 2023

Looking at the code it feels like it should be possible to postpone the wasted shift to just before write of each sample?

@mewmew
Copy link
Member Author

mewmew commented Oct 31, 2023

Looking at the code it feels like it should be possible to postpone the wasted shift to just before write of each sample?

I think the main issue is that the sample is used in getLPCResiduals to compute the residual, and that is based on already having wasted those bits. We could update the code to waste the sample bits in each place where a sample is read (e.g. encodeConstantSamples, encodeVerbatimSamples, etc).

For now, I settled on the defer method. While not the prettiest, it does help us with separation of concern, so we only handle wasted bits in one place (ref: 97aacbb).

@mewmew
Copy link
Member Author

mewmew commented Oct 31, 2023

@karlek, @MarkKremer and @wader, if you feel satisfied with the current state of affairs after the merge of this PR, then we can tag the 1.0.10 release : )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement LPC encoding of audio samples
3 participants