Apply sign extension when decoding int #732
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR fixes a regression introduced in #280.
During the removal of the
byteorder
crate all conversions betweenBuf
/BufMut
and primitive integer types were replaced with std implementations (from_be_bytes
,from_le_bytes
,to_be_bytes
,to_le_bytes
etc.). When dealing with variable length signed integers (get_int
andget_int_le
), the representation of the integer was simply padded with0
bytes and then interpreted asi64
. This caused the bug.Let's take
val = -42i64
withnbytes = 2
, assume little-endian and see the order of operations of the current implementation:i64
input0xffffffffffffffd6
i64
input ->[u8; 8]
\xd6\xff\xff\xff\xff\xff\xff\xff
[u8; 8]
-> truncated&[u8]
\xd6\xff
&[u8]
-> decodedi64
(by appending0
)0xffd6
(65494)i64
converted back to&[u8]
just to show the mistake\xd6\xff\x00\x00\x00\x00\x00\x00
<- see how we have\x00
instead of\0xff
The decoded value is incorrect. To get the correct value, Sign extension must be applied to the bits representing the integer. The new operations will be:
i64
input0xffffffffffffffd6
i64
input ->[u8; 8]
\xd6\xff\xff\xff\xff\xff\xff\xff
[u8; 8]
-> truncated&[u8]
\xd6\xff
&[u8]
->u64
0xffd6
u64 << 48
0xffd6000000000000
(u64 << 48) as i64 >> 48
0xffffffffffffffd6
(-42) <-- it matches the input valueInspired by the implementation in
byteorder
: https://github.com/BurntSushi/byteorder/blob/18f32ca3a41c9823138e782752bc439e99ef7ec8/src/lib.rs#L87-L91I also found it interesting to see the difference in the resulting assembly: https://rust.godbolt.org/z/vfWdjdc3b
Fixes #730