-

@ mleku
2025-04-26 20:02:11
i have spent my afternoon tonight working on making a variable length integer encoding
the standard zig-zag varint scheme uses the 8th bit to indicate a further byte is part of the number, and it goes backwards, which confuses me
it took me ages to figure it out, but i have a similar encoding that instead of the extra bit being the sign of continue, instead it means the end, and the number sequence is reversed
so each byte in the buffer represents the smallest to largest places of what is effectively a base 128 encoding
when it finds a number that is bigger than that, with the 8th bit set, it removes the 8th bit and then stops and the sum of each base 128 digit multiplied by the power of the place, produces the result
i'm not sure why the standard method is done the exact opposite because it seems to me like this is simpler
to encode, i just AND the last 7 bits of each byte, then divide the number by 128, and once that gives a zero, the result of the last operation has 128 added
to decode, i first put the first byte value in the return value, and then each subsequent value i multiply by 128 (bitshift left by 7 bits) and then add that to the result, and when it finds a value over 128, it subtracts 128 and does the same operation and terminates
gotta formalize it into an actual algorithm now, but it's mostly written
i'm obsessed with encoding data and i wanted an encoding that can be read one byte at a time to scan the length values, without needing to have a second buffer to decode the variable integer numbers used for the length prefixes of fields
this is how i enjoy myself. fuck you, math is awesome.