Whats the best way of doing variable length encoding of an unsigned integer value in C# ?
"The actual intent is to append a variable length encoded integer (bytes) to a file header."
For ex: "Content-Length" - Http Header
Can this be achieved with some changes in the logic below.
I have written some code which does that ....
A method I have used, which makes smaller values use fewer bytes, is to encode 7 bits of data + 1 bit of overhead pr. byte.
The encoding works only for positive values starting with zero, but can be modified if necessary to handle negative values as well.
The way the encoding works is like this:
To decode:
39 32 31 24 23 16 15 8 7 0 value: |DDDDDDDD|CCCCCCCC|BBBBBBBB|AAAAAAAA| encoded: |0000DDDD|xDDDDCCC|xCCCCCBB|xBBBBBBA|xAAAAAAA| (note, stored in reverse order)
As you can see, the encoded value might occupy one additional byte that is just half-way used, due to the overhead of the control bits. If you expand this to a 64-bit value, the additional byte will be completely used, so there will still only be one byte of extra overhead.
Note: Since the encoding stores values one byte at a time, always in the same order, big- or little-endian systems will not change the layout of this. The least significant byte is always stored first, etc.
Ranges and their encoded size:
0 - 127 : 1 byte 128 - 16.383 : 2 bytes 16.384 - 2.097.151 : 3 bytes 2.097.152 - 268.435.455 : 4 bytes 268.435.456 - max-int32 : 5 bytes
Here's C# implementations for both:
void Main()
{
using (FileStream stream = new FileStream(@"c:\temp\test.dat", FileMode.Create))
using (BinaryWriter writer = new BinaryWriter(stream))
writer.EncodeInt32(123456789);
using (FileStream stream = new FileStream(@"c:\temp\test.dat", FileMode.Open))
using (BinaryReader reader = new BinaryReader(stream))
reader.DecodeInt32().Dump();
}
// Define other methods and classes here
public static class Extensions
{
/// <summary>
/// Encodes the specified <see cref="Int32"/> value with a variable number of
/// bytes, and writes the encoded bytes to the specified writer.
/// </summary>
/// <param name="writer">
/// The <see cref="BinaryWriter"/> to write the encoded value to.
/// </param>
/// <param name="value">
/// The <see cref="Int32"/> value to encode and write to the <paramref name="writer"/>.
/// </param>
/// <exception cref="ArgumentNullException">
/// <para><paramref name="writer"/> is <c>null</c>.</para>
/// </exception>
/// <exception cref="ArgumentOutOfRangeException">
/// <para><paramref name="value"/> is less than 0.</para>
/// </exception>
/// <remarks>
/// See <see cref="DecodeInt32"/> for how to decode the value back from
/// a <see cref="BinaryReader"/>.
/// </remarks>
public static void EncodeInt32(this BinaryWriter writer, int value)
{
if (writer == null)
throw new ArgumentNullException("writer");
if (value < 0)
throw new ArgumentOutOfRangeException("value", value, "value must be 0 or greater");
do
{
byte lower7bits = (byte)(value & 0x7f);
value >>= 7;
if (value > 0)
lower7bits |= 128;
writer.Write(lower7bits);
} while (value > 0);
}
/// <summary>
/// Decodes a <see cref="Int32"/> value from a variable number of
/// bytes, originally encoded with <see cref="EncodeInt32"/> from the specified reader.
/// </summary>
/// <param name="reader">
/// The <see cref="BinaryReader"/> to read the encoded value from.
/// </param>
/// <returns>
/// The decoded <see cref="Int32"/> value.
/// </returns>
/// <exception cref="ArgumentNullException">
/// <para><paramref name="reader"/> is <c>null</c>.</para>
/// </exception>
public static int DecodeInt32(this BinaryReader reader)
{
if (reader == null)
throw new ArgumentNullException("reader");
bool more = true;
int value = 0;
int shift = 0;
while (more)
{
byte lower7bits = reader.ReadByte();
more = (lower7bits & 128) != 0;
value |= (lower7bits & 0x7f) << shift;
shift += 7;
}
return value;
}
}