I want to encode/compress some binary image data as a sequence if bits. (This sequence will, in general, have a length that does not fit neatly in a whole number of standard integer types.)
How can I do this without wasting space? (I realize that, unless the sequence of bits has a "nice" length, there will always have to be a small amount [< 1 byte] of leftover space at the very end.)
FWIW, I estimate that, at most, 3 bits will be needed per symbol that I want to encode. Does Python have any built-in tools for this kind of work?
There's nothing very convenient built in but there are third-party modules such as bitstring and bitarray which are designed for this.
from bitstring import BitArray
s = BitArray('0b11011')
s += '0b100'
s += 'uint:5=9'
s += [0, 1, 1, 0, 1]
...
s.tobytes()
To join together a sequence of 3-bit numbers (i.e. range 0->7) you could use
>>> symbols = [0, 4, 5, 3, 1, 1, 7, 6, 5, 2, 6, 2]
>>> BitArray().join(BitArray(uint=x, length=3) for x in symbols)
BitArray('0x12b27eab2')
>>> _.tobytes()
'\x12\xb2~\xab '
Some related questions: