Lua, dealing with non-ascii byte streams, byteorder change

mike.dinnone picture mike.dinnone · Mar 9, 2011 · Viewed 12.9k times · Source

Need to encode & decode byte-stream (containing non-ascii characters possibly), from/into uint16, uint32, uint64 (their typical C/C++ meaning), taking care of endianness. What is an efficient & hopefully cross-platform way to do such a thing in Lua ?

My target arch is 64-bit x86_64, but would like to keep it portable (if it doesn't cost me on performance front).

e.g.

decode (say currently in a Lua string) -- 0x00, 0x1d, 0xff, 0x23, 0x44, 0x32 (little endian) as - uint16: (0x1d00) = 7424 uint32: (0x324423ff) = 843326463

Would be great if someone can explain with an example.

Answer

jpjacobs picture jpjacobs · Mar 9, 2011

for converting from bytes to int (taking care of endianness at byte level, and signedness):

function bytes_to_int(str,endian,signed) -- use length of string to determine 8,16,32,64 bits
    local t={str:byte(1,-1)}
    if endian=="big" then --reverse bytes
        local tt={}
        for k=1,#t do
            tt[#t-k+1]=t[k]
        end
        t=tt
    end
    local n=0
    for k=1,#t do
        n=n+t[k]*2^((k-1)*8)
    end
    if signed then
        n = (n > 2^(#t*8-1) -1) and (n - 2^(#t*8)) or n -- if last bit set, negative.
    end
    return n
end

And while we're at it also the other direction:

function int_to_bytes(num,endian,signed)
    if num<0 and not signed then num=-num print"warning, dropping sign from number converting to unsigned" end
    local res={}
    local n = math.ceil(select(2,math.frexp(num))/8) -- number of bytes to be used.
    if signed and num < 0 then
        num = num + 2^n
    end
    for k=n,1,-1 do -- 256 = 2^8 bits per char.
        local mul=2^(8*(k-1))
        res[k]=math.floor(num/mul)
        num=num-res[k]*mul
    end
    assert(num==0)
    if endian == "big" then
        local t={}
        for k=1,n do
            t[k]=res[n-k+1]
        end
        res=t
    end
    return string.char(unpack(res))
end

Any remarks are welcome, it's tested, but not too thoroughly...