How to convert UTF8 byte arrays to string in lua

Tony picture Tony · Sep 9, 2013 · Viewed 11k times · Source

I have a table like this

table = {57,55,0,15,-25,139,130,-23,173,148,-24,136,158}

it is utf8 encoded byte array by php unpack function

unpack('C*',$str);

how can I convert it to utf-8 string I can read in lua?

Answer

greatwolf picture greatwolf · Sep 9, 2013

Lua doesn't provide a direct function for turning a table of utf-8 bytes in numeric form into a utf-8 string literal. But it's easy enough to write something for this with the help of string.char:

function utf8_from(t)
  local bytearr = {}
  for _, v in ipairs(t) do
    local utf8byte = v < 0 and (0xff + v + 1) or v
    table.insert(bytearr, string.char(utf8byte))
  end
  return table.concat(bytearr)
end

Note that none of lua's standard functions or provided string facilities are utf-8 aware. If you try to print utf-8 encoded string returned from the above function you'll just see some funky symbols. If you need more extensive utf-8 support you'll want to check out some of the libraries mention from the lua wiki.