I'm currently working with Freeswitch and its event socket library (through the mod event socket). For instance:
from ESL import ESLconnection
cmd = 'uuid_kill %s' % active_call # active_call comes from a Django db and is unicode
con = ESLconnection(config.HOST, config.PORT, config.PWD)
if con.connected():
e = con.api(str(cmd))
else:
logging.error('Couldn\'t connect to Freeswitch Mod Event Socket')
As you can see, I had to explicitly cast con.api()
's argument with str()
. Without that, the call ends up in the following stack trace:
Traceback (most recent call last):
[...]
e = con.api(cmd)
File "/usr/lib64/python2.7/site-packages/ESL.py", line 87, in api
def api(*args): return apply(_ESL.ESLconnection_api, args)
TypeError: in method 'ESLconnection_api', argument 2 of type 'char const *'
I don't understand this TypeError: what does it mean ? cmd
contains a string, so what does it fix it when I cast it with str(cmd)
?
Could it be related to Freeswitch's python API, generated through SWIG ?
Short answer: cmd
likely contains a Unicode string, which cannot be trivially converted to a const char *
. The error message likely comes from a wrapper framework that automates writing Python bindings for C libraries, such as SWIG or ctypes. The framework knows what to do with a byte string, but punts on Unicode strings. Passing str(cmd)
helps because it converts the Unicode string to a byte string, from which a const char *
value expected by C code can be trivially extracted.
Long answer:
The C type char const *
, more customarily spelled const char *
, can be read as "read-only array of char
", char
being C's way to spell "byte". When a C function accepts a const char *
, it expects a "C string", i.e. an array of char
values terminated with a null character. Conveniently, Python strings are internally represented as C strings with some additional information such as type, reference count, and the length of the string (so the string length can be retrieved with O(1) complexity, and also so that the string may contain null characters themselves).
Unicode strings in Python 2 are represented as arrays of Py_UNICODE
, which are either 16 or 32 bits wide, depending on the operating system and build-time flags. Such an array cannot be passed to code that expects an array of 8-bit chars — it needs to be converted, typically to a temporary buffer, and this buffer must be freed when no longer needed.
For example, a simple-minded (and quite unnecessary) wrapper for the C function strlen
could look like this:
PyObject *strlen(PyObject *ignore, PyObject *obj)
{
const char *c_string;
size_t len;
if (!PyString_Check(obj)) {
PyErr_Format(PyExc_TypeError, "string expected, got %s", Py_TYPE(obj)->tp_name);
return NULL;
}
c_string = PyString_AsString(obj);
len = strlen(c_string);
return PyInt_FromLong((long) len);
}
The code simply calls PyString_AsString
to retrieve the internal C string stored by every Python string and expected by strlen
. For this code to also support Unicode objects (provided it even makes sense to call strlen
on Unicode objects), it must handle them explicitly:
PyObject *strlen(PyObject *ignore, PyObject *obj)
{
const char *c_string;
size_t len;
PyObject *tmp = NULL;
if (PyString_Check(obj))
c_string = PyString_AsString(obj);
else if (PyUnicode_Check(obj)) {
if (!(tmp = PyUnicode_AsUTF8String(obj)))
return NULL;
c_string = PyString_AsString(tmp);
}
else {
PyErr_Format(PyExc_TypeError, "string or unicode expected, got %s",
Py_TYPE(obj)->tp_name);
return NULL;
}
len = strlen(c_string);
Py_XDECREF(tmp);
return PyInt_FromLong((long) len);
}
Note the additional complexity, not only in lines of boilerplate code, but in the different code paths that require different management of a temporary object that holds the byte representation of the Unicode string. Also note that the code needed to decide to on an encoding when converting a Unicode string to a byte string. UTF-8 is guaranteed to be able to encode any Unicode string, but passing a UTF-8-encoded sequence to a function expecting a C string might not make sense for some uses. The str
function uses the ASCII codec to encode the Unicode string, so if the Unicode string actually contained any non-ASCII characters, you would get an exception.
There have been requests to include this functionality in SWIG, but it is unclear from the linked report if they made it in.