I am confused with this behaviour of different versions of python and dont understand why ?
Python 2.7.5 (default, Aug 25 2013, 00:04:04)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.0.68)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> c="hello"
>>> a=ctypes.c_char_p(c)
>>> print(a.value)
hello
Python 3.3.5 (default, Mar 11 2014, 15:08:59)
[GCC 4.2.1 Compatible Apple LLVM 5.0 (clang-500.2.79)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> c="hello"
>>> a=ctypes.c_char_p(c)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: bytes or integer address expected instead of str instance
One works while the other gives me an error. Which one is correct ?
If both of them are correct, how can i achieve the same behaviour as 2.7 in 3.3.5 ? I want to pass the char pointer to C from python.
c_char_p
is a subclass of _SimpleCData
, with _type_ == 'z'
. The __init__
method calls the type's setfunc
, which for simple type 'z'
is z_set
.
In Python 2, the z_set
function (2.7.7) is written to handle both str
and unicode
strings. Prior to Python 3, str
is an 8-bit string. CPython 2.x str
internally uses a C null-terminated string (i.e. an array of bytes terminated by \0
), for which z_set
can call PyString_AS_STRING
(i.e. get a pointer to the internal buffer of the str
object). A unicode
string needs to first be encoded to a byte string. z_set
handles this encoding automatically and keeps a reference to the encoded string in the _objects
attribute.
>>> c = u'spam'
>>> a = c_char_p(c)
>>> a._objects
'spam'
>>> type(a._objects)
<type 'str'>
On Windows, the default ctypes string encoding is 'mbcs'
, with error handling set to 'ignore'
. On all other platforms the default encoding is 'ascii'
, with 'strict'
error handling. To modify the default, call ctypes.set_conversion_mode
. For example, set_conversion_mode('utf-8', 'strict')
.
In Python 3, the z_set
function (3.4.1) does not automatically convert str
(now Unicode) to bytes
. The paradigm shifted in Python 3 to strictly divide character strings from binary data. The ctypes default conversions were removed, as was the function set_conversion_mode
. You have to pass c_char_p
a bytes
object (e.g. b'spam'
or 'spam'.encode('utf-8')
). In CPython 3.x, z_set
calls the C-API function PyBytes_AsString
to get a pointer to the internal buffer of the bytes
object.
Note that if the C function modifies the string, then you need to instead use create_string_buffer
to create a c_char
array. Look for a parameter to be typed as const
to know that it's safe to use c_char_p
.