Jython and IronPython run on platforms where strings are unicode capable by default. Both implementations have chosen to make
str essentially an alias for
unicode in Python source code. The
bytes type, introduced in PEP358 as part of transition to fully unicode Python 3.0, is unambiguously a sequence of single byte values. We can see in the table below that Jython and IronPython are caught between what is on the one hand most practical for interopability with existing code and their host platforms, and on the other hand the Right Thing as delivered by Python 3.0.
It seems clear that if you need to write code that is portable between the different Python implementations you should steer clear
str and use
unicode to unambigiously express your intent.
Of course, this is impossible since the Python Standard Library is littered with uses of
str. For example, in IronPython
str just like Python 2.6 but the
str is actually has multibyte storage. IronPython hides this well, but the abstraction can leak, resulting in much confusion. Again Python 3.0 does what is right, and
pickle.dumps() returns a
These difficulties are most likely to occur when interfacing with native Java or .NET APIs that expect byte arrays, for example when pickling to database blobs.
In Jython an
str instance can be converted to a Java byte array as follows.
>>> import jarray
>>> a = jarray.array("This is string", 'b')
array('b', [84, 104, 105, 115, 32, 105, 115, 32, 32, 115, 116, 114, 105, 110, 103])
The equivalent in IronPython, as provided by Michael Foord, being,
>>> from System import Array, Byte
>>> a = Array[Byte](tuple(Byte(ord(c)) for c in "This is a string"))
Array[Byte]((<System.Byte object at 0x000000000000002B >, <System.Byte object at 0x000000000000002C >, <System.Byte object at 0x000000000000002D >, <System.Byte object at 0x000000000000002E >, <System.Byte object at 0x000000000000002F >, <System.Byte object at 0x0000000000000030 >, <System.Byte object at 0x0000000000000031 >, <System.Byte object at 0x0000000000000032 >, <System.Byte object at 0x0000000000000033 >, <System.Byte object at 0x0000000000000034 >, <System.Byte object at 0x0000000000000035 >, <System.Byte object at 0x0000000000000036 >, <System.Byte object at 0x0000000000000037 >, <System.Byte object at 0x0000000000000038 >, <System.Byte object at 0x0000000000000039 >, <System.Byte object at 0x000000000000003A >))
Going back we can use identical code in IronPython and Jython.
>>> s = ''.join(chr(c) for c in a)
'This is a string'