This is the mail archive of the archer@sourceware.org mailing list for the Archer project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Python 3.0


El vie, 13-02-2009 a las 23:57 -0800, Jim Blandy escribiÃ:
> The first thing I ran into trying this was that Python's string
> representation has changed:
> http://docs.python.org/3.0/whatsnew/3.0.html#text-vs-data-instead-of-unicode-vs-8-bit
> 
> At the C API level, the PyString_ functions are all gone; one uses
> PyByteArray or PyUnicode instead.  I think this may not be a big deal
> for GDB, as byte arrays probably pretty much match what GDB is doing
> with its strings, last I checked: read them from the debug info, and
> dump 'em to stdout, without much concern for encoding.

IMO we should be careful here. If GDB knows its working with a string in
the inferior (as opposed to a byte array), then it should use PyUnicode,
converting from target_encoding to convert to Unicode. If GDB is working
with a string which originates from the host (e.g., a string the user
typed in the CLI), then it should use PyUnicode but converting from
host_encoding. Using target_encoding and host_encoding doesn't help much
with today's GDB, but Tromey has a patch which greatly improves charset
handling, and the Python side will benefit from it automatically IIUC.

I've made an effort to be consistent in that regard. There are a few
places which nowadays use PyString_, but that's because Python 2.x
mandates using them (for instance, the return value of _str_ has to be a
PyString). But there are also places which should be using PyUnicode_
instead...
-- 
[]'s
Thiago Jung Bauermann
IBM Linux Technology Center


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]