This is the mail archive of the mailing list for the Archer project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Python 3.0

El vie, 13-02-2009 a las 23:57 -0800, Jim Blandy escribiÃ:
> The first thing I ran into trying this was that Python's string
> representation has changed:
> At the C API level, the PyString_ functions are all gone; one uses
> PyByteArray or PyUnicode instead.  I think this may not be a big deal
> for GDB, as byte arrays probably pretty much match what GDB is doing
> with its strings, last I checked: read them from the debug info, and
> dump 'em to stdout, without much concern for encoding.

IMO we should be careful here. If GDB knows its working with a string in
the inferior (as opposed to a byte array), then it should use PyUnicode,
converting from target_encoding to convert to Unicode. If GDB is working
with a string which originates from the host (e.g., a string the user
typed in the CLI), then it should use PyUnicode but converting from
host_encoding. Using target_encoding and host_encoding doesn't help much
with today's GDB, but Tromey has a patch which greatly improves charset
handling, and the Python side will benefit from it automatically IIUC.

I've made an effort to be consistent in that regard. There are a few
places which nowadays use PyString_, but that's because Python 2.x
mandates using them (for instance, the return value of _str_ has to be a
PyString). But there are also places which should be using PyUnicode_
Thiago Jung Bauermann
IBM Linux Technology Center

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]