This is the mail archive of the mailing list for the Mauve project.

Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: new test cases (long)

Hash: RIPEMD160

hello Mark,

On Friday 14 February 2003 22:57, Mark Wielaard wrote:
> Hi Raif,
> On Sat, 2003-02-08 at 17:17, Raif S. Naffah wrote:
> > the tests are to ensure that the mandated (as per public Javadoc
> > 1.3.1 and 1.4.1) minimal character encodings are supported by the
> > bytecode interpreter.
> > [...]
> > +	* gnu/testlet/java/lang/String/getBytes14: new test
> Here you test for "ISO8859_15". I looked here:
> but couldn't see where it said this is a required character set.

yes it is not listed there.  but i refer you to 
<.../j2sdk1.4.1/docs/guide/intl/encoding.doc.html> page of the public 
documentation of sun's jdk-1.4.1; 2nd paragraph:

"Sun's Java 2 Software Development Kit, Standard Edition, v. 1.4.1 for 
all platforms (SolarisTM operating environment, Linux, and Microsoft 
Windows) and the Java 2 Runtime Environment, Standard Edition, v. 1.4.1 
for Solaris and Linux support all encodings shown on this page..."

and further down the same page, a table giving the "Basic Encoding Set 
(contained in lib/rt.jar) - Supported by java.nio, and 
java.lang APIs."  in the "Canonical Name for and java.lang 
API," column, next to ISO-8859-15 row entry, there is a reference to 
"extended encoding set."  i took this to mean the value of the 
canonical name to be taken from the second set; ie. the extended 
encoding set."

there are two possible deductions from this page:

a. "ISO-8859-15" is a MUST encoding in java.nio, as well as in 
and java.lang, but in the last two the canonical name is as stated in 
the "extended set" i.e. "ISO8859_15" (ISO 8859-15, Latin alphabet No. 9 
(and hence the supporting classes are in charsets.jar rather than in 

b. "ISO-8859-15" is only a MUST encoding in java.nio, but not in 
nor java.lang.

i adopted the first.

there is of course the 3rd possibility of the writer(s) of these 
documentation pages being in contradiction.

the code itself (for the sun's jdk 1.4.1_01) does support ISO-8859-15, 
which can be thought of as the lithmus test.

> Is it really required or just nice to have since the Sun
> implementation supports it? (Which might still be a good reason to
> add them to Mauve, but then I would like to label them explicitly as
> such.)

my interpretation of it was that is is a MUST.

> Also you seem to test (in getBytes13) for the "historical names" for
> which I couldn't find a definition.

the relevant javadoc page in sun'd jdk 1.3.1_06 
<.../jdk1.3.1/docs/guide/intl/encoding.doc.html> lists the required 

"...Sun's Java 2 Runtime Environment, Standard Edition, v. 1.3.1 for 
Windows comes in two different versions: US-only and international. The 
US-only version only supports the encodings shown in the first table. 
The international version (which includes the lib\i18n.jar file) 
supports all encodings shown on this page."

it then proceeds to list the "Basic Encoding Set" (contained in rt.jar) 
where those names are defined.

the only difference is the Latin Alphabet #9.

>... Do you know where they are
> specified? InputStreamReader and OutputStreamWriter getEncoding() are
> supposed to return them but they don't document what they actually
> look like.

the references sun cites are:

* The Unicode standard 
<>, and
* The Unicode FAQ <>.

Version: GnuPG v1.0.7 (GNU/Linux)
Comment: Que du magnifique


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]