This is the mail archive of the libc-locales@sourceware.org mailing list for the GNU libc locales project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: Variable length date strings in glibc locales?


Hi,

On 2014-05-28 10:17, Keld Simonsen wrote:
> On Tue, May 27, 2014 at 05:20:44PM -0400, Carlos O'Donell wrote:
>> On 05/27/2014 05:02 PM, Keld Simonsen wrote:
>>> On Tue, May 27, 2014 at 09:58:56AM +0300, Marko Myllynen wrote:
>>>> Hi,
>>>>
>>>> in some languages dates are written without leading zeroes so that May 3
>>>> would be "3.5.". The same for time, 08:07:00 would be "8.07.00".
>>>>
>>>> In glibc locales it would be possible to write dates and times in such
>>>> fashion but do we know how that would affect existing applications? Are
>>>> they expecting dates and times to be fixed length and would variable
>>>> length date strings cause formatting or layout issues? Looking at
>>>> existing locales, almost all of them use fixed length strings for
>>>> d_fmt/t_fmt/date_fmt/d_t_fmt.
>>>>
>>>> Ideally of course it would be nice to change certain locales to use date
>>>> and time formats according to their cultural conventions and national
>>>> recommendations but if that would lead to wonky layout in applications
>>>> then it's probably better to be pragmatic and use fixed length dates.
>>>>
>>>> I could add few words about this to our Locales wiki page if someone
>>>> happens to know what's the best approach here.
>>>
>>> I think there are two consideraation schools.
>>>
>>> 1. What is expected by users using POSIX like systems.
>>>
>>> 2. What is expected by the linguistic oriented users.
>>>
>>> This varies from program to program.
>>
>> Certainly.
>>
>> So I'm going to take a policy position for glibc. The GNU C 
>> Library should accept locales with dates and times of variable
>> widths.
>>
>> If the author of a program wishes a constant width they need
>> to use an ISO standard date format, or pad the individual
>> components themselves.
> 
> Hmm, I think that would not be according to users' expectations. 
> For example "ls" and logging programs would be hard to change to that new 
> convention. I think it would be better to go along with existing practice
> and then provide guidance for how to use the info for new ways of using the
> data.
> 
> I think the main problem area is the abbreviated day names and month names. 
> I would advise that these be kept fixed, as this is what programs and users
> expect in utilities. Then we should introduce a new set of abbreviated
> day names and month names, that can have varying length. 
> In this way we will be backwards compatible, and in line with programmers
> and users expectations.

currently abbreviated day names and month names vary in length between
locales and within locales, too (e.g. fr_FR). Using the simple test
program again, we see:

localhost:~> cat t2.sh
#!/bin/bash

for f in a b ; do
  echo $f:
  for l in $(ls -1 /usr/share/i18n/locales/* | grep -Ev
'(@|i18n$|iso14651|translit|POSIX)') ; do
    echo -n "$(LC_ALL=$(basename $l.UTF-8) date --date="2007-05-03
08:07:00" +"%$f" | wc -L)" ; echo -e "\t$(basename $l).UTF-8"
  done | sort -un | sed -n '1p;$p'
done
localhost:~> unset LC_ALL
localhost:~> bash ./t2.sh
a:
1	ar_AE.UTF-8
7	fa_IR.UTF-8
b:
1	gu_IN.UTF-8
7	gv_GB.UTF-8
localhost:~>

In the past there was a request from an application developer to change
a locale which had fixed length month abbreviations in use:

https://bugzilla.redhat.com/show_bug.cgi?id=657572

So not sure what would be the ideal solution here but perhaps we could
at least try to mitigate the confusion with some additional instructions
in the wiki page, suggestions on that would warmly be welcome.

Thanks,

-- 
Marko Myllynen


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]