This is the mail archive of the libc-alpha@sourceware.org mailing list for the glibc project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [PATCH 1/1] Added Locale for mfe_MU


Akhilesh Kumar <akhilesh.k@samsung.com> さんはかきました:

> Added locale for "Morisyen" which is also called as "Mauritian Creole"
> spoken in Mauritius.

> [BZ #21971]
> 	*locale/iso-639.def: Added DEFINE_LANGUAGE_CODE3 for "mfe"        
> 	*localedata/SUPPORTED: Added mfe_MU     
> 	*localedata/locales/mfe_MU: New File
>
> ---
>  locale/iso-639.def        |    1 +
>  localedata/SUPPORTED      |    1 +
>  localedata/locales/mfe_MU |  187 +++++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 189 insertions(+), 0 deletions(-)
>  create mode 100644 localedata/locales/mfe_MU
>
> diff --git a/locale/iso-639.def b/locale/iso-639.def
> index b109a3b..6d6eceb 100644
> --- a/locale/iso-639.def
> +++ b/locale/iso-639.def
> @@ -316,6 +316,7 @@ DEFINE_LANGUAGE_CODE3 ("Mari", chm, chm)
>  DEFINE_LANGUAGE_CODE ("Marshallese", mh, mah, mah)
>  DEFINE_LANGUAGE_CODE3 ("Marwari", mwr, mwr)
>  DEFINE_LANGUAGE_CODE3 ("Masai", mas, mas)
> +DEFINE_LANGUAGE_CODE3 ("Mauritius", mfe, mfe)

“Mauritius” is the name of the country.
The name of the language in English should be “Mauritian Creole” or
“Morisyen”.

>  DEFINE_LANGUAGE_CODE3 ("Mayan languages", myn, myn)
>  DEFINE_LANGUAGE_CODE3 ("Meadow Mari", mhr, mhr)
>  DEFINE_LANGUAGE_CODE3 ("Mende", men, men)
> diff --git a/localedata/SUPPORTED b/localedata/SUPPORTED
> index 197917b..e8f4764 100644
> --- a/localedata/SUPPORTED
> +++ b/localedata/SUPPORTED
> @@ -311,6 +311,7 @@ lzh_TW/UTF-8 \
>  mag_IN/UTF-8 \
>  mai_IN/UTF-8 \
>  mai_NP/UTF-8 \
> +mfe_MU/UTF-8 \
>  mg_MG.UTF-8/UTF-8 \
>  mg_MG/ISO-8859-15 \
>  mhr_RU/UTF-8 \
> diff --git a/localedata/locales/mfe_MU b/localedata/locales/mfe_MU
> new file mode 100644
> index 0000000..8f1a426
> --- /dev/null
> +++ b/localedata/locales/mfe_MU
> @@ -0,0 +1,187 @@
> +comment_char %
> +escape_char /
> +
> +% This file is part of the GNU C Library and contains locale data.
> +% The Free Software Foundation does not claim any copyright interest
> +% in the locale data contained in this file.  The foregoing does not
> +% affect the license of the GNU C Library as a whole.  It does not
> +% exempt you from the conditions of the license if your use would
> +% otherwise be governed by that license.
> +
> +% Locale for Morisyen locale in the Mauritius
> +% Contributed by Akhilesh Kumar <akhilesh.k@samsung.com>
> +
> +LC_IDENTIFICATION
> +title      "Morisyen locale for the Mauritius"
> +source     "Samsung Electronics Co., Ltd."
> +address    ""
> +contact    ""
> +email      "akhilesh.k@samsung.com"
> +tel        ""
> +fax        ""
> +language   "English"
> +territory  "Mauritius"
> +revision   "1.0"
> +date       "2017-08-18"
> +
> +category "i18n:2012";LC_IDENTIFICATION
> +category "i18n:2012";LC_CTYPE
> +category "i18n:2012";LC_COLLATE
> +category "i18n:2012";LC_TIME
> +category "i18n:2012";LC_NUMERIC
> +category "i18n:2012";LC_MONETARY
> +category "i18n:2012";LC_MESSAGES
> +category "i18n:2012";LC_PAPER
> +category "i18n:2012";LC_NAME
> +category "i18n:2012";LC_ADDRESS
> +category "i18n:2012";LC_TELEPHONE
> +category "i18n:2012";LC_MEASUREMENT
> +END LC_IDENTIFICATION
> +
> +
> +LC_CTYPE
> +copy "i18n"
> +END LC_CTYPE
> +
> +% http://demo.icu-project.org/icu-bin/locexp?d_=en&_=mfe
> +LC_TIME
> +% Abbreviated weekday names 
> +abday	"dim";/
> +       "lin";/
> +       "mar";/
> +       "mer";/
> +	"ze ";/
> +	"van";/
> +	"sam"

Could you use the whitespace in a consistant way instead of a mixture
of tabs and spaces?

> +%
> +% Full weekday names
> +day     "dimans";/
> +        "lindi";/
> +        "mardi";/
> +        "merkredi";/
> +        "zedi";/
> +        "vandredi";/
> +        "samdi"
> +% Abbreviated month names		
> +abmon   "zan";/
> +	 "fev";/
> +        "mar";/
> +        "avr";/
> +        "me";/
> +        "zin";/
> +        "zil";/
> +	 "out";/
> +	 "sep";/
> +        "okt";/
> +        "nov";/
> +        "des"
> +% 
> +% Full month names
> +mon     "zanvie";/
> +	 "fevriye";/
> +        "mars";/
> +        "avril";/
> +        "me";/
> +	 "zin";/
> +	 "zilye";/
> +	 "out";/
> +	 "septam";/
> +        "oktob";/
> +        "novam";/
> +        "desam"
> +%
> +d_t_fmt     "<U0025><U0061><U0020><U0025><U0064><U0020><U0025><U0062><U0020><U0025><U0059><U0020><U0025><U0054><U0020><U0025><U005A>"
> +d_fmt       "<U0025><U0064><U002F><U0025><U006D><U002F><U0025><U0079>"
> +t_fmt       "<U0025><U0054>"
> +am_pm       "AM";"PM"
> +t_fmt_ampm  "<U0025><U006C><U003A><U0025><U004D><U003A><U0025><U0053><U0020><U0025><U0050><U0020><U0025><U005A>"
> +date_fmt    "<U0025><U0061><U0020><U0025><U0065><U0020><U0025><U0062>/
> +<U0020><U0025><U0048><U003A><U0025><U004D><U003A><U0025><U0053><U0020>/
> +<U0025><U005A><U0020><U0025><U0059>"
> +END LC_TIME

These are all ASCII, so you could write the text directly here as well
instead of the code points.

> +% http://wikitravel.org/en/User:LiangHH/Mauritian_Creole_phrasebook
> +LC_MESSAGES
> +yesexpr "<U005E>[+1yY]"

<U005E> could be written directly, i.e.

yesexpr "^[+1yY]

> +noexpr  "[-0nN]"

Why is the ^ missing here?

> +yesstr  "Yes"
> +nostr   "No"
> +END LC_MESSAGES
> +
> +LC_COLLATE
> +
> +% Copy the template from ISO/IEC 14651
> +copy "iso14651_t1"
> +
> +END LC_COLLATE
> +
> +LC_MONETARY
> +% https://en.wikipedia.org/wiki/Mauritian_rupee
> +int_curr_symbol     "MUR "
> +currency_symbol     "Rs"

According to the Wikipedia page you quote, this should be “₨”, not “Rs”.

> +mon_decimal_point   "."
> +mon_thousands_sep   ","

According to mfe.xml in CLDR, U+00A0 NO-BREAK SPACE is used as a
thousands separator.  So we should not use “,” but  U+202F NARROW
NO-BREAK SPACE (We consistently use the narrow version in thousands_sep
and mon_thousands_sep in glibc currently).

> +mon_grouping        3;3
> +positive_sign       ""
> +negative_sign       "-"
> +int_frac_digits     2
> +frac_digits         2
> +p_cs_precedes       1
> +int_p_sep_by_space  1
> +p_sep_by_space      1
> +n_cs_precedes       1
> +int_n_sep_by_space  1
> +n_sep_by_space      1
> +p_sign_posn         1
> +n_sign_posn         1
> +%
> +END LC_MONETARY
> +
> +LC_NUMERIC
> +decimal_point          "."
> +thousands_sep          ","

According to mfe.xml in CLDR, U+00A0 NO-BREAK SPACE is used as a
thousands separator.  So we should not use “,” but  U+202F NARROW
NO-BREAK SPACE (We consistently use the narrow version in thousands_sep
and mon_thousands_sep in glibc currently).

> +grouping               3
> +END LC_NUMERIC
> +
> +LC_PAPER
> +copy "i18n"
> +END LC_PAPER
> +
> +LC_TELEPHONE
> +% https://www.howtocallabroad.com/mauritius/
> +tel_int_fmt "+%c %l"
> +% 00 Africa: all countries except Kenya, Nigeria, Tanzania and Uganda
> +int_select     "00"
> +int_prefix     "230"
> +END LC_TELEPHONE
> +
> +LC_MEASUREMENT
> +copy "i18n"
> +END LC_MEASUREMENT
> +
> +
> +
> +LC_NAME
> +name_fmt    "<U0025><U0064><U0025><U0074><U0025><U0067><U0025><U0074>/
> +<U0025><U006D><U0025><U0074><U0025><U0066>"

Could be written directly because it is ASCII.

> +name_miss   "Miss."
> +name_mr     "Mr."
> +name_mrs    "Mrs."
> +name_ms     "Ms."
> +END LC_NAME
> +
> +
> +LC_ADDRESS
> +postal_fmt   "%f%N%h%s%N%T"
> +country_name "Mauritius"
> +country_ab2  "MU"
> +country_ab3  "MUS"
> +country_num  480
> +% https://en.wikipedia.org/wiki/List_of_international_vehicle_registration_codes
> +country_car  "MS"
> +country_isbn "978-613,978-620,978-99903,978-99949"
> +lang_name    "Morisyen"

According to mfe.xml in CLDR, this should be “kreol morisien”,
Wikipedia has it capitalized as “Kreol Morisien”. I think we should
follow CLDR here.

> +lang_ab      "mfe"

lang_ab should contain the iso-639-1 code, but as there is no iso-639-1
code for Morisyen, it should be empty or omitted, I think.

> +lang_term    "mfe"
> +lang_lib     "mfe"

I think lang_lib is the iso-639-2 code and there is no iso-639-2
code for Morisyen either, only iso-639-3.

I am not sure what one should do in such a case. I guess using the
iso-639-3 code in lang_lib as well is OK, 

> +END LC_ADDRESS

-- 
Mike FABIAN <mfabian@redhat.com>


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]