This is the mail archive of the binutils@sourceware.org mailing list for the binutils project.


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

Re: [Translation-i18n] xtensa message pluralization



> On Nov 8, 2017, at 8:40 PM, Alan Modra <amodra@gmail.com> wrote:
> 
> On Wed, Nov 08, 2017 at 05:52:59PM +0000, Paul.Koning@dell.com wrote:
>> The reason for the "entire sentences" rule is that a lot of languages adjust word forms in fairly complex ways, depending not just on the number (singular/plural, etc.) but also on other considerations.  If you have two sentence fragments, in English you can typically just concatenate them and be ok.  In lots of other languages it's not that simple; the correct way to phrase part 2 may depend on what is in part 1.  A sentence is a grammatical unit, and can be translated in isolation without running into these issues.  But half a sentence cannot, at least not necessarily.
> 
> Understood, and answers my question, thanks.  Using concat is not right.
> 
>> Given the limitations of the gettext machinery, if you want clean translation there are certain message constructs you have to avoid.  It appears that messages with more than one to-be-pluralized element are such an example, since there isn't an "nngettext" to give you the correct message for the plural based on more than one value.
> 
> That isn't going to happen.  The translation project is going to be
> faced with sentences that really do need two or more pluralized nouns
> for the sense to be conveyed naturally in English.  Avoiding two
> plurals in one sentence will mean loss of information (eg. dropping
> "bytes" from a quantity) or stilted contrived sentences.
> 
> To recap, the sentence we are talking about here is:
> 	"format '%s' allows %d slots, but there are %d opcodes"

Yes, I agree that cutting this into two sentences is not a clean solution.

> Bruno suggested the best solution was to break the sentence at the
> conjunction "but", which is of course a natural place to break a
> sentence into phrases.  (It's how I broke the sentence at first too,
> when considering the reordering issue.)  The code to do that would be:
> 
>      char *phrase1, *phrase2;
>      int slots = xtensa_format_num_slots (xtensa_default_isa, vinsn->format);
> 
>      if (asprintf (&phrase1, ngettext ("format '%s' allows %d slot,",
> 					"format '%s' allows %d slots,",
> 					slots),
> 		    xtensa_format_name (xtensa_default_isa, vinsn->format),
> 		    slots) == -1
> 	  || asprintf (&phrase2, ngettext ("there is %d opcode",
> 					   "there are %d opcodes",
> 					   vinsn->num_slots),
> 		       vinsn->num_slots) == -1)
> 	as_fatal ("%s", xstrerror (errno));
> 
>      as_bad (_("%s but %s"), phrase1, phrase2);
>      free (phrase1);
>      free (phrase2);
> 
> This would give a translator the following to work with:
> 
> msgid "format '%s' allows %d slot,"
> msgid_plural "format '%s' allows %d slots,"
> msgstr[0] ""
> msgstr[1] ""
> 
> msgid "there is %d opcode"
> msgid_plural "there are %d opcodes"
> msgstr[0] ""
> msgstr[1] ""
> 
> msgid "%s but %s"
> msgstr ""
> 
> The patch I posted gives:
> 
> msgid "allows %d slot"
> msgid_plural "allows %d slots"
> msgstr[0] ""
> msgstr[1] ""
> 
> msgid "there is %d opcode"
> msgid_plural "there are %d opcodes"
> msgstr[0] ""
> msgstr[1] ""
> 
> msgid "format '%s' %s, but %s"
> msgstr ""
> 
> Note that in both cases a translator does in fact have access to the
> entire sentence, but with some restrictions.  In both cases the
> "slots" phrase translation can't depend on the quantity in the
> "opcodes" phrase translation, and vice versa.  Bruno's suggestion has
> a further restriction in that the translation for "format" must be
> adjacent to the "slots" translation.  So, abbreviating F for format, S
> for slots, O for opcodes components, a translator could arrange to
> emit FSO, SFO, OFS, OSF, but not FOS or SOF.
> 
> The patch I posted allows all the ordering possibilities, but the
> translation for "format" can't depend on the "slots" quantity, and a
> translator has a little more difficulty in piecing together the
> sentence by just looking at the .pot file.
> 
> There is also the issue of other messages that may share "%s but %s"
> construction in the future.  If such exist then that is another
> complication for anyone wanting to reorder phrases, and a reason why
> it may be better to put "format" with "but".
> 
> I think that covers all the issues I've considered.  I'm not a
> linguist, and besides English only know a little German.  So I'm quite
> happy to take advice, Bruno.  The only reason to extend this thread
> was wondering whether you had considered everything, and to make
> other binutils developers aware of the problems they cause!  I know
> there is room for a lot of improvement in the binutils source
> regarding translation, not least being the fact that ld's einfo
> function doesn't allow reordering.

For this specific case the technique you suggested may be ok.  I'm worried that there may be other languages where the left and right sides interact.  Or cases where "but" isn't constant.  For that, a cleaner answer may be to move the "but" part into the second phrase.

I was thinking about one possible German wording: "..., es giebt aber ..." ("but there is/are" -- with the "but" appearing after the other words).  That's ok in German.  But the analogous Dutch word pattern would be "er is echter" for singular and "er zijn echter" for plural.  There are other ways to translate the phrase that don't run into this, but it illustrates the issue.  If you make the "but" part of the second message then this issue goes away.

	paul


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]