Transifex

  • Documentation
  • Projects & Content
  • Working with Plurals and Genders

Working with Plurals and Genders

Transifex supports a part of the ICU MessageFormat under the JSON file format for plural rules. If you have content encoded in ICU, you can use the JSON file format to import and export it in Transifex.

Also, the most important and frequently-used features of ICU can be supported in other file formats in a non-ICU way, such as plurals, context and developer comments. Depending on the file format you use, you can access some or all of these features.

You can use PO files as a good way to work around this. It is one of the most popular file formats for localization, supported by many frameworks. It's quite powerful and we have a good support for it.

PO files can support context and also developer comments, if you want to bring that info straight from the source file. If, in addition to plurals, you have genders, you can use the context field (msgctxt) to indicate the gender. You can also use the developer comment (#.) to explicitly tell the translator how to translate the string.

Here is an example, showing the same source string but using the context to save two different entities in Transifex.

#. Please translate it as a MALE gender  
#: src/main.py:338  
msgctxt "male"  
msgid "Cousin"  
msgstr ""  

#. Please translate it as a FEMALE gender  
#: src/main.py:340  
msgctxt "female"  
msgid "Cousin"  
msgstr  
Pluralized strings can also be defined in a .po file as follows:

#: src/main.py:340  
msgid "plural"  
msgid_plural "plurals"  
msgstr[0] "1 plural"  
msgstr[1] "%d plurals"  

Pluralized strings can also be defined in a PO file as follows:

#: src/main.py:340  
msgid "plural"  
msgid_plural "plurals"  
msgstr[0] "1 plural"  
msgstr[1] "%d plurals"

How pluralized strings are handled by Transifex

The number of plural forms for each specific language are automatically identified and displayed in the Transifex editor. So, if a translator wants to submit a translation for a pluralized string, they have to translate all the available plural forms in order to be able to save the translation. Otherwise, the save button won't be enabled and the translation won't be submitted.

As an example, English has 2 plural forms ("one" and "other"). The source file will have two phrases, one for each plural form. On the other hand, Russian has 4 plural forms. So, after the translations are done, when you download the file from Transifex, your translation file will look as follows.

#: src/main.py:340
msgid "plural"
msgid_plural "plurals"
msgstr[0] "Ваш суд заканчивается в 1 день"
msgstr[1] "Ваш суд заканчивается в % @ дней"
msgstr[2] "Ваш суд заканчивается в % @ дней"
msgstr[3] "Ваш суд заканчивается в % @ дней"

Transifex is following ISO Standards and Unicode CLDR data for the supported languages, as you can read in our documentation.

You can also take a look at the rules we follow, for defining the plural forms, here.

Leveraging TM with Pluralized Strings

Transifex's Translation Memory functionality supports plural forms natively. Here is an idea on how to use the TM with plural-enabled entities.

  • Open the Transifex Editor, click on the Search box and filter source strings by “pluralized” to only see the phrases that are marked as pluralized.

    2.png#asset:3535

  • For each phrase you select, you will be able to see its various plural forms and toggle between them, such as "One" and "Other". On the right side of the translation box, you'll be able to see the TM suggestions for this entity's plural form.

  • To use a TM suggestion, click “Use this”, which will copy the TM entry into the translation box. You can now save the translation by clicking "Save".

    use_this_suggestion.png#asset:3832

Natively supporting ICU

We love the concept and idea behind ICU. It's a powerful framework when you have very advanced i18n needs. It's also a neat way to store the actual phrases in the engine itself, instead of having multiple fields. More fluid and dynamic, easier to work with as a developer.

So why aren't we natively supporting ICU and suggest workarounds? Here are some of our thoughts on this.

ICU really affects a very small percentage of users and phrases. Very, very few of our users' phrases are pluralized (eg. you rarely see plurals in marketing and user-generated content). From those few users, very few have the need to also support male and female versions. And out of those, less than half are actually are willing to put the effort to support it in their code.

ICU is also a quite complex framework. The phrase itself is no longer plain English, but has variables and logic in it. That's a fundamental difference, it's like replacing your house's bricks with a brick that can grow in size with a button. The change affects every single piece of a localization system like Transifex. Some examples are the Translation Memory (serializing entries when updating a TM of billions of entries is no joke!), the wordcount across the platform, etc.

Finally, as outlined on this page, you can support the most important ICU features with Transifex already, such as plurals.

Properly supporting ICU would mean an overhaul of many core components in Transifex. This, in combination with the above, makes us want to push this in the future, until more customers ask for it.