Introduction to File Formats
Transifex supports over 25 localization file formats. You can learn more about each one by clicking on the respective link in the left menu.
If you use a file format that's not in the list, let us know and we may be able to help you.
For most file formats, Transifex uses UTF-8 encoding. This means files you upload to Transifex must be encoded in UTF-8, and all files downloaded from Transifex will be in UTF-8. There are some exceptions in cases where a file format specification requires a different encoding. For example, JAVA .properties files uses ISO-8859-1 and Apple's .strings files uses UTF-16.
For every file you have, you can download a different version depending on your needs. These different versions are called modes. They come in handy in a number of different situations. For example, a developer may want to only use reviewed translations. Or a translator may want to translate a file offline using their own desktop tool.
These are the available modes when using the Client or the API:
- reviewed: the file will include reviewed strings in the target language. All other strings will either be empty, or be in the source language (this varies depending on the file format).
- translator: the file will be suitable for offline translation of the resource.
- developer: the file will be suitable for usage by developers in their source code tree.
Some file formats will take untranslated entries and fill them in with source strings. So if you use one of the modes mentioned above, you'll get a file where the untranslated strings won't be empty. There are several modes that you can use to bypass this rule. They're available when using the API.
- onlytranslated: the response will include the translated strings and the untranslated ones will be returned empty.
- onlyreviewed: the response will include the reviewed strings and the rest (translated or not) will be returned empty.
These two modes above only apply to the following file formats: Apple strings, Chrome I18N, HTML, Java Properties, Joomla (ini), JSON Key-Value, Microsoft Word (alpha), Mozilla properties, Plain Text (txt), RequireJS format, and Windows JSON (resjson), GETTEXT (PO).
- sourceastranslation: the response will include the translated strings and the untranslated ones will be filled with the corresponding source strings.
This mode is only supported for the following formats: Android, KEYVALUEJSON, PO, Properties (Java Properties), QT, SRT, Stringsdict, and YML.
For all other file formats, use the
translator mode to get a file that leaves the untranslated entries empty. For some formats such as Java .properities, the downloaded file will have the source string in the comment so the translator can translate the file faster.
When using the command-line client, you can specify the mode of the file you want to download with the
--mode option of the
tx pull command.
In the web interface, choose to Download for use, Download only reviewed translations, or Download for translation.
When Transifex encounters a pluralized entry in your file, it'll associate all the plural forms together. This way, Transifex can present the translator with all the formals in the Editor. At the same time, Transifex knows how many plural forms each language has and asks the translator to translate all of them. Until the translator translates all the plural forms in the selected language, they won't be able to save their work.
Using variables in plurals
Don't use "1 car" and "%s cars" in your code. Include the variable in all cases, i.e. use this: "%s car" and "%s cars".
Why? English uses two plural rules: "one" ("1 car") and "other" ("0,2,10 cars"). But other languages, bundle the different cases in other ways. For example, in Portuguese, it's "0, 1 car" and "2, 10 cars". So, when coding, make sure you always include the variable denoting the number of objects in the string.
Below you can find the metadata that is supported per file format. File formats that do not support any of these metadata are not included in the table.
File formats and Metadata table
Every project in Transifex has a set of resources. Each resource corresponds to a source file. If, for example, you have a project with two files for translation, foo.pot and bar.pot, you will need to create two separate resources (e.g. foo and bar) and map each one to the corresponding translation file.
Every project in Transifex is associated with a source language, which is the source language of your files.
When you upload the source file, Transifex will extract the source strings using a parser suitable for your i18n format and strore them as the translations of the source language. Then, by using the web editor or by uploading a file, you can translate these strings into even more languages (target languages).
When importing a source file into Transifex, this file will be saved and used as a template for all automatically generated translation files, when the user wants to download them. However, some parsers do not guarantee that the downloaded translation files will be the same with the source file with respect to any metadata present.
There are three internal structures in Transifex which are of interest: Source Entities, Translations, and Template files.
Each resource inherits the source language of the project it belongs to, which can be other than English. In the following examples, however, we will use English as the source language.
At the core of the Transifex translation storage engine are Source Entities. These are representations of actual translatable objects, together with their metadata, like comments.
For instance, for gettext PO files, a source entity corresponds to a msgid entry together with all the metadata it carries (context, occurrences, comments etc).]
Each entity has Translations to a number of languages — including the source language. There is a 1-1 relationship between an entity and a translation in a particular language. So, if you upload a fresh POT file with one msgid inside, you will end up with one source entity and one translation for English.
When you upload a source file, Transifex internally will remove the source strings and replace them with a hash. Transifex will use that hash, when you download the file, to insert the correct translations for the requested language. In essence, you're downloading the English file with the English content of it replaced with French content.
Here are the steps taken, when you create a new resource by uploading a new source language file:
- Identify the translatable entities/segments in the file.
- For each one, create a new source entity in Transifex, if one does not yet exist.
- Most source files include the string in the source language (in the case of Gettext POT, this is the msgid content). Take this string and store it in the database as the translation for the source language, if needed.
- Replace this string with a hash, in order to mark the place of the source entity and be able to replace it with the correct translation, when exporting the translation file. The above architecture means that all translated versions of the file preserve the comments and non-translatable part of the file.
When you upload a file, e.g. a French file, Transifex goes through all translations, locates the respective Source Entity and updates its translation. Then the file is deleted. This means that any content which may not be supported by Transifex (such as arbitrary comments in random locations in the file) might not be preserved.
When you request to download the file in a specific language, Transifex will take the template file and substitute all hashes with the French translations. Transifex can export different files based on the mode the you specify.