-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathREADME.Rmd
119 lines (78 loc) · 5.64 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
[![Project Status: Active - The project has reached a stable, usable state and is being actively developed.](http://www.repostatus.org/badges/0.1.0/active.svg)](http://www.repostatus.org/#active)
[![Travis-CI Build Status](https://travis-ci.org/RL10N/poio.svg?branch=master)](https://travis-ci.org/RL10N/poio)
# **poio**: Input/Output Functionality for "PO" and "POT" Message Translation Files
R packages use a text file format with a `.po` extension to store translations of messages, warnings, and errors. **poio** provides functionality to read in these files, fix the metadata, and write the objects back to file.
## Installation
To install the development version, you first need the *devtools* package.
```{r, eval = FALSE}
install.packages("devtools")
```
Then you can install the **poio** package using
```{r, eval = FALSE}
devtools::install_bitbucket("RL10N/poio")
```
## Functions
`read_po` reads PO and POT files into R, and stores them as an object of class `"po"` (see below for details).
`fix_metadata` fixes the metadata in a `po` object.
`generate_po_from_pot` generates a PO object from a POT object.
`write_po` writes `po` objects back to a PO file.
`get_n_plural_forms` is a convenience function for retriving the number of plural forms for a language from the `po` object's metadata.
## Datasets
`language_codes` is a list of all the language codes and the country codes that `gettext` understands.
`plural_forms` is a data.frame of the plural forms header string for over 140 common languages.
## Examples
A typical workflow begins by generating a POT master translation file for a package using `tools::xgettext2pot`. In this case, we'll use a sample file stored in the **poio** package. The contents look like this:
```{r}
library(poio)
pot_file <- system.file("extdata/R-summerof69.pot", package = "poio")
cat(readLines(pot_file), sep = "\n")
```
To import the file, use `read_po`. A description on the object's structure is shown in the "PO Objects" section below.
```{r}
(pot <- read_po(pot_file))
```
`tools::xgettext2pot` makes a mess of some of the metadata element that it generates, so they need fixing. `fix_metadata` auto-guesses sensible options, but you can manually set values if you prefer.
```{r}
pot_fixed <- fix_metadata(pot, "Language-Team" = "Team RL10N!")
pot_fixed$metadata
```
Now you need to choose some languages to translate your messages into. Suitable language codes can be found in the `language_codes` dataset included in the package.
```{r}
data(language_codes)
str(language_codes, vec.len = 8)
```
Then, for each language that you want to create a translation for, generate a `po` object and write it to file. If your current working directory is the root of your package, the correct file name is automatically generated.
```{r, eval = FALSE}
for(lang in c("de", "fr_BE"))
{
po <- generate_po_from_pot(pot, lang)
write_po(po)
}
```
## PO Objects
`po` objects are lists with class `"po"` (to allow S3 methods), containing the following elements:
- *source_type*: A string. Either `"r"` or `"c"`, depending upon whether the messages originated from R-level code, or C-level code.
- *file_type*: Either `"po"` or `"pot"`, depending upon whether the messages originated from a PO (language-specific) or POT (master translation) file. Determined from the file name.
- *initial_comments*: A `character` vector of comments added by the translator.
- *metadata*: A `data_frame` of file metadata with columns "name" and "value".
- *direct*: A `data_frame` of messages with a direct translation, as created by `stop`,
`warning`, `message` or`gettext`; its columns are described below.
- *countable*: A `data_frame`of messages where the translation depends upon a countable value, as created by `ngettext`; its columns are described below.
The `direct` element of the `po` object has the following columns.
- *msgid*: Character. The untranslated (should be American English) message.
- *msgstr*: Character. The translated message, or empty strings in the case of POT files.
- *is_obsolete*: Logical. Is the message obsolete?
- *msgctxt*: List of character. Disambiguating context information to allow multiple messages with the same ID.
- *translator_comments*: List of character. Comments added by the translator, typically to explain unclear messages, or why translation choices were made.
- *source_reference_comments*: List of character. Links to where the message occured in the source, in the form "filename:line".
- *flags_comments*: List of character. Typically used to describe formatting directives. R uses C-style formatting, which would imply a `"c-format"` flag. For example `%d` denotes an integer, and `%s` denotes a string. `"fuzzy"` flags can appear when PO files are merged.
- *previous_string_comment*: List of character. When PO files are merged with an updated POT file ,and a fuzzy flag is generated, the old msgid is stored in a previous string comment.
The `countable` element of the `po` object takes the same form as the `direct` element, with two differences.
- *msgid_plural*: Character. The plural form of the untranslated message.
- *msgstr*: This is now a list of character (rather than character.)
## See Also
The [*msgtools*](http://github.com/RL10N/msgtools) package, which has higher level tools for working with messages and translations.
The Pology python library has some useful [*documentation on the PO file format*](http://pology.nedohodnik.net/doc/user/en_US/ch-poformat.html).
The GNU [*gettext*](https://www.gnu.org/software/gettext/manual/html_node/index.html) utility.
## Acknowledgements
This package was developed as part of the [*RL10N*](https://rl10n.github.io) project, funded by the [R Consortium](https://www.r-consortium.org/).