Boost.Locale
Localized Text Formatting

The iostream manipulators are very useful, but when we create messages for the user, sometimes we need something like good old printf or boost::format.

Unfortunately boost::format has several limitations in the context of localization:

  1. It renders all parameters using global locale rather than target ostream locale. For example:
    std::locale::global(std::locale("en_US.UTF-8"));
    output.imbue(std::locale("de_DE.UTF-8"))
    output << boost::format("%1%") % 1234.345;

    This would write "1,234.235" to output, instead of the "1.234,234" that is expected for "de_DE" locale.
  2. It knows nothing about the Boost.Locale manipulators.
  3. The printf-like syntax is very limited for formatting complex localized data, not allowing the formatting of dates, times, or currencies

Thus a new class, boost::locale::format, is introduced. For example:

wcout << wformat(L"Today {1,date} I would meet {2} at home") % time(0) % name <<endl;

Each format specifier is enclosed within {} brackets, is separated with a comma ",", and may have an additional option after an equals symbol '='. This option may be simple ASCII text or single-quoted localized text. If a single-quote should be inserted within the text, it may be represented with a pair of single-quote characters.

Here is an example of a format string:

    "Ms. {1} had arrived at {2,ftime='%I o''clock'} at home. The exact time is {2,time=full}"

The syntax is described by the following grammar:

    format : '{' parameters '}'
    parameters: parameter | parameter ',' parameters;
    parameter : key ["=" value] ;
    key : [0-9a-zA-Z<>]+ ;
    value : ascii-string-excluding-"}"-and="," | local-string ;
    local-string : quoted-text | quoted-text local-string;
    quoted-text : '[^']*' ;

You can include a literal '{' and '}' by inserting double "{{" or "}}" into the text.

cout << format(translate("Unexpected `{{' in line {1} in file {2}")) % pos % file;

Would display something like:

Unexpected `{' in line 5 in file source.cpp

The following format key-value pairs are supported:

  • [0-9]+ – digits, the index of the formatted parameter – required.
  • num or number – format a number. Options are:
    • hex – display in hexadecimal format
    • oct – display in octal format
    • sci or scientific – display in scientific format
    • fix or fixed – display in fixed format
      For example, number=sci
  • cur or currency – format currency. Options are:
    • iso – display using ISO currency symbol.
    • nat or national – display using national currency symbol.
  • per or percent – format a percentage value.
  • date, time, datetime or dt – format a date, a time, or a date and time. Options are:
    • s or short – display in short format.
    • m or medium – display in medium format.
    • l or long – display in long format.
    • f or full – display in full format.
  • ftime with string (quoted) parameter – display as with strftime. See as::ftime manipulator.
  • spell or spellout – spell the number.
  • ord or ordinal – format an ordinal number (1st, 2nd... etc)
  • left or < – align-left.
  • right or > – align-right.
  • width or w – set field width (requires parameter).
  • precision or p – set precision (requires parameter).
  • locale – with parameter – switch locales for the current operation. This command generates a locale with formatting facets, giving more fine grained control of formatting. For example:
    cout << format("This article was published at {1,date=l} (Gregorian) {1,locale=he_IL@calendar=hebrew,date=l} (Hebrew)") % date;
  • timezone or tz – the name of the timezone to display the time in. For example:
    cout << format("Time is: Local {1,time}, ({1,time,tz=EET} Eastern European Time)") % date;
  • local - display the time in local time
  • gmt - display the time in UTC time scale
    cout << format("Local time is: {1,time,local}, universal time is {1,time,gmt}") % time;

The constructor for the format class can take an object of type message, simplifying integration with message translation code.

For example:

cout<< format(translate("Adding {1} to {2}, we get {3}")) % a % b % (a+b) << endl;

A formatted string can be fetched directly by using the str(std::locale const &loc=std::locale()) member function. For example:

std::wstring de = (wformat(translate("Adding {1} to {2}, we get {3}")) % a % b % (a+b)).str(de_locale);
std::wstring fr = (wformat(translate("Adding {1} to {2}, we get {3}")) % a % b % (a+b)).str(fr_locale);
Note
There is one significant difference between boost::format and boost::locale::format: Boost.Locale's format converts its parameters only when written to an ostream or when the str() member function is called. It only saves references to the objects that can be written to a stream.

This is generally not a problem when all operations are done in one statement, such as:

cout << format("Adding {1} to {2}, we get {3}") % a % b % (a+b);

Because the temporary value of (a+b) exists until the formatted data is actually written to the stream. But following code is wrong:

format fmt("Adding {1} to {2}, we get {3}");
fmt % a;
fmt % b;
fmt % (a+b);
cout << fmt;

Because the temporary value of (a+b) no longer exists when fmt is written to the stream. A correct solution would be:

format fmt("Adding {1} to {2}, we get {3}");
fmt % a;
fmt % b;
int a_plus_b = a+b;
fmt % a_plus_b;
cout << fmt;