Java 21: Internationalization
Table of contents
Introduction
In this article, we will learn how to use formatting instances to format dates and numbers.
We will take a look at the java.text
package and its classes. We will also learn how to use the java.time
package to work with dates and times.
Finally, we will learn how to use the ResourceBundle
object to retrieve localized messages from a properties file.
Locale
class changes
Since Java 19, all constructors of the Locale
class have been deprecated.
Instead, you should use the Locale.of
static constructors to create a new Locale
instance
or use the Locale.Builder
class to create a new Locale
instance.
// static constructors
Locale.of(language);
Locale.of(language, country);
Locale.of(language, country, variant);
// builder
new Locale.Builder()
.setLanguage(language)
.setRegion(country)
.setVariant(variant)
.build();
// deprecated constructors
new Locale(language);
new Locale(language, country);
new Locale(language, country, variant);
Number formatting
The NumberFormat
class offers a number of methods for returning different instances for formatting numbers.
getPercentInstance(Locale)
- formats numbers as percentages.getCurrencyInstance(Locale)
- formats numbers as currency values.getIntegerInstance(Locale)
- formats numbers as integers.getNumberInstance(Locale)
- formats numbers as general-purpose numbers.getCompactNumberInstance(Locale, NumberFormat.Style)
- formats numbers as compact numbers.
Percentage formatting
Different countries use different symbols for percentages. For example, in the US, the percentage symbol is %
, while in the UK, it is £
.
NumberFormat.getPercentInstance(Locale.US).format(0.75);
// 75%
NumberFormat.getPercentInstance(LOCALE_FRANCE).format(0.75);
// 75 %
NumberFormat.getPercentInstance(Locale.of("ar", "SA")).format(0.75);
// ٪٧٥
Long numbers formatting
Compact numbers are a method of formatting numbers in a more compact way. For example, instead of writing 1,000,000
you can write 1M
.
They are useful when you want to display a large number in a small space. Every locale has its own set of compact
numbers. There are 4 different styles of compact numbers:
NumberFormat.Style.SHORT
- the shortest form of the compact number. For example,1M
for1,000,000
.NumberFormat.Style.LONG
- the longest form of the compact number. For example,1 million
for1,000,000
.
// NumberFormat.Style.SHORT
NumberFormat.getCompactNumberInstance(LOCALE_US, Style.SHORT).format(1_213_700);
// 1M
NumberFormat.getCompactNumberInstance(LOCALE_FRANCE, Style.SHORT).format(1_213_700);
// 1 M
NumberFormat.getCompactNumberInstance(LOCALE_POLAND, Style.SHORT).format(1_213_700);
// 1 mln
// NumberFormat.Style.LONG
NumberFormat.getCompactNumberInstance(LOCALE_US, Style.LONG).format(12_137);
// 12 thousand
NumberFormat.getCompactNumberInstance(LOCALE_FRANCE, Style.LONG).format(12_137);
// 12 mille
NumberFormat.getCompactNumberInstance(LOCALE_POLAND, Style.LONG).format(12_137);
// 12 tysięcy
Numbers formatting
The NumberFormat
class provides a number of methods for formatting numbers.
NumberFormat.getNumberInstance(localeUs).format(1234567.89);
// 1,234,567.89
NumberFormat.getNumberInstance(localeFrance).format(1234567.89);
// 1 234 567,89
NumberFormat.getNumberInstance(localePoland).format(1234567.89);
// 1 234 567,89
Money formatting
Currency formatting is useful when you want to display a number as a currency value. For example, instead of joining
currency symbol and number you can use the NumberFormat
instance which does it for you in the best possible way,
taking into account the locale.
NumberFormat.getCurrencyInstance(localePoland).format(123.45);
// 123,45 zł
NumberFormat.getCurrencyInstance(localeFrance).format(123.45);
// 123,45 €
NumberFormat.getCurrencyInstance(localeUS).format(123.45);
// $123.45
Note that different locales use different symbols for decimal separators and grouping separators. For example, in the US, the decimal separator is .
, while in Poland and France, it is ,
.
Symbols and currencies
NumberFormat
are powerful tool, but what if you want to display a currency symbol or a decimal separator? You can use the DecimalFormatSymbols
class to get the symbols for a given locale.
Currency currency = DecimalFormatSymbols.getInstance(localePoland).getCurrency();
currency.getDisplayName(LOCALE_POLAND);
// złoty polski
currency.getDisplayName(Locale.FRANCE);
// zloty polonais
currency.getDisplayName(Locale.US);
// Polish zloty
Note that we took Polish currency (złoty polski) and Locale.FRANCE
and Java returned the French name of the currency (zloty polonais).
If we took Locale.US
and Polish currency, Java returned the English name of the currency (Polish zloty).
Returning symbols works in the same way:
currency.getSymbol(LOCALE_POLAND);
// zł
currency.getSymbol(Locale.FRANCE);
// PLN
currency.getSymbol(Locale.US);
// PLN
The DecimalFormatSymbols
class provides a set of symbols (such as the currency symbol) that are used to format numbers.
DecimalFormatSymbols.getInstance(LOCALE_POLAND).getCurrencySymbol();
// zł
DecimalFormatSymbols.getInstance(Locale.FRANCE).getCurrencySymbol();
// €
DecimalFormatSymbols.getInstance(LOCALE_POLAND).getGroupingSeparator();
// ' '
DecimalFormatSymbols.getInstance(Locale.US).getGroupingSeparator();
// ,
Date formatting
After previous sections, you should be familiar with formatting values using getXXXInstance
methods and Locale
objects. You shouldn't be surprised if I tell you that dates formatting works the same way!
The DateFormat
class provides a number of methods for formatting date and time.
getDateTimeInstance(int dateStyle, int timeStyle, Locale locale)
- returns aDateFormat
instance that formats dates and times.getDateInstance(int dateStyle, Locale locale)
- returns aDateFormat
instance that formats dates.getTimeInstance(int timeStyle, Locale locale)
- returns aDateFormat
instance that formats times.
All methods need a style of date or time and locale. The DateFormat
class provides a number of constants for formatting dates and times, there are 4 main styles:
DateFormat.SHORT
- the shortest form of the date or time.DateFormat.MEDIUM
- the medium form of the date or time.DateFormat.LONG
- the longest form of the date or time.DateFormat.FULL
- the full form of the date or time.
Date wonderfulDay = Date.from(LocalDateTime.of(2005, 4, 2, 21, 37).toInstant(UTC));
DateFormat.getDateTimeInstance(DateFormat.FULL, DateFormat.FULL, LOCALE_POLAND).format(wonderfulDay)
// sobota, 2 kwietnia 2005 23:37:00 czas środkowoeuropejski letni
DateFormat.getDateTimeInstance(DateFormat.FULL, DateFormat.FULL, LOCALE_FRANCE).format(wonderfulDay)
// samedi 2 avril 2005 à 23:37:00 heure d’été d’Europe centrale
DateFormat.getDateTimeInstance(DateFormat.FULL, DateFormat.FULL, LOCALE_US).format(wonderfulDay)
// Saturday, April 2, 2005 at 11:37:00 PM Central European Summer Time
Time has been automatically converted to the local time zone in the first line, where we passed local time and converted it into the UTC time zone.
Getting messages from resource bundles
In the previous sections, we learned how to format values using Locale
objects. In this section, we will learn how to read translated messages from the ResourceBundle
class.
The ResourceBundle
class is a collection of key-value pairs. The keys are strings and the values are objects. The ResourceBundle
class provides a number of methods for getting values from the bundle.
getString(String key)
- returns a string for the given key.getObject(String key)
- returns an object for the given key.getKeys()
- returns anEnumeration
of the keys contained in thisResourceBundle
and its parent bundles.
Put the following code in the src/main/resources/messages_pl_PL.properties
file.
footerText=© 2023 SimpleLocalize. Wszelkie prawa zastrzeżone.
linkText=Utwórz konto SimpleLocalize
message=Dziękujemy za wypróbowanie naszego demo SimpleLocalize dla Javy!
title=Hej {0}!
Now, we can read the messages using ResourceBundle
object.
ResourceBundle bundle = ResourceBundle.getBundle("messages", LOCALE_POLAND);
bundle.getString("linkText");
// Utwórz konto SimpleLocalize
bundle.getString("title");
// Hej {0}!
As you can see, the title
message contains a placeholder {0}
. We can use the MessageFormat
class to replace the placeholder with a value.
ResourceBundle bundle = ResourceBundle.getBundle("messages", LOCALE_POLAND);
String title = bundle.getString("title");
String message = MessageFormat.format(title, "Jakub");
// Hej Jakub!
One of the advantages of using ResourceBundle
instead just reading from text files is that it supports inheritance.
You can create a bundle that extends another bundle. In this case, the ResourceBundle
class will look for the key in the current bundle and if it doesn't find it, it will look for it in the parent bundle.
String normalization
Normalization is the process of converting a string to a normalized form. The java.text.Normalizer
class provides a number of methods for normalizing strings.
The Normalizer.Form
class provides a number of constants for normalization forms.
Normalizer.Form.NFC
- Canonical decomposition, followed by canonical composition.Normalizer.Form.NFD
- Canonical decomposition.Normalizer.Form.NFKC
- Compatibility decomposition, followed by canonical composition.Normalizer.Form.NFKD
- Compatibility decomposition.
What is canonical and compatibility decomposition?
In simple words, not technical and unprofessional words, it means that the string with diacritics will be converted to the string without diacritics. Normalizer splits the string into a sequence of characters and then combines them into a new string without diacritics.
Normalizer
.normalize("Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ", Normalizer.Form.NFD)
.replaceAll("[^\\p{ASCII}]", "")
//This is a funky String
When else might it be useful? When you work with strings that contain special characters, you should normalize them before comparing them. Or you want to normalize half-width to full-width characters. Half-width characters are common in Japanese and Chinese languages, and they were used before the computer era.
Normalizer.normalize("Hello, world!", Normalizer.Form.NFKC);
// Hello, world!
Java 21 Localization updates
Looking up Default Locale
In Java 21, you can display the default locale using the java -XshowSettings:locale
command.
This command displays the default locale and the supported locales based on the system settings.
java -XshowSettings:locale -version
It's a good way of debugging issues related to localization be verifying whihc locale the JVM is using by default.
Changing the default locale
You can change the default locale using the java.util.Locale.setDefault
method.
Locale.setDefault(Locale.of("fr", "CA"));
Swedish language updates
Collation rules have been updated in Java 21 for the Swedish language to distinguish between v
and w
Old rules: va
, wb
, vc
New rules: va
, vc
, wb
Old rules are still available with Locale.forLanguageTag("sv-co-u-trad")
in case you need them.
The collation rules, also known as collating rules or sorting rules, are a set of guidelines or algorithms used to determine the order in which characters are sorted and compared in a given language or character set.
CLDR update
The Common Locale Data Repository (CLDR) has been updated to version 41 in Java 21. The CLDR is a repository of locale data used by many software systems, including Java.
Tzdata update
The tzdata has been updated to version 2023c in Java 21.
Conclusion
In this article, we didn't cover every problem when it comes to internationalization and localization, but we learned
how to face the most common challenges in international Java projects.
We learned how to use the Locale
class and saw what news Java 19 brings and what it deprecates. We learned how to get
messages from resource bundles, normalize strings and format dates and numbers.
Related resources
- Java 21: Locale class
- Java 21: NumberFormat class
- Java 21: DecimalFormatSymbols class
- Java 21: Currency class
- Java 21: Locale.Builder
- Java 21: Normalizer class
- Java 21: ResourceBundle class
- Java 19: List of supported locales
- Spring Boot 3.2: Simple internationalization
- Billy Korando: Internationalization in Java
- Billy Korando: Looking up Default Locale in Java 21
- Half-width kana
- Sean Patrick Floyd: Normalization example