Java 21: Internationalization

Jakub Pomykała
Jakub Pomykała
Last updated: March 12, 20248 min read
Java 21: Internationalization

Table of contents

Introduction

In this article, we will learn how to use formatting instances to format dates and numbers. We will take a look at the java.text package and its classes. We will also learn how to use the java.time package to work with dates and times. Finally, we will learn how to use the ResourceBundle object to retrieve localized messages from a properties file.

Locale class changes

Since Java 19, all constructors of the Locale class have been deprecated. Instead, you should use the Locale.of static constructors to create a new Locale instance or use the Locale.Builder class to create a new Locale instance.

// static constructors
Locale.of(language);
Locale.of(language, country);
Locale.of(language, country, variant);

// builder
new Locale.Builder()
    .setLanguage(language)
    .setRegion(country)
    .setVariant(variant)
    .build();

// deprecated constructors
new Locale(language);
new Locale(language, country);
new Locale(language, country, variant);

Number formatting

The NumberFormat class offers a number of methods for returning different instances for formatting numbers.

  • getPercentInstance(Locale) - formats numbers as percentages.
  • getCurrencyInstance(Locale) - formats numbers as currency values.
  • getIntegerInstance(Locale) - formats numbers as integers.
  • getNumberInstance(Locale) - formats numbers as general-purpose numbers.
  • getCompactNumberInstance(Locale, NumberFormat.Style) - formats numbers as compact numbers.

Percentage formatting

Different countries use different symbols for percentages. For example, in the US, the percentage symbol is %, while in the UK, it is £.

NumberFormat.getPercentInstance(Locale.US).format(0.75);
// 75%

NumberFormat.getPercentInstance(LOCALE_FRANCE).format(0.75);
// 75 %

NumberFormat.getPercentInstance(Locale.of("ar", "SA")).format(0.75);
// ٪٧٥

Long numbers formatting

Compact numbers are a method of formatting numbers in a more compact way. For example, instead of writing 1,000,000 you can write 1M. They are useful when you want to display a large number in a small space. Every locale has its own set of compact numbers. There are 4 different styles of compact numbers:

  • NumberFormat.Style.SHORT - the shortest form of the compact number. For example, 1M for 1,000,000.
  • NumberFormat.Style.LONG - the longest form of the compact number. For example, 1 million for 1,000,000.
// NumberFormat.Style.SHORT

NumberFormat.getCompactNumberInstance(LOCALE_US, Style.SHORT).format(1_213_700);
// 1M

NumberFormat.getCompactNumberInstance(LOCALE_FRANCE, Style.SHORT).format(1_213_700);
// 1 M

NumberFormat.getCompactNumberInstance(LOCALE_POLAND, Style.SHORT).format(1_213_700);
// 1 mln


// NumberFormat.Style.LONG

NumberFormat.getCompactNumberInstance(LOCALE_US, Style.LONG).format(12_137);
// 12 thousand

NumberFormat.getCompactNumberInstance(LOCALE_FRANCE, Style.LONG).format(12_137);
// 12 mille

NumberFormat.getCompactNumberInstance(LOCALE_POLAND, Style.LONG).format(12_137);
// 12 tysięcy

Numbers formatting

The NumberFormat class provides a number of methods for formatting numbers.

NumberFormat.getNumberInstance(localeUs).format(1234567.89);
// 1,234,567.89

NumberFormat.getNumberInstance(localeFrance).format(1234567.89);
// 1 234 567,89

NumberFormat.getNumberInstance(localePoland).format(1234567.89);
// 1 234 567,89

Money formatting

Currency formatting is useful when you want to display a number as a currency value. For example, instead of joining currency symbol and number you can use the NumberFormat instance which does it for you in the best possible way, taking into account the locale.

NumberFormat.getCurrencyInstance(localePoland).format(123.45);
// 123,45 zł

NumberFormat.getCurrencyInstance(localeFrance).format(123.45);
// 123,45 €

NumberFormat.getCurrencyInstance(localeUS).format(123.45);
// $123.45

Note that different locales use different symbols for decimal separators and grouping separators. For example, in the US, the decimal separator is ., while in Poland and France, it is ,.

Symbols and currencies

NumberFormat are powerful tool, but what if you want to display a currency symbol or a decimal separator? You can use the DecimalFormatSymbols class to get the symbols for a given locale.

Currency currency = DecimalFormatSymbols.getInstance(localePoland).getCurrency();
currency.getDisplayName(LOCALE_POLAND);
// złoty polski

currency.getDisplayName(Locale.FRANCE);
// zloty polonais

currency.getDisplayName(Locale.US);
// Polish zloty

Note that we took Polish currency (złoty polski) and Locale.FRANCE and Java returned the French name of the currency (zloty polonais). If we took Locale.US and Polish currency, Java returned the English name of the currency (Polish zloty).

Returning symbols works in the same way:

currency.getSymbol(LOCALE_POLAND);
// zł

currency.getSymbol(Locale.FRANCE);
// PLN

currency.getSymbol(Locale.US);
// PLN

The DecimalFormatSymbols class provides a set of symbols (such as the currency symbol) that are used to format numbers.

DecimalFormatSymbols.getInstance(LOCALE_POLAND).getCurrencySymbol();
// zł

DecimalFormatSymbols.getInstance(Locale.FRANCE).getCurrencySymbol();
// €

DecimalFormatSymbols.getInstance(LOCALE_POLAND).getGroupingSeparator();
// ' '

DecimalFormatSymbols.getInstance(Locale.US).getGroupingSeparator();
// ,

Date formatting

After previous sections, you should be familiar with formatting values using getXXXInstance methods and Locale objects. You shouldn't be surprised if I tell you that dates formatting works the same way!

The DateFormat class provides a number of methods for formatting date and time.

  • getDateTimeInstance(int dateStyle, int timeStyle, Locale locale) - returns a DateFormat instance that formats dates and times.
  • getDateInstance(int dateStyle, Locale locale) - returns a DateFormat instance that formats dates.
  • getTimeInstance(int timeStyle, Locale locale) - returns a DateFormat instance that formats times.

All methods need a style of date or time and locale. The DateFormat class provides a number of constants for formatting dates and times, there are 4 main styles:

  • DateFormat.SHORT - the shortest form of the date or time.
  • DateFormat.MEDIUM - the medium form of the date or time.
  • DateFormat.LONG - the longest form of the date or time.
  • DateFormat.FULL - the full form of the date or time.
Date wonderfulDay = Date.from(LocalDateTime.of(2005, 4, 2, 21, 37).toInstant(UTC));

DateFormat.getDateTimeInstance(DateFormat.FULL, DateFormat.FULL, LOCALE_POLAND).format(wonderfulDay)
// sobota, 2 kwietnia 2005 23:37:00 czas środkowoeuropejski letni

DateFormat.getDateTimeInstance(DateFormat.FULL, DateFormat.FULL, LOCALE_FRANCE).format(wonderfulDay)
// samedi 2 avril 2005 à 23:37:00 heure d’été d’Europe centrale

DateFormat.getDateTimeInstance(DateFormat.FULL, DateFormat.FULL, LOCALE_US).format(wonderfulDay)
// Saturday, April 2, 2005 at 11:37:00 PM Central European Summer Time

Time has been automatically converted to the local time zone in the first line, where we passed local time and converted it into the UTC time zone.

Getting messages from resource bundles

In the previous sections, we learned how to format values using Locale objects. In this section, we will learn how to read translated messages from the ResourceBundle class.

The ResourceBundle class is a collection of key-value pairs. The keys are strings and the values are objects. The ResourceBundle class provides a number of methods for getting values from the bundle.

  • getString(String key) - returns a string for the given key.
  • getObject(String key) - returns an object for the given key.
  • getKeys() - returns an Enumeration of the keys contained in this ResourceBundle and its parent bundles.

Put the following code in the src/main/resources/messages_pl_PL.properties file.

footerText=© 2023 SimpleLocalize. Wszelkie prawa zastrzeżone.
linkText=Utwórz konto SimpleLocalize
message=Dziękujemy za wypróbowanie naszego demo SimpleLocalize dla Javy!
title=Hej {0}!

Now, we can read the messages using ResourceBundle object.

ResourceBundle bundle = ResourceBundle.getBundle("messages", LOCALE_POLAND);

bundle.getString("linkText");
// Utwórz konto SimpleLocalize

bundle.getString("title");
// Hej {0}!

As you can see, the title message contains a placeholder {0}. We can use the MessageFormat class to replace the placeholder with a value.

ResourceBundle bundle = ResourceBundle.getBundle("messages", LOCALE_POLAND);
String title = bundle.getString("title");
String message = MessageFormat.format(title, "Jakub");
// Hej Jakub!

One of the advantages of using ResourceBundle instead just reading from text files is that it supports inheritance. You can create a bundle that extends another bundle. In this case, the ResourceBundle class will look for the key in the current bundle and if it doesn't find it, it will look for it in the parent bundle.

String normalization

Normalization is the process of converting a string to a normalized form. The java.text.Normalizer class provides a number of methods for normalizing strings.

The Normalizer.Form class provides a number of constants for normalization forms.

  • Normalizer.Form.NFC - Canonical decomposition, followed by canonical composition.
  • Normalizer.Form.NFD - Canonical decomposition.
  • Normalizer.Form.NFKC - Compatibility decomposition, followed by canonical composition.
  • Normalizer.Form.NFKD - Compatibility decomposition.

What is canonical and compatibility decomposition?

In simple words, not technical and unprofessional words, it means that the string with diacritics will be converted to the string without diacritics. Normalizer splits the string into a sequence of characters and then combines them into a new string without diacritics.

Normalizer
  .normalize("Tĥïŝ ĩš â fůňķŷ Šťŕĭńġ", Normalizer.Form.NFD)
  .replaceAll("[^\\p{ASCII}]", "")
//This is a funky String

When else might it be useful? When you work with strings that contain special characters, you should normalize them before comparing them. Or you want to normalize half-width to full-width characters. Half-width characters are common in Japanese and Chinese languages, and they were used before the computer era.

Normalizer.normalize("Hello, world!", Normalizer.Form.NFKC);
// Hello, world!

Java 21 Localization updates

Looking up Default Locale

In Java 21, you can display the default locale using the java -XshowSettings:locale command. This command displays the default locale and the supported locales based on the system settings.

java -XshowSettings:locale -version

It's a good way of debugging issues related to localization be verifying whihc locale the JVM is using by default.

Changing the default locale

You can change the default locale using the java.util.Locale.setDefault method.

Locale.setDefault(Locale.of("fr", "CA"));

Swedish language updates

Collation rules have been updated in Java 21 for the Swedish language to distinguish between v and w

Old rules: va, wb, vc

New rules: va, vc, wb

Old rules are still available with Locale.forLanguageTag("sv-co-u-trad") in case you need them.

The collation rules, also known as collating rules or sorting rules, are a set of guidelines or algorithms used to determine the order in which characters are sorted and compared in a given language or character set.

CLDR update

The Common Locale Data Repository (CLDR) has been updated to version 41 in Java 21. The CLDR is a repository of locale data used by many software systems, including Java.

Tzdata update

The tzdata has been updated to version 2023c in Java 21.

Conclusion

In this article, we didn't cover every problem when it comes to internationalization and localization, but we learned how to face the most common challenges in international Java projects. We learned how to use the Locale class and saw what news Java 19 brings and what it deprecates. We learned how to get messages from resource bundles, normalize strings and format dates and numbers.

Jakub Pomykała
Jakub Pomykała
Founder of SimpleLocalize