Java and Locales

Hideyuki Inada, Capitola Computing Inc.

Each country in the world has its own cultural conventions.  Sometimes, a country has multiple cultural conventions if it consists of more than one ethnic group.

When you internationalize your software, you have to take these issues into consideration.  Locale is used to refer to cultural conventions used in each country. If the format of data is affected by the locale, the data is said to be locale-sensitive. If the data is locale-sensitive, you have to present or process the data following the conventions used in the specific locale.

Typical locale-sensitive data are:

l         Date and time format

l         Number format including monetary format

l         Collation sequence

l         Name format

l         Address format

l         Unit of measurement - Metric (SI) or pound/feet

 

For date, time, number formats and collation sequence, Java provides string support of locale via the Locale class and various locale-sensitive formatter class.

Locale class

Before we discuss locale-sensitive data, let's go over the Locale class that  is essential in handling the locale-sensitive data in your code.  Here is how to instantiate the class for the major Western Europe and North America countries:

 

import java.util.*;

 

public class intl31 {

 

  public static void main(String args[])

  {

    String asLanguageCountry[][] = {

       // Western Europe

      { "en", "GB"},

      { "fr", "FR"},

      { "de", "DE"},

      { "es", "ES"},

      { "it", "IT"},

      { "sv", "SE"},

      { "da", "DK"},

      { "nl", "NL"},

 

       // North America

      { "en", "US"},

      { "en", "CA"},

      { "fr", "CA"},

      { "es", "MX"}

    };

    Locale l;

    int i;

 

    // Code

    for(i = 0; i < asLanguageCountry.length; i++){

      l = new Locale(asLanguageCountry[i][0], asLanguageCountry[i][1]);

      System.out.print(l.getDisplayName(Locale.US) + "\n");

    }

  }

intl31.java

In the first argument of the Locale constructor, you specify the two-digit ISO-639 language code in lower case.

In the second argument of the Locale constructor, you specify the two-digit ISO-3166 country code in upper case.

 

For example, since Canada has an English-speaking region and a French-speaking region, it has two locales to support each of them.

 

This code outputs the following:

C:\programs\javatest\site>java intl31

English (United Kingdom)

French (France)

German (Germany)

Spanish (Spain)

Italian (Italy)

Swedish (Sweden)

Danish (Denmark)

Dutch (Netherlands)

English (United States)

English (Canada)

French (Canada)

Spanish (Mexico)

 

In the sections that follow, we will use this code as the basis, and examine the formats for various locale-sensitive data.

Date and time format

Java provides the DateFormat class to present the date in a format appropriate for each locale.  The following code displays the current date for the countries we used in previous example in the four formats (full, long, medium and short formats):

 

import java.util.*;

import java.text.*;

 

public class intl33 {

 

  public static void main(String args[])

  {

    String asLanguageCountry[][] = {

       // Western Europe

      { "en", "GB"},

      { "fr", "FR"},

      { "de", "DE"},

      { "es", "ES"},

      { "it", "IT"},

      { "sv", "SE"},

      { "da", "DK"},

      { "nl", "NL"},

 

       // North America

      { "en", "US"},

      { "en", "CA"},

      { "fr", "CA"},

      { "es", "MX"}

    };

    Locale l;

    int i, j;

    DateFormat df;

    int aiDateFormat[] = {

      DateFormat.FULL,

      DateFormat.LONG,

      DateFormat.MEDIUM,

      DateFormat.SHORT

    };

 

    // Code

    for(j = 0; j < aiDateFormat.length; j++){

      for(i = 0; i < asLanguageCountry.length; i++){

        l = new Locale(asLanguageCountry[i][0], asLanguageCountry[i][1]);

        System.out.print(l.getDisplayName(Locale.US) + "\t");

 

        df = DateFormat.getDateInstance(aiDateFormat[j], l);

        System.out.print(df.format(new Date()) + "\n");

      }

    }

  }

}

intl33.java

The output result is following  (The chcp command issued at the beginning is to set the code page to 1252 on the machine that executes this program):

C:\programs\javatest\site>chcp 1252

Active code page: 1252

 

C:\programs\javatest\site>java intl33

English (United Kingdom)        04 October 2002

French (France) vendredi 4 octobre 2002

German (Germany)        Freitag, 4. Oktober 2002

Spanish (Spain) viernes 4 de octubre de 2002

Italian (Italy) venerdì 4 ottobre 2002

Swedish (Sweden)        den 4 oktober 2002

Danish (Denmark)        4. oktober 2002

Dutch (Netherlands)     vrijdag 4 oktober 2002

English (United States) Friday, October 4, 2002

English (Canada)        Friday, October 4, 2002

French (Canada) vendredi 4 octobre 2002

Spanish (Mexico)        viernes 4 de octubre de 2002

English (United Kingdom)        04 October 2002

French (France) 4 octobre 2002

German (Germany)        4. Oktober 2002

Spanish (Spain) 4 de octubre de 2002

Italian (Italy) 4 ottobre 2002

Swedish (Sweden)        den 4 oktober 2002

Danish (Denmark)        4. oktober 2002

Dutch (Netherlands)     4 oktober 2002

English (United States) October 4, 2002

English (Canada)        October 4, 2002

French (Canada) 4 octobre 2002

Spanish (Mexico)        4 de octubre de 2002

English (United Kingdom)        04-Oct-02

French (France) 4 oct. 02

German (Germany)        04.10.2002

Spanish (Spain) 04-oct-02

Italian (Italy) 4-ott-02

Swedish (Sweden)        2002-okt-04

Danish (Denmark)        04-10-2002

Dutch (Netherlands)     4-okt-02

English (United States) Oct 4, 2002

English (Canada)        4-Oct-02

French (Canada) 02-10-04

Spanish (Mexico)        4/10/2002

English (United Kingdom)        04/10/02

French (France) 04/10/02

German (Germany)        04.10.02

Spanish (Spain) 4/10/02

Italian (Italy) 04/10/02

Swedish (Sweden)        2002-10-04

Danish (Denmark)        04-10-02

Dutch (Netherlands)     4-10-02

English (United States) 10/4/02

English (Canada)        04/10/02

French (Canada) 02-10-04

Spanish (Mexico)        4/10/02

 

As you can see, the Month/Day/Year format that is used in the U.S. is not necessarily used in the rest of the world, and supporting different formats on your own requires certain development cycles.  Using the DateFormat class can take the burden off your development, and makes it possible to easily provide the international date format support in your software.

 

Time format

The appropriate time format for each locale can be obtained by calling DateFormat.getTimeInstance(). The following code contains a slight modification from the previous example to illustrate this:

 

import java.util.*;

import java.text.*;

 

public class intl34 {

 

  public static void main(String args[])

  {

    String asLanguageCountry[][] = {

       // Western Europe

      { "en", "GB"},

      { "fr", "FR"},

      { "de", "DE"},

      { "es", "ES"},

      { "it", "IT"},

      { "sv", "SE"},

      { "da", "DK"},

      { "nl", "NL"},

 

       // North America

      { "en", "US"},

      { "en", "CA"},

      { "fr", "CA"},

      { "es", "MX"}

    };

    Locale l;

    int i, j;

    DateFormat tf;

    int aiDateFormat[] = {

      DateFormat.FULL,

      DateFormat.LONG,

      DateFormat.MEDIUM,

      DateFormat.SHORT

    };

 

    // Code

    for(j = 0; j < aiDateFormat.length; j++){

      for(i = 0; i < asLanguageCountry.length; i++){

        l = new Locale(asLanguageCountry[i][0], asLanguageCountry[i][1]);

        System.out.print(l.getDisplayName(Locale.US) + "\t");

 

        tf = DateFormat.getTimeInstance(aiDateFormat[j], l);

        System.out.print(tf.format(new Date()) + "\n");

      }

      System.out.println("\n");

    }

  }

}

 

The output is shown below:

C:\programs\javatest\site>java intl34

English (United Kingdom)        17:54:06 o'clock PDT

French (France) 17 h 54 PDT

German (Germany)        17.54 Uhr PDT

Spanish (Spain) 17H54' PDT

Italian (Italy) 17.54.06 PDT

Swedish (Sweden)        kl 17:54 PDT

Danish (Denmark)        17:54:06 PDT

Dutch (Netherlands)     17:54:06 uur PDT

English (United States) 5:54:06 PM PDT

English (Canada)        5:54:06 o'clock PM PDT

French (Canada) 17 h 54 PDT

Spanish (Mexico)        05:54:06 PM PDT

 

 

English (United Kingdom)        17:54:06 PDT

French (France) 17:54:06 PDT

German (Germany)        17:54:06 PDT

Spanish (Spain) 17:54:06 PDT

Italian (Italy) 17.54.06 PDT

Swedish (Sweden)        17:54:06 PDT

Danish (Denmark)        17:54:06 PDT

Dutch (Netherlands)     17:54:06 PDT

English (United States) 5:54:06 PM PDT

English (Canada)        5:54:06 PDT PM

French (Canada) 17:54:06 PDT

Spanish (Mexico)        05:54:06 PM PDT

 

 

English (United Kingdom)        17:54:06

French (France) 17:54:06

German (Germany)        17:54:06

Spanish (Spain) 17:54:06

Italian (Italy) 17.54.06

Swedish (Sweden)        17:54:06

Danish (Denmark)        17:54:06

Dutch (Netherlands)     17:54:06

English (United States) 5:54:06 PM

English (Canada)        5:54:06 PM

French (Canada) 17:54:06

Spanish (Mexico)        05:54:06 PM

 

 

English (United Kingdom)        17:54

French (France) 17:54

German (Germany)        17:54

Spanish (Spain) 17:54

Italian (Italy) 17.54

Swedish (Sweden)        17:54

Danish (Denmark)        17:54

Dutch (Netherlands)     17:54

English (United States) 5:54 PM

English (Canada)        5:54 PM

French (Canada) 17:54

Spanish (Mexico)        05:54 PM

 

In addition to the time format, if your software supports simultaneous use of your software by multiple users in different time zones (for example U.S., Japan and Europe) at the same time, you may want to consider storing date/time information in a single time-zone, and convert the value to the local time zone of the user when it accepts or displays data.  If you take this approach, you will no longer have to tag each date/time data with the time zone which may be a plus depending on your application.

 

Conversion from string to date

In Java, it is possible to parse a monolithic date string that contains month, day and year and convert it to the Date class.  However, this is not recommended when your software receives data input from various regions of the world at the same time.  The following example illustrates the case where the user in the U.S. enters the date " October 12, 2004" in short format "10/12/4", and how this string is interpreted in different locale settings:

 

import java.util.*;

import java.text.*;

 

public class intl35 {

 

  public static void main(String args[])

  {

    String asLanguageCountry[][] = {

       // Western Europe

      { "en", "GB"},

      { "fr", "FR"},

      { "de", "DE"},

      { "es", "ES"},

      { "it", "IT"},

      { "sv", "SE"},

      { "da", "DK"},

      { "nl", "NL"},

 

       // North America

      { "en", "US"},

      { "en", "CA"},

      { "fr", "CA"},

      { "es", "MX"}

    };

    Locale l;

    int i, j;

    DateFormat df;

    DateFormat dfLong;

    String sDate = "10/12/4"; // October 12, 2004 in the U.S. format

    Date d;

 

    // Code

    for(i = 0; i < asLanguageCountry.length; i++){

       l = new Locale(asLanguageCountry[i][0], asLanguageCountry[i][1]);

       System.out.print(l.getDisplayName(Locale.US) + "\t");

 

       df = DateFormat.getDateInstance(DateFormat.SHORT, l);

       dfLong = DateFormat.getDateInstance(DateFormat.LONG, l);

 

       try {

         d = df.parse(sDate);

         if(d == null){

           System.out.print("Date string cannot be parsed." + "\n");

         }

         else{

           System.out.print(dfLong.format(new Date()) + "\n");

         }

       }

       catch(Exception e){

          System.out.print("Date string cannot be parsed: " + e.getMessage() + "\n");

       }

    }

  }

}

intl35.java

The output is shown below:

 

C:\programs\javatest\site>java intl35

English (United Kingdom)        04 October 2002

French (France) 4 octobre 2002

German (Germany)        Date string cannot be parsed: Unparseable date: "10/12/4

"

Spanish (Spain) 4 de octubre de 2002

Italian (Italy) 4 ottobre 2002

Swedish (Sweden)        Date string cannot be parsed: Unparseable date: "10/12/4

"

Danish (Denmark)        Date string cannot be parsed: Unparseable date: "10/12/4

"

Dutch (Netherlands)     Date string cannot be parsed: Unparseable date: "10/12/4