Initializing MFC/CRT for consumption of regional settings (Internationalization/C++)

Summary

In this post I’m going to demonstrate how to make sure C++ programs use the regional settings correctly when using Microsoft Foundation Classes. You think it’s a fully transparent process and already done for you? Not always! I’m going to start with the conclusion, and you can read more if you need to. I hope people find this useful.

Conclusion

Put the following code in a function and call it from your Application’s initialization routine, or you may get incomplete functionality within MFC classes which mix use of CRT and other DLLs to format numbers and dates.

setlocale(LC_COLLATE,“.OCP”); // sets the sort order

setlocale(LC_MONETARY, “.OCP”); // sets the currency formatting rules

setlocale(LC_NUMERIC, “.OCP”); // sets the formatting of numerals

setlocale(LC_TIME, “.OCP”); // defines the date/time formatting

Background

I’m going to walk you through how I arrived at the conclusion above, supported by some code samples and other articles.

Whilst reviewing some region-aware code in C++, I noticed some differences to the way in which C++ handles it versus C#. In C# this is (relatively) easy, as the majority of the .NET classes are geared up for Culture awareness. In MFC, it (at first) appeared to be very similar, although hardly any classes provided the overloads you get in C# .NET.

During automated testing of the code, strange results caused me hours of head scratching. The worse thing a programmer can face is inconsistent results. We like things to fail or pass 100% of the time.

For the purposes of this example, I created an MFC application in Visual Studio 2010, using a dialog based application type with no document/view support. My code is placed into the OnAppAbout event handler (provided as default) as a playpen area.

Here I’m simply tracing the ‘localized’ time format, and then specifying a more verbose time format that includes the abbreviated month and day names.

void CInternationalizationExampleApp::OnAppAbout()

{

COleDateTime tm(2001,5,1,12,13,14);

TRACE( tm.Format()+“\n” + tm.Format(L“%Y/%m (%b)/%d (%a) %X”));

}

This results in the following strings (i’m UK locale, so I expect day/month/year):

01/05/2001 12:13:14
2001/05 (May)/01 (Tue) 12:13:14

Now, if I change the regional settings to Swedish (Sweden) in the Region and Language Options dialog, press Apply, and rerun the code – I expect to see the outputs change accordingly.

Changing the regional settings temporarily.
Changing the regional settings temporarily.

In Swedish, May is Maj, and Tue is Ti. Their short date format is also ISO style, with year, month and date separated with hyphens.

Results:

2001-05-01 12:13:14
2001/05 (May)/01 (Tue) 12:13:14

COleDateTime has used the correct regional settings for .Format(), but not for Format( string ). On closer inspection of the two functions we can see subtle differences : – notice the code highlighted.

inline CString COleDateTime::Format(_In_ DWORD dwFlags,_In_ LCID lcid) const

{

// If null, return empty string

if (GetStatus() == null)

return _T(“”);

// If invalid, return DateTime global string

if (GetStatus() == invalid)

{

CString str;

if(str.LoadString(ATL_IDS_DATETIME_INVALID))

return str;

return szInvalidDateTime;

}

CComBSTR bstr;

if (FAILED(::VarBstrFromDate(m_dt, lcid, dwFlags, &bstr)))

{

CString str;

if(str.LoadString(ATL_IDS_DATETIME_INVALID))

return str;

return szInvalidDateTime;

}

CString tmp = CString(bstr);

return tmp;

}

and

inline CString COleDateTime::Format(_In_z_ LPCTSTR pFormat) const

{

ATLENSURE_THROW(pFormat != NULL, E_INVALIDARG);

// If null, return empty string

if(GetStatus() == null)

return _T(“”);

// If invalid, return DateTime global string

if(GetStatus() == invalid)

{

CString str;

if(str.LoadString(ATL_IDS_DATETIME_INVALID))

return str;

return szInvalidDateTime;

}

UDATE ud;

if (S_OK != VarUdateFromDate(m_dt, 0, &ud))

{

CString str;

if(str.LoadString(ATL_IDS_DATETIME_INVALID))

return str;

return szInvalidDateTime;

}

struct tm tmTemp;

tmTemp.tm_sec = ud.st.wSecond;

tmTemp.tm_min = ud.st.wMinute;

tmTemp.tm_hour = ud.st.wHour;

tmTemp.tm_mday = ud.st.wDay;

tmTemp.tm_mon = ud.st.wMonth – 1;

tmTemp.tm_year = ud.st.wYear – 1900;

tmTemp.tm_wday = ud.st.wDayOfWeek;

tmTemp.tm_yday = ud.wDayOfYear – 1;

tmTemp.tm_isdst = 0;

CString strDate;

LPTSTR lpszTemp = strDate.GetBufferSetLength(256);

_tcsftime(lpszTemp, strDate.GetLength(), pFormat, &tmTemp);

strDate.ReleaseBuffer();

return strDate;

}

On stepping into the code and checking on MSDN, the VarBstrFromDate function lives in oleaut32.dll. The code trail goes cold in a non debug windows environment at this point. Whereas _tcsftime is in msvcr100d.dll, which we can step into and we see references to the _locale_t data type and a set of localization functions which have similar signatures e.g. suffix with _l (_tcsftime_l).

So the CRT is supposed to be locale aware under the hood. This told me either two different mechanisms are at play or some sort of initialization is required (most probable).

On further googling, I confirmed that the functions which interact with the regional settings are in Kernel32.dll, and are called things like GetLocaleInfo()(see WinNLS.h). Whereas CRT uses functions like setlocale and suffixes functions with with _l to allow a _locale_t object to be passed. By design, it has similarly named functions such as __getlocaleinfo.

I found the following articles on CodeProject that further confirmed my suspicions.

http://www.codeproject.com/Articles/9600/Windows-SetThreadLocale-and-CRT-setlocale

http://www.codeproject.com/Articles/12568/Format-Date-and-Time-As-Per-User-s-Locale-Settings

We can test this theory by setting the locale to Swedish using the setlocale function.

#include <locale.h>

void CInternationalizationExampleApp::OnAppAbout()

{

COleDateTime tm(2001,5,1,12,13,14);

setlocale( LC_TIME, “swedish” );

TRACE( tm.Format()+“\n”+tm.Format(L“%Y/%m (%b)/%d (%a) %X”));

}

Results in:

2001-05-01 12:13:14
2001/05 (maj)/01 (ti) 12:13:14

This is good news, but I need it to work based off the current user profile by default and I don’t want to have the method described in the code project articles if I can help it. I’m not switching locales.

On a closer inspection of the CRT source code, I traced down what happens in setlocale() and it calls __getlocaleinfo, __crtGetLocaleInfoA and the function that really matters: __crtGetLocaleInfoA_stat – it calls GetLocaleInfoW from Kernel32.dll. Bingo.

If I set a breakpoint in __crtGetLocaleInfoA_stat it doesn’t fire unlesswe call setlocale().

This indicates that the CRT does not populate it’s locale buffers when it starts, nor lazily, but that it has to be explicitly set (as the code project articles correctly state).

Looking at the setlocale MSDN article we can see there’s a shortcut to initialize CRT based on the current regional settings.

setlocale( LC_ALL, ".OCP" );

Explicitly sets the locale to the current OEM code page obtained from the operating system.

I can successfully put this code in, and my test code has the same (correct) result

void CInternationalizationExampleApp::OnAppAbout()

{

COleDateTime tm(2001,5,1,12,13,14);

setlocale( LC_ALL, “.OCP” );

//setlocale( LC_TIME, “swedish” );

TRACE( tm.Format()+“\n” + tm.Format(L“%Y/%m (%b)/%d (%a) %X”));

}

Results:

2001-05-01 12:13:14
2001/05 (maj)/01 (ti) 12:13:14

In the article pasted below, Microsoft recommends NOT using LC_ALL or LC_CTYPE. http://msdn.microsoft.com/en-us/goglobal/bb688121.aspx

Note: The code from the article doesn’t actually compile, and you can use setlocale() and get rid of the Unicode L and explicit use of _w.

C Run Time

CRT locale support is built around the (_w) setlocale (category, locale) call. A call to this function defines the results of all subsequent CRT-based locale-sensitive operations, not only the character encoding. The category argument defines scope of environment changes after setlocale is called.

In order to set the rules for formatting locale-sensitive data in accordance with the user locale, the following calls can be executed:

_wsetlocale (LC_COLLATE, L(".OCP") ); // sets the sort order
_wsetlocale (LC_MONETARY, L(".OCP") ); // sets the currency formatting rules
_wsetlocale (LC_NUMERIC, L(".OCP") ); // sets the formatting of numerals
_wsetlocale (LC_TIME, L(".OCP") ); // defines the date/time formatting

“.OCP” and “.ACP” parameters always refer to the settings of the user locale, not the system locale. While selecting this locale for LC_CTYPE or LC_ALL is not a good choice, all other categories should be set to match the user locale, unless your console must be explicitly independent of the user’s settings.

Leave a comment