Converting Text from Unicode to Character Format

To convert text data from Unicode to another character encoding standard, use ITranscoder. The Unicode data can be either a UniChar* or an IText object. The char data can be either a char* or an IString object.

To convert text from Unicode into another character format:

  1. Call ITranscoder::createTranscoder to create a transcoder for the desired character set. Use the transcoder name provided in the Transcoder Names table. You can also specify a mapping proximity. ITranscoder::kSupersetMapping is the default.
  2. Set the behavior for handling exception characters if you want the transcoder to do something other than use substitution characters. You can specify the behavior using ITranscoder::setUnmappedBehavior. You can also set specific substitution characters using setCharSubstitute.
  3. Preprocess the line-breaking characters by calling ILineBreakConverter::convertInPlace or convert. You must specify the line-breaking convention to use for the non-Unicode text.
  4. Transcode the text using the fromUnicode function.

For example, this code shows how to transcode text from Unicode (unicodeText) into the ISO-8859-1 (Latin) character set:

// Create the transcoder

ITranscoder* transcoder = ITranscoder::createTranscoder("ISO-
8850-1");

// Preprocess any line-breaking characters

ILineBreakConverter::convertInPlace(unicodeText, 

		ILineBreakConverter::kHost);

// Transcode the string

IString asciiText;

ITranscoder::result res = transcoder->fromUnicode(unicodeText, 
acsiiText);

if (res == codecvt_base::ok) {

	// transcoding was successful

}

delete transcoder;

Transcoder Names