ILineBreakConverter

This class is used to process the line breaking conventions for code set conversion.

ILineBreakConverter processes the line breaking conventions for code set conversion. When a foreign code set string is transcoded to Unicode, the foreign code set line breaks are not converted. That means a transcoded Unicode string still carries the foreign code set line breaks. Clients are required to postprocess this Unicode string to using the Unicode line breaking convention (for example, the PARAGRAPH SEPARATORs -- U+2029). ILineBreakConverter is a simple concrete class that does this. For instance, on DOS, Win32, and OS/2 hosts, the line breaks (CR/LF sequences) will be converted to PARAGRAPH SEPARATORs. On the AIX platform, LFs will be converted to PARAGRAPH SEPARATORs.

Converting the PARAGRAPH SEPARATOR back to an appropriate line breaking convention depends on the host on which the document is being viewed or used. The following table shows the rules used during the line breaks conversion:

     Host/App             Host Line Breaks                   Unicode Line Breaks
     ========             ================                   ===================
     DOS,OS/2,Win        CR/LF:0x000D 0x000A             <=> U+2029 PARAGRAPH SEPARATOR
     Unix                     LF:0x000A                  <=> U+2029 PARAGRAPH SEPARATOR
     Mac                      CR:0x000D                  <=> U+2029 PARAGRAPH SEPARATOR
     Word/RichEdit       CR/LF_VT:0x000D 0x000A          <=> U+2029 PARAGRAPH SEPARATOR
     Word/RichEdit       CR/LF_VT:0x000B                 <=> U+2028 LINE SEPARATOR

To use this class, the following scenario shows how to postprocess the line breaks after a text string is transcoded to Unicode, and then preprocess the line breaks before transcoding the text string back to host character set:

     transcoder->toUnicode(hostText, uniText);
     ILineBreakConverter::convertInPlace(uniText); // postprocess
     // Do something on uniText ...
     ILineBreakConverter::convertInPlace(uniText, ILineBreakConverter::kHost); // preprocess
     transcoder->fromUnicode(uniText, hostText);
 

You should not derive from this class.


ILineBreakConverter - Member Functions and Data by Group

Converting a Unicode String to Use A Given Line Break Convention

Use the functions in this group to convert a Unicode string using the specified line break convention. Using these functions, you can postprocess and preprocess the line breaks after and before the text is converted to the foreign host character set.


[view class]
convert
public:
static IText convert( const IText& string, ELineBreakConvention target = kUnicode )
Converts a Unicode string to using the target line break convention. The default uses the Unicode line break convention. Returns a Unicode string containing the target line break convention.

Note: Use this function to postprocess the line breaks for the string after it is converted to Unicode, and preprocess the line breaks before it is converted back to foreign host character set.

string
The given Unicode string.
target
The target line break convention.

Supported Platforms

Windows OS/2 AIX
Yes Yes Yes


[view class]
convertInPlace
public:
static void convertInPlace( IText& string, ELineBreakConvention target = kUnicode )
Converts a Unicode string in place to using target line break convention. The default uses the Unicode line break convention. Note: Use this function to postprocess the line breaks after converting to Unicode, and preprocess the line breaks before converting back to foreign host character set.
string
The given Unicode string.
target
The target line break convention.

Supported Platforms

Windows OS/2 AIX
Yes Yes Yes


Getting the Host Line Break Convention

Use the function in this group to obtain the current line breaking convention used for the host.


[view class]
hostConvention
public:
static ELineBreakConvention hostConvention()
Gets the current host line break convention.

Supported Platforms

Windows OS/2 AIX
Yes Yes Yes


ILineBreakConverter - Enumerations


[view class]
ELineBreakConvention
enum ELineBreakConvention { kUnicode, 
                            kCRLF, 
                            kLF, 
                            kCR, 
                            kCRLF_VT, 
                            kHost }
Useful constants indicating the target line break convention.
kUnicode - Unicode line break convention.
kCRLF    - Win, OS/2, DOS line break convention.
kLF      - UNIX line break convention.
kCR      - Macintosh line break convention.
kCRLF_VT - Microsoft Word/RichEdit line break convention.
kHost    - Current host line break convention.

Supported Platforms

Windows OS/2 AIX
Yes Yes Yes


ILineBreakConverter - Inherited Member Functions and Data

Inherited Public Functions

Inherited Public Data

Inherited Protected Functions

Inherited Protected Data