FrontBase logo

FrontBase Documentation

FrontBase logo

Backtrack:
  Welcome!
    6. Original Documentation
Updated: 20-Nov-2000
prevnext
Table of Contents

6.8. FrontBase Unicode Manager

FBUnicodeManager is a YellowBox application that lets you create and maintain collations and translations.

A collation is a table-like mapping between Unicode characters and corresponding values that should be used when comparing two Unicode strings. The identity collation is thus a collation where each Unicode character is mapped into the ordinal value of the character in the Unicode character set.

A translation maps each Unicode character, defined in the translation, to a string of Unicode characters. A translation is thus not a simple lookup table, but allows you to convert a string of characters to a completely different string of characters, i.e. even the lengths could differ. A typical example is converting from lower case to upper case, where in some languages a single lower case character maps into two or more upper case characters, e.g. the German lower case û which maps into SS. The predefined SQL 92 functions UPPER and LOWER actually uses translations and by modifying the corresponding translations, you can change the result of both functions.

The Unicode inspector
Creating a collation
Saving a collation
To edit an existing collation
To use a collation in SQL 92
Creating a translation
Saving a translation
To edit an existing translation
To use a translation in SQL 92
The Unicode inspector

The Unicode inspector is an important component of the FBUnicodeManager, it lets you view and examine all characters defined in the Unicode 2.0 character set. The Unicode inspector is also the key instrument to use when creating collations and translations.

To bring the Unicode inspector front, click on the Unicode inspector... menu item:


Code - The ordinal value (hexadecimal) in the Unicode character set
Unicode 2.0 Name - The official full name of the character
Category - The category which the character has been assigned by the Unicode comittee
Upper - The ordinal value (hexadecimal) of the corresponding upper case character (if any)
Lower - The ordinal value (hexadecimal) of the corresponding lower case character (if any)
Title - The ordinal value (hexadecimal) of the corresponding title case character (if any, see e.g. 1C5 hex)

Clicking on a line in the inspector will select it and update the display field and the ordinal field. Please note that the Helvetica font that comes with Mac OS X Server doesn't include a definition for all Unicode characters, in particular not too many characters with an ordinal value higher than 255 (FF) are defined. Examples: the character "Latin Small Ligature Ij" (132 hex.) isn't defined, while the character "Latin Small Letter O With Macron" (14D hex.) is defined.

Creating a collation

To create a collation, click on the Collations->New menu item:


Code - The ordinal value (hexadecimal) in the Unicode character set
Unicode 2.0 Name - The official full name of the character
Value - The value (decimal) which will be used when comparing two characters
Upper - The ordinal value (hexadecimal) of the corresponding upper case character (if any)
Lower - The ordinal value (hexadecimal) of the corresponding lower case character (if any)

The above collation has been filled with values by first clicking on the "Include Unicodes" button and then clicking on the "Map all lowercase into uppercase" button.

To quickly populate an otherwise empty collation click on the "Includes Unicodes" button. If you don't want e.g. the first 32 Unicode characters (they are all control codes), enter 32 in the From: field before clicking on the "Include Unicodes" button. To correspondingly limit (or extend) the upper limit, edit the To: field.

To select individual Unicodes to be inserted in a collation: go to the Unicode inspector, select the characters, click Edit->Copy, go to the Collation window and click Edit->Paste.

To rearrange characters in a collation: select the characters, click Edit->Cut, click on the character after which you want the Cut'ed characters inserted, click Edit->Paste.

Click on the "Reset Value" button, to reset the value to the ordinal value.

If you want the characters in the collation to be given consecutive values, i.e. starting from a given value, edit the "Starting at:" field and click on the "Renumber Value" button.

Please note that characters not defined in a collation will be assigned a value that is equal to the ordinal value, i.e. it is perfectly OK to leave out characters that are of no concern to you.

Saving a collation

Once you have completed the definition of a new collation, click on the Collation->Save as..." menu item:


Collations must be stored in /usr/FrontBase/Collations, but if you don't have write permissions to this folder, save the collation in e.g. your home folder, it can then get copied later on. The "Table/Search threshold:" field is used to specify the value which defines the split between a direct lookup table and a linear search table. If you have a collation that covers many Unicode characters, but which is very sparse, it may not make sense to use a direct lookup table for all characters. This is mainly a memory issue for the FrontBase database server.

To edit an existing collation

Click on the Collations->Open menu item, select the collation in the Open panel and click OK. You can then modify the collation as needed and save it again by clicking on the Collations->Save menu item.

To use a collation in SQL 92

Once a collation has been stored in /usr/FrontBase/Collations, it can be used in SQL 92 statements. One very common way to use a collation is to specfify it when creating a table:

	CREATE COLLATION CASE_INSENSITIVE
		FOR INFORMATION_SCHEMA.SQL_TEXT
		FROM EXTERNAL('CaseInsensitive.coll1');	
			-- CaseInsensitive.coll1 comes with a FrontBase distribution
	COMMIT;

	CREATE TABLE T0(C0 VARCHAR(1024) COLLATE CASE_INSENSITIVE);
	COMMIT;

When ever two C0 column values are compared, the specified collation will be used. This applies not only to WHERE clauses, but to all operations in which a C0 column participate, incl. indexes.

Creating a translation

To create a translation, click on the Translations->New menu item:


Code - The ordinal value (hexadecimal) in the Unicode character set
Unicode 2.0 Name - The official full name of the character
Upper - The ordinal value (hexadecimal) of the corresponding upper case character (if any)
Lower - The ordinal value (hexadecimal) of the corresponding lower case character (if any)
Translate into - The string of characters the selected character is to be translated into

The above translation has been filled with values by clicking on the "Include Unicodes" button.

To quickly populate an otherwise empty translation click on the "Includes Unicodes" button. If you don't want e.g. the first 32 Unicode characters (they are all control codes), enter 32 in the From: field before clicking on the "Include Unicodes" button. To correspondingly limit (or extend) the upper limit, edit the To: field.

To select individual Unicodes to be inserted in a translation: go to the Unicode inspector, select the characters, click Edit->Copy, go to the translation window and click Edit->Paste.

To rearrange characters in a collation: select the characters, click Edit->Cut, click on the character after which you want the Cut'ed characters inserted, click Edit->Paste.

Double-click, for a given character, on the "Translate into" field and enter the proper string of characters.

Saving a translation

Once you have completed the definition of a new translation, click on the Translation->Save as..." menu item:


Translations must be stored in /usr/FrontBase/Translations, but if you don't have write permissions to this folder, save the translation in e.g. your home folder, it can then get copied later on. The "Table/Search threshold:" field is used to specify the value which defines the split between a direct lookup table and a linear search table. If you have a translation that covers many Unicode characters, but which is very sparse, it may not make sense to use a direct lookup table for all characters. This is mainly a memory issue for the FrontBase database server.

To edit an existing translation

Click on the Translations->Open menu item, select the collation in the Open panel and click OK. You can then modify the translation as needed and save it again by clicking on the Translations->Save menu item.

To use a translation in SQL 92

Once a translation has been stored in /usr/FrontBase/Translations, it can be used in SQL 92 statements.

	CREATE TRANSLATION ROMAN_DIGITS
		FOR INFORMATION_SCHEMA.SQL_TEXT
		TO  INFORMATION_SCHEMA.SQL_TEXT
		FROM EXTERNAL('RomanDigits.trans');
	COMMIT;

	CREATE TABLE T0(C0 VARCHAR(1024));
	INSERT INTO T0 VALUES '111', '3';
	SELECT * FROM T0 WHERE TRANSLATE(C0 USING ROMAN_DIGITS) = 'III';
	-- Returns both rows in T0 (assuming that '1' maps into 'I' and that '3' maps into 'III')



If you have feedback or questions on this document, please send e-mail to doc-feedback@frontbase.com. Please reference the section number and topic. Thanks!!

©2000 FrontBase, Inc. All rights reserved.