Unicode Data

Unicode Data and Meadows Software

DesignMerge Pro and DesignMerge Catalog software for Adobe InDesign, as well as Data Exporter for DesignMerge Catalog, have the ability to import Unicode data. In order to support this, you must export your data file into a specific type of Unicode format called UTF-8.

 

About UTF-8

UTF-8 is a variable-width character encoding that can represent every character in the Unicode character set, with each character represented by one to four bytes. For the first 128 characters, UTF-8 uses the same one-byte encoding as ASCII; in other words, UTF-8 uses standard ASCII character codes for the first 128 characters.

 

About UTF-16 or UTF-32

UTF-16 and UTF-32 are each a type of Unicode character encoding that is different from UTF-8. DesignMerge Pro and DesignMerge Catalog do not support UTF-16 or UTF-32 data at this time.

If you are using a database or spreadsheet application (such as Microsoft Excel) that does not support exporting data to a UTF-8 text file, you can export the data to a UTF-16 text file. Then, you can use a text utility application (such as BBEdit on that Macintosh) to save the UTF-16 text file as a UTF-8 text file for use with DesignMerge Pro or DesignMerge Catalog. However, as of this writing, most Microsoft Excel versions allow you to save your data in UTF-8 format.

 

DesignMerge Features Related to Unicode Support

Data Type:

DesignMerge Pro and DesignMerge Catalog Database Definitions have a new “Data Type” setting where you can indicate whether the database file contains UTF-8 or ASCII data.

Quick Setup:

Quick Setup will automatically detect when a database file is a UTF-8 text file if the file includes the UTF-8 BOM (Byte Order Mark). If a UTF-8 file does not contain the UTF-8 BOM, Quick Setup will assume the database is an ASCII text file, however you can change the “Data Type” setting from “ASCII” to “UTF-8” in the “Quick Setup” dialog before proceeding. Also, you can change the Database Definition’s “Data Type” setting in the “Edit Database Definition” dialog at any time. Please note selecting an inappropriate Data Type may yield unpredictable results when DesignMerge imports data (when DesignMerge Catalog updates Links (Placeholders), or when DesignMerge Pro merges a document).

Field Names:

Quick Setup uses the data in the first record (the “header row”) of the selected database file as the default names for the new fields. If the data in the first record includes non-ASCII character codes, Field Names may not display as expected because DesignMerge Pro and DesignMerge Catalog Field Names do not display non-ASCII characters at this time. To avoid Field Names displaying unexpectedly, we recommend using only ASCII Printable Characters* (see note below) in the first record of data in a UTF-8 database file, or edit the default Field Names in the “Quick Setup” dialog before proceeding.

Database Definition Settings:

Please note that only ASCII Printable Characters* (see note below) are supported for use in any setting in a Database Definition. This includes, for example, the following: Field Names, Variable Link Names, Variable Link Parameters, Price Styles, DesignMerge Pro Rules Names, DesignMerge Pro Rules Criteria Values, or DesignMerge Catalog Search Criteria. Use only ASCII Printable Characters in any and all settings for a Database Definition.

DesignMerge Catalog Search Key Values:

Only ASCII Printable Characters* (see note below) are supported for use as a Search Key Value for DesignMerge Catalog Links (Placeholders) in a document. Use ASCII Printable Characters for Search Key Values.

Variable Link Import Filters:

When setting up a Variable Link for a UTF-8 database, do not set up the Link to use a Text Import Filter. Applying an Import Filter to UTF-8 data will yield unpredictable results. Instead, when using a UTF-8 database, always select “None” for a Variable Link’s “Import Filter” setting.

TransTables:

The TransTable feature has not yet been extended to convert non-ASCII characters. You may continue to use a TransTable to convert the first 128 characters that are in the UTF-8 file (characters whose UTF-8 encoding matches ASCII encoding: characters whose decimal encoding is 0 through 127).

 

A Special Note About Glyphs

Adobe InDesign, and other applications, allow you to assign any one of the glyphs that are available in a font for a particular character to display an alternate representation of that character. Please note that a glyph assignment is a formatting attribute and formatting attributes are not included in plain ASCII or UTF-8 text files. Therefore, a font’s default glyph for each character will be applied when plain ASCII or UTF-8 text is imported into a document. This also applies to the text that DesignMerge Pro and DesignMerge Catalog import into their Links (Placeholders).

* The ASCII Printable Character Set is the set of characters whose ASCII decimal code is between 32 and 126 (letters, digits, punctuation marks, and a few miscellaneous symbols).