4.2 Encoding glyphs below U+0020 SPACE
Summary of Recommendations
We recommend that fonts not encode in the cmap characters below U+0020 SPACE.
The OS/2 table contains a field known as
usFirstCharIndex which, according to the OpenType specification, is:
The minimum Unicode index (character code) in this font, according to the cmap subtable for platform ID 3 and platform- specific encoding ID 0 or 1. For most fonts supporting Win-ANSI or other character sets, this value would be 0x0020. This field cannot represent supplementary character values (codepoints greater than 0xFFFF). Fonts that support supplementary characters should set the value in this field to 0xFFFF if the minimum index value is a supplementary character.
which suggests that the norm is for Unicode cmaps to start encoding at U+0020.
Historically, however, a number of vendors (most commonly Adobe and SIL) have released fonts with characters below U+0020 (space) encoded in the cmap, resulting in the
usFirstCharIndex value being less than 0x0020.
Why is this a problem?
usFirstCharIndex values lower than 0x0020 can confuse software and cause unexpected results.
One example is Scribus which, as of the May 2014, actually draws the glyph for CR (and possibly LF) if they are encoded. If the CR/LF glyphs have width (as CR does in many fonts), they will impact right-aligned text, as in this example:
The exact procedures needed to make sure your font does not encode characters below U+0020 will depend on the toolchain you use.
For FontLab Studio, follow these steps:
Make sure none of
CR(may be called
tab, or anything else before the
spaceglyph have USVs in their glyph properties.
Set FontInfo to not add .null, CR and space (in other words, leave box unchecked):
Finally make sure in the TrueType generation options that the first 256 chars are not being re-encoded:
If the project uses Microsoft VOLT then the VOLT project (
.vtp file) may be overriding the encoding of these first few glyphs, e.g.,:
DEF_GLYPH ".notdef" ID 0 TYPE BASE END_GLYPH DEF_GLYPH ".null" ID 1 UNICODE 0 TYPE BASE END_GLYPH DEF_GLYPH "nonmarkingreturn" ID 2 UNICODE 13 TYPE BASE END_GLYPH DEF_GLYPH "space" ID 3 UNICODE 32 TYPE BASE END_GLYPH DEF_GLYPH "tab" ID 4 UNICODE 9 END_GLYPH
These can be cleaned up within VOLT by using the Glyph Editor or by exporting the project to a text file, editing to remove the
UNICODE nn reference, and re-importing the project.