The default IDOL Server configuration file uses generic transliteration. HPE recommends that you use generic transliteration because it is the best way to ensure that cross-lingual search can happen.
Generic transliteration performs transliteration as described in the following table.
| Language or character type | Transliteration |
|---|---|
| Symbols | All dashes and hyphens to a hyphen character. |
| Latin | Accented characters to non-accented characters |
| Spanish | Accented vowels áéíóúü to non-accented vowels |
| Portuguese | Accented vowels àáâãçéêíòóôõúü to non-accented vowels |
| Greek | Accented Greek characters to non-accented characters |
| Cyrillic (including Serbian extensions) | All characters mapped to A–Z |
| Arabic | Arabic character normalization |
| Japanese |
Half width katakana to full width katakana Full width 0–9, A–Z, a–z to single byte 0–9, A–Z, a–z |
| Chinese | Full width 0–9, A–Z, a–z to single byte 0–9, A–Z, a–z |
For all other languages, transliteration does not apply, except for hyphen normalization.
Note: Languages with a sentence-breaking library might be transliterated as part of the sentence-breaking process.
When you set GenericTransliteration to True, it applies to all languages, unless you specifically disable transliteration for a language.
You can disable transliteration for an individual language by setting the Transliteration parameter to False in the individual language configuration section. This option completely disables transliteration for that language.
|
|