public enum DiacriticalMark extends Enum<DiacriticalMark> implements SimpleDatatype<Character>
DiacriticalMark
represents a specific shape like e.g. '~', '^', etc. that is added at a specific position (on
top, at bottom, etc.) to a letter. For instance if you add two dots
to the letter 'a' you get
'ä'. combining characters
representing
the mark itself in addition to the precomposed characters (combination of a specific character with the mark[s]).
combining characters
that have no equivalent separate
character
. Further the naming of composed characters
in the unicode standard are
not always precise enough to determine the according combining character
. E.g.
UnicodeUtil.LATIN_SMALL_LETTER_DOTLESS_J_WITH_STROKE
and UnicodeUtil.LATIN_SMALL_LETTER_O_WITH_STROKE
could be considered to be both a combination of the diacritic "stroke". However, observing the glyphs and studying
unicode indicated that this is wrong. UnicodeUtil
should be considered as work in progress and we heavily require
your contribution to improve and support more diacritics. It may also be possible that characters or diacritics get
renamed in future versions if our understanding of unicode grows.Normalizer
Enum Constant and Description |
---|
ACUTE
A mark that can be placed on top of some Latin, Cyrillic or Greek characters.
|
BREVE
A mark that can be placed on top of some Latin, ...
|
CARON
A mark that can be placed on top of some Latin, ...
|
CEDILLA
A mark that can be placed at the bottom of some Latin characters.
|
CIRCUMFLEX
A mark that can be placed on top of some Latin characters (e.g.
|
DIAERESIS
Two small dots placed on top of Latin vowels, called trema, diaeresis, or umlaut.
|
DOT_ABOVE
A small dot placed to the top of some Latin letters used in some European languages and Vietnamese.
|
DOT_BELOW
A small dot placed to the bottom of some Latin letters used in some European languages and Vietnamese.
|
DOUBLE_ACUTE
Like
ACUTE but doubled. |
DOUBLE_GRAVE
Like
GRAVE but doubled. |
GRAVE
A mark that can be placed on top of some Latin, Cyrillic or Greek characters.
|
HOOK_ABOVE
A little question mark without the dot, that is placed on top of Vietnamese letters.
|
HORN_ABOVE
A mark similar to a comma (,) that is placed to the top right of Vietnamese vowels.
|
MACRON
A long horizontal stroke placed on top of letters.
|
OGONEK
A little hook placed to the bottom right of Latin vowels.
|
RING_ABOVE
A small ring placed to the top of some Latin letters.
|
SHORT_SOLIDUS_OVERLAY
A short solidus is a line diagonal through a grapheme (letter).
|
SHORT_STROKE_OVERLAY
A short stroke is a short line drawn horizontal through a grapheme (letter).
|
TILDE
A small tilde (~) placed on top of some letters.
|
Modifier and Type | Field and Description |
---|---|
private char |
combiningCharacter |
private Collection<Character> |
composedCharacters |
private Map<Character,Character> |
composeMap |
private Map<Character,Character> |
decomposeMap |
private char |
separateCharacter |
private String |
title |
Modifier and Type | Method and Description |
---|---|
protected void |
addComposition(char uncomposed,
char composed)
This method adds the given
composition pair. |
Character |
compose(char character)
This method composes the given
character with this DiacriticalMark . |
Character |
decompose(char character)
This method de-composes the given
character with this DiacriticalMark . |
char |
getCombiningCharacter()
This method gets the combining character for this
DiacriticalMark . |
Collection<Character> |
getComposedCharacters()
This method gets a
Collection with all precomposed characters containing this mark. |
char |
getSeparateCharacter()
This method gets the separate character for this
DiacriticalMark . |
Character |
getValue()
This method returns the raw value of this datatype.
|
protected abstract void |
initialize()
This method is called at construction.
|
String |
toString()
|
static DiacriticalMark |
valueOf(String name)
Returns the enum constant of this type with the specified name.
|
static DiacriticalMark[] |
values()
Returns an array containing the constants of this enum type, in
the order they are declared.
|
public static final DiacriticalMark ACUTE
public static final DiacriticalMark BREVE
public static final DiacriticalMark CARON
public static final DiacriticalMark CEDILLA
public static final DiacriticalMark CIRCUMFLEX
public static final DiacriticalMark DIAERESIS
public static final DiacriticalMark DOT_ABOVE
public static final DiacriticalMark DOT_BELOW
public static final DiacriticalMark DOUBLE_ACUTE
ACUTE
but doubled. If your environment supports unicode, you can see it here: ˝public static final DiacriticalMark DOUBLE_GRAVE
GRAVE
but doubled. If your environment supports unicode, you can see it here: ̏ .public static final DiacriticalMark GRAVE
public static final DiacriticalMark HOOK_ABOVE
public static final DiacriticalMark HORN_ABOVE
public static final DiacriticalMark MACRON
public static final DiacriticalMark OGONEK
public static final DiacriticalMark RING_ABOVE
public static final DiacriticalMark SHORT_SOLIDUS_OVERLAY
public static final DiacriticalMark SHORT_STROKE_OVERLAY
public static final DiacriticalMark TILDE
private final char separateCharacter
private final char combiningCharacter
private final String title
private final Collection<Character> composedCharacters
public static DiacriticalMark[] values()
for (DiacriticalMark c : DiacriticalMark.values()) System.out.println(c);
public static DiacriticalMark valueOf(String name)
name
- the name of the enum constant to be returned.IllegalArgumentException
- if this enum type has no constant with the specified nameNullPointerException
- if the argument is nullprotected abstract void initialize()
protected void addComposition(char uncomposed, char composed)
composition
pair.uncomposed
- is the uncomposed character.composed
- is the composed character.public char getSeparateCharacter()
DiacriticalMark
. It represents the mark itself as a
standalone character. combining character
. Therefore this method may return a character that looks
similar to the diacritic mark, but is NOT the correct representation for it.getCombiningCharacter()
public char getCombiningCharacter()
DiacriticalMark
. Unlike the
separate character
this character gets combined with the following character to a
single glyph.public Character getValue()
SimpleDatatype
String
, Character
, Boolean
, any type of Number
, any type of java.time.LocalDate
, etc.).getValue
in interface AttributeReadValue<Character>
getValue
in interface SimpleDatatype<Character>
public Character compose(char character)
character
with this DiacriticalMark
.character
- is the character to compose (e.g. 'a').null
if no such composition exists in
unicode.public Character decompose(char character)
character
with this DiacriticalMark
. In other words this
DiacriticalMark
is removed from the given character
if it is composed
. It is
the inverse operation of compose(char)
.character
- is the character to de-compose (e.g. 'ä' or 'á').null
if the given character
does is not
composed
with this DiacriticalMark
.public Collection<Character> getComposedCharacters()
Collection
with all precomposed characters
containing this mark.public String toString()
Datatype
String
representation of this Datatype
. While the general contract of
Object.toString()
is very weak and mainly used for debugging, the contract here is very strong. The
returned String
has to be suitable for end-users and official output to any kind of sink. NlsMessage
for
this purpose and implement NlsObject
if you want to support I18N/L10N.Copyright © 2001–2016 mmm-Team. All rights reserved.