新規作成  編集  Ruby-GNOME2 Project Website  ページ一覧  検索  更新履歴  編集履歴  RSS  ログイン

GLib::Unicode

Class Methods

GLib::Unicode.canonical_ordering(ucs4)
Computes the canonical ordering of a string. This rearranges decomposed characters in the string according to their combining classes. See the Unicode manual for more information.
  • ucs4: a UCS-4 encoded String
  • Returns: the canonical ordering of ucs4
GLib::Unicode.canonical_decomposition(unichar)
Computes the canonical decomposition of a Unicode character.
  • unichar: a Unicode character as Integer
  • Returns: a string of Unicode characters.

Constants

Type

These are the possible character classifications from the Unicode specification. See <URL:http://www.unicode.org/Public/UNIDATA/UnicodeData.html>.

CONTROL
General category "Other, Control" (Cc)
FORMAT
General category "Other, Format" (Cf)
UNASSIGNED
General category "Other, Not Assigned" (Cn)
PRIVATE_USE
General category "Other, Private Use" (Co)
SURROGATE
General category "Other, Surrogate" (Cs)
LOWERCASE_LETTER
General category "Letter, Lowercase" (Ll)
MODIFIER_LETTER
General category "Letter, Modifier" (Lm)
OTHER_LETTER
General category "Letter, Other" (Lo)
TITLECASE_LETTER
General category "Letter, Titlecase" (Lt)
UPPERCASE_LETTER
General category "Letter, Uppercase" (Lu)
COMBINING_MARK
General category "Mark, Spacing Combining" (Mc)
ENCLOSING_MARK
General category "Mark, Enclosing" (Me)
NON_SPACING_MARK
General category "Mark, Nonspacing" (Mn)
DECIMAL_NUMBER
General category "Number, Decimal Digit" (Nd)
LETTER_NUMBER
General category "Number, Letter" (Nl)
OTHER_NUMBER
General category "Number, Other" (No)
CONNECT_PUNCTUATION
General category "Punctuation, Connector" (Pc)
DASH_PUNCTUATION
General category "Punctuation, Dash" (Pd)
CLOSE_PUNCTUATION
General category "Punctuation, Close" (Pe)
FINAL_PUNCTUATION
General category "Punctuation, Final quote" (Pf)
INITIAL_PUNCTUATION
General category "Punctuation, Initial quote" (Pi)
OTHER_PUNCTUATION
General category "Punctuation, Other" (Po)
OPEN_PUNCTUATION
General category "Punctuation, Open" (Ps)
CURRENCY_SYMBOL
General category "Symbol, Currency" (Sc)
MODIFIER_SYMBOL
General category "Symbol, Modifier" (Sk)
MATH_SYMBOL
General category "Symbol, Math" (Sm)
OTHER_SYMBOL
General category "Symbol, Other" (So)
LINE_SEPARATOR
General category "Separator, Line" (Zl)
PARAGRAPH_SEPARATOR
General category "Separator, Paragraph" (Zp)
SPACE_SEPARATOR
General category "Separator, Space" (Zs)

BreakType

These are the possible line break classifications. The five Hangul types were added in Unicode 4.1, so, has been introduced in GLib 2.10. Note that new types may be added in the future. Applications should be ready to handle unknown values. They may be regarded as GLib::Unicode::BreakType::UNKNOWN?. See <URL:http://www.unicode.org/unicode/reports/tr14/>.

AFTER
ALPHABETIC
AMBIGUOUS
BEFORE
BEFORE_AND_AFTER
CARRIAGE_RETURN
CLOSE_PUNCTUATION
COMBINING_MARK
COMPLEX_CONTEXT
CONTINGENT
EXCLAMATION
HANGUL_LVT_SYLLABLE
HANGUL_LV_SYLLABLE
HANGUL_L_JAMO
HANGUL_T_JAMO
HANGUL_V_JAMO
HYPHEN
IDEOGRAPHIC
INFIX_SEPARATOR
INSEPARABLE
LINE_FEED
MANDATORY
NEXT_LINE
NON_BREAKING_GLUE
NON_STARTER
NUMERIC
OPEN_PUNCTUATION
POSTFIX
PREFIX
QUOTATION
SPACE
SURROGATE
SYMBOL
UNKNOWN
WORD_JOINER
ZERO_WIDTH_SPACE

Script

The GLib::Unicode::Script enumeration identifies different writing systems. The values correspond to the names as defined in the Unicode standard. The enumeration has been added in GLib 2.14. Note that new types may be added in the future. Applications should be ready to handle unknown values. See Unicode Standard Annex 24: Script names. Since 2.14

INVALID_CODE
a value never returned from GLib::UniChar.get_script
COMMON
a character used by multiple different scripts
INHERITED
a mark glyph that takes its script from the base glyph to which it is attached
ARABIC
Arabic
ARMENIAN
Armenian
BENGALI
Bengali
BOPOMOFO
Bopomofo
CHEROKEE
Cherokee
COPTIC
Coptic
CYRILLIC
Cyrillic
DESERET
Deseret
DEVANAGARI
Devanagari
ETHIOPIC
Ethiopic
GEORGIAN
Georgian
GOTHIC
Gothic
GREEK
Greek
GUJARATI
Gujarati
GURMUKHI
Gurmukhi
HAN
Han
HANGUL
Hangul
HEBREW
Hebrew
HIRAGANA
Hiragana
KANNADA
Kannada
KATAKANA
Katakana
KHMER
Khmer
LAO
Lao
LATIN
Latin
MALAYALAM
Malayalam
MONGOLIAN
Mongolian
MYANMAR
Myanmar
OGHAM
Ogham
OLD_ITALIC
Old Italic
ORIYA
Oriya
RUNIC
Runic
SINHALA
Sinhala
SYRIAC
Syriac
TAMIL
Tamil
TELUGU
Telugu
THAANA
Thaana
THAI
Thai
TIBETAN
Tibetan
CANADIAN_ABORIGINAL
Canadian Aboriginal
YI
Yi
TAGALOG
Tagalog
HANUNOO
Hanunoo
BUHID
Buhid
TAGBANWA
Tagbanwa
BRAILLE
Braille
CYPRIOT
Cypriot
LIMBU
Limbu
OSMANYA
Osmanya
SHAVIAN
Shavian
LINEAR_B
Linear B
TAI_LE
Tai Le
UGARITIC
Ugaritic
NEW_TAI_LUE
New Tai Lue
BUGINESE
Buginese
GLAGOLITIC
Glagolitic
TIFINAGH
Tifinagh
SYLOTI_NAGRI
Syloti Nagri
OLD_PERSIAN
Old Persian
KHAROSHTHI
Kharoshthi
UNKNOWN
an unassigned code point
BALINESE
Balinese
CUNEIFORM
Cuneiform
PHOENICIAN
Phoenician
SCRIPT_PHAGS_PA
Phags-pa
NKO
N'Ko

ChangeLog

  • 2006-12-10: merged with GLib::UnicodeBreak? and GLib::UnicodeScript?. - kou
  • 2006-12-10: moved from GLib. - kou
更新日時:2006/12/10 16:38:50
キーワード:
参照:[GLib::Unicode] [Ruby/GLib] [GLib::UniChar]