import "unicode"

このパッケージは、Unicodeコードポイントのプロパティを調べるためのデータおよび関数を提供します。

パッケージファイル

casetables.go digit.go letter.go tables.go

定数

const (
    MaxRune         = 0x10FFFF // 有効なUnicodeコードポイントの最大値。
    ReplacementChar = 0xFFFD   // コードポイントが不正であることを表す。
)

CaseRanges内のケースマッピング(大文字・小文字変換)を行うDelta配列へのインデックス。

const (
    UpperCase = iota
    LowerCase
    TitleCase
    MaxCase
)

CaseRangeのDeltaフィールドの値がUpperLowerのとき、「大文字、小文字、大文字、小文字・・・」と文字を連続して並べた形式であることを表します。

const (
    UpperLower = MaxRune + 1 // (Cannot be a valid delta.)
)

Versionは、Unicodeのバージョンです。

const Version = "5.2.0"

変数

var (
    Cc     = _Cc    // Cc is the set of Unicode characters in category Cc.
    Cf     = _Cf    // Cf is the set of Unicode characters in category Cf.
    Co     = _Co    // Co is the set of Unicode characters in category Co.
    Cs     = _Cs    // Cs is the set of Unicode characters in category Cs.
    Digit  = _Nd    // Digit is the set of Unicode characters with the "decimal digit" property.
    Nd     = _Nd    // Nd is the set of Unicode characters in category Nd.
    Letter = letter // Letter is the set of Unicode letters.
    Lm     = _Lm    // Lm is the set of Unicode characters in category Lm.
    Lo     = _Lo    // Lo is the set of Unicode characters in category Lo.
    Lower  = _Ll    // Lower is the set of Unicode lower case letters.
    Ll     = _Ll    // Ll is the set of Unicode characters in category Ll.
    Mc     = _Mc    // Mc is the set of Unicode characters in category Mc.
    Me     = _Me    // Me is the set of Unicode characters in category Me.
    Mn     = _Mn    // Mn is the set of Unicode characters in category Mn.
    Nl     = _Nl    // Nl is the set of Unicode characters in category Nl.
    No     = _No    // No is the set of Unicode characters in category No.
    Pc     = _Pc    // Pc is the set of Unicode characters in category Pc.
    Pd     = _Pd    // Pd is the set of Unicode characters in category Pd.
    Pe     = _Pe    // Pe is the set of Unicode characters in category Pe.
    Pf     = _Pf    // Pf is the set of Unicode characters in category Pf.
    Pi     = _Pi    // Pi is the set of Unicode characters in category Pi.
    Po     = _Po    // Po is the set of Unicode characters in category Po.
    Ps     = _Ps    // Ps is the set of Unicode characters in category Ps.
    Sc     = _Sc    // Sc is the set of Unicode characters in category Sc.
    Sk     = _Sk    // Sk is the set of Unicode characters in category Sk.
    Sm     = _Sm    // Sm is the set of Unicode characters in category Sm.
    So     = _So    // So is the set of Unicode characters in category So.
    Title  = _Lt    // Title is the set of Unicode title case letters.
    Lt     = _Lt    // Lt is the set of Unicode characters in category Lt.
    Upper  = _Lu    // Upper is the set of Unicode upper case letters.
    Lu     = _Lu    // Lu is the set of Unicode characters in category Lu.
    Zl     = _Zl    // Zl is the set of Unicode characters in category Zl.
    Zp     = _Zp    // Zp is the set of Unicode characters in category Zp.
    Zs     = _Zs    // Zs is the set of Unicode characters in category Zs.
)
var (
    Arabic                 = _Arabic                 // Arabic is the set of Unicode characters in script Arabic.
    Armenian               = _Armenian               // Armenian is the set of Unicode characters in script Armenian.
    Avestan                = _Avestan                // Avestan is the set of Unicode characters in script Avestan.
    Balinese               = _Balinese               // Balinese is the set of Unicode characters in script Balinese.
    Bamum                  = _Bamum                  // Bamum is the set of Unicode characters in script Bamum.
    Bengali                = _Bengali                // Bengali is the set of Unicode characters in script Bengali.
    Bopomofo               = _Bopomofo               // Bopomofo is the set of Unicode characters in script Bopomofo.
    Braille                = _Braille                // Braille is the set of Unicode characters in script Braille.
    Buginese               = _Buginese               // Buginese is the set of Unicode characters in script Buginese.
    Buhid                  = _Buhid                  // Buhid is the set of Unicode characters in script Buhid.
    Canadian_Aboriginal    = _Canadian_Aboriginal    // Canadian_Aboriginal is the set of Unicode characters in script Canadian_Aboriginal.
    Carian                 = _Carian                 // Carian is the set of Unicode characters in script Carian.
    Cham                   = _Cham                   // Cham is the set of Unicode characters in script Cham.
    Cherokee               = _Cherokee               // Cherokee is the set of Unicode characters in script Cherokee.
    Common                 = _Common                 // Common is the set of Unicode characters in script Common.
    Coptic                 = _Coptic                 // Coptic is the set of Unicode characters in script Coptic.
    Cuneiform              = _Cuneiform              // Cuneiform is the set of Unicode characters in script Cuneiform.
    Cypriot                = _Cypriot                // Cypriot is the set of Unicode characters in script Cypriot.
    Cyrillic               = _Cyrillic               // Cyrillic is the set of Unicode characters in script Cyrillic.
    Deseret                = _Deseret                // Deseret is the set of Unicode characters in script Deseret.
    Devanagari             = _Devanagari             // Devanagari is the set of Unicode characters in script Devanagari.
    Egyptian_Hieroglyphs   = _Egyptian_Hieroglyphs   // Egyptian_Hieroglyphs is the set of Unicode characters in script Egyptian_Hieroglyphs.
    Ethiopic               = _Ethiopic               // Ethiopic is the set of Unicode characters in script Ethiopic.
    Georgian               = _Georgian               // Georgian is the set of Unicode characters in script Georgian.
    Glagolitic             = _Glagolitic             // Glagolitic is the set of Unicode characters in script Glagolitic.
    Gothic                 = _Gothic                 // Gothic is the set of Unicode characters in script Gothic.
    Greek                  = _Greek                  // Greek is the set of Unicode characters in script Greek.
    Gujarati               = _Gujarati               // Gujarati is the set of Unicode characters in script Gujarati.
    Gurmukhi               = _Gurmukhi               // Gurmukhi is the set of Unicode characters in script Gurmukhi.
    Han                    = _Han                    // Han is the set of Unicode characters in script Han.
    Hangul                 = _Hangul                 // Hangul is the set of Unicode characters in script Hangul.
    Hanunoo                = _Hanunoo                // Hanunoo is the set of Unicode characters in script Hanunoo.
    Hebrew                 = _Hebrew                 // Hebrew is the set of Unicode characters in script Hebrew.
    Hiragana               = _Hiragana               // Hiragana is the set of Unicode characters in script Hiragana.
    Imperial_Aramaic       = _Imperial_Aramaic       // Imperial_Aramaic is the set of Unicode characters in script Imperial_Aramaic.
    Inherited              = _Inherited              // Inherited is the set of Unicode characters in script Inherited.
    Inscriptional_Pahlavi  = _Inscriptional_Pahlavi  // Inscriptional_Pahlavi is the set of Unicode characters in script Inscriptional_Pahlavi.
    Inscriptional_Parthian = _Inscriptional_Parthian // Inscriptional_Parthian is the set of Unicode characters in script Inscriptional_Parthian.
    Javanese               = _Javanese               // Javanese is the set of Unicode characters in script Javanese.
    Kaithi                 = _Kaithi                 // Kaithi is the set of Unicode characters in script Kaithi.
    Kannada                = _Kannada                // Kannada is the set of Unicode characters in script Kannada.
    Katakana               = _Katakana               // Katakana is the set of Unicode characters in script Katakana.
    Kayah_Li               = _Kayah_Li               // Kayah_Li is the set of Unicode characters in script Kayah_Li.
    Kharoshthi             = _Kharoshthi             // Kharoshthi is the set of Unicode characters in script Kharoshthi.
    Khmer                  = _Khmer                  // Khmer is the set of Unicode characters in script Khmer.
    Lao                    = _Lao                    // Lao is the set of Unicode characters in script Lao.
    Latin                  = _Latin                  // Latin is the set of Unicode characters in script Latin.
    Lepcha                 = _Lepcha                 // Lepcha is the set of Unicode characters in script Lepcha.
    Limbu                  = _Limbu                  // Limbu is the set of Unicode characters in script Limbu.
    Linear_B               = _Linear_B               // Linear_B is the set of Unicode characters in script Linear_B.
    Lisu                   = _Lisu                   // Lisu is the set of Unicode characters in script Lisu.
    Lycian                 = _Lycian                 // Lycian is the set of Unicode characters in script Lycian.
    Lydian                 = _Lydian                 // Lydian is the set of Unicode characters in script Lydian.
    Malayalam              = _Malayalam              // Malayalam is the set of Unicode characters in script Malayalam.
    Meetei_Mayek           = _Meetei_Mayek           // Meetei_Mayek is the set of Unicode characters in script Meetei_Mayek.
    Mongolian              = _Mongolian              // Mongolian is the set of Unicode characters in script Mongolian.
    Myanmar                = _Myanmar                // Myanmar is the set of Unicode characters in script Myanmar.
    New_Tai_Lue            = _New_Tai_Lue            // New_Tai_Lue is the set of Unicode characters in script New_Tai_Lue.
    Nko                    = _Nko                    // Nko is the set of Unicode characters in script Nko.
    Ogham                  = _Ogham                  // Ogham is the set of Unicode characters in script Ogham.
    Ol_Chiki               = _Ol_Chiki               // Ol_Chiki is the set of Unicode characters in script Ol_Chiki.
    Old_Italic             = _Old_Italic             // Old_Italic is the set of Unicode characters in script Old_Italic.
    Old_Persian            = _Old_Persian            // Old_Persian is the set of Unicode characters in script Old_Persian.
    Old_South_Arabian      = _Old_South_Arabian      // Old_South_Arabian is the set of Unicode characters in script Old_South_Arabian.
    Old_Turkic             = _Old_Turkic             // Old_Turkic is the set of Unicode characters in script Old_Turkic.
    Oriya                  = _Oriya                  // Oriya is the set of Unicode characters in script Oriya.
    Osmanya                = _Osmanya                // Osmanya is the set of Unicode characters in script Osmanya.
    Phags_Pa               = _Phags_Pa               // Phags_Pa is the set of Unicode characters in script Phags_Pa.
    Phoenician             = _Phoenician             // Phoenician is the set of Unicode characters in script Phoenician.
    Rejang                 = _Rejang                 // Rejang is the set of Unicode characters in script Rejang.
    Runic                  = _Runic                  // Runic is the set of Unicode characters in script Runic.
    Samaritan              = _Samaritan              // Samaritan is the set of Unicode characters in script Samaritan.
    Saurashtra             = _Saurashtra             // Saurashtra is the set of Unicode characters in script Saurashtra.
    Shavian                = _Shavian                // Shavian is the set of Unicode characters in script Shavian.
    Sinhala                = _Sinhala                // Sinhala is the set of Unicode characters in script Sinhala.
    Sundanese              = _Sundanese              // Sundanese is the set of Unicode characters in script Sundanese.
    Syloti_Nagri           = _Syloti_Nagri           // Syloti_Nagri is the set of Unicode characters in script Syloti_Nagri.
    Syriac                 = _Syriac                 // Syriac is the set of Unicode characters in script Syriac.
    Tagalog                = _Tagalog                // Tagalog is the set of Unicode characters in script Tagalog.
    Tagbanwa               = _Tagbanwa               // Tagbanwa is the set of Unicode characters in script Tagbanwa.
    Tai_Le                 = _Tai_Le                 // Tai_Le is the set of Unicode characters in script Tai_Le.
    Tai_Tham               = _Tai_Tham               // Tai_Tham is the set of Unicode characters in script Tai_Tham.
    Tai_Viet               = _Tai_Viet               // Tai_Viet is the set of Unicode characters in script Tai_Viet.
    Tamil                  = _Tamil                  // Tamil is the set of Unicode characters in script Tamil.
    Telugu                 = _Telugu                 // Telugu is the set of Unicode characters in script Telugu.
    Thaana                 = _Thaana                 // Thaana is the set of Unicode characters in script Thaana.
    Thai                   = _Thai                   // Thai is the set of Unicode characters in script Thai.
    Tibetan                = _Tibetan                // Tibetan is the set of Unicode characters in script Tibetan.
    Tifinagh               = _Tifinagh               // Tifinagh is the set of Unicode characters in script Tifinagh.
    Ugaritic               = _Ugaritic               // Ugaritic is the set of Unicode characters in script Ugaritic.
    Vai                    = _Vai                    // Vai is the set of Unicode characters in script Vai.
    Yi                     = _Yi                     // Yi is the set of Unicode characters in script Yi.
)
var (
    ASCII_Hex_Digit                    = _ASCII_Hex_Digit                    // ASCII_Hex_Digit is the set of Unicode characters with property ASCII_Hex_Digit.
    Bidi_Control                       = _Bidi_Control                       // Bidi_Control is the set of Unicode characters with property Bidi_Control.
    Dash                               = _Dash                               // Dash is the set of Unicode characters with property Dash.
    Deprecated                         = _Deprecated                         // Deprecated is the set of Unicode characters with property Deprecated.
    Diacritic                          = _Diacritic                          // Diacritic is the set of Unicode characters with property Diacritic.
    Extender                           = _Extender                           // Extender is the set of Unicode characters with property Extender.
    Hex_Digit                          = _Hex_Digit                          // Hex_Digit is the set of Unicode characters with property Hex_Digit.
    Hyphen                             = _Hyphen                             // Hyphen is the set of Unicode characters with property Hyphen.
    IDS_Binary_Operator                = _IDS_Binary_Operator                // IDS_Binary_Operator is the set of Unicode characters with property IDS_Binary_Operator.
    IDS_Trinary_Operator               = _IDS_Trinary_Operator               // IDS_Trinary_Operator is the set of Unicode characters with property IDS_Trinary_Operator.
    Ideographic                        = _Ideographic                        // Ideographic is the set of Unicode characters with property Ideographic.
    Join_Control                       = _Join_Control                       // Join_Control is the set of Unicode characters with property Join_Control.
    Logical_Order_Exception            = _Logical_Order_Exception            // Logical_Order_Exception is the set of Unicode characters with property Logical_Order_Exception.
    Noncharacter_Code_Point            = _Noncharacter_Code_Point            // Noncharacter_Code_Point is the set of Unicode characters with property Noncharacter_Code_Point.
    Other_Alphabetic                   = _Other_Alphabetic                   // Other_Alphabetic is the set of Unicode characters with property Other_Alphabetic.
    Other_Default_Ignorable_Code_Point = _Other_Default_Ignorable_Code_Point // Other_Default_Ignorable_Code_Point is the set of Unicode characters with property Other_Default_Ignorable_Code_Point.
    Other_Grapheme_Extend              = _Other_Grapheme_Extend              // Other_Grapheme_Extend is the set of Unicode characters with property Other_Grapheme_Extend.
    Other_ID_Continue                  = _Other_ID_Continue                  // Other_ID_Continue is the set of Unicode characters with property Other_ID_Continue.
    Other_ID_Start                     = _Other_ID_Start                     // Other_ID_Start is the set of Unicode characters with property Other_ID_Start.
    Other_Lowercase                    = _Other_Lowercase                    // Other_Lowercase is the set of Unicode characters with property Other_Lowercase.
    Other_Math                         = _Other_Math                         // Other_Math is the set of Unicode characters with property Other_Math.
    Other_Uppercase                    = _Other_Uppercase                    // Other_Uppercase is the set of Unicode characters with property Other_Uppercase.
    Pattern_Syntax                     = _Pattern_Syntax                     // Pattern_Syntax is the set of Unicode characters with property Pattern_Syntax.
    Pattern_White_Space                = _Pattern_White_Space                // Pattern_White_Space is the set of Unicode characters with property Pattern_White_Space.
    Quotation_Mark                     = _Quotation_Mark                     // Quotation_Mark is the set of Unicode characters with property Quotation_Mark.
    Radical                            = _Radical                            // Radical is the set of Unicode characters with property Radical.
    STerm                              = _STerm                              // STerm is the set of Unicode characters with property STerm.
    Soft_Dotted                        = _Soft_Dotted                        // Soft_Dotted is the set of Unicode characters with property Soft_Dotted.
    Terminal_Punctuation               = _Terminal_Punctuation               // Terminal_Punctuation is the set of Unicode characters with property Terminal_Punctuation.
    Unified_Ideograph                  = _Unified_Ideograph                  // Unified_Ideograph is the set of Unicode characters with property Unified_Ideograph.
    Variation_Selector                 = _Variation_Selector                 // Variation_Selector is the set of Unicode characters with property Variation_Selector.
    White_Space                        = _White_Space                        // White_Space is the set of Unicode characters with property White_Space.
)
var AzeriCase = _TurkishCase

CaseRangesは、ケースマッピングを記述したテーブルです。自分自身へのマッピングは含まれません。

var CaseRanges = _CaseRanges

Categoriesは、Unicodeデータテーブルのセットです。

var Categories = map[string][]Range{
    "Lm":     Lm,
    "Ll":     Ll,
    "Me":     Me,
    "Mc":     Mc,
    "Mn":     Mn,
    "Zl":     Zl,
    "letter": letter,
    "Zp":     Zp,
    "Zs":     Zs,
    "Cs":     Cs,
    "Co":     Co,
    "Cf":     Cf,
    "Cc":     Cc,
    "Po":     Po,
    "Pi":     Pi,
    "Pf":     Pf,
    "Pe":     Pe,
    "Pd":     Pd,
    "Pc":     Pc,
    "Ps":     Ps,
    "Nd":     Nd,
    "Nl":     Nl,
    "No":     No,
    "So":     So,
    "Sm":     Sm,
    "Sk":     Sk,
    "Sc":     Sc,
    "Lu":     Lu,
    "Lt":     Lt,
    "Lo":     Lo,
}

Propertiesは、Unicodeプロパティテーブルのセットです。

var Properties = map[string][]Range{
    "Pattern_Syntax":                     Pattern_Syntax,
    "Other_ID_Start":                     Other_ID_Start,
    "Pattern_White_Space":                Pattern_White_Space,
    "Other_Lowercase":                    Other_Lowercase,
    "Soft_Dotted":                        Soft_Dotted,
    "Hex_Digit":                          Hex_Digit,
    "ASCII_Hex_Digit":                    ASCII_Hex_Digit,
    "Deprecated":                         Deprecated,
    "Terminal_Punctuation":               Terminal_Punctuation,
    "Quotation_Mark":                     Quotation_Mark,
    "Other_ID_Continue":                  Other_ID_Continue,
    "Bidi_Control":                       Bidi_Control,
    "Variation_Selector":                 Variation_Selector,
    "Noncharacter_Code_Point":            Noncharacter_Code_Point,
    "Other_Math":                         Other_Math,
    "Unified_Ideograph":                  Unified_Ideograph,
    "Hyphen":                             Hyphen,
    "IDS_Binary_Operator":                IDS_Binary_Operator,
    "Logical_Order_Exception":            Logical_Order_Exception,
    "Radical":                            Radical,
    "Other_Uppercase":                    Other_Uppercase,
    "STerm":                              STerm,
    "Other_Alphabetic":                   Other_Alphabetic,
    "Diacritic":                          Diacritic,
    "Extender":                           Extender,
    "Join_Control":                       Join_Control,
    "Ideographic":                        Ideographic,
    "Dash":                               Dash,
    "IDS_Trinary_Operator":               IDS_Trinary_Operator,
    "Other_Grapheme_Extend":              Other_Grapheme_Extend,
    "Other_Default_Ignorable_Code_Point": Other_Default_Ignorable_Code_Point,
    "White_Space":                        White_Space,
}

Scriptsは、Unicodeのスクリプトテーブルのセットです。

var Scripts = map[string][]Range{
    "Katakana":               Katakana,
    "Malayalam":              Malayalam,
    "Phags_Pa":               Phags_Pa,
    "Inscriptional_Parthian": Inscriptional_Parthian,
    "Latin":                  Latin,
    "Inscriptional_Pahlavi":  Inscriptional_Pahlavi,
    "Osmanya":                Osmanya,
    "Khmer":                  Khmer,
    "Inherited":              Inherited,
    "Telugu":                 Telugu,
    "Samaritan":              Samaritan,
    "Bopomofo":               Bopomofo,
    "Imperial_Aramaic":       Imperial_Aramaic,
    "Kaithi":                 Kaithi,
    "Old_South_Arabian":      Old_South_Arabian,
    "Kayah_Li":               Kayah_Li,
    "New_Tai_Lue":            New_Tai_Lue,
    "Tai_Le":                 Tai_Le,
    "Kharoshthi":             Kharoshthi,
    "Common":                 Common,
    "Kannada":                Kannada,
    "Old_Turkic":             Old_Turkic,
    "Tamil":                  Tamil,
    "Tagalog":                Tagalog,
    "Arabic":                 Arabic,
    "Tagbanwa":               Tagbanwa,
    "Canadian_Aboriginal":    Canadian_Aboriginal,
    "Tibetan":                Tibetan,
    "Coptic":                 Coptic,
    "Hiragana":               Hiragana,
    "Limbu":                  Limbu,
    "Egyptian_Hieroglyphs":   Egyptian_Hieroglyphs,
    "Avestan":                Avestan,
    "Myanmar":                Myanmar,
    "Armenian":               Armenian,
    "Sinhala":                Sinhala,
    "Bengali":                Bengali,
    "Greek":                  Greek,
    "Cham":                   Cham,
    "Hebrew":                 Hebrew,
    "Meetei_Mayek":           Meetei_Mayek,
    "Saurashtra":             Saurashtra,
    "Hangul":                 Hangul,
    "Runic":                  Runic,
    "Deseret":                Deseret,
    "Lisu":                   Lisu,
    "Sundanese":              Sundanese,
    "Glagolitic":             Glagolitic,
    "Oriya":                  Oriya,
    "Buhid":                  Buhid,
    "Ethiopic":               Ethiopic,
    "Javanese":               Javanese,
    "Syloti_Nagri":           Syloti_Nagri,
    "Vai":                    Vai,
    "Cherokee":               Cherokee,
    "Ogham":                  Ogham,
    "Syriac":                 Syriac,
    "Gurmukhi":               Gurmukhi,
    "Tai_Tham":               Tai_Tham,
    "Ol_Chiki":               Ol_Chiki,
    "Mongolian":              Mongolian,
    "Hanunoo":                Hanunoo,
    "Cypriot":                Cypriot,
    "Buginese":               Buginese,
    "Bamum":                  Bamum,
    "Lepcha":                 Lepcha,
    "Thaana":                 Thaana,
    "Old_Persian":            Old_Persian,
    "Cuneiform":              Cuneiform,
    "Rejang":                 Rejang,
    "Georgian":               Georgian,
    "Shavian":                Shavian,
    "Lycian":                 Lycian,
    "Nko":                    Nko,
    "Yi":                     Yi,
    "Lao":                    Lao,
    "Linear_B":               Linear_B,
    "Old_Italic":             Old_Italic,
    "Tai_Viet":               Tai_Viet,
    "Devanagari":             Devanagari,
    "Lydian":                 Lydian,
    "Tifinagh":               Tifinagh,
    "Ugaritic":               Ugaritic,
    "Thai":                   Thai,
    "Cyrillic":               Cyrillic,
    "Gujarati":               Gujarati,
    "Carian":                 Carian,
    "Phoenician":             Phoenician,
    "Balinese":               Balinese,
    "Braille":                Braille,
    "Han":                    Han,
    "Gothic":                 Gothic,
}
var TurkishCase = _TurkishCase

Is関数

func Is(ranges []Range, rune int) bool

Isは、runeが指定されたrangesテーブル内に存在するかどうかを返します。

IsDigit関数

func IsDigit(rune int) bool

IsDigitは、runeが数字かどうかを返します。

IsLetter関数

func IsLetter(rune int) bool

IsLetterは、runeが文字かどうかを返します。

IsLower関数

func IsLower(rune int) bool

IsLowerは、runeが小文字かどうかを返します。

IsSpace関数

func IsSpace(rune int) bool

IsSpaceは、runeがホワイトスペース文字かどうかを返します。

IsTitle関数

func IsTitle(rune int) bool

IsTitleは、runeがタイトルケース文字かどうかを返します。

IsUpper関数

func IsUpper(rune int) bool

IsUpperは、runeが大文字かどうかを返します。

To関数

func To(_case int, rune int) int

Toは、runeをUpperCase、LowerCase、TitleCaseのいずれかにマッピング変換します。

ToLower関数

func ToLower(rune int) int

ToLowerは、runeを小文字にマッピング変換します。

ToTitle関数

func ToTitle(rune int) int

ToTitleは、runeをタイトルケースにマッピング変換します。

ToUpper関数

func ToUpper(rune int) int

ToUpperは、runeを大文字にマッピング変換します。

CaseRange型

CaseRangeは、単純なケース変換(1つのコードポイントを1つのコードポイントへ)のためにUnicodeコードポイントの範囲を表します。範囲はLo以上、Hi以下で、刻みは1固定です。Deltaをコードポイントに加えることで、文字を他のケースへ変換します。この値はマイナスのこともあります。ゼロのときはすでにそのケースであることを表しています。また対応する大文字と小文字のペアを交互に並べる特殊なケースがあります。この印として次の固定値のDeltaが使われれます。

{UpperLower, UpperLower, UpperLower}

この定数UpperLowerの実際の値は、加減値としては不正な値です。

type CaseRange struct {
    Lo    int
    Hi    int
    Delta d
}

Range型

Unicodeコードポイントの範囲を表します。この範囲はLo以上、Hi以下で、刻みはstrideです。

type Range struct {
    Lo     int
    Hi     int
    Stride int
}

SpecialCase型

SpecialCaseは、トルコ語などの言語毎に規定されているケースマッピングを表します。SpecialCaseのメソッドはオーバライドによって標準マッピングをカスタマイズします。

type SpecialCase []CaseRange

(SpecialCase) ToLower関数

func (special SpecialCase) ToLower(rune int) int

ToLowerは、specialマップを優先的に使用してruneを小文字にマッピング変換します。

(SpecialCase) ToTitle関数

func (special SpecialCase) ToTitle(rune int) int

ToTitleは、specialマップを優先的に使用してruneをタイトルケースにマッピング変換します。

(SpecialCase) ToUpper関数

func (special SpecialCase) ToUpper(rune int) int

ToUpperは、specialマップを優先的に使用してruneを大文字にマッピング変換します。

バグ

すべてのケース変換に対応できるような仕組みが必要。(これには、入出力時の複数のルーンへの対応も含む)