Wikisource:RE-Werkstatt/Alle Zeichen
|
Alle Zeichen der RE | Alle Glyphen Unicode | Liste Glyphen
Das Verzeichnis aller Zeichen in Paulys Realencyclopädie der classischen Altertumswissenschaft (RE) – nur Text ohne Formatierung – in der Codierung utf-8 (Unicode) ist wie folgt zu lesen:
- Quelle ist eine Textdatei (37 MB als .zip) aus dem Download in der RE-Werkstatt (Juni 2023) und symbl.cc
- 84'346'547 Multibyte-Zeichen, im Total ~100 MB
- Sortiert nach Alphabet (Block) mit Link zur Codetabelle und Fussnote als Erläuterung
- Glyphen als konkrete Darstellung eines Zeichens.
1. Alle Zeichen der RE
[Bearbeiten]Alphabet (Block) | Alle Zeichen der RE (Häufigkeit / Anzahl) |
---|---|
ASCII Zeichensatz [1] | TAB NL SPACE (nicht gezählt) ! " # $ % & ' ( ) * + , - . / |
Latin-1 Supplement [2] | £ ¤ § ¨ ª « ¬ ¯ ° ± ² ³ ´ µ · ¹ º » ¼ ½ ¾ À Á Â Ã Ä Å Æ Ç É Ê Í Î Ï Ð Ó Ô Ö × Ú Û Ü ß |
Latin Extended-A [3] | Ā ā Ă ă ą Ć ć Ċ Č č Đ đ Ē ē ĕ ę ě Ğ ğ Ġ ġ Ģ ģ Ī ī ĭ į ı Ĵ ĵ Ķ ķ ł ń ņ ň ŋ Ō ō ŏ ő ŕ Ŗ ř Ś ś Ŝ ŝ ş Š š ũ Ū ū ŭ ŷ Ź ż Ž ž (8734 / 60) |
Latin Extended-B [4] | Ɔ Ǝ ƿ ǎ ǐ Ǒ ǒ ǔ Ǘ Ǧ ǧ ǰ Ǵ ǵ ȓ Ș ș ț ȥ |
IPA Extensions [5] | ə ʒ ʠ (5 / 3) |
Spacing Modifier Letters [6] | ʰ ʹ ʻ ʼ ʾ ʿ |
Combining Diacritical Marks [7] | ́ ̄ ̅ ̆ ̌ ̓ ̔ ̣ ̥ ̮ ̯ ̽ ͂ ͗ ͡ (135 / 15) |
Greek and Coptic [8] | ʹ ͵ ΄ Ά Έ Ή Ί Ό Ύ Ώ ΐ Α Β Γ Δ Ε Ζ Η Θ Ι Κ Λ Μ Ν Ξ Ο Π Ρ Σ Τ Υ Φ Χ Ψ Ω Ϊ ά έ ή ί ΰ α β γ δ ε ζ η θ ι κ λ μ ν ξ ο π ρ ς σ τ υ φ χ ψ ω ϊ ϋ ό ύ ώ ϑ ϕ |
Cyrillic [9] | Є І В О П С Т Х Э а е и й л н о р с у і ї Ҁ ӗ ӱ Ӿ (76 / 25) |
Armenian [10] | Ե ե է (11 / 3) |
Hebrew [11] | ְ ֱ ֲ ֳ ִ ֵ ֶ ַ ָ ֹ ֻ ּ ־ ׁ ׂ ׅ א ב ג ד ה ו ז ח ט י ך כ ל ם מ ן נ ס ע ף פ ץ צ ק ר ש ת ׳ ״ (3906 / 45) |
Arabic [12] | ا ب ش م (4 / 4) |
Devanagari [13] | घ ज ऩ प ़ (5 / 5) |
Cherokee [14] | Ꭶ (2 / 1) |
Unified Canadian Aboriginal Syllabics [15] | ᐨ (3 / 1) |
Runic [16] | ᛋ (3 / 1) |
Phonetic Extensions [17] | ᴗ ᵃ ᵉ (8 / 3) |
Phonetic Extensions Supplement [18] | ᶜ (11 / 1) |
Latin Extended Additional [19] | ḁ Ḋ Ḍ ḍ Ḏ ḏ ḗ Ḡ Ḥ ḥ Ḫ ḫ ḭ Ḱ ḱ Ḳ ḳ ḵ ḷ ṁ Ṃ ṃ ṅ Ṇ ṇ ṑ ṓ ṙ Ṛ ṛ ṟ Ṡ ṡ Ṣ ṣ Ṥ ṩ Ṭ ṭ Ṯ ṯ ṱ ṷ Ṿ Ẓ ẓ ẖ ẛ Ạ ạ Ả ả ậ ẹ ẽ ị (3363 / 56) |
Greek Extended [20] | ἀ ἁ ἂ ἃ ἄ ἅ ἆ ἇ Ἀ Ἁ Ἂ Ἃ Ἄ Ἅ Ἆ ἐ ἑ ἒ ἓ ἔ ἕ Ἐ Ἑ Ἒ Ἓ Ἔ Ἕ ἠ ἡ ἢ ἣ ἤ ἥ ἦ ἧ Ἠ Ἡ Ἢ Ἣ Ἤ Ἥ Ἦ Ἧ ἰ ἱ ἲ ἳ ἴ ἵ ἶ ἷ Ἰ Ἱ Ἲ Ἴ Ἵ Ἶ ὀ ὁ ὂ ὃ ὄ ὅ Ὀ Ὁ Ὂ Ὃ Ὄ Ὅ ὐ ὑ ὒ ὓ ὔ ὕ ὖ ὗ Ὑ Ὓ Ὕ Ὗ ὠ ὡ ὢ ὣ ὤ ὥ ὦ ὧ Ὠ Ὡ Ὣ Ὤ Ὥ Ὦ Ὧ |
General Punctuation [21] | – — ‖ ‘ ’ ‚ “ ” „ |
Superscripts and Subscripts [22] | ⁰ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ₁ ₂ ₆ ₈ (2703 / 11) |
Combining Diacritical Marks for Symbols [23] | ⃒ ⃠ (8 / 2) |
Letterlike Symbols [24] | ℞ ℳ ℵ (82 / 3) |
Number Forms [25] | ⅒ ⅓ ⅔ ⅕ ⅘ ⅛ ⅜ ⅝ ⅞ ↀ ↂ Ↄ (42 / 12) |
Arrows [26] | → ↓ (2 / 2) |
Mathematical Operators [27] | ∆ − ∗ √ |
Miscellaneous Technical [28] | ⌒ ⎥ ⎷ ⎿ ⏉ ⏑ ⏓ ⏕ ⏖ (20 / 9) |
Box Drawing [29] | ━ │ ┃ ╚ ╵ ╷ (95 / 6) |
Geometric Shapes [30] | ■ □ ▭ ▯ ◡ ◺ ◻ ◿ (727 / 8) |
Miscellaneous Symbols [31] | ☐ ♃ ⚀ ⚞ ⚟ ⚭ (38 / 6) |
Dingbats [32] | ✕ ✝ ❊ ❰ (134 / 4) |
Miscellaneous Mathematical Symbols-A [33] | ⟓ ⟨ ⟩ (1127 / 3) |
Miscellaneous Mathematical Symbols-B [34] | ⧄ ⧠ (2 / 2) |
Supplemental Mathematical Operators [35] | ⪌ ⫙ (2 / 2) |
Coptic [36] | ⲁ ⲅ ⲉ ⲓ ⲕ ⲟ ⲡ ⲥ ⲧ ⲩ ⳓ (19 / 11) |
Tifinagh [37] | ⴼ ⵓ (2 / 2) |
CJK Symbols and Punctuation [38] | 〈 〉 (88 / 2) |
Alphabetic Presentation Forms [39] | fi fl (831 / 2) |
Halfwidth and Fullwidth Forms [40] | | (3 / 1) |
Osage [41] | 𐓄 (1 / 1) |
Linear A [42] | 𐙙 (1 / 1) |
Phoenician [43] | 𐤋 (2 / 1) |
Lydian [44] | 𐤭 (1 / 1) |
Old Turkic [45] | 𐰩 (1 / 1) |
Ancient Greek Musical Notation [46] | 𝈂 (2 / 1) |
Alchemical Symbols [47] | 🜷 (1 / 1) |
70'876'673 Glyphen sind in Paulys RE digitalisiert, gegliedert in 810 verschiedene Glyphen in 47 Blöcken 13'469'874 Leerzeichen (SPACE Wortgrenze, TAB Tabulator, NL neue Zeile),
davon 122'534 Absätze (NL)
|
2. Häufigkeiten
[Bearbeiten]Die Tabelle zeigt für alle Zeichen (Glyphen) in der RE die Häufigkeit mit dem hexadezimalen Codepunkt (Hex).
Häufigkeit | G | Block[1] ASCII Zeichensatz | Hex |
---|---|---|---|
• | ASCII punctuation and symbols | ||
964 | ! | Exclamation mark | U+0021 |
230 | " | Quotation mark | U+0022 |
207 | # | Number sign | U+0023 |
3 | $ | Dollar sign | U+0024 |
229 | % | Percent sign | U+0025 |
29 | & | Ampersand | U+0026 |
1'782 | ' | Apostrophe | U+0027 |
366'821 | ( | Left parenthesis | U+0028 |
416'591 | ) | Right parenthesis | U+0029 |
1'119 | * | Asterisk | U+002A |
• | ASCII math operator | ||
376 | + | Plus sign | U+002B |
• | ASCII punctuation | ||
1'071'184 | , | Comma | U+002C |
52'323 | - | Hyphen-minus | U+002D |
2'301'491 | . | Full stop | U+002E |
9'286 | / | Solidus | U+002F |
• | ASCII digits | ||
173'730 | 0 | Digit zero | U+0030 |
574'785 | 1 | Digit one | U+0031 |
389'842 | 2 | Digit two | U+0032 |
318'582 | 3 | Digit three | U+0033 |
275'857 | 4 | Digit four | U+0034 |
249'102 | 5 | Digit five | U+0035 |
226'088 | 6 | Digit six | U+0036 |
209'998 | 7 | Digit seven | U+0037 |
233'932 | 8 | Digit eight | U+0038 |
201'723 | 9 | Digit nine | U+0039 |
• | ASCII punctuation | ||
48'332 | : | Colon | U+003A |
130'529 | ; | Semicolon | U+003B |
• | ASCII mathematical operators | ||
83 | < | Less-than sign | U+003C |
35'789 | = | Equals sign | U+003D |
153 | > | Greater-than sign | U+003E |
• | ASCII punctuation | ||
9'288 | ? | Question mark | U+003F |
• | Uppercase Latin alphabet | ||
445'257 | A | Latin capital letter a | U+0041 |
249'607 | B | Latin capital letter b | U+0042 |
242'675 | C | Latin capital letter c | U+0043 |
255'712 | D | Latin capital letter d | U+0044 |
185'482 | E | Latin capital letter e | U+0045 |
131'645 | F | Latin capital letter f | U+0046 |
227'252 | G | Latin capital letter g | U+0047 |
180'871 | H | Latin capital letter h | U+0048 |
826'050 | I | Latin capital letter i | U+0049 |
80'664 | J | Latin capital letter j | U+004A |
151'034 | K | Latin capital letter k | U+004B |
193'097 | L | Latin capital letter l | U+004C |
208'859 | M | Latin capital letter m | U+004D |
122'855 | N | Latin capital letter n | U+004E |
88'759 | O | Latin capital letter o | U+004F |
259'961 | P | Latin capital letter p | U+0050 |
17'843 | Q | Latin capital letter q | U+0051 |
136'692 | R | Latin capital letter r | U+0052 |
434'465 | S | Latin capital letter s | U+0053 |
170'938 | T | Latin capital letter t | U+0054 |
39'419 | U | Latin capital letter u | U+0055 |
319'220 | V | Latin capital letter v | U+0056 |
100'517 | W | Latin capital letter w | U+0057 |
203'082 | X | Latin capital letter x | U+0058 |
498 | Y | Latin capital letter y | U+0059 |
83'310 | Z | Latin capital letter z | U+005A |
• | ASCII punctuation and symbols | ||
53'467 | [ | Left square bracket | U+005B |
14 | \ | Reverse solidus | U+005C |
53'557 | ] | Right square bracket | U+005D |
2 | ^ | Circumflex accent | U+005E |
37 | _ | Low line | U+005F |
55 | ` | Grave accent | U+0060 |
• | Lowercase Latin alphabet | ||
3'367'471 | a | Latin small letter a | U+0061 |
963'307 | b | Latin small letter b | U+0062 |
1'790'048 | c | Latin small letter c | U+0063 |
2'789'753 | d | Latin small letter d | U+0064 |
8'965'941 | e | Latin small letter e | U+0065 |
829'563 | f | Latin small letter f | U+0066 |
1'496'061 | g | Latin small letter g | U+0067 |
2'476'223 | h | Latin small letter h | U+0068 |
4'746'245 | i | Latin small letter i | U+0069 |
45'986 | j | Latin small letter j | U+006A |
457'791 | k | Latin small letter k | U+006B |
2'228'224 | l | Latin small letter l | U+006C |
1'396'530 | m | Latin small letter m | U+006D |
5'423'998 | n | Latin small letter n | U+006E |
1'918'094 | o | Latin small letter o | U+006F |
580'485 | p | Latin small letter p | U+0070 |
38'188 | q | Latin small letter q | U+0071 |
4'139'041 | r | Latin small letter r | U+0072 |
3'737'166 | s | Latin small letter s | U+0073 |
3'375'296 | t | Latin small letter t | U+0074 |
2'315'876 | u | Latin small letter u | U+0075 |
536'354 | v | Latin small letter v | U+0076 |
583'039 | w | Latin small letter w | U+0077 |
60'563 | x | Latin small letter x | U+0078 |
164'369 | y | Latin small letter y | U+0079 |
539'137 | z | Latin small letter z | U+007A |
• | ASCII punctuation and symbols | ||
27 | { | Left curly bracket | U+007B |
1'238 | | | Vertical line | U+007C |
31 | } | Right curly bracket | U+007D |
144 | ~ | Tilde | U+007E |
67'729'533 | 93 | Basic Latin | |
Häufigkeit | G | Block[2] Latin-1 Supplement | Hex |
• | Latin-1 punctuation and symbols | ||
7 | £ | Pound sign | U+00A3 |
1 | ¤ | Currency sign | U+00A4 |
5'950 | § | Section sign | U+00A7 |
4 | ¨ | Diaeresis | U+00A8 |
2 | ª | Feminine ordinal indicator | U+00AA |
28 | « | Left-pointing double angle quotation mark | U+00AB |
7 | ¬ | Not sign | U+00AC |
768 | | Soft hyphen | U+00AD |
9 | ¯ | Macron | U+00AF |
1'395 | ° | Degree sign | U+00B0 |
6 | ± | Plus-minus sign | U+00B1 |
7'883 | ² | Superscript two | U+00B2 |
2'677 | ³ | Superscript three | U+00B3 |
13 | ´ | Acute accent | U+00B4 |
3 | µ | Micro sign | U+00B5 |
2'048 | · | Middle dot | U+00B7 |
335 | ¹ | Superscript one | U+00B9 |
15 | º | Masculine ordinal indicator | U+00BA |
16 | » | Right-pointing double angle quotation mark | U+00BB |
• | Vulgar fractions | ||
51 | ¼ | Vulgar fraction one quarter | U+00BC |
363 | ½ | Vulgar fraction one half | U+00BD |
25 | ¾ | Vulgar fraction three quarters | U+00BE |
• | Letters | ||
7 | À | Latin capital letter a with grave | U+00C0 |
1 | Á | Latin capital letter a with acute | U+00C1 |
92 | Â | Latin capital letter a with circumflex | U+00C2 |
1 | Ã | Latin capital letter a with tilde | U+00C3 |
8'207 | Ä | Latin capital letter a with diaeresis | U+00C4 |
5 | Å | Latin capital letter a with ring above | U+00C5 |
14 | Æ | Latin capital letter ae | U+00C6 |
50 | Ç | Latin capital letter c with cedilla | U+00C7 |
608 | É | Latin capital letter e with acute | U+00C9 |
64 | Ê | Latin capital letter e with circumflex | U+00CA |
2 | Í | Latin capital letter i with acute | U+00CD |
19 | Î | Latin capital letter i with circumflex | U+00CE |
4 | Ï | Latin capital letter i with diaeresis | U+00CF |
1 | Ð | Latin capital letter eth | U+00D0 |
7 | Ó | Latin capital letter o with acute | U+00D3 |
5 | Ô | Latin capital letter o with circumflex | U+00D4 |
2'440 | Ö | Latin capital letter o with diaeresis | U+00D6 |
• | Mathematical operator | ||
121 | × | Multiplication sign | U+00D7 |
• | Letters | ||
3 | Ú | Latin capital letter u with acute | U+00DA |
18 | Û | Latin capital letter u with circumflex | U+00DB |
17'792 | Ü | Latin capital letter u with diaeresis | U+00DC |
89'651 | ß | Latin small letter sharp s | U+00DF |
477 | à | Latin small letter a with grave | U+00E0 |
824 | á | Latin small letter a with acute | U+00E1 |
4'045 | â | Latin small letter a with circumflex | U+00E2 |
18 | ã | Latin small letter a with tilde | U+00E3 |
258'461 | ä | Latin small letter a with diaeresis | U+00E4 |
13 | å | Latin small letter a with ring above | U+00E5 |
2 | æ | Latin small letter ae | U+00E6 |
415 | ç | Latin small letter c with cedilla | U+00E7 |
1'250 | è | Latin small letter e with grave | U+00E8 |
10'404 | é | Latin small letter e with acute | U+00E9 |
630 | ê | Latin small letter e with circumflex | U+00EA |
948 | ë | Latin small letter e with diaeresis | U+00EB |
13 | ì | Latin small letter i with grave | U+00EC |
250 | í | Latin small letter i with acute | U+00ED |
1'374 | î | Latin small letter i with circumflex | U+00EE |
1'174 | ï | Latin small letter i with diaeresis | U+00EF |
1 | ð | Latin small letter eth | U+00F0 |
120 | ñ | Latin small letter n with tilde | U+00F1 |
30 | ò | Latin small letter o with grave | U+00F2 |
308 | ó | Latin small letter o with acute | U+00F3 |
619 | ô | Latin small letter o with circumflex | U+00F4 |
13 | õ | Latin small letter o with tilde | U+00F5 |
143'463 | ö | Latin small letter o with diaeresis | U+00F6 |
• | Mathematical operator | ||
1 | ÷ | Division sign | U+00F7 |
• | Letters | ||
31 | ù | Latin small letter u with grave | U+00F9 |
164 | ú | Latin small letter u with acute | U+00FA |
1'050 | û | Latin small letter u with circumflex | U+00FB |
309'118 | ü | Latin small letter u with diaeresis | U+00FC |
66 | ý | Latin small letter y with acute | U+00FD |
7 | þ | Latin small letter thorn | U+00FE |
3 | ÿ | Latin small letter y with diaeresis | U+00FF |
876'010 | 75 | Latin-1 Supplement | |
Häufigkeit | G | Block[3] Latin Extended-A | Hex |
• | European Latin | ||
35 | Ā | Latin capital letter a with macron | U+0100 |
2'462 | ā | Latin small letter a with macron | U+0101 |
2 | Ă | Latin capital letter a with breve | U+0102 |
57 | ă | Latin small letter a with breve | U+0103 |
1 | ą | Latin small letter a with ogonek | U+0105 |
1 | Ć | Latin capital letter c with acute | U+0106 |
128 | ć | Latin small letter c with acute | U+0107 |
11 | Ċ | Latin capital letter c with dot above | U+010A |
178 | Č | Latin capital letter c with caron | U+010C |
548 | č | Latin small letter c with caron | U+010D |
1 | Đ | Latin capital letter d with stroke | U+0110 |
10 | đ | Latin small letter d with stroke | U+0111 |
16 | Ē | Latin capital letter e with macron | U+0112 |
374 | ē | Latin small letter e with macron | U+0113 |
88 | ĕ | Latin small letter e with breve | U+0115 |
9 | ę | Latin small letter e with ogonek | U+0119 |
59 | ě | Latin small letter e with caron | U+011B |
146 | Ğ | Latin capital letter g with breve | U+011E |
288 | ğ | Latin small letter g with breve | U+011F |
21 | Ġ | Latin capital letter g with dot above | U+0120 |
26 | ġ | Latin small letter g with dot above | U+0121 |
2 | Ģ | Latin capital letter g with cedilla | U+0122 |
53 | ģ | Latin small letter g with cedilla | U+0123 |
1 | Ī | Latin capital letter i with macron | U+012A |
780 | ī | Latin small letter i with macron | U+012B |
87 | ĭ | Latin small letter i with breve | U+012D |
1 | į | Latin small letter i with ogonek | U+012F |
23 | ı | Latin small letter dotless i | U+0131 |
3 | Ĵ | Latin capital letter j with circumflex | U+0134 |
1 | ĵ | Latin small letter j with circumflex | U+0135 |
1 | Ķ | Latin capital letter k with cedilla | U+0136 |
1 | ķ | Latin small letter k with cedilla | U+0137 |
16 | ł | Latin small letter l with stroke | U+0142 |
75 | ń | Latin small letter n with acute | U+0144 |
13 | ņ | Latin small letter n with cedilla | U+0146 |
4 | ň | Latin small letter n with caron | U+0148 |
2 | ŋ | Latin small letter eng | U+014B |
3 | Ō | Latin capital letter o with macron | U+014C |
393 | ō | Latin small letter o with macron | U+014D |
38 | ŏ | Latin small letter o with breve | U+014F |
6 | ő | Latin small letter o with double acute | U+0151 |
2 | ŕ | Latin small letter r with acute | U+0155 |
1 | Ŗ | Latin capital letter r with cedilla | U+0156 |
16 | ř | Latin small letter r with caron | U+0159 |
2 | Ś | Latin capital letter s with acute | U+015A |
189 | ś | Latin small letter s with acute | U+015B |
1 | Ŝ | Latin capital letter s with circumflex | U+015C |
2 | ŝ | Latin small letter s with circumflex | U+015D |
4 | ş | Latin small letter s with cedilla | U+015F |
315 | Š | Latin capital letter s with caron | U+0160 |
1'451 | š | Latin small letter s with caron | U+0161 |
2 | ũ | Latin small letter u with tilde | U+0169 |
1 | Ū | Latin capital letter u with macron | U+016A |
548 | ū | Latin small letter u with macron | U+016B |
55 | ŭ | Latin small letter u with breve | U+016D |
3 | ŷ | Latin small letter y with circumflex | U+0177 |
1 | Ź | Latin capital letter z with acute | U+0179 |
3 | ż | Latin small letter z with dot above | U+017C |
25 | Ž | Latin capital letter z with caron | U+017D |
149 | ž | Latin small letter z with caron | U+017E |
8734 | 60 | Latin Extended-A | |
Häufigkeit | G | Block[4] Latin Extended-B | Hex |
• | Non-European and historic Latin | ||
1 | Ɔ | Latin capital letter open o | U+0186 |
1 | Ǝ | Latin capital letter reversed e | U+018E |
1 | ƿ | Latin letter wynn | U+01BF |
• | Pinyin diacritic-vowel combinations | ||
53 | ǎ | Latin small letter a with caron | U+01CE |
40 | ǐ | Latin small letter i with caron | U+01D0 |
1 | Ǒ | Latin capital letter o with caron | U+01D1 |
38 | ǒ | Latin small letter o with caron | U+01D2 |
25 | ǔ | Latin small letter u with caron | U+01D4 |
2 | Ǘ | Latin capital letter u with diaeresis and acute | U+01D7 |
• | Phonetic and historic letters | ||
57 | Ǧ | Latin capital letter g with caron | U+01E6 |
117 | ǧ | Latin small letter g with caron | U+01E7 |
3 | ǰ | Latin small letter j with caron | U+01F0 |
3 | Ǵ | Latin capital letter g with acute | U+01F4 |
16 | ǵ | Latin small letter g with acute | U+01F5 |
• | Additions for Slovenian and Croatian | ||
1 | ȓ | Latin small letter r with inverted breve | U+0213 |
• | Additions for Romanian | ||
1 | Ș | Latin capital letter s with comma below | U+0218 |
3 | ș | Latin small letter s with comma below | U+0219 |
6 | ț | Latin small letter t with comma below | U+021B |
• | Miscellaneous additions | ||
4 | ȥ | Latin small letter z with hook | U+0225 |
1 | Ȧ | Latin capital letter a with dot above | U+0226 |
• | Additions for Livonian | ||
4 | ȳ | Latin small letter y with macron | U+0233 |
378 | 21 | Latin Extended-B | |
Häufigkeit | G | Block[5] IPA Extensions | Hex |
• | IPA extensions | ||
2 | ə | Latin small letter schwa | U+0259 |
1 | ʒ | Latin small letter ezh | U+0292 |
2 | ʠ | Latin small letter q with hook | U+02A0 |
5 | 3 | IPA Extensions | |
Häufigkeit | G | Block[6] Spacing Modifier Letters | Hex |
• | Latin superscript modifier letters | ||
8 | ʰ | Modifier letter small h | U+02B0 |
• | Miscellaneous phonetic modifiers | ||
324 | ʹ | Modifier letter prime | U+02B9 |
2 | ʻ | Modifier letter turned comma | U+02BB |
2 | ʼ | Modifier letter apostrophe | U+02BC |
181 | ʾ | Modifier letter right half ring | U+02BE |
882 | ʿ | Modifier letter left half ring | U+02BF |
• | Spacing clones of diacritics | ||
1 | ˘ | Breve | U+02D8 |
• | Extended Bopomofo tone marks | ||
1 | ˫ | Modifier letter yang departing tone mark | U+02EB |
1401 | 8 | Spacing Modifier Letters | |
Häufigkeit | G | Block[7] Combining Diacritical Marks | Hex |
• | Ordinary diacritics | ||
13 | ́ | Combining acute accent | U+0301 |
9 | ̄ | Combining macron | U+0304 |
5 | ̅ | Combining overline | U+0305 |
5 | ̆ | Combining breve | U+0306 |
3 | ̌ | Combining caron | U+030C |
21 | ̓ | Combining comma above | U+0313 |
2 | ̔ | Combining reversed comma above | U+0314 |
29 | ̣ | Combining dot below | U+0323 |
32 | ̥ | Combining ring below | U+0325 |
1 | ̮ | Combining breve below | U+032E |
4 | ̯ | Combining inverted breve below | U+032F |
• | Miscellaneous additions | ||
2 | ̽ | Combining x above | U+033D |
• | Additions for Greek | ||
7 | ͂ | Combining greek perispomeni | U+0342 |
• | Additions for the Uralic Phonetic Alphabet | ||
1 | ͗ | Combining right half ring above | U+0357 |
• | Double diacritics | ||
1 | ͡ | Combining double inverted breve | U+0361 |
135 | 15 | Combining Diacritical Marks | |
Häufigkeit | G | Block[8] Greek and Coptic | Hex |
• | Numeral signs | ||
1 | ʹ | Greek numeral sign | U+0374 |
2 | ͵ | Greek lower numeral sign | U+0375 |
• | Spacing accent marks | ||
17 | ΄ | Greek tonos | U+0384 |
• | Letter | ||
51 | Ά | Greek capital letter alpha with tonos | U+0386 |
• | Letters | ||
17 | Έ | Greek capital letter epsilon with tonos | U+0388 |
3 | Ή | Greek capital letter eta with tonos | U+0389 |
2 | Ί | Greek capital letter iota with tonos | U+038A |
4 | Ό | Greek capital letter omicron with tonos | U+038C |
4 | Ύ | Greek capital letter upsilon with tonos | U+038E |
1 | Ώ | Greek capital letter omega with tonos | U+038F |
544 | ΐ | Greek small letter iota with dialytika and tonos | U+0390 |
3'866 | Α | Greek capital letter alpha | U+0391 |
5'794 | Β | Greek capital letter beta | U+0392 |
3'211 | Γ | Greek capital letter gamma | U+0393 |
6'728 | Δ | Greek capital letter delta | U+0394 |
2'873 | Ε | Greek capital letter epsilon | U+0395 |
591 | Ζ | Greek capital letter zeta | U+0396 |
347 | Η | Greek capital letter eta | U+0397 |
1'985 | Θ | Greek capital letter theta | U+0398 |
2'748 | Ι | Greek capital letter iota | U+0399 |
8'400 | Κ | Greek capital letter kappa | U+039A |
3'490 | Λ | Greek capital letter lamda | U+039B |
5'511 | Μ | Greek capital letter mu | U+039C |
2'094 | Ν | Greek capital letter nu | U+039D |
297 | Ξ | Greek capital letter xi | U+039E |
1'394 | Ο | Greek capital letter omicron | U+039F |
7'200 | Π | Greek capital letter pi | U+03A0 |
423 | Ρ | Greek capital letter rho | U+03A1 |
5'760 | Σ | Greek capital letter sigma | U+03A3 |
3'561 | Τ | Greek capital letter tau | U+03A4 |
274 | Υ | Greek capital letter upsilon | U+03A5 |
2'566 | Φ | Greek capital letter phi | U+03A6 |
3'170 | Χ | Greek capital letter chi | U+03A7 |
144 | Ψ | Greek capital letter psi | U+03A8 |
245 | Ω | Greek capital letter omega | U+03A9 |
1 | Ϊ | Greek capital letter iota with dialytika | U+03AA |
32'261 | ά | Greek small letter alpha with tonos | U+03AC |
25'011 | έ | Greek small letter epsilon with tonos | U+03AD |
14'409 | ή | Greek small letter eta with tonos | U+03AE |
50'846 | ί | Greek small letter iota with tonos | U+03AF |
26 | ΰ | Greek small letter upsilon with dialytika and tonos | U+03B0 |
157'187 | α | Greek small letter alpha | U+03B1 |
18'503 | β | Greek small letter beta | U+03B2 |
35'253 | γ | Greek small letter gamma | U+03B3 |
44'364 | δ | Greek small letter delta | U+03B4 |
95'440 | ε | Greek small letter epsilon | U+03B5 |
3'927 | ζ | Greek small letter zeta | U+03B6 |
38'311 | η | Greek small letter eta | U+03B7 |
24'175 | θ | Greek small letter theta | U+03B8 |
118'181 | ι | Greek small letter iota | U+03B9 |
69'326 | κ | Greek small letter kappa | U+03BA |
71'112 | λ | Greek small letter lamda | U+03BB |
60'152 | μ | Greek small letter mu | U+03BC |
155'053 | ν | Greek small letter nu | U+03BD |
7'933 | ξ | Greek small letter xi | U+03BE |
158'596 | ο | Greek small letter omicron | U+03BF |
62'562 | π | Greek small letter pi | U+03C0 |
108'115 | ρ | Greek small letter rho | U+03C1 |
106'181 | ς | Greek small letter final sigma | U+03C2 |
65'790 | σ | Greek small letter sigma | U+03C3 |
128'509 | τ | Greek small letter tau | U+03C4 |
45'649 | υ | Greek small letter upsilon | U+03C5 |
19'696 | φ | Greek small letter phi | U+03C6 |
21'408 | χ | Greek small letter chi | U+03C7 |
2'711 | ψ | Greek small letter psi | U+03C8 |
31'116 | ω | Greek small letter omega | U+03C9 |
609 | ϊ | Greek small letter iota with dialytika | U+03CA |
35 | ϋ | Greek small letter upsilon with dialytika | U+03CB |
29'773 | ό | Greek small letter omicron with tonos | U+03CC |
17'939 | ύ | Greek small letter upsilon with tonos | U+03CD |
8'021 | ώ | Greek small letter omega with tonos | U+03CE |
• | Variant letterforms | ||
422 | ϑ | Greek theta symbol | U+03D1 |
3 | ϕ | Greek phi symbol | U+03D5 |
• | Archaic letters | ||
3 | Ϙ | Greek letter archaic koppa | U+03D8 |
5 | ϙ | Greek small letter archaic koppa | U+03D9 |
1 | Ϛ | Greek letter stigma | U+03DA |
6 | ϛ | Greek small letter stigma | U+03DB |
101 | Ϝ | Greek letter digamma | U+03DC |
113 | ϝ | Greek small letter digamma | U+03DD |
6 | ϟ | Greek small letter koppa | U+03DF |
4 | Ϡ | Greek letter sampi | U+03E0 |
1 | ϡ | Greek small letter sampi | U+03E1 |
• | Coptic letters derived from Demotic | ||
1 | ϣ | Coptic small letter shei | U+03E3 |
• | Variant letterforms | ||
1 | ϰ | Greek kappa symbol | U+03F0 |
53 | ϱ | Greek rho symbol | U+03F1 |
• | Variant letterforms and symbols | ||
5 | ϵ | Greek lunate epsilon symbol | U+03F5 |
• | Variant letterform | ||
26 | Ϲ | Greek capital lunate sigma symbol | U+03F9 |
1'902'250 | 87 | Greek and Coptic | |
Häufigkeit | G | Block[9] Cyrillic | Hex |
• | Cyrillic extensions | ||
3 | Є | Cyrillic capital letter ukrainian ie | U+0404 |
2 | І | Cyrillic capital letter byelorussian-ukrainian i | U+0406 |
• | Basic Russian alphabet | ||
6 | В | Cyrillic capital letter ve | U+0412 |
2 | О | Cyrillic capital letter o | U+041E |
1 | П | Cyrillic capital letter pe | U+041F |
1 | С | Cyrillic capital letter es | U+0421 |
3 | Т | Cyrillic capital letter te | U+0422 |
5 | Х | Cyrillic capital letter ha | U+0425 |
2 | Э | Cyrillic capital letter e | U+042D |
8 | а | Cyrillic small letter a | U+0430 |
8 | е | Cyrillic small letter ie | U+0435 |
2 | и | Cyrillic small letter i | U+0438 |
1 | й | Cyrillic small letter short i | U+0439 |
2 | л | Cyrillic small letter el | U+043B |
1 | н | Cyrillic small letter en | U+043D |
4 | о | Cyrillic small letter o | U+043E |
8 | р | Cyrillic small letter er | U+0440 |
4 | с | Cyrillic small letter es | U+0441 |
1 | у | Cyrillic small letter u | U+0443 |
• | Cyrillic extensions | ||
1 | і | Cyrillic small letter byelorussian-ukrainian i | U+0456 |
2 | ї | Cyrillic small letter yi | U+0457 |
• | Historic letters | ||
5 | Ҁ | Cyrillic capital letter koppa | U+0480 |
• | Extended Cyrillic | ||
1 | ӗ | Cyrillic small letter ie with breve | U+04D7 |
1 | ӱ | Cyrillic small letter u with diaeresis | U+04F1 |
• | Additions for Nivkh | ||
2 | Ӿ | Cyrillic capital letter ha with stroke | U+04FE |
76 | 25 | Cyrillic | |
Häufigkeit | G | Block[10] Armenian | Hex |
• | Uppercase letters | ||
1 | Ե | Armenian capital letter ech | U+0535 |
• | Lowercase letters | ||
5 | ե | Armenian small letter ech | U+0565 |
5 | է | Armenian small letter eh | U+0567 |
11 | 3 | Armenian | |
Häufigkeit | G | Block[11] Hebrew | Hex |
• | Points and punctuation | ||
120 | ְ | Hebrew point sheva | U+05B0 |
8 | ֱ | Hebrew point hataf segol | U+05B1 |
13 | ֲ | Hebrew point hataf patah | U+05B2 |
1 | ֳ | Hebrew point hataf qamats | U+05B3 |
102 | ִ | Hebrew point hiriq | U+05B4 |
61 | ֵ | Hebrew point tsere | U+05B5 |
57 | ֶ | Hebrew point segol | U+05B6 |
125 | ַ | Hebrew point patah | U+05B7 |
140 | ָ | Hebrew point qamats | U+05B8 |
86 | ֹ | Hebrew point holam | U+05B9 |
3 | ֻ | Hebrew point qubuts | U+05BB |
136 | ּ | Hebrew point dagesh or mapiq | U+05BC |
12 | ־ | Hebrew punctuation maqaf | U+05BE |
45 | ׁ | Hebrew point shin dot | U+05C1 |
9 | ׂ | Hebrew point sin dot | U+05C2 |
• | Puncta extraordinaria | ||
1 | ׅ | Hebrew mark lower dot | U+05C5 |
• | Based on ISO 8859-8 | ||
177 | א | Hebrew letter alef | U+05D0 |
197 | ב | Hebrew letter bet | U+05D1 |
37 | ג | Hebrew letter gimel | U+05D2 |
100 | ד | Hebrew letter dalet | U+05D3 |
174 | ה | Hebrew letter he | U+05D4 |
166 | ו | Hebrew letter vav | U+05D5 |
31 | ז | Hebrew letter zayin | U+05D6 |
84 | ח | Hebrew letter het | U+05D7 |
23 | ט | Hebrew letter tet | U+05D8 |
301 | י | Hebrew letter yod | U+05D9 |
6 | ך | Hebrew letter final kaf | U+05DA |
83 | כ | Hebrew letter kaf | U+05DB |
192 | ל | Hebrew letter lamed | U+05DC |
87 | ם | Hebrew letter final mem | U+05DD |
135 | מ | Hebrew letter mem | U+05DE |
88 | ן | Hebrew letter final nun | U+05DF |
81 | נ | Hebrew letter nun | U+05E0 |
43 | ס | Hebrew letter samekh | U+05E1 |
131 | ע | Hebrew letter ayin | U+05E2 |
40 | ף | Hebrew letter final pe | U+05E3 |
61 | פ | Hebrew letter pe | U+05E4 |
6 | ץ | Hebrew letter final tsadi | U+05E5 |
43 | צ | Hebrew letter tsadi | U+05E6 |
55 | ק | Hebrew letter qof | U+05E7 |
349 | ר | Hebrew letter resh | U+05E8 |
147 | ש | Hebrew letter shin | U+05E9 |
145 | ת | Hebrew letter tav | U+05EA |
• | Additional punctuation | ||
4 | ׳ | Hebrew punctuation geresh | U+05F3 |
1 | ״ | Hebrew punctuation gershayim | U+05F4 |
3906 | 45 | Hebrew | |
Häufigkeit | G | Block[12] Arabic | Hex |
• | Based on ISO 8859-6 | ||
1 | ا | Arabic letter alef | U+0627 |
1 | ب | Arabic letter beh | U+0628 |
1 | ش | Arabic letter sheen | U+0634 |
1 | م | Arabic letter meem | U+0645 |
4 | 4 | Arabic | |
Häufigkeit | G | Block[13] Devanagari | Hex |
• | Consonants | ||
1 | घ | Devanagari letter gha | U+0918 |
1 | ज | Devanagari letter ja | U+091C |
1 | ऩ | Devanagari letter nnna | U+0929 |
1 | प | Devanagari letter pa | U+092A |
• | Various signs | ||
1 | ़ | Devanagari sign nukta | U+093C |
5 | 5 | Devanagari | |
Häufigkeit | G | Block[14] Cherokee | Hex |
• | Uppercase syllables | ||
2 | Ꭶ | Cherokee letter ga | U+13A6 |
Häufigkeit | G | Block[15] Unified Canadian Aboriginal Syllabics | Hex |
• | Syllables | ||
3 | ᐨ | Canadian syllabics final short horizontal stroke | U+1428 |
Häufigkeit | G | Block[16] Runic | Hex |
• | Letters | ||
3 | ᛋ | Runic letter sigel long-branch-sol s | U+16CB |
Häufigkeit | G | Block[17] Phonetic Extensions | Hex |
• | Latin letters | ||
4 | ᴗ | Latin small letter bottom half o | U+1D17 |
• | Latin superscript modifier letters | ||
1 | ᵃ | Modifier letter small a | U+1D43 |
3 | ᵉ | Modifier letter small e | U+1D49 |
8 | 3 | Phonetic Extensions | |
Häufigkeit | G | Block[18] Phonetic Extensions Supplement | Hex |
• | Modifier letters | ||
11 | ᶜ | Modifier letter small c | U+1D9C |
Häufigkeit | G | Block[19] Latin Extended Additional | Hex |
• | Latin general use extensions | ||
1 | ḁ | Latin small letter a with ring below | U+1E01 |
1 | Ḋ | Latin capital letter d with dot above | U+1E0A |
12 | Ḍ | Latin capital letter d with dot below | U+1E0C |
138 | ḍ | Latin small letter d with dot below | U+1E0D |
7 | Ḏ | Latin capital letter d with line below | U+1E0E |
27 | ḏ | Latin small letter d with line below | U+1E0F |
1 | ḗ | Latin small letter e with macron and acute | U+1E17 |
4 | Ḡ | Latin capital letter g with macron | U+1E20 |
414 | Ḥ | Latin capital letter h with dot below | U+1E24 |
421 | ḥ | Latin small letter h with dot below | U+1E25 |
122 | Ḫ | Latin capital letter h with breve below | U+1E2A |
201 | ḫ | Latin small letter h with breve below | U+1E2B |
13 | ḭ | Latin small letter i with tilde below | U+1E2D |
4 | Ḱ | Latin capital letter k with acute | U+1E30 |
1 | ḱ | Latin small letter k with acute | U+1E31 |
191 | Ḳ | Latin capital letter k with dot below | U+1E32 |
278 | ḳ | Latin small letter k with dot below | U+1E33 |
4 | ḵ | Latin small letter k with line below | U+1E35 |
28 | ḷ | Latin small letter l with dot below | U+1E37 |
3 | ṁ | Latin small letter m with dot above | U+1E41 |
1 | Ṃ | Latin capital letter m with dot below | U+1E42 |
21 | ṃ | Latin small letter m with dot below | U+1E43 |
28 | ṅ | Latin small letter n with dot above | U+1E45 |
1 | Ṇ | Latin capital letter n with dot below | U+1E46 |
254 | ṇ | Latin small letter n with dot below | U+1E47 |
1 | ṑ | Latin small letter o with macron and grave | U+1E51 |
2 | ṓ | Latin small letter o with macron and acute | U+1E53 |
5 | ṙ | Latin small letter r with dot above | U+1E59 |
9 | Ṛ | Latin capital letter r with dot below | U+1E5A |
115 | ṛ | Latin small letter r with dot below | U+1E5B |
3 | ṟ | Latin small letter r with line below | U+1E5F |
16 | Ṡ | Latin capital letter s with dot above | U+1E60 |
2 | ṡ | Latin small letter s with dot above | U+1E61 |
71 | Ṣ | Latin capital letter s with dot below | U+1E62 |
313 | ṣ | Latin small letter s with dot below | U+1E63 |
1 | Ṥ | Latin capital letter s with acute and dot above | U+1E64 |
2 | ṩ | Latin small letter s with dot below and dot above | U+1E69 |
71 | Ṭ | Latin capital letter t with dot below | U+1E6C |
444 | ṭ | Latin small letter t with dot below | U+1E6D |
3 | Ṯ | Latin capital letter t with line below | U+1E6E |
55 | ṯ | Latin small letter t with line below | U+1E6F |
1 | ṱ | Latin small letter t with circumflex below | U+1E71 |
3 | ṷ | Latin small letter u with circumflex below | U+1E77 |
2 | Ṿ | Latin capital letter v with dot below | U+1E7E |
10 | Ẓ | Latin capital letter z with dot below | U+1E92 |
9 | ẓ | Latin small letter z with dot below | U+1E93 |
17 | ẖ | Latin small letter h with line below | U+1E96 |
1 | ẛ | Latin small letter long s with dot above | U+1E9B |
• | Latin extensions for Vietnamese | ||
2 | Ạ | Latin capital letter a with dot below | U+1EA0 |
6 | ạ | Latin small letter a with dot below | U+1EA1 |
2 | Ả | Latin capital letter a with hook above | U+1EA2 |
1 | ả | Latin small letter a with hook above | U+1EA3 |
3 | ậ | Latin small letter a with circumflex and dot below | U+1EAD |
10 | ẹ | Latin small letter e with dot below | U+1EB9 |
1 | ẽ | Latin small letter e with tilde | U+1EBD |
6 | ị | Latin small letter i with dot below | U+1ECB |
3363 | 56 | Latin Extended Additional | |
Häufigkeit | G | Block[20] Greek Extended | Hex |
• | Precomposed polytonic Greek | ||
17'136 | ἀ | Greek small letter alpha with psili | U+1F00 |
793 | ἁ | Greek small letter alpha with dasia | U+1F01 |
321 | ἂ | Greek small letter alpha with psili and varia | U+1F02 |
149 | ἃ | Greek small letter alpha with dasia and varia | U+1F03 |
4'790 | ἄ | Greek small letter alpha with psili and oxia | U+1F04 |
493 | ἅ | Greek small letter alpha with dasia and oxia | U+1F05 |
113 | ἆ | Greek small letter alpha with psili and perispomeni | U+1F06 |
7 | ἇ | Greek small letter alpha with dasia and perispomeni | U+1F07 |
14'824 | Ἀ | Greek capital letter alpha with psili | U+1F08 |
451 | Ἁ | Greek capital letter alpha with dasia | U+1F09 |
32 | Ἂ | Greek capital letter alpha with psili and varia | U+1F0A |
2 | Ἃ | Greek capital letter alpha with dasia and varia | U+1F0B |
2'181 | Ἄ | Greek capital letter alpha with psili and oxia | U+1F0C |
117 | Ἅ | Greek capital letter alpha with dasia and oxia | U+1F0D |
18 | Ἆ | Greek capital letter alpha with psili and perispomeni | U+1F0E |
20'846 | ἐ | Greek small letter epsilon with psili | U+1F10 |
1'964 | ἑ | Greek small letter epsilon with dasia | U+1F11 |
27 | ἒ | Greek small letter epsilon with psili and varia | U+1F12 |
43 | ἓ | Greek small letter epsilon with dasia and varia | U+1F13 |
4'314 | ἔ | Greek small letter epsilon with psili and oxia | U+1F14 |
692 | ἕ | Greek small letter epsilon with dasia and oxia | U+1F15 |
4'105 | Ἐ | Greek capital letter epsilon with psili | U+1F18 |
1'154 | Ἑ | Greek capital letter epsilon with dasia | U+1F19 |
14 | Ἒ | Greek capital letter epsilon with psili and varia | U+1F1A |
10 | Ἓ | Greek capital letter epsilon with dasia and varia | U+1F1B |
557 | Ἔ | Greek capital letter epsilon with psili and oxia | U+1F1C |
232 | Ἕ | Greek capital letter epsilon with dasia and oxia | U+1F1D |
501 | ἠ | Greek small letter eta with psili | U+1F20 |
3'557 | ἡ | Greek small letter eta with dasia | U+1F21 |
986 | ἢ | Greek small letter eta with psili and varia | U+1F22 |
176 | ἣ | Greek small letter eta with dasia and varia | U+1F23 |
382 | ἤ | Greek small letter eta with psili and oxia | U+1F24 |
333 | ἥ | Greek small letter eta with dasia and oxia | U+1F25 |
474 | ἦ | Greek small letter eta with psili and perispomeni | U+1F26 |
163 | ἧ | Greek small letter eta with dasia and perispomeni | U+1F27 |
397 | Ἠ | Greek capital letter eta with psili | U+1F28 |
1'296 | Ἡ | Greek capital letter eta with dasia | U+1F29 |
5 | Ἢ | Greek capital letter eta with psili and varia | U+1F2A |
3 | Ἣ | Greek capital letter eta with dasia and varia | U+1F2B |
66 | Ἤ | Greek capital letter eta with psili and oxia | U+1F2C |
145 | Ἥ | Greek capital letter eta with dasia and oxia | U+1F2D |
29 | Ἦ | Greek capital letter eta with psili and perispomeni | U+1F2E |
9 | Ἧ | Greek capital letter eta with dasia and perispomeni | U+1F2F |
7'250 | ἰ | Greek small letter iota with psili | U+1F30 |
5'162 | ἱ | Greek small letter iota with dasia | U+1F31 |
14 | ἲ | Greek small letter iota with psili and varia | U+1F32 |
105 | ἳ | Greek small letter iota with dasia and varia | U+1F33 |
1'722 | ἴ | Greek small letter iota with psili and oxia | U+1F34 |
529 | ἵ | Greek small letter iota with dasia and oxia | U+1F35 |
1'135 | ἶ | Greek small letter iota with psili and perispomeni | U+1F36 |
368 | ἷ | Greek small letter iota with dasia and perispomeni | U+1F37 |
2'369 | Ἰ | Greek capital letter iota with psili | U+1F38 |
670 | Ἱ | Greek capital letter iota with dasia | U+1F39 |
11 | Ἲ | Greek capital letter iota with psili and varia | U+1F3A |
486 | Ἴ | Greek capital letter iota with psili and oxia | U+1F3C |
82 | Ἵ | Greek capital letter iota with dasia and oxia | U+1F3D |
13 | Ἶ | Greek capital letter iota with psili and perispomeni | U+1F3E |
2'067 | ὀ | Greek small letter omicron with psili | U+1F40 |
4'337 | ὁ | Greek small letter omicron with dasia | U+1F41 |
23 | ὂ | Greek small letter omicron with psili and varia | U+1F42 |
379 | ὃ | Greek small letter omicron with dasia and varia | U+1F43 |
1'451 | ὄ | Greek small letter omicron with psili and oxia | U+1F44 |
1'574 | ὅ | Greek small letter omicron with dasia and oxia | U+1F45 |
1'102 | Ὀ | Greek capital letter omicron with psili | U+1F48 |
407 | Ὁ | Greek capital letter omicron with dasia | U+1F49 |
1 | Ὂ | Greek capital letter omicron with psili and varia | U+1F4A |
2 | Ὃ | Greek capital letter omicron with dasia and varia | U+1F4B |
194 | Ὄ | Greek capital letter omicron with psili and oxia | U+1F4C |
146 | Ὅ | Greek capital letter omicron with dasia and oxia | U+1F4D |
8'182 | ὐ | Greek small letter upsilon with psili | U+1F50 |
3'598 | ὑ | Greek small letter upsilon with dasia | U+1F51 |
9 | ὒ | Greek small letter upsilon with psili and varia | U+1F52 |
91 | ὓ | Greek small letter upsilon with dasia and varia | U+1F53 |
1'105 | ὔ | Greek small letter upsilon with psili and oxia | U+1F54 |
1'070 | ὕ | Greek small letter upsilon with dasia and oxia | U+1F55 |
485 | ὖ | Greek small letter upsilon with psili and perispomeni | U+1F56 |
393 | ὗ | Greek small letter upsilon with dasia and perispomeni | U+1F57 |
569 | Ὑ | Greek capital letter upsilon with dasia | U+1F59 |
2 | Ὓ | Greek capital letter upsilon with dasia and varia | U+1F5B |
148 | Ὕ | Greek capital letter upsilon with dasia and oxia | U+1F5D |
5 | Ὗ | Greek capital letter upsilon with dasia and perispomeni | U+1F5F |
376 | ὠ | Greek small letter omega with psili | U+1F60 |
877 | ὡ | Greek small letter omega with dasia | U+1F61 |
63 | ὢ | Greek small letter omega with psili and varia | U+1F62 |
26 | ὣ | Greek small letter omega with dasia and varia | U+1F63 |
98 | ὤ | Greek small letter omega with psili and oxia | U+1F64 |
422 | ὥ | Greek small letter omega with dasia and oxia | U+1F65 |
164 | ὦ | Greek small letter omega with psili and perispomeni | U+1F66 |
304 | ὧ | Greek small letter omega with dasia and perispomeni | U+1F67 |
121 | Ὠ | Greek capital letter omega with psili | U+1F68 |
20 | Ὡ | Greek capital letter omega with dasia | U+1F69 |
1 | Ὣ | Greek capital letter omega with dasia and varia | U+1F6B |
26 | Ὤ | Greek capital letter omega with psili and oxia | U+1F6C |
13 | Ὥ | Greek capital letter omega with dasia and oxia | U+1F6D |
29 | Ὦ | Greek capital letter omega with psili and perispomeni | U+1F6E |
23 | Ὧ | Greek capital letter omega with dasia and perispomeni | U+1F6F |
9'020 | ὰ | Greek small letter alpha with varia | U+1F70 |
5'826 | ὲ | Greek small letter epsilon with varia | U+1F72 |
6'385 | ὴ | Greek small letter eta with varia | U+1F74 |
20'191 | ὶ | Greek small letter iota with varia | U+1F76 |
13'695 | ὸ | Greek small letter omicron with varia | U+1F78 |
3'046 | ὺ | Greek small letter upsilon with varia | U+1F7A |
732 | ὼ | Greek small letter omega with varia | U+1F7C |
18 | ᾀ | Greek small letter alpha with psili and ypogegrammeni | U+1F80 |
1 | ᾁ | Greek small letter alpha with dasia and ypogegrammeni | U+1F81 |
48 | ᾄ | Greek small letter alpha with psili and oxia and ypogegrammeni | U+1F84 |
3 | ᾅ | Greek small letter alpha with dasia and oxia and ypogegrammeni | U+1F85 |
13 | ᾆ | Greek small letter alpha with psili and perispomeni and ypogegrammeni | U+1F86 |
2 | ᾇ | Greek small letter alpha with dasia and perispomeni and ypogegrammeni | U+1F87 |
9 | ᾐ | Greek small letter eta with psili and ypogegrammeni | U+1F90 |
7 | ᾑ | Greek small letter eta with dasia and ypogegrammeni | U+1F91 |
5 | ᾒ | Greek small letter eta with psili and varia and ypogegrammeni | U+1F92 |
1 | ᾓ | Greek small letter eta with dasia and varia and ypogegrammeni | U+1F93 |
17 | ᾔ | Greek small letter eta with psili and oxia and ypogegrammeni | U+1F94 |
3 | ᾕ | Greek small letter eta with dasia and oxia and ypogegrammeni | U+1F95 |
33 | ᾖ | Greek small letter eta with psili and perispomeni and ypogegrammeni | U+1F96 |
78 | ᾗ | Greek small letter eta with dasia and perispomeni and ypogegrammeni | U+1F97 |
59 | ᾠ | Greek small letter omega with psili and ypogegrammeni | U+1FA0 |
2 | ᾡ | Greek small letter omega with dasia and ypogegrammeni | U+1FA1 |
1 | ᾢ | Greek small letter omega with psili and varia and ypogegrammeni | U+1FA2 |
29 | ᾤ | Greek small letter omega with psili and oxia and ypogegrammeni | U+1FA4 |
1 | ᾥ | Greek small letter omega with dasia and oxia and ypogegrammeni | U+1FA5 |
8 | ᾦ | Greek small letter omega with psili and perispomeni and ypogegrammeni | U+1FA6 |
119 | ᾧ | Greek small letter omega with dasia and perispomeni and ypogegrammeni | U+1FA7 |
28 | ᾰ | Greek small letter alpha with vrachy | U+1FB0 |
131 | ᾱ | Greek small letter alpha with macron | U+1FB1 |
971 | ᾳ | Greek small letter alpha with ypogegrammeni | U+1FB3 |
77 | ᾴ | Greek small letter alpha with oxia and ypogegrammeni | U+1FB4 |
2'316 | ᾶ | Greek small letter alpha with perispomeni | U+1FB6 |
332 | ᾷ | Greek small letter alpha with perispomeni and ypogegrammeni | U+1FB7 |
21 | ᾽ | Greek koronis | U+1FBD |
49 | ᾿ | Greek psili | U+1FBF |
12 | ῂ | Greek small letter eta with varia and ypogegrammeni | U+1FC2 |
1'206 | ῃ | Greek small letter eta with ypogegrammeni | U+1FC3 |
55 | ῄ | Greek small letter eta with oxia and ypogegrammeni | U+1FC4 |
9'311 | ῆ | Greek small letter eta with perispomeni | U+1FC6 |
1'807 | ῇ | Greek small letter eta with perispomeni and ypogegrammeni | U+1FC7 |
1 | ῍ | Greek psili and varia | U+1FCD |
2 | ῎ | Greek psili and oxia | U+1FCE |
1 | ῏ | Greek psili and perispomeni | U+1FCF |
17 | ῐ | Greek small letter iota with vrachy | U+1FD0 |
41 | ῑ | Greek small letter iota with macron | U+1FD1 |
9 | ῒ | Greek small letter iota with dialytika and varia | U+1FD2 |
13'184 | ῖ | Greek small letter iota with perispomeni | U+1FD6 |
16 | ῗ | Greek small letter iota with dialytika and perispomeni | U+1FD7 |
8 | ῠ | Greek small letter upsilon with vrachy | U+1FE0 |
25 | ῡ | Greek small letter upsilon with macron | U+1FE1 |
1 | ῢ | Greek small letter upsilon with dialytika and varia | U+1FE2 |
156 | ῤ | Greek small letter rho with psili | U+1FE4 |
1'076 | ῥ | Greek small letter rho with dasia | U+1FE5 |
11'468 | ῦ | Greek small letter upsilon with perispomeni | U+1FE6 |
2 | Ὺ | Greek capital letter upsilon with varia | U+1FEA |
1'455 | Ῥ | Greek capital letter rho with dasia | U+1FEC |
10 | ῲ | Greek small letter omega with varia and ypogegrammeni | U+1FF2 |
3'565 | ῳ | Greek small letter omega with ypogegrammeni | U+1FF3 |
178 | ῴ | Greek small letter omega with oxia and ypogegrammeni | U+1FF4 |
14'081 | ῶ | Greek small letter omega with perispomeni | U+1FF6 |
2'690 | ῷ | Greek small letter omega with perispomeni and ypogegrammeni | U+1FF7 |
37 | ῾ | Greek dasia | U+1FFE |
257'417 | 159 | Greek Extended | |
Häufigkeit | G | Block[21] General Punctuation | Hex |
• | Spaces | ||
115 | Thin space | U+2009 | |
• | Format characters | ||
94 | | Zero width non-joiner | U+200C |
659 | | Left-to-right mark | U+200E |
• | Dashes | ||
46'119 | – | En dash | U+2013 |
1'599 | — | Em dash | U+2014 |
• | General punctuation | ||
153 | ‖ | Double vertical line | U+2016 |
• | Quotation marks and apostrophe | ||
14'608 | ‘ | Left single quotation mark | U+2018 |
11'069 | ’ | Right single quotation mark | U+2019 |
5'564 | ‚ | Single low-9 quotation mark | U+201A |
454 | “ | Left double quotation mark | U+201C |
2 | ” | Right double quotation mark | U+201D |
680 | „ | Double low-9 quotation mark | U+201E |
• | General punctuation | ||
467 | † | Dagger | U+2020 |
1 | ‡ | Double dagger | U+2021 |
548 | • | Bullet | U+2022 |
2'257 | … | Horizontal ellipsis | U+2026 |
• | Space | ||
1'763 | Narrow no-break space | U+202F | |
• | General punctuation | ||
2 | ‰ | Per mille sign | U+2030 |
326 | ′ | Prime | U+2032 |
42 | ″ | Double prime | U+2033 |
4 | ‴ | Triple prime | U+2034 |
1 | ‶ | Reversed double prime | U+2036 |
• | Quotation marks | ||
304 | ‹ | Single left-pointing angle quotation mark | U+2039 |
304 | › | Single right-pointing angle quotation mark | U+203A |
• | General punctuation | ||
3 | ※ | Reference mark | U+203B |
1 | ‾ | Overline | U+203E |
2 | ‿ | Undertie | U+203F |
• | Archaic punctuation | ||
1 | ⁙ | Five dot punctuation | U+2059 |
4 | ⁚ | Two dot punctuation | U+205A |
7 | ⁝ | Tricolon | U+205D |
2 | ⁞ | Vertical four dots | U+205E |
87'155 | 31 | General Punctuation | |
Häufigkeit | G | Block[22] Superscripts and Subscripts | Hex |
• | Superscripts | ||
28 | ⁰ | Superscript zero | U+2070 |
1'121 | ⁴ | Superscript four | U+2074 |
561 | ⁵ | Superscript five | U+2075 |
531 | ⁶ | Superscript six | U+2076 |
277 | ⁷ | Superscript seven | U+2077 |
119 | ⁸ | Superscript eight | U+2078 |
53 | ⁹ | Superscript nine | U+2079 |
• | Subscripts | ||
5 | ₁ | Subscript one | U+2081 |
6 | ₂ | Subscript two | U+2082 |
1 | ₆ | Subscript six | U+2086 |
1 | ₈ | Subscript eight | U+2088 |
2703 | 11 | Superscripts and Subscripts | |
Häufigkeit | G | Block[23] Combining Diacritical Marks for Symbols | Hex |
• | Combining diacritical marks for symbols | ||
7 | ⃒ | Combining long vertical line overlay | U+20D2 |
• | Enclosing diacritics | ||
1 | ⃠ | Combining enclosing circle backslash | U+20E0 |
8 | 2 | Combining Diacritical Marks for Symbols | |
Häufigkeit | G | Block[24] Letterlike Symbols | Hex |
• | Letterlike symbols | ||
75 | ℞ | Prescription take | U+211E |
5 | ℳ | Script capital m | U+2133 |
• | Hebrew letterlike math symbols | ||
2 | ℵ | Alef symbol | U+2135 |
82 | 3 | Letterlike Symbols | |
Häufigkeit | G | Block[25] Number Forms | Hex |
• | Fractions | ||
1 | ⅒ | Vulgar fraction one tenth | U+2152 |
8 | ⅓ | Vulgar fraction one third | U+2153 |
8 | ⅔ | Vulgar fraction two thirds | U+2154 |
2 | ⅕ | Vulgar fraction one fifth | U+2155 |
2 | ⅘ | Vulgar fraction four fifths | U+2158 |
6 | ⅛ | Vulgar fraction one eighth | U+215B |
1 | ⅜ | Vulgar fraction three eighths | U+215C |
1 | ⅝ | Vulgar fraction five eighths | U+215D |
2 | ⅞ | Vulgar fraction seven eighths | U+215E |
• | Archaic Roman numerals | ||
2 | ↀ | Roman numeral one thousand c d | U+2180 |
2 | ↂ | Roman numeral ten thousand | U+2182 |
7 | Ↄ | Roman numeral reversed one hundred | U+2183 |
42 | 12 | Number Forms | |
Häufigkeit | G | Block[26] Arrows | Hex |
• | Simple arrows | ||
1 | → | Rightwards arrow | U+2192 |
1 | ↓ | Downwards arrow | U+2193 |
2 | 2 | Arrows | |
Häufigkeit | G | Block[27] Mathematical Operators | Hex |
• | Miscellaneous mathematical symbols | ||
1 | ∆ | Increment | U+2206 |
• | Operators | ||
17 | − | Minus sign | U+2212 |
28 | ∗ | Asterisk operator | U+2217 |
15 | √ | Square root | U+221A |
• | Miscellaneous mathematical symbol | ||
37 | ∞ | Infinity | U+221E |
• | Angles | ||
3 | ∟ | Right angle | U+221F |
• | Relations | ||
52 | ∣ | Divides | U+2223 |
112 | ∥ | Parallel to | U+2225 |
• | Logical and set operators | ||
1 | ∩ | Intersection | U+2229 |
• | Integrals | ||
2 | ∫ | Integral | U+222B |
• | Relations | ||
44 | ∼ | Tilde operator | U+223C |
1 | ≠ | Not equal to | U+2260 |
1 | ≡ | Identical to | U+2261 |
1 | ≶ | Less-than or greater-than | U+2276 |
1 | ⊏ | Square image of | U+228F |
• | Operators | ||
1 | ⊙ | Circled dot operator | U+2299 |
• | Relations | ||
9 | ⊦ | Assertion | U+22A6 |
• | Miscellaneous mathematical symbols | ||
3 | ⊿ | Right triangle | U+22BF |
• | Matrix ellipses | ||
1 | ⋮ | Vertical ellipsis | U+22EE |
330 | 19 | Mathematical Operators | |
Häufigkeit | G | Block[28] Miscellaneous Technical | Hex |
• | Miscellaneous technical | ||
1 | ⌒ | Arc | U+2312 |
• | Bracket pieces | ||
1 | ⎥ | Right square bracket extension | U+23A5 |
• | Terminal graphic characters | ||
1 | ⎷ | Radical symbol bottom | U+23B7 |
• | Dentistry notation symbols | ||
1 | ⎿ | Dentistry symbol light vertical and bottom right | U+23BF |
1 | ⏉ | Dentistry symbol light down and horizontal | U+23C9 |
• | Metrical symbols | ||
11 | ⏑ | Metrical breve | U+23D1 |
1 | ⏓ | Metrical short over long | U+23D3 |
2 | ⏕ | Metrical two shorts over long | U+23D5 |
1 | ⏖ | Metrical two shorts joined | U+23D6 |
20 | 9 | Miscellaneous Technical | |
Häufigkeit | G | Block[29] Box Drawing | Hex |
• | Light and heavy solid lines | ||
2 | ━ | Box drawings heavy horizontal | U+2501 |
49 | │ | Box drawings light vertical | U+2502 |
39 | ┃ | Box drawings heavy vertical | U+2503 |
• | Light and double line box components | ||
2 | ╚ | Box drawings double up and right | U+255A |
• | Light and heavy half lines | ||
1 | ╵ | Box drawings light up | U+2575 |
2 | ╷ | Box drawings light down | U+2577 |
95 | 6 | Box Drawing | |
Häufigkeit | G | Block[30] Geometric Shapes | Hex |
• | Geometric shapes | ||
522 | ■ | Black square | U+25A0 |
25 | □ | White square | U+25A1 |
1 | ▭ | White rectangle | U+25AD |
1 | ▯ | White vertical rectangle | U+25AF |
168 | ◡ | Lower half circle | U+25E1 |
1 | ◺ | Lower left triangle | U+25FA |
8 | ◻ | White medium square | U+25FB |
1 | ◿ | Lower right triangle | U+25FF |
727 | 8 | Geometric Shapes | |
Häufigkeit | G | Block[31] Miscellaneous Symbols | Hex |
• | Miscellaneous symbols | ||
31 | ☐ | Ballot box | U+2610 |
• | Astrological symbols | ||
1 | ♃ | Jupiter | U+2643 |
• | Dice | ||
1 | ⚀ | Die face-1 | U+2680 |
• | Symbols for closed captioning from ARIB STD B24 | ||
1 | ⚞ | Three lines converging right | U+269E |
1 | ⚟ | Three lines converging left | U+269F |
• | Genealogical symbols | ||
3 | ⚭ | Marriage symbol | U+26AD |
38 | 6 | Miscellaneous Symbols | |
Häufigkeit | G | Block[32] Dingbats | Hex |
• | Miscellaneous | ||
128 | ✕ | Multiplication x | U+2715 |
• | Crosses | ||
3 | ✝ | Latin cross | U+271D |
• | Stars, asterisks and snowflakes | ||
1 | ❊ | Eight teardrop-spoked propeller asterisk | U+274A |
• | Ornamental brackets | ||
2 | ❰ | Heavy left-pointing angle bracket ornament | U+2770 |
134 | 4 | Dingbats | |
Häufigkeit | G | Block[33] Miscellaneous Mathematical Symbols-A | Hex |
• | Operators | ||
1 | ⟓ | Lower right corner with dot | U+27D3 |
• | Mathematical brackets | ||
562 | ⟨ | Mathematical left angle bracket | U+27E8 |
564 | ⟩ | Mathematical right angle bracket | U+27E9 |
1127 | 3 | Miscellaneous Mathematical Symbols-A | |
Häufigkeit | G | Block[34] Miscellaneous Mathematical Symbols-B | Hex |
• | Square symbols | ||
1 | ⧄ | Squared rising diagonal slash | U+29C4 |
• | Miscellaneous mathematical symbols | ||
1 | ⧠ | Square with contoured outline | U+29E0 |
2 | 2 | Miscellaneous Mathematical Symbols-B | |
Häufigkeit | G | Block[35] Supplemental Mathematical Operators | Hex |
• | Relational operators | ||
1 | ⪌ | Greater-than above double-line equal above less-than | U+2A8C |
• | Forks | ||
1 | ⫙ | Element of opening downwards | U+2AD9 |
2 | 2 | Supplemental Mathematical Operators | |
Häufigkeit | G | Block[36] Coptic | Hex |
• | Bohairic Coptic letters | ||
1 | ⲁ | Coptic small letter alfa | U+2C81 |
1 | ⲅ | Coptic small letter gamma | U+2C85 |
1 | ⲉ | Coptic small letter eie | U+2C89 |
2 | ⲓ | Coptic small letter iauda | U+2C93 |
1 | ⲕ | Coptic small letter kapa | U+2C95 |
4 | ⲟ | Coptic small letter o | U+2C9F |
2 | ⲡ | Coptic small letter pi | U+2CA1 |
2 | ⲥ | Coptic small letter sima | U+2CA5 |
2 | ⲧ | Coptic small letter tau | U+2CA7 |
2 | ⲩ | Coptic small letter ua | U+2CA9 |
• | Old Coptic and dialect letters | ||
1 | ⳓ | Coptic small letter old coptic hei | U+2CD3 |
19 | 11 | Coptic | |
Häufigkeit | G | Block[37] Tifinagh | Hex |
• | Letters | ||
1 | ⴼ | Tifinagh letter yaf | U+2D3C |
1 | ⵓ | Tifinagh letter yu | U+2D53 |
2 | 2 | Tifinagh | |
Häufigkeit | G | Block[38] CJK Symbols and Punctuation | Hex |
• | CJK angle brackets | ||
44 | 〈 | Left angle bracket | U+3008 |
44 | 〉 | Right angle bracket | U+3009 |
88 | 2 | CJK Symbols and Punctuation | |
Häufigkeit | G | Block[39] Alphabetic Presentation Forms | Hex |
• | Latin ligatures | ||
533 | fi | Latin small ligature fi | U+FB01 |
298 | fl | Latin small ligature fl | U+FB02 |
831 | 2 | Alphabetic Presentation Forms | |
Häufigkeit | G | Block[40] Halfwidth and Fullwidth Forms | Hex |
• | Fullwidth ASCII variants | ||
3 | | | Fullwidth vertical line | U+FF5C |
Häufigkeit | G | Block[41] Osage | Hex |
• | Uppercase letters | ||
1 | 𐓄 | Osage capital letter pa | U+104C4 |
Häufigkeit | G | Block[42] Linear A | Hex |
• | Simple signs | ||
1 | 𐙙 | Linear a sign a305 | U+10659 |
Häufigkeit | G | Block[43] Phoenician | Hex |
• | Letters | ||
2 | 𐤋 | Phoenician letter lamd | U+1090B |
Häufigkeit | G | Block[44] Lydian | Hex |
1 | 𐤭 | Lydian letter r | U+1092D |
Häufigkeit | G | Block[45] Old Turkic | Hex |
• | Consonants | ||
1 | 𐰩 | Old turkic letter yenisei enc | U+10C29 |
Häufigkeit | G | Block[46] Ancient Greek Musical Notation | Hex |
• | Ancient Greek vocalic notation | ||
2 | 𝈂 | Greek vocal notation symbol-3 | U+1D202 |
Häufigkeit | G | Block[47] Alchemical Symbols | Hex |
• | Symbols for other substances | ||
1 | 🜷 | Alchemical symbol for alkali-2 | U+1F737 |
70'876'673 Glyphen sind in Paulys RE digitalisiert, gegliedert in 810 verschiedene Glyphen in 47 Blöcken 13'469'874 Leerzeichen (SPACE Wortgrenze, TAB Tabulator, NL neue Zeile),
davon 122'534 Absätze (NL)
|
3. Anmerkungen
[Bearbeiten]- ↑ a b
ASCII Zeichensatz, American Standard Code for Information Interchange.The Basic Latin (or C0 Controls and Basic Latin) Unicode block is the first block of the Unicode standard, and the only block which is encoded in one byte in UTF-8. The block contains all the letters and control codes of the ASCII encoding.
The Basic Latin block was included in its present from version 1.0.0 of the Unicode Standard, without addition or alteration of the character repertoire.
The classical Latin alphabet, also known as the Roman alphabet, is a writing system that evolved from the visually similar Cumaean Greek version of the Greek alphabet. The Greek alphabet, including the Cumaean version, descended from the Phoenician abjad. The Etruscans who ruled early Rome adopted and modified the Cumaean Greek alphabet. The Etruscan alphabet was in turn adopted and further modified by the ancient Romans to write the Latin language.
During the Middle Ages scribes adapted the Latin alphabet for writing Romance languages, direct descendants of Latin, as well as Celtic, Germanic, Baltic, and some Slavic languages. With the age of colonialism and Christian evangelism, the Latin script spread beyond Europe, coming into use for writing indigenous American, Australian, Austronesian, Austroasiatic, and African languages. More recently, linguists have also tended to prefer the Latin script or the International Phonetic Alphabet (itself largely based on Latin script) when transcribing or creating written standards for non-European languages, such as the African reference alphabet.
The term Latin alphabet may refer to either the alphabet used to write Latin (as described in this article), or other alphabets based on the Latin script, which is the basic set of letters common to the various alphabets descended from the classical Latin one, such as the English alphabet. These Latin alphabets may discard letters, like the Rotokas alphabet, or add new letters, like the Danish and Norwegian alphabets. Letter shapes have evolved over the centuries, including the creation for Medieval Latin of lower-case forms which did not exist in the Classical period.
- ↑ a b
Latin-1 Supplement
(128 codes from 0080–00FF,
symbl.cc)
The Latin-1 Supplement (also called C1 Controls and Latin-1 Supplement) is the second Unicode block in the Unicode standard. It encodes the upper range of ISO 8859-1: 80 (U+0080) — FF (U+00FF). Controls C1 (0080–009F) are not graphic.
The C1 Controls and Latin-1 Supplement block has been included in its present form, with the same character repertoire since version 1.0 of the Unicode Standard, where it was known as Latin 1
- ↑ a b
Latin Extended-A
(128 codes from 0100–017F, Alphabet, Language: Celtic, Sami, Maltese, Turkish,
symbl.cc)
Latin Extended-A is a block of the Unicode Standard.
It encodes Latin letters from the Latin ISO character sets other than Latin-1 (which is already encoded in the Latin-1 Supplement block) and also legacy characters from the ISO 6937 standard.
The Latin Extended-A block has been in the Unicode Standard since version 1.0, with its entire character repertoire, except for the Latin Small Letter Long S, which was added during unification with ISO 10646 in version 1.1
- ↑ a b
Latin Extended-B
(208 codes from 0180–024F, Alphabet, Language: Slovenian, Croatian,
symbl.cc)
Latin Extended-B is a block (0180-024F) of the Unicode Standard. It has been included since version 1.0, where it was only allocated to the code points U+0180..U+01FF and contained 113 characters. During unification with ISO 10646 for version 1.1, the block was expanded, and another 65 characters were added. In version 3.0, the last thirty available code points in the block were assigned.
The Latin Extended-B block contains ten subheadings for groups of characters: Non-European and historic Latin, African letters for clicks, Croatian digraphs matching Serbian Cyrillic letters, Pinyin diacritic-vowel combinations, Phonetic and historic letters, Additions for Slovenian and Croatian, Additions for Romanian, Miscellaneous additions, Additions for Livonian, and Additions for Sinology. The Non-European and historic, African clicks, Croatian digraphs, Pinyin, and the first part of the Phonetic and historic letters were present in Unicode 1.0; additional Phonetic and historic letters were added for version 3.0; and other Phonetic and historic, as well as the rest of the sub-blocks were the characters added for version 1.1.
- ↑ a b
IPA Extensions
(96 codes from 0250–02AF, Alphabet,
symbl.cc)
IPA Extensions is a block (0250–02AF) of the Unicode standard that contains full size letters used in the International Phonetic Alphabet (IPA). Both modern and historical characters are included, as well as former IPA signs and non-IPA phonetic letters. Additional characters employed for phonetics, like the palatalization sign, are encoded in the blocks Phonetic Extensions (1D00–1D7F) and Phonetic Extensions Supplement (1D80–1DBF). Diacritics are found in the Spacing Modifier Letters (02B0–02FF) and Combining Diacritical Marks (0300–036F) blocks.
With IPA´s ability to use Unicode for the presentation of phonetic symbols, ASCII-based systems such as X-SAMPA or Kirshenbaum are being supplanted. Within the Unicode blocks there are also a few former IPA characters no longer in international use by linguists.
The IPA Extensions block has been present in Unicode since version 1.0, and was unchanged through the unification with ISO 10646. The block was filled out with extensions for representing disordered speech in version 3.0, and Sinology phonetic symbols in version 4.0
The International Phonetic Alphabet (unofficially—though commonly—abbreviated IPA) is an alphabetic system of phonetic notation based primarily on the Latin alphabet. It was devised by the International Phonetic Association as a standardized representation of the sounds in oral language.
Who needs IPA? This is a relevant question! Actually, a lot of people. The IPA is used by lexicographers, foreign language students and teachers, linguists, speech-language pathologists, singers, actors, constructed language creators, and translators.
The IPA is designed to represent only those qualities of speech that are part of oral language: phones, phonemes, intonation, and the separation of words and syllables. To represent additional qualities of speech, such as tooth gnashing, lisping, and sounds made with a cleft palate, an extended set of symbols called the Extensions to the IPA may be used.
IPA symbols consist of one or more elements of two basic types, letters and diacritics. For example, the sound of the English letter ´t´ may be transcribed in IPA with a single letter, , or with a letter plus diacritics, , depending on how precise you want to describe its features in the context. Slashes are often used to signal broad or phonemic transcription; thus, /t/ is less specific and could refer to either or , depending on the context and language.
Letters or diacritics might be added, removed, or modified by the International Phonetic Association. According to the recent change in 2005, there are 107 letters, 52 diacritics, and four prosodic marks in the IPA. These are shown in the current IPA chart, posted below in this article and at the website of the IPA.
Let´s have some fun! You can take letters from this block and flip your text to entertain yourself and your friends.
- ↑ a b
Spacing Modifier Letters
(80 codes from 02B0–02FF, Alphabet,
symbl.cc)
Spacing Modifier letters is a Unicode block containing characters for the IPA (International Phonetic Alphabet), UPA (Uralic Phonetic Alphabet or Finno-Ugric transcription system), and other phonetic transcriptions. Included are the IPA tone marks, and modifiers for aspiration and palatalization.
- ↑ a b
Combining Diacritical Marks
(112 codes from 0300–036F, Alphabet,
symbl.cc)
Combining Diacritical Marks is a Unicode block containing the most common combining characters. It also contains the Combining Grapheme Joiner, which prevents canonical reordering of combining characters, and despite the name, actually separates characters that would otherwise be considered a single grapheme in a given context.
A diacritic /daɪ.əˈkrɪtɨk/ – also diacritical mark, diacritical point, or diacritical sign – is a glyph added to a letter, or basic glyph. The term derives from the Greek διακριτικός (diakritikós, “distinguishing”, from ancient Greek διά (diá, through) and κρίνω (krínein, to separate)). Diacritic is primarily an adjective, though sometimes used as a noun, whereas diacritical is only ever an adjective. Some diacritical marks, such as the acute (´) and grave (`), are often called accents. Diacritical marks may appear above or below a letter, or in some other position such as within the letter or between two letters.
The main use of diacritical marks in the Latin script is to change the sound-values of the letters to which they are added. Examples from English are the diaereses in naïve and Noël, which show that the vowel with the diaeresis mark is pronounced separately from the preceding vowel; the acute and grave accents, which can indicate that a final vowel is to be pronounced, as in saké and poetic breathèd; and the cedilla under the “c” in the borrowed French word façade, which shows it is pronounced /s/ rather than /k/. In other Latin alphabets, they may distinguish between homonyms, such as the French là (“there”) versus la (“the”), which are both pronounced . In Gaelic type, a dot over a consonant indicates lenition of the consonant in question.
In other alphabetic systems, diacritical marks may perform other functions. Vowel pointing systems, namely the Arabic harakat ( ـَ, ـُ, ـُ, etc.) and the Hebrew niqqud ( ַ, ֶ, ִ, ֹ , ֻ, etc.) systems, indicate sounds (vowels and tones) that are not conveyed by the basic alphabet. The Indic virama ( ् etc.) and the Arabic sukūn ( ـْـ ) mark the absence of a vowel. Cantillation marks indicate prosody. Other uses include the Early Cyrillic titlo ( ◌҃ ) and the Hebrew gershayim ( ״ ), which, respectively, mark abbreviations or acronyms, and Greek diacritical marks, which showed that letters of the alphabet were being used as numerals. In the Hanyu Pinyin official romanization system for Chinese, diacritics are used to mark the tones of the syllables in which the marked vowels occur.
In orthography and collation, a letter modified by a diacritic may be treated either as a new, distinct letter or as a letter–diacritic combination. This varies from language to language, and may vary from case to case within a language.
In some cases, letters are used as “in-line diacritics” in place of ancillary glyphs, because they modify the sound of the letter preceding them, as in the case of the “h” in English “sh” and “th”.
- ↑ a b
Greek and Coptic
(144 codes from 0370–03FF, Alphabet, Language: Greek, Coptic,
symbl.cc)
Greek and Coptic is the Unicode block for representing modern (monotonic) Greek. It was originally used for writing Coptic, using the similar Greek letters, in addition to the uniquely Coptic additions. Beginning with version 4.1 of the Unicode Standard, a separate Coptic block has been included in Unicode, allowing for mixed Greek/Coptic text that is stylistically contrastive, as is convention in scholarly works. Writing polytonic Greek requires the use of combining characters or the precomposed vowel + tone characters in the Greek Extended character block.
The Greek alphabet is the script that has been used to write the Greek language since the 8th century BC. It was derived from the earlier Phoenician alphabet, and was the first alphabetic script to have distinct letters for vowels as well as consonants. As such, it became the ancestor of numerous other European and Middle Eastern alphabets, including Latin and Cyrillic. Apart from its use in writing the Greek language, both in its ancient and its modern forms, the Greek alphabet today also serves as a source of technical symbols and labels in many domains of mathematics, science and other fields.
In its classical and modern forms, the alphabet has 24 letters, ordered from alpha to omega. Like and Cyrillic0400–04FF, Greek originally had only a single form of each letter; it developed the letter case distinction between upper-case and lower-case forms in parallel with Latin during the modern era.
Sound values and conventional transcriptions for some of the letters differ between Ancient Greek and Modern Greek usage, owing to phonological changes in the language.
In traditional (“polytonic”) Greek orthography, vowel letters can be combined with several diacritics, including accent marks, so-called “breathing” marks, and the iota subscript. In common present-day usage for Modern Greek since the 1980s, this system has been simplified to a so-called “monotonic” convention
The Coptic alphabet is the script used for writing the Coptic language. The repertoire of glyphs is based on the Greek alphabet augmented by letters borrowed from the Egyptian Demotic and is the first alphabetic script used for the Egyptian language. There are several alphabets, as the Coptic writing system may vary greatly among the various dialects and subdialects of the Coptic language.
- ↑ a b
Cyrillic
(256 codes from 0400–04FF, Alphabet, Language: Russian, Ukrainian, Bulgarian,
symbl.cc)
Cyrillic is a Unicode block containing the characters used to write the widely used languages with a Cyrillic orthography. The core of the block is based on the ISO 8859-5 standard, with additions for minority languages and historic orthographies.
The Cyrillic script /sɨˈrɪlɪk/ is an alphabetic writing system employed across Eastern Europe, North and Central Asian countries. It is based on the Early Cyrillic, which was developed in the First Bulgarian Empire during the 9th century AD at the Preslav Literary School. It is the basis of alphabets used in various languages, past and present, in parts of Southeastern Europe and Northern Eurasia, especially those of Slavic origin, and non-Slavic languages influenced by Russian. As of 2011, around 252 million people in Eurasia use it as the official alphabet for their national languages. About half of them are in Russia. Thus, Cyrillic is one of the most used writing systems in the world.
Cyrillic is derived from the Greek uncial script, augmented by letters from the older Glagolitic alphabet, including some ligatures. These additional letters were used for sounds not found in Greek. The script is named in honor of the two Byzantine brothers, Saints Cyril and Methodius, who created the Glagolitic alphabet earlier on. Modern scholars believe that Cyrillic was developed and formalized by early disciples of Cyril and Methodius.
With the accession of Bulgaria to the European Union on 1 January 2007, Cyrillic became the third official script of the European Union, following the Latin and Greek scripts.
- ↑ a b
Armenian
(96 codes from 0530–058F, Alphabet, Language: Armenian,
symbl.cc)
Armenian is a Unicode block containing characters for writing the Armenian language, both the traditional Western Armenian and reformed Eastern Armenian orthographies. Five Armenian ligatures are encoded in the Alphabetic Presentation Forms block.
The Armenian language (classical: հայերէն; reformed: հայերեն hayeren) is an Indo-European language spoken by the Armenians. It is the official language of the Republic of Armenia and the self-proclaimed Nagorno-Karabakh Republic. It has historically been spoken throughout the Armenian Highlands and today is widely spoken in the Armenian diaspora.
Armenian has its own unique script, the Armenian alphabet, invented in 405 AD by Mesrop Mashtots.
Scholars classify Armenian as an independent branch of the Indo-European language family. The area that linguists are especially interested in is the distinctive phonological developments within the Indo-European languages. Armenian shares a number of major innovations with Greek, and some linguists group these two languages with Phrygian and the Indo-Iranian family into a higher-level subgroup of Indo-European, which is defined by such shared changes as the augment. Recently other scholars have proposed a Balkan grouping including Greek, Phrygian, Armenian, and Albanian.
Armenia was a monolingual country till the second century BC. Its language has long literary history, with a fifth-century Bible translation as its oldest surviving text.
There are two standardized modern literary forms, Eastern Armenian and Western Armenian, with which most contemporary dialects are mutually intelligible.
- ↑ a b
Hebrew
(112 codes from 0590–05FF, Alphabet, Language: Hebrew, Yiddish,
symbl.cc)
Hebrew is a Unicode block containing characters for writing the Hebrew, Yiddish, Ladino, and other Jewish diaspora languages.
Hebrew is a West Semitic language of the Afroasiatic language family. Historically, it is regarded as the language of the Hebrew Israelites and their ancestors, although the language was not referred to by the name Hebrew in the Tanakh.
The earliest examples of written Paleo-Hebrew date from the 10th century BC, those were primitive drawings. Since the language used in that inscription remained unknown, it was impossible to prove whether it was in fact Hebrew or another local language.
Hebrew had ceased to be an everyday spoken language somewhere between 200 and 400 CE, declining since the Bar Kochba War. Aramaic and to a lesser extent Greek were already in use as international languages, especially among elites and immigrants. Thus, Hebrew survived into the medieval period as the language of Jewish liturgy, rabbinic literature, intra-Jewish commerce, and poetry. Then, in the 19th century, it was revived as a spoken and literary language. According to Ethnologue, nowadays it´s spoken by 9 million people worldwide, including 7 million who are from Israel. If you didn´t know, The United States has the second largest Hebrew speaking population, with about 221,593 fluent speakers, mostly from Israel.
Modern Hebrew is one of the two official languages of Israel (the other is Arabic). As for pre-modern Hebrew, it is used for prayers or studies in Jewish communities all around the world today. Ancient Hebrew is also the liturgical language of the Samaritans, while modern Hebrew or Arabic are their vernacular. As a foreign language, it is studied mostly by Jews and students of Judaism and Israel, and by archaeologists and linguists specializing in the Middle East and its civilizations, as well as by theologians in Christian seminaries.
The Torah (the first five books), and most of the rest of the Hebrew Bible, are written in Biblical Hebrew. Much of its present form is written in the dialect that scholars believe flourished around the 6th century BC, around the time of the Babylonian exile. For this reason, Hebrew has been referred to by Jews as Leshon HaKodesh (לשון הקדש), “The Holy Language”, since ancient times.
- ↑ a b
Arabic
(256 codes from 0600–06FF, Alphabet, Language: Arabic, Persian, Kurd,
symbl.cc)
Arabic is a Unicode block, containing the standard letters and the most common diacritics of the Arabic script, and the Arabic-Indic digits.
The Arabic script is a writing system used for writing several languages of Asia and Africa, such as Arabic, the Sorani and Luri dialects of Kurdish, Persian, Pashto, and Urdu. Even until the 16th century, it was used to write some texts in Spanish. After the Latin script, Chinese characters, and Devanagari, it is the fourth-most widely used writing system in the world.The Arabic script is written from right to left in a cursive style. In most cases the letters transcribe consonants, or consonants and a few vowels, so most Arabic alphabets are abjads.The script was first used to write texts in Arabic, most notably the Qurʼān, the holy book of Islam. With the spread of Islam, it came to be used to write languages of many language families, leading to the addition of new letters and other symbols, with some versions, such as Kurdish, Uyghur, and old Bosnian being abugidas or true alphabets. It is also the basis for a rich tradition of Arabic calligraphy.
- ↑ a b
Devanagari
(128 codes from 0900–097F, Abugida, Language: Sanskrit, Hindi,
symbl.cc)
Devanagari is a Unicode block containing characters for writing Hindi, Marathi, Sindhi, Nepali and Sanskrit. In its original incarnation, the code points U+0900..U+0954 were a direct copy of the characters A0-F4 from the 1988 ISCII standard. The Bengali0980–09FF, Gurmukhi0A00–0A7F, Gujarati0A80–0AFF, Oriya0B00–0B7F, Tamil0B80–0BFF, Telugu0C00–0C7F, Kannada0C80–0CFF, and Malayalam0D00–0D7F blocks were similarly all based on their ISCII encodings.
Devanagari, also called Nagari, is an abugida alphabet of India and Nepal. It is written from left to right, does not have distinct letter cases, and is recognisable (along with most other North Indic scripts, with a few exceptions like Gujarati0A80–0AFF and Oriya0B00–0B7F) by a horizontal line that runs along the top of full letters. Since the 19th century, it has been the most commonly used script for writing Sanskrit. Devanagari is used to write Hindi, Nepali, Marathi, Konkani, Bodo and Maithili among other languages and dialects. It was formerly used to write Gujarati. Because it is the standardised script for the Hindi, Nepali, Marathi, Konkani and Bodo languages, Devanagari is one of the most used and adopted writing systems in the world.
- ↑ a b
Cherokee
(96 codes from 13A0–13FF, Syllabary, Language: Cherokee,
symbl.cc)
The Cherokee script is a syllabic script invented by the Indian George Hess (also known as George Gist or tribe chief Sequoia) for the Cherokee language in 1819. His creation of the syllabary is particularly noteworthy, because he couldn´t read any script. He first experimented with logograms, but his system later developed into a syllabary.
The descendants of Sequoia claim that the script was invented much earlier than when Sequoiawas born, so his role was reduced to being the last member of a special clan who guarded this script, but there is no confirmation or evidence of this.
A year later, in 1820, thousands of Cherokee learned to write and read in this script. In 1830 90% of the Indians of this tribe mastered literacy and writing skills.
The Cherokee script was used for more than a hundred years. It was published in books, religious texts, almanacs and newspapers (in particular, the Cherokee Phoenix newspaper).
Today this script still exists and plays a very important role in the life of the Cherokee. For example, you need to speak and write Cherokee to get the status of a full member of the tribe. In addition, the authorities are trying to revive and popularize both the writing and the Cherokee language.
The writing system consists of 85 syllabic signs. Some of them resemble Latin letters, but have a completely different meaning (for example, the sign for /a/ reminds of D).
Not all phonemic oppositions are marked in writing. For example, /g/ and /k/ differ only in syllables with /a/. In the alphabet there are also no marks for the length and brevity of vowels and tonal differences. Besides, there is no accepted way to express consonant combinations.
In this system, each symbol represents a syllable rather than a single phoneme. Some symbols do resemble the Latin, Greek and even Cyrillic scripts´ letters, but the sounds are completely different (for example, the sound /a/ is written with a letter that resembles Latin /d/).
- ↑ a b
Unified Canadian Aboriginal Syllabics
(640 codes from 1400–167F, Abugida, Language: Cree,
symbl.cc)
Unified Canadian Aboriginal Syllabics is a Unicode block containing characters for writing Inuktitut, Carrier, several dialects of Cree, and Canadian Athabascan languages. You can find the additions for some Cree dialects, Ojibwe, in the Unified Canadian Aboriginal Syllabics Extended block.
Canadian Aboriginal syllabic writing, or simply syllabics, is a family of abugidas (consonant-based alphabets) used to write a number of Aboriginal Canadian languages of the Algonquian, Inuit, and (formerly) Athabaskan language families. They are valued for their distinctiveness from the Latin script of the dominant languages and for the ease with which literacy can be achieved. In fact, by the late 19th century the Cree had achieved one of the highest rates of literacy in the world.
Canadian syllabics are currently used to write all of the Cree languages from Naskapi (spoken in Quebec) to the Rocky Mountains, including Eastern Cree, Woods Cree, Swampy Cree and Plains Cree. You can also see them as Inuktitut texts in the eastern Canadian Arctic. Actually these Canadian syllabics perform as co-official with the Latin script in the territory of Nunavut.
Apart from that, this script is met regionally for the other large Canadian Algonquian language, Ojibwe in Western Canada, as well as for Blackfoot, where the alphabet is actually considered obsolete. Among the Athabaskan languages further to the west, the syllabics have been used to write Dakelh (Carrier), Chipewyan, Slavey, Tłı̨chǫ (Dogrib) and Dane-zaa (Beaver).
As for the United States, you may come across this kind of writing in communities that straddle the border, but it´s mostly a Canadian phenomenon.
- ↑ a b
Runic
(96 codes from 16A0–16FF, Alphabet, Language: Old Italic, Runic,
symbl.cc)
Runic is a Unicode block containing characters for writing Futhark runic inscriptions. Although many of the characters appear similar, they should not be confused with the J.R.R. Tolkien-designed Cirth, which has a separate ConScript Unicode Registry encoding. However, in Unicode 7.0 some additional Runic characters were added, including three Runic characters that were used only by Tolkien, for example in the maps of Hobbit: these are different from Cirth.
Runes (Proto-Norse: ᚱᚢᚾᛟ (runo), Old Norse: rún) are the letters in a set of related alphabets known as runic alphabets, which were used to write various Germanic languages before the adoption of the Latin alphabet and for specialised purposes thereafter. The Scandinavian variants are also known as futhark or fuþark (derived from their first six letters of the alphabet: F, U, Þ, A, R, and K); the Anglo-Saxon variant is futhorc or fuþorc (due to sound changes undergone in Old English by the names of those six letters).
Runology is the study of the runic alphabets, runic inscriptions, runestones, and their history. Runology forms a specialised branch of Germanic linguistics.
The earliest runic inscriptions date from around 150 AD. The characters were generally replaced by the Latin alphabet as the cultures that had used runes underwent Christianisation, by approximately 700 AD in central Europe and 1100 AD in northern Europe. However, the use of runes persisted for specialized purposes in northern Europe. Until the early 20th century, runes were used in rural Sweden for decorative purposes in Dalarna and on Runic calendars.
The three best-known runic alphabets are the Elder Futhark (around 150–800 AD), the Anglo-Saxon Futhorc (400–1100 AD), and the Younger Futhark (800–1100 AD). The Younger Futhark is divided further into the long-branch runes (also called Danish, although they were also used in Norway and Sweden); short-branch or Rök runes (also called Swedish-Norwegian, although they were also used in Denmark); and the stavlösa or Hälsinge runes (staveless runes). The Younger Futhark developed further into the Marcomannic runes, the Medieval runes (1100–1500 AD), and the Dalecarlian runes (around 1500–1800 AD).
Historically, the runic alphabet is a derivation of the Old Italic alphabets of antiquity, with the addition of some innovations. Which variant of the Old Italic family in particular gave rise to the runes is uncertain. Suggestions include Raetic, Etruscan, or Old Latin as candidates. At the time, all of these scripts had the same angular letter shapes suited for epigraphy, which would become characteristic of the runes.
The process of transmission of the script is unknown. The oldest inscriptions are found in Denmark and northern Germany, not near Italy. A “West Germanic hypothesis” suggests transmission via Elbe Germanic groups, while a “Gothic hypothesis” presumes transmission via East Germanic expansion.
- ↑ a b
Phonetic Extensions
(128 codes from 1D00–1D7F,
symbl.cc)
Phonetic Extensions is a Unicode block containing phonetic characters used in the Uralic Phonetic Alphabet, Old Irish phonetic notation, the Oxford English dictionary and American dictionaries, and Americanist and Russianist phonetic notations. Its character set is continued in the following Unicode block, Phonetic Extensions Supplement.
- ↑ a b
Phonetic Extensions Supplement
(64 codes from 1D80–1DBF,
symbl.cc)
Phonetic Extensions Supplement is a Unicode block containing characters for specialized and deprecated forms of the International Phonetic Alphabet.
- ↑ a b
Latin Extended Additional
(256 codes from 1E00–1EFF,
symbl.cc)
Latin Extended Additional is a block of the Unicode standard.
The characters in this block are mostly precomposed combinations of Latin letters with one or more general diacritical marks. There are also a few Medievalist characters.
- ↑ a b
Greek Extended
(256 codes from 1F00–1FFF,
symbl.cc)
Greek Extended is a Unicode block containing the accented vowels necessary for writing polytonic Greek. The regular, unaccented Greek characters can be found in the Greek and Coptic (Unicode block). Greek Extended was encoded in version 1.1 of the Unicode Standard as is, having had no additions up to 6.2. As an alternative to Greek Extended, combining characters can be used to represent the tones and breath marks of polytonic Greek.
- ↑ a b
General Punctuation
(112 codes from 2000–206F,
symbl.cc)
General Punctuation is a Unicode block containing punctuation, spacing, and formatting characters for use with all scripts and writing systems. Included are the defined-width spaces, joining formats, directional formats, smart quotes, archaic and novel punctuation such as the interobang, and invisible mathematical operators.
- ↑ a b
Superscripts and Subscripts
(48 codes from 2070–209F,
symbl.cc)
Superscripts and Subscripts is a Unicode block containing superscript and subscript numerals, mathematical operators, and letters used in mathematics and phonetics. Other superscript letters can be found in the Spacing Modifier Letters, Phonetic Extensions and Phonetic Extensions Supplement blocks, while the superscript 1, 2, and 3, inherited from ISO 8859-1, were included in the Latin-1 Supplement block.
- ↑ a b
Combining Diacritical Marks for Symbols
(48 codes from 20D0–20FF,
symbl.cc)
Combining Diacritical Marks for Anz is a Unicode block containing Arrows2190–21FF, dots, enclosures, and overlays for modifying symbol characters.
Talking about linguistics, how can we characterize the diacritical marks? Basically, those are various subscript and superscript symbols, which are applied in letter-alphabets (including consonant-alphabets, like abugidas) and syllable alphabets. Their main feature is that they act not as separate and independent symbols, but as additional marks for changing or narrowing the meaning of a particular sound or letter. Sometimes diacritics are supposed to be smaller than the letter itself.
Synonymous names: accents (more specific), diacritics (professional discourse). Needless to say, a system of diacritics that refers to some script or text is also called a diacritic.
Sometimes one letter may have more than two diacritics at the same time. Just like in the following examples: ặ, ṩ, ᶑ.
The vocal symbols in alphabets like Hebrew, Arabic, and Syriac can be often confused with diacritics due to their similar appearance. However, they mostly act as a special type of letters, so they carry different functions.
When do we use diacritics? Diacritics come in handy if the letters in an alphabet are not enough to express some sounds or meanings. The main alternatives for diacritics are various combinations of two letters (digraphs), three letters or more that convey one sound. For instance, the sound /sh/ is a digraph in English as it is in French /ch/, whereas in German it will be a trigraph /sch/. Are there languages that convey this sound with one letter? Yes, sure, it´s clearly reflected in Czech /š/. Plus, in this case we´re dealing with a diacritic, which plays the role of this pronunciation facilitator.
Diacritics are used both with consonant and vowel letters. The key drawback of diacritics is that they fill the writing with tiny little details, which are extremely important, and if you forget or skip one, it can lead to serious mistakes and consequences. However, we know a lot of languages which don´t use diacritics at all (English) or just a little (Russian). In some cases there´s a tendency of replacing diacritical letters with digraphs. The German sound /ö/ becomes /ое/ in the textual versions, but since the introduction of umlaut, this phenomenon is almost out of use.
- ↑ a b
Letterlike Symbols
(80 codes from 2100–214F,
symbl.cc)
Letterlike Anz is a Unicode block containing 80 characters which are mostly built of the glyphs of one or more letters. In addition to this block, Unicode includes stylized mathematical alphabets, although Unicode does not really categorise these characters as being “letterlike”.
Most of the symbols are perfect for decorating texts, posts, and making your nicknames or bio on various websites stand out.
- ↑ a b
Number Forms
(64 codes from 2150–218F,
symbl.cc)
Number Forms is a Unicode block containing characters which have specific meaning as numbers, but are built from other characters. They consist primarily of simple fractions and Roman numerals. In addition to the characters in the Number Forms block, three fractions were inherited from ISO-8859-1 which was incorporated as a whole Latin-1 supplement block.
Unfortunately, here you won´t find the fraction to talk about the mysterious platform from Harry Potter. Yes, you´ll have to type 9 3/4 with your own fingers. However, if you want to say that for this particular pizza you´ll need ⅖ glasses of flour, ⅔ plates of mozzarella crust and ⅞ spoons of patience — you´ve come to the right place. Apart from recipes, you can use these symbols for evaluating your classmate´s performance in a school project: ↉ objectives achieved, ⅒ subjects passed.
- ↑ a b
Arrows
(112 codes from 2190–21FF,
symbl.cc)
Arrows is a Unicode block containing lines, curves, and semicircle symbols terminating in barbs or arrows. It´s interesting that even arrows have categories: Unicode divides them into two groups in particular: simple arrows and arrows with modifications, not to mention the arrows with bent tips. Some arrows feel lonely when they travel alone, so they go in pairs.
The general objectives of arrows (both in real life and Unicode) are to mark directions, connections, relations, logical assumptions, implications, and computer buttons. The main directions include the key four: up, down, left, right. However, some signs are coded in eight variants.
What else can you use arrows for? Well, a lot of bloggers use such symbols to indicate the they are referring to the previous story on their profile ←. Besides, these arrows are often met in books emphasizing some important information ↗. Plus, as usual, you can always express your creativity with these strange creatures: snake-like arrow ↝, an arrow doing yoga ↨, flash zig-zag arrow ↯, and two arrows that crashed into the walls ↹.
- ↑ a b
Mathematical Operators
(256 codes from 2200–22FF,
symbl.cc)
The Unicode Standard encodes almost all standard characters used in mathematics. Unicode Technical Report #25 provides comprehensive information about the character repertoire, their properties, and guidelines for implementation. Mathematical operators and symbols are situated in multiple Unicode blocks. Some of these blocks are dedicated to, or primarily contain, mathematical characters while others are a mix of mathematical and non-mathematical characters. This article covers all Unicode characters with a derived property of “Math”.
At least 6 Unicode blocks contain special mathematical symbols. Mathematical Operators2200–22FF, Miscellaneous Mathematical Anz -A27C0–27EF, Miscellaneous Mathematical Anz -B2980–29FF, Supplemental Mathematical Operators2A00–2AFF, Mathematical Alphanumeric Anz 1D400–1D7FF, Arabic Mathematical Alphabetic Anz 1EE00–1EEFF.
Math symbols are basically a set of graphic signs which serve for writing down math ideas and terms. Different cultures used to have their own symbols for such operations. Some, still do. However, in modern times the unified international system is more popular. It has been developing historically, like any native language. A lot of symbols were borrowed from other alphabets.
In order to write numbers, Arabic digits are used. As a rule, the decimal system is applied. By the way, Unicode employs the hexadecimal one. Apart from numbers, letters are used too, mostly Greek and Latin. Not only the register matters, but the way of writing too (font).
A small part of math symbols (mostly related to measurements) is included in the standard ISO 31-11. But, in general, there are no unified rules for designation. The multiplication sign can be written either as a dot ∙, or a star * or even a cross ×.
- ↑ a b
Miscellaneous Technical
(256 codes from 2300–23FF,
symbl.cc)
Miscellaneous Technical is the name of a Unicode block ranging from U+2300 to U+23FF, which contains various common symbols which are related to and used in the various technical, programming language, and academic professions.Symbol ⌂ (HTML hexadecimal code is ⌂) represents a house or a home.Symbol ⌘ (⌘) represents the Command key on Mac keyboard.Symbol ⌚ (⌚) is a watch (or clock).Symbol ⏏ (⏏) is the “Eject” button symbol found on electronic equipment.Symbol ⏚ (⏚) is the “Earth Ground” symbol found on electrical or electronic manual, tag and equipment.It also includes most of the uncommon symbols used by the APL programming language.
- ↑ a b
Box Drawing
(128 codes from 2500–257F,
symbl.cc)
Box Drawing is a Unicode block containing characters for compatibility with legacy graphics standards that contained characters for making bordered charts and tables, i.e. box-drawing characters.
Box-drawing characters, also known as line-drawing characters, are a form of semigraphics widely used in text user interfaces to draw various geometric frames and boxes. In graphical user interfaces, these characters are much less useful as it is much simpler to draw lines and rectangles directly with graphical APIs. Box-drawing characters work only with monospaced fonts; however, they are still useful for plaintext comments on websites.
- ↑ a b
Geometric Shapes
(96 codes from 25A0–25FF,
symbl.cc)
Geometric Shapes is a Unicode block that consists of 96 symbols referring to geometry at codepoint range U+25A0-25FF. Squares, triangles, rectangles, pointing left, right, up, down; transparent or painted shapes, striped or chequered ▦. It´s absolutely up to you in what contexts to apply them.
For example, some look like buttons on a keyboard ▪ or road signs showing direction ▻. My favourite is this square ▮, because it looks smooth and serene. Perfect for copying and pasting on social media!
Seriously speaking, geometric shapes come in handy if you specialise in art, design, or engineering. They are basically figures which represent the forms of different life objects. Some figures are two-dimensional, whereas some are three-dimensional shapes. In our case, we´re talking about two-dimensional, of course. Although, this quarter-eaten pie on a plate ◴ seems pretty three-dimensional to me.
- ↑ a b
Miscellaneous Symbols
(256 codes from 2600–26FF,
symbl.cc)
Miscellaneous Anz is a Unicode block (U+2600–U+26FF) containing glyphs representing concepts from a variety of categories: astrological, astronomical, chess, dice, musical notation, political symbols, recycling, religious symbols, trigrams, warning signs, and weather, among others.
- ↑ a b
Dingbats
(192 codes from 2700–27BF, face Corporation. That´s how we got the font “Zapf dingbats”. It was divided in three sections: series 100, 200, and 300. This Unicode block in particular presents Zapf dingbats 100.,
symbl.cc)
A dingbat is an ornament, character, or spacer used in typesetting. Sometimes it´s more formally known as a printer´s ornament or printer´s character often employed for the creation of box frames. The term was later applied to the computer industry for describing fonts that have symbols and shapes in the positions designated for alphabetical or numeric characters.
In 1977 the German typographer Hermann Zapf created more than a thousand drafts for glyphs. Needless to say, 360 of them were confirmed by ITC — International Typ face Corporation. That´s how we got the font “Zapf dingbats”. It was divided in three sections: series 100, 200, and 300. This Unicode block in particular presents Zapf dingbats 100.
Among various arrows, stars and crosses, you will find snowflakes, cards, digits, and even some emojis. I personally adore the variety of scissors that this block offers ✂ ✀ ✃
- ↑ a b
Miscellaneous Mathematical Symbols-A
(48 codes from 27C0–27EF,
symbl.cc)
Miscellaneous Mathematical Anz -A is a Unicode block containing characters for mathematical, logical, and database notation.
Mathematics, considered the language of all sciences, cannot do without a recording system. Numerous concepts and operators have made an influence as the development of this science is going on. Since these symbols are not included in the standard alphabets, typing them from the keyboard can be problematic. Nevertheless, these mathematical symbols can be copied and pasted.
The Unicode Consortium is no stranger to the problem of scientists, so many different signs were included in the table. If this is not what you need, use the search on the website or check the following sections: Arabic mathematical alphabetic symbols, Miscellaneous mathematical symbols B, Supplemental mathematical operators. Letters for formulas can be taken in a set of and a block of Mathematical alphanumeric symbols.
- ↑ a b
Miscellaneous Mathematical Symbols-B
(128 codes from 2980–29FF,
symbl.cc)
Miscellaneous Mathematical Anz -B is a Unicode block containing miscellaneous mathematical symbols, including brackets, angles, and circle symbols.
- ↑ a b
Supplemental Mathematical Operators
(256 codes from 2A00–2AFF,
symbl.cc)
Supplemental Mathematical Operators is a Unicode block containing various mathematical symbols, including N-ary operators, summations and integrals, intersections and unions, logical and relational operators, and subset/superset relations.
- ↑ a b
Coptic
(128 codes from 2C80–2CFF, Alphabet, Language: Coptic,
symbl.cc)
Coptic is a Unicode block used with the Greek and Coptic block to write the Coptic language. Prior to version 4.1 of the Unicode Standard, Greek and Coptic were used exclusively to write Coptic text. However, Greek and Coptic letter forms are contrastive in many scholarly works, and their further separation was needed. Therefore, the specific Coptic letters in the Greek and Coptic block are not reproduced in the Coptic Unicode block.
Apparently, the Coptic alphabet is the script used for writing the Coptic language. The repertoire of glyphs is based on the Greek alphabet augmented by letters borrowed from the Egyptian Demotic. The borrowings included some Egyptian consonants, since they were missing from the Greek alphabet. It´s actually the first alphabetic script used for the Egyptian language.
There are several Coptic alphabets, as the Coptic writing system may vary greatly among the various dialects and subdialects of the Coptic language.
- ↑ a b
Tifinagh
(80 codes from 2D30–2D7F, Abjad, Language: Tuareg,
symbl.cc)
Tifinagh is a Unicode block containing characters of the Tifinagh alphabet, used for writing Tuareg, Berber, and other languages of North Africa.
Tifinagh (Berber pronunciation: ; also written Tifinaɣ in the Berber Latin alphabet, ⵜⵉⴼⵉⵏⴰⵖ in Neo-Tifinagh, and تيفيناغ in the Berber Arabic alphabet) is a series of abjad and alphabetic scripts used to write Berber languages.A modern derivate of the traditional script, known as Neo-Tifinagh, was introduced in the 20th century. A slightly modified version of the traditional script, called Tifinagh Ircam, is used in a number of Moroccan elementary schools in teaching the Berber language to children as well as a number of publications.The word tifinagh is thought to be a Berberized feminine plural cognate of Punic, through the Berber feminine prefix ti- and Latin Punicus; thus tifinagh could possibly mean “the Phoenician (letters)” or “the Punic letters”.
- ↑ a b
CJK Symbols and Punctuation
(64 codes from 3000–303F,
symbl.cc)
CJK Anz and Punctuation is a Unicode block containing symbols and punctuation in the unified Chinese, Japanese and Korean script.
Chinese punctuation uses a different set of punctuation marks from European languages, although the concept of punctuation was adapted in the written language during the 20th century from Western punctuation marks. Before that, the concept of punctuation in Eastern Asian cultures did not exist at all. The first book to be printed with modern punctuation was Outline of the History of Chinese Philosophy (中國哲學史大綱) by Hu Shi (胡適), published in 1919. Scholars did, however, annotate texts with symbols resembling the modern ´。´ and ´、´ (see below) to indicate full-stops and pauses, respectively. Traditional poetry and calligraphy maintains the punctuation-free style. The usage of punctuation is regulated by the Chinese national standard GB/T 15834–2011 “General rules for punctuation” Chinese: 标点符号用法; pinyin: biāodiǎn fúhào yòngfǎ.
- ↑ a b
Alphabetic Presentation Forms
(80 codes from FB00–FB4F,
symbl.cc)
Alphabetic Presentation Forms is a Unicode block containing standard ligatures for the Latin, Armenian, and Hebrew scripts.
- ↑ a b
Halfwidth and Fullwidth Forms
(240 codes from FF00–FFEF,
symbl.cc)
In CJK (Chinese, Japanese and Korean) computing, graphic characters are traditionally classed into fullwidth (in Taiwan and Hong Kong: 全形; in CJK and Japanese: 全角) and halfwidth (in Taiwan and Hong Kong: 半形; in CJK and Japanese: 半角) characters. With fixed-width fonts, a halfwidth character occupies half the width of a fullwidth character, hence the name.
In the days of computer terminals and text mode computing, characters were normally laid out in a grid, often 80 columns by 24 or 25 lines. Each character was displayed as a small dot matrix, often about 8 pixels wide, and an SBCS (single byte character set) was generally used to encode characters of western languages.
For a number of practical and aesthetic reasons, Han characters would need to be twice as wide as these fixed-width SBCS characters. These “fullwidth characters” were typically encoded in a DBCS (double byte character set), although less common systems used other variable-width character sets that used more bytes per character.
- ↑ a b
Osage
(80 codes from 104B0–104FF, Language: (UNESCO). Today It´s A Dead Language — The Last Native Speaker Passed Away In 2005. Nevertheless, In 2006 Herman Mongrain Lookout Created A Script For Osage. In 2014 He Improved It Significantly, And In 2016 It Was Added To Unicode.,
symbl.cc)
This alphabet represents the Osage language which was spoken by Osage indians. They used to live in the north of Oklahoma (US). This language is included in the Red Book of Endangered Spr (UNESCO). Today it´s a dead language — the last native speaker passed away in 2005. Nevertheless, in 2006 Herman Mongrain Lookout created a script for Osage. In 2014 he improved it significantly, and in 2016 it was added to Unicode.
Оsage design is based on the Latin alphabet. Words are written from left to right. It uses common European diacritics, punctuation, and numbers.
- ↑ a b
Linear A
(384 codes from 10600–1077F,
symbl.cc)
Linear A is a variety of the Cretan script which was developed in Ancient Greece. Along with Cretan hieroglyphic, it still remains undeciphered.
Long story short: this script was discovered by archaeologist Sir Arthur Evans. Actually Linear A was the primary script used in palace and religious writings of the Minoan civilization. The vast majority of the inscriptions were written on tablets made of unbaked clay, some of which have survived due to the fact that they were touched by some fires or arsons. Some inscriptions are inked on vessels and other objects. The shape of the signs suggests that the main material for writing was not clay, but parchment or similar material.
Linear A was the origin of the Linear B script, which was later used by the Mycenaean civilization. In the 1950s, Linear B was largely deciphered and found to contain the early form of Greek. Although the two systems share many symbols, this did not lead to a subsequent decipherment of Linear A. Using the values associated with Linear B, Linear A mainly produces unintelligible words. If it uses the same or similar syllabic values as Linear B, then its underlying language appears unrelated to any known language. This has been dubbed the Minoan language.
- ↑ a b
Phoenician
(32 codes from 10900–1091F,
symbl.cc)
The Phoenician alphabet, called by convention the Proto-Canaanite alphabet for inscriptions older than around 1200 BC, is the oldest verified consonantal alphabet. It was used by the civilization of Phoenicia to write Phoenician (apparently), a Northern Semitic language. It is classified as an abjad because when you write it, you only put consonantal sounds (however, matres lectionis were used for some vowels in certain late varieties).
Just to make it clear: matres lectionis are consonants used to indicate a vowel.Global influencer
The Phoenician alphabet was derived from Egyptian hieroglyphics. It became one of the most widely used writing systems spread by Phoenician merchants across the Mediterranean world, where it evolved and was adapted by many other cultures. This is the approximate list of the scripts that were influenced by Phoenecian: • Paleo-Hebrew alphabet was built on the Phoenician • The Aramaic alphabet, a modified form of Phoenician, was the ancestor of modern Arabic script. • The Modern Hebrew script is a stylistic variant of the Aramaic script. • The Greek alphabet (and by extension its descendants such as the Latin, the Cyrillic, and the Coptic2C80–2CFF) were direct successors of Phoenician, including the first full alphabet (with vowels rather than just consonants)..
As the letters were originally incised with a stylus, most of the shapes are angular and straight, although more cursive versions are increasingly attested in later times, culminating in the Neo-Punic alphabet of Roman-era North Africa. Phoenician was usually written from right to left, although there are some texts written in boustrophedon.
Boustrophedon — the style of writing, which looks like a staircase: first line from left to right, second from right to left.
In 2005, UNESCO registered the Phoenician alphabet into the Memory of the World Programme as a heritage of Lebanon. Now we can truly see why.
- ↑ a b
Lydian
(32 codes from 10920–1093F,
symbl.cc)
Lydian script was used to write the Lydian language. That the language preceded the script is indicated by names in Lydian, which must have existed before they were written. Like other scripts of Anatolia in the Iron Age, the Lydian alphabet is a modification of the East Greek alphabet, but it has unique features. The same Greek letters may not represent the same sounds in both languages or in any other Anatolian language (in some cases it may). Moreover, the Lydian script is alphabetic.
Early Lydian texts are written both from left to right and from right to left. Later texts are exclusively written from right to left. One text is boustrophedon. Spaces separate words except that one text uses dots. Lydian uniquely features a quotation mark in the shape of a right triangle.
The first codification was made by Roberto Gusmani in 1964 in a combined lexicon (vocabulary), grammar, and text collection.
- ↑ a b
Old Turkic
(80 codes from 10C00–10C4F,
symbl.cc)
The Old Turkic script (also known as variously Göktürk script, Orkhon script, Orkhon-Yenisey script) is the alphabet used by the Göktürk and other early Turkic Khanates during the 8th to 10th centuries to record the Old Turkic language.
The script is named after the Orkhon Valley in Mongolia where early 8th-century inscriptions were discovered in an 1889 expedition by Nikolay Yadrintsev. These Orkhon inscriptions were published by Vasily Radlov and deciphered by the Danish philologist Vilhelm Thomsen in 1893.
This writing-system was later used within the Uyghur Empire. Additionally, a Yenisei variant is known from 9th-century Kyrgyz inscriptions, and it has likely cousins in the Talas Valley of Turkestan and the Old Hungarian script of the 10th century. Words were usually written from right to left.
Thomsen characterized the script as “Turkish runes”, and it is still occasionally described as runic or “runiform” by comparison to the Old Germanic alphabet that were used during roughly the same period.
- ↑ a b
Ancient Greek Musical Notation
(80 codes from 1D200–1D24F,
symbl.cc)
Ancient Greek Musical Notation is a Unicode block containing various symbols used in Ancient Greece for composing and writing down music.
The system was popular in Greece from the 9th century BC to the 6th century AD. It consisted of symbols inscribed on stone or metal plates, which represented the pitch and duration of musical notes.
The notation evolved over a period of more than 500 years. It went from simple scales of tetrachords, or divisions of the perfect fourth, to The Perfect Immutable System, encompassing a span of fifteen pitch keys. The most famous example of ancient Greek musical notation is the Seikilos epitaph, a piece of music inscribed on a tombstone.
Any discussion of ancient Greek music, theoretical, philosophical or aesthetic, is fraught with two problems. First, there are few examples of written music, which makes the exploration difficult. Second, there are many theoretical and philosophical accounts, sometimes fragmentary. All in all, the notation was applied to both vocal and instrumental music, but much of it has been lost and is only partially understood today.
The symbols in this block include instrumental and vocalic notation, plus further inscriptions. Copy them to your history report and get an excellent mark from your teacher!
- ↑ a b
Alchemical Symbols
(128 codes from 1F700–1F77F,
symbl.cc)
This block includes alchemical symbols — images of substances, processes, measurement measures and various chemical elements.
What is alchemy? It´s an early form of natural science studying the properties of matter and their transformation. Alchemists sought to discover the “philosopher´s stone”, which, according to their ideas, could turn ordinary metals into gold and possessed immortal properties.
Alchemical symbols were used to denote chemical elements and compounds until the 18th century. The design of the symbols was largely standardized, however, the symbols themselves and the style could vary.
How can alchemical symbols be used today? • decorative elements in design and fashion • tattoo ideas • metaphorical context in art and literature • astrology and esotericism.
As well as decoration of posts and texts on the topics mentioned above.