Typesetting a Non-Latin Script With ConTEXt: Greek Thomas A. - - PowerPoint PPT Presentation
Typesetting a Non-Latin Script With ConTEXt: Greek Thomas A. - - PowerPoint PPT Presentation
Typesetting a Non-Latin Script With ConTEXt: Greek Thomas A. Schmitz ConTEXt user meeting, Epen, March 2007 Areas of Interest for Non-Latin Scripts: Input; Output. Input Methods: ASCII; Unicode. Converting ASCII into
Areas of Interest for Non-Latin Scripts:
- Input;
- Output.
Input Methods:
- ASCII;
- Unicode.
Converting ASCII into Greek:
>'Andra moi >'ennepe, Mo~usa, pol'utropon <`oc m'ala poll'a
- Ανδρα µοι ννεπε, Μοσα, πολτροπον θ µλα πολλ
Characteristics of ASCII Input
- Portability across platforms and editors;
- typing intuitive;
- lack of visual feedback; need to compile your source.
Characteristics of Unicode Input
- Immediate visual feedback: you see what you mean.
- Portable in theory, but older platforms and some editors have
problems handling Unicode;
- display of Unicode characters can be unreliable and/or ugly on some
platforms and editors;
- proper keyboard driver may be difficult to find; variety of input
methods;
- support for Unicode in TEX is incomplete; at its base, TEX is still an
8-bit system.
Unicode in Vim 7.0 under Mac OS X
Unicode in emacs under Mac OS X
Unicode in Vim 7.0 under Fedora linux fc6
Unicode in emacs 23 (cvs) under Fedora linux fc6
Unicode in emacs 21.4 under debian linux
Characteristics of Unicode Input
- Immediate visual feedback: you see what you mean.
- Portable in theory, but older platforms and some editors have
problems handling Unicode;
- display of Unicode characters can be unreliable and/or ugly on some
platforms and editors;
- proper keyboard driver may be difficult to find; variety of input
methods;
- support for Unicode in TEX is incomplete; at its base, TEX is still an
8-bit system;
Using Fonts in TEX
.tfm
A .tfm File Converted to .pl format
(CHARACTER O 100 (CHARWD R 0.5) (CHARHT R 0.6449995) ) (CHARACTER C A (CHARWD R 0.65) (CHARHT R 0.6842785) ) (CHARACTER C B (CHARWD R 0.617) (CHARHT R 0.6842785) (CHARDP R 0.249085) )
Using Fonts in TEX
.tfm .map
Excerpt from a map file
GreekGentiumAlt <genaltagr.enc <GenAR102.TTF
Using Fonts in TEX
.tfm .map .enc
Excerpt from an enc file
/mu % 109 /nu % 110 /omicron % 111 /pi % 112 /chi % 113 /rho % 114 /sigma % 115 /tau % 116 /upsilon % 117 /uni1FB3 % 118 /omega % 119 /xi % 120 /psi % 121
Using Fonts in TEX
.tfm .map .enc .pfb .pdf
Another Look at Using Fonts
.pfb .ttf .otf
Another Look at Using Fonts
font a font b font c
Another Look at Using Fonts
a.tfm b.tfm c.tfm font a font b font c a.enc b.enc c.enc pdf
Tools for Producing tfm Files
fontforge
Multi-purpose tool for manipulating fonts;
afm2tfm
part of TEX-installation, converts afm files to tfm format;
afm2pl
by Siep Kronenberg: similar to afm2tfm, but produces
pl which can then be converted by pltotf; ttf2afm
converts truetype ttf to tfm;
- tftotfm
by Eddie Kohler, converts opentype otf to tfm.
Different Names for the Character :
- uni1F86
- _F86
- alphaiotasubleniscircumflex
- alphatildelenisiota
- e
Preparing a Font for Use with ConTEXt
1. Write enc file; 2. use this enc and proper tool to create tfm; 3. register names in map file; 4. first test run:
\loadmapfile[my.map] \starttext \showfont[myfont] \stoptext
A Greek Font in ConTEXt
020 10 021 11 022 12 023 13 024 14 025 15 026 16 027 17 030 18 031 19 032 1a 033 1b 034 1c 035 1d 036 1e 037 1f 32 040 20
- 33
041 21
- 34
042 22
- 35
043 23
- 36
044 24
ϙ
37 045 25
- 38
046 26
>
39 047 27
(
40 050 28
)
41 051 29
*
42 052 2a
- 43
053 2b
,
44 054 2c
- hyph
45 055 2d
.
46 056 2e
- 47
057 2f 48 060 30
1
49 061 31
2
50 062 32
3
51 063 33
4
52 064 34
5
53 065 35
6
54 066 36
7
55 067 37
8
56 070 38
9
57 071 39
:
58 072 3a
;
59 073 3b
- 60
074 3c
=
61 075 3d
- 62
076 3e
?
63 077 3f
¨
64 100 40
Α
65 101 41
Β
66 102 42
- 67
103 43
∆
68 104 44
Ε
69 105 45
Φ
70 106 46
Γ
71 107 47
Η
72 110 48
Ι
73 111 49
Θ
74 112 4a
Κ
75 113 4b
Λ
76 114 4c
Μ
77 115 4d
Ν
78 116 4e
Ο
79 117 4f
Π
80 120 50
Χ
81 121 51
Ρ
82 122 52
Σ
83 123 53
Τ
84 124 54
Υ
85 125 55
- 86
126 56
Ω
87 127 57
Ξ
88 130 58
Ψ
89 131 59
Ζ
90 132 5a
[
91 133 5b
- 92
134 5c
]
93 135 5d
- 94
136 5e 95 137 5f
<
96 140 60
α
97 141 61
β
98 142 62
ς
99 143 63
δ
100 144 64
ε
101 145 65
φ
102 146 66
γ
103 147 67
η
104 150 68
ι
105 151 69
θ
106 152 6a
κ
107 153 6b
λ
108 154 6c
µ
109 155 6d
ν
110 156 6e
ο
111 157 6f
π
112 160 70
χ
113 161 71
ρ
114 162 72
σ
115 163 73
τ
116 164 74
υ
117 165 75
- 118
166 76
ω
119 167 77
ξ
120 170 78
ψ
121 171 79
ζ
122 172 7a
⌊
123 173 7b
|
124 174 7c
⌋
125 175 7d
ˆ
126 176 7e 127 177 7f
“
128 200 80
”
129 201 81
!
130 202 82
- 131
203 83
- 132
204 84
ί
133 205 85
- 134
206 86
- 135
207 87
- 136
210 88
- 137
211 89
- 138
212 8a
- 139
213 8b
- 140
214 8c
- 141
215 8d
ϊ
142 216 8e
ΐ
143 217 8f
- 144
220 90
- 145
221 91
- 146
222 92
- 147
223 93
- 148
224 94
- 149
225 95
- 150
226 96
΅
151 227 97
- 152
230 98
- 153
231 99
έ
154 232 9a
- 155
233 9b
- 156
234 9c
- 157
235 9d
- 158
236 9e
- 159
237 9f
160 240 a0
- 161
241 a1
- 162
242 a2
ά
163 243 a3
- 164
244 a4
- 165
245 a5
- 166
246 a6
- 167
247 a7
- 168
250 a8
- 169
251 a9
- 170
252 aa
- 171
253 ab
- 172
254 ac
- 173
255 ad
- 174
256 ae
- 175
257 af
- 176
260 b0
- 177
261 b1
- 178
262 b2
- 179
263 b3
- 180
264 b4
- 181
265 b5
- 182
266 b6
- 183
267 b7
- 184
270 b8
- 185
271 b9
- 186
272 ba
ή
187 273 bb
- 188
274 bc
- 189
275 bd
- 190
276 be
- 191
277 bf 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207
Beyond the Pure ASCII Range:
>~h| ⇒
Ligatures in the pl File:
>~h| ⇒
(LIGTABLE (LABEL O 76) (LIG C o O 321) (LIG C i O 204) (LIG C h O 272) (LIG C e O 231) (LIG C a O 242) (LIG O 176 O 222) (LIG O 140 O 226) (LIG O 47 O 224) (STOP)
Ligatures in the lig File:
>~h| ⇒
% LIGKERN lenis guilsinglleft =: lenisgrave ; % LIGKERN lenis guilsinglright =: lenisacute ; % LIGKERN lenis uni1FC0 =: tildelenis ; % LIGKERN lenis alpha =: alphalenis ; % LIGKERN lenis epsilon =: epsilonlenis ; % LIGKERN lenis eta =: etalenis ; % LIGKERN lenis iota =: iotalenis ; % LIGKERN lenis omicron =: omicronlenis ; % LIGKERN lenis upsilon =: upsilonlenis ; % LIGKERN lenis omega =: omegalenis ;
Different Meanings of “Encoding”
xxx.enc enco-xxx.tex /uni1F85 \definecharacter alpha 161
generic TEX ConTEXt-specific names vary names uniform hidden from user accessible to user
How ConTEXt Accesses Characters:
1. ConTEXt “sees” symbolic name \greekalphadasia; 2. since it uses font encoding agr, looks up name in enco-agr.tex and finds that it corresponds to character 161; 3. puts box with dimensions of character 161 of font in current use into its output; takes care of kerning and ligatures; 4. TEX now reads map file and sees that current font is tied to xxx.enc; 5. character 161 in current font is named uni1F86; pdfTEX extracts shape of glyph with this name and puts it into box.
Schematic Representation of Character Use:
Named character
enco-agr.tex ⇒ 161
box with dimensions of char 161
my.enc ⇒ actual name of char 161
draw glyph of char 161
Excerpt from unic-031.tex
\startunicodevector 31 \expandafter\strippedcsname \ifcase\numexpr#1\relax \greekalphapsili \or %1f00 \greekalphadasia \or \greekalphapsilivaria \or \greekalphadasiavaria \or \greekalphapsilitonos \or \greekalphadasiatonos \or \greekalphapsiliperispomeni \or \greekalphadasiaperispomeni \or
Summing it up:
1. develop/use input method; 2. write encoding vectors for your fonts (xxx.enc); 3. extract tfms from your font files; 4. write map file(s), organize fonts in typescript file(s); 5. think about ConTEXt encoding (enco-xxx); 6. prepare unicode vector (unic-xxx); 7. prepare files for hyphenation; 8. write module with user interface.
Future Developments:
1. Implement support for X E
- TEX. Problem: X
E TEX’s mechanism for choosing fonts not entirely compatible with ConTEXt. 2. Port functionality of Gianfranco Boggio-Togna’s metre package. The code is pretty complex; this may take some time.
The metre package:
Metrical symbols taken from standard fonts (Latin Modern):
¯˘¯ ˘ ¯ ˘
× ◦◦
Additional editorial symbols, from math fonts: [ [ ] ] Ability to stack these symbols on top
- f letters or on top of each other
[not implemented yet]
Future Developments:
1. Implement support for X E
- TEX. Problem: X