Typesetting a Non-Latin Script With ConTEXt: Greek Thomas A. - - PowerPoint PPT Presentation

typesetting a non latin script with context greek thomas
SMART_READER_LITE
LIVE PREVIEW

Typesetting a Non-Latin Script With ConTEXt: Greek Thomas A. - - PowerPoint PPT Presentation

Typesetting a Non-Latin Script With ConTEXt: Greek Thomas A. Schmitz ConTEXt user meeting, Epen, March 2007 Areas of Interest for Non-Latin Scripts: Input; Output. Input Methods: ASCII; Unicode. Converting ASCII into


slide-1
SLIDE 1

Typesetting a Non-Latin Script With ConTEXt: Greek Thomas A. Schmitz

ConTEXt user meeting, Epen, March 2007

slide-2
SLIDE 2

Areas of Interest for Non-Latin Scripts:

  • Input;
  • Output.
slide-3
SLIDE 3

Input Methods:

  • ASCII;
  • Unicode.
slide-4
SLIDE 4

Converting ASCII into Greek:

>'Andra moi >'ennepe, Mo~usa, pol'utropon <`oc m'ala poll'a

  • Ανδρα µοι ννεπε, Μοσα, πολτροπον θ µλα πολλ
slide-5
SLIDE 5

Characteristics of ASCII Input

  • Portability across platforms and editors;
  • typing intuitive;
  • lack of visual feedback; need to compile your source.
slide-6
SLIDE 6

Characteristics of Unicode Input

  • Immediate visual feedback: you see what you mean.
  • Portable in theory, but older platforms and some editors have

problems handling Unicode;

  • display of Unicode characters can be unreliable and/or ugly on some

platforms and editors;

  • proper keyboard driver may be difficult to find; variety of input

methods;

  • support for Unicode in TEX is incomplete; at its base, TEX is still an

8-bit system.

slide-7
SLIDE 7

Unicode in Vim 7.0 under Mac OS X

slide-8
SLIDE 8

Unicode in emacs under Mac OS X

slide-9
SLIDE 9

Unicode in Vim 7.0 under Fedora linux fc6

slide-10
SLIDE 10

Unicode in emacs 23 (cvs) under Fedora linux fc6

slide-11
SLIDE 11

Unicode in emacs 21.4 under debian linux

slide-12
SLIDE 12

Characteristics of Unicode Input

  • Immediate visual feedback: you see what you mean.
  • Portable in theory, but older platforms and some editors have

problems handling Unicode;

  • display of Unicode characters can be unreliable and/or ugly on some

platforms and editors;

  • proper keyboard driver may be difficult to find; variety of input

methods;

  • support for Unicode in TEX is incomplete; at its base, TEX is still an

8-bit system;

slide-13
SLIDE 13

Using Fonts in TEX

.tfm

slide-14
SLIDE 14

A .tfm File Converted to .pl format

(CHARACTER O 100 (CHARWD R 0.5) (CHARHT R 0.6449995) ) (CHARACTER C A (CHARWD R 0.65) (CHARHT R 0.6842785) ) (CHARACTER C B (CHARWD R 0.617) (CHARHT R 0.6842785) (CHARDP R 0.249085) )

slide-15
SLIDE 15

Using Fonts in TEX

.tfm .map

slide-16
SLIDE 16

Excerpt from a map file

GreekGentiumAlt <genaltagr.enc <GenAR102.TTF

slide-17
SLIDE 17

Using Fonts in TEX

.tfm .map .enc

slide-18
SLIDE 18

Excerpt from an enc file

/mu % 109 /nu % 110 /omicron % 111 /pi % 112 /chi % 113 /rho % 114 /sigma % 115 /tau % 116 /upsilon % 117 /uni1FB3 % 118 /omega % 119 /xi % 120 /psi % 121

slide-19
SLIDE 19

Using Fonts in TEX

.tfm .map .enc .pfb .pdf

slide-20
SLIDE 20

Another Look at Using Fonts

.pfb .ttf .otf

slide-21
SLIDE 21

Another Look at Using Fonts

font a font b font c

slide-22
SLIDE 22

Another Look at Using Fonts

a.tfm b.tfm c.tfm font a font b font c a.enc b.enc c.enc pdf

slide-23
SLIDE 23

Tools for Producing tfm Files

fontforge

Multi-purpose tool for manipulating fonts;

afm2tfm

part of TEX-installation, converts afm files to tfm format;

afm2pl

by Siep Kronenberg: similar to afm2tfm, but produces

pl which can then be converted by pltotf; ttf2afm

converts truetype ttf to tfm;

  • tftotfm

by Eddie Kohler, converts opentype otf to tfm.

slide-24
SLIDE 24

Different Names for the Character :

  • uni1F86
  • _F86
  • alphaiotasubleniscircumflex
  • alphatildelenisiota
  • e
slide-25
SLIDE 25

Preparing a Font for Use with ConTEXt

1. Write enc file; 2. use this enc and proper tool to create tfm; 3. register names in map file; 4. first test run:

\loadmapfile[my.map] \starttext \showfont[myfont] \stoptext

slide-26
SLIDE 26

A Greek Font in ConTEXt

020 10 021 11 022 12 023 13 024 14 025 15 026 16 027 17 030 18 031 19 032 1a 033 1b 034 1c 035 1d 036 1e 037 1f 32 040 20

  • 33

041 21

  • 34

042 22

  • 35

043 23

  • 36

044 24

ϙ

37 045 25

  • 38

046 26

>

39 047 27

(

40 050 28

)

41 051 29

*

42 052 2a

  • 43

053 2b

,

44 054 2c

  • hyph

45 055 2d

.

46 056 2e

  • 47

057 2f 48 060 30

1

49 061 31

2

50 062 32

3

51 063 33

4

52 064 34

5

53 065 35

6

54 066 36

7

55 067 37

8

56 070 38

9

57 071 39

:

58 072 3a

;

59 073 3b

  • 60

074 3c

=

61 075 3d

  • 62

076 3e

?

63 077 3f

¨

64 100 40

Α

65 101 41

Β

66 102 42

  • 67

103 43

68 104 44

Ε

69 105 45

Φ

70 106 46

Γ

71 107 47

Η

72 110 48

Ι

73 111 49

Θ

74 112 4a

Κ

75 113 4b

Λ

76 114 4c

Μ

77 115 4d

Ν

78 116 4e

Ο

79 117 4f

Π

80 120 50

Χ

81 121 51

Ρ

82 122 52

Σ

83 123 53

Τ

84 124 54

Υ

85 125 55

  • 86

126 56

87 127 57

Ξ

88 130 58

Ψ

89 131 59

Ζ

90 132 5a

[

91 133 5b

  • 92

134 5c

]

93 135 5d

  • 94

136 5e 95 137 5f

<

96 140 60

α

97 141 61

β

98 142 62

ς

99 143 63

δ

100 144 64

ε

101 145 65

φ

102 146 66

γ

103 147 67

η

104 150 68

ι

105 151 69

θ

106 152 6a

κ

107 153 6b

λ

108 154 6c

µ

109 155 6d

ν

110 156 6e

ο

111 157 6f

π

112 160 70

χ

113 161 71

ρ

114 162 72

σ

115 163 73

τ

116 164 74

υ

117 165 75

  • 118

166 76

ω

119 167 77

ξ

120 170 78

ψ

121 171 79

ζ

122 172 7a

123 173 7b

|

124 174 7c

125 175 7d

ˆ

126 176 7e 127 177 7f

128 200 80

129 201 81

!

130 202 82

  • 131

203 83

  • 132

204 84

ί

133 205 85

  • 134

206 86

  • 135

207 87

  • 136

210 88

  • 137

211 89

  • 138

212 8a

  • 139

213 8b

  • 140

214 8c

  • 141

215 8d

ϊ

142 216 8e

ΐ

143 217 8f

  • 144

220 90

  • 145

221 91

  • 146

222 92

  • 147

223 93

  • 148

224 94

  • 149

225 95

  • 150

226 96

΅

151 227 97

  • 152

230 98

  • 153

231 99

έ

154 232 9a

  • 155

233 9b

  • 156

234 9c

  • 157

235 9d

  • 158

236 9e

  • 159

237 9f

160 240 a0

  • 161

241 a1

  • 162

242 a2

ά

163 243 a3

  • 164

244 a4

  • 165

245 a5

  • 166

246 a6

  • 167

247 a7

  • 168

250 a8

  • 169

251 a9

  • 170

252 aa

  • 171

253 ab

  • 172

254 ac

  • 173

255 ad

  • 174

256 ae

  • 175

257 af

  • 176

260 b0

  • 177

261 b1

  • 178

262 b2

  • 179

263 b3

  • 180

264 b4

  • 181

265 b5

  • 182

266 b6

  • 183

267 b7

  • 184

270 b8

  • 185

271 b9

  • 186

272 ba

ή

187 273 bb

  • 188

274 bc

  • 189

275 bd

  • 190

276 be

  • 191

277 bf 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207

slide-27
SLIDE 27

Beyond the Pure ASCII Range:

>~h| ⇒

slide-28
SLIDE 28

Ligatures in the pl File:

>~h| ⇒

(LIGTABLE (LABEL O 76) (LIG C o O 321) (LIG C i O 204) (LIG C h O 272) (LIG C e O 231) (LIG C a O 242) (LIG O 176 O 222) (LIG O 140 O 226) (LIG O 47 O 224) (STOP)

slide-29
SLIDE 29

Ligatures in the lig File:

>~h| ⇒

% LIGKERN lenis guilsinglleft =: lenisgrave ; % LIGKERN lenis guilsinglright =: lenisacute ; % LIGKERN lenis uni1FC0 =: tildelenis ; % LIGKERN lenis alpha =: alphalenis ; % LIGKERN lenis epsilon =: epsilonlenis ; % LIGKERN lenis eta =: etalenis ; % LIGKERN lenis iota =: iotalenis ; % LIGKERN lenis omicron =: omicronlenis ; % LIGKERN lenis upsilon =: upsilonlenis ; % LIGKERN lenis omega =: omegalenis ;

slide-30
SLIDE 30

Different Meanings of “Encoding”

xxx.enc enco-xxx.tex /uni1F85 \definecharacter alpha 161

generic TEX ConTEXt-specific names vary names uniform hidden from user accessible to user

slide-31
SLIDE 31

How ConTEXt Accesses Characters:

1. ConTEXt “sees” symbolic name \greekalphadasia; 2. since it uses font encoding agr, looks up name in enco-agr.tex and finds that it corresponds to character 161; 3. puts box with dimensions of character 161 of font in current use into its output; takes care of kerning and ligatures; 4. TEX now reads map file and sees that current font is tied to xxx.enc; 5. character 161 in current font is named uni1F86; pdfTEX extracts shape of glyph with this name and puts it into box.

slide-32
SLIDE 32

Schematic Representation of Character Use:

Named character

enco-agr.tex ⇒ 161

box with dimensions of char 161

my.enc ⇒ actual name of char 161

draw glyph of char 161

slide-33
SLIDE 33

Excerpt from unic-031.tex

\startunicodevector 31 \expandafter\strippedcsname \ifcase\numexpr#1\relax \greekalphapsili \or %1f00 \greekalphadasia \or \greekalphapsilivaria \or \greekalphadasiavaria \or \greekalphapsilitonos \or \greekalphadasiatonos \or \greekalphapsiliperispomeni \or \greekalphadasiaperispomeni \or

slide-34
SLIDE 34

Summing it up:

1. develop/use input method; 2. write encoding vectors for your fonts (xxx.enc); 3. extract tfms from your font files; 4. write map file(s), organize fonts in typescript file(s); 5. think about ConTEXt encoding (enco-xxx); 6. prepare unicode vector (unic-xxx); 7. prepare files for hyphenation; 8. write module with user interface.

slide-35
SLIDE 35

Future Developments:

1. Implement support for X E

  • TEX. Problem: X

E TEX’s mechanism for choosing fonts not entirely compatible with ConTEXt. 2. Port functionality of Gianfranco Boggio-Togna’s metre package. The code is pretty complex; this may take some time.

slide-36
SLIDE 36

The metre package:

Metrical symbols taken from standard fonts (Latin Modern):

¯˘¯ ˘ ¯ ˘

× ◦◦

Additional editorial symbols, from math fonts: [ [ ] ] Ability to stack these symbols on top

  • f letters or on top of each other

[not implemented yet]

slide-37
SLIDE 37

Future Developments:

1. Implement support for X E

  • TEX. Problem: X

E TEX’s mechanism for choosing fonts not entirely compatible with ConTEXt. 2. Port functionality of Gianfranco Boggio-Togna’s metre package. The code is pretty complex; this may take some time. 3. The wonders of luaTEX!