Justification in Arabic typography

Arabic writing has different characteristics from Latin writing. Some of these characteristics contribute to the processing complexity of the justification algorithms that can be used in Arabic. In particular, Arabic writing is cursive in its printed form as well as in its handwritten one. The letters change according to their position in the word, according to the surrounding letters, and in some cases according to the word’s meaning (ex. The word Allah (God), and the word Mohamed when it means the prophet Mohamed (see figure 3)). The alternative positions are then interdependent. The exit point of each glyph is attached to the entry point of the following glyph and there is no hyphenation.

Figure 3: : Special morphology
Figure 3:
Special morphology

Ligatures

The cursive nature of Arabic writing implies, among other things, a wide use of ligatures (Haralambous 1995). Indeed, Arabic writing is rich in ligatures. Some of them are mandatory and obey grammatical and contextual rules (Haralambous 1995). Others are optional and exist only for aesthetic reasons, legibility, or justification. Moreover, the connection of letters can lead to the introduction of implicit contextual ligatures. An explicit ligature is the fusion of two, three, or even more graphemes.

Generally, the ligature’s width is less than the one of a fused graphemes group. The aesthetic ligature

(right) in figure 4 is 9.65 pt wide, whereas the ligature in the simple contextual substitutions (middle) is 14.75 pt wide.

Figure 4: : Contextual and aesthetic transformations
Figure 4:
Contextual and aesthetic transformations

Controlling the ligatures’ behavior by converting the implicit ligatures into aesthetic ones gives some flexibility to adjust the word for the available space on the line. The example in figure 5 shows three ligature levels: mandatory simple substitutions, aesthetic ligatures of the second degree, and finally, aesthetic ligatures of the third degree. The last two ligature levels provide different possibilities for shrinking of the same word.

Figure 5: : Various levels of ligatures
Figure 5:
Various levels of ligatures

The use of second and third degree aesthetic ligatures has to take into consideration the constraints of legibility. A typesetting system should take into account three options for ligatures. For the first level, there are only implicit contextual ligatures. This level also covers mandatory grammatical ligatures, such as the Lam-Alef ligature; it is recommended for textbooks or books for the general public where it is necessary to avoid confusion between letters and reading ambiguities. In a second-level publication, it is possible to take some liberty and use some aesthetic ligatures. In the third level, the use of aesthetic ligatures of higher degrees is allowed, and graphic expressions are free.

The decision to use explicit ligatures to improve justification must take into account the graphic environment and the block regularity of the text. In calligraphy, once an aesthetic ligature is used, there is no obligation to use this ligature in all the text occurrences. The justification problem can be sorted out with kashida in texts composed only by implicit ligatures.

The use of ligatures to justify lines is not limited to Arabic writing. Adolf Wild, conservator of the Gutenberg Museum in Mainz, examined the Gutenberg Bible from a typographical point of view and determined that at the lines level, in some cases, Gutenberg justified the text through using ligatures instead of using the variable spaces we are familiar with today. (Wild 1995)

Kashida

The connections between Arabic letters are curvilinear bridges. They are extensible. This property—called kashida, tamdid, madda, maT, tTwil, or aliTalat, etc.—is a feature of Arabic script that is rarely met or maintained in other writing systems (see figure 6). Kashida is used in various circumstances for different purposes:

  • emphasis: stretching to emphasize an important word or to correspond to phonetic inflection;
  • legibility: to find a better letter layout on the baseline, and to correct the cluttering that can appear at the joint between two successive letters in the same word;
  • aesthetics, to embellish a word;
  • justification, to justify a text.

Kashida is not a character in itself. It allows stretching some letter parts while its body is kept rigid. The example in figure 6 shows compositions of the Arabic word Mohamed. The arrows indicate the use of kashida, in two extensibility levels.

Figure 6: : Various curvilinear kashida
Figure 6:
Various curvilinear kashida

There are three kinds of stretching: mandatory, allowed, and prohibited. The typographical strength of a text can be determined, among other factors, by whether it respects mandatory stretching and/or eschews prohibited stretching.

In terms of Arabic text justification, kashida is a typographical effect that allows lengthening letters in some carefully selected points on the line within determined parameters so that the paragraph can be justified. The Arabic term tansil refers to selecting good places to insert kashida.

Current typesetting systems: In terms of text processing tools, curvilinear kashida is, generally, still beyond what the majority of typesetting systems can offer. As we have seen, kashida is not a character in itself, but an elongation of some letter parts. To implement kashida, the majority of typesetting systems proceed by inserting rectilinear segments between letters. The resulting typographical quality is unpleasant. Due to the lack of adequate tools, the solution consists of inserting a glyph. That is, rather than computing parameterized Bézier curves in real time, a ready-to-use glyph is inserted. Moreover, whenever stretching is performed by means of a parameterized glyph coming from an external dynamic font, the current font context is changed.

Curvilinear extensibility of glyphs can be offered through the a priori generation of curvilinear glyphs for some predefined sizes. Beyond these sizes, the system will choose curvilinear primitive and linear fragments. Of course, this will violate the curvilinear shape of letters and symbols if they are extended greatly.

A better approach consists of building a dynamic font (Berry 1999; Lazrek 2003; Sherif and Fahmy 2008; Bayar and Sami 2009) through parameterizations of the composition procedure of each letter. To handle the elongations, a letter is decomposed into two parts: the static body of the letter and its dynamic part, capable of stretching. The introduced parameters indicate the extensibility size or degree.

Allographs

Essentially, there are up to four different shapes for each letter in Arabic: isolated, initial, median, or final form. Allographs are the various shapes that a letter can have while keeping its place in the word. For instance, the initial form of the letter Beh can have—in the same calligraphic style—more than one shape. Allometry is the study of the allograph phenomenon, shape, position, context, etc. Generally, allographs are chosen by the writer for aesthetic reasons. However, in Arabic calligraphy, sometimes an allograph is desired and even recommended. The shapes of letters may change according to the nature of the neighboring letters, and in some cases according to the presence of kashida. Some rules concerning use of the allograph are:

  • the shape of the median form of the letter Beh should be more acute when it comes between two spine letters:
  • the initial form of the letter Beh can take one of three allograph shapes, determined by the letter that follows:
  • the initial form of the letter Hah should be a lawzi Hah if it precedes an ascending letter:
  • the initial form of the letter Ain should be a finjani Ain if it precedes an ascending letter:
  • the initial form of the letter Hah, as well as the final form of the letter Meem, change their morphologies in presence of kashida:
  • the letter Kaf changes its morphological shape in case of stretching and should be changed into zinadi Kaf:

Back to Zapf and Thành

Adapting the approaches used to improve the justification of Latin text to Arabic typography does not seem to be the best solution. Zapf and Thành reasoned on the basis of Latin typography, where there are individual glyphs and no tools similar to kashida that can be used to justify lines. Indeed, kashida is not a simple horizontal scaling to enlarge letter width. In some cases, operation of kashida on the letter can totally change its glyph’s morphological shape (see figure 7). The use of kashida is governed by rules and customs inspired by manuals and treatises on Arabic calligraphy (Benaatia, Elyaakoubi, and Lazrek, 2006).

Figure 7: : Arabic stretching letter
Figure 7:
Arabic stretching letter

Diacritic marks

A diacritical mark is a sign added to a letter, like the acute accent on the letter e which produces é. Diacritics are placed above or below letters, or in some other position such as within the letter or between two letters. In different scripts, diacritical marks have common phonetic and linguistic roles: they change the sound value of the letter to which they are added. However, in other alphabetic systems, diacritics may perform other functions. For example, Arabic vocalization marks indicate short vowels applied to consonantal base letters. This vocalization is sometimes omitted altogether in writing.

Arabic diacritical marks have an additional typographical role, which is to fill the void produced by position and juxtaposition of letters on the line. This task is influenced by the effects of Arabic text justification. If kashida is used to manage Arabic lines, it influences the positioning and the measurements of the Arabic diacritical marks (Hssini, Lazrek, and Benatia 2008) (see figure 8). Additionally, the presence of diacritical marks either above or below glyphs adds a vertical inter-line space that should be taken into account in vertical adjustments. Thus, the vertical document adjustment is more delicate.

In this paper we are considering text without taking into account the effect of vocalization, and we have chosen only to address horizontal justification, without considering vertical justification, which is also an issue in Arabic.

Figure 8: : Diacritical marks with various sizes
Figure 8:
Diacritical marks with various sizes