watermark
3 Transliteration of Arabic
3.1 Romanization scheme
In my documents, I frequently need to transliterate and transcribe Arabic text in Latin characters. There are various Romanization schemes in existence, using dots, macrons, etc. The Romanization scheme I am using for a few of my projects is tabulated below:
Arabic letter | Transliterated output | ASCII input |
---|---|---|
ء | ʾ | E |
ا | ā | A |
ب | b | b |
ت | t | t |
ث | t͡h | v |
ج | j | j |
ح | ḥ | H |
خ | k͡h | x |
د | d | d |
ذ | d͡h | p |
ر | r | r |
ز | z | z |
س | s | s |
ش | s͡h | c |
ص | ṣ | S |
ض | ḍ | D |
ط | ṭ | T |
ظ | ḍ͡h | P |
ع | ɛ | e |
غ | g͡h | g |
ف | f | f |
ق | q | q |
ك | k | k |
ل | l | l |
م | m | m |
ن | n | n |
ه | h | h |
و (C/V) | w/ū | w /U |
ي (C/V) | y/ī | y /I |
As you can see, I use digraphs (d͡h, g͡h, etc.) for some letters. This is because, for my current projects, I prefer readability over precision.
It is possible to input these special characters directly by modifying your keyboard layout or mapping, either at an operating system, or editor level. Andreas Hallberg has described a technique for inputing them in the vim editor here: https://andreasmhallberg.github.io/ergonomic-arabic-transcription/
For Quarto, I prefer to input the transliterated text as ASCII characters. I have written a Lua filter transliteration-span.lua
to handle rendering them correctly. The mapping of ASCII input to transliterated output is shown in the table above and is encoded in the filter. So if I input:
[pahabtu maphaban]{.trn}
It will be output as d͡hahabtu mad͡h·haban.
Note the dot character · is automatically inserted by the filter between the digraph d͡h and the following h for helping in disambiguation.
With {.trn}
the output is in italic (as above). But sometimes I need to have non-italic output, as in the case of names. For that I use {.trn2}
. For example:
[#eAEicah]{.trn2} and [E#Adam]{.trn2} are studying
[#qurEAn]{.trn2} and [#HadIv]{.trn2}. the
This is rendered as:
Ɛāʾis͡hah and ʾĀdam are studying the Qurʾān and Ḥadīt͡h.
Note how the hash character #
is used to control capitalization.
3.2 Fonts
For the Latin font used in your main text, you will need to pick a font that supports the dots, macrons, breves, etc needed for transliteration. For my transliteration scheme, the font will also need to support U+02be
for ʾ and U+025b
for ɛ. Not all Latin fonts support these extra characters. In this document, I am using the Charis SIL font.
Other fonts I have experimented with, that have varied support for these characters, are:
3.3 Test transliteration ʾabjd hwz ḥṭy klmn sɛfṣ qrs͡ht t͡hk͡hd͡h ḍḍ͡hg͡h āūī
Dummy text