Preface

The Arabic fonts in this article may not render correctly in the Chrome and Chromium browsers. In this case please use Firefox or Safari instead.

بسم الله الرحمن الرحيم

الحمد لله والصلاة والسلام على نبينا محمد. أما بعد:

1 Introduction

The rules of how to write hamza in the Arabic script are quite complex. The matter is further complicated by the limitations of metal and digital font technology. We attempt to give a comprehensive set of rules for writing hamza and some recommendations for typographers and typesetters.

This article is based on findings from the author’s research. Any errors reported as issues at https://github.com/adamiturabi/hamza-rules will be greatly appreciated.

2 Scope

We treat only the orthography of Standard Arabic. The orthography of the Qurʾān (الرسم العثماني) is not discussed.

3 Seats of hamza

hamza is written in four different ways:

  1. Seated on an ʾalif: أ or إ
  2. Seated on an wāw: ؤ
  3. Seated on an yāʾ: ئ
  4. Unseated: ء

Here are some of notes about writing hamza in the above four methods:

  • When unseated hamza comes between two letters that are joined, then it is written above the line that joins them, for example: خَطِيءَة k͡haṭīʾaḧ. In this word, the yāʾ ي joins to the tāʾ marbūṭaḧ ة.

    As a special case, when unseated hamza comes between joined lām and ʾalif (لا), then it is positioned between them thus: لءا. (In most cases, this is replaced with لآ as we will explain in the next point below.) And this is different from hamzah on the ʾalif following the lām: لأ.

  • When unseated hamza is followed by an ʾalif: ءا, the combination of hamza and ʾalif is usually written as آ as a convention. Examples: آمَنَ ʾāmana, ظَمْآن ḍ͡hamʾān, شَنَآن s͡hanaʾān. However, when the ʾalif is a suffix or part of a suffix, or the hamza is doubled, or there is an ʾalif before the hamza then we will write ءا, not آ. Examples: شَيْءَانِ s͡hayʾāni, سَءَّال saʾʾāl, قِرَاءَات qirāʾāt.

  • When hamza is seated on ʾalif, if it has an kasraḧ, it is written below the ʾalif: إِ. Otherwise, it is written above the ʾalif: أَ, أُ, أْ.

  • When hamza is seated on yāʾ ئ the dots of the yāʾ are no longer written. Here’s how it will appear in different positions:

    Isolated End Middle Beginnning
    ئ ـئ ـئـ ئـ

    Note that hamza is seated on yāʾ in the middle position ـئـ is different from unseated hamza between two joining letters ـءـ.

So how do we know when to write hamza unseated and when seated? And how do we choose between its three different seats? There are a set of rules that we need to follow in order to correctly write hamza.

4 Rules for determining the seat of hamza

4.1 Without prefixes and suffixes

We will first learn how to determine the seat of hamzah for a word without any prefix or suffix.

Hamzah can occur in three positions in a word:

  1. At the beginning of the word
  2. In the middle of the word
  3. At the end of the word

We will treat each of these positions below.

4.1.1 At the beginning of the word

When hamza occurs in the beginning of a word, then:

  1. If the hamza carries a long-ā vowel, it is written unseated followed by an ʾalif and written as آ, for example آمَنَ ʾāmana.
  2. If the hamza carries any other vowel, it is written seated on an ʾalif, and is marked with the appropriated vowel mark, for example أَسْلَمَ ʾaslama, أُرِيدُ ʾurīdu, إِسْلَام ʾislām, إِيمَان ʾīmān, أُوخِذَ ʾūk͡hid͡ha.

4.1.2 In the middle of the word

The most general case is when hamza is in the middle of a word.

Arabic has three short vowels, three long vowels, two diphthongs, and a sukūn. Each of these has an order of precedence and a hamza seat.

Precedence Vowel Seated hamzah
1. ī/ay ء
2. i ئ
3. ū/aw ء
4. u ؤ
5. ā ء
6. a أ
7. ◌ْ ء

Main rule: Disregard any doubling mark ◌ّ and consider the vowel on the consonant before the hamza and the shortened vowel on the hamza itself. Determine which of the two vowels wins by being higher in precedence in the above table. The winning vowel’s seat will be the seat of the hamza.

Sub-rule: If the main rule determines that hamza is to be seated on ʾalif, and there is a long ā vowel on the hamza using an ʾalif, then hamza shall be unseated. And the combination of ءَا will usually be written as آ.

Examples:

Word Vowel on consonant before hamza Shortened vowel on hamza Winning vowel Seated hamza
هَيْءَة hayʾaḧ ay a ay ء
خَطِيءَة k͡haṭīʾaḧ ī a ī ء
اسْتِيءَاس ʾistīʾās ī a ī ء (Exception: ءَا is not written as آ when the preceding vowel is ī.)
تَوْءَم tawʾam aw a aw ء
سَائِل sāʾil ā i i ئ
تَسَاؤُل tasāʾul ā u u ؤ
تَسَاءَلَ tasāʾala ā a ā ء
قِرَاءَات qirāʾāt ā a ā ء
نُوآنٌ nūʾānun ū a ū ء
مَسْؤُول masʾūl ◌ْ u u ؤ
تَرْئِيس tarʾīs ◌ْ i i ئ
مِرْآة mirʾāḧ ◌ْ a a ء (Using sub-rule.)
ظَمْآن ḍ͡hamʾān ◌ْ a a ء (Using sub-rule.)
مَسْأَلَة masʾalaḧ ◌ْ a a أ
الْمَرْأَة almarʾaḧ ◌ْ a a أ
بِئْسَ biʾsa i ◌ْ i ئ
سُؤْل suʾl u ◌ْ u ؤ
کَأْس kaʾs a ◌ْ a أ
سُئِلَ suʾila u i i ئ
يَئِسَ yaʾisa a i i ئ
رَئِيس raʾīs a i i ئ
سُؤَال suʾāl u a u ؤ
رُؤُوس ruʾūs u u u ؤ
لُؤَيّ luʾayy u a u ؤ
شَنَآن s͡hanaʾān a a a ء (Using sub-rule.)
سَأَلَ saʾala a a a أ
رَأَىٰ raʾā a a a أ (Sub-rule doesn’t apply because ā vowel at end represented by ىٰ, not ʾalif.)
رَأَّسَ raʾʾasa a a a أ
يُرَئِّسُ yuraʾʾisu a i i ئ
رُئِّسَ ruʾʾisa u i i ئ
تَفَؤُّل tafaʾʾul a u u ؤ
سَءَّال saʾʾāl a a a ء (Using sub-rule.)
لَءَّال laʾʾāl a a a ء (Using sub-rule.)

4.1.3 At the end of the word

When hamza occurs at the end of a word, disregard the vowel on hamza itself, and consider only the vowel on preceding consonant. Plug it into the precedence table as above to determine the seat of hamza.

Word Vowel on consonant before hamza Seated hamza
دُعَاءُ duɛāʾu ā ء
سُوءُ sūʾu ū ء
جِيءَ jīʾa ī ء
ضَوْءَ ḍawʾa aw ء
شَيْءَ s͡hayʾa ay ء
بُطْءُ buṭʾu ◌ْ ء
عِبْءُ ɛibʾu ◌ْ ء
شَطْءُ s͡haṭʾu ◌ْ ء
يُهَدِّئُ yuhaddiʾu i ئ
سَيِّئُ sayyiʾu i ئ
بَطُؤَ baṭuʾa u ؤ
يَهْدَأُ yahdaʾu a أ
مُبْتَدَإِ mubtadaʾi a إ

The exception to this rule is when the previous letter is a doubled wāw with an ḍammaḧ. In this case the hamza will again be unseated. Example تَبَوُّءُ tabawwuʾu.

Note also that مُبْتَدَإِ mubtadaʾi can be written with the hamza below the ʾalif because of the i-mark on the hamza. But it is also common to write it as مُبْتَدَأ mubtadaʾ, especially when the hamza is unvoweled.

4.2 Prefixes and suffixes

4.2.1 Prefixes

If hamza is in the beginning of a word, adding a prefix to the word will not alter the writing of the hamza. Examples:

  • لِ + أُسْتَاذِ = لِأُسْتَاذِ
  • الْ + آخِرَة = الْآخِرَة

4.2.2 Suffixes

If hamza is at the end of a word, adding a suffix to the word can, in general, alter the writing of the hamza. Hamza is now, generally, treated as if it is in the middle of the word, and the rules for hamza in the middle of a word apply. Examples:

Word Vowel on consonant before hamza Shortened vowel on hamza Winning vowel Seated hamza
بَرِيءُونَ barīʾūna ī u ī ء
بَرِيءَانِ barīʾāni ī a ī ء
بَرِيءِينَ barīʾīna ī i ī ء
بَرِيءَيْنِ barīʾayni ī a ī ء
شَيْءُهُ s͡hayʾuhu ay u ay ء
شَيْءَهُ s͡hayʾahu ay a ay ء
شَيْءِهِ s͡hayʾihi ay i ay ء
شَيْءَانِ s͡hayʾāni ay a ay ء
شَيْءَيْنِ s͡hayʾayni ay a ay ء
مَجِيءُهُ majīʾuhu ī u ī ء
مَجِيءَهُ majīʾahu ī a ī ء
مَجِيءِهِ majīʾihi ī i ī ء
سُوئِهِ sūʾihi ū i i ئ
ضَوْئِهِ ḍawʾihi aw i i ئ
سُوءُهُ sūʾuhu ū u ū ء
سُوءَهُ sūʾahu ū a ū ء
سُوءَانِ sūʾāni ū a ū ء
ضَوْءَهُ ḍawʾahu aw a aw ء
ضَوْءَانِ ḍawʾāni aw a aw ء
مُتَّکِئِينَ muttakiʾīna i i i ئ
يُبَرِّئُونَ yubarriʾūna i u i ئ
يُبَرَّؤُونَ yubarraʾūna a u u ؤ

There are some exceptions:

  • If the letter before the hamza has a sukūn and is not wāw or yāʾ, then the hamza will be written unseated. Examples:

    • جُزْءَانِ juzʾāni
    • عِبْءَانِ ɛibʾāni
    • عِبْءَيْنِ ɛibʾayni
    • بُطْءَهُ buṭʾahu
    • بُطْءُهُ buṭʾuhu
    • بُطْءِهِ buṭʾihi
  • If the hamzah is after a long-ū vowel or an aw semi-vowel, and the hamzah does not have an i-mark, then it is written unseated. Examples:

    • يَسُوءُونَ yasūʾūna

    This is actually an acceptable variant in the general case, as will discuss below in section 4.4

(انِ, يْنِ, هُ, and هِ are suffixes.) Note that the combination ءا is not written as آ when the ʾalif is part of the suffix.

4.3 tanwīn on final hamza

tanwīn on a final hamza does not affect the writing of the hamza except in the case of tanwīn al-fat·ḥ. When writing tanwīn al-fat·ḥ on a hamza at the end of a word:

  1. If there is an ʾalif before a unseated hamza اء, then we don’t add a silent ʾalif when writing tanwīn al-fat·ḥ. For example دَاء becomes دَاءً dāʾan, not دَاءًا.

  2. Otherwise, we add the silent ʾalif after the hamza so that the hamza is now in the middle of the word with a suffix ʾalif after it. We now pretend that the hamza has an fat·ḥaḧ and that the ʾalif after it is a long-ā vowel. Then we go through the rules for writing hamza in the middle of a word (given above) to determine how hamza will be written. We then write the an-mark on the hamza. Examples:

  • مُبْتَدَأ becomes مُبْتَدَأٌ، مُبْتَدَءًا، مُبْتَدَإٍ
  • مَلْجَأ becomes مَلْجَأٌ، مَلْجَءًا، مَلْجَإٍ
  • جُزْء becomes جُزْءٌ، جُزْءًا، جُزْءٍ
  • شَيْء becomes شَيْءٌ، شَيْءًا، شَيْءٍ
  • سَيِّئ becomes سَيِّئٌ, سَيِّئًا, سَيِّئٍ
  • تَبَوُّء becomes تَبَوُّءٌ, تَبَوُّءًا, تَبَوُّءٍ

4.4 Variants

There are some historical and regional variants to the above rules. The main one is when the letter before hamza has a sukūn, the hamza is generally written unseated. So with this variant, we write:

  • مَسْءُول instead of مَسْؤُول
  • أَسْءِلَة instead of أَسْئِلَة
  • مَسْءَلَة instead of مَسْأَلَة

However, this rule appears to be not consistently followed. For example, nas͡hʾah is generally always written نَشْأَة never نَشْءَة.

A second variant is to avoid the repetition of vowel letters like و and ي. So they write:

  1. رُءُوس instead of رُؤُوس.
  2. رَءِيس instead of رَئِيس.

5 Typographical limitations

Due to what appears to have been a limitation of typesetting technology in the days of typewriters, metal typography, and early digital typography, unseated hamza between two joining letters ـءـ was usually written as seated on yāʾ instead: ـئـ. Because of this limitation we are now accustomed to seeing:

  • شَيْئًا instead of شَيْءًا
  • خَطِيئَة instead of خَطِيءَة
  • هَيْئَة instead of هَيْءَة
  • عِبْئَيْنِ instead of عِبْءَيْنِ

and similar variants.

These variants have pervaded to such a degree that many modern explanations on the rules of hamza orthography present the above as the correct way of writing, and modify their rules with exceptions to allow this writing.

Fortunately, advancements in digital font technology now allow us to revert back to the original rules. However, unfortunately, only very few computer fonts today actually implement this feature.

Two fonts, of which we are aware, that do allow the preferred orthography are:

  1. Dr Khaled Hosny’s “Amiri”: https://www.amirifont.org/
  2. DecoType “Naskh”: https://www.decotype.com/oneliner/

With these “hamza-safe fonts” you can always use U+0621 “Arabic Letter Hamza” to type unseated hamza no matter whether it is between joining or non-joining letters. With most other fonts U+0621 “Arabic Letter Hamza” will prevent the surrounding two letters from joining.

This issue has been discussed in detail by Thomas Milo in Unicode L2/14-109: https://unicode.org/L2/L2014/14109-inline-chars.pdf.

However, the problem remains that this is a font-specific hack. It seems that the Unicode has no official guidance in this regard.

Another solution, that works with most other fonts, is to fake the correct orthography using a combination of U+0640 “Arabic Tatweel” and U+0654 “Arabic hamza above” thus ـٔ to achieve the appearance of unseated hamza between two joining letters. However, this will not work when hamza is between lām and ʾalif in the mandatory lam-ʾalif ligature لا (correct: لءا, incorrect: لـٔا). Such words are rare in Standard Arabic but do exist, e.g. لَءَّال laʾʾāl meaning “pearl-seller”, and مِلْءًا. Perhaps the official solution ought to be that fonts absorb the tatweel and not let it affect the lām-ʾalif ligature when it it is input between lām and ʾalif.

Beware that using tatweel in this manner may also affect the searching of characters in a digital document.