Foundation model alignment for Arabic-speaking populations: the nuances
The “translate Anthropic Constitutional AI into Arabic” failure
As Annota8’s founder in Cairo, I see the idea pitched in almost every other product meeting with Gulf and Egyptian teams: “Let’s take Anthropic’s constitution and translate it, then prompt our model to follow it when responding in Arabic.” The idea sounds reasonable to anyone who has not built alignment data before. It is not reasonable.
The simple reason: a Western model’s ethical constitution embodies values rooted in a very specific regulatory and societal context — American First-Amendment-style speech, gender definitions drawn from contemporary Western academic work, intellectual-property limits framed by the DMCA, suicide-and-self-harm guidance built on APA clinical norms. Each clause carries value, and each clause carries deep assumptions about who the user is, where they live, and which legal regime governs them.
A serious Arabic alignment layer starts from a different question: which humans will this model serve, under which jurisdiction, embedded in which value system? The answer is not singular — and that pluralism is the core of the problem.
Axis 1: religious sensitivity is not one cell
Most Arabic RLHF teams I’ve seen treat “Islam” as a single block. That is a technical error before it is a cultural one.
Arabic-speaking populations include, at minimum:
- Sunnis across the four canonical madhhabs[^2] — Hanafi (Turkey, Pakistan, the Levant, parts of Egypt), Maliki (North and West Africa), Shafi’i (Egypt, Yemen, East Africa), Hanbali (predominantly Saudi Arabia, with mixed schools elsewhere in the Gulf). Fiqh differs in subtle but visible ways in conversation — the ruling on music, the bounds of imagery, the fiqh of financial transactions.
- Twelver Shia — Iraq (~55–64% per CIA Factbook estimates)[^3], Bahrain (~55–70% of citizens per US State Department / Library of Congress estimates)[^4], Lebanon (~25–30% of the total population; estimates vary and no formal census since 1932)[^5], the UAE (~10–15% of Emirati citizens)[^6]. Sources of taqlid differ (Sistani, Khamenei, others), and the fatwas diverge substantively from Sunni positions.
- Ibadis — the plurality / largest school in Oman, and the state madhhab[^7]. A distinct fiqh tradition.
- Druze — Lebanon, Syria, Israel (including the Golan Heights), Jordan. A creedal specificity that cannot be flattened into Levantine Sunni Islam.
- Coptic Orthodox — Egypt (~10%+ of the population per CIA Factbook; church estimates run higher)[^8]. The Ethiopian Orthodox Tewahedo Church is historically related but has been autocephalous since 1959 and is distinct from the Coptic Orthodox Church of Alexandria[^9]. “Am I fasting today?” requires the relevant church’s calendar, not the Hijri one.
- Maronites, Greek Orthodox, Greek Catholics — Lebanon, Syria, Jordan, Palestine.
A model that refuses “advise on investing in conventional bank stocks” because its RLHF data trained it on the Hanbali Saudi position, and then serves a Shafi’i user in Cairo, or a Shia user in Basra, or a Copt in Alexandria — that model is imposing a position on three-quarters of its users that is not theirs.
Serious alignment annotation requires at minimum four layers: (1) classify the question as religious / sectarian / fiqh-touching, (2) detect any sectarian cue from the user, (3) generate an answer that surfaces the disagreement rather than resolving it, (4) refuse to produce a binding fatwa and defer to qualified humans. Each layer needs a linguist annotator with fiqh literacy — not a $12/hour translation contractor.
Axis 2: register — Classical, MSA, dialect
“Arabic” is not one language in the user’s head. It is a continuous arc.
At the top: Classical Arabic, the language of al-Mutanabbi and al-Jahiz, used by literary writers, poets, and imams in formal sermons. In the middle: MSA — Modern Standard Arabic, the language of newspapers, broadcast news, and academic papers. At the bottom: the spoken dialect of each country with its internal variants (Beirut Levantine, Aleppo Levantine, Qatif Gulf, Najdi Gulf, Cairene, Sa’idi, Fessi Moroccan, Marrakshi Moroccan).
A well-aligned model understands that a user who writes “أبغى أعرف الفرق بين الفائدة المركّبة والبسيطة” expects an answer in a Gulf register close to MSA — not a Classical-Arabic poem and not a literal translation of ChatGPT’s English answer. The same user, if they write “اشرح لي قصيدة المعلّقة لامرئ القيس,” expects a completely different register: Classical Arabic with classical roots.
Most machine-translated Arabic RLHF produces a single tone: simplified-fusha that reads like a translated legal document. That tone alienates the colloquial user and embarrasses the literary one. The fix is not a smaller model — the fix is alignment annotation on graded samples across three registers and training the model to detect register from the prompt.
Axis 3: code-switching tolerance
In every MENA market from Beirut to Riyadh to Cairo, the user writes things like: “حابب أعمل subscription على الخطّة السنوي بس عاوز أعرف لو فيه refund policy لو cancelled في الشهر الأوّل.” Six English words inside an Arabic sentence — that is not the exception, that is the default in professional and technical communication across the region.
A well-aligned model accepts code-switching and responds in kind when appropriate. A badly-translated RLHF model treats code-switching as “incorrect usage,” silently rewrites the question into pure fusha before answering, and breaks the user’s flow.
Alignment annotation here is theoretically simple, operationally hard: thousands of samples annotated in their natural register, with annotators marking when code-switching is acceptable (technical term), preferred (sales or HR contexts), or replaceable (formal government context). That is a linguistics-PhD-level task, not a rule list copied from an Anthropic doc.
Axis 4: per-jurisdiction political sensitivity
This is the axis that breaks commercial deployment more than any other.
Saudi Arabia operates under the Anti-Cyber Crime Law of 2007 (Royal Decree No. M/17)[^10] plus SDAIA’s data and AI rules (PDPL, AI Ethics Principles, AI Adoption Framework, Generative AI Guidelines)[^11]. Content touching the leadership, content judged to incite sedition, or content engaging border / territorial questions in ways inconsistent with the official position — creates legal liability for the operator.
Egypt operates under Anti-Cybercrime Law No. 175 of 2018[^12]. Sensitivities are different: the armed forces, June 30 2013, regional alignments.
The UAE operates under Federal Decree-Law No. 34 of 2021 on Countering Rumors and Cybercrimes (effective 2 January 2022)[^13]. The sensitivity surface shifts again.
Qatar, Kuwait, Bahrain, Oman, Jordan, Iraq, Lebanon — each with its own legal frame and bounded political debate.
One alignment layer with a single unified political boundary will fail in every market. The fix is not to refuse everything political — that makes the model useless to journalists and political analysts. The fix is alignment annotation split by geography, with a routing layer at deployment time that selects the right policy based on the customer’s jurisdiction.
Axis 5: gender norms and modesty register
This is the axis most Western vendors avoid, because answering it requires a stance, and a stance creates opposition.
The reality: gender norms differ materially between Casablanca, Amman, Riyadh, and Cairo. Speaking in one tone everywhere produces a model that fails in half its markets.
Modesty register in responses requires careful annotation: when to use “الفاضلة / الفاضل” honorifics, when to use just the name; how to refer to a married woman addressing the cultural-register convention of the user’s city (the conventions in Riyadh and Beirut are different, and neither is the default); how to handle a relationship-advice request from a Riyadh user versus an identical one from Beirut, each on its own register’s terms.
The answer is neither “refuse” nor “uniform reply” — the answer is intelligent alignment to user context, and that requires geographically diverse annotator teams, not one team in one location.
Axis 6: AAOIFI boundaries
This is a purely commercial-technical axis: any model deployed inside an Islamic financial institution (a bank, takaful operator, sukuk issuer, Sharia-compliant lender) is, in many jurisdictions, indirectly subject to AAOIFI standards (Accounting and Auditing Organization for Islamic Financial Institutions)[^14]. AAOIFI standards are adopted as mandatory in some jurisdictions (Bahrain, Oman, Qatar, Sudan, Syria, parts of the UAE) and as best-practice guidance in others; Saudi Arabia generally follows its own Shari’ah governance through SAMA and individual institutional Sharia boards rather than mandating AAOIFI Shari’ah standards directly[^15]. AAOIFI does not govern the model — it governs the product that uses the model. That distinction matters.
A model deployed inside an Islamic bank must:
- Not propose a financing structure containing explicit or implicit riba (defining “implicit” is itself a fiqh debate).
- Not describe a “murabaha,” “ijara,” or “istisna’a” product in language that bleeds into conventional finance.
- Understand the gap between the bank’s internal Sharia Board fatwa and AAOIFI’s general standards, and not conflate them.
- Not assume the role of a Sharia advisor — that is a strictly human function; the model is a support layer.
This requires an alignment annotator with current Sharia training on the latest AAOIFI releases — not an ML engineer who skimmed a summary. I worked with a Gulf bank last year and the first question in the SFT contract was: “Do you have annotators with Sharia certification?” The second: “Do you have a Sharia lawyer reviewing a random sample before delivery?” If you don’t have an answer, the deal closes with another vendor — or doesn’t close at all.
Why “translate Anthropic to Arabic” doesn’t work — the technical summary
If you fold the six axes together, serious Arabic alignment needs:
- RLHF data authored originally in Arabic, not translated. Prompts written naturally in Cairo, chosen and rejected responses written by Arab annotators, the preference judgment rendered with linguistic taste.
- Annotators with PhD-level Arabic linguistics for register, tone, and code-switching. Not translation contractors. Not undergraduates with surface fluency.
- Sharia advisors for each major madhhab, with credentialed certification.
- Counsel licensed in-jurisdiction for political rules in each market of deployment.
- Geographically diverse teams for modesty register and gender norms.
- AAOIFI review at any Islamic-financial deployment.
This stack is not “nice to have” — it is the minimum bar for serious commercial deployment in the region. Every lab that cuts a corner ships a model that “speaks Arabic” but is not “aligned to Arabic-speaking populations” — and the difference shows up inside the first week of production traffic.
How we help at Annota8
We don’t build foundation models. We produce the alignment layer that lets you deploy yours in the Gulf, Egypt, and the Levant. We staff PhD-level Arabic linguists in Cairo, partner with Sharia advisors across four madhhabs, and have legal counsel in-jurisdiction in Riyadh, Cairo, and Abu Dhabi. We build originally-Arabic RLHF data, preference sets, and red-teaming samples with sectarian, political, and gender layering. See /solutions/foundation-models and the KSA sovereign-FM blueprint at /blueprints/ksa-sovereign-fm-lab-sft-rlhf-data-program.