Riyadh vs Cairo annotation work: cost, quality, sovereignty
Why this question matters now
In 2023, my answer to “where does Arabic AI annotation get done?” was short: Cairo. Easy economics, deep talent, and no meaningful sovereignty constraint from any single large buyer. In 2026, the answer is more complicated. Riyadh is no longer a cosmetic option — it has become a contractual requirement for a defined slice of buyers: HUMAIN, SDAIA, Aramco, SAMA-regulated banks, and government entities subject to NDMO data classification.[^10] At the same time, Egyptian tax reforms and the Executive Regulations issued November 2025 under Egypt’s Personal Data Protection Law (Law 151/2020)[^1] made Cairo more, not less, suitable for institutional governance than it was three years ago.
Before founding Annota8, I was a buyer of annotation services from V7, Kognic, and Scale AI on research projects. (I did not buy from Labelbox — that correction matters to anyone who assumes otherwise.) What I observed from the buyer side: cost models from American and European vendors collapse for Arabic. LATAM or Southeast Asia pricing does not transfer to a language requiring 10+ distinguishable dialects,[^11] right-to-left text handling, and cultural-diplomatic judgment on RLHF samples. When I started Annota8, the question “where do we put the teams?” had a realistic answer: no single city is enough.
This post is candid about the actual differences across five MENA cities as annotation labor markets, and why the hybrid model is what matches what a serious buyer actually needs.
The direct comparison across five cities
| Dimension | Riyadh | Cairo | Dubai | Alexandria | Beirut |
|---|---|---|---|---|---|
| Relative labor cost[^15] | 3-4x | 1.0x (baseline) | 3-3.5x | 0.8x | 1.2x |
| Linguistic talent depth | Mid, growing fast | Very high (3 major universities)[^12] | Low-mid (business hub, not academic) | High (Cairo capacity backup) | High (historical publishing legacy)[^13] |
| Dialect coverage | Strong Gulf | Comprehensive (Egyptian, Levantine, Gulf, Maghrebi) | Multi (but not dialect-deep) | Comprehensive (Cairo-similar) | Excellent Levantine + modern standard |
| Sovereignty / NDMO upper-classification eligible[^3] | Eligible | Not eligible (outside KSA) | Not eligible (outside KSA) | Not eligible | Not eligible |
| UAE sector data residency[^2] | Partial | No | Full | No | No |
| Egypt DP Law 2020 data residency[^1] | No | Full | No | Full | No |
| Vendor tax framework | ZATCA + IKTVA[^6] + Saudization[^5] | ETA (Egyptian Tax Authority)[^9] | UAE Corporate Tax 9% + VAT[^7] | ETA + potential free-zone benefits | Volatile Lebanese tax frame |
| Setup speed | Mid (MISA + commercial) | Fast | Very fast (free zones) | Fast | Slow (variable) |
The first-row numbers are macro-economic estimates built from average Arabic annotator wages at two years’ experience plus benefits and social tax, not commercial pricing. They are framing tools, not quotes.[^15]
Riyadh: sovereignty at its fair price
Riyadh is not a data annotation city in the classical sense. It is a contract city — the place where deals that must sit inside Saudi borders get signed. The advantage here is not labor cost or talent depth, it is the legal access to workloads that simply cannot run from outside the Kingdom.
What makes Riyadh specifically required:
- NDMO upper-classification data (“Top Secret” / “Secret”):[^3] The top bands of NDMO’s data classification policy — Top Secret and Secret — require government and strategic data to remain inside the Kingdom. Any annotation vendor handling this category from Cairo is in breach of the regulatory framework, no matter how watertight the NDA.
- SAMA Cyber Security Framework and Outsourcing Rules:[^10] SAMA’s framework effectively requires in-Kingdom processing for sensitive banking workloads. Serious banking NLP projects (KYC, AML, Gulf-Arabic customer service) require local processing.
- Saudization / Nitaqat:[^5] A vendor pursuing multi-year government contracts needs real Saudi headcount, not a brass plaque. Green and platinum Nitaqat bands open procurement doors that stay closed otherwise.
- IKTVA:[^6] Aramco’s local content program rewards vendors that domesticate spend. It is not a legal requirement, but it is a genuine competitive preference factor in evaluation.
Talent depth in Riyadh is growing fast on the back of King Saud University’s computational linguistics programs and SDAIA’s hiring expansion. But the 2026 reality: the number of working Arabic NLP researchers in Riyadh is still well below Cairo’s. What you gain in sovereignty, you pay in labor cost and specialist depth.
Read more about KSA in our regional guide and our explainer on the NDMO data classification framework.
Cairo: linguistic depth at the lowest cost base
Cairo’s advantage is one thing, sharply defined: academic linguistic depth. Three major universities — Cairo University, Ain Shams, and the American University in Cairo (AUC)[^12] — continuously graduate researchers in Arabic linguistics, computational linguistics, and translation studies. The Cairo-based PhD linguist who reviews QA on an RLHF project costs ~10-15% of the Riyadh equivalent and ~25-30% of the Dubai equivalent, with the same academic qualification and often deeper credentials.[^15]
What Cairo enables operationally:
- PhD-level QA review on SFT and RLHF samples at volumes American teams cannot afford economically
- Comprehensive dialect coverage — Egyptian dominates locally but internal migration plus graduates from across the Arab world allow specialized teams in Levantine, Gulf, Maghrebi, and Sudanese
- Depth in translation, transliteration, and heritage text handling — Egyptian publishing history fed back into the universities across generations
- Economics that permit multi-pass annotation — where American projects can only afford one pass, the same budget supports 2-3 review passes for quality
What Cairo does not provide:
- Not eligible for NDMO Top Secret / Secret workloads[^3]
- Does not serve UAE sector data residency or KSA data residency
- Egyptian tax has tightened after the 2023-2025 reforms; ETA registration is now mandatory for independent contractors,[^9] and personal income tax applies on worldwide income for tax residents under Article 2 of Law 91/2005[^4]
- Internet and power infrastructure are less reliable than the Gulf; multiple UPS lines and redundant connectivity are operating requirements, not luxuries
Explore Egypt in more detail for the full operating environment.
Dubai: the buying-decision hub, not the production hub
Dubai is not usually where the actual work happens — it is where regional enterprise contracts get signed. The AI offices of Microsoft, Google, AWS, G42, Presight, Mubadala, and anyone allocating a MENA budget from a single decision point sit in DIFC, d3, Internet City, or ADGM in Abu Dhabi.
What forces a serious vendor to have a Dubai presence:
- Executive buyer access: A quarterly business review with a UAE energy company or a regional Gulf bank typically happens in a Dubai tower, not Riyadh or Cairo
- Free zones + structural flexibility: DIFC and ADGM offer internationally recognized legal structures (Common Law) that ease multi-jurisdiction enterprise contracting[^14]
- Logistical platform: In-person meetings with KSA, Kuwait, Bahrain, Oman, and Qatar teams happen in diplomatically neutral Dubai
- UAE sector data residency: If the customer is a regulated Emirati entity, sector-specific laws (UAE Health Data Law, Central Bank rules) and the UAE PDPL framework keep data inside the UAE.[^2]
But using Dubai as a production annotation hub runs into simple economics: labor costs ~3-3.5x Cairo without additional linguistic depth.[^15] The practical model: sales office + account management + contract signing in Dubai, actual production in Cairo / Riyadh / Alexandria depending on the workload profile. Further detail in our UAE regional guide.
Alexandria and Beirut: the strategic backups
Alexandria runs ~0.8x Cairo cost,[^15] the same academic and legal environment (Alexandria University graduates comparable linguistic researchers), and the bonus of being away from Cairo’s congestion and infrastructure pressure. It works as natural capacity backup for Cairo on demand surges without replacing it as the primary hub.
Beirut is a different story. Cost runs ~1.2x Cairo,[^15] but the edge is qualitative: the legacy of major Arabic publishing houses (Dar Al-Ilm Lilmalayin, Dar Al-Saqi, Dar Al-Adab)[^13] produced over decades a layer of editors, translators, and language proofreaders at a level not easily reproduced. For projects requiring high-register Modern Standard Arabic (legal review, diplomatic translation, premium editorial), Beirut still carries real value despite the infrastructure instability and volatile tax frame. The practical model: small specialized teams, not high-volume operations.
Sovereignty as a contractual requirement, not a cosmetic feature
The most common buyer-side misunderstanding I see: treating “sovereignty” as an optional premium feature. The 2026 regulatory reality:
- KSA NDMO mandates strict data classification.[^3] Top Secret + Secret outside the Kingdom = regulatory breach.
- UAE Federal Decree-Law No. 45 of 2021 (PDPL) plus sector-specific laws (UAE Health Data Law, Central Bank rules) mandate data residency for healthcare and licensed financial institutions in the UAE — the residency obligations on health and financial data come from those sector laws, not the PDPL itself.[^2]
- Egypt Personal Data Protection Law 151 of 2020 with its Executive Regulations issued 1 November 2025 (Decree 816/2025) imposes specific conditions for transferring Egyptian personal data abroad, and requires consent from the Egyptian Personal Data Protection Center for certain categories.[^1]
- Wider GCC data residency policies are forming fast in Kuwait (CITRA), Bahrain (PDPL), Oman, and Qatar.
The vendor who promises a regulated Saudi customer “safe” processing from Cairo for NDMO Top Secret / Secret data is either unfamiliar with the framework or willing to expose the customer to regulatory risk. Read our designed around PDPL principles guide for a deeper breakdown, and our glossary for data residency, sovereign cloud, and NDMO data classification.
Why hybrid beats single-city vendors
The single-city vendor loses three deals before winning one:
- Loses on sovereignty: A regulated KSA customer needs in-Kingdom processing. The Cairo-only vendor falls out of the long list before the first meeting.
- Loses on cost: A Gulf bank needs 10 million RLHF samples on a finite budget. The Dubai-only or US vendor falls out on collapsed economics.
- Loses on depth: An Arabic foundation model lab needs PhD-grade QA for ALLaM v3 or a competitor model. The vendor without access to Cairo University, Ain Shams, and AUC talent cannot price the project rationally.
The hybrid model we are building at Annota8 is designed to solve this equation through functional distribution:
- Cairo: The linguistic heart. PhD-grade QA review, annotation guideline development, comprehensive dialect coverage, RLHF + SFT work at volume.
- Riyadh: Sovereign processing. NDMO Top Secret / Secret workloads,[^3] SAMA-regulated banking contracts,[^10] the headcount that serves Saudization/Nitaqat,[^5] ZATCA invoicing.
- Dubai: Executive account management. Signing, QBR meetings, multi-jurisdiction regional contracts.[^14]
- Alexandria: Cairo capacity backup. Demand surges, specific specializations, infrastructure pressure relief.
- Beirut: Small specialized teams. High-register MSA, legal review, premium editorial.
This is not “more cities is better.” It is functional matching — assigning work to the city that serves it on the best combination of economics and governance. Read our workforce architecture for the routing mechanics.
Three things we are not saying
Three honest notes before closing:
First, the numbers in the table are macro-economic estimates, not pricing. Final customer pricing depends on the task, data volume, contract terms, and required QA level. Do not use these numbers as a direct negotiating reference.
Second, Cairo’s talent-depth advantage is real but it is not permanent. Riyadh is investing heavily in building Saudi NLP talent, and SDAIA + King Saud University are producing growing cohorts. Over 5-7 years, the gap narrows. The hybrid model absorbs that shift rather than resisting it.
Third, Annota8 is a young company. I do not claim meaningful market share today. What I do claim: our geographic structure flows from buyer-side understanding of why American and European vendor economics break for Arabic, and that structure makes sense on paper in a way that does not apply to a single-city vendor.