26 May 2026 Crowd-density safety AI MENA

Crowd-density safety AI for Middle East operations teams (Fruin LOS, Hajj, mosque venues)

TL;DR

The crowd-density framework used in practice by Middle East operations teams is Fruin’s Levels of Service (LOS), a six-level academic scale A–F that maps density (people per square metre) to walking behaviour, comfort, and risk.¹ Some practitioners collapse the upper half of the scale into operational “danger / critical” bands (effectively densities above 4 ppl/m² and above 6 ppl/m²), and that is the framing we use below. Operations teams running Hajj, Umrah, malls, stadiums, and mosque events need cameras plus computer-vision models that measure density second-by-second. But the model does not train on thin air — it needs annotated ground truth data. This article covers: (1) what each density band means operationally, (2) what needs annotating in video to train a density model, (3) how to build a dataset that combines normal footage, pre-incident footage, and incident-time footage, (4) why models trained only on Western data fail in the context of the Grand Mosque and the Prophet’s Mosque. Core academic references: Helbing, Johansson and Al-Abideen 2007 (Physical Review E 75:046109, “Dynamics of crowd disasters: An empirical study”).² Note on terminology: earlier drafts of this piece used “ESCO” as a label for the density bands. That was incorrect — ESCO is the EU jobs taxonomy, not a crowd-safety framework. The right reference is Fruin LOS; this piece has been corrected. Annota8 does not sell safety models. We support the construction of computer-vision training data, annotated by Arabic-speaking teams who understand the operational context of the Grand Mosque, the Prophet’s Mosque, and large Gulf venues.

Context: why I am writing this now

Every Hajj season I get into the same conversations with operations teams in the Kingdom and their technology partners — the General Presidency for the Affairs of the Two Holy Mosques, Hajj authorities, large mall operators, stadium teams — around the same question: “How do we measure crowd density precisely enough to prevent a disaster?”

The question carries an assumption. Many teams imagine the problem is “a better camera” or “a newer computer-vision model.” The real problem is: a lack of training data annotated for Arabic operational context. Off-the-shelf open-source crowd-counting models are trained on public benchmark datasets that do not include the operational scenes that matter here — a person wearing ihram, a crowd moving in a circumambulation pattern, a density threshold in a Saudi mall corridor.

A note: I am not a certified crowd-safety expert. My background is AI data. This piece is written from the data-team angle: what needs to be annotated. For actual operational crowd planning, refer to licensed crowd-safety engineers and to the academic literature referenced later in this article.

Fruin LOS crowd density levels: operational walk-through

Fruin’s LOS is a six-level academic scale (A through F).¹ For operations-team work, Keith Still³ and UK HSE practice commonly collapse the upper half into a four-band danger grid. The numbers below are the operational four-band view; see Fruin 1971 for the original six-level scale.¹ Numbers are people per square meter.

Operational band	Density	Operational state	Controllability
A	< 2 ppl/m²	Comfortable, normal flow	Full control, natural movement
B	2 - 4 ppl/m²	Crowded but acceptable, partial slowdown	Partial control, light intervention possible
C	4 - 6 ppl/m²	Dangerous packing, motion near-stopped	Urgent intervention required
D	> 6 ppl/m²	Critical, risk of crowd collapse, asphyxia, domino falls	Intervention often too late — disaster

Points that the public conversation underplays:

One: the B-to-C transition is fast and non-linear. Helbing and colleagues showed in “Crowd disasters as systemic failures: analysis of the Love Parade disaster” (Helbing & Mukerji 2012)⁴ and earlier in “The Dynamics of Crowd Disasters: An Empirical Study” (Helbing et al. 2007)² that the move from crowded movement to “crowd turbulence” can happen on sub-minute timescales — see the timeline reconstructions in Helbing & Mukerji 2012.⁴

Two: density alone is not enough. The real risk feature is density × variance (how much movement direction fluctuates). A static crowd at 5 ppl/m² is less dangerous than one at the same density where some people push north and others push south. This shapes what must be annotated.

Three: local density diverges from average density. The model needs to operate at the cell level, not just at the camera level — averages hide hotspots.

The historical record: what disasters teach us

Reading the academic literature and the official reports on major Middle East crowd incidents is mandatory for anyone building technology in this space:

Jamaraat Bridge 2006 — the 12 January 2006 incident at the Jamaraat Bridge during the stoning ritual.⁵ Peer-reviewed analysis published by Helbing, Johansson and Al-Abideen in Physical Review E 75:046109 (2007) describes the crowd-dynamics conditions present at the time — high local density, opposing movement vectors at a constriction, the rapid onset of “crowd turbulence” — using video evidence collected on-site with the cooperation of the Hajj authorities.² The Saudi authorities subsequently delivered the New Jamaraat Bridge expansion project, completed in stages through 2010, which redesigned the geometry and capacity of the site.⁶
Mina 2015 — the 24 September 2015 incident near the intersection of streets 204 and 223.⁷ The official Saudi casualty figure published by the Hajj authorities is 769 (independent tallies by AP, AFP, and others are substantially higher).⁸ The peer-reviewed crowd-dynamics literature published after 2015 describes the same crowd-turbulence signature — opposing high-density flows converging — that the 2007 paper identified at Jamaraat.² The data and the academic discussion below are restricted to that engineering question.
Love Parade, Duisburg 2010 — A European disaster (21 deaths) but a landmark analysis. Helbing & Mukerji 2012 (“Crowd disasters as systemic failures”) tracks the temporal flow and crowd density minute by minute, and offers “crowd turbulence” as a predictive indicator.⁴⁹
Stadium incidents — Hillsborough 1989, Port Said 2012, Kanjuruhan 2022 — all crush events at ingress/egress points.¹⁰ A reminder that Fruin LOS and crowd safety are not Hajj-only concerns.

All of these incidents confirm the same pattern: a non-linear shift from safe movement to collapse within seconds. A usefully safety-oriented computer-vision model must detect the transition, not the state after it.

Distinguishing pre-incident data from incident-time data

This is the most important operational detail when building a training set:

Incident-time data — Scenes of density D, chaotic motion, falls, asphyxia. These are rare, ethically sensitive, and mostly sourced from historical incident footage. Insufficient on their own to train a predictive model.
Pre-incident data — The 30 to 60 seconds before a disaster occurs, when density shifts from B to C to D. This is the golden data. If you train the model to see the transition, it can warn before the disaster.

The problem: pre-incident data is acutely scarce, especially in Arabic context. Serious teams therefore build their datasets this way:

Collect historical footage of high-risk areas (Grand Mosque entry points, Jamaraat Bridge pre- and post-expansion, Mina intersections, stadium gates)
Annotate every frame with the actual Fruin LOS level + per-cell density + dominant motion direction per cell
Compile published historical incident footage (Jamaraat 2006, Mina 2015, Love Parade, Hillsborough — available in the academic literature)
Annotate the temporal sequence with the transition event (the B → C moment, the C → D moment)
Build a model that recognizes pre-transition signals

Video annotation modality, bounding box templates, and keypoint templates all feed into this.

What needs annotation in a crowd-safety video

From practical experience, the annotation layers that feed a production density and crowd-motion model:

Annotation layer	Description	Output	Tool
Head dots	Each person marked with a dot on their head	Count + density	keypoint
Bounding boxes for individuals	Each separable person in a box	Detection + tracking	bounding box
Density maps	2D distribution of density over a frame	Density model training	density heatmap
Cell-level segmentation	Frame divided into cells with density per cell	Hotspot detection	grid annotation
Motion vectors	Dominant motion direction per cell	Crowd turbulence detection	optical flow + manual review
Clothing + context labels	Ihram / regular clothing / stadium kit / mall wear	Scenario understanding + personalization	classification
Physical critical points	Pillar, staircase, gate, choke point	Structural alignment	semantic segmentation
Event labels	Moment of fall, point of collapse, motion stop	Ground truth for event-detection model	temporal annotation
Per-frame Fruin LOS label	A / B / C / D	Ground truth for the model	classification

The devil is in the detail:

Head dots beat bounding boxes for high density. At 6 ppl/m², boxes overlap and annotation collapses. Dots do not overlap.
Substantive annotation must come from an annotator trained in crowd safety. A generic annotator cannot tell density 3 from density 5 — both look “a lot.” Training the annotator takes examples, references, and testing.
Religious and cultural context matters. Annotators of Grand Mosque scenes must understand the ritual movement patterns of the venue — the directional flow of circumambulation, the defined path of the sa’i, and the orientation of prayer rows — so that the model treats expected motion as the norm and unusual motion vectors as outliers to be reviewed. These are sacred-context operational facts, not visual heuristics; the annotator must be trained on the ritual before being trained on the model.

Why Western models fail in the Grand Mosque and the Prophet’s Mosque

Off-the-shelf open-source crowd-counting models are trained on public benchmark datasets dominated by stadiums, streets, and protests. A model trained on them fails on:

White ihram garments — uniform color and a draped silhouette confuse detection models
Overhead density — Grand Mosque cameras are typically drone-mounted or pole-mounted at sharp top-down angles. Western datasets are largely frontal.
Tawaf pattern — Organized circular motion does not appear in Western training data
Mosque lighting — Yellow lamps, moonlight, night-to-dawn transitions — a wide lighting range inside the same frame

The fix is not “a better model” — the fix is to build local training data with Grand Mosque, Prophet’s Mosque, and large Saudi venue context. That requires Arabic annotation teams, partnership with the General Presidency and the Presidency of Affairs for the Two Holy Mosques, and an academic partner. Read Hajj and crowd safety solutions for government for the operational model we recommend.

How to build a crowd density ground-truth dataset

Operational guidance from experience:

Collect data from four pilot sites before scaling — for example: the Grand Mosque plaza during prayer, Jamaraat Bridge on a stoning day, a Saudi mall entrance on peak day, a stadium entrance
Capture in different conditions — night/day, summer/winter, weekday/holiday, near-miss/normal day
Use multiple synchronized cameras — overhead camera, frontal camera, oblique camera. The final model must work from any angle.
Layer three annotation passes — head dots + individual bounding boxes for separable persons + a full density map per frame
Inspect productivity with a golden set — clips annotated by a certified crowd-safety expert, against which every annotator is checked monthly
Maintain a full audit trail — government agencies, Saudi authorities, and academic partners may later request a record of who annotated what, when, and under what guidance
Build held-out test partitions — clips the model never sees during training, used for independent evaluation

The glossary covers definitions of every technical term mentioned in this list.

Responsibility and privacy

A point that cannot be skipped:

Saudi PDPL (Personal Data Protection Law, in force since 14 September 2023) regulates biometric personal data. A human face may qualify as biometric personal data.¹¹
Operational fix: annotating heads with dots does not require storing the face. Safety cameras work on count and density, not identity. Output models should be abstract, not identity-based.
Agency authorization — Any video capture inside the Two Holy Mosques requires a formal letter of authorization from the General Presidency for the Affairs of the Two Holy Mosques. There is no “hobbyist” route here.

What Annota8 does — and does not do

We do:

Build training data for crowd density and motion problems, using a trained Arabic team
Support Fruin-LOS-aware video annotation with bounding box, head-point, and density-map templates
Provide audit trails and held-out test partitions
Work under PDPL constraints and Saudi data-sovereignty requirements
Compose our data with Saudi academic partners on request

We do not:

Sell pre-built safety models. We build the data. The model is built by the operator’s team or by a technology partner.
Provide crowd-safety consulting. Engage certified crowd-safety engineers (Keith Still³, G4S, Crowd Dynamics International, or appointed Saudi partners).
Make tactical decisions during a live event. Our work ends at a well-annotated dataset feeding a model.

What this means for the buyer

Start with the question: “Which pilot sites, how many hours of video, which annotation layers, on what timeline?” — not “Which model?”
Ask the vendor for an annotation log, QA evidence, and held-out test partitions — not just “data”
Use pre-incident data, not incident-time data alone
Do not rely on out-of-the-box Western models without local fine-tuning. Test them against local data before deployment
Engage a certified crowd-safety expert through the entire data and model project

Discuss crowd-safety data for your program → 30-minute call Read Hajj and crowd safety solutions

References

Fruin J.J. “Designing for Pedestrians: A Level-of-Service Concept,” Highway Research Record 355 (1971), Transportation Research Board. Also published as the book “Pedestrian Planning and Design” (1971). https://onlinepubs.trb.org/Onlinepubs/hrr/1971/355/355-001.pdf ↩ ↩² ↩³
Helbing D., Johansson A., Al-Abideen H.Z. “Dynamics of crowd disasters: An empirical study,” Physical Review E 75:046109 (2007). DOI: 10.1103/PhysRevE.75.046109. https://arxiv.org/abs/physics/0701203 ↩ ↩² ↩³ ↩⁴
Keith Still — formerly Professor of Crowd Science at Manchester Metropolitan University (2014–2020); developed and led the MSc in Crowd Safety and Risk Analysis. https://www.gkstill.com/CV/References.html ↩ ↩²
Helbing D., Mukerji P. “Crowd disasters as systemic failures: analysis of the Love Parade disaster,” EPJ Data Science 1:7 (2012). https://epjdatascience.springeropen.com/articles/10.1140/epjds7 ↩ ↩² ↩³
“2006 Hajj stampede” — 12 January 2006 incident at Jamaraat Bridge during the stoning ritual. https://en.wikipedia.org/wiki/2006_Hajj_stampede ↩
“Jamaraat Bridge” — New Jamaraat Bridge expansion completed in stages through 2010 (ground and first levels operational by 2007 Hajj, full five-level completion by 2010). https://en.wikipedia.org/wiki/Jamaraat_Bridge ↩
“2015 Mina stampede” — 24 September 2015 incident at the intersection of streets 204 and 223 leading to Jamaraat Bridge. https://en.wikipedia.org/wiki/2015_Mina_stampede ↩
Daily Sabah, “At least 769 dead, 934 injured in stampede at Mina during Hajj pilgrimage in Saudi Arabia,” 24 September 2015. The 769 figure is the official Saudi government figure; independent tallies by AP (2,411), AFP (2,236), and others are substantially higher. https://www.dailysabah.com/mideast/2015/09/24/at-least-769-dead-934-injured-in-stampede-at-mina-during-hajj-pilgrimage-in-saudi-arabia ↩
Love Parade disaster, Duisburg, 24 July 2010 — 21 deaths and 500+ injured. See Helbing & Mukerji 2012 abstract. ↩
Stadium disasters cited: Hillsborough (1989), Port Said (2012), Kanjuruhan (2022) — see “Kanjuruhan Stadium disaster” and related entries. https://en.wikipedia.org/wiki/Kanjuruhan_Stadium_disaster ↩
Saudi Personal Data Protection Law (PDPL) entered into force 14 September 2023; one-year transition period ended 14 September 2024. Biometric data is classified as “sensitive personal data” under PDPL. Morgan Lewis, “Saudi Arabia Personal Data Protection Law Transition Period Ends September 14, 2024.” https://www.morganlewis.com/pubs/2024/09/saudi-arabia-personal-data-protection-law-transition-period-ends-september-14 ↩

Limitations & disclaimer

Limitations of this analysis. This post reflects Annota8's reading of publicly available evidence as of its last-modified date. Vendor positioning, regulatory frameworks, benchmark numbers, and program scope can change without notice. Where numeric ranges are cited, those numbers are reproducible from the source linked in the post's References section — Annota8 has not independently re-run the benchmarks unless explicitly stated in the post.

Privacy & legal posture. Annota8 is an early-stage AI data operations company in soft launch. We do not currently hold SOC 2, ISO 27001, PDPL certification, or any other third-party security or privacy certification. We design with PDPL principles in mind and can sign a DPA modelled on the EU SCC template. Specific compliance posture for your engagement is available on request from [email protected].

Nothing in this post is legal, tax, or investment advice. Regulatory citations should be verified with counsel in your jurisdiction. Vendor names mentioned in this post are referenced as industry-landscape context only — Annota8 is not asserting a comparative product claim, a customer relationship, or any other affiliation with any platform named, unless that affiliation is explicitly stated.

Reach the team:[email protected] · annota8.ai