Crowd-density safety AI for Middle East operations teams (Fruin LOS, Hajj, mosque venues)
Context: why I am writing this now
Every Hajj season I get into the same conversations with operations teams in the Kingdom and their technology partners — the General Presidency for the Affairs of the Two Holy Mosques, Hajj authorities, large mall operators, stadium teams — around the same question: “How do we measure crowd density precisely enough to prevent a disaster?”
The question carries an assumption. Many teams imagine the problem is “a better camera” or “a newer computer-vision model.” The real problem is: a lack of training data annotated for Arabic operational context. Off-the-shelf open-source crowd-counting models are trained on public benchmark datasets that do not include the operational scenes that matter here — a person wearing ihram, a crowd moving in a circumambulation pattern, a density threshold in a Saudi mall corridor.
A note: I am not a certified crowd-safety expert. My background is AI data. This piece is written from the data-team angle: what needs to be annotated. For actual operational crowd planning, refer to licensed crowd-safety engineers and to the academic literature referenced later in this article.
Fruin LOS crowd density levels: operational walk-through
Fruin’s LOS is a six-level academic scale (A through F).1 For operations-team work, Keith Still3 and UK HSE practice commonly collapse the upper half into a four-band danger grid. The numbers below are the operational four-band view; see Fruin 1971 for the original six-level scale.1 Numbers are people per square meter.
| Operational band | Density | Operational state | Controllability |
|---|---|---|---|
| A | < 2 ppl/m² | Comfortable, normal flow | Full control, natural movement |
| B | 2 - 4 ppl/m² | Crowded but acceptable, partial slowdown | Partial control, light intervention possible |
| C | 4 - 6 ppl/m² | Dangerous packing, motion near-stopped | Urgent intervention required |
| D | > 6 ppl/m² | Critical, risk of crowd collapse, asphyxia, domino falls | Intervention often too late — disaster |
Points that the public conversation underplays:
One: the B-to-C transition is fast and non-linear. Helbing and colleagues showed in “Crowd disasters as systemic failures: analysis of the Love Parade disaster” (Helbing & Mukerji 2012)4 and earlier in “The Dynamics of Crowd Disasters: An Empirical Study” (Helbing et al. 2007)2 that the move from crowded movement to “crowd turbulence” can happen on sub-minute timescales — see the timeline reconstructions in Helbing & Mukerji 2012.4
Two: density alone is not enough. The real risk feature is density × variance (how much movement direction fluctuates). A static crowd at 5 ppl/m² is less dangerous than one at the same density where some people push north and others push south. This shapes what must be annotated.
Three: local density diverges from average density. The model needs to operate at the cell level, not just at the camera level — averages hide hotspots.
The historical record: what disasters teach us
Reading the academic literature and the official reports on major Middle East crowd incidents is mandatory for anyone building technology in this space:
- Jamaraat Bridge 2006 — the 12 January 2006 incident at the Jamaraat Bridge during the stoning ritual.5 Peer-reviewed analysis published by Helbing, Johansson and Al-Abideen in Physical Review E 75:046109 (2007) describes the crowd-dynamics conditions present at the time — high local density, opposing movement vectors at a constriction, the rapid onset of “crowd turbulence” — using video evidence collected on-site with the cooperation of the Hajj authorities.2 The Saudi authorities subsequently delivered the New Jamaraat Bridge expansion project, completed in stages through 2010, which redesigned the geometry and capacity of the site.6
- Mina 2015 — the 24 September 2015 incident near the intersection of streets 204 and 223.7 The official Saudi casualty figure published by the Hajj authorities is 769 (independent tallies by AP, AFP, and others are substantially higher).8 The peer-reviewed crowd-dynamics literature published after 2015 describes the same crowd-turbulence signature — opposing high-density flows converging — that the 2007 paper identified at Jamaraat.2 The data and the academic discussion below are restricted to that engineering question.
- Love Parade, Duisburg 2010 — A European disaster (21 deaths) but a landmark analysis. Helbing & Mukerji 2012 (“Crowd disasters as systemic failures”) tracks the temporal flow and crowd density minute by minute, and offers “crowd turbulence” as a predictive indicator.49
- Stadium incidents — Hillsborough 1989, Port Said 2012, Kanjuruhan 2022 — all crush events at ingress/egress points.10 A reminder that Fruin LOS and crowd safety are not Hajj-only concerns.
All of these incidents confirm the same pattern: a non-linear shift from safe movement to collapse within seconds. A usefully safety-oriented computer-vision model must detect the transition, not the state after it.
Distinguishing pre-incident data from incident-time data
This is the most important operational detail when building a training set:
- Incident-time data — Scenes of density D, chaotic motion, falls, asphyxia. These are rare, ethically sensitive, and mostly sourced from historical incident footage. Insufficient on their own to train a predictive model.
- Pre-incident data — The 30 to 60 seconds before a disaster occurs, when density shifts from B to C to D. This is the golden data. If you train the model to see the transition, it can warn before the disaster.
The problem: pre-incident data is acutely scarce, especially in Arabic context. Serious teams therefore build their datasets this way:
- Collect historical footage of high-risk areas (Grand Mosque entry points, Jamaraat Bridge pre- and post-expansion, Mina intersections, stadium gates)
- Annotate every frame with the actual Fruin LOS level + per-cell density + dominant motion direction per cell
- Compile published historical incident footage (Jamaraat 2006, Mina 2015, Love Parade, Hillsborough — available in the academic literature)
- Annotate the temporal sequence with the transition event (the B → C moment, the C → D moment)
- Build a model that recognizes pre-transition signals
Video annotation modality, bounding box templates, and keypoint templates all feed into this.
What needs annotation in a crowd-safety video
From practical experience, the annotation layers that feed a production density and crowd-motion model:
| Annotation layer | Description | Output | Tool |
|---|---|---|---|
| Head dots | Each person marked with a dot on their head | Count + density | keypoint |
| Bounding boxes for individuals | Each separable person in a box | Detection + tracking | bounding box |
| Density maps | 2D distribution of density over a frame | Density model training | density heatmap |
| Cell-level segmentation | Frame divided into cells with density per cell | Hotspot detection | grid annotation |
| Motion vectors | Dominant motion direction per cell | Crowd turbulence detection | optical flow + manual review |
| Clothing + context labels | Ihram / regular clothing / stadium kit / mall wear | Scenario understanding + personalization | classification |
| Physical critical points | Pillar, staircase, gate, choke point | Structural alignment | semantic segmentation |
| Event labels | Moment of fall, point of collapse, motion stop | Ground truth for event-detection model | temporal annotation |
| Per-frame Fruin LOS label | A / B / C / D | Ground truth for the model | classification |
The devil is in the detail:
- Head dots beat bounding boxes for high density. At 6 ppl/m², boxes overlap and annotation collapses. Dots do not overlap.
- Substantive annotation must come from an annotator trained in crowd safety. A generic annotator cannot tell density 3 from density 5 — both look “a lot.” Training the annotator takes examples, references, and testing.
- Religious and cultural context matters. Annotators of Grand Mosque scenes must understand the ritual movement patterns of the venue — the directional flow of circumambulation, the defined path of the sa’i, and the orientation of prayer rows — so that the model treats expected motion as the norm and unusual motion vectors as outliers to be reviewed. These are sacred-context operational facts, not visual heuristics; the annotator must be trained on the ritual before being trained on the model.
Why Western models fail in the Grand Mosque and the Prophet’s Mosque
Off-the-shelf open-source crowd-counting models are trained on public benchmark datasets dominated by stadiums, streets, and protests. A model trained on them fails on:
- White ihram garments — uniform color and a draped silhouette confuse detection models
- Overhead density — Grand Mosque cameras are typically drone-mounted or pole-mounted at sharp top-down angles. Western datasets are largely frontal.
- Tawaf pattern — Organized circular motion does not appear in Western training data
- Mosque lighting — Yellow lamps, moonlight, night-to-dawn transitions — a wide lighting range inside the same frame
The fix is not “a better model” — the fix is to build local training data with Grand Mosque, Prophet’s Mosque, and large Saudi venue context. That requires Arabic annotation teams, partnership with the General Presidency and the Presidency of Affairs for the Two Holy Mosques, and an academic partner. Read Hajj and crowd safety solutions for government for the operational model we recommend.
How to build a crowd density ground-truth dataset
Operational guidance from experience:
- Collect data from four pilot sites before scaling — for example: the Grand Mosque plaza during prayer, Jamaraat Bridge on a stoning day, a Saudi mall entrance on peak day, a stadium entrance
- Capture in different conditions — night/day, summer/winter, weekday/holiday, near-miss/normal day
- Use multiple synchronized cameras — overhead camera, frontal camera, oblique camera. The final model must work from any angle.
- Layer three annotation passes — head dots + individual bounding boxes for separable persons + a full density map per frame
- Inspect productivity with a golden set — clips annotated by a certified crowd-safety expert, against which every annotator is checked monthly
- Maintain a full audit trail — government agencies, Saudi authorities, and academic partners may later request a record of who annotated what, when, and under what guidance
- Build held-out test partitions — clips the model never sees during training, used for independent evaluation
The glossary covers definitions of every technical term mentioned in this list.
Responsibility and privacy
A point that cannot be skipped:
- Saudi PDPL (Personal Data Protection Law, in force since 14 September 2023) regulates biometric personal data. A human face may qualify as biometric personal data.11
- Operational fix: annotating heads with dots does not require storing the face. Safety cameras work on count and density, not identity. Output models should be abstract, not identity-based.
- Agency authorization — Any video capture inside the Two Holy Mosques requires a formal letter of authorization from the General Presidency for the Affairs of the Two Holy Mosques. There is no “hobbyist” route here.
What Annota8 does — and does not do
We do:
- Build training data for crowd density and motion problems, using a trained Arabic team
- Support Fruin-LOS-aware video annotation with bounding box, head-point, and density-map templates
- Provide audit trails and held-out test partitions
- Work under PDPL constraints and Saudi data-sovereignty requirements
- Compose our data with Saudi academic partners on request
We do not:
- Sell pre-built safety models. We build the data. The model is built by the operator’s team or by a technology partner.
- Provide crowd-safety consulting. Engage certified crowd-safety engineers (Keith Still3, G4S, Crowd Dynamics International, or appointed Saudi partners).
- Make tactical decisions during a live event. Our work ends at a well-annotated dataset feeding a model.
What this means for the buyer
- Start with the question: “Which pilot sites, how many hours of video, which annotation layers, on what timeline?” — not “Which model?”
- Ask the vendor for an annotation log, QA evidence, and held-out test partitions — not just “data”
- Use pre-incident data, not incident-time data alone
- Do not rely on out-of-the-box Western models without local fine-tuning. Test them against local data before deployment
- Engage a certified crowd-safety expert through the entire data and model project
References
Footnotes
-
Fruin J.J. “Designing for Pedestrians: A Level-of-Service Concept,” Highway Research Record 355 (1971), Transportation Research Board. Also published as the book “Pedestrian Planning and Design” (1971). https://onlinepubs.trb.org/Onlinepubs/hrr/1971/355/355-001.pdf ↩ ↩2 ↩3
-
Helbing D., Johansson A., Al-Abideen H.Z. “Dynamics of crowd disasters: An empirical study,” Physical Review E 75:046109 (2007). DOI: 10.1103/PhysRevE.75.046109. https://arxiv.org/abs/physics/0701203 ↩ ↩2 ↩3 ↩4
-
Keith Still — formerly Professor of Crowd Science at Manchester Metropolitan University (2014–2020); developed and led the MSc in Crowd Safety and Risk Analysis. https://www.gkstill.com/CV/References.html ↩ ↩2
-
Helbing D., Mukerji P. “Crowd disasters as systemic failures: analysis of the Love Parade disaster,” EPJ Data Science 1:7 (2012). https://epjdatascience.springeropen.com/articles/10.1140/epjds7 ↩ ↩2 ↩3
-
“2006 Hajj stampede” — 12 January 2006 incident at Jamaraat Bridge during the stoning ritual. https://en.wikipedia.org/wiki/2006_Hajj_stampede ↩
-
“Jamaraat Bridge” — New Jamaraat Bridge expansion completed in stages through 2010 (ground and first levels operational by 2007 Hajj, full five-level completion by 2010). https://en.wikipedia.org/wiki/Jamaraat_Bridge ↩
-
“2015 Mina stampede” — 24 September 2015 incident at the intersection of streets 204 and 223 leading to Jamaraat Bridge. https://en.wikipedia.org/wiki/2015_Mina_stampede ↩
-
Daily Sabah, “At least 769 dead, 934 injured in stampede at Mina during Hajj pilgrimage in Saudi Arabia,” 24 September 2015. The 769 figure is the official Saudi government figure; independent tallies by AP (2,411), AFP (2,236), and others are substantially higher. https://www.dailysabah.com/mideast/2015/09/24/at-least-769-dead-934-injured-in-stampede-at-mina-during-hajj-pilgrimage-in-saudi-arabia ↩
-
Love Parade disaster, Duisburg, 24 July 2010 — 21 deaths and 500+ injured. See Helbing & Mukerji 2012 abstract. ↩
-
Stadium disasters cited: Hillsborough (1989), Port Said (2012), Kanjuruhan (2022) — see “Kanjuruhan Stadium disaster” and related entries. https://en.wikipedia.org/wiki/Kanjuruhan_Stadium_disaster ↩
-
Saudi Personal Data Protection Law (PDPL) entered into force 14 September 2023; one-year transition period ended 14 September 2024. Biometric data is classified as “sensitive personal data” under PDPL. Morgan Lewis, “Saudi Arabia Personal Data Protection Law Transition Period Ends September 14, 2024.” https://www.morganlewis.com/pubs/2024/09/saudi-arabia-personal-data-protection-law-transition-period-ends-september-14 ↩