Full Methodology

Version 2.2 Β· March 2026


1. Purpose & Scope

Mental Health Radar is a real-time, population-level index of psychosocial stress. It produces a composite Mental Health Risk Score β€” the Acute Stress Signal Index (ASSI) β€” for regions worldwide, updated every six hours.

The tool answers a question that existing surveillance infrastructure does not answer in real time: where are conditions most acutely elevated right now, and across which life domains?

What the score is

The score reflects the degree to which current conditions elevate the risk of population-level mental health deterioration. It aggregates public data signals through a 12-domain wellbeing model, anchored against international standards. As the data signal stack grows, the score becomes progressively more precise β€” but the scoring architecture and domain model remain stable.

What the score is not

The score is not a clinical diagnostic instrument. It does not diagnose individuals, predict individual outcomes, or constitute medical advice. It is a population surveillance tool for the social and structural determinants of mental health.

An elevated score means conditions known to predict mental health deterioration are present. Whether deterioration materialises depends on protective factors, resilience, and the timeliness of intervention. The score identifies where intervention is most likely to have impact β€” it does not diagnose a population.

Audience

The primary audience is non-specialist: policymakers, public health practitioners, journalists, and engaged citizens who want to understand the mental health landscape of their region relative to others. Signal-level decomposition and domain breakdowns provide analytical depth for users who want to understand what is driving the score.


2. Scientific Foundation

The domain model, weighting decisions, and structural vulnerability components are grounded in a comprehensive literature review synthesising evidence across biological, psychological, social, occupational, environmental, lifestyle, and digital determinants of mental health. The evidence base spans every inhabited continent and is not specific to any single country or health system.

2.1 Risk Factor Evidence Base

The following table summarises the evidence strength for each major risk factor domain, rated on a 1–10 scale based on effect size, replication, and meta-analytic support in the published literature.

Risk Factor DomainEvidence WeightKey References
Adverse Childhood Experiences9/10Felitti et al. (1998); Varese et al. (2012); Norman et al. (2012)
Socioeconomic Deprivation9/10Lund et al. (2010); Patel et al. (2018); Reiss (2013)
Genetic Predisposition8/10Sullivan et al. (2012); Howard et al. (2019); Musliner et al. (2021)
Social Isolation / Loneliness8/10Holt-Lunstad et al. (2015); Cacioppo & Cacioppo (2018)
Attachment / Early Relational7/10Mikulincer & Shaver (2012); Main & Hesse (1990)
Discrimination / Marginalisation7/10Schmitt et al. (2014); Williams et al. (2019); Meyer (2003)
Relationship Quality / IPV7/10Ouellet-Morin et al. (2015); WHO (2021)
Work Stress / Burnout7/10Harvey et al. (2017); Bianchi et al. (2015)
Unemployment / Economic Insecurity7/10Paul & Moser (2009); Benach et al. (2014)
Substance Use7/10Kessler et al. (1996); Moore et al. (2007)
Neurobiological Factors6/10McEwen (2017); Miller & Raison (2016)
Physical Health Comorbidity6/10Katon et al. (2007); Liu et al. (2017)
Community Violence / Safety6/10Steel et al. (2009); Fowler et al. (2009)
Personality / Cognitive Style6/10Kotov et al. (2010); Nolen-Hoeksema et al. (2008)
Urban / Environmental Factors5/10Lederbogen et al. (2011); Mair et al. (2008)
Lifestyle (sleep, diet, exercise)5/10Baglioni et al. (2011); Schuch et al. (2018)
Digital / Social Media5/10Holt-Lunstad et al. (2015); Valkenburg et al. (2022)
Climate / Environmental Stress4/10Hickman et al. (2021); Cianconi et al. (2020)

2.2 Syndemic Framing

Risk factors rarely operate in isolation. Socioeconomic deprivation, for instance, increases exposure to adverse childhood experiences, neighbourhood violence, job insecurity, social isolation, and barriers to care β€” creating clustered, compounding risk profiles that are qualitatively different from and more harmful than the sum of individual factors (Bellis et al., 2017). This syndemic framing (Singer et al., 2017) underpins the SYNDEMIC detection system described in Section 6.

2.3 What the Literature Does Not Justify Including

Several factors that might seem intuitive are excluded from real-time scoring because the evidence does not support them as measurable population-level leading indicators:

Genetic predisposition (evidence weight 8/10) β€” not measurable at the regional level in real time. Its effects are captured structurally in the BVI via childhood adversity indicators.

Social media use intensity β€” effect sizes are moderate and heterogeneous (Valkenburg et al., 2022). Social media is included as a language signal (what people are saying) rather than a usage signal (how much they're using it).

Individual-level neurobiological factors β€” not measurable at population level.


3. The 12-Domain Model

3.1 Design Principles

Three principles governed the domain design:

Signal-to-domain mapping must be explicit. Every weight in the contribution matrix is defensible. Each data source's contribution to each domain is documented with reasoning, preventing hidden assumptions.

Domain weights must reflect the literature, not the signal stack. The data sources available today are not the data sources we will have tomorrow. If we weighted domains in proportion to data availability, we would produce a biased composite that over-represents what we happen to have data for. The domain weights are set against the evidence base and remain stable as new signals are added.

Missing signals must not silently inflate surviving signals. The weighted average computation excludes absent signals from the denominator. A domain with only two of three signals present uses a normalised two-signal average, not a three-signal average with one term at zero.

3.2 Domains and Weights

DomainWeightRationale
Economic Security16%Socioeconomic deprivation rated 9/10 in the evidence base. Strongest documented population-level risk pathway.
Direct Mental Health13%The most proximate signal β€” directly measured psychological distress indicators.
Social Connection11%Loneliness rated 8/10. Holt-Lunstad et al. (2015) found perceived social isolation carries an independent 26% mortality risk increase. Higher than most surveillance tools weight it.
Family & Relationships10%Relationship quality/IPV rated 7/10. Mechanistically distinct from social connection β€” captures domestic and relational distress specifically.
Mental Healthcare Access10%Healthcare access degradation is a buffer erosion factor. When access fails, all other risk factors are amplified because the primary intervention pathway is blocked.
Physical Health9%Physical health comorbidity rated 6/10, but anchored by hard clinical outcome measures where available.
Geopolitical & Political8%Media-transmitted political and geopolitical anxiety. Holman et al. (2020) demonstrated that media exposure predicts PTSD and depression independently of direct event exposure.
Personal Safety7%Community violence rated 6/10. Captures both direct violence exposure and perceived safety threat.
Purpose & Meaning5%The literature strongly supports purpose/meaning as a protective factor whose erosion predicts depression onset (Frankl, 1959; Kim et al., 2022). Currently the hardest domain to measure with available public data.
Collective Safety5%Disaster and mass-threat exposure. Steel et al. (2009) found approximately 30% PTSD prevalence in severely affected populations.
General Healthcare Access3%Distinct from mental healthcare access. Captures general healthcare system degradation. Weight will increase as better data sources become available globally.
Environment & Place3%Environmental health signals including air quality, extreme weather, and disaster exposure. Weight will increase as climate and environmental data sources expand.

3.3 Domain Contribution Matrix

Each domain's score is computed from a weighted combination of contributing data signals. The contribution matrix defines exactly which signals feed into each domain and at what proportion. Every row sums to 1.0.

The matrix is maintained separately for different region types to reflect the different data signals available in each geography. As new signals are added for a region, they are slotted into the appropriate matrix rows and the existing weights are re-normalised β€” the domain model itself does not change.

For each domain and each region, the domain score is calculated as:

domain_score = sum(signal_score Γ— weight) / sum(weight for signals actually present)

Missing signals are excluded from the weighted average entirely β€” they do not default to the baseline at this layer. This prevents a missing signal from pulling the domain score toward baseline when other signals are clearly elevated.

3.4 Per-Domain Design Rationale

Economic Security β€” Unemployment and labour market signals are the primary inputs because they are the most direct, high-frequency measures of economic stress available. Community-level economic distress language and search behaviour provide leading indicators of economic anxiety before it manifests in official statistics. Housing stress signals (eviction, affordability) capture a dimension of economic precarity that employment data alone misses.

Direct Mental Health β€” This domain captures the most proximate mental health signals available: community language in mental health forums, mental health search behaviour, and news coverage of mental health crises. It is the domain most directly measuring the construct the tool exists to track.

Social Connection β€” The evidence weight of 8/10 for loneliness justifies the third-highest domain weight. Available proxies include social isolation language in online communities and searches for loneliness and connection. Economic signals are excluded from this domain because economic stress and social isolation are sufficiently distinct pathways that cross-contamination should be minimised.

Family & Relationships β€” Relationship distress language in dedicated communities (divorce, abuse, relationship crisis forums) is the most direct available signal. Housing displacement signals contribute because eviction is a documented family disruption event.

Mental Healthcare Access β€” Search behaviour for mental health services captures unmet care demand in real time. Drug shortage data directly measures medication access failure. Community discussions of treatment barriers provide qualitative confirmation.

Physical Health β€” Where available, clinical outcome data (overdose mortality, substance-related deaths) serves as the hard anchor. Community health distress language and health-seeking search behaviour provide more responsive but softer signals.

Geopolitical & Political β€” News tone analysis dominates because media coverage is the primary transmission mechanism between political events and population-level anxiety. Community political distress language and political search behaviour contribute as corroborating signals.

Personal Safety β€” Where available, direct violence incident data is the lead indicator. Community safety language captures perceived threat at ground level. Conflict event data is critical in regions experiencing armed conflict. News coverage of crime and violence contributes as a secondary signal.

Purpose & Meaning β€” The most difficult domain to measure with public data. Community language about meaninglessness and hopelessness, and searches related to purpose and meaning, are weak but defensible proxies. This domain will strengthen significantly as employment quality data and community participation metrics become available.

Collective Safety β€” Disaster declarations and disaster event databases are the primary signals. News coverage of collective threat events and conflict data contribute in regions where these are relevant.

General Healthcare Access β€” Distinct from mental healthcare access. Captures general healthcare system degradation (hospital closures, wait times, uninsured rates). Available signals are weak proxies in most regions; weight will increase as better data sources are integrated.

Environment & Place β€” Air quality data provides genuine environmental health differentiation between regions. Disaster exposure data contributes. Climate signals (extreme heat, wildfire smoke) remain priority additions that would justify further weight increases.


4. Scoring Architecture

The scoring pipeline is a four-layer computation that applies identically to all regions worldwide.

Layer 1: Signal Ingestion β†’ raw values per region
           ↓
Layer 2: Global Z-Score Normalisation β†’ anchored [5, 95] scores
           ↓
Layer 3: Domain Contribution Matrix β†’ 12 domain scores
           ↓
Layer 4: Domain Weights β†’ ASSI β†’ BVI Modifier β†’ Final Score

4.1 Layer 1 β€” Signal Ingestion

Signal fetchers run in sequence at each scoring cycle. Each returns a structured dataset of raw values per region. Errors are isolated β€” a failed fetcher contributes the global baseline (50.0) to all regions rather than blocking the pipeline. Geographic scope is declared per signal in a central registry; the pipeline discovers applicable signals for each region type at runtime.

As new data sources are added, they register into this layer without changes to the normalisation, domain model, or scoring formula downstream. The architecture is designed so that signal expansion is the primary path to improving score quality β€” the analytical layers are stable.

4.2 Layer 2 β€” Global Z-Score Normalisation

Every raw signal value is converted to a score on a common scale using globally-anchored Z-scores:

z     = (value βˆ’ reference) / scale
score = clip(50 + z Γ— 15,  min=5, max=95)

50 is the global developed-nation baseline. 15 is the scaling factor β€” one standard deviation of deviation from the global norm maps to Β±15 score points. This is a deliberate calibration: a region one standard deviation above the reference scores 65 (entering the Severe band), while two standard deviations (highly unusual) scores 80 (firmly Crisis). This preserves Crisis as a meaningful alarm level rather than a common occurrence.

Why clip to [5, 95]: Scores at 0 or 100 would imply certainty β€” that a region is exactly at the global floor or ceiling. Given signal noise and data lags, these extremes are not informationally meaningful. The clipping communicates that the score is an estimate with inherent uncertainty.

Reference Constants

Each signal is anchored against a published international reference value drawn from WHO, OECD, ILO, or equivalent global benchmarks. The reference represents the developed-nation baseline for that measure; the scale represents one standard deviation in the global distribution.

For example: if the OECD average unemployment claims rate is 200 per 100,000 during non-recessionary periods and the scale is 100, then a region at 300/100k scores 65 β€” one standard deviation above baseline, appropriately flagged as Severe given the documented mental health impact. A region at 400/100k scores 80 β€” two standard deviations, firmly in the Crisis band.

Dynamic Anchoring

Some signals β€” particularly those derived from social media language and search behaviour β€” cannot be meaningfully compared to a fixed clinical reference. For these signals, the normalisation uses the current scoring run's mean and standard deviation across all regions (with a minimum SD floor to prevent division by near-zero). This captures relative elevation: how much more distress signal is present in a given region versus the typical level across all regions in this run.

The globally-anchored signals (labour market data, clinical outcomes, disaster events, conflict data, etc.) do not shift with the batch, so the composite score is a mix of both anchoring approaches. Full global anchoring of all signals requires sufficient historical data to compute stable per-region baselines β€” this is a planned evolution as scoring history accumulates.

4.3 Layer 3 β€” Domain Contribution Matrix

See Section 3.3 for the matrix design and Section 3.4 for per-domain rationale.

4.4 Layer 4 β€” ASSI, BVI, and Final Score

ASSI (Acute Stress Signal Index):

ASSI = Ξ£ (domain_weight Γ— domain_score)  across all 12 domains

BVI application:

Final Score = clip(ASSI Γ— (1 + bvi_modifier),  min=5, max=95)

Where bvi_modifier ranges from βˆ’0.40 to +0.60. See Section 5 for BVI details.

Risk level classification:

ScoreRisk LevelColour
1 – 20Thriving#4ade80
21 – 40Stable#7dd3fc
41 – 60OECD Base#fde68a
61 – 80Severe#fb923c
81 – 100Crisis#f87171

The bands are equal-interval (20 points each), with 50 β€” the global OECD baseline β€” at the centre of the OECD Base band. A region scoring 50 is classified as "at the OECD baseline," not elevated.


5. Behavioral Vulnerability Index (BVI)

5.1 Purpose

The BVI encodes structural mental health risk β€” the pre-existing conditions that determine how much a given level of acute stress translates into actual mental health deterioration. Two regions with identical ASSI scores may have very different outcomes if one has substantially higher structural vulnerability.

The BVI is a modifier, not a separate score. It scales the ASSI: a highly vulnerable region sees its acute signal amplified; a structurally resilient region sees it dampened. The modifier ranges from βˆ’0.40 (dampening) to +0.60 (amplification).

5.2 Components

The BVI captures five dimensions of structural vulnerability, each supported by strong evidence as a moderating factor in population mental health outcomes:

ComponentScientific Basis
Childhood Adversity PrevalenceEvidence weight 9/10; strongest modifiable risk pathway (Felitti et al., 1998; Norman et al., 2012)
Poverty RateEvidence weight 9/10; foundational structural determinant (Lund et al., 2010; Ridley et al., 2020)
Healthcare Shortage DensityHealthcare access as buffer erosion β€” amplifies all other risk
Health Insurance CoverageInsurance as protective buffer; particularly relevant for mental healthcare access
Income Inequality (Gini)Independent mental health risk factor beyond absolute poverty (Wilkinson & Pickett, 2009)

The specific data sources used to measure these components vary by region. Where national statistical agencies provide direct measures (census data, health surveys, administrative records), those are used. Where direct measures are unavailable, international equivalents are drawn from WHO, World Bank, UNICEF, and ILO datasets.

5.3 Computation

Each component is normalised to a 0–1 scale and combined with equal weights. Components where higher values indicate worse conditions (childhood adversity, poverty, healthcare shortage, inequality) produce a higher positive modifier. Protective components (insurance coverage) are inverted.

bvi_raw      = weighted_average(normalised_components)
bvi_modifier = scale(bvi_raw, min=βˆ’0.40, max=+0.60)
Final Score  = clip(ASSI Γ— (1 + bvi_modifier), 5, 95)

5.4 Why This Range

The modifier range [βˆ’0.40, +0.60] was chosen over a raw multiplier approach. Applied to a 50-baseline score, a naΓ―ve 1.4Γ— multiplier on a score of 60 would yield 84 β€” potentially pushing a region into Crisis that is merely Severe on acute signals. The percentage-based additive modifier is more conservative and directly interpretable: structural vulnerability amplifies the acute signal by up to 60%.

5.5 Update Cadence

BVI components are based on annual surveys and administrative data, typically with a 1–2 year publication lag. The BVI refreshes on a 30-day cycle β€” more than sufficient given the annual cadence of the underlying data. A known limitation is that sudden structural changes (emergency policy shifts, post-disaster healthcare changes) may not be reflected for up to a year.

5.6 Global BVI Coverage

BVI availability varies by region. Where all five components can be sourced from national or international datasets, a full BVI is computed. Where component data is unavailable, the BVI defaults to a neutral modifier (0.0) β€” the ASSI passes through unmodified. BVI coverage is expanding as international data sources are integrated. The goal is full BVI coverage for all scored regions.


6. Syndemic Detection

6.1 Definition

A ⚑ SYNDEMIC badge is displayed when three or more of the 12 domain scores are simultaneously β‰₯ 65.

The threshold of 65 was chosen because it falls in the lower portion of the Severe band β€” a domain score of 65 reflects a meaningful deviation from baseline that is not merely noise. A threshold in the OECD Base band would trigger too easily; a threshold at 70+ would miss early syndemic patterns.

6.2 Scientific Basis

Singer et al. (2017) in The Lancet define a syndemic as two or more epidemics interacting synergistically and contributing to excess burden of disease. The mental health application, supported by Bellis et al. (2017) and the ACE literature, is that co-occurring stressors produce compounding β€” not merely additive β€” effects on mental health outcomes.

A region with Economic Security score 70, Mental Healthcare Access score 72, and Social Connection score 68 is in a qualitatively different situation from a region with the same composite score driven by a single elevated signal. The composite score alone cannot encode this distinction; the badge does.

6.3 Display Only

The SYNDEMIC badge does not modify the composite score. This is an intentional conservative choice. The literature supports compounding effects, but the specific magnitude of a score adjustment would be intuitive rather than calibrated against empirical data. Adding uncalibrated bonuses to the composite would risk producing scores that users cannot interpret or verify. A calibrated syndemic score adjustment will be introduced once validation against clinical outcome data becomes possible.


7. Global Design

7.1 Architecture

The scoring formula, domain model, normalisation approach, BVI logic, and syndemic detection apply identically to all regions worldwide. There is no separate scoring pathway for different geographies. A unified pipeline handles all region types via a signal registry that declares which data sources apply to each geography.

7.2 Why the Architecture Is Geography-Agnostic

The system was designed for global deployment from the start:

Expanding coverage to new regions is primarily a data sourcing task, not a methodology change. New signals register into the existing pipeline and contribute to the domain model through the contribution matrix. The analytical architecture is stable.

7.3 The OECD Baseline in Global Context

The OECD-calibrated baseline (50 = developed-nation standard) is retained globally without modification. Regions in the Global South will structurally score above 50 on several signals β€” this is the honest and correct representation. The tool measures deviation from a human development standard, not from a regional norm.

This enables direct cross-region comparison. Observing that a high-income country's physical health signal scores similarly to a middle-income country is a meaningful and actionable finding β€” not a bug in the methodology.

7.4 Data Sufficiency Gating

Not all regions have identical signal coverage. The system handles this transparently:

Signal coverage is disclosed per region on the frontend β€” a data confidence indicator shows which signals are active and which are absent.

7.5 Signal Coverage by Country Income Tier

Country TierExpected Signal CoverageTypical Domain Coverage
OECD High IncomeBroadest signal stack; all domain types representedAll 12 domains
Upper-Middle IncomeMost core signals available10 of 12 domains
Lower-Middle IncomeEconomic, conflict, disaster, and search signals8 of 12 domains
Low Income / Conflict-affectedConflict and disaster signals are primary4–6 domains
Data-sparse / closedNews coverage signals only3 domains; confidence marked LOW

The signal stack is continuously expanding to close these coverage gaps. The goal is meaningful signal coverage across all 12 domains for every scored region.


8. Data Pipeline & Infrastructure

8.1 Pipeline Architecture

Scheduled Pipeline (every 6 hours)
   β”‚
   β”œβ”€β”€ Run unified scoring function
   β”‚     β”œβ”€β”€ Signal fetchers (sequential, error-isolated)
   β”‚     β”œβ”€β”€ Normaliser (Z-score conversion)
   β”‚     β”œβ”€β”€ Domain model (contribution matrix β†’ 12 domain scores)
   β”‚     β”œβ”€β”€ ASSI computation (weighted domain sum)
   β”‚     β”œβ”€β”€ BVI fetch + modifier application
   β”‚     └── SYNDEMIC detection
   β”‚
   β”œβ”€β”€ Write unified scores cache (all regions)
   β”œβ”€β”€ Write seed file (cold-start fallback)
   β”œβ”€β”€ Append to rolling history (7-day window)
   └── Commit + deploy

API Backend
   β”‚
   β”œβ”€β”€ Serves scores cache (auto-reloads on file change)
   β”œβ”€β”€ GET /scores β€” all regions (filterable by type)
   β”œβ”€β”€ GET /scores/{region} β€” single region detail
   β”œβ”€β”€ GET /history β€” 7-day rolling history
   β”œβ”€β”€ GET /signals β€” signal status and freshness
   └── GET /metadata β€” data freshness and distribution

Static Frontend
   β”‚
   β”œβ”€β”€ Interactive choropleth maps
   β”œβ”€β”€ Region detail panel with 12 domain bars
   β”œβ”€β”€ Trend charts
   └── Auto-refresh matching pipeline cadence

8.2 Unified Pipeline

All regions are scored in a single pipeline run. There are no separate pipelines for different geographies. The workflow discovers which signals apply to each region type via a central signal registry, runs all applicable fetchers, and produces a single unified output.

8.3 Graceful Degradation

If any fetcher fails (network error, API rate limit, malformed response), that signal defaults to the global baseline (50.0) for all regions. The pipeline always produces a complete output even if some signals are temporarily unavailable. Stale cached data is used if fresh data cannot be obtained β€” a staleness flag is raised but scores are not penalised. This design ensures the tool remains available and scores remain interpretable even under imperfect data conditions.

8.4 Data Files

Unified scores cache β€” The primary live output. Contains a timestamp and an array of region records, each with: composite score, risk level, colour, all 12 domain scores, ASSI, BVI modifier, syndemic badge status, data confidence level, and top contributing signals.

Rolling history β€” Array of snapshots with per-region scores. Maximum 7-day (168-hour) rolling window. Used for trend charts and momentum analysis.

Seed file β€” Cold-start fallback copy of the most recent full scoring run, ensuring the tool is never empty on first load.

Signal-specific caches are stored separately with their own time-to-live values appropriate to each source's update frequency.

8.5 Backend

The API backend reads the scores cache on startup and maintains an in-process cache. When the pipeline commits new data, the next request detects the file change and reloads automatically β€” no restart or webhook required.

CORS is configured for the production domain, preview/staging URLs, and local development.

8.6 Frontend

The frontend is a static site served via CDN. Maps are rendered client-side. Regions are coloured by the pre-computed risk level colour from the API. Selecting a region renders a detail panel showing all 12 domain scores, SYNDEMIC badge status, and a trend sparkline from the history endpoint.

8.7 Auth Architecture (Planned)

The frontend sends an authentication header on every request from day one. For free-tier users the token is empty and ignored. When the paid tier launches, the backend stub becomes real authentication middleware β€” zero frontend changes required.


9. Limitations & Caveats

9.1 Signal Coverage Inequalities

Signal quality varies by population size and data infrastructure maturity. Smaller regions and countries with less developed statistical systems have higher score variance and fewer active signals. Scores for these regions should be interpreted with greater caution, and the data confidence indicator should be checked.

9.2 Dynamic vs. Fixed Anchors

Signals derived from social media and search behaviour use batch-level anchoring (current run mean and SD). This captures relative elevation within the current batch, not absolute elevation against a global reference. A global batch in which every region is experiencing elevated distress will show no elevation for any individual region. The globally-anchored signals (labour market, clinical outcomes, disasters, conflict) partially correct for this. Full global anchoring of all signals requires sufficient historical data to compute stable per-region baselines.

9.3 Risk Conditions, Not Outcomes

An elevated score indicates conditions associated with risk. It does not mean deterioration has occurred or will necessarily occur. Protective factors β€” social cohesion, strong healthcare infrastructure, community resilience, timely intervention β€” can substantially buffer the effects even when risk conditions are present.

9.4 Structural Data Lag

BVI components and some clinical outcome signals are based on annual data with 1–2 year publication lags. These signals anchor the score against hard structural and clinical measures, but they do not respond to acute changes. They are confirmation signals, not leading indicators.

9.5 Social Media Representativeness

Online community language and search behaviour signals skew toward younger, more educated, more internet-connected populations. Rural communities, older adults, non-English speakers, and populations with lower internet access are systematically underrepresented. The BVI and labour market signals partially compensate by capturing structural factors affecting these populations. Full representativeness is an ongoing challenge that improves as the signal stack diversifies.

9.6 National-Level Signal Uniformity

Some signals (e.g., drug shortage data) are available only at the national level. In these cases, all sub-national regions within a country receive the same score for that signal. This is a known limitation disclosed in the signal metadata. These signals are best understood as national-level modifiers rather than region-differentiating inputs.


10. Validation Strategy

The composite score and its domain components require ongoing validation against clinical outcomes to be scientifically defensible. The validation strategy operates at multiple levels and applies globally, adapting to the health surveillance infrastructure available in each region.

10.1 Validation Data Sources

Validation compares Mental Health Radar scores against independent ground-truth measures of mental health outcomes. The specific data sources vary by region, but the categories are universal:

Emergency psychiatric presentation volume β€” Hospital emergency department data on mental health presentations, available with varying lag times depending on the national health reporting system. Useful for validating acute signal spikes.

Crisis helpline call volume β€” National and regional crisis line data validates the crisis signal layer. Available in most countries with established crisis services (e.g., 988 in the US, 116 123 in the EU, Lifeline services across Asia-Pacific).

Substance-related mortality β€” National drug and alcohol mortality surveillance validates Physical Health domain predictions. Typically available with 6–12 month lag depending on the national vital statistics system.

Population mental health surveys β€” Nationally representative surveys (BRFSS, Eurobarometer, WHO World Mental Health Survey, national health interview surveys) validate absolute burden scores against clinical prevalence estimates. Annual cadence.

Insurance and administrative claims data β€” Where available, anonymised mental health service utilisation data validates Healthcare Access domain scores.

10.2 Validation Protocol

  1. Log every composite score and domain score with timestamp and geography.
  2. Match to ground-truth outcome measures at the appropriate lag (2 weeks for emergency data, 6–12 months for mortality, annually for survey data).
  3. Compute correlation, sensitivity, and specificity metrics at each level (composite, domain, signal).
  4. Report accuracy metrics publicly in this methodology documentation.
  5. Use validation findings to recalibrate reference constants, domain weights, and signal contributions where the evidence warrants it.

10.3 Validation Coverage

Validation data availability varies by region. High-income countries with mature health surveillance systems will be validated first and most rigorously. The validation framework is designed to incorporate new ground-truth sources as they become available in each region, ensuring that the tool's accuracy claims are always proportionate to the evidence behind them.


11. Roadmap

Signal Expansion

The signal stack is the primary lever for improving score quality. Planned additions include: WHO Global Health Observatory metrics, UNODC substance use data, World Bank poverty and inequality indicators, international social media analysis, climate and extreme weather signals, and mass layoff leading indicators. Each new signal slots into the existing domain model without changing the scoring architecture.

Global Coverage

The goal is meaningful signal coverage across all 12 domains for 190+ countries, followed by sub-national scoring (province/state/region level) for major countries. Sub-national expansion follows a prioritised approach based on statistical infrastructure maturity, beginning with countries that have strong open-data APIs (EU Eurostat, UK ONS, Canada Statistics Canada, Australia ABS, and others).

Scoring Enhancements

Premium Features (Planned)


12. References

ACLED (Armed Conflict Location & Event Data). (2024). ACLED Methodology and Coding Decisions.

Alicandro, G., et al. (2022). Using Google search query surveillance to monitor depression trends. Journal of Affective Disorders, 296, 149–155.

Ayers, J.W., et al. (2013). Seasonality in seeking mental health information on Google. American Journal of Preventive Medicine, 44(5), 520–525.

Baglioni, C., et al. (2011). Insomnia as a predictor of depression: a meta-analytic evaluation. Journal of Affective Disorders, 135(1–3), 10–19.

Bellis, M.A., et al. (2017). Adverse childhood experiences and their impact on health-harming behaviours in the Welsh adult population. BMC Public Health, 17, 630.

Benach, J., et al. (2014). Precarious employment: understanding an emerging social determinant of health. Annual Review of Public Health, 35, 229–253.

Bianchi, R., et al. (2015). Burnout and depression: Evidence about their convergence but also their distinctiveness. Psychological Reports, 117(3), 931–947.

Cacioppo, J.T., & Cacioppo, S. (2018). The growing problem of loneliness. The Lancet, 391(10119), 426.

Cianconi, P., et al. (2020). The impact of climate change on mental health: A systematic descriptive review. Frontiers in Psychiatry, 11, 74.

Coppersmith, G., et al. (2014). Measuring post traumatic stress disorder in Twitter. ICWSM, 8(1), 579–582.

Desmond, M. (2016). Evicted: Poverty and profit in the American city. Crown Publishers.

Felitti, V.J., et al. (1998). Relationship of childhood abuse and household dysfunction to many of the leading causes of death in adults. American Journal of Preventive Medicine, 14(4), 245–258.

Fowler, P.J., et al. (2009). Community violence: A meta-analysis on the effect of exposure and mental health outcomes of children and adolescents. Development and Psychopathology, 21(1), 227–259.

Frankl, V. (1959). Man's Search for Meaning. Beacon Press.

Graetz, N. (2021). Estimating the effect of eviction on housing instability and mental health. Housing Policy Debate, 31(3–5), 710–730.

Harvey, S.B., et al. (2017). Can work make you mentally ill? A systematic meta-review of work-related risk factors for common mental health problems. Occupational and Environmental Medicine, 74(4), 301–310.

Hickman, C., et al. (2021). Climate anxiety in children and young people and their beliefs about government responses to climate change. The Lancet Planetary Health, 5(12), e863–e873.

Holman, E.A., et al. (2020). Media's role in broadcasting acute stress following the Boston Marathon bombings. PNAS, 111(1), 93–98.

Holt-Lunstad, J., et al. (2015). Loneliness and social isolation as risk factors for mortality: A meta-analytic review. Perspectives on Psychological Science, 10(2), 227–237.

Howard, D.M., et al. (2019). Genome-wide meta-analysis of depression identifies 102 independent variants. Nature Neuroscience, 22(3), 343–352.

Jahoda, M. (1982). Employment and Unemployment: A Social-Psychological Analysis. Cambridge University Press.

Katon, W.J., et al. (2007). The pathways study: a randomized trial of collaborative care in patients with diabetes and depression. Archives of General Psychiatry, 61(10), 1042–1049.

Kessler, R.C., et al. (1996). The epidemiology of co-occurring addictive and mental disorders. American Journal of Orthopsychiatry, 66(1), 17–31.

Kim, E.S., et al. (2022). Purpose in life and reduced incidence of stroke in older adults. Journal of Psychosomatic Research, 74(5), 427–432.

Kotov, R., et al. (2010). Linking "big" personality traits to anxiety, depressive, and substance use disorders: A meta-analysis. Psychological Bulletin, 136(5), 768–821.

Lederbogen, F., et al. (2011). City living and urban upbringing affect neural social stress processing in humans. Nature, 474(7352), 498–501.

Leetaru, K., & Schrodt, P.A. (2013). GDELT: Global data on events, location, and tone, 1979–2012. ISA Annual Convention, 2(4).

Liu, N.H., et al. (2017). Excess mortality in persons with severe mental disorders. World Psychiatry, 16(1), 30–40.

Lund, C., et al. (2010). Poverty and common mental disorders in low and middle income countries: A systematic review. Social Science & Medicine, 71(3), 517–528.

Main, M., & Hesse, E. (1990). Parents' unresolved traumatic experiences are related to infant disorganized attachment status. In M.T. Greenberg et al. (Eds.), Attachment in the Preschool Years. University of Chicago Press.

Mair, C., et al. (2008). Is neighbourhood racial/ethnic composition associated with depressive symptoms? Social Science & Medicine, 67(11), 1657–1665.

McEwen, B.S. (2017). Neurobiological and systemic effects of chronic stress. Chronic Stress, 1, 2470547017692328.

Meyer, I.H. (2003). Prejudice, social stress, and mental health in lesbian, gay, and bisexual populations. Psychological Bulletin, 129(5), 674–697.

Mikulincer, M., & Shaver, P.R. (2012). Attachment theory expanded. In K. Deaux & M. Snyder (Eds.), The Oxford Handbook of Personality and Social Psychology. Oxford University Press.

Miller, A.H., & Raison, C.L. (2016). The role of inflammation in depression. Nature Reviews Immunology, 16(1), 22–34.

Moore, T.H., et al. (2007). Cannabis use and risk of psychotic or affective mental health outcomes. The Lancet, 370(9584), 319–328.

Musliner, K.L., et al. (2021). Polygenic liability and the developmental and familial risk for major depression. Psychological Medicine, 51(6), 980–988.

Nolen-Hoeksema, S., et al. (2008). Rethinking rumination. Perspectives on Psychological Science, 3(5), 400–424.

Norman, R.E., et al. (2012). The long-term health consequences of child physical abuse, emotional abuse, and neglect. PLOS Medicine, 9(11), e1001349.

OECD. (2023). Health Statistics. OECD Publishing, Paris.

OECD. (2023). Labour Force Statistics. OECD Publishing, Paris.

Ouellet-Morin, I., et al. (2015). Intimate partner violence and mental health disorders in adulthood. Psychological Medicine, 45(4), 661–677.

Patel, V., et al. (2018). The Lancet Commission on global mental health and sustainable development. The Lancet, 392(10157), 1553–1598.

Paul, K.I., & Moser, K. (2009). Unemployment impairs mental health: Meta-analyses. Journal of Vocational Behavior, 74(3), 264–282.

Reiss, F. (2013). Socioeconomic inequalities and mental health problems in children and adolescents. Social Science & Medicine, 90, 24–31.

Ridley, M., et al. (2020). Poverty, depression, and anxiety: Causal evidence and mechanisms. Science, 370(6522), eaay0214.

Schmitt, M.T., et al. (2014). The consequences of perceived discrimination for psychological well-being. Psychological Bulletin, 140(4), 921–948.

Schuch, F.B., et al. (2018). Physical activity and incident depression: A meta-analysis. American Journal of Psychiatry, 175(7), 631–648.

Singer, M., et al. (2017). Syndemics and the biosocial conception of health. The Lancet, 389(10072), 941–950.

Steel, Z., et al. (2009). Association of torture and other potentially traumatic events with mental health outcomes. JAMA, 302(5), 537–549.

Sullivan, P.F., et al. (2012). Genetic epidemiology of schizophrenia. Schizophrenia Bulletin, 38(5), 892–941.

Valkenburg, P.M., et al. (2022). Social media use and its impact on adolescent mental health: An umbrella review. Current Opinion in Psychology, 44, 58–68.

Varese, F., et al. (2012). Childhood adversities increase the risk of psychosis. Schizophrenia Bulletin, 38(4), 661–671.

Vos, T., et al. (2020). Global burden of 369 diseases and injuries in 204 countries and territories, 1990–2019. The Lancet, 396(10258), 1204–1222.

Wilkinson, R., & Pickett, K. (2009). The Spirit Level: Why Equality is Better for Everyone. Allen Lane.

Williams, D.R., et al. (2019). Racism and health: evidence and needed research. Annual Review of Public Health, 40, 105–125.

World Health Organization. (2021). Violence against women prevalence estimates, 2018. WHO.

World Health Organization. (2022). World Mental Health Atlas. WHO.

World Health Organization. Global Health Observatory (GHO) Data Repository. https://www.who.int/data/gho

↑ Back to top