Modern societies are increasingly shaped by rare events that are statistically infrequent yet catastrophically impactful. Financial crises, large-scale industrial accidents, major infrastructure failures, aviation disasters, and systemic blackouts occur with low probability, but their human, economic, and societal consequences are enormous and long-lasting. Such events often remain invisible in aggregated statistics, escape traditional risk metrics, and are poorly captured by average-based models, despite causing disproportionate losses when they occur.
More broadly, the modern industrial world operates close to performance and safety limits. Transportation systems, vehicle manufacturers, energy networks, chemical plants, and even medicine and pharmaceutical production rely on highly optimized processes where small deviations, rare interactions, or unexpected behavioral responses can trigger safety-critical outcomes. In such tightly coupled systems, failures rarely stem from a single cause; instead, they emerge from complex interactions between technology, environment, and human behavior, often under time pressure and uncertainty.
Within this context, modern transportation systems exemplify how safety-critical events emerge from subtle, dynamic interactions between traffic conditions, infrastructure, environment, and human behavior. Crashes and incidents are rarely the result of a single factor; instead, they arise from the accumulation of risk over time, before any visible disruption occurs.
Traffic crashes belong to the same class of low-probability, high-impact events. At the level of individual vehicles or road segments, crashes are rare, yet their cumulative effect is devastating. In Belgium alone, road crashes account for more than €11 billion per year, corresponding to approximately 2 % of the national GDP, through loss of life, injuries, healthcare costs, congestion, infrastructure damage, and productivity loss. The surprise comes from the fact that the Belgium economic damage cause by 314 crashes and 4 fatalities per 1 Billion km traveled. Despite continuous technological progress in vehicles and infrastructure, traffic safety remains a systemic risk problem rather than a purely local or random phenomenon.
Risk analysis and profiling aim to uncover these latent processes, transforming raw traffic and contextual data into actionable knowledge about where, when, and why safety deteriorates.
Viewed through the lens of understanding the mechanism of rare events, risk analysis and continuous risk indexing are not domain-specific techniques, but general tools for understanding and managing rare, high-impact events in complex systems. The same challenges encountered in traffic safety—extreme class imbalance, noisy and partially observed signals, interacting human and technical components, and delayed or indirect outcomes—are central to many other safety-critical domains. Industrial operations, energy networks, finance, cybersecurity, healthcare, and large-scale infrastructure systems all require methods that can fuse heterogeneous information sources, learn from rare but consequential events, and translate complex dynamics into interpretable and actionable risk signals. The combination of probabilistic modeling, machine and deep learning, time-series analysis, and explainable AI enables robust risk assessment under uncertainty and supports early, informed decision-making wherever anticipating failure before it materializes is essential
This project investigates pedestrian crash severity as a complex risk outcome shaped by interacting human, environmental, and infrastructural factors. While pedestrian crashes represent a relatively small share of total traffic incidents, their injury severity and societal cost are disproportionately high, making them a critical safety concern. The motivation behind this work was to move beyond descriptive or single-model analyses and to rigorously assess how different modeling paradigms capture severity mechanisms, uncertainty, and nonlinear effects. By systematically comparing statistical and machine-learning approaches, the project aims to provide both methodological clarity and practical insight into how crash severity can be more accurately modeled and interpreted for safety analysis and policy design. The dataset comes from Salt Lake City, Utah, the US.
The study is built on a carefully structured modeling framework that combines theoretical grounding, statistical inference, and predictive validation. A Generalized Ordered Probit model is used to explicitly represent the ordinal nature of injury severity while relaxing restrictive parallel-regression assumptions. This is complemented by a stacking ensemble that integrates heterogeneous learners to improve robustness and generalization, and by TabNet, a deep learning architecture specifically designed for structured tabular data. The models are trained, validated, and compared under consistent experimental conditions, with careful attention to model stability, class imbalance, interpretability, and out-of-sample performance. This layered approach reflects a modeling philosophy in which explanatory power, predictive accuracy, and reliability under uncertainty are treated as complementary objectives, rather than trade-offs.
From a methodological perspective, this project makes extensive use of modern machine learning and deep learning techniques tailored to safety-critical data. The stacking ensemble leverages model diversity to capture nonlinear interactions that are difficult to specify a priori, while TabNet employs sequential attention mechanisms to learn feature selection and interaction patterns directly from data, improving both performance and interpretability. Explainability tools are used to trace how different variables contribute to severity outcomes across models, allowing learned patterns to be critically assessed rather than treated as black-box results. These methods are particularly important in the context of crash severity analysis, where rare outcomes, noisy measurements, and correlated predictors are the norm rather than the exception (Transportation Research Board 103rd Annual Meeting Transportation Research - 2024, Data Science for Transportation - 2024).
Beyond pedestrian safety, the project addresses a class of problems common across industrial and safety-critical domains: modeling rare but severe outcomes, learning from imbalanced datasets, integrating heterogeneous predictors, and producing results that remain interpretable for decision-makers. The combination of ordinal modeling, ensemble learning, deep neural networks, and explainable AI is directly applicable to areas such as industrial risk assessment, reliability engineering, healthcare outcome modeling, insurance analytics, and operational safety monitoring. The study demonstrates how advanced models can be embedded within a disciplined analytical workflow, one that balances statistical validity, predictive performance, and transparency, making the underlying methods transferable to any domain where high-impact risks must be quantified, explained, and acted upon before failures occur.
This project investigates pedestrian–vehicle interaction dynamics at unsignalized intersections and midblock crosswalks, focusing on how pedestrians evaluate and accept or reject traffic gaps under real-world conditions. Using detailed video-based observations coupled with field interviews with a subset of observed population. The Project examines how individuals are exposed to unsafe gaps, how waiting time and environmental conditions influence risk-taking behavior, and how pre-crash decision processes unfold before a potential conflict materializes. Rather than analyzing only observed crashes, the project adopts a surrogate safety perspective, treating gap acceptance, rejected gaps, speed adaptation, and rolling-gap behavior as measurable indicators of latent crash risk. The objective was to move beyond descriptive statistics and develop a behavioral framework capable of capturing how safety risk accumulates dynamically during the crossing process, before an actual collision occurs.
The study integrates statistical analysis, structural equation modeling (SEM), and a hybrid binary mixed logit framework to model pedestrian decision-making at a granular level. A regression component estimates gap size as a function of traffic and geometric variables, while a latent variable, caution behavior, is constructed through SEM using observable indicators such as gender, speed adaptation, and rejected gaps. These components are embedded within a mixed logit model to capture heterogeneity in acceptance decisions, random taste variation, and nonlinear behavioral responses. The modeling framework allows gap size, waiting time, and behavioral traits to interact within a probabilistic utility structure, explicitly reflecting the stochastic and boundedly rational nature of crossing decisions. This layered approach, combining behavioral theory, latent constructs, and advanced discrete choice modeling, demonstrates a strong capability in handling complex human–infrastructure interactions under uncertainty (Transportation Research Board 94th Annual Meeting - 2015).
Beyond pedestrian safety, this project addresses a general class of problems involving human decision-making under dynamic risk exposure. The ability to model rare but high-consequence events through surrogate indicators, construct latent behavioral variables, and estimate probabilistic decision structures under heterogeneity is directly transferable to domains such as industrial safety monitoring, human–machine interaction analysis, operational risk modeling, insurance analytics, and reliability engineering. The framework illustrates how noisy behavioral signals can be structured into interpretable latent constructs and embedded in predictive choice models that balance explanatory insight with forecasting capability. In industrial and economic contexts, similar methodologies can be applied to study operator behavior, equipment failure exposure, user risk perception, or system-level vulnerability, where decisions are made continuously under uncertainty and where rare events carry disproportionate costs. The project therefore reflects a modeling philosophy centered on risk emergence, behavioral heterogeneity, and probabilistic inference, extending naturally to complex safety-critical systems beyond transportation.