| | |
Last updated on June 4, 2026. This conference program is tentative and subject to change
Technical Program for Thursday July 2, 2026
| |
| ThATR-28 Lecture session, TR-28 |
Add to My Program |
| TR-28 Special Sessions Aviation HF SVO Regular a - Morning Session |
|
| |
| |
| 10:15-10:25, Paper ThATR-28.1 | Add to My Program |
| Adaptive Automation Levels in Air Traffic Control: Balancing Human Workload, Situation Awareness, and Performance (I) |
|
| Ramesh, Mercedes Premalatha | Nanyang Technological University |
| Li, Zhimin | Nanyang Technological University |
| Dhief, Imen | Air Traffic Management Research Institute |
| Hsieh, Meng-Hsueh | Nanyang Technological University |
| Feroskhan, Mir | Nanyang Technological University Singapore |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Autonomous Systems: Control, Safety, and Reliability, Digital Twins, Simulation, and Virtual Environments
Abstract: Air traffic control (ATC) must maintain separation and throughput as demand grows; International Civil Aviation Organization (ICAO) forecasts indicate global passenger and cargo traffic will more than double over the next two decades, increasing workload peaks and time-critical ATC configuration and decision authority. Decision-support automation can provide alerts, prioritization, and resolution guidance, yet many deployments rely on fixed levels of automation (LOA). Fixed LOA is often mismatched to changing complexity: low automation may not prevent overload, while high authority can reduce engagement and situation awareness (SA) and increase susceptibility to mode confusion and miscalibrated trust. Here, authority refers to the degree of control delegated to automation, and LOA provides the discrete mechanism through which this authority is allocated between the human operator and the system. The research problem is adaptive LOA selection for ATC configuration and decision authority that balances workload, situation awareness (SA), and operational performance. New methods are needed because (1) support demand must be inferred from non-intrusive indicators, (2) LOA transitions must be stable, transparent, and quickly overridable to limit automation surprise, and (3) validation is often restricted to system-level evidence rather than operational trials. An Adaptive LOA Manager is introduced that assigns ATC configuration and decision-support functions (monitoring, advisory generation, consent-based execution, and bounded automation) to four discrete LOA modes and selects LOA using a sliding-window Support Demand Index combining traffic-demand features such as conflict rate and time-to-conflict and interaction features (acknowledgment latency, intervention rate). Transition safety is enforced through hysteresis, minimum dwell time, intent preview, and a single-action veto, and the automation logic is consistent with recent ATM optimization and automated vectoring approaches. Evaluation uses scenario-based system logs and attention/interaction traces relevant to missed-warning risks and SA degradation.
|
| |
| 10:25-10:35, Paper ThATR-28.2 | Add to My Program |
| Adaptive HMI Design Strategies for Mitigating Pilot Spatial Attention-Loss Driven by Scanning Series Monitoring (I) |
|
| YUAN, Xin | The Hong Kong Polytechnic University |
| Li, Qinbiao | The Hong Kong Polytechnic University |
| YIU, Cho Yin | The Hong Kong Polytechnic University |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-AI Collaboration and Decision Support, Human-in-loop Systems and Architectures
Abstract: Enhancing pilots' cognitive resilience in dynamic flight environments requires cockpit systems capable of sensing, interpreting, and mitigating cognitive degradation within a closed-loop framework. Loss of Attentional Control (LoAC) remains a critical yet subtle threat to aviation safety, particularly in single-pilot operations, where early-stage attentional drift may precede overt performance errors. Detecting LoAC through deviations from normative visual scanning behaviour, therefore represents a key challenge for adaptive human–machine interaction (HMI) design. This study proposes an attention-awareness interaction framework that integrates phase-specific scanning pattern modelling with Large Language Model (LLM)-driven cognitive mitigation. First, phase-dependent standard scanning patterns were modelled using Hidden Markov Models (HMM), capturing structured transitions among task-relevant Areas of Interests (AOIs) across flight phases. Second, a three-fold anomaly detection strategy was developed to identify abnormal scanning behaviour. Finally, the standard scanning patterns were operationalised as a structured knowledge base to develop an LLM-assisted LoAC mitigation agent via few-shot learning. A head-up display (HUD) serves as the interaction layer, providing context-sensitive visual cues that highlight critical AOIs and recommended actions to guide pilots back toward an optimal attentional state. Overall, this work presents a coherent closed-loop cognitive framework that unifies probabilistic attentional modelling, real-time deviation monitoring, and adaptive visual guidance, supporting a transition from reactive error correction to proactive cognitive support in intelligent cockpits.
|
| |
| 10:35-10:45, Paper ThATR-28.3 | Add to My Program |
| Methodological Challenges in Human Factors Research for Air Traffic Management (I) |
|
| Lyu, Mengtao | Georgia Institute of Technology |
| Li, Qinbiao | The Hong Kong Polytechnic University |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-Machine Integration, Human-AI Collaboration and Decision Support
Abstract: Air traffic management (ATM) is shifting from localized, voice-based tactical control toward data-rich, highly automated strategic flow management.Programs such as the Single European Sky ATM Research (SESAR) and the United States' Next Generation Air Transportation System (NextGen), necessitates a corresponding evolution in human factors research methodologies. Many established human factors approaches rely on small-sample simulations and subjective questionnaires, which have proven to be insufficient for the validation of safety-critical autonomous systems in advancing ATM paradigms. This paper provides an exhaustive analysis of the methodological barriers currently impeding rigorous scientific inquiry in this domain. We examine (i) recruitment and statistical power limitations in expert controller populations; (ii) the shortage of time-aligned, defensible labels for physiological and behavioral data needed for machine learning; (iii) trade-offs between experimental control and ecological validity in simulation-based evaluation; and (iv) end-to-end latency constraints that hinder closed-loop, real-time cognitive state estimation. Furthermore, we propose practical mitigation strategies, including the deployment of generative methods for scenario and data augmentation, the utilization of transfer learning to reduce calibration burden, and the adoption of shadow-mode validation architectures that enable evaluation on live operational data without introducing operational risk. Overall, this work provides a structured methodological roadmap for conducting scalable and defensible human factor research that can support the design, evaluation, and certification of trustworthy human–autonomy teaming in future ATM systems.
|
| |
| 10:45-10:55, Paper ThATR-28.4 | Add to My Program |
| Recognising Emotions in Air-Ground Communications with Deep Learning (I) |
|
| YIU, Cho Yin | The Hong Kong Polytechnic University |
| LI, Wen-Chin | Cranfield University |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-in-loop Systems and Architectures, Human-AI Collaboration and Decision Support
Abstract: Decision-making in air traffic operations often requires a stable emotional state to ensure the decision quality. To ensure aeronautical decisions are free from emotional impacts, this research presents a novel dataset and deep learning model to identify the emotions of stakeholders, including pilots and air traffic controllers (ATCOs). We recorded 30 utterances in seven different emotions from 20 participants who are pilots or ATCOs. Features were extracted from the utterance recordings for emotion recognition. A long short-term memory (LSTM) model was constructed to perform emotion recognition. The proposed model yielded a test accuracy at 64.29%. The model demonstrates the potential in identifying emotions of pilots and ATCOs via their communications.
|
| |
| 10:55-11:05, Paper ThATR-28.5 | Add to My Program |
| The Transparency of Insights into Cockpit Evolution for Future Single-Pilot Operation: Evidence from Eye Tracker Patterns (I) |
|
| Li, Qinbiao | The Hong Kong Polytechnic University |
| YIU, Cho Yin | The Hong Kong Polytechnic University |
| YUAN, Xin | The Hong Kong Polytechnic University |
| Lyu, Mengtao | Georgia Institute of Technology |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-AI Collaboration and Decision Support, Human-Machine Integration
Abstract: While single-pilot operation (SPO) provides substantial operational and economic advantages, its practical implementation relies critically on achieving safety equivalence with conventional dual-pilot operations (DPO). A critical but under-investigated challenge stems from the mismatch between current cockpit interfaces, which are originally designed for crew coordination, and the cognitive requirements imposed on a solo pilot. This study investigates how conventional dual-pilot cockpit interaction patterns may inadvertently compromise pilot performance in SPO, especially focusing on the significant degradation of situation awareness (SA) compared with DPO. Using eye-tracking techniques, we analyzed the visual scanning behaviors of experienced airline captains across multiple flight phases under both SPO and DPO scenarios, and identified distinctive oculomotor signatures associated with SA degradation in single-pilot conditions. The findings emphasize the necessity of rethinking cockpit information architecture for SPO, shifting from simple automation augmentation to a holistic redesign that matches the perceptual and cognitive workflows of solo pilots. The extracted eye-movement indicators can serve as objective metrics for evaluating future SPO-adapted displays and provide actionable guidance for interface optimization. Ultimately, this research establishes a human-centered foundation for cockpit evolution, aiming to build an intuitive operational environment that actively supports the single pilot’s operational and decision-making demands.
|
| |
| 11:05-11:15, Paper ThATR-28.6 | Add to My Program |
| Temporal and Spatial Constraints on Peripheral Signal Accessibility During Continuous Task Engagement (I) |
|
| Li, Zhimin | Nanyang Technological University |
| Li, Fan | Department of Aeronautical and Aviation Engineering, the Hong Kong Polytechnic University |
| YAN, YUQI | The University of Newcastle, School of Engineering, |
| Dhief, Imen | Air Traffic Management Research Institute |
| Ramesh, Mercedes Premalatha | Nanyang Technological University |
| Feroskhan, Mir | Nanyang Technological University Singapore |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-in-loop Systems and Architectures
Abstract: In many real-world monitoring and control environments, operators perform continuous primary tasks while transient peripheral signals appear unpredictably. Although such peripheral signals are often visually salient, they are frequently missed. Prior research has shown that attentional demands impair peripheral signal detection, yet it remains unclear whether peripheral signal accessibility decreases gradually with primary task engagement or is disproportionately constrained at specific moments. This study examines the temporal and spatial constraints on peripheral signal accessibility during continuous attention tasks. A dual-task paradigm was employed in which participants performed a continuous primary task while brief peripheral signals appeared unpredictably. Peripheral signal accessibility was defined as the successful detection and report of these signals, and spatial engagement was quantified using eye-tracking measures confined to the primary-task area. Results show that post-signal spatial engagement is significantly associated with peripheral signal detection: detected trials exhibit high and stable spatial lock-in to the primary task, whereas missed trials show reduced and more variable engagement. In contrast, temporal proximity shows less systematic influence. Mixed-effects modeling further supports spatial engagement as a key determinant of detection outcomes and suggests that its influence remains stable across variations in temporal proximity. These findings clarify how peripheral signal accessibility is constrained during continuous task performance and provide a refined understanding of attentional tunneling by highlighting the central role of spatial engagement in shaping detection failures in dynamic monitoring environments.
|
| |
| 11:15-11:25, Paper ThATR-28.7 | Add to My Program |
| From Misunderstanding to Alignment: LLM-Supported Training to Enhance Controller–Pilot Collaboration (I) |
|
| Li, Donglin | The Hong Kong Polytechnic University |
| Shi, Bin | Northwest Region Ningxia Air Traffic Management Sub-Bureau, Civil Aviation Administration of China |
| Li, Zhimin | Nanyang Technological University |
| Li, Fan | Department of Aeronautical and Aviation Engineering, the Hong Kong Polytechnic University |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-AI Collaboration and Decision Support
Abstract: Breakdowns in pilot–controller coordination often arise not from incorrect procedures but from differences in how instructions are interpreted during interaction. Conventional training emphasizes phraseology and procedural correctness, yet offers limited opportunities for operators to observe how their instructions are interpreted by a communication partner. In this study, we present an LLM-supported training platform in which a conversational agent acts as a simulated pilot. It serves as an interactive communication partner that externalizes its interpretation through dialogue. The platform was co-designed with practicing operators and incorporates representative misunderstanding scenarios and structured debriefing to support perspective-taking and reflection. An exploratory user study with air traffic controllers showed high usability and indicated that the interaction helped participants improve their training setting. Overall, this design-oriented insight injects new possibilities into traditional training practices and may serve as a practical approach to addressing communication challenges between pilots and air traffic controllers.
|
| |
| 11:25-11:35, Paper ThATR-28.8 | Add to My Program |
| Evaluating the Impact of a Multi-Sector Planner Configuration on Air Traffic Controller Workload and Support Tool Requirements (I) |
|
| Hsieh, Meng-Hsueh | Nanyang Technological University |
| Wang, Chung-Hung John | Nanyang Technological University |
| Li, Zhimin | Nanyang Technological University |
| Ramesh, Mercedes Premalatha | Nanyang Technological University |
| Chen, Chun-Hsien | Nanyang Technological University |
| Feroskhan, Mir | Nanyang Technological University Singapore |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-in-loop Systems and Architectures, Human-AI Collaboration and Decision Support
Abstract: The objective of this exploratory study is to engage air traffic controllers (ATCOs) in Multi-Sector Planner (MSP) operations using current system to identify MSP support tool requirements. This is in response to dynamic changes and expected increase in air traffic volume. The purpose of the pretest is to assess the effect of MSP on air traffic controllers’ workload and situation awareness (SA) and identify the supporting tools to reduce the workload and enhance SA. Overall, the results indicate that the planner controller (PC) experienced higher workload and lower situation awareness in the MSP condition, in which the PC was required to manage two sectors while coordinating with one tactical controller (TC) per sector. These findings of this exploratory study highlight the need for support tools to facilitate operations under the MSP configuration. The focus group findings serve as a reference for the development of support tools for the MSP position. Suggestions and potential extensions of this study are discussed in the conclusion.
|
| |
| 11:35-11:45, Paper ThATR-28.9 | Add to My Program |
| A Quantum-Like Many-Body Wave Function–Based Modeling Approach for Dynamic Human–Machine Function Allocation (I) |
|
| Yu, Rourou | Nanjing University of Aeronautics and Astronautics |
| Sun, Youchao | Nanjing University of Aeronautics and Astronautics |
| Li, Yuhan | Nanjing University of Aeronautics and Astronautics |
| Guo, Chaochao | Nanjing University of Aeronautics and Astronautics |
Keywords: Human-AI Collaboration and Decision Support, Human-Machine Integration, Human-in-loop Systems and Architectures
Abstract: This paper proposes a quantum-like many-body wave function–based modeling approach for dynamic human–machine function allocation. The method is based on a quantum-like many-body wave function state modeling paradigm that maps the human operator, the automated system, and their associated functions into a unified tensor-product space. By introducing a situation modulation operator and an automation-level modulation operator, it describes the continuous evolution of human-machine collaborative function states, thereby constructing a probability-driven dynamic allocation mechanism.
|
| |
| 11:45-11:55, Paper ThATR-28.10 | Add to My Program |
| Implementing Gaze Entropy to Evaluate the Design of an eVTOL with SVO Concept (I) |
|
| Zhan, Yuhan | Nanjing University of Aeronautics and Astronautics |
| Li, Yuhan | Nanjing University of Aeronautics and Astronautics |
| Sun, Youchao | Nanjing University of Aeronautics and Astronautics |
| Zhang, Shuguang | Beihang University |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-Machine Integration, Human-in-loop Systems and Architectures
Abstract: This study evaluates the design of an electric Vertical Take-Off and Landing (eVTOL) aircraft incorporating the Simplified Vehicle Operations (SVO) concept by analyzing operators’ gaze entropy and subjective workload ratings throughout a series of Mission Task Elements (MTEs). Twenty-three professional eVTOL operators participated in the study, completing nine distinct MTEs. Gaze entropy and subjective workload ratings, assessed using the Cooper-Harper Handling Qualities Rating (HQR) scale, were recorded throughout the tasks. The effects of the different MTEs on gaze entropy and subjective workload were analyzed to evaluate the efficacy of the aircraft design. The analysis revealed that gaze entropy increases with higher workload, and gaze patterns shift depending on the specific characteristics of the MTEs. Gaze entropy shows promise as an objective, non-intrusive indicator for evaluating aircraft design. It may serve as a supplementary tool in innovative aircraft design evaluation and training.
|
| |
| 11:55-12:05, Paper ThATR-28.11 | Add to My Program |
| An MTE-Based Handling Quality Assessment Framework for SVO Fixed-Wing VTOL Aircraft (I) |
|
| Fan, Xinyu | Beihang University |
| Li, Yuhan | Nanjing University of Aeronautics and Astronautics |
| Zhang, Shuguang | Beihang University |
Keywords: Human-in-loop Systems and Architectures, Human Factors, Ergonomics, and Performance in Intelligent Systems, Autonomous Systems: Control, Safety, and Reliability
Abstract: The rapid evolution of Simplified Vehicle Operations (SVO) in fixed-wing Vertical Take-off and Landing (VTOL) aircraft presents significant challenges to traditional handling quality assessment methods, which are primarily designed for rotorcraft or general aviation. Current airworthiness specifications lack specialized Mission Task Elements(MTEs) tailored to the high-speed cruise and transition envelopes of these new configurations. To address this gap, this study proposes a comprehensive handling quality assessment framework based on tailored MTEs and deep learning techniques. An MTEs-based evaluation model is developed through a Convolutional Neural Network (CNN) architecture. Multivariate flight time-series data describing both aircraft response and control execution are transformed into Gramian Angular Field (GAF) representations to preserve temporal dependencies and inter-variable coupling. A task identity embedding mechanism is introduced to enable MTE-aware evaluation, while an ordinal regression head is employed to ensure consistency with the ordered nature of the Cooper–Harper Rating (CHR) scale. Simulation experiments demonstrate that the proposed model delivers reliable evaluation results and enhances assessment efficiency by reducing subjective bias and dependency on professional test pilots. The framework exhibits robustness against pilot variability, objectivity in handling quality, and applicability across diverse operational scenarios.
|
| |
| 12:05-12:15, Paper ThATR-28.12 | Add to My Program |
| A Fast Computational Approach for Wing Lift Prediction in Tilt-Propeller Slipstream Toward Enhanced SVO Envelope Protection (I) |
|
| Xiong, Xinyu | Beihang University |
| Sun, Zhaohu | Sichuan Aerofugia Technology Development Co., Ltd |
| Huang, Zijian | Beihang University |
| Song, Lei | Beihang University |
Keywords: Autonomous Systems: Control, Safety, and Reliability, Human-in-loop Systems and Architectures, Digital Twins, Simulation, and Virtual Environments
Abstract: To meet the real-time aerodynamic prediction demands of envelope protection within a Simplified Vehicle Operations architecture for tilt-propeller aircraft, this paper proposes a fast lift analysis approach that integrates an advanced blade element momentum theory, a lifting line wake model, and a non-planar vortex lattice method. With second-level computational latency, the method enables real-time prediction of wing loads under propeller slipstream induction. Validation shows that, in the small angle of attack range, the proposed approach achieves CFD-comparable accuracy while improving computational speed by about four orders of magnitude. To mitigate the drop in prediction accuracy in the nonlinear region, an SVO envelope protection strategy based on the ‘stay-away-from-the-boundary’ principle is further proposed. The method provides physics-layer support for simplified piloting and safety assurance of novel aircraft such as eVTOL.
|
| |
| 12:15-12:25, Paper ThATR-28.13 | Add to My Program |
| RunA-Fit Adaptive Feature Refinement and Alignment for CLIP-Based Few-Shot Airfield Runway Anomaly Classification |
|
| cao, yuhang | Chongqing University |
| Yu, Zhongliang | School of Automation, Chongqing University |
| Loja, Rene V. S | Salesian Polytechnic University |
| Su, Xiaojie | Chongqing University |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-in-loop Systems and Architectures, Human-AI Collaboration and Decision Support
Abstract: The combination of contrastive language image pre-training (CLIP) and few-shot learning technique has demonstrated impressive classification and retrieval performance on diverse downstream tasks. However, there are many challenges in the classification of airport runway anomalies, such as limited or low-quality data and complex feature selection in dynamic environments, which might diminish the classifier's performance. In this paper, we present a joint solution consisting of three modules: the test image feature enhancement module, which filters local image features based on text characteristics to produce more accurate feature descriptions; the support set projection space module, a Fusion of Old and New Datasets method (RunA-Fit) for CLIP pre-trained model, which generates additional samples using diffusion models and projects the features of these new samples into the original subspaces to achieve alignment; and the quantification of generated dataset similarity distribution module, which ensures consistency between features of the generated and original data. Moreover, we implement adaptive weight adjustment for each module based on the dataset. Extensive experiments demonstrate that our method significantly enhances classification performance across 11 datasets, yielding an average accuracy improvement of 0.67% over the baseline method APE. The code is available at https://github.com/hanxiao007/RunA-Fit
|
| |
| 12:25-12:35, Paper ThATR-28.14 | Add to My Program |
| Evaluating Psychophysiological Variables for Real-Time Activity Recognition in Manned–Unmanned Teaming Missions |
|
| Tschurtschenthaler, Karl | University of the Bundeswehr Munich |
| Schulte, Axel | Bundeswehr University Munich |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-Machine Integration, Human-AI Collaboration and Decision Support
Abstract: This paper evaluates the suitability of automatically computed psychophysiological variables for recognizing pilot activity. Currently, our activity recognition (AR) model in [4] primarily relies on observations derived from cockpit interactions. However, recognizing activities from these observations is difficult because of the fragmented observational pattern caused by the pilots' rapid shifts in attention. In comparison, observations derived from psychophysiological signals could provide more continuous input. To investigate this possibility, we conducted an experiment with five military fighter pilots in a fast-jet research simulator. We used electrocardiographic and eye tracking data to compute various psychophysiological variables in real time. Then, we analyzed these variables for task-dependent differences to determine if they could generate meaningful observations. Our results showed that several psychophysiological variables are strongly task-dependent. However, these results are highly individualized, as only certain pilots exhibited changes in these variables that could generate task-specific observations. Therefore, prior calibration of the observation generation is crucial when using psychophysiological variables for an AR model. Our AR model aims to provide adaptive assistance based on information about the pilot’s current activity during Manned–Unmanned Teaming missions. Task awareness is essential for these systems to support pilots effectively.
|
| |
| 12:35-12:45, Paper ThATR-28.15 | Add to My Program |
| Linking Vehicle Dynamics to User Experience in L2 Automated Urban Turning: A Driving Simulator Study |
|
| Dai, Siyi | Technical University of Munich (TUM) |
| Huemer, Jakob | BMW Group |
| Bengler, Klaus | Chair of Ergonomics, Technical University of Munich |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-in-loop Systems and Architectures, Autonomous Systems: Control, Safety, and Reliability
Abstract: We present a persistence-aware feature and modelling framework for relating vehicle dynamics to multidimensional user experience (UX) during SAE Level-2 automated traffic-free urban left turns. Using scenario-level descriptors that capture both intensity and temporal persistence, we examine comfort and relaxation, stimulation and fun, perceived safety, and a participant-weighted overall score (1-UX) in a motion-base driving simulator study. At a near-peak primary threshold ( κ = 0.7), comfort and relaxation were most strongly associated with yaw-acceleration intensity and were largely driven by additive effects, i.e., mainly single-motion effects, whereas stimulation and fun were associated with sustained absolute lateral acceleration. In contrast, perceived safety and 1-UX were better captured by combined motion effects, indicating that overall and safety judgments integrate coupled patterns of speed and acceleration persistence, yaw dynamics, and lane-margin variability rather than single motion parameters. Expertise moderated selected associations, and strong participant-level clustering highlighted substantial individual differences. These results support using persistence-aware descriptors and selective motion-interaction modelling to evaluate and tune automated turning behaviour, and motivate further studies in on-road settings with other road users.
|
| |
| 12:45-12:55, Paper ThATR-28.16 | Add to My Program |
| A Framework on Human–Autonomy Unified Control and Strategies for EVTOL |
|
| Wang, Nan | Beihang University |
| Li, Yuhan | Nanjing University of Aeronautics and Astronautics |
| Zhang, Shuguang | Beihang University |
Keywords: Human-Machine Integration, Autonomous Systems: Control, Safety, and Reliability, Human-AI Collaboration and Decision Support
Abstract: With the rapid development of Advanced Air Mobility (AAM), Urban Air Mobility (UAM), and electric Vertical Takeoff and Landing (eVTOL) technologies, relying solely on pilots or autonomy is insufficient to meet the requirements of safety, robustness, and social acceptability at the current stage. This paper proposes and systematically elaborates on the Human–Autonomy Unified Control (HAUC) framework applicable in UML–3 to 5, emphasizing the tight coupling of humans and autonomy to form a joint cognitive entity. The framework supports the continuity and stability of the entire flight envelope through shared tasks, dynamic authorization, and bidirectional understanding. This paper first distinguishes between the concepts of automation and autonomy, and then reviews several promising strategies for realizing HAUC. Subsequently, the concept of HAUC and its four core paradigms: Human–Autonomy Mutual Calibration (HAMC), Shared Situation Awareness (SSA), Dynamic Authority Allocation (DAA), and Bidirectional Perceptual Fusion Interface (BPFI) are proposed. Finally, the three main challenges of achieving HAUC — technology, airworthiness, public and ethics — are discussed, along with future research directions and suggestions. This paper aims to provide a systematic reference and research roadmap for the design, verification, and certification of Human–Autonomy hybrid eVTOL operation.
|
| |
| 12:55-13:15, Paper ThATR-28.17 | Add to My Program |
| A Physiological Feature-Based Machine Learning Approach for Identification of Multiple Visual Monitoring Tasks (I) |
|
| Li, Yufei | Nanyang Technological University |
| Lye, Sun Woh | Nanyang Technological University |
| Liu, Bufan | Nanyang Technological University |
Keywords: Human-AI Collaboration and Decision Support, Human-Machine Integration, Human Factors, Ergonomics, and Performance in Intelligent Systems
Abstract: This paper explores the importance of visual physiological features in task identification by using the integration and application of Artificial Intelligence (AI) in monitoring and assisting human decision-making processes. Our methodology focuses on understanding human decision-making tasks through data capture of real-time visual monitoring. We integrated AI in task-based decision-making and proposed a 3-Phase methodology which improves Human-Machine Interaction. We achieve this by applying the decision tree classification algorithm to identify and classify visual patterns among air traffic controllers (ATCos) engaged in 3 common operational monitoring tasks. Feature importance was derived with the algorithm applied. The proposed classifier model was able to achieve 90% matching with the actual tasks executed by ATCOs in the validation set. Besides model establishment, additional number of synthetic data were generated using Synthetic Minority Over-sampling Technique (SMOTE) to address class imbalance in the training data. The classifier with SMOTE training data was also able to achieve 90 % task matching accuracy. Lastly, an investigation was also conducted to evaluate the three parameter set configurations; parameter data set based on Whole Frame only (W), Aircraft of Interest (AOI) only, and both W and AOI. When both parameter data sets were considered together, its task matching accuracy was found to be significantly higher than just considering one parameter data set at a time
|
| |
| ThATR-29 Lecture session, TR-29 |
Add to My Program |
| TR-29 Special Sessions Hybrid Intel HM Coop Regular B - Morning Session |
|
| |
| |
| 10:15-10:25, Paper ThATR-29.1 | Add to My Program |
| AI-Enhanced Tactical Congestion Management Framework for Human-In-The-Loop Decision Support (I) |
|
| YANG, Huijuan | Ecole Nationale De l'Aviation Civile |
| Delahaye, Daniel | Ecole Nationale De l'Aviation Civile |
| Ma, Chunyao | Nanyang Technological University |
| Alam, Sameer | Nanyang Technological University |
Keywords: Human-AI Collaboration and Decision Support
Abstract: Tactical air traffic management requires balancing sector workloads, operational feasibility, and delay minimisation under highly dynamic traffic conditions. This paper presents an AI-enhanced optimisation framework for large-scale airspace congestion management designed to support human-in-the-loop decision making in European air traffic control environments. The proposed approach combines trajectory-informed sector-list option generation with a mixed-integer optimisation formulation and a Selective Simulated Annealing (SSA) search algorithm. Spatiotemporal interactions between flights and sectors are encoded through precomputed sector-entry indicators, enabling scalable optimisation without requiring full 4D trajectory re-planning. The optimisation process employs localised modification operators and overload-guided neighbourhood search to efficiently explore congestion mitigation strategies while preserving interpretability of the resulting tactical adjustments. A large-scale case study based on European traffic data involving 1,406 flights demonstrates that the framework converges within approximately four seconds. Comparative experiments show that the integrated optimisation eliminates sector-entry overloads primarily through sector-list adjustments (80.7% of flights), while ground delays are applied selectively (6.12% of flights). These results indicate that trajectory-informed optimisation can provide scalable tactical congestion mitigation while remaining compatible with controller-centred operational workflows.
|
| |
| 10:25-10:35, Paper ThATR-29.2 | Add to My Program |
| A Linguistics-Guided Hybrid Intelligence Framework for Conversational Decision Support Toward Smarter Air Traffic Management (I) |
|
| Pal Thamburaj, Kingston | Nanyang Technological Univeristy |
| Ramesh, Mercedes Premalatha | Nanyang Technological University |
| Dhief, Imen | Air Traffic Management Research Institute |
| Feroskhan, Mir | Nanyang Technological University Singapore |
Keywords: Human-in-loop Systems and Architectures, Human Factors, Ergonomics, and Performance in Intelligent Systems, Digital Twins, Simulation, and Virtual Environments
Abstract: Controller-pilot radiotelephony remains a primary coordination channel in air traffic management, yet small linguistic deviations and readback/hearback breakdowns continue to contribute to incidents. Rising traffic density and diverse speaker accents increase frequency congestion, raising the cost of rare but high-impact misinterpretations in complex sectors. State-of-the-art conversational systems built on automatic speech recognition, spoken language understanding, and large language models can transcribe and paraphrase reliably in general domains, but operational adoption in air traffic control is constrained by scarce operational data, strict phraseology requirements, and opaque failure modes under uncertainty. An interpretable decision-support objective is therefore required: convert transmissions into structured meaning, validate completeness and internal consistency, and surface ambiguity for human resolution rather than autonomous action. A linguistics-guided hybrid intelligence framework is described that combines phraseology-driven normalization, semantic frame parsing (intent, callsign, and parameter slots), rule-based constraint checking, and uncertainty-aware escalation. Decision-support outputs include structured clearance summaries, flagged inconsistencies, and compact rationales aligned with ICAO conventions. System-level evaluation uses a scenario-based corpus of 27{,}000 ATC utterances and 6{,}000 clearance--readback pairs with perturbations modeling omission, reordering, unit ambiguity, and ASR-like digit confusions. Relative to a parsing-only baseline, the full configuration improves slot-level F1 from 0.925 to 0.945 and increases mismatch-detection F1 from 0.747 to 0.913, while maintaining a 0.101 over-escalation rate on meaning-preserving messages. Constraint ablation confirms the contribution of unit validation to ambiguity capture and escalation behavior; residual error analysis highlights callsign confusability and partial readbacks as dominant stressors.
|
| |
| 10:35-10:45, Paper ThATR-29.3 | Add to My Program |
| Bridging the Experience Gap: Multimodal Agent-Driven AI for Knowledge Transfer in Air Traffic Management (I) |
|
| Yang, Tiance | The Hong Kong Polytechnic University |
| Zhang, Zhensheng | AVIC Leihua Electronic Technology Research Institute |
| Lee, Ching-Hung | Xi’an Jiaotong University |
| Li, Fan | Department of Aeronautical and Aviation Engineering, the Hong Kong Polytechnic University |
Keywords: Human-AI Collaboration and Decision Support, Multi-Agent Architectures, Human-in-loop Systems and Architectures
Abstract: Experience-based knowledge is essential for air traffic management (ATM), yet its implicit and multimodal nature makes effective modeling and transfer difficult. We propose EXGRAIL, an experience-aware graph inference frame work that represents operational experience using a dynamic, multimodal knowledge graph. EXGRAIL abstracts textual expert knowledge into reusable decision subgraphs and lever ages graph-based reasoning to support cross-task knowledge transfer and expert-level explanation. A case study in realistic ATM scenarios shows that EXGRAIL produces more accurate and operationally consistent solutions than GraphRAG-based and tool-augmented LLMs, demonstrating the potential of experience-driven graph inference for intelligent ATM decision support.
|
| |
| 10:45-10:55, Paper ThATR-29.4 | Add to My Program |
| System-Level Performance Analysis of AI-Assisted and Conventional Coordination in Air Traffic Management (I) |
|
| Ramesh, Mercedes Premalatha | Nanyang Technological University |
| Dhief, Imen | Air Traffic Management Research Institute |
| Li, Zhimin | Nanyang Technological University |
| Bai, Lu | Nanyang Technological University |
| Feroskhan, Mir | Nanyang Technological University Singapore |
Keywords: Human-AI Collaboration and Decision Support, Digital Twins, Simulation, and Virtual Environments, Human-in-loop Systems and Architectures
Abstract: Inter-unit coordination is a persistent bottleneck in air traffic management (ATM) because it links adjacent sectors and flight information regions through time-critical negotiation and transfer-of-control steps. Coordination demand is unlikely to ease: global passenger demand reached a record high in 2024 and long-term forecasts project continued growth, while the European network reported 10.7 million flights and 22.7 million minutes of en-route ATFM delay in 2024. Standardized ground-ground exchanges (e.g., AIDC and OLDI) and procedure sets (e.g., PANS-ATM) support structured notifications and coordination; in parallel, digital twins are emerging as a practical way to generate operational evidence without disrupting live operations, and AI decision support has been explored for ATM-related optimization and monitoring tasks. Yet, a system-level benchmark that quantifies the time-reliability trade-off between conventional and AI-assisted coordination across heterogeneous coordination patterns remains missing. Three challenges drive the need for a new method: (i) coordination episodes must be defined consistently despite differing local rules and message variants, (ii) reliability must be captured using system-observable indicators rather than controller-in-the-loop experiments, and (iii) repeatable evaluation must stress exception handling while preserving confidentiality. A digital-twin-based evaluation framework is therefore formulated. It defines coordination episodes from log events and compares a conventional baseline against three AI-assisted coordination modes (rule-based routine auto-coordination, predictive early-trigger recommendations, and optimization-based proposal generation). It reports completion time distributions and reliability indicators such as successful closure and escalation rate, while interaction effort is estimated from coordination turns, following prior measurement of digital input performance Experimental results show that under stress demand, AI-assisted Modes~B and~C reduce tail completion time by approximately 18-20% relative to the conventional baseline and improve on-time closure from 76.8% to above 87%, while lowering escalation rates from 10.1% to below 7%.
|
| |
| 10:55-11:05, Paper ThATR-29.5 | Add to My Program |
| Evaluation of GenLLM-Based Air Traffic Complexity Assessment (I) |
|
| Göppel, Simon | University of the Bundeswehr Munich |
| Li, Max | University of Michigan |
| Schultz, Michael | University of the Bundeswehr Munich |
Keywords: Human-AI Collaboration and Decision Support
Abstract: Air traffic complexity is a key driver of controller workload and fundamentally constrains air traffic growth. While air traffic complexity metrics provide quantitative indicators of workload, they are typically calibrated for specific sectors or traffic scenarios, limiting their generalizability. This underscores the need for flexible, context-sensitive support for situation assessment. Accordingly, this study investigates whether GenLLMs can assess the complexity of air traffic situations. A survey among air traffic controllers is conducted to establish a human ground truth. These ratings serve as a benchmark for a systematic evaluation of several GenLLMs. The models are assessed using progressively structured prompting strategies, ranging from zero-shot prompting to multi-role reasoning and in-prompt learning. The results show that several models achieve an average deviation of less than one rating level from the controller benchmark. Assessment performance is strongly model-dependent, with larger models exhibiting closer agreement with human judgments. The effect of prompting strategies is not universal and is primarily observed for suitable models in the present application. Overall, the findings demonstrate the feasibility of GenLLM–based air traffic complexity assessment and highlight its potential for situation assessment support and future co-controller concepts.
|
| |
| 11:05-11:15, Paper ThATR-29.6 | Add to My Program |
| Retrieval-Augmented Large Language Models for Evidence-Based Hazard Log Generation in Emerging Aviation Systems (I) |
|
| Schultz, Michael | University of the Bundeswehr Munich |
| Göppel, Simon | University of the Bundeswehr Munich |
Keywords: Autonomous Systems: Control, Safety, and Reliability, Human-AI Collaboration and Decision Support, Human Factors, Ergonomics, and Performance in Intelligent Systems
Abstract: We introduce a retrieval-augmented synthesis pipeline for deriving structured hazard logs for emerging aviation concepts from historical aviation accident evidence. NTSB accident reports are transformed into a schema-consistent corpus combining coded findings and narrative mechanisms for semantic indexing. Mechanism-level retrieval uses sentence-transformer embeddings, a FAISS inner-product index, evidence-derived seed extraction, and maximal marginal relevance to obtain diversified, scenario-relevant cases. Hazard generation is constrained by strict JSON schema validation, one-to-one evidence binding, explicit causal sequencing, and enforced primary-mechanism uniqueness. A multipass strategy with critic-based filtering and deterministic de-duplication improves robustness against mechanism repetition and evidence drift. Evaluation of an urban eVTOL safety-landing scenario compares locally deployed open-weight models under identical constraints. Retrieval augmentation supports mechanism-specific and traceable hazard derivation compared to unconstrained scenario-based prompting. Mistral-7B requires multipass generation to achieve acceptable mechanism diversity and evidence consistency, whereas GPT-OSS-20b produces structurally valid and mechanism-differentiated hazard sets in a single pass. Scaling to GPT-OSS-120b yields only marginal improvements at substantially higher computational cost.
|
| |
| 11:15-11:25, Paper ThATR-29.7 | Add to My Program |
| Human-AI Collaboration for UAS Traffic Management: A Review of Decision Support, Trust, and Assurance (I) |
|
| Pauldurai, Arun Prasanth | Renault Nissan Technology and Business Center India Pvt Ltd |
| Ramesh, Mercedes Premalatha | Nanyang Technological University |
| Pal Thamburaj, Kingston | Nanyang Technological Univeristy |
Keywords: Autonomous Systems: Control, Safety, and Reliability, Human Factors, Ergonomics, and Performance in Intelligent Systems, Policy, Standards, and Societal Impact of
Abstract: Uncrewed aircraft systems (UAS) are rapidly transitioning from isolated missions to dense, heterogeneous operations in low-altitude airspace, increasingly interacting with conventional air traffic management (ATC), U-space frameworks, and emerging advanced air mobility (AAM) concepts. This transformation is driving the integration of artificial intelligence (AI) to support key functions such as demand prediction, intent-based deconfliction, trajectory planning, contingency management, and operator decision support. However, the introduction of learning-enabled automation in safetycritical environments raises challenges related to human roles, accountability, transparency, and operational trust. This paper presents a comprehensive review of human–AI collaboration in UAS Traffic Management (UTM) and its interaction with adjacent air traffic control contexts. The review adopts a socio-technical perspective, examining how AI-enabled decision support can scale to high-density operations without degrading situation awareness, increasing mode confusion, or undermining calibrated trust. It synthesizes four dimensions: (i) operational architectures and service roles in UTM frameworks, (ii) AI methods for conflict detection, trajectory prediction, and multi-agent coordination, (iii) human factors including workload, attention management, and mixed-initiative interaction, and (iv) assurance mechanisms such as explainability, uncertainty communication, monitoring, auditability, and governance. The review also highlights the growing role of communication-centric and multimodal interfaces, including speech-based and hybrid systems, designed to reduce cognitive load while preserving traceability and standard phraseology. A key contribution is the integration of technical, human, and governance perspectives into a unified analytical framework that identifies design tensions and research gaps in current UTM systems. The paper concludes with a focused research agenda on scalable simulation and digital twins, metrics for calibrated human–AI trust, human-centered certification pathways, and training approaches to support effective supervision of adaptive and autonomous systems in future airspace environments.
|
| |
| 11:25-11:35, Paper ThATR-29.8 | Add to My Program |
| A Hybrid Learning Control Framework for Safety-Constrained Multi-Agent Aerial Pursuit in Airport Airspace (I) |
|
| Deng, Yaosheng | Nanyang Technological University |
| Bai, Lu | Nanyang Technological University |
| Gao, Junjie | Ntu |
| Xiao, Jiaping | Nanyang Technological University |
| Feroskhan, Mir | Nanyang Technological University Singapore |
Keywords: Autonomous Systems: Control, Safety, and Reliability, Human-Machine Integration, Multi-Agent Architectures
Abstract: Bird strikes remain a significant risk to aviation safety and airport operations, motivating the development of autonomous aerial wildlife deterrence systems. Although reinforcement learning (RL) enables adaptive and rapid decision-making for target pursuit, its direct deployment in safety-critical airspace is limited by the absence of explicit safety guarantees. In airport environments, autonomous pursuers must simultaneously maintain sensing continuity with agile biological targets while avoiding collisions with other aerial agents, aircraft, and static infrastructure. These coupled safety requirements cannot be reliably enforced through purely reward-based learning. This paper presents a hybrid learning control framework for safety-constrained multi-agent pursuit in airport airspace. A trained RL policy generates nominal pursuit commands, while a Control Barrier Function (CBF)-based safety layer enforces collision avoidance and sensing maintenance through a quadratic programming formulation. A supervisory switching mechanism activates safety correction only when nominal actions violate CBF conditions, preserving adaptive pursuit behavior whenever feasible. By separating performance-oriented learning from constraint enforcement, the proposed hybrid learning-control framework ensures forward invariance of the defined safety sets without restricting nominal operation unnecessarily. Simulation results demonstrate robust constraint satisfaction and stable pursuit performance under target maneuvers and disturbances, outperforming RL-only approaches and supporting the safe integration of autonomous deterrence systems into airport environments.
|
| |
| 11:35-11:45, Paper ThATR-29.9 | Add to My Program |
| City-Wide Low-Altitude Urban Air Mobility: A Scalable Global Path Planning Approach Via Risk-Aware Multi-Scale Cell Decomposition (I) |
|
| Rivera, Josue N | Nanyang Techonological University |
| Sun, Dengfeng | Assistant Professor, Purdue University |
| Lv, Chen | Nanyang Technological University |
Keywords: Autonomous Systems: Control, Safety, and Reliability, Applications in Healthcare, Smart Infrastructure, and Environments, Digital Twins, Simulation, and Virtual Environments
Abstract: The realization of Urban Air Mobility (UAM) necessitates scalable global path planning algorithms capable of ensuring safe navigation within complex urban environments. This paper proposes a multi-scale risk-aware cell decomposition method that efficiently partitions city-scale airspace into variable-granularity sectors, assigning each cell an analytically estimated risk value based on obstacle proximity and expected risk. Unlike uniform grid approaches or sampling-based methods, our approach dynamically balances resolution with computational speed by bounding cell risk via Mahalanobis distance projections, eliminating exhaustive field sampling. Comparative experiments against classical A*, Artificial Potential Fields (APF), and Informed RRT* across five diverse urban topologies demonstrate that our method generates safer paths with lower cumulative risk while reducing computation time by orders of magnitude. The proposed framework, Larp Path Planner, is open-sourced and supports any map provider via its modified GeoJSON internal representation, with experiments conducted using OpenStreetMap data to facilitate reproducible research in city-wide aerial navigation.
|
| |
| 11:45-11:55, Paper ThATR-29.10 | Add to My Program |
| Beyond Transportation: Autonomous Vehicles As a Human-City Interface for Future Mobility (I) |
|
| Chen, Keqi | Nanyang Technological University |
| Lv, Chen | Nanyang Technological University |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Policy, Standards, and Societal Impact of, Trust, Transparency, and Ethical Governance in Human-Machine Systems
Abstract: Urban mobility is undergoing a profound transformation beyond mere efficiency and infrastructure. As autonomous vehicle technology advances, its most significant impact will be not on transportation logistics, but on how it reshapes relationships between people and cities. Yet urban planning, transportation engineering, and smart city initiatives typically frame autonomous mobility as a technical challenge, overlooking how it transforms human experiences. This Perspective introduces a new human-machine interaction framework, the Human-Mobile-Urban Interface (HMUI), to examine how emerging autonomous mobility technologies redefine urban life. By reconceptualizing mobility as an experiential interface rather than infrastructure alone, this approach opens new pathways for urban research, design, and policy, guiding cities toward more human-centered and equitable futures.
|
| |
| 11:55-12:05, Paper ThATR-29.11 | Add to My Program |
| Safe Driving for Human–Machine Shared Control: A Prediction-Informed Risk-Aware SMPC (I) |
|
| Zhou, Yangyang | The Hong Kong Polytechnic University |
| GUO, Jingrui | The Hong Kong Polytechnic University |
| Hu, Dong | The Hong Kong Polytechnic University |
| Pai, Zheng | The Hong Kong Polytechnic University |
| Huang, Chao | Adelaide University |
Keywords: Human-in-loop Systems and Architectures, Human-Machine Integration, Human Factors, Ergonomics, and Performance in Intelligent Systems
Abstract: With the rapid development of intelligent driving technology, human-machine shared control (HMSC) has emerged as a practical paradigm to improve driving safety and user acceptability. In HMSC, a shared controller coordinates driver intent and automation assistance to generate a compatible control command for vehicle execution. However, in dynamic trafffc, effective shared control relies on anticipating surrounding vehicle behaviors, and overlooking their motion uncertainty may result in underestimated collision risk and insufffcient safety margins during safety-critical interactions. This paper presents an innovative framework for HMSC that integrates LSTM-based trajectory prediction with an explicit uncertainty representation, enabling chance-based safety constraint in the controller optimization. Extensive driver-in-the-loop (DiL) experiments were conducted, and a large volume of data was collected for quantitative analysis. Finally, statistical results over multiple evaluation metrics demonstrate the effectiveness of the proposed method.
|
| |
| 12:05-12:15, Paper ThATR-29.12 | Add to My Program |
| LIT-Bench: A Multi-Level Evaluation Benchmark for Vision-Language Models in Intelligent Transportation Systems (I) |
|
| Zhang, Geyuan | Tongji University |
| Fang, Shiyu | Tongji University |
| Cui, Yiming | Tongji University |
| Zhang, Jiarui | Tongji University |
| Zheng, Liyong | Tongji University |
| Hang, Peng | Tongji University |
Keywords: Human-AI Collaboration and Decision Support, Human-Machine Integration, Autonomous Systems: Control, Safety, and Reliability
Abstract: Intelligent Transportation Systems (ITS) play a crucial role in enhancing road safety, traffic efficiency, and urban sustainability. However, existing studies largely focus on isolated tasks or scenario-specific applications. With the strong perception and reasoning capabilities of large models, an increasing number of works have begun exploring the integration of Vision-Language Models (VLMs) into ITS. Nevertheless, a unified benchmark and evaluation framework for systematically assessing VLM performance across diverse core ITS tasks is still lacking. To address this gap, this paper proposes the Language Models for Intelligent Transportation Benchmark (LIT-Bench), designed to comprehensively evaluate whether current VLMs are capable of handling complex ITS tasks. First, we construct a roadside visual dataset comprising real-world routine traffic scenes and simulated accident scenarios, covering both typical operational conditions and high-risk situations. Second, a multi-level annotation framework is developed to support four representative ITS tasks: environmental perception, event identification, risk recognition, and decision support. Finally, we establish a comprehensive evaluation framework that integrates single-task performance with a weighted aggregation strategy combining the Analytic Hierarchy Process (AHP) and the Entropy Weight Method (EWM), enabling unified capability quantification and model ranking. Experimental results reveal that although state-of-the-art VLMs achieve strong environmental perception (up to 0.77), their capability degrades significantly in event identification (0.16–0.67) and decision support (average 0.54), revealing clear hierarchical gaps. Autonomous driving-specialized models demonstrate superior risk recognition compared to general VLMs, yet exhibit systematically lower event identification performance—a pattern attributed to vehicle-centric training data emphasis and catastrophic forgetting during domain fine-tuning. Among specialized models, performance varies substantially (0.468–0.592), indicating that effective domain adaptation requires targeted architectural innovation rather than general fine-tuning alone.
|
| |
| 12:15-12:25, Paper ThATR-29.13 | Add to My Program |
| Vision-Conditioned Structured Trajectory Planning Via Flow Matching and Route-First Decoding (I) |
|
| Cai, Zhuohang | University of Electronic Science and Technology of China |
| He, Xiangkun | University of Electronic Science and Technology of China |
Keywords: Trust, Transparency, and Ethical Governance in Human-Machine Systems, Digital Twins, Simulation, and Virtual Environments, Autonomous Systems: Control, Safety, and Reliability
Abstract: End-to-end autonomous driving typically predicts future motion either by direct waypoint regression or by autoregressive token generation. Direct regression is simple, but it often struggles to preserve coherent long-horizon structure; autoregressive decoders are more expressive, but they incur sequential cost and are prone to drift. We propose a vision-conditioned structured trajectory planner based on Flow Matching over a structured latent plan. Conditioning is constructed from visual tokens, two target-point route tokens, and one ego-speed token. Instead of regressing final waypoints directly, the model predicts 30 latent plan tokens, split into a 10-token speed branch and a 20-token route branch. During inference, the route branch is decoded first into route points, after which the speed branch is converted into cumulative travel distance and sampled along the predicted route. Experiments on 7,861 held-out samples from a vision-only closedloop driving benchmark compare an original autoregressive baseline with two variants of the proposed planner: DINOv3+ DiT and LLaVA + DiT. Under this evaluation protocol, the DINOv3 variant attains lower route ADE/FDE than the 377.0Mparameter baseline with only 65.9M parameters, highlighting substantial parameter savings for lighter deployment, while the LLaVA variant achieves the strongest overall route metrics together with the lowest longitudinal speed-control error among the three evaluated models. These results suggest that the structured planning head accounts for a substantial part of the improvement in this setting, while stronger visual features provide further gains once the planner is fixed.
|
| |
| 12:25-12:35, Paper ThATR-29.14 | Add to My Program |
| Distributed Swarm Deployment in Fourier Coordinates Via Riesz Energy Shaping |
|
| Zhang, Xiaozhen | The Hong Kong Polytechnic University |
| Huang, Chao | Adelaide University |
Keywords: Multi-Agent Architectures, Autonomous Systems: Control, Safety, and Reliability
Abstract: This paper studies distributed deployment of robot swarms. The objective is to enforce all robots into a prescribed target region while achieving an even distribution. One key challenge is the difffculty in describing complex and irregular target regions. To address this issue, we propose a Fourier-based polar coordinate system that facilitates the description of target regions. In particular, under the proposed coordinate, the target region can be concisely deffned by a sup-level set. Furthermore,a convex optimization is proposed to establish such Fourierbased polar coordinate system from a set of interest points. We enforce even distribution by deffning a graph-weighted Riesz energy, and we prove that the swarm trajectories converge to the resulting graph-weighted Riesz energy points on the target region. To conclude, a scenario comprising a ffeet of 50 UAVs is investigated, thereby substantiating the viability of the introduced distributed strategy.
|
| |
| 12:35-12:45, Paper ThATR-29.15 | Add to My Program |
| CBF-QP Based Collision-Constrained Shared Control for Multi Agent Robot Swarms |
|
| Ramesh, Mercedes Premalatha | Nanyang Technological University |
| Mohankumar, Karthikeyan | Kalasalingam University |
| Pal Thamburaj, Kingston | Nanyang Technological Univeristy |
Keywords: Autonomous Systems: Control, Safety, and Reliability, Digital Twins, Simulation, and Virtual Environments, Multi-Agent Architectures
Abstract: Multi-agent robot swarms offer scalable and flexible solutions for applications such as search and rescue, surveillance, and cooperative navigation; however, as swarm density increases, ensuring collision-free operation in shared and cluttered environments remains a fundamental challenge. In practical deployments, human supervision is often required to guide swarm behavior, yet human-specified commands can be imprecise, delayed, or unsafe, leading to frequent collisions and degraded performance. Existing swarm collision avoidance approaches predominantly rely on heuristic methods such as artificial potential fields and velocity obstacle-based techniques, which lack formal safety guarantees and often exhibit oscillatory or deadlock behavior in dense scenarios.Optimization-based safety frameworks, including Control Barrier Functions (CBFs), provide strong theoretical guarantees but are typically designed for fully autonomous control and do not directly accommodate human-in-the-loop operation. This work addresses the problem of integrating continuous human supervisory control with formally enforced collision constraints in multi-agent robot swarms. A collision-constrained shared control framework based on Control Barrier Function-based Quadratic Programming (CBF-QP) is proposed, in which human supervisory velocity commands and nominal swarm cohesion objectives are treated as desired inputs and filtered through a convex optimization layer. The resulting CBF-QP computes the closest admissible control action that satisfies inter-agent and robot-obstacle separation constraints while respecting actuator limits, thereby minimizing deviation from human intent. The approach addresses three key challenges: enforcing safety under potentially unsafe human commands, preserving command fidelity through minimal intervention, and maintaining computational scalability with increasing swarm size. The framework is validated entirely in simulation using single-integrator swarm dynamics, with additional experiments demonstrating applicability to differential-drive robot models.Extensive Monte Carlo evaluations across randomized initial conditions,dense multi-robot interactions.
|
| |
| 12:45-12:55, Paper ThATR-29.16 | Add to My Program |
| ART: Adaptive Relational Transformer for Pedestrian Trajectory Prediction with Temporal-Aware Relations |
|
| Li, Ruochen | Durham University |
| Chang, Ziyi | Durham University |
| Hu, Junyan | Durham University |
| Li, Jiannan | Singapore Management University |
| Atapour-Abarghouei, Amir | Durham University |
| Shum, Hubert P. H. | Durham University |
Keywords: Multi-Agent Architectures, Autonomous Systems: Control, Safety, and Reliability, Human-AI Collaboration and Decision Support
Abstract: Accurate prediction of real-world pedestrian trajectories is crucial for a wide range of robot-related applications. Recent approaches typically adopt graph-based or transformer-based frameworks to model interactions. Despite their effectiveness, these methods either introduce unnecessary computational overhead or struggle to represent the diverse and time-varying characteristics of human interactions. In this work, we present an Adaptive Relational Transformer (ART), which introduces a Temporal-Aware Relation Graph (TARG) to explicitly capture the evolution of pairwise interactions and an Adaptive Interaction Pruning (AIP) mechanism to reduce redundant computations efficiently. Extensive evaluations on ETH/UCY and NBA benchmarks show that ART delivers state-of-the-art accuracy with high computational efficiency.
|
| |
| ThBTR-28 Lecture session, TR-28 |
Add to My Program |
| TR-28 Special Sessions GenAI Privacy Regular C - Afternoon Session |
|
| |
| |
| 14:00-14:10, Paper ThBTR-28.1 | Add to My Program |
| Multimodal Voice Distress Detection for Real-Time Safety Monitoring in Schools (I) |
|
| Ritvik, Chawla | Delhi Public School, Faridabad |
| Kumar, Chirag | The LNM Institute of Information Technology |
| Malik, Anu | Gautam Buddha University |
Keywords: Human-Robot Interaction and Collaborative Robotics, Human-AI Collaboration and Decision Support, Human-Machine Integration
Abstract: Ensuring the safety of students and staff requires monitoring methods that remain discreet and protect individual dignity while still being effective. Audio-based techniques offer potential, but they often struggle with robustness, generalization, and interpretability. To address these issues, we present a multimodal voice distress detection system for real-time safety monitoring. The system integrates acoustic features extracted using TorchAudio, speech transcripts generated through Whisper, and schema-guided inference from a fine-tuned DialoGPTmedium model. Their outputs are organized to report distress level, confidence, the reasoning behind the decision, and the recommended action, supporting interpretability in safety-critical scenarios. Tests are conducted on four benchmark datasets - RAVDESS, CREMA-D, TESS, and IEMOCAP and it shows strong performance, with 90.65% accuracy and 98.72% recall for distress cases. When compared to a logistic regression baseline that uses only acoustic features, the multimodal fusion approach offers a clear improvement in sensitivity. These findings demonstrate the promise of the proposed framework as a privacy preserving solution for real-time institutional distress detection.
|
| |
| 14:10-14:20, Paper ThBTR-28.2 | Add to My Program |
| Speech Emotion Recognition with Dual-Stream Channel Attention (I) |
|
| Mittal, Kanan | Indira Gandhi Delhi Technical University for Women |
| Agarwal, Divyanshi | Indira Gandhi Delhi Technical University for Women |
| Goel, Nidhi | IGDTUW |
Keywords: Human-Robot Interaction and Collaborative Robotics, Human-Machine Integration, Autonomous Systems: Control, Safety, and Reliability
Abstract: Speech emotion recognition (SER) in real-world environment remains challenging due to noise, diverse recording conditions and the complexity of capturing expressive vocal cues from raw speech. This work proposes a robust SER framework that focuses on learning stable low-level and mid-level acoustic representations using MFCC features and enhancing them through complementary global statistical cues. A multi-head Dual-Stream Pooling Channel Attention mechanism enables the model to emphasize emotionally informative patterns while suppressing distortions by denoising and voice activity detection (VAD). Experiments on the IEMOCAP benchmark demonstrate that our method achieves competitive performance on the 8-class emotion classification task while remaining computationally efficient and deployment-friendly. Ablation studies further validate the contribution of each architectural component and emphasize the importance of reliability-aware processing in practical SER systems. In the comparison experiments, our method achieves an overall accuracy of 71.76% and UAR of 72.72%.
|
| |
| 14:20-14:30, Paper ThBTR-28.3 | Add to My Program |
| Geo-NWM: A Theoretical Framework for Geometry-Consistent World Models in 6-DoF Humanoid Navigation (I) |
|
| Batra, Rupesh | Cluix |
| Chawla, Ritvik | Delhi Public School, Faridabad |
| Fortino, Giancarlo | University of Calabria |
Keywords: Autonomous Systems: Control, Safety, and Reliability, Digital Twins, Simulation, and Virtual Environments, Human-Machine Integration
Abstract: Generative world models facilitate robotic planning through future prediction, but existing navigation models (NWM) lack 3D geometric grounding, failing in complex 6-DoF humanoid environments. We present Geo-NWM (Geometric-Navigation World Model), a framework integrating conditional diffusion transformers with sparse visual SLAM priors to enforce geometric consistency. A consistency loss formalized on the SE(3) manifold penalizes physics-violating pixel motions, stabilizing diffusion gradients and mitigating trajectory hallucinations. The dual-stream architecture fuses a SLAM geometric anchor with a generative diffusion dreamer via a Geometric Gate layer. We introduce the Dream-to-Reality ATE (D-ATE) metric to quantify physical fidelity. Validation on the Unitree G1 in MuJoCo demonstrates an 80% reduction in trajectory drift, providing a robust foundation for safe human-robot interaction in shared environments.
|
| |
| 14:30-14:40, Paper ThBTR-28.4 | Add to My Program |
| Toward a Humanoid Conductor: Beat-Pattern Gesture Synthesis and Evaluation on the Unitree (I) |
|
| Khan, Akhtar | DIMES Department, University of Calabria, Italy |
| Chawla, Rashmi | JCBoseUST/Univ. of Calabria |
| Longo, Raffaele | Luigi Cherubini State Conservatory of Music, Florence, Italy |
| Francesco, Pupo | University of Calabria |
| Fortino, Giancarlo | University of Calabria |
Keywords: Human-Robot Interaction and Collaborative Robotics, Human-Machine Integration, Autonomous Systems: Control, Safety, and Reliability
Abstract: Humanoid robots are starting to emerge in the musical domain; however, most of the demonstrations focus on instrumental performance or on dance, instead of the central communicative role of the conductor. This work investigates the Unitree G1 humanoid as an accessible and compact robotic platform capable of executing canonical conducting beat patterns. This paper targets three standard conducting gestures (4-beat, 3-beat and 2-beat patterns, henceforth Gesture 4, Gesture 3, and Gesture 2) and generates upper-body motions by mapping textbook conducting diagrams to feasible trajectories within the kinematic workspace of the G1. Using the 23 Degrees of Freedom (DoF) morphology and position-controlled joints of the G1, this work generates smooth, tempo-synchronized joint-space trajectories and executes them in real time on the physical robot. Motion evaluation is performed through spatial trajectory analysis and kinematic assessment of temporal consistency. The results demonstrate stable ictus timing and structured reproduction of canonical beat geometry, providing preliminary evidence that the Unitree G1 is a promising platform for research in embodied musical communication and human–robot interaction.
|
| |
| 14:40-14:50, Paper ThBTR-28.5 | Add to My Program |
| Automated Diagnostic Evaluation of Vision-Guided Waste Sorting Via LLMs (I) |
|
| SONKER, MANISH | NTPC LTD |
| Gupta, Shyam Manohar | NTPC Limited |
| Kulshreshtha, Amit Kumar | NTPC Limited |
| Kumar, Manish | Zakir Husain Delhi College |
Keywords: Applications in Healthcare, Smart Infrastructure, and Environments, Multi-Agent Architectures, Human-Robot Interaction and Collaborative Robotics
Abstract: Robotic waste segregation systems achieve high accuracy in controlled settings but lack mechanisms to explain failures during industrial deployment. Current evaluation frame works provide classification metrics without diagnostic insights, forcing engineers to manually analyze logs. This work proposes an LLM-based framework for automated, explainable evaluation of vision-guided robotic waste sorting systems. The framework processes structured logs from perception and manipulation modules to generate multi-criteria assessments, identify failure patterns, and provide natural-language recommendations. We validate the approach across six test scenarios including contamination, occlusion, and high clutter conditions. Results show strong correlation between LLM evaluations and human expert assessments r = 0.85, p < 0.001 with 0.67-point mean absolute error. The LLM identifies systematic failure patterns matching expert analysis and generates actionable recommendations. Closed-loop experiments demonstrate 6.3% detection accuracy improvement and 5.8% manipulation success improvement after implementing LLM suggestions. These results constitute preliminary evidence of the framework’s utility for iterative system refinement, rep resenting a promising first step toward LLM-assisted evaluation in unstructured waste management environments. Index Terms—E-Waste, LLM Models, YOLOv8, Robotic seg regation
|
| |
| 14:50-15:00, Paper ThBTR-28.6 | Add to My Program |
| Contactless Palmprint Identification with Programmable Multispectral Imaging and Foundation Models on the Edge (I) |
|
| Seyedmohammdi, Seyed Jamal | Singapore Institute of Technology |
| Lim, Vi Shean | Singapore Institute of Technology |
| Foo, Shi Quan Marcus | Singapore Institute of Technology |
| Ng, Pai Chet | Singapore Institute of Technology |
| Plataniotis, Konstantinos N. | University of Toronto |
Keywords: Applications in Healthcare, Smart Infrastructure, and Environments
Abstract: Palmprint biometrics offers strong discriminative power but existing contactless systems remain limited by unstable illumination, handcrafted features, and reliance on cloud computation. We introduce Palm-Auth, the first full-stack, multispectral, foundation model based palmprint authentication platform, implemented entirely on low-cost embedded hard- ware. Palm-Auth integrates (i) a programmable multispectral illumination module that captures seven spectrum modalities using RGB and IR camera modules; (ii) a lightweight ROI extractor and foundation model-based identification pipeline, deployed fully on an NVIDIA Jetson TX2 NX; and (iii) an edge-oriented control and user interface on a Raspberry Pi. Comprehensive experiments on CASIA-MS dataset and our collected data show that Palm-Auth achieves 99.1%, 99.5%, and 88.58% rank-1 identification accuracy on the CASIA-MS, MPD-V2, and Palm-Auth dataset, respectively. The full pipeline delivers 2–3-second identification. Palm-Auth demonstrates that combining programmable multispectral imaging with ef- ficient foundation-model inference enables practical, privacy- preserving, and deployable contactless palmprint authentica- tion. The source code for Palm-Auth:https://github.com/ jamal94sm/Palm-Auth
|
| |
| 15:00-15:10, Paper ThBTR-28.7 | Add to My Program |
| Efficient and Interpretable Tabular Learning Via Visual Transformation (I) |
|
| Hajjari, Amin | Yazd University |
| Seyedmohammdi, Seyed Jamal | Singapore Institute of Technology |
| Sadeghi, Mohammad Taghi | Yazd University |
| Abouei, Jamshid | Yazd University |
| Ng, Pai Chet | Singapore Institute of Technology |
| Mohammadi, Arash | Concordia University |
Keywords: Trust, Transparency, and Ethical Governance in Human-Machine Systems
Abstract: Tabular data remains prevalent in high-stakes applications, but Deep Learning (DL) models often underperform due to limited efficiency and interpretability. We introduce Tab2Vis, a computationally efficient tabular-to-image framework for classification that converts samples into class-discriminative 28 times 28 grayscale images, enabling Convolutional Neural Networks (CNNs) to leverage visual inductive biases. To address multicollinearity, we introduce a dataset-level Variance Inflation Factor (VIF) initialization strategy that eliminates repeated computations. We incorporate an integrated interpretability mechanism providing consistent explanations across both domains. Experiments on 67 OpenML-CC18 datasets show Tab2Vis achieves competitive performance with state-of-the-art methods while offering notable gains in efficiency and interpretability.
|
| |
| 15:10-15:20, Paper ThBTR-28.8 | Add to My Program |
| Governance-Aligned Chatbot Operationalizing the Fourth Edition of the Occupational Therapy Practice Framework (OTPF-4) (I) |
|
| Lee, Ke Yi | Singapore Institute of Technology |
| LIM, SI HUI | Singapore Institute of Technology (SIT) |
| Lim, Jinghao | Singapore Institute of Technology |
| Yeh, I-ling | Singapore Institute of Technology |
| Atmosukarto, Indriyati | Singapore Institute of Technology |
| Soh, Donny | Singapore Institute of Technology |
Keywords: Trust, Transparency, and Ethical Governance in Human-Machine Systems, Multi-Agent Architectures, Human-AI Collaboration and Decision Support
Abstract: The introduction of AI systems into human-inthe- loop, safety-critical settings, such as healthcare education, makes effective governance of human–AI collaboration crucial. Professional frameworks that guide human practice, such as the Fourth Edition of the Occupational Therapy Practice Framework: Domain and Process (OTPF-4) are designed to be narrative, interpretive, and value-driven. Although these qualities improve professional judgment, they create challenges for AI systems in achieving transparency, auditability, and alignment with ethical and institutional standards. This paper presents a progressive and exploratory implementation study aimed at translating OTPF-4 into a machineinterpretable representation for use in an AI-powered clinical education system. In this context, “machine-interpretable representation” does not denote a fully formal symbolic or ontological model; rather, it refers to a structured and operational encoding of OTPF-4 concepts through staged prompting, controlled workflows, and graph-based orchestration. Instead of replacing clinical reasoning, the method organizes OTPF-4 concepts into a structured knowledge representation that supports constrained reasoning, interpretability, and human oversight. Using a real-world example of a clinical reasoning chatbot, the team illustrates and preliminarily evaluates how governance requirements such as accountability, traceability, role delineation, and ethical safeguards can be operationalized within the system. Evaluation includes pilot user feedback and response-level analysis, highlighting both observed benefits and current limitations in alignment and transparency. The team argue that making computationally explicit professional frameworks is an important but often overlooked step toward enabling trustworthy human–AI collaboration, supporting responsible AI use while maintaining professional authority and intent. This is a preliminary study towards the formalization of measurable and auditable framework-aligned AI behavior.
|
| |
| 15:20-15:30, Paper ThBTR-28.9 | Add to My Program |
| Adaptive Energy-Based Robot Control for Physical Human–Robot Interaction: A Less Conservative Approach |
|
| Dang, Van Trong | Nara Institute of Science of Technology |
| Kotake, Hiroki | Nara Institute of Science and Technology |
| Honji, Sumitaka | Nara Institute of Science and Technology |
| Wada, Takahiro | Nara Institute of Science and Technology |
Keywords: Human-Robot Interaction and Collaborative Robotics, Human-Machine Integration, Human-in-loop Systems and Architectures
Abstract: Passivity-based control theory has emerged as a promising framework for physical human–robot interaction by explicitly enforcing the energetically passive relation. However, maintaining passivity at all times may overly limit the task execution capability of the robot, potentially increasing human physical workload. Furthermore, interaction safety can be violated in conventional passivity-based control approaches due to ignoring stored energy level and energy rate constraints. In this paper, an adaptive energy-based robot control is proposed for physical human-robot interaction by extending the passivity-based control and integrating additional safety constraints. Specifically, this method alleviates the inherent conservatism of conventional passivity-based control by ensuring passivity in the closed-loop system only when the system’s energy exceeds a predefined threshold, while allowing more flexible behaviors otherwise. Additionally, adaptive control parameter laws, stored energy level saturation, and energy rate constraints are integrated into the control method to enhance task performance and safe interaction. Numerical simulation and human-in-the-loop co-carrying experiment are conducted to validate the feasibility and effectiveness of the proposed approach.
|
| |
| 15:30-15:40, Paper ThBTR-28.10 | Add to My Program |
| Delay-Compensated Stiffness Estimation for Robot-Mediated Dyadic Interaction |
|
| Du, Mingtian | Nanyang Technological University |
| Raghavendra Kulkarni, Suhas | Nanyang Technological University |
| Noronha, Bernardo | Articares Pte. Ltd |
| Campolo, Domenico | Nanyang Technological University |
Keywords: Human-Robot Interaction and Collaborative Robotics, Human-in-loop Systems and Architectures, Applications in Healthcare, Smart Infrastructure, and Environments
Abstract: Robot-mediated human-human (dyadic) interactions enable therapists to provide physical therapy remotely, yet an accurate perception of patient stiffness remains challenging due to network-induced haptic delays. Conventional stiffness estimation methods, which neglect delay, suffer from temporal misalignment between force and position signals, leading to significant estimation errors as delays increase. To address this, we propose a robust, delay-compensated stiffness estimation framework by deriving an algebraic estimator based on quasi-static equilibrium that temporally aligns the expert's input with the novice's response. A Normalised Weighted Least Squares (NWLS) implementation is then introduced to robustly filter dynamic bias resulting from the algebraic derivation. Experiments using commercial rehabilitation robots (H-MAN) as the platform demonstrate that the proposed method significantly outperforms the standard estimator, maintaining consistent tracking accuracy under multiple introduced delays. These findings offer a promising solution for achieving high-fidelity haptic perception in remote dyadic interaction, potentially facilitating reliable stiffness assessment in therapeutic settings across networks.
|
| |
| 15:40-15:50, Paper ThBTR-28.11 | Add to My Program |
| Adaptive Observer-Based Control for Reduced-Sensor Dual-Parallel PMSMs with Stator Resistance Variation |
|
| Wang, Ziyao | Tongji University |
| Liu, Tianyi | Tongji University |
Keywords: Human-Machine Integration, Human-Robot Interaction and Collaborative Robotics
Abstract: Reduced Sensor Dual Parallel-PMSM (RSDPMSM) systems offer advantages in cost and hardware simplicity, but their control performance is sensitive to time-varying motor parameters, especially stator resistance variations in reduced-sensor configurations. This paper proposes an adaptive observer-based control strategy with online stator resistance identification for SIDP-PMSM systems without hardware modification. An error dynamic model is derived to show that the observer reconstruction error depends solely on the stator resistance of the master motor, achieving effective decoupling from the slave motor parameters. Based on this result, a recursive least squares (RLS) algorithm with a forgetting factor is employed to identify the master motor stator resistance in real time and adaptively update the observer and controller. Simulation results under feedback engagement, speed variation, and load disturbance conditions demonstrate improved current reconstruction accuracy, speed synchronization, and disturbance rejection. The proposed method enhances robustness and control accuracy for low-cost multi-motor drive applications with time-varying parameters.
|
| |
| ThBTR-29 Lecture session, TR-29 |
Add to My Program |
| TR-29 Special Sessions Linguistic AI Regular D - Afternoon Session |
|
| |
| |
| 14:00-14:10, Paper ThBTR-29.1 | Add to My Program |
| AI-Supported Linguistic Reasoning: A Human-AI Decision Support Framework for Multilingual Meaning and Ambiguity (I) |
|
| Pal Thamburaj, Kingston | Nanyang Technological Univeristy |
| Ramesh, Mercedes Premalatha | Nanyang Technological University |
Keywords: Human-AI Collaboration and Decision Support, Human Factors, Ergonomics, and Performance in Intelligent Systems, Trust, Transparency, and Ethical Governance in Human-Machine Systems
Abstract: Multilingual meaning interpretation remains challenging because lexical polysemy, syntactic underspecification, and pragmatic inference interact with code-mixing and culturally grounded expressions. State-of-the-art multilingual pretrained models (e.g., mBERT, XLM-R, mT5) and large language models (LLMs) improve cross-lingual processing, yet many systems still commit to a single output and provide limited decision support when multiple readings are plausible. In multilingual decision-making settings, ambiguity should be made visible: alternatives should be ranked, uncertainty should be quantified, and unstable reasoning should be flagged. An AI-supported linguistic reasoning framework is introduced that combines lightweight cue extraction, multilingual sentence embeddings for retrieval and similarity, and LLM-based candidate generation and reranking. Outputs include ranked meaning hypotheses, a normalized-entropy ambiguity score, and concise evidence-linked explanations; self-consistency sampling and meaning-preserving perturbation tests provide stability indicators that trigger review flags. Evaluation uses a FLORES-101 benchmark subset covering English, Mandarin Chinese, Malay, and Tamil and three computational tasks: candidate interpretation ranking with hard distractors, semantic similarity robustness under perturbations, and cross-lingual meaning alignment. Across tasks, stability-aware scoring improves top-1 accuracy by 4-9 points over embedding-only baselines and increases top-hypothesis perturbation stability from 0.72 to 0.86, while preserving the richer explanatory traces associated with LLM reasoning. Error analysis highlights recurring failures involving idioms, discourse-dependent inference, and culturally grounded references. The resulting framework supports interpretable human-AI collaboration for multilingual linguistic analysis and decision support.
|
| |
| 14:10-14:20, Paper ThBTR-29.2 | Add to My Program |
| Joyful Learning with Digital Escape Rooms: A Human-Centered Framework for Teacher Education and AI-Enabled Extension (I) |
|
| BASKARAN, GANGA | NTU, NIE |
| Ramakrishnan, Umayalambigai | Nanyang Technological Univeristy |
| Pal Thamburaj, Kingston | Nanyang Technological Univeristy |
Keywords: Human Factors, Ergonomics, and Performance in Intelligent Systems, Human-AI Collaboration and Decision Support, Policy, Standards, and Societal Impact of
Abstract: Digital escape rooms are structured, game-based learning activities that combine narrative goals, collaborative problem solving, and rapid feedback. In language classrooms, these features can increase participation, sustain attention, and create repeated opportunities for authentic reading, vocabulary use, and form-focused practice. This paper presents a design and deployment approach for Tamil language teacher education using Genially to build interactive escape room experiences. The learning environment is treated as a human-machine system in which teachers, learners, and the platform coordinate decisions about pacing, hinting, and task completion. A human-centered framework is introduced to guide puzzle design, facilitation, and evaluation, with attention to cognitive load, transparency of rules, accessibility of multimedia, and responsible data practices. Pilot classroom observations in a teacher education setting indicate high engagement, peer explanation, and increased confidence among student teachers in authoring interactive materials. Design patterns are documented for vocabulary locks, short reading missions, grammar checkpoints, and culturally grounded prompts that maintain linguistic accuracy while supporting teamwork and clear assessment criteria during implementation. To support systematic refinement, an evaluation plan is proposed that combines observation protocols, short learning checks, learner perception measures, and interaction traces that inform reflective lesson improvement. A staged pathway for AI-enabled extension is outlined, including analytics-supported redesign, teacher-reviewed automated hints, and optional multilingual scaffolding for low-resourced language contexts. Governance considerations are summarized for safety, accountability, privacy, and bias mitigation when analytics or AI features are introduced. The approach offers a practical model for designing interactive learning technologies that preserve teacher oversight while enabling evidence-informed iteration.
|
| |
| 14:20-14:30, Paper ThBTR-29.3 | Add to My Program |
| AI Powered Interactive Role Play Simulator for Authentic Spoken Language Practice (I) |
|
| Ramakrishnan, Umayalambigai | Nanyang Technological Univeristy |
| BASKARAN, GANGA | NTU, NIE |
| Pal Thamburaj, Kingston | Nanyang Technological Univeristy |
Keywords: Human-AI Collaboration and Decision Support, Human-Machine Integration, Human-Robot Interaction and Collaborative Robotics
Abstract: Authentic spoken interaction remains a persistent challenge in Tamil language education within multilingual societies where English frequently dominates everyday communication. Although curricular frameworks emphasise communicative competence, many learners experience limited opportunities for spontaneous oral practice, elevated speaking anxiety, and delayed or inconsistent feedback during classroom activities. This paper presents the design and pilot implementation of an AI powered Tamil interactive role play simulator that provides voice to voice engagement in everyday communicative scenarios such as markets, transport hubs, libraries, clinics, and cultural events. The system integrates automatic speech recognition for Tamil, a scenario constrained dialogue engine, and text to speech technologies to enable responsive conversational interaction with non player characters. Personalised formative feedback is generated after each interaction, including pronunciation cues, fluency indicators, vocabulary suggestions, and task completion guidance. A teacher facing analytics dashboard summarises learner progress through interpretable metrics to support targeted instructional decisions without increasing assessment workload. A design based research methodology guides iterative refinement through classroom deployment, learner surveys, speaking assessments, and system analytics. Pilot findings indicate improved learner confidence, increased willingness to attempt spontaneous responses, and measurable gains in oral fluency for frequently practised communicative functions. The simulator demonstrates the potential of human centred AI systems to support transparent feedback, scalable speaking practice, and data informed pedagogy in mother tongue language education.
|
| |
| 14:30-14:40, Paper ThBTR-29.4 | Add to My Program |
| Discriminant Insights: Applying Linear Discriminant Analysis to Tamil Vowel Usage for Social Awareness and Emotional Lyrics in Education (I) |
|
| B, Sharmila | SRM Institute of Science and Technology |
| Gnana Prakasam Louis Raja, Saviour Prakash | SRM Institute of Science and Technology |
| Ganesan, Ramesh | Central University of Tamil Nadu |
| Natarajan, Kalaiyarasi | SRM Institute of Science and Technogy |
| Vijayan, Nandhakumar | SRM Institute of Science and Technology |
Keywords: Human-Machine Integration, Human-AI Collaboration and Decision Support, Human-in-loop Systems and Architectures
Abstract: Tamil lyrics occupy a central role in language and literature classrooms, where they are used to explore phonology,stylistic variation, and the interaction between linguistic form and social meaning. This paper presents an exploratory baseline tool rather than a production classifier: it investigates whether a compact and statistically transparent model can support pedagogical analysis by summarizing vowel usage patterns across two lyric categories—Social Awareness and Emotionall yrics—frequently discussed in teaching and cultural discourse. Each of 54 authors is represented by raw counts of the twelve Tamil vowels, Uyir Ezhuthukal. Linear Discriminant Analysis (LDA) is applied with equal class priors to derive a single canonical discriminant function. The model yields an eigenvalue of 0.411 and a canonical correlation of 0.540, indicating moderate separation. Wilks’ lambda is 0.709 (p = 0.198), confirming non-significant class separation and limiting strong inferential claims. Resubstitution accuracy is 74.1% and leave-one-out cross-validation accuracy is 55.6%, corroborating the baseline nature of the approach. Despite modest predictive performance, interpretable LDA coefficients, centroids, and diagnostic statistics constitute structured decision-support artifacts that augment—rather than replace—instructor judgment in classroom discussion, example selection, and hypothesis formation. Raw counts are used in this baseline to preserve stylistic intensity and enable classroom reproducibility; the pedagogical implications of this choice and directions toward normalized features are explicitly discussed. A teacher-oriented analytical workflow, a concrete classroom deployment example, and governance considerations for responsible educational deployment are outlined, drawing on principles from humancentered design and trustworthy AI.
|
| |
| 14:40-14:50, Paper ThBTR-29.5 | Add to My Program |
| Statistical Modeling of Tamil Vowel Usage for Human-AI Collaborative Language Education: A Comparative Study of LDA and Regularized Logistic Regression (I) |
|
| Vijayan, Nandhakumar | SRM Institute of Science and Technology |
| Gnana Prakasam Louis Raja, Saviour Prakash | SRM Institute of Science and Technology |
| Ganesan, Ramesh | Central University of Tamil Nadu |
| G L, Christina Martha | Jamal Mohamed College |
| Natarajan, Kalaiyarasi | SRM Institute of Science and Technogy |
Keywords: Human-Machine Integration, Human-AI Collaboration and Decision Support, Human-in-loop Systems and Architectures
Abstract: Tamil is a classical, morphologically rich language whose vowel system (Uyir Ezhuthukal) is tightly linked to rhythm, prosody, and meaning in literary and lyrical writing. This paper examines whether simple, observable phonological statistics can support transparent decision support for language education—helping teachers and learners explore stylistic patterns across lyric genres and authors as an exploratory analytic aid rather than an automated classifier. A dataset of 54 Tamil lyric authors is organized into two semantic categories (Social Awareness and Emotional). For each author, raw counts of 12 vowels form 12-dimensional feature vectors, and two interpretable classifiers are compared: Linear Discriminant Analysis (LDA) and regularized logistic regression. LDA produces one discriminant function (eigenvalue = 0.411, canonical correlation = 0.54) and achieves 74.1% training accuracy and 55.6% leaveone-out cross-validated (LOOCV) accuracy. Regularized logistic regression attains comparable LOOCV accuracy (55.6%). The non-significant Wilks’ lambda (p = 0.199) and a total-vowelcount baseline (LOOCV = 55.6%) confirm that the modest signal reflects both phonological pattern and length confound, motivating length-aware reporting. Interpretable vowel loadings— notably ii, a, aa, and i as positive discriminators—constitute the primary contribution, enabling educators to inspect and discuss phonological patterns rather than relying on opaque predictions. This study frames phonological counts as auditable analytics complementing qualitative teaching in low-resource language settings.
|
| |
| 14:50-15:00, Paper ThBTR-29.6 | Add to My Program |
| Quantitative Analysis of Vallinam and Idaiyinam Patterns in Sangam-Era Tamil Texts for Interpretable Language Analytics (I) |
|
| Ganesan, Ramesh | Central University of Tamil Nadu |
| Gnana Prakasam Louis Raja, Saviour Prakash | SRM Institute of Science and Technology |
| Natarajan, Kalaiyarasi | SRM Institute of Science and Technogy |
| G L, Christina Martha | Jamal Mohamed College |
| Vijayan, Nandhakumar | SRM Institute of Science and Technology |
Keywords: Human-Machine Integration, Human-AI Collaboration and Decision Support, Human-in-loop Systems and Architectures
Abstract: Classical Tamil consonant classes –Vallinam (stops) and Idaiyinam (approximants and laterals)—carry significant phonetic and stylistic functions across literary registers and are central to prosody instruction in Tamil language education. This paper addresses a gap in quantitative Tamil phonology: the absence of reproducible, factor-decomposed analyses that can directly inform teacher-facing text selection and learner assessment. We present a deterministic consonantcounting procedure applied to a corpus of 60 Sangam-era poems (873 raw tokens, 645 analyzed after filtering) attributed to six Pulavarkal, followed by a one-way ANOVA on perauthor normalized rates that tests whether consonant usage differences across poets are statistically reliable. Results confirm significant author effects for both Vallinam (F(5,107)=38.2, p<0.001, ²=0.64) and Idaiyinam (F(5,107)=41.5, p<0.001, ²=0.66). We then demonstrate two concrete classroom applications: a text-selection rubric for prosody instruction and an assessmen item-alignment checklist, both grounded in the empirical factor profiles. Limitations including corpus size, romanization sensitivity, and reproducibility requirements are addressed with specific mitigation protocols. The work contributes an interpretable, auditable analytics pipeline suited to human-inthe-loop educational tools for low-resource classical languages.
|
| |
| 15:00-15:10, Paper ThBTR-29.7 | Add to My Program |
| Action-Conditioned Prompting for Air Traffic Control: A Smart Prompt Producer That Converts Controller HMI Actions into Verified, Explainable LLM Outputs (I) |
|
| Ramesh, Mercedes Premalatha | Nanyang Technological University |
| Pal Thamburaj, Kingston | Nanyang Technological Univeristy |
| Dhief, Imen | Air Traffic Management Research Institute |
| Feroskhan, Mir | Nanyang Technological University Singapore |
Keywords: Human-in-loop Systems and Architectures, Human-AI Collaboration and Decision Support, Multi-Agent Architectures
Abstract: Air traffic control (ATC) is a safety-critical socio-technical system in which radiotelephony is both an operational instrument and a failure pathway. Readback/hearback deviations, call sign confusion, and non-standard phraseology persist under workload and local conventions. Large language models (LLMs) are increasingly explored for language support, yet chat-style prompting and free-form generation increase interaction burden, reduce traceability, and hinder verification. The objective is to convert routine controller interface activity into grounded, checkable, explainable language outputs for low-resource and cross-cultural settings. Action-conditioned prompting compiles controller human-machine interface (HMI) actions (track selection, label edits, clearance entry, coordination and handover) into a structured intent state that drives a smart prompt producer. Prompt plans route requests through a constrained multi-agent pipeline and retrieval grounding over ICAO phraseology plus locally permitted SOP artifacts. Novelty arises from treating action logs as a low-resource interaction language and elevating verification to an interface contract: JSON schemas, deterministic validators for numeric integrity and call-sign consistency, rule-based readback requirements, and disagreement-based uncertainty gating that degrades outputs to clarification or silence. A prototype verification layer was exercised through simulator-style log replay and a perturbation harness spanning 2,000 procedure-consistent scenarios. All injected structural and numeric violations were rejected with zero false positives on unperturbed cases; required-slot mismatches injected into 10% of scenarios produced zero unsafe suggestions, with 78% rejected and 22% degraded to clarification.
|
| |
| 15:10-15:20, Paper ThBTR-29.8 | Add to My Program |
| FSC-CD: A Feature–Structure Coupled Approach for Community Deception in Networks |
|
| yang, yue | Xi'an University of Architecture and Technolog |
| Yang, Zhuoyan | Xi'an University of Architecture and Technology |
| Liu, Yutong | Xi'an University of Architecture and Technology |
| Zhang, Chong | Xi'an Jiaotong University |
| Su, Xiaojie | Chongqing University |
Keywords: Trust, Transparency, and Ethical Governance in Human-Machine Systems
Abstract: Many real-world networks, such as social and biological networks, exhibit community structures. Community detection algorithms extract valuable insights from these networks by identifying densely connected groups, enabling applications such as recommendation, behavior understanding, and system optimization. However, growing concerns about data privacy and security have led to techniques that protect user information from being over-inferred within communities. This has given rise to community deception (CD), which introduces small, targeted perturbations to a network to obscure sensitive communities from detection algorithms. Most existing community deception approaches focus on modifying network topology, often neglecting the rich feature information embedded within communities. In this paper, we propose FSC-CD (Feature–Structure Coupled Community Deception), which couples feature-derived representations with structural cues to improve community concealment. FSC-CD is effective for both single-community deception and randomized multi-community hiding. A key innovation is a budget allocation strategy that optimizes the distribution of perturbations to maximize deception efficiency. Moreover, by exploiting feature similarity, FSC-CD designs an edge perturbation mechanism that improves stability under small perturbation budgets. Extensive experiments on three real-world network datasets across multiple community detectors show that FSC-CD is more stable and consistently outperforms baseline methods in hiding both single and multiple communities, reducing the detection accuracy by up to 17.6% compared to state-of-the-art approaches.
|
| |
| 15:20-15:30, Paper ThBTR-29.9 | Add to My Program |
| Closing the Context Gap: Community Rules As a Correction Layer for Algorithmic Moderation |
|
| Kuo, Tina | Technical University of Munich |
| Buchholz, Clara Lea | Technical University Munich (TUM) |
| Grossklags, Jens | Technical University of Munich |
Keywords: Trust, Transparency, and Ethical Governance in Human-Machine Systems, Policy, Standards, and Societal Impact of, Human-AI Collaboration and Decision Support
Abstract: As social media platforms increasingly rely on automated systems for content moderation, user trust and system transparency are threatened by a context gap where centralized algorithms fail to account for the nuanced social norms of diverse sub-communities. While platforms provide generalized policy baselines, they rely on volunteer moderators to bridge this gap by establishing community-specific rules. We conducted a mixed-methods study comprising a content analysis of 337 rules from 60 Facebook Groups and semi-structured interviews with 15 users to examine the utility of these governance artifacts. Our findings reveal that community rules function as a critical transparency interface, providing instructional scaffolding and contextual nuance that centralized algorithmic systems lack. We frame these rules not merely as social norms, but as human-in-the-loop interventions that augment platform-wide oversight by defining specific consequences and ethical boundaries. Users perceive these localized rules as more legitimate anchors for trust than platform-wide standards. We conclude by discussing how these human-authored governance artifacts democratize digital governance and offer design implications for hybrid human-machine moderation systems.
|
| |
| 15:30-15:40, Paper ThBTR-29.10 | Add to My Program |
| Towards Responsible AI in Safety-Critical Human-Machine Systems: A Literature-Based Governance Framework |
|
| Panner, Elias | TU Wien |
| Scheffer, Sara | TU Wien |
| Matyas, Kurt | TU Wien |
Keywords: Trust, Transparency, and Ethical Governance in Human-Machine Systems
Abstract: The increasing integration of Artificial Intelligence (AI) into safety-critical human–machine systems raises significant challenges for risk governance, accountability, and human oversight. Although numerous Responsible and Trustworthy AI principles and frameworks have been proposed, their operationalization in real-world, safety-critical settings remains uneven and insufficiently evaluated. This paper presents a systematic literature review (SLR), conducted in accordance with PRISMA guidelines, examining Responsible AI risk governance in safety-critical systems between 2015 and 2025. The review draws on peer-reviewed research, international standards, and policy documents from authoritative bodies, including the EU, NIST, IEEE, and ISO. A thematic synthesis of 36 sources organizes the literature into three interconnected themes: cross domain governance frameworks, healthcare as a mature safety critical domain, and firefighting and emergency response as an emerging domain. The findings reveal convergence on core governance dimensions (e.g., accountability, transparency and explainability, safety and robustness, and human agency and oversight) alongside persistent gaps in implementation and lifecycle governance. Based on these insights, a human centered, lifecycle-oriented AI governance framework is pro posed to support the design and governance of safety-critical human–machine systems.
|
| |
| 15:40-15:50, Paper ThBTR-29.11 | Add to My Program |
| Integrating Responsible AI into the Model Development Life Cycle: A Practical Framework for Predictive Maintenance |
|
| Barisua, Barile Smith | Southern Alberta Institute of Technology |
| Suman, Reeta | Southern Alberta Institute of Technology (SAIT) |
Keywords: Trust, Transparency, and Ethical Governance in Human-Machine Systems, Human-in-loop Systems and Architectures
Abstract: Abstract— Safety-critical ML systems require human oversight, but data quality issues often go undetected, leading operators to trust invalid models. We present RAI-MDLC, a checkpoint framework that enables non expert reviewers to validate ML pipelines using clear pass/fail criteria. Through controlled experiments on aerospace turbofan degradation (NASA C-MAPSS, n=31,818) and healthcare fall detection (Medical IoT, n=670), we systematically detect three common errors: data leakage (detected in <1 min), label misalignment (<5 min), and class imbalance collapse (immediate). An ablation analysis identifies a minimum viable subset of checkpoints (2.1, 4.1, and 4.2) that catches 100% of tested errors. After guided corrections, aerospace models achieve an R² of 0.88 (LSTM) and 0.82 (XGBoost). In healthcare, the framework prevents deployment of a model with 95.68% accuracy but 0% recall for falls. Cross-domain code reusability reaches 85 90%, debugging time drops from days to 2 4 hours, and transparent validation builds operator trust. RAI-MDLC complements existing automated tools by adding a layer of human centered governance accessible to domain experts without ML training.
|
| |
| 15:50-16:00, Paper ThBTR-29.12 | Add to My Program |
| A Task-Based Benchmark for Model Context Protocol-Driven Geospatial Dataset Discovery: LLM Agents vs Human Search |
|
| Morandini, Luca | University of Melbourne |
| Sinnott, Richard O, | University of Melbourne |
Keywords: Human-AI Collaboration and Decision Support, Human-Machine Integration, Human Factors, Ergonomics, and Performance in Intelligent Systems
Abstract: The use of Large Language Models to assist in querying geospatial metadata remains under-explored, with most research focused on querying geographic data that assumes given datasets. In this paper, we present a task-based benchmark comparing human analysts with Model Context Protocol (MCP)-enabled LLM agents to query a collection of 7,000+ geospatial datasets with curated metadata. Across 10 dataset-discovery tasks, we evaluate performance using a range of criteria: task success, search effort (number of steps), and retrieval quality (precision and recall) of the final query. We also report time and cost measures where applicable. Results show that frontier models approach human performance on parts of the retrieval-quality metrics and, under a strict exact-match success criterion, may match or surpass humans, although performance remains task-dependent. These findings illustrate the current accuracy–efficiency trade-offs of MCP-based LLM agents for geospatial data.
|
| |