| |
Last updated on October 11, 2025. This conference program is tentative and subject to change
Technical Program for Tuesday October 7, 2025
|
Tu-S1-T1 |
Hall F |
Deep Learning 3 |
Regular Papers - Cybernetics |
Chair: Zhang, Zhiyuan | Singapore Management University |
Co-Chair: Ji, Changqing | TOHOKU University |
|
08:30-08:45, Paper Tu-S1-T1.1 | |
Time-Frequency-Spatial Neural Architecture for Decoding Visual Signals from Macaque ECoG |
|
Ji, Changqing | TOHOKU University |
Kawasaki, Keisuke | Niigata University |
Hasegawa, Isao | Niigata University |
Okatani, Takayuki | TOHOKU University |
Keywords: Deep Learning, Computational Intelligence, Machine Learning
Abstract: Understanding how visual information is encoded in electrocorticography (ECoG) signals is essential for developing accurate and interpretable decoding models. In this study, we propose two novel approaches for multi-class visual classification based on ECoG data recorded from the inferior temporal cortex of macaque monkeys. The first model, MST-ECoGNet, combines traditional signal processing with neural networks by employing the Modified Stockwell Transform (MST) to map ECoG signals into a structured time-frequency-spatial domain. The second model, BiBand-3DECoGNet, replaces MST with a learnable convolutional module and utilizes a 3D spatial encoder to exploit the electrode array structure. Experimental results show that our models significantly outperform prior work, achieving up to a 12.87 percentage point improvement in classification accuracy while reducing model size by a factor of ten and increasing training speed by sixfold. Analysis of feature dimensions reveals that spatial and low-frequency components carry the most relevant information for visual decoding. These findings provide a foundation for further exploration of the neural mechanisms underlying visual object representation in the brain.
|
|
08:45-09:00, Paper Tu-S1-T1.2 | |
From Traditional Methods to GPT-Based Models for 2D Video Game Level Procedural Content Generation: An Empirical Study |
|
Cerezo, Daniel | University of Granada |
Triguero, Isaac | University of Granada |
Keywords: Deep Learning, Evolutionary Computation, Transfer Learning
Abstract: Procedural level generation in video games has made significant strides, yet achieving high-quality automated level design remains a major challenge. Over the years, techniques have evolved from simple constructive algorithms to advanced Artificial Intelligence (AI) models like Generative Adversarial Networks and Large Language Models. However, the lack of a standardised evaluation framework has hindered direct numerical comparisons and the ability to gauge true progress in the field. To address this gap, we propose an evaluation methodology to benchmark key generation techniques and explore the potential of general-purpose AI models. As a case study, we present a Super Mario Bros level generator powered by ChatGPT, leveraging general-purpose natural language for design tasks. The results show the different strengths and weaknesses of existing models, indicating that traditional algorithms still outperform the most advanced AI methods in this domain, highlighting the need for further innovation to bridge the gap.
|
|
09:00-09:15, Paper Tu-S1-T1.3 | |
VAEGAN-Based Architecture for Missing Values Imputation in Sparse Datasets |
|
Charbti, Yanis | University of Gabès; National Engineering School of Gabès (ENIG) |
Njah, Hasna | University of Gabes |
Keywords: Deep Learning, Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing, Machine Learning
Abstract: In order to address missing-value prediction in sparse datasets, we present VAEGAIN, a novel imputation framework modeled after transformer architectures that combines the adversarial robustness of Generative Adversarial Networks (GANs) with the representational power of Variational Autoencoders (VAEs). To capture intricate feature dependencies and maintain statistical coherence, each linear transformation in the GAN discriminator and encoder–decoder pathways is parameterized. Because high missingness can skew underlying feature correlations and impair downstream model performance, sparse data poses serious imputation challenges in many real-world scenarios. In order to overcome these problems, VAEGAIN uses three dynamically generated matrices—data, mask, and stochastic noise—to direct an end-to-end, context-aware imputation procedure. In order to ensure accurate imputation and strong feature consistency, the framework is optimized using a composite loss that combines reconstruction error with adversarial objectives. An experimental evaluation shows that, at a fixed 20% missing rate, VAEGAIN achieves 12-18% higher imputation accuracy than state-of-the-art baselines. Our approach offers a trustworthy and interpretable solution for incomplete data reconstruction in high-stakes domains like healthcare and finance by providing trustworthy probabilistic estimates in addition to imputed values. By establishing a new benchmark for hybrid, uncertainty-aware imputation, VAEGAIN opens the door for probabilistic reasoning on sparse, structured datasets in the future.
|
|
09:15-09:30, Paper Tu-S1-T1.4 | |
Detection of Incomplete Root Canal Obturations in Dental X-Ray Images Via Spatial-Semantic Attention and Dynamic Feature Calibration |
|
Ren, Zhiqi | Chongqing Normal University |
Chai, Shanglei | Shenzhen University |
Zhang, Zhiyuan | Singapore Management University |
Zhang, Xueyang | Southern Medical University |
Tian, Yibin | Shenzhen University |
Zeng, Zhi | Chongqing Normal University |
Keywords: Deep Learning, Image Processing and Pattern Recognition
Abstract: 为了解决低分辨 9575;、小目标的挑 2; 特征丢失和复杂 5299;剖学的干扰 检测不完全根管 0805;填的结构 在牙科根尖周 X 光片中,本文提 0986;了一种 改进的 YOLOv8 模型。首先,我 0204;设计一个 Convolution 模块,其中 Space-to-Depth 转换 (SDT-Conv) 通过空间深度保 0041;特征图分辨率 可分离的卷积, 6377;效缓解信息 下采样作导致小 0446;标丢失。 其次,我们构建 9968;个动态迭代令୮ 0;聚合器 (迪塔) 架构,可增强全 3616;功能 通过超令牌空间 2858;合进行表示,ߣ 7;及 语义关联,同时 7319;用空间语义 双流注意力机制 4378;化多尺度 功能融合能力, 0174;而提供更丰富 0; 整个网络的特征 0449;息。最后,我ߤ 4; 嵌入一个高效的 多尺度 注意 (EMA) 动态校准机制 检测头,通过以 9979;方式优化特征ࡺ 9;应 跨通道权重自适 4212;,使模型能ࣩ
|
|
09:30-09:45, Paper Tu-S1-T1.5 | |
RSTnet: A Residual Hybrid CNN-Transformer Model for Enhanced Multi-Organ Segmentation |
|
Wang, Beiqi | Beijing University of Chemical Technology |
Miao, Deyu | BEijing University of Chemical Technology |
Keywords: Deep Learning, Image Processing and Pattern Recognition
Abstract: Multi-organ segmentation is essential in clinical applications but remains challenging due to diverse organ morphologies, image noise that degrades local features, and data scarcity limiting the representation of small organs. These challenges often cause poor boundary delineation and loss of fine microstructural details. Previous methods typically struggle to effectively balance global contextual information and fine local features, resulting in suboptimal segmentation performance, especially around complex organ boundaries. To address these limitations, we propose RSTnet, a hybrid CNN-Transformer architecture designed to explicitly capture and fuse multi-scale global and local information. RSTnet uses a dual-encoder structure: one path employs a Swin Transformer backbone enhanced with Convolutional Block Attention Modules (CBAM) to extract robust global context, while the other path is a dedicated Convolutional Neural Network (CNN) encoder that focuses on fine-grained local feature extraction. Rather than integrating the parallel encoder outputs directly, Residual Channel Attention Skip Connections (RCASC) are introduced as attention-enhanced skip connections between encoder and decoder layers to improve feature fusion during decoding. Experimental results on the Synapse multi-organ CT dataset demonstrate that RSTnet outperforms state-of-the-art methods, achieving an average Dice Similarity Coefficient (DSC) of 76.25% and an average Hausdorff Distance (HD95) of 25.24%. These results confirm that RSTnet effectively overcomes the limitations of prior methods, particularly in segmenting complex organ boundaries and capturing microstructural details under challenging conditions.
|
|
09:45-10:00, Paper Tu-S1-T1.6 | |
Optimizing Latent Factor Models for Recommender Systems |
|
Elrfaey, Mohamed | University of Victoria |
Gulliver, Aaron | University of Victoria |
Keywords: Machine Learning, Intelligent Internet Systems, Deep Learning
Abstract: This paper presents an approach to matrix factorization-based recommender systems that incorporates parameter tuning and noise modeling to improve robustness and generalization. The validation performance is best for the dimensions of the latent features k = 2 with a mean squared error (MSE) of validation loss of 0.464. The results show the impact of Gaussian noise on performance and demonstrate how implicit feedback integration can improve the accuracy of recommendations. They offer practical insights into building scalable and robust recommendation systems
|
|
Tu-S1-T2 |
Hall N |
Application of Artificial Intelligence 3 |
Regular Papers - Cybernetics |
Chair: Huang, Linqing | Shanghai Jiao Tong University |
Co-Chair: Oliveira, Adriano, Adriano L.I.Oliveira | Universidade Federal De Pernambuco |
|
08:30-08:45, Paper Tu-S1-T2.1 | |
TripletAudit: Context-Aware Sensitive Content Detection for Reimbursement |
|
Qi, Gege | Computer Network Information Center, CAS, Beijing, China; Univer |
Chang, Wenjing | Computer Network Information Center, CAS, Beijing, China; Univer |
Wang, Yue | Computer Network Information Center, CAS |
Yu, Jianjun | Computer Network Information Center, Chinese Academy of Sciences, |
Keywords: Application of Artificial Intelligence, Deep Learning, Machine Learning
Abstract: Sensitive content detection in financial reimbursement documents is fundamental to corporate compliance management. Traditional keyword-based approaches suffer from high false-positive rates due to their inability to capture contextual semantics. We propose TripletAudit, a context-aware framework for sensitive content detection that leverages triplet learning, incorporating a hierarchical architecture combining BERT for semantic embedding, Bi-LSTM for sequential modeling, and multi-head attention for context understanding. Our triplet learning mechanism effectively distinguishes between sensitive and non-sensitive contexts through anchor-positive-negative samples. Experiments on real-world reimbursement datasets demonstrate that TripletAudit achieves state-of-the-art performance with 94.03% accuracy and 93.24% F1-score, significantly outperforming baseline methods in financial compliance scenarios.
|
|
08:45-09:00, Paper Tu-S1-T2.2 | |
CAMF: Collaborative Adversarial Multi-Agent Framework for Machine Generated Text Detection |
|
Yue, Wang | Upstart Holdings |
Wei, Liesheng | ShangHai Ocean University |
Wang, Yuxiang | Stevens Institue of Technology |
Keywords: Information Assurance and Intelligence, AI and Applications, Application of Artificial Intelligence
Abstract: Detecting machine-generated text (MGT) from contemporary Large Language Models (LLMs) is increasingly crucial amid risks like disinformation and threats to academic integrity. Existing zero-shot detection paradigms, despite their practicality, often exhibit significant deficiencies. Key challenges include: (1) superficial analyses focused on limited textual attributes, and (2) a lack of investigation into consistency across linguistic dimensions such as style, semantics, and logic. To address these challenges, we introduce the Collaborative Adversarial Multi-agent Framework (CAMF), a novel archi- tecture using multiple LLM-based agents. CAMF employs specialized agents in a synergistic three-phase process: Multi- dimensional Linguistic Feature Extraction, Adversarial Consis- tency Probing, and Synthesized Judgment Aggregation. This structured collaborative-adversarial process enables a deep analysis of subtle, cross-dimensional textual incongruities in- dicative of non-human origin. Empirical evaluations demon- strate CAMF’s significant superiority over state-of-the-art zero- shot MGT detection techniques.
|
|
09:00-09:15, Paper Tu-S1-T2.3 | |
GTRNet: Graph Topology-Aware Refinement Network for User Role Classification in Social Networks |
|
Gu, Tingxuan | Shanghai Jiao Tong University |
Huang, Linqing | Shanghai Jiao Tong University |
Liu, Gongshen | Shanghai Jiao Tong University |
Keywords: Application of Artificial Intelligence, Deep Learning, Machine Learning
Abstract: With the rapid development of Internet technology, social networks have become essential platforms for communication, and in the process, they generate vast amounts of user data that reflect interests, relationships, and habits. User role classification in social networks is critical for personalized recommendations, precision marketing, and security. In this work, we investigate Graph Neural Networks (GNNs) for user role classification in social networks, and a novel GNN architecture, termed Graph Topology-aware Refinement Network (GTRNet), is designed. GTRNet comprises two modules: Graph Topology Encoding (GTE) and Node Representation Refinement (NRR). They combine the node features and network topological information via convolutional layers to enhance the use role classification performance. The experimental results demonstrate that GTRNet usually outperforms the state-of-the-art methods on the Facebook, Cora, and Citeseer datasets, with micro-F1 score improvements of 0.021, 0.009, and 0.014, respectively. It verifies GTRNet's effectiveness in addressing social network user role classification task.
|
|
09:15-09:30, Paper Tu-S1-T2.4 | |
An Adaptive Learning System for Anomaly Detection in SDN |
|
Ferdous, Ehtesham | Charles Sturt University |
Rahman, Md Geaur | Charles Sturt University |
Mahmood, Adnan | Macquarie University |
Rehman, Sabih ur | Charles Sturt University |
Islam, Md Zahidul | Charles Sturt University |
Keywords: Intelligent Internet Systems, Machine Learning, Application of Artificial Intelligence
Abstract: Significant growth of internet users signifies that more autonomous systems need to be designed. Software-Defined Network (SDN) replaces traditional networks due to their solubility and capability to integrate with artificial intelligence. It also makes networking future poof. A technological shift often comes with shortcomings, and SDN is no exception. Anomaly detection is a crucial task. Existing anomaly detection methods use traditional statistical and machine learning-based models that may perform poorly due to two key issues: (i) the unavailability of all training network-packets at the beginning since these packets are received over time and the training dataset gradually grows over time, and (ii) limited memory and computation capacity of the computing system for model building, preventing the storage of all network-packets that arrive over time for a prolonged period. To address the issues, this paper introduces a novel functionality in SDN called Adaptive anomaly Detection Method (ADM), which utilises an existing technique for facilitating incremental learning so that ADM can handle new anomaly types by keeping track of historical information. ADM is evaluated using four criteria by comparing its performances against four state-of-the-art methods on two publicly available datasets. Our experimental results demonstrate that ADM performs significantly better than existing techniques. The source code of this work has been made publicly available at https://github.com/eferdous/adm
|
|
09:30-09:45, Paper Tu-S1-T2.5 | |
Offline and Continual Just-In-Time Software Defect Prediction with Pre-Trained Language Models |
|
Monteiro, Monique Louise | Federal University of Pernambuco |
Cabral, George Gomes | Federal Rural University of Pernambuco |
Oliveira, Adriano, Adriano L.I.Oliveira | Universidade Federal De Pernambuco |
Keywords: Application of Artificial Intelligence, Deep Learning, Machine Learning
Abstract: Just-in-time Software Defect Prediction (JIT-SDP) aims to detect potential defects early, helping to prevent risky code from entering the repository during development. This study evaluates JIT-SDP using pre-trained language models in different architectures and settings. It compares open-source fine-tuned models such as CodeT5+ and UniXCoder with closed LLMs such as GPT and Gemini. This is the first known study to compare trainable open and prompt-based closed decoder-only models for JIT-SDP. The main results show that fine-tuned open models outperform closed models in zero-shot and few-shot scenarios without advanced prompt engineering techniques, and in cross-project tasks, CodeT5+ and UniXCoder surpass previous state-of-the-art results. The findings underscore the value of model architecture, fine-tuning, and expert features for effective defect prediction. Finally, we introduce CodeFlowLM – to our knowledge, the first framework for continual JIT-SDP using pre-trained language models.
|
|
09:45-10:00, Paper Tu-S1-T2.6 | |
MGAN: Multi-Granularity Action Network for Video Action Recognition |
|
Zhou, Tong | Hebei University |
Yang, Wenzhu | Hebei University |
Keywords: Application of Artificial Intelligence, Deep Learning, Machine Vision
Abstract: Some traditional action recognition methods are limited to using appearance features to identify human behaviors. However, in real-world environments, human activities often occur in complex and changeable scenes. Therefore, learning the action features in videos from complex environments is the key to action recognition. When performing action recognition tasks in complex spatio-temporal backgrounds, fusing action information from different granularity levels is often a necessary step. In response to this idea, we propose a Multi-Granularity Action Network (MGAN), considering the modeling and fusion of spatio-temporal information with regard to action granularity. The central component of this network is the Multi-Granularity Action Excitation Module (MGAE), which excites features in parallel along different channel groups through different granularities and is capable of modeling multi-scale spatio-temporal information. To enhance the motion information between frames, we propose a Local Motion Module (LMM) for extracting fine-grained motion features. We insert the MGAE and the LMM into ResNet-50 to form a simple yet effective Multi-Granularity Action Network. We have conducted a large number of experiments on three widely used datasets, namely Something-Something V1, Something-Something V2, and UCF-101. The experiments demonstrate that with only a small number of additional parameters and computational cost introduced, the proposed MGAN achieves competitive performance.
|
|
Tu-S1-T4 |
Room 0.12 |
Robotic Systems 1 |
Regular Papers - SSE |
Chair: Ellinas, Georgios | University of Cyprus |
Co-Chair: Zhang, Shuaiqi | Harbin Engineering University |
|
08:30-08:45, Paper Tu-S1-T4.1 | |
SkyNet: An Extensible Edge-Cloud Collaborative Framework for Robots in Long-Horizon Tasks |
|
Jiang, Xinyi | Zhejiang University |
Wang, Guoming | ZHEJIANG UNIVERSITY |
Lu, Rongxing | Queen's University, Canada |
Tang, Siliang | Zhejiang University |
Keywords: Robotic Systems
Abstract: Large language models (LLMs) have shown promise in empowering robotics, but their widespread real-world application remains challenging due to two main issues: (1) existing research is “out-of-the-box”, struggling to generalize to new robots and tasks, especially long-horizon tasks, and (2) deploying more general and powerful LLMs exceeds the capabilities of commodity hardware. To address these challenges, we propose the edge-cloud collaborative framework, i.e., SkyNet. We deploy LLMs in the cloud to create initial plans, select executable skills from a predefined library, and send them to the edge-based robot. The robot integrates multiple modules to form a policy network to complete the skills and update the feedback history. Based on the feedback history, the cloud LLMs determine whether to replan. The edge-cloud collaborative approach alleviates the pressure of deploying LLMs on commodity hardware, while the modular design enables easy extension to different tasks or robots without reconfiguring everything. To address the lack of standardized real-world experimental setups, we set up two easily replicable long-horizon tasks on a mobile robot equipped with commodity hardware, analyze the performance of various modules, and demonstrated the effectiveness of SkyNet.
|
|
08:45-09:00, Paper Tu-S1-T4.2 | |
Attention Mechanism and Improved SAC for Basketball Shooting in Humanoid Robots |
|
Zhang, Shuaiqi | Harbin Engineering University |
Zhao, Guodong | Harbin Engineering University |
Liu, Mingshuo | Harbin Engineering University |
Dong, Jian Hua | Harbin Engineering University |
Keywords: Robotic Systems
Abstract: Basketball shooting for humanoid robots is a highly complex and challenging task that requires efficient perception, decision-making capabilities, and precise motion control. Traditional control approaches to humanoid robot basketball shooting often rely on extensive manual coding, which results in high development costs, poor adaptability, and limited capability to handle the complex variations of dynamic environments. To address these challenges, this paper divides the basketball shooting process into three sub-tasks: approach the ball, pick up the ball, and shoot the ball. We address the challenges of perception, decision-making, and control by introducing an attention mechanism in the perception module, proposing a hybrid prioritized experience replay method for the Soft Actor-Critic (SAC) algorithm in the decision-making module, and designing low-level actions for robot control. Through end-to-end training, the robot sequentially completes the three sub-tasks and masters the basketball shooting skill through autonomous learning. The experimental results demonstrate that the proposed method significantly outperforms traditional deep reinforcement learning algorithms in terms of convergence speed, and shooting accuracy.
|
|
09:00-09:15, Paper Tu-S1-T4.3 | |
Reinforcement Learning Task Assignment for Multi-Pursuer Multi-Evader Reach-Avoid Games |
|
Mahfuz, Asif | Queen's University |
Cabral, Kleber | Queen's University |
Givigi, Sidney | Queen's University |
Keywords: Robotic Systems
Abstract: Multi-pursuer multi-evader (MPME) pursuit-evasion (PE) scenarios are common in real-world applications but pose significant computational challenges as the number of agents grows and dynamics become more complex. Hierarchical decomposition simplifies the problem by splitting it into high-level task assignment (TA) and low-level execution. TA, an NP-hard problem with an exponentially expanding solution space, is addressed here using a novel Reinforcement Learning (RL) approach with both static and dynamic TA in a 2D MPME reach-avoid game. The RL method matches the performance of a combinatorial optimization approach with homogeneous pursuers and outperforms it with heterogeneous ones, demonstrating its effectiveness in complex settings.
|
|
09:15-09:30, Paper Tu-S1-T4.4 | |
Interactive Identification of Granular Materials Using Force Measurements |
|
Hynninen, Samuli | Aalto University |
Nguyen Le, Tran | Aalto University |
Kyrki, Ville | Aalto University |
Keywords: Robotic Systems
Abstract: Despite the potential the ability to identify granular materials creates for applications such as robotic cooking or earth moving, granular material identification remains a challenging area, existing methods mostly relying on shaking the materials in closed containers. This work presents an interactive material identification framework that enables robots to identify a wide range of granular materials using only force-torque measurements. Unlike prior works, the proposed approach uses direct interaction with the materials. The approach is evaluated through experiments with a real-world dataset comprising 11 granular materials, which we also make publicly available. Results show that our method can identify a wide range of granular materials with near-perfect accuracy while relying solely on force measurements obtained from direct interaction. %Additionally, we performed a case study on granular material transportation, which shows that manipulation skills targeted for a particular material may fail with another one, highlighting the potential benefits of reliable material identification. Further, our comprehensive data analysis and experiments show that a high-performance feature space must combine features related to the force signal's time-domain dynamics and frequency spectrum. We account for this by proposing a combination of the raw signal and its high-frequency magnitude histogram as the suggested feature space representation. We show that the proposed feature space outperforms baselines by a significant margin. The code and data set are available at: https://irobotics.aalto.fi/identify_granular/.
|
|
09:30-09:45, Paper Tu-S1-T4.5 | |
Towards Real-Time Safe UAV Trajectory Planning Using Clustering of Nearby UAVs in Dense Urban Environments |
|
Exadaktylos, Stylianos | University of Cyprus |
Vitale, Christian | University of Cyprus |
Kolios, Panayiotis | University of Cyprus |
Ellinas, Georgios | University of Cyprus |
Keywords: Robotic Systems, Autonomous Vehicle
Abstract: Rapid urbanization trends have resulted in a significant increase in global urban population over the past few decades. This growth is projected to continue, further exacerbating traffic congestion in road networks. Urban air mobility presents a promising solution to alleviate this congestion, offering substantial environmental, economic, and societal benefits. However, to fully realize these benefits, autonomous flight and automation are crucial for safely managing the high density of aircraft expected in urban airspaces. To address this challenge, this work proposes an innovative model predictive control (MPC) framework for generating on-demand, safe 3D trajectories for fleets of unmanned aerial vehicles (UAVs) operating in dense urban environments. Specifically, it introduces a near real-time MPC method designed to enhance the airspace’s capacity to accommodate an increasing number of flights within a confined area, while consistently adhering to stringent safety standards. An extensive simulation study is conducted to demonstrate the effectiveness of the proposed framework in planning safe, on-demand trajectories in complex urban settings.
|
|
09:45-10:00, Paper Tu-S1-T4.6 | |
NMPC-Lander: Nonlinear MPC with Control Barrier Function for UAV Landing on a Moving Platform |
|
Batool, Amber | Skolkovo Institute of Science and Technology |
Batool, Faryal | Skolkovo Institute of Science and Technology |
Khan, Roohan Ahmed | Skolkovo Institute of Science and Technology |
Mustafa, Muhammad Ahsan | Skolkovo Institute of Science and Technology |
Fedoseev, Aleksey | Skolkovo Institute of Science and Technology |
Tsetserukou, Dzmitry | Skoltech |
Keywords: Robotic Systems, Autonomous Vehicle, Adaptive Systems
Abstract: Quadcopters are versatile aerial robots gaining popularity in numerous critical applications. However, their operational effectiveness is constrained by limited battery life and restricted flight range. To address these challenges, autonomous drone landing on stationary or mobile charging and battery-swapping stations has become an essential capability. In this study, we present NMPC-Lander, a novel control architecture that integrates Nonlinear Model Predictive Control (NMPC) with Control Barrier Functions (CBF) to achieve precise and safe autonomous landing on both static and dynamic platforms. Our approach employs NMPC for accurate trajectory tracking and landing, while simultaneously incorporating CBF to ensure collision avoidance with static obstacles. Experimental evaluations on the real hardware demonstrate high precision in landing scenarios, with an average final position error of 9.0 cm and 11 cm for stationary and mobile platforms, respectively. Notably, NMPC Lander outperforms the B-spline combined with the A* planning method by nearly threefold in terms of position tracking, underscoring its superior robustness and practical effectiveness.
|
|
Tu-S1-T5 |
Room 0.14 |
Human-Machine Interaction 1 |
Regular Papers - HMS |
Chair: Gu, Niu | Shanghai University of Electric Power |
Co-Chair: Menezes, Paulo | University of Coimbra |
|
08:30-08:45, Paper Tu-S1-T5.1 | |
DMCM Dual-Space Mapping Contrastive Meta Learning for Cold Start Recommendation |
|
Cao, Yukun | ShangHai University of Electric Power |
Gu, Niu | Shanghai University of Electric Power |
He, Yongcheng | Shanghai University of Electric Power |
Keywords: Human-Computer Interaction, Human-centered Learning
Abstract: ,基于元学习和对比 基于学习的方法取得了可喜的成果 在解决建议中的冷启动问题时 通过全局建模用户偏好来构建系统。然而,由于 冷启动之间稀疏的历史交互 物品和用户,很难准确捕捉 用户和项目之间的层次结构关系。自 为了缓解上述问题,我们提出了一种双空间映射 冷启动推荐的对比元学习 (DMCM),同时映射用户和项目信息 进入庞加莱空间和欧几里得空间到更多 有效捕捉 用户-项目交互信息。通过双空间 映射对比学习,表征能力 用户-项目关联的覆盖率得到增强,并且 展开用户交互信息。随后,一个 多层次元学习路径动态调整策略 用于进一步提高模型能力 个性化并适应新用户和新项目。 在三个公共数据集上进行的实验表明 所提
|
|
08:45-09:00, Paper Tu-S1-T5.2 | |
Preliminary Study on Sense of Agency Assessment Via Intentional Binding in an EMG-Based Control Interface |
|
Nagai, Miwa | NTT Corporation |
Shindo, Masato | NTT Corporation |
Aoki, Ryosuke | NTT, Inc |
Keywords: Human-Computer Interaction, Human-Machine Interaction
Abstract: Wearable electromyography (EMG)-based control interfaces are widely used by various users including both individuals with and without physical disabilities. Since EMG-based control interfaces are unfamiliar to most users and require individually defined input detection thresholds, differences in the sense of agency among users are to be expected. This study aimed to investigate individual trends in the sense of agency when operating an unfamiliar EMG-based control interface. To this end, we utilized a task that combines the intentional binding (IB) measurement paradigm with a perceptual-motor learning task to measure task performance and implicit sense of agency during EMG operation. To classify individual trends in the sense of agency, we conducted clustering based on variations in raw IB data. We found that cluster with large variations in IB had a low implicit sense of agency, reflecting the difficulty of the interface operation in detecting muscle movements as input. In addition, an analysis of the cluster with low variability in IB suggests that participants who took longer to become familiar with the input operation exhibited higher implicit agency, while those who found the task easy exhibited lower implicit agency. These findings suggest that IB may serve as a useful measure for capturing differences in operability not reflected in task performance or explicit reports of sense of agency.
|
|
09:00-09:15, Paper Tu-S1-T5.3 | |
Designing Effective Human-Swarm Interaction Interfaces: Insights from a User Study on Task Performance |
|
Wattearachchi, Wasura | University of New South Wales |
Lakshika, Erandi | University of New South Wales |
Kasmarik, Kathryn | University of New South Wales |
Barlow, Michael | University of New South Wales |
Keywords: Human-Computer Interaction, User Interface Design, Human-Collaborative Robotics
Abstract: In this paper, we present a systematic method of design for human-swarm interaction interfaces, combining theoretical insights with empirical evaluation. We first derived ten design principles from existing literature, applying them to key information dimensions identified through goal-directed task analysis and developed a tablet-based interface for a target search task. We then conducted a user study with 31 participants where humans were required to guide a robotic swarm to a target in the presence of three types of hazards that pose a risk to the robots: Distributed, Moving, and Spreading. Performance was measured based on the proximity of the robots to the target and the number of deactivated robots at the end of the task. Results indicate that at least one robot was brought closer to the target in 98% of tasks, demonstrating the interface’s success in fulfilling the primary objective of the task. Additionally, in nearly 67% of tasks, more than 50% of the robots reached the target. Moreover, particularly better performance was noted in moving hazards. Additionally, the interface appeared to help minimise robot deactivation, as evidenced by nearly 94% of tasks where participants managed to keep more than 50% of the robots active, ensuring that most of the swarm remained operational. However, its effectiveness varied across hazards, with robot deactivation being lowest in distributed hazard scenarios, suggesting that the interface provided the most support in these conditions.
|
|
09:15-09:30, Paper Tu-S1-T5.4 | |
Comparison of Low-Cost Object Detection Models for Character Product Image Detection from SNS |
|
Funada, Maho | Meiji University |
Sakurai, Yoshitaka | Meiji University |
Keywords: Human-Machine Interaction, Human-Computer Interaction, Human-Machine Cooperation and Systems
Abstract: This study conducts a comparative evaluation of the YOLO and DETR models as an initial step toward the development of a cost-effective and accurate object detection system for images of character-themed products, such as stuffed animals, on social media platforms. The evaluation focuses on the detection methods and accuracy of each model, analyzing and identifying the strengths and characteristics of each model.
|
|
09:30-09:45, Paper Tu-S1-T5.5 | |
Orlock: A Modular Agent for Natural Interaction in Indoor Navigation |
|
Carnaz, Gonçalo | Institute of Systems and Robotics , Coimbra University |
Faria, Esmeralda | Universidade De Coimbra |
Menezes, Paulo | University of Coimbra |
Keywords: Human-Computer Interaction, Human-Machine Interface, Human-Machine Interaction
Abstract: Providing effective indoor spatial guidance through natural language remains a key challenge in human-computer interaction. This paper presents Orlock, a virtual agent designed to support wayfinding and spatial orientation within a university building. To ensure a consistent baseline, building maps were shown to participants unfamiliar with the space. Following their interaction with Orlock, participants completed an 18-item questionnaire covering five key dimensions: clarity, usefulness, ease of interaction, overall satisfaction, and design preferences. Results from the 7-point Likert-scale responses indicated generally high satisfaction, with a clear preference for multimodal feedback combining visual and verbal instructions. These findings highlight the importance of multimodal interaction in orientation systems and offer practical design insights for future development of intelligent spatial guidance agents.
|
|
09:45-10:00, Paper Tu-S1-T5.6 | |
Multimodal Appearance-Based Gaze-Controlled Virtual Keyboard with Synchronous–Asynchronous Interaction for Low-Resource Settings |
|
Meena, Yogesh | IIT Gandhinagar |
Salvi, Manish | Indian Institute of Technology Gandhinagar |
Keywords: Human-Computer Interaction, Multi-User Interaction, Assistive Technology
Abstract: Over the past decade, the demand for communication devices has increased among individuals with mobility and speech impairments. Eye-gaze tracking has emerged as a promising solution for hands-free communication; however, traditional appearance-based interfaces often face challenges such as accuracy issues, involuntary eye movements, and difficulties with extensive command sets. This work presents a multimodal appearance-based gaze-controlled virtual keyboard that utilises deep learning in conjunction with standard camera hardware, incorporating both synchronous and asynchronous modes for command selection. The virtual keyboard application supports menu-based selection with nine commands, enabling users to spell and type up to 56 English characters—including uppercase and lowercase letters, punctuation, and a delete function for corrections. The proposed system was evaluated with twenty able-bodied participants who completed specially designed typing tasks using three input modalities: (i) a mouse, (ii) an eye-tracker, and (iii) an unmodified webcam. Typing performance was measured in terms of speed and information transfer rate (ITR) at both command and letter levels. Average typing speeds were 18.3±5.31 letters/min (mouse), 12.60±2.99 letters/min (eye-tracker, synchronous), 10.94±1.89 letters/min (webcam, synchronous), 11.15 ± 2.90 letters/min (eye-tracker, asynchronous), and 7.86 ± 1.69 letters/min (webcam, asynchronous). ITRs were approximately 80.29 ± 15.72 bits/min (command level) and 63.56 ± 11 bits/min (letter level) with webcam in synchronous mode. The system demonstrated good usability and low workload with webcam input, highlighting its user-centred design and promise as an accessible communication tool in low-resource settings.
|
|
Tu-S1-T6 |
Room 0.16 |
System Modeling and Control 1 |
Regular Papers - SSE |
Chair: Lingras, Pawan | Saint Mary's University |
Co-Chair: Ishihara, Shinji | Hitachi, Ltd |
|
08:30-08:45, Paper Tu-S1-T6.1 | |
Analysis of Periodic Dynamics of the Closed-Loop CCM DC-DC Buck Converter with Linear Method |
|
Twu, Shih-Hsiung | Chung Yuan University |
Ma, Li-Shan | Chung Yuan Christian University |
Wang, Yu-Chieh | Chung Yuan Christian University |
Keywords: System Modeling and Control
Abstract: In this paper, we delve into the examination of periodic dynamics, specifically with periods one and two, exhibited by closed-loop DC-DC buck converters operating in continuous conduction mode (CCM). Our approach employs a novel linear theoretical method introduced by the authors. The investigation utilizes state sequence flow charts derived from this innovative linear method to analyze the periodic dynamics inherent in closed-loop CCM DC-DC buck converters. Our research not only establishes theoretical foundations for the general exact closed-form solutions to system states and duty cycles but also validates them through simulation, enhancing the confirmation of our findings. The simulation verification assesses various aspects, including system states, duty cycles, average output voltages, and output voltage ripples, considering variations in input voltages. In essence, our work presents a straightforward and precise methodology to unravel the intricate nonlinear characteristics embedded in closed-loop CCM DC-DC buck converters.
|
|
08:45-09:00, Paper Tu-S1-T6.2 | |
EL4S: Enhanced L4S Congestion Control for Low-Latency Real-Time Streaming in 5G Networks |
|
Du, WenJi | University of Chinese Academy of Sciences, CNIC |
Yang, Wanghong | Computer Network Information Center |
Zhao, Baosen | CNIC |
Li, Zhenya | University of Chinese Academy of Sciences, Computer Network Info |
Ren, Yongmao | CNIC |
Zhou, Xu | CNIC |
Keywords: System Modeling and Control, Communications, Control of Uncertain Systems
Abstract: Reliable low-latency media streaming is increasingly critical for delivering seamless interactive and immersive services. To meet this need, the IETF and 3GPP have introduced the Low Latency, Low Loss, Scalable Throughput (L4S) architecture, which mitigates queuing delays in IP traffic and supports latency-sensitive applications. This paper focuses on real-time video streaming and proposes an enhanced framework, Enhanced L4S (EL4S), which integrates a link load factor feedback mechanism to more accurately reflect real-time network conditions.We design and implement a cross-layer end-side transmission control algorithm based on EL4S to improve real-time video performance over 5G networks. In this design, EL4S encodes link load information into packets at the 5G core and uses ACK-based feedback to inform the sender about current network states. At the sender, a dynamic bitrate adaptation algorithm adjusts the transmission rate in response to the reported link load factor, balancing throughput and delay while avoiding network congestion. This adaptive mechanism enables precise, frame-level control over video encoding rates and transmission behavior.Extensive experimental evaluations demonstrate that EL4S achieves high link utilization and low latency, significantly outperforming existing baseline algorithms. Overall, EL4S provides an efficient and deployable solution for achieving low-latency, high-quality real-time streaming in dynamic 5G networks.
|
|
09:00-09:15, Paper Tu-S1-T6.3 | |
Extracting 3D Features from 2D Images Using Artificial Intelligence: A Survey |
|
Fisher, Andrew | York University |
Pillai, Arjun | York University |
Lingras, Pawan | Saint Mary's University |
Mago, Vijay | York University |
Keywords: System Modeling and Control, Consumer and Industrial Applications, Robotic Systems
Abstract: In computer vision, most data are captured in 2D formats, limiting spatial understanding in real-world applications. This presents a challenge for fields such as architecture, construction, and robotics, where interpreting spatial relationships from minimal visual input is increasingly essential. This survey reviews recent advancements in extracting 3D features from 2D imagery, a critical task in these domains, where spatial accuracy and object orientation directly impact performance. We focus on three core areas: (1) disposition estimation, determining object pose; (2) joint modeling, constructing skeletal representations; and (3) scene reconstruction, generating spatially accurate environments. Each category is evaluated based on input modalities, performance metrics, and code availability. By providing a unified overview of these techniques, this paper highlights their practical value in enabling 3D reasoning from conventional 2D data.
|
|
09:15-09:30, Paper Tu-S1-T6.4 | |
Model Predictive Allocation Control for Virtual Power Plants Reflecting Community Preferences |
|
Ishihara, Shinji | Hitachi, Ltd |
Ohtsuka, Toshiyuki | Kyoto University |
Keywords: System Modeling and Control, Cyber-physical systems, Intelligent Power Grid
Abstract: Virtual Power Plants (VPPs) have been increasingly being used to achieve carbon neutrality in energy systems and to improve resilience. This study handled a control scheme for small-scale community-driven VPPs, which has attracted much attention in recent years. In such community-driven VPPs, operations are not limited to maximizing economic value, but are also focused on community preferences. In this study we proposed Model Predictive Allocation Control (MPAC), which enables VPPs to operate in a way that appropriately reflects community preferences. The MPAC formulates the energy resource allocation control by Model Predictive Control (MPC) and modifies the VPP operation by tuning the weight parameters of the MPC. Furthermore, the weight parameters can be tuned by a preference learning-based optimization algorithm to easily reflect the community's decisions. We also compensate for the operational stability of the VPPs by using frequency stabilizing control in combination. The effectiveness of the proposed method was verified by experiments using numerical simulations.
|
|
09:30-09:45, Paper Tu-S1-T6.5 | |
Multi-UAV Transportation Assignment Network System Research Considering Route Planning |
|
Teng, Yushan | Beihang University |
Wang, Xiaohong | Beihang University |
Wang, Lizhi | Beihang University |
Li, Ruyue | Beihang University |
Keywords: System Modeling and Control, Decision Support Systems, Intelligent Transportation Systems
Abstract: The study focuses on the multi-UAV assignment network system problem considering route planning. It firstly establishes geographic environment model and designs the cost function and the flight constraints of the UAV in the transport task process. Combined with the particle swarm optimization algorithm, it uses simulation software to solve the assignment network system problem and constructs the transportation assignment network with complete route information on the basis of the optimal flight route generated by the algorithm. According to the relevant actual scenes, the transportation network simulation experiment and the multi-point distribution simulation experiment are carried out respectively, so as to obtain the optimal assignment planning scheme that accomplishes the cost objective of the multi-UAV under the flow restriction and transportation point requirements.
|
|
09:45-10:00, Paper Tu-S1-T6.6 | |
Multi-Order Shipping Networks Via High-Order Dependencies: Construction and Node Influence Analysis |
|
Yu, Mengjun | National University of Defense Technology |
Ye, Xiongfei | National University of Defense Technology |
Han, Xu | China Electronics Technology Group Corporation No.7 Research Ins |
Yan, Liang | National University of Defense Technology |
Duan, XiaoJun | National University of Defense Technology |
Huang, PengQiZi | National University of Defense Technology |
Keywords: System Modeling and Control, Decision Support Systems, Intelligent Transportation Systems
Abstract: Maritime networks constitute a critical pillar of the global economic system and reflect intricate trade relationships between nations. Meanwhile, the application of complex network theory has increasingly highlighted the intrinsic link between node influence and dependencies among non-adjacent nodes. However, a central challenge lies in uncovering latent dependencies—mediated by intermediaries—between non-directly connected nations, which is crucial for the precise modeling of maritime networks and the analysis of indirect influence. In this study, we construct and optimize a national-level multi-order influence network by mining multi-order dependency rules through advanced network construction methods. Empirical analysis of real-world ship navigation trajectory data (August 2018 and August 2023) reveals increasingly interconnected global dependencies. The proposed method successfully quantifies multi-order influence, identifies strategic hubs, tracks the evolution of influence, and offers actionable insights into geopolitical shifts and strategies for trade resilience.
|
|
Tu-S1-T7 |
Room 0.31 |
Human-Centered Transportation |
Regular Papers - HMS |
Chair: Xue, Hengyu | International Innovation Institute of Beihang University |
Co-Chair: Wang, Yecan | Honda R&D Co., Ltd |
|
08:30-08:45, Paper Tu-S1-T7.1 | |
Optimizing Computational Efficiency of MPC-Based Motion Cueing Algorithm for Vehicle Driving Simulators |
|
Xue, Hengyu | International Innovation Institute of Beihang University |
Sun, Jiaxun | Swiss Federal Institute of Technology Zurich |
Tang, Kexin | Beihang University |
Zhang, Junli | Beihang University |
Lin, Qingfeng | Hangzhou International Innovation Institute of Beihang Universit |
Keywords: Human-Centered Transportation, Design Methods, Human-Machine Interaction
Abstract: Vehicle driving simulators have been widely used in fields including road design, automotive development, and driver training. As a core component of the simulators, motion cueing algorithms (MCAs) aim to reproduce realistic vehicle motion sensations while respecting workspace limitations for a high-fidelity driving experience. Recent advancements have identified model predictive control (MPC) as a promising approach for MCA. However, conventional MPC-based MCA faces computational bottlenecks that limit real-time performance. Although recent studies have employed swarm intelligence optimization algorithms like genetic algorithm and grey wolf optimizer to enhance the performance of MPC-based MCA through horizon adjustments, there was still an overlook of improvement path from the computational cost perspective. Therefore, this paper proposes a costs-optimization framework for MPC-based MCA (COMPC-based MCA), which integrates two key components: an Operator Splitting Quadratic Program (OSQP) solver and a Snow Ablation Optimizer (SAO) that considers computational costs and sensation errors for parameters selection. Experiment results validated the advantages of the proposed framework. COMPC-based MCA can achieve fast quadratic programming (QP) solving and select appropriate parameters for performance improvement while considering computational costs.
|
|
08:45-09:00, Paper Tu-S1-T7.2 | |
Quantitative Evaluation of Interactive Walking Behavior in Multiple-Pedestrian Environment |
|
Ito, Haruki | Nagoya University |
Okuda, Hiroyuki | Nagoya University |
Suzuki, Tatsuya | Nagoya University |
Keywords: Human-Centered Transportation, Human Factors, Human-Collaborative Robotics
Abstract: In this paper, first of all, the group walking behavior by four pedestrians are observed. In the observation, not only the motion data but also the decision making of each pedestrian are collected by using special device. Then, three behavioral indicators: deceleration, detour amount, and decision entropy, are defined and calculated. It has been found that these three indicators successfully quantify the ’smoothness’of the group walking behavior. Finally, the principal component analysis(PCA) is applied to the three dimensional indicator data. As the result, the meaning of three principal components are clearly explained. The discussion based on the PCA will be a basis for the further analysis and classification of the group walking behavior.
|
|
09:00-09:15, Paper Tu-S1-T7.3 | |
Real-Time Driving Risk Assessment Using LSTM-Attention and XGBoost with Physiological and Driving Behavior Data |
|
Wang, Yecan | Honda R&D Co., Ltd |
Guagnano, Michele | Politecnico Di Torino |
Taniguchi, Hiroki | Honda Motor R&D Co., Ltd |
Nagatani, Nozomi | Honda Motor R&D Co., Ltd |
Ono, Hiroshi | Honda Motor R&D Co., Ltd |
Shinkawa, Satoru | Honda Motor R&D Co., Ltd |
Miyamoto, Sumie | The University of Tokyo |
Hasumi, Eriko | The University of Tokyo |
Fujiu, Katsuhito | The University of Tokyo |
Takai, Madoka | The University of Tokyo |
Violante, Massimo | Politecnico Di Torino |
Mitsuzawa, Shigenobu | Honda R&D Co., Ltd |
Keywords: Human-Centered Transportation, Human Factors, Human-Computer Interaction
Abstract: Early identification of driving risks is crucial for driving safety. In this study, a deep learning method based on Long Short-term Memory (LSTM) with attention mechanism and XGBoost to recognize risk levels during driving simulation for early warnings, has been proposed. Here, we collected multimodal data -including driving data, eyes status, and basic physiological data-from driving simulators, eye trackers, and smartwatches for comprehensive analysis. By combining driving behavior and physiological features in each time window, the ability of LSTM networks was applied to analyze temporal features for prediction of the risk levels (Level 0, Level 1, and Level 2) in the next 20 seconds. Meanwhile, an attention mechanism was introduced to understand the importance of multiple features as well. Then the output of LSTM-attention was further refined using XGBoost, which demonstrated excellent performance in classification tasks. The integrated model showed that the overall accuracy achieved is above 89.97%, with an F1-score of 89.15%.
|
|
09:15-09:30, Paper Tu-S1-T7.4 | |
Modeling and Evaluation of Bike-Sharing Systems for Planning Sustainable Urban Mobility |
|
Dantas, Renata | Federal Institute of Pernambuco - IFPE |
Dagba, Akin | Federal Institute of Pernambuco - IFPE |
Rameh, IonÁ | Federal Institute of Pernambuco - IFPE |
Dantas, Jamilson | UFPE |
Maciel, Paulo | UFPE |
Keywords: Human-Centered Transportation, Human-Computer Interaction, Information Systems for Design
Abstract: As urban centers face growing challenges from population growth, traffic congestion, and environmental degradation, sustainable mobility solutions have become essential. This paper presents a performance evaluation of a public bicycle-sharing system using Stochastic Petri Nets (SPNs) to model and simulate user behavior. Grounded in recent research on urban mobility and shared transportation, the proposed stochastic framework captures key operational parameters, including bicycles in transit, bicycles at stations, waiting probability, system utilization, and the availability of parking docks. A case study evaluates three high-demand stations under varying demand scenarios (baseline, − 25%, and +25%). Results indicate that, while the system operates adequately under current demand, it is prone to saturation under increased usage and exhibits inefficiencies when demand decreases. The proposed model provides valuable insights to support operational improvements, infrastructure planning, and public policy formulation for integrated and sustainable urban mobility.
|
|
Tu-S1-T8 |
Room 0.32 |
Evolutionary Computation 1 |
Regular Papers - Cybernetics |
Chair: Ishibuchi, Hisao | Southern University of Science and Technology |
Co-Chair: Wei, Feng-Feng | South China University of Technology |
|
08:30-08:45, Paper Tu-S1-T8.1 | |
Mutation Probability Specification in Large-Scale Evolutionary Multi-Objective Optimization Algorithms |
|
Nan, Yang | Southern University of Science and Technology |
Ishibuchi, Hisao | Southern University of Science and Technology |
Shu, Tianye | Southern University of Science and Technology |
Chen, Longcan | Southern University of Science and Technology |
Keywords: Evolutionary Computation, Computational Intelligence
Abstract: In the community of evolutionary multi-objective optimization (EMO), large-scale multi-objective optimization problems (LSMOPs) with many decision variables have attracted much attention. The main difficulty of LSMOPs lies in their high-dimensional decision space, which slows down the convergence of EMO algorithms towards the Pareto front. To address this issue, many novel variation operators have been proposed to improve the efficiency of EMO algorithms. However, for both conventional EMO algorithms (e.g., NSGA-II) and recently proposed EMO algorithms (e.g., LERD), the polynomial mutation with the mutation probability 1/n, where n is the number of decision variables, is always used. For LSMOPs with a large number of decision variables, the mutation probability 1/n looks too small (e.g., 1/1000). In this paper, we examine different mutation probabilities and find that many existing EMO algorithms with a larger mutation probability (e.g., 10/n) are significantly better than the standard setting (i.e., 1/n) in handling LSMOPs.
|
|
08:45-09:00, Paper Tu-S1-T8.2 | |
How to Choose Solutions for Applying Momentum in Evolutionary Multi-Objective Optimization |
|
Chen, Longcan | Southern University of Science and Technology |
Pang, Lie Meng | Southern University of Science and Technology |
Zhang, Qingfu | City University of Hong Kong |
Ishibuchi, Hisao | Southern University of Science and Technology |
Keywords: Evolutionary Computation, Computational Intelligence, Heuristic Algorithms
Abstract: Momentum is a technique that adds the momentum moves from the earlier iterations into the current update to accelerate convergence. While the momentum technique has been widely used in single-objective optimization, its application in evolutionary multi-objective optimization (EMO) has not gained much attention. Since EMO algorithms are population-based algorithms, how to choose solutions for applying momentum becomes an important issue. Inspired by Polyak's momentum method and Nesterov's momentum method in single-objective optimization, we propose four different momentum methods for EMO. Our findings demonstrate that the performance of EMOAs with momentum is strongly affected by the choice of solutions to which momentum moves are applied.
|
|
09:00-09:15, Paper Tu-S1-T8.3 | |
Crowdsourcing Knowledge Integration Evolutionary Transfer Optimization for Feature Selection |
|
Liang, Shurui | South China University of Technology |
Wei, Feng-Feng | South China University of Technology |
Chen, Wei-Neng | South China University of Technology |
Keywords: Evolutionary Computation, Computational Intelligence, Swarm Intelligence
Abstract: Crowdsourcing has become a powerful tool for data collection and problem-solving, harnessing collective intelligence to address complex tasks. Building knowledge bases through such intelligence is gaining attention to enhance crossdomain model performance. As an essential technique for dimensionality reduction, the task of feature selection (FS) also emerges in the context of crowdsourcing, resulting in crowdsourcing feature selection (CFS). Due to the diversity of workers and the varying quality of data in a crowdsourcing environment, CFS faces challenges such as heterogeneous data reliability and dynamic participation. To tackle these issues, we present Crowd-Fed Evolutionary Transfer Optimization (CFETO), an efficient and crowdsourcing-enhanced framework for feature selection. CFETO comprises a central server and multiple distributed workers: workers perform local data collection and optimization, extracting knowledge from their datasets, while the server integrates this knowledge to build a knowledge base. An evolutionary transfer learning strategy is further employed to harness this knowledge, thereby improving both convergence speed and selection robustness. Experiments on 20 real-world datasets show that CFETO outperforms traditional centralized FS methods approaches, underlining its potential for broader application in complex, distributed crowdsourcing scenarios.
|
|
09:15-09:30, Paper Tu-S1-T8.4 | |
A Tree-Based Broad Learning System-Assisted Evolutionary Algorithm with Incremental Learning for Expensive Optimization |
|
Dong, Fangchen | South China University of Technology |
Wei, Feng-Feng | South China University of Technology |
Geng, Mingcan | South China University of Technology |
Chen, Wei-Neng | South China University of Technology |
Keywords: Evolutionary Computation, Computational Intelligence, Swarm Intelligence
Abstract: Surrogate model-assisted evolutionary algorithm (SAEA) has become a generalized method for expensive optimization problems (EOPs) with high-cost evaluations. However, most existing SAEAs retrain surrogate models frequently during evolutionary iterations. Besides, the quality of sample data directly affects model accuracy, leading to difficulties in finding the optimal solution. To address these challenges, this paper introduces the tree-based broad learning system (TBLS) as the surrogate model into SAEA framework, and a TBLS-assisted optimizer with incremental learning (TBLSOIF) is proposed. First, an incrementable TBLS is constructed for predicting the quality of the solution, and the model is incrementally updated by expanding the layer nodes when the samples increase. And then effectively avoids the high computational costs of retraining models from scratch and significantly improves the update efficiency of surrogate models. Second, an adaptive sample augmentations (ASA) strategy is designed to generate more samples for updating the TBLS using a variational autoencoder (VAE). Experiments on 15 problems in the CEC2017 benchmark functions show that the proposed TBLSO-IF algorithm is more effective and competitive than the other five state-of-the-art SAEA methods.
|
|
09:30-09:45, Paper Tu-S1-T8.5 | |
An Evolutionary Multi-Objective Optimization Algorithm with Variable Consistency Strategy |
|
Guan, Weifeng | GuangDong University of Technology |
Gu, Fangqing | Guangdong University of Technology |
Chen, Xingyuan | Guangdong University of Technology |
Wang, Hailong | Guangdong University of Technology, Guangdong |
Liu, Hai-lin | Guangdong University of Technology |
Keywords: Evolutionary Computation, Computational Intelligence, Swarm Intelligence
Abstract: Evolutionary multi-objective optimization (EMO) algorithms generate diverse Pareto optimal solutions to comprehensively approximate the Pareto front. However, an excessive number of diverse solutions pose significant challenges for decision makers who must select a single or few solutions for implementation. To address this decision-making burden, we propose a novel approach that promotes consistency among decision variables while accepting controlled performance trade-offs. Specifically, we develop VCM2M, an EMO algorithm that integrates a variable consistency strategy into the MOEA/D framework. It decomposes a multi-objective optimization problem (MOP) into a number of subproblems. In each subproblem, we incorporate the logarithmic sum regularization term of the decision variables into objective functions, achieving consistency during both the optimization and selection phases. An epsilon-insensitive loss function is utilized to normalize the loss of population diversity in the objective space. Experimental evaluation on specially constructed test problems demonstrates that VCM2M successfully generates solution sets organized into distinct clusters, where solutions within each cluster exhibit significantly higher variable consistency compared to those produced by MOEA/D-M2M, AR-MOEA, and NSGA-III.
|
|
09:45-10:00, Paper Tu-S1-T8.6 | |
Approximating Hypervolume Contributions Using Grammatical Evolution |
|
Bernabé Rodríguez, Amín V. | Basque Center for Applied Mathematics |
Coello Coello, Carlos Artemio | CINVESTAV-IPN |
Keywords: Evolutionary Computation, Heuristic Algorithms, Optimization and Self-Organization Approaches
Abstract: The hypervolume (HV) indicator is widely used in multi-objective evolutionary algorithms (MOEAs) due to its Pareto compliance property. This property makes it very effective for assessing the quality of approximation sets from different MOEAs and for ranking solutions among a population of solutions using the individual hypervolume contribution (HVC). However, the computational cost of computing the HV increases exponentially with the number of objectives. Furthermore, this increase in computational cost is worse when adopting the HVC as a density estimator, making it prohibitive in many-objective optimization problems (MaOPs). In this work, we propose a novel approach to create HVC approximation functions using Grammatical Evolution (GE). We describe the grammar and fitness functions designed to identify the worst-contributing individual given a population of non-dominated solutions. Then, we use a GE implementation with training data generated from the DTLZ and WFG benchmark problems. The resulting approximation functions, tailored for dimensionalities ranging from 2 to 10, are evaluated against two state-of-the-art methods: HVC-Net and the R2-based HVC approximation. Experimental results on validation data also derived from benchmark problems show that our GE-generated functions consistently outperform both alternative approaches regarding worst-contributing individual identification for dimensions greater than two while maintaining competitive execution times. These results indicate that GE is a viable and effective tool for generating high-quality HVC approximations, particularly suitable for solving MaOPs.
|
|
Tu-S1-T9 |
Room 0.51 |
Cyber-Physical Systems |
Regular Papers - SSE |
Chair: David, Beserra | EPITA |
Co-Chair: Zhao, Huarong | Jiangnan University |
|
08:30-08:45, Paper Tu-S1-T9.1 | |
ACIL-SED: An Acoustic Clustering and Imbalance Learning Method for Sound Event Detection |
|
Chen, Zhichao | South China Agricultural University |
Zhong, Cankun | South China Agricultural University |
Liang, Yun | South China Agricultural University |
Luo, Tang | South China Agricultural University |
Zhang, Yihang | South China Agricultural University |
Ng, Wing Yin | South China University of Technology |
Keywords: Cyber-physical systems
Abstract: Sound Event Detection (SED) aims to identify and locate specific sound events within audio streams. Despite recent advancements, current methodologies exhibit two fundamental limitations: (1) Insufficient modeling of discriminative temporal-spectral characteristics induces compromised differentiation for acoustically similar events. (2) Failing to address both inter-class and intra-class imbalance problems induces biased classifications toward dominant classes and inactive frames. To tackle these challenges, we propose Acoustic Clustering and Imbalance Learning-based Sound Event Detection (ACIL-SED), which consists of Cluster-Specialized Convolutional Recurrent Neural Network (CS-CRNN) and Dual-Objective Adaptive Balance Loss (DOABLoss). The CSCRNN employs acoustic clustering-guided specialized submodels, where each cluster’s sub-model focuses on discriminative feature learning in specific temporal-spectral characteristics, thereby resolving the compromised feature representation inherent in a shared single-model architecture. The DOABLoss adaptively combines mean squared error (MSE) and weighted binary cross-entropy (WBCE) to simultaneously address the issues of inter-class duration imbalance and intra-class activeinactive frame imbalance. Experimental results show that ACIL-SED outperforms state-of-the-art methods in complex acoustic environments.
|
|
08:45-09:00, Paper Tu-S1-T9.2 | |
Data-Driven Event-Triggered Sliding-Mode Control for Wind Turbine with Prescribed Performance and Quantized Information |
|
Zhao, Huarong | Jiangnan University |
Shan, Jinjun | York University |
Yan, Wentao | Jiangnan University |
Yu, Hongnian | Built Environment, Edinburgh Napier University |
Keywords: Cyber-physical systems, Control of Uncertain Systems, Adaptive Systems
Abstract: This paper studies a data-driven event-triggered sliding-mode control problem for wind turbines with prescribed performance and quantified information to maximize power generation efficiency. Initially, a partial form dynamic linearization model is established for the controlled wind turbine system. A logarithmic quantizer is considered to quantize data before it is transmitted. Then, a data-driven event-triggered sliding-mode control approach is established, where a smooth function is introduced to limit the control error converges to a prescribed range, and an event-triggered scheme is designed to reduce the communication frequencies of the controlled plant. Finally, the convergence of the proposed method is rigorously proven, and the effectiveness of the proposed method is demonstrated through simulation studies.
|
|
09:00-09:15, Paper Tu-S1-T9.3 | |
Long Short-Term Memory Network-Based H∞ Synchronization Control and Anomaly Detection for Cyber-Physical Systems |
|
Kwon, Hyoeun | Daegu Gyeongbuk Institute of Science & Technology, Korea Institu |
Lee, Suwoong | Korea Institute of Industrial Technology |
Kwon, Wookyong | ETRI |
Lim, Yongseob | Daegu Gyeongbuk Institution of Science and Technology (DGIST) |
Jin, Yongsik | Daegu Gyeongbuk Institute of Science and Technology (DGIST) |
Keywords: Cyber-physical systems, Digital Twin, System Modeling and Control
Abstract: In the synchronization of cyber-physical systems (CPSs), modeling the nonlinear dynamics of physical plants is a challenging task. To address this challenge, we propose a novel H∞ controller design method that leverages a data-driven approach to robustly synchronize CPSs and ensure their stability. In the proposed approach, the input-output relationship of the physical system is learned using long short-term memory (LSTM) networks to approximate the unknown dynamics of CPSs. Furthermore, we exploit an effective control scheme for trained LSTM networks to effectively handle the nonlinearity of activation functions. To ensure stability and performance in the convergence of synchronization error, a controller design criterion is derived for the trained LSTM network in terms of linear matrix inequalities, and the controller gain is computed using convex optimization techniques. In addition, we present an anomaly detection algorithm using the proposed method, which can synchronize CPSs and detect abnormal signals without requiring any prior physical model information. Consequently, the stability of the synchronization control system can be ensured, enabling its application to anomaly detection. Finally, the effectiveness of the proposed method is validated through an experiment on a motor control system even in abnormal operating conditions.
|
|
09:15-09:30, Paper Tu-S1-T9.4 | |
Indoor Localization in a Collaborative BLE-Based System through Dynamic Path Loss Estimation Via Location Aware Autonomous Mobile Robot |
|
Moradbeikie, Azin | CITIN |
Azevedo, Rolando | CITIN |
Jesus, Cristiano | CITIN |
David, Beserra | EPITA |
Ivan Lopes, Sergio | CITIN |
Keywords: Cyber-physical systems, Robotic Systems, Smart Buildings, Smart Cities and Infrastructures
Abstract: Providing Artificial Intelligence as a Service (AIaaS) for the next generation of Industry, known as Industry 5.0, is committed to providing an accurate indoor localization system. Providing Indoor localization, particularly in dynamic industrial environments where operational efficiency relies on precise location information, is a challenging task. This paper presents a Collaborative Indoor Positioning System (CIPS) facilitated by a location-aware autonomous mobile robot equipped with Bluetooth Low Energy (BLE) beacons to enhance indoor localization accuracy through dynamic path loss parameter estimation. The robot traverses the facility while periodically broadcasting its ID and real-time location to a central server. By correlating the robot’s ground-truth positions with Received Signal Strength Indicator (RSSI) measurements from fixed BLE receivers, the server continuously updates path loss parameters using a log-normal shadowing model and linear regression. Evaluation of the proposed system is conducted using a real testbed with a deployed application system in an office environment. Experimental results demonstrate a significant improvement in distance estimation accuracy, ranging between 19% and 32%, and a 20% reduction in location estimation error compared to conventional methods, highlighting the potential of the proposed CIPS approach for enhancing indoor localization in industrial settings.
|
|
09:30-09:45, Paper Tu-S1-T9.5 | |
Cyber-Attacks Detection in Timed Probabilistic DESs Via Artificial Neural Networks (I) |
|
Amri, Omar | Université Le Havre Normandie |
Seatzu, Carla | Univ. of Cagliari |
Giua, Alessandro | University of Cagliari |
Lefebvre, Dimitri | University Le Havre Normandie |
Keywords: Cyber-physical systems, Discrete Event Systems
Abstract: In this paper, the problem of cyber-attacks detection in timed probabilistic discrete event systems via artificial neural networks is investigated. We extend the problem of state estimation for timed probabilistic discrete event systems using artificial neural networks, to address attack detection. So that, a detection strategy is implemented to determine whether the system is operating in normal mode or under attack, and to identify the potential type of attack. Two primary cases are examined: (i) attack detection over observations. In this case, the attack detector recognizes whether the system is in a normal mode or an attack mode after each new observation, (ii) attack detection over time. Here, the attack detector evaluates the system’s status at each clock time increment.
|
|
Tu-S1-T10 |
Room 0.90 |
Situation Awareness, Decision Making and Cognitive Situation Management &
Intelligent Industrial Environments and Cyber-Physical Industrial
Systems |
Special Sessions: Cyber |
Chair: Linares-Barranco, Alejandro | University of Seville |
Co-Chair: Pröstl Andrén, Filip | AIT Austrian Institute of Technology |
Organizer: Salfinger, Andrea | University of Udine |
Organizer: Snidaro, Lauro | University of Udine |
Organizer: Ruppert, Tamás | University of Pannonia |
Organizer: Pál, Darányi | University of Pannonia |
|
08:30-08:45, Paper Tu-S1-T10.1 | |
Driver Assistant: Persuading Drivers to Adjust Secondary Tasks Using Large Language Models (I) |
|
Xiang, Wei | Zhejiang University |
Li, Muchen | Zhejiang University&China Unicom Data Intelligence Co., LTD |
Yan, Jie | Zhejiang University |
Zheng, Manling | Zhejiang University |
Hanfei, Zhu | Zhejiang University |
Jiang, Mengyun | Zhejiang University |
Sun, Lingyun | Zhejiang University |
Keywords: AI and Applications, Application of Artificial Intelligence
Abstract: Level 3 automated driving systems allows drivers to engage in secondary tasks while diminishing their perception of risk. In the event of an emergency necessitating driver intervention, the system will alert the driver with a limited window for reaction and imposing a substantial cognitive burden. To address this challenge, this study employs a Large Language Model (LLM) to assist drivers in maintaining an appropriate attention on road conditions through a “humanized” persuasive advice. Our tool leverages the road conditions encountered by Level 3 systems as triggers, proactively steering driver behavior via both visual and auditory routes. Empirical study indicates that our tool is effective in sustaining driver attention with reduced cognitive load and coordinating secondary tasks with takeover behavior. Our work provides insights into the potential of using LLMs to support drivers during multi-task automated driving.
|
|
08:45-09:00, Paper Tu-S1-T10.2 | |
Boosting Exploration and Risk-Taking under Cognitive Load: The Interactive Role of tDCS and Curiosity (I) |
|
Singh, Ankit | Indian Institute of Technology (IIT) Mandi |
Govindaraji, Ramajayam | Indian Institute of Technology (IIT) Mandi |
Dutt, Varun | Indian Institute of Technology Mandi |
Keywords: Human Factors, Augmented Cognition
Abstract: Successful decision-making in uncertain contexts depends on balancing comfortable, tried-and-tested options with curious exploration. Yet, high cognitive load tends to disrupt this equilibrium, lowering risk-taking and exploratory responses. Although neuromodulation through transcranial direct current stimulation (tDCS) has been shown to boost cognitive flexibility, its combination with cognitive load in influencing curiosity is not well understood. This study investigates how anodal tDCS applied to the dorsolateral prefrontal cortex affects risk-taking and exploration under different cognitive loads. Thirty participants were assigned to one of three conditions: (1) tDCS with trivia- based cognitive load, (2) trivia without tDCS, and (3) no-load, no-tDCS stimulation (control). Participants performed a 50-trial decision-making task with three options per trial: a certain safe option, a risky high-reward option, and an exploratory option that provided probabilistic information. Measures of behavior were risk-taking, exploratory behavior, and cognitive flexibility in terms of strategy switching. Outcome showed that the tDCS with trivia condition had significantly more risk-taking and exploration compared to both control groups. Importantly, even without cognitive load, there was a facilitation of exploration. Cognitive flexibility was most pronounced for the tDCS group and implies that neuromodulation can support adaptation behavior under load. These observations prove that a decrease in cognitive load as well as neuromodulation separately and interactively improves decision making, curiosity, and risk acceptance. This implies real- world value in high-demand areas like healthcare, education, and aviation where preserving exploration when under pressure is paramount.
|
|
09:00-09:15, Paper Tu-S1-T10.3 | |
Explaining Autonomous Navigation to Human-In-The-Loop Operator in Multi-Task Rotorcraft Search & Rescue Operations (I) |
|
Nathaniel, Amadi | Cranfield Uni |
Sam, Cartwright | Cranfield Uni |
Claudel, Noe | Cranfield Uni |
Jamal, Mohammed | Cranfield Uni |
Kenechukwu, Agbo | Cranfield University |
Chatzithanos, Paraskevas | Cranfield University |
Wisniewski, Mariusz | Cranfield University |
Tsourdos, Antonios | Cranfield University |
Xing, Yang | Cranfield University |
Guo, Weisi | Cranfield University |
Keywords: Information Assurance and Intelligence, Application of Artificial Intelligence, Deep Learning
Abstract: Aerial search and rescue (SAR) rotorcrafts currently need multiple specialist human operators, increasing cost and the risk of downtime due to crew unavailability and mental stress. Autonomy can aid fewer operators performing multiple tasks, but the human operator must maintain situation awareness (SA) of crucial autonomous decisions. A key challenge is the cognitive stress on a multi-tasking human-in-the-loop (HITL) due to the AI agent making decisions without human understanding. Explainable AI (XAI) has often been proposed as a way to explain autonomy decisions, but current XAI solutions doesn't adapt to real-time human factors in high stress and high stakes situations. Here, we allow an AI agent to perform autonomous rotorcraft navigation, whilst the HITL operator has to perform two simultaneous tasks: (i) search for a target on the ground by toggling an onboard camera, and (ii) maintain SA of the autonomous navigation task through our novel XAI interface. Our novel XAI approach leverages on dimensionality reduction techniques to visualize the reinforcement learning (RL) navigation's internal states, highlighting patterns in its decision-making process through intuitive interactive clustering on saliency maps. To ensure convergence on performance, we design a two-way interface that allows the human to interpret AI decisions and then give feedback via a Large Language Model to modify the autonomous navigation. Testing demonstrates increased task performance (+43%), while experiencing substantial human reductions in physical demand (-53%), time pressure (-30%), effort (-23%), and frustration (-26%), but at the cost of slightly increased mental demand (+12%).
|
|
09:15-09:30, Paper Tu-S1-T10.4 | |
Virtual Reality Simulation of Landslide Risk: Investigating Behavioral and Neurophysiological Responses to Warning Systems (I) |
|
Mehra, Arjun | Applied Cognitive Science Lab and Centre for Human-Computer Inte |
Kumar, Ajoy | Indian Institute of Technology Mandi |
Devi, Arti | Applied Cognitive Science Lab, Indian Institute of Technology, M |
Uday, Kala Venkata | Geotechnical Engineering Lab, IIT Mandi |
Dutt, Varun | Indian Institute of Technology Mandi |
Keywords: Computational Intelligence, Machine Vision, Computational Intelligence in Information
Abstract: Landslide early warning systems (EWS) are critical to disaster preparedness but are frequently limited by uncertainty of prediction and variability in user trust. This research presents a new dual-modality method combining virtual reality (VR) simulation and electroencephalography (EEG) to evaluate behavioral and neurophysiological reactions to probabilistic landslide warnings. Eighty drivers experienced a VR driving situation with different warning accuracy (70% vs. 95%) and lighting conditions (day vs. night), collecting behavioral measures (e.g., collisions, speed, trajectory deviance) and EEG-based cognitive measures (e.g., alpha/theta, alpha/gamma, beta/gamma ratios) as dependent measures. Results revealed that decreased warning accuracy caused elevated collision rates, route deviances, and beta/gamma EEG activity, representing higher cognitive stress. Higher ratios of alpha/theta and alpha/gamma were related to performance in driving and were more prominent under higher accuracy and daylight. These results stress the promise that neuroadaptive VR systems hold to improve disaster training by dynamically calibrating feedback according to the cognitive states of users, therefore providing useful insights into the intelligent, human-orientated EWS technology design within the fields of system, man, and cybernetics.
|
|
09:30-09:45, Paper Tu-S1-T10.5 | |
Toward Chiller Condenser Fouling Factor Prediction Using Spiking EdgeAI (I) |
|
Linares-Barranco, Alejandro | University of Seville |
Ávila-Gutiérrez, Miguel | University of Seville |
Pérez-Peña, Antonio Manuel | University of Seville |
Montes-Sánchez, Juan Manuel | University of Seville |
Salmerón-Lissen, José Manuel | University of Seville |
Keywords: AIoT, Neural Networks and their Applications, Application of Artificial Intelligence
Abstract: In this study, a recurrent spiking neural network (RSNN) has been implemented on an accelerator deployed on a reconfigurable circuit (FPGA) to predict the Condenser Fouling Factor (CFF) in a chiller as a predictive maintenance (PdM) measure. This system operates through a low-power device utilising edge computing, geared towards AIoT/EdgeAI applications. It can accurately identify CFF levels that exceed 25% based on several pressure and temperature sensor data distributed in the chiller, achieving an accuracy greater than 90% with an architecture of 256 parallel and recurrent neurons in the hidden layer. The model was trained using a proprietary dataset that recorded sensor states during controlled experiments where the condenser was manually obstructed at varying coverage percentages. The primary advantage of employing RSNN techniques lies in their dual capability: first, they are designed to detect temporal signal patterns, and second, their trainability allows for adaptation to various applications across different contexts. The training for the particular accelerator used in this work is done on the FPGA, not requiring power-hungry machines.
|
|
09:45-10:00, Paper Tu-S1-T10.6 | |
PowerTeams - a Novel Approach for Collaboration in Engineering Power System Applications (I) |
|
Brandauer, Christof | Salzburg Research Forschungsg.m.b.H |
Wohnig, Jonas | Salzburg Research Forschungsg.m.b.H |
Pröstl Andrén, Filip | AIT Austrian Institute of Technology |
Vettoretti, Denis | AIT Austrian Institute of Technology |
Strasser, Thomas | AIT Austrian Institute of Technology GmbH |
Veichtlbauer, Armin | University of Applied Sciences Upper Austria |
Steinmaurer, Gerald | University of Applied Sciences Upper Austria |
Resch, Jürgen | Ing. Punzenberger COPA-DATA GmbH |
Keywords: Cloud, IoT, and Robotics Integration, Knowledge Acquisition, Intelligent Internet Systems
Abstract: The transformation of the energy sector into a cyber-physical system of systems, driven by the integration of renewable energy sources and digitalization, demands new engineering and validation methodologies. Traditional monolithic approaches lack the flexibility, interoperability, and collaborative capabilities required for modern power system automation applications. This work introduces the PowerTeams approach, a cloud-native, service-oriented platform designed to support collaborative engineering across the entire lifecycle of power system automation applications. The platform integrates both collaboration and engineering services, supports dynamic service registration, and employs a microservice architecture with container-based deployment to ensure scalability and flexibility. A validation scenario involving the development and deployment of a power plant controller for a photovoltaic system demonstrates the platform’s capabilities. Initial evaluations with external users confirm the feasibility and benefits of the approach, highlighting its potential for real-world power and energy system projects.
|
|
Tu-S1-T11 |
Room 0.94 |
Digital Twins in Current and Future Smart Systems: Opportunities and
Innovations |
Special Sessions: SSE |
Chair: Nardone, Roberto | University of Naples Parthenope |
Co-Chair: Picone, Marco | University of Modena and Reggio Emilia |
Organizer: Nardone, Roberto | University of Naples Parthenope |
Organizer: Coppolino, Luigi | University of Naples Parthenope |
|
08:30-08:45, Paper Tu-S1-T11.1 | |
Building a Digital Twin Image of a Layered Architecture for Multi-Plant Manufacturing (I) |
|
Iannaccone, Antonio | University of Naples "Parthenope" |
Adinolfi, Francesco | Innovaway S.p.A |
Chianetta, Dario | Innovaway S.p.A |
Romano, Luigi | University of Naples Parthenope |
Keywords: Cyber-physical systems, Consumer and Industrial Applications, Decision Support Systems
Abstract: The increasing complexity of multi-plant manufacturing systems demands architectural models able to support coordinated, scalable, and secure industrial operations. This paper introduces a five-layer architecture for distributed smart manufacturing environments to enhance interoperability, standardization, and cyber-physical integration across multiple facilities. The architecture leverages Digital Twin technology to synchronize physical operations with digital models, enabling coordinated control, monitoring, and decision support. Its layered structure supports modular deployment across plant-level and enterprise systems, while ensuring compatibility with heterogeneous technologies and legacy infrastructures. Specific emphasis is placed on cross-site data aggregation and external information sharing in compliance with data space principles. A prototype implementation validates the applicability of the architecture and its ability to support scalable and secure industrial operations, laying the groundwork for broader adoption in interconnected manufacturing ecosystems.
|
|
08:45-09:00, Paper Tu-S1-T11.2 | |
Digital Twins & Zero-Conf AI: Structuring Automated Intelligent Pipelines for Industrial Applications (I) |
|
Picone, Marco | University of Modena and Reggio Emilia |
Turazza, Fabio | Università Di Modena E Reggio Emilia (DISMI) |
Martinelli, Matteo | Università Degli Studi Di Modena E Reggio Emilia (DISMI) |
Mamei, Marco | Università Degli Studi Di Modena E Reggio Emilia (DISMI) |
Keywords: Digital Twin, Cyber-physical systems, Consumer and Industrial Applications
Abstract: The increasing complexity of Cyber-Physical Systems (CPS), particularly in the industrial domain, has amplified the challenges associated with the effective integration of Artificial Intelligence (AI) and Machine Learning (ML) techniques. Fragmentation across IoT and IIoT technologies, manifested through diverse communication protocols, data formats and device capabilities, creates a substantial gap between low-level physical layers and high-level intelligent functionalities. Recently, Digital Twin (DT) technology has emerged as a promising solution, offering structured, interoperable and semantically rich digital representations of physical assets. Current approaches are often siloed and tightly coupled, limiting scalability and reuse of AI functionalities. This work proposes a modular and interoperable solution that enables seamless AI pipeline integration into CPS by minimizing configuration and decoupling the roles of DTs and AI components. We introduce the concept of Zero Configuration (ZeroConf) AI pipelines, where DTs orchestrate data management and intelligent augmentation. The approach is demonstrated in a MicroFactory scenario, showing support for concurrent ML models and dynamic data processing, effectively accelerating the deployment of intelligent services in complex industrial settings.
|
|
09:00-09:15, Paper Tu-S1-T11.3 | |
Memory Optimization for Convex Hull Support Point Queries |
|
Greer, Michael | Kinematic Labs |
Keywords: Digital Twin, Cyber-physical systems, Robotic Systems
Abstract: Support point queries are a critical part of many collision detection pipelines, including those for robotics and real-time graphical applications. This paper proposes several memory layout optimizations to speed up support point queries on convex hulls. These methods are implemented and tested on a variety of different hardware models, with a decrease in processing time of up to five times compared to current approaches. The results in this paper can be integrated with existing physics libraries with minimal effort.
|
|
09:15-09:30, Paper Tu-S1-T11.4 | |
Multidimensional Stochastic Petri Nets: A Novel Approach to Modeling and Simulation of Stochastic Discrete-Event Systems |
|
Khodadadi, Atieh | KIT (Karlsruhe Institute of Technology) |
Lazarova-Molnar, Sanja | KIT (Karlsruhe Institute of Technology) |
Keywords: Digital Twin, Discrete Event Systems, System Modeling and Control
Abstract: Process Mining (PM) has been proven valuable for extracting process flows from data, also in the form of stochastic Petri net (SPN) models of systems. SPNs are widely recognized for their ability to model complex, stochastic systems and are extensively used in combination with PM. While SPNs provide an intuitive and straightforward way to model complex systems, representing changes across multiple dimensions, such as energy and waste, remains challenging in their standard frameworks. In this paper, we introduce an extension of stochastic Petri nets, termed Multidimensional SPNs (MDSPNs), by extending the SPN framework to capture dynamics along different dimensions. MDSPNs facilitate a comprehensive modeling of systems’ behaviors from multiple perspectives, which can correspond to the diverse objectives of systems. To facilitate design and simulation of MDSPNs, we designed and developed MDPySPN, a Python library, which we also introduce in this paper. MDPySPN enables the simulation of MDSPNs by supporting alterations of multiple values at system events. With MDPySPN, we aim to provide researchers, engineers, and simulation professionals with a practical and extensible toolkit to model, simulate, and analyze MDSPNs, thereby supporting multi-objective optimization of stochastic processes in systems. Through a case study, we demonstrate the capabilities of modeling and simulation of MDSPNs using MDPySPN.
|
|
09:30-09:45, Paper Tu-S1-T11.5 | |
Physics Informed Neural Networks for Tool Condition Monitoring in Subtractive Manufacturing |
|
Rothe, Jakob F. | Siemens AG |
Yilmaz, Safa | Technical University of Munich |
Reisch, Raven T. | Siemens AG |
Runkler, Thomas A. | Siemens AG |
Keywords: Digital Twin, Manufacturing Automation and Systems, Fault Monitoring and Diagnosis
Abstract: In subtractive manufacturing, predicting the life- span of tools offers significant advantages. However, data-driven methods for such predictions often fail to incorporate domain knowledge effectively. This paper presents a hybrid approach that combines machine learning with domain expertise. Firstly, we introduce a novel method to augment sparsely labeled datasets. Secondly, we propose a new loss function that inte- grates domain knowledge into a machine learning model. This approach enhances the accuracy and reliability of tool wear predictions, ultimately improving the efficiency of subtractive manufacturing processes. We evaluate the proposed methods on the NUAA Ideahouse Dataset. We achieve a R2 score of up to 0.99 on unseen data.
|
|
09:45-10:00, Paper Tu-S1-T11.6 | |
Holistic Specification of the Human Digital Twin: Stakeholders, Users, Functionalities, and Applications |
|
Mandischer, Nils | University of Augsburg |
Atanasyan, Alexander | RWTH Aachen University |
Dahmen, Ulrich | RWTH Aachen University |
Schluse, Michael | RWTH Aachen University |
Rossmann, Juergen | RWTH Aachen University |
Mikelsons, Lars | University of Augsburg |
Keywords: Digital Twin, Technology Assessment
Abstract: The digital twin of humans is a relatively new concept. While many diverse definitions, architectures, and applications exist, a clear picture is missing on what, in fact, makes a human digital twin. Within this context, researchers and industrial use-case owners alike are unaware about the market potential of the - at the moment - rather theoretical construct. In this work, we draw a holistic vision of the human digital twin, and derive the specification of this holistic human digital twin in form of requirements, stakeholders, and users. For each group of users, we define exemplary applications that fall into the six levels of functionality: store, analyze, personalize, predict, control, and optimize. The functionality levels facilitate an abstraction of abilities of the human digital twin. From the manifold applications, we discuss three in detail to showcase the feasibility of the abstraction levels and the analysis of stakeholders and users. Based on the deep discussion, we derive a comprehensive list of requirements on the holistic human digital twin. These considerations shall be used as a guideline for research and industries for the implementation of human digital twins, particularly in context of reusability in multiple target applications.
|
|
Tu-S1-T12 |
Room 0.95 |
Federated Intelligence: Synergies between Federated Learning and Collective
Intelligence & Next-Generation Computational Intelligence for Evolving
Systems and Applications |
Special Sessions: Cyber |
Chair: Badica, Costin | Universitatea Din Craiova |
Co-Chair: Hayashida, Tomohiro | Hiroshima University |
Organizer: Badica, Costin | Universitatea Din Craiova |
Organizer: Camacho, David | Universidad Autonoma De Madrid |
Organizer: Nguyen, Ngoc Thanh | Wroclaw University of Science and Technology |
|
08:30-08:45, Paper Tu-S1-T12.1 | |
MTF-Grasp: A Multi-Tier Federated Learning Approach for Robotic Grasping (I) |
|
Zaland, Obaidullah | Umeå University |
Elmroth, Erik | Umeå University |
Bhuyan, Monowar | Umeå University |
Keywords: Neural Networks and their Applications, Computational Intelligence, Image Processing and Pattern Recognition
Abstract: Federated Learning (FL) is a promising machine learning paradigm that enables participating devices to train privacy-preserved and collaborative models. FL has proven its benefits for robotic manipulation tasks. However, grasping tasks lack exploration in such settings where robots train a global model without moving data and ensuring data privacy. The main challenge is that each robot learns from data that is nonindependent and identically distributed (non-IID) and low quantity. This exhibits performance degradation, particularly in robotic grasping. Thus, in this work, we propose MTF-Grasp, a multi-tier FL approach for robotic grasping, acknowledging the unique challenges posed by the non-IID data distribution across robots, including quantitative skewness. MTF-Grasp harnesses data quality and quantity across robots to select a set of ``top-level'' robots with better data distribution and higher sample count. It then utilizes top-level robots to train initial seed models and distribute them to the remaining ``low-level'' robots, reducing the risk of model performance degradation in low-level robots. Our approach outperforms the conventional FL setup by up to 8% on the quantity-skewed Cornell and Jacquard grasping datasets.
|
|
08:45-09:00, Paper Tu-S1-T12.2 | |
Consensus without Authority: A Meta-Protocol Framework for Decentralized Collective Cognition (I) |
|
Ferenczi, Andras | University of Craiova |
Badica, Costin | Universitatea Din Craiova |
Keywords: Machine Learning, Swarm Intelligence, Intelligent Internet Systems
Abstract: Modern applications increasingly rely on knowledge and collective wisdom. While the internet offers abundant sources, it is untrustworthy, incomplete, and essentially a vast, heterogeneous database lacking current information. With innovation accelerating exponentially, harnessing collective intelligence is essential to drive progress. We present emph{Consensus Without Authority}, a general, ledger-backed meta-protocol framework that transforms any group of self-interested actors into a trusted collective-intelligence engine. Each actor i produces a local insight ( LI_i = f_1(mathcal{Q}, theta_i) ), evaluates its peers ( U_i = f_2({LI_j}) ), and enters a two-phase commit-reveal cycle. Harmonizers aggregate votes ( V = f_3({U_i}) ), synthesize candidate knowledge ( CK_j = f_4(V, {LI_i}) ), and a smart contract finalizes each round r by ( CK^{(r)} = f_5({CK_j}) ). We argue that, under majority-honest assumptions and token-weighted incentives, honest play is a Nash equilibrium and the framework converges to a unique fixed point even in Byzantine settings. While first applied to federated learning, the framework is underline{not} limited to distributed ML training. It supports socially scoped cognition, where actors contribute local knowledge and reasoning to a shared space, enriching common knowledge iteratively. Two case studies illustrate its range: (i) emph{crowd-sourced RLHF}, where dispersed annotators guide policy updates without sharing labels; (ii) emph{federated LoRA tuning} of large language models across siloed hospitals, achieving near-centralized accuracy under HIPAA and GDPR. We place the framework within an emerging “AI-native stack”: a Byzantine ledger for immutable state, Google’s A2A for agent-to-agent messaging, and Anthropic’s MCP for secure tool access — together enabling a emph{Cognitive Internet} where autonomous agents learn, trade, and verify knowledge without centralized trust, creating a scalable, interoperable substrate for next-generation human–AI collaboration.
|
|
09:00-09:15, Paper Tu-S1-T12.3 | |
Capsule-ConvKAN: A Hybrid Neural Approach for Medical Image Classification (I) |
|
Pitukova, Laura | Technical University of Kosice |
Sinčák, Peter | Technical University of Kosice |
Kovács, László József | University of Miskolc |
Wang, Peng | University of Connecticut |
Keywords: Neural Networks and their Applications, Deep Learning, Application of Artificial Intelligence
Abstract: This study conducts a comprehensive comparison of four neural network architectures: Convolutional Neural Network, Capsule Network, Convolutional Kolmogorov–Arnold Network, and the newly proposed Capsule-Convolutional Kolmogorov–Arnold Network. The proposed Capsule-ConvKAN architecture combines the dynamic routing and spatial hierarchy capabilities of Capsule Network with the flexible and interpretable function approximation of Convolutional Kolmogorov–Arnold Networks. This novel hybrid model was developed to improve feature representation and classification accuracy, particularly in challenging real-world biomedical image data. The architectures were evaluated on a histopathological image dataset, where Capsule-ConvKAN achieved the highest classification performance with an accuracy of 91.21%. The results demonstrate the potential of the newly introduced Capsule-ConvKAN in capturing spatial patterns, managing complex features, and addressing the limitations of traditional convolutional models in medical image classification.
|
|
09:15-09:30, Paper Tu-S1-T12.4 | |
Data Sharing Procedure for Effective Learnings in Deep Reinforcement Learning of Multiagent Systems (I) |
|
Hayashida, Tomohiro | Hiroshima University |
Asano, Kotaro | Hiroshima University |
Sekizaki, Shinya | Hiroshima University |
Nishizaki, Ichiro | Hiroshima University |
Keywords: Computational Intelligence, Application of Artificial Intelligence, Deep Learning
Abstract: In recent years, reinforcement learning has made significant progress in a wide range of domains, including autonomous driving, behavior analysis in electricity markets, and robotic control. Many of these studies have demonstrated the effectiveness of applying Multi-Agent Systems (MAS), where multiple agents act cooperatively or competitively. However, a major challenge in MAS lies in the increased environmental uncertainty caused by the influence of other agents, which can lead to instability in the learning process. To address this issue, various approaches have been proposed that aim to improve learning efficiency by leveraging the experience data of other agents. Previous studies have reported that sharing full experience data among agents can enhance learning efficiency in both cooperative and competitive settings. Furthermore, there are cases where sharing only partial experience data has also led to performance improvements. In this paper, we focus on the similarity between agents as a key criterion for selecting shared data and demonstrate its effectiveness in improving learning performance in MAS. Specifically, we propose a novel method that selectively shares experience data based on agent similarity and utilizes this data to complement an agent’s own experiences. Through simulations with heterogeneous (asymmetric) agents, we show that the proposed method enhances learning efficiency in multi-agent environments.
|
|
09:30-09:45, Paper Tu-S1-T12.5 | |
A Dual-Stage Dual-Population Evolutionary Algorithm for Distribution System Reconfiguration with Severe Constraints (I) |
|
Sekizaki, Shinya | Hiroshima University |
Hayashida, Tomohiro | Hiroshima University |
Keywords: Evolutionary Computation, Metaheuristic Algorithms, Swarm Intelligence
Abstract: The growing integration of renewable energy sources (RESs) within electric distribution systems poses substantial challenges, such as network constraint violations and increased monetary costs. Distribution system reconfiguration is a viable solution for addressing these problems. However, due to severe network constraints, distribution system operators often face challenges in finding optimal reconfiguration solutions within practical computational time. This paper proposes a constrained multiobjective evolutionary algorithm (CMOEA) specialized for solving the reconfiguration problems based on dual-stage and dual-population strategies to tackle these difficulties. The proposed CMOEA employs specialized selection pressures for each population in each stage to guide the populations toward good objective values with diversity. The proposed CMOEA is applied to computational experiments of a reconfiguration problem characterized by severe constraints and uncertainties arising from RESs. The results demonstrate its effectiveness in identifying feasible solutions with good objective values, i.e., the low investment cost.
|
|
09:45-10:00, Paper Tu-S1-T12.6 | |
LLM-Guided Evolutionary Strategy Generation for Quantitative Trading (I) |
|
Zhang, Di | Xi'an Jiaotong-Liverpool University |
Jiang, Zhengyong | Xi’an Jiaotong-Liverpool University |
Ji, Qiong | Xi'an Jiaotong-Liverpool University |
Liu, Hengyan | Xi'an Jiaotong-Liverpool University |
Wang, Tianshi | Xi'an Jiao Tong-Liverpool University |
Stefanidis, Angelos | Bournemouth University |
Keywords: Application of Artificial Intelligence, Expert and Knowledge-Based Systems, Evolutionary Computation
Abstract: This paper proposes LLM-GA, a novel framework that integrates large language models (LLMs) with genetic algorithms (GA) for automated trading strategy generation. The system architecture comprises three synergistic modules: 1) a signal generator extracting technical, fundamental, and sentiment indicators; 2) an LLM-enhanced GA core that initializes seed strategies and performs semantically-aware crossover/mutation operations; and 3) an execution module forming a closed-loop adaptive system. Unlike traditional GA that randomly combines signals, our approach leverages LLMs' financial reasoning capability to maintain logical consistency during strategy evolution. Experiments based on historical data of the Chinese stock market in the past five years (2020-2024) show that, LLM-GA achieves superior risk-adjusted returns (Annualized Excess Return (AER)=12.3%, Maximum Drawdown (MDD)=35.2%) compared to baseline methods including vanilla GA, PSO, and ensemble learning. Ablation studies reveal that LLM-guided initialization improves starting strategy quality by 215%, while semantic crossover reduces invalid strategies by 83.5%. Despite performance gaps against RL methods (2-3% lower AER), our method provides unique advantages in strategy interpretability and diversity, addressing critical limitations in black-box approaches like reinforcement learning. The work establishes a new paradigm for human-AI collaborative quantitative strategy development.
|
|
Tu-S1-T13 |
Room 0.96 |
Fuzzy Systems |
Regular Papers - Cybernetics |
Chair: Grantner, Janos | WMU |
Co-Chair: Wang, Tao | Tsinghua University |
|
08:30-08:45, Paper Tu-S1-T13.1 | |
An FPGA-Based Accelerator for Fuzzy Grasper Assessment in the Laparoscopic Box Trainer Peg Transfer Task |
|
Bainbridge, Kenneth | Western Michigan University |
Grantner, Janos | WMU |
Abdel-Qader, Ikhlas | Western Michigan University |
Keywords: Fuzzy Systems and their applications, Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing, Neural Networks and their Applications
Abstract: The peg transfer task is a crucial exercise in the Fundamentals of Laparoscopic Surgery (FLS) program for developing the dexterity and coordination skills required for laparoscopic surgery. It involves a surgeon holding a grasper in each hand to transfer six pegs on a peg board from the left to the right (or vice versa) and then back again. This task must be accomplished using both graspers while adhering to various constraints on their movement and time. Automating the assessment of this task is crucial for providing objective, consistent, and real-time feedback to laparoscopic surgery trainees, helping them refine their skills more effectively. Previous work developed a neural network to identify grasper and peg objects in various states. The identified states were further used as inputs to a two-level, cascaded, fuzzy logic system to assess grasper movements. The system ran entirely on a PC but was too slow to provide real-time feedback. To address this limitation, this paper presents the development of a custom fuzzy grasper assessment hardware accelerator using VHDL and its implementation on an SoC-FPGA. The accelerator successfully performs the same task while being, on average, over 64 times faster than the previous PC-based implementation. It is capable of 9,211 fuzzy inferences per second. In future work, the neural network running on the PC will be accelerated using the same SoC-FPGA, and the accelerator can be further pipelined to improve performance. This will pave the way for fully integrated real-time feedback for the trainees, a promising advancement in laparoscopic surgery training.
|
|
08:45-09:00, Paper Tu-S1-T13.2 | |
Bridging Causal Discovery and Fuzzy Systems: An Efficient Rule-Based Modeling Approach |
|
Li, Yishen | Beijing University of Posts and Telecommunications |
Wang, Tao | Tsinghua University |
Li, Yuliang | Beijing University of Posts and Telecommunications |
Sun, Fuchun | Tsinghua University |
Keywords: Fuzzy Systems and their applications, Machine Learning, Neural Networks and their Applications
Abstract: Generating fuzzy rule bases from data is essential for building interpretable fuzzy systems. Traditional approaches like Wang-Mendel (WM) rely on correlations but often produce large, redundant rule bases, reducing interpretability and increasing computational cost. To address this, recent work has incorporated causal discovery into rule generation. Te Zhang et al.introduced a method using directed graphs within the Markov blanket, but their reliance on DirectLiNGAM limits applicability to linear data. This paper adopts the Causal Additive Model with Unobserved Variables (CAMUV) to identify a target variable’s Markov blanket and extract its direct causal features. These are then used as inputs to a Takagi-Sugeno-Kang (TSK) fuzzy system. Compared to causal learning algorithms like GRaSP, DECI, and BOSS, CAMUV better handles nonlinear and partially unobserved data, enhancing causal discovery and interpretability. Unlike WM, the TSK system generates dynamic causal if-then rules, improving both interpretability and modeling power. Experiments across seven datasets show an average accuracy improvement of approximately 5% over benchmark models. This work offers a novel, causality-driven approach to constructing interpretable fuzzy systems for complex data.
|
|
09:00-09:15, Paper Tu-S1-T13.3 | |
Enhancing the Dendritic Cell Algorithm through Automated Feature Reduction Techniques for Improved Anomaly Detection |
|
Pereira, Vitor | Faculty of Engineering, University of Porto |
Pinto, Rui | FEUP |
Gonçalves, Gil | Faculdade De Engenharia Da Universidade Do Porto |
Keywords: Artificial Immune Systems, Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing, Expert and Knowledge-Based Systems
Abstract: The Dendritic Cell Algorithm (DCA), inspired by the Human Immune System, is a promising Artificial Immune System (AIS) for anomaly detection. However, its pre-processing phase traditionally depends on expert manual intervention, limiting scalability and objectivity. This study investigates the impact of integrating automated feature reduction techniques to streamline this phase. We propose three approaches: Kernel Principal Component Analysis (KPCA), Autoencoders (AE), and a hybrid Autoencoder with KPCA (AEkPCA). The models were tested on seven datasets from cybersecurity, biology, and finance domains. KPCA, particularly with a Gaussian kernel, delivered the most consistent results, achieving high accuracy and strong MCAV separation. AEkPCA showed competitive performance in complex datasets, especially with polynomial kernels, though with greater variability. AE in isolation exhibited unstable behavior and inconsistent detection. These results support the viability of automated preprocessing in DCA, highlighting that performance depends heavily on the feature reduction method and kernel combination.
|
|
09:15-09:30, Paper Tu-S1-T13.4 | |
HO-DJ-GNN: A Laplacian-Based Hybrid Diffusion Jump Model for Node Classification in Graph Neural Networks |
|
Wang, Hexuan | Tiangong University |
Yan, Yang | School of Information Technology and Engineering, Tianjin Univer |
Wang, Qiuyan | School of Computer Science and Technology, Tiangong University, |
Chen, Hanning | College of Artificial Intelligence, Tianjin University of Scienc |
Liang, Xiaodan | School of Computer Science and Technology, Tiangong University, |
Keywords: Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing, Deep Learning, Machine Learning
Abstract: This paper proposes a graph neural network model, DJ-HO-GCN, which integrates high-order graph convolution with diffusion distance, addressing the performance limitations of traditional graph neural networks on heterogeneous graphs. This model employs a novel approach grounded in the rich mathematical theory of simple complexes (SCs), a robust tool for simulating high-order interactions. By utilizing this method, the high-order convolution module effectively captures correlations between nodes at greater distances, particularly when analyzing with graphs with complex structures and significant heterogeneity, thereby demonstrating clear advantages. Meanwhile, by introducing the concept of diffusion pump, the diffusion distance of K-step nodes is dynamically calculated during the training process, and a structural filter is generated in combination with the diffusion distance. This model can well describe the similarity between nodes, enhance the modeling ability of the model for remote dependencies, and effectively reduce the problem of excessive smoothing. To further improve the applicability of the model, DJ-HO-GCN combines these two mechanisms, avoiding the limitations of traditional methods and demonstrating strong robustness and scalability in different types of graph data. The experimental results show that DG-HO-GCN has achieved a high accuracy rate on most datasets, especially showing obvious advantages on heterogeneic graphs (such as chameleons and squirrels), thereby verifying the effectiveness and innovation of this method in graph neural networks.
|
|
09:30-09:45, Paper Tu-S1-T13.5 | |
Revisiting and Improving the NEAT Algorithm |
|
Lyubchev, Dimitar | University of National and World Economy, Sofia, Bulgaria |
Marchev, Angel | UNWE |
Keywords: Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing, Evolutionary Computation
Abstract: Neuroevolution of Augmenting Topologies (NEAT) stands out as a pioneering framework for simultaneously evolving weights and architectures of artificial neural networks; however, several implicit design choices remain underexplored, particularly in large-scale or zero-knowledge scenarios. This paper revisits the canonical NEAT algorithm and the widely adopted neat Python library, presenting an extensive experimental campaign that spans synthetic control tasks. serves its dual goals of methodological critique and targeted enhancement. The aim is to introduce lightweight yet impactful extensions, including: balanced crossover, rejuvenation-driven speciation, adaptive hyper-parameters, and GPU-accelerated batch evaluation. Thus, achieving improved convergence speed, solution compactness, and behavioral diversity. Some empirical results demonstrate consistent performance gains across the examined domains, affirming the practical value of our refinements. These contributions advance NEAT toward a more scalable, adaptive, and high-performance solution for automated neural architecture discovery.
|
|
Tu-S1-T14 |
Room 0.97 |
Haptic Systems |
Regular Papers - HMS |
Chair: Funabora, Yuki | Nagoya University |
Co-Chair: Treichl, Tobias | German Aerospace Center |
|
08:30-08:45, Paper Tu-S1-T14.1 | |
Using the Internal Model Hypothesis to Model Human Motor Control in Multi-Domain Simulations |
|
Treichl, Tobias | German Aerospace Center |
Keywords: Haptic Systems, Human Factors, Human-Machine Interaction
Abstract: There is a trend in human-machine interaction towards a more physical interaction between humans and machines. Examples are shared control systems such as lane keeping assistants or human-robot collaboration. Thereby, the interplay of the sensorimotor control loops of the human operator with the machine plays an important role. State of the art literature suggests that this interplay should be considered already in the design process of such systems. Therefore, the realistic modeling of the human motor control is crucial for a human model for physical human-machine interaction. The current state of research in neuromechanics suggests that humans use so-called internal models of their limbs and environment, located in the central nervous system, to perform accurate movements despite slow reaction times. This theory is called the internal model hypothesis. This paper presents an approach to implement the internal model hypothesis in a dynamic human model using inverse models. The inverse models are derived using modified Newton-Euler equations. Finally, the validity of the chosen approach is demonstrated by comparing numerical simulations with experimental data using the example of a human-steering wheel interaction. The numerical simulations show a similar qualitative behaviour compared to the mean subject trajectories. Furthermore, the simulated trajectories are within the standard deviation of the experimental data for most of the time.
|
|
08:45-09:00, Paper Tu-S1-T14.2 | |
Omnidirectional Grip Assessment and Tactile Feedback for Interactive Upper Limb Rehabilitation |
|
Stefanowicz, Katarzyna Anna | Management Center Innsbruck |
Pichler, Julia | Management Center Innsbruck |
Winkler, Simon | MCI Internationale Hochschule GmbH |
Kim, Yeongmi | MCI Internationale Hochschule GmbH |
Keywords: Haptic Systems, Human-Machine Interaction
Abstract: In rehabilitation of upper limb impairment, grip training is a crucial step and requires a universal approach. To address rehabilitation of somatosensory perception and grip control we present a hand grip assessment and training device with integrated tactile feedback for immersive, interactive therapy. The grip is soft and designed to measures grip force omnidirectionally. Additionally, texture simulation is provided through a high-resolution vibration module, delivering tactile feedback. The mechanical design consists of a base platform and a precisely aligned steel support element that holds a grip measurement unit. The system supports grip measurements and offers a playful, engaging experience designed to enhance patient motivation during therapy by integrating VR game applications. To verify that the proposed device enables omnidirectional grip assessment, we measured the force values by applying the same force at six different angles. Each of the six force values was repeatedly measured across different angles. The device allows users to grip freely, making it suitable for rehabilitation training, and offers significant potential for personalized therapy applications.
|
|
09:00-09:15, Paper Tu-S1-T14.3 | |
Funabot-Suit for Upper Body: McKibben Actuated-Suit Inducing Seven Kinesthetic Perceptions in Elbow, Shoulder, and Trunk |
|
Fukatsu, Haru | Nagoya University |
Peng, Yanhong | Nagoya University |
Funabora, Yuki | Nagoya University |
Doki, Shinji | Nagoya University |
Keywords: Haptic Systems, Human-Machine Interaction, Wearable Computing
Abstract: This paper presents the Funabot-Suit for the Upper Body, enabling users to perceive seven upper body motions (shoulder abduction, elbow flexion/extension, and trunk rotation) through kinesthetic feedback induced by 80 McKibben artificial muscles embedded in elastic clothing. Unlike existing systems that rely on rigid structures or vibrotactile feedback, the Funabot-Suit provides soft, multi-joint kinesthetic feedback via garment deformation. Building on our previous work focused on trunk motion, this study extends perception to more localized joints such as the shoulders and elbows. The experimental results from nine participants indicate that larger body parts tend to induce stronger and clearer sensations, while certain localized motions, such as elbow extension, are perceived less clearly. These findings suggest that both the size of the body part and the placement of artificial muscles affect kinesthetic perception. The results provide design insights for future wearable devices targeting VR, rehabilitation, and human-robot interaction.
|
|
09:15-09:30, Paper Tu-S1-T14.4 | |
HapticVLM: VLM-Driven Texture Recognition Aimed at Intelligent Haptic Interaction |
|
Khan, Muhammad Haris | Intelligent Space Robotics Lab, SKOLTECH |
Altamirano Cabrera, Miguel | Skolkovo Institute of Science and Technology Skoltech |
Iarchuk, Dmitrii | Skolkovo Institute of Science and Technology |
Mahmoud, Yara | Skolkovo Institute of Science and Technology |
Trinitatova, Daria | Skolkovo Institute of Science and Technology |
Tokmurziyev, Issatay | Skolkovo Institute of Science and Technology |
Tsetserukou, Dzmitry | Skoltech |
Keywords: Haptic Systems, Intelligence Interaction, Human-Machine Interaction
Abstract: This paper introduces HapticVLM, a novel multimodal system that integrates vision-language reasoning with deep convolutional networks to enable real-time haptic feedback. HapticVLM leverages a ConvNeXt-based material recognition module to generate robust visual embeddings for accurate identification of object materials, while a state-of-the-art Vision-Language Model (Qwen2-VL-2B-Instruct) infers ambient temperature from environmental cues. The system synthesizes tactile sensations by delivering vibrotactile feedback through speakers and thermal cues via a Peltier module, thereby bridging the gap between visual perception and tactile experience. Experimental evaluations demonstrate an average recognition accuracy of 84.67% across five distinct auditory-tactile patterns and a temperature estimation accuracy of 86.7% based on a tolerance-based evaluation method with an 8°C margin of error across 15 scenarios. Although promising, the current study is limited by the use of a small set of prominent patterns and a modest participant pool. Future work will focus on expanding the range of tactile patterns and increasing user studies to further refine and validate the system's performance. Overall, HapticVLM presents a significant step toward context-aware, multimodal haptic interaction with potential applications in virtual reality, and assistive technologies.
|
|
09:30-09:45, Paper Tu-S1-T14.5 | |
Enhancement of Vibration Induced Illusory Wrist Movement by Presenting Finger Joint Extension |
|
Izumi, Kai Shunshi | The University of Tokyo |
Honda, Koki | The University of Tokyo |
Fukui, Rui | The University of Tokyo |
Keywords: Haptic Systems, Virtual and Augmented Reality Systems, Human-Machine Interaction
Abstract: In recent years, kinesthetic illusion has gained attention as a means of presenting a sense of movement to users and improving immersion in human-machine systems for entertainment, skill acquisition training, and rehabilitation. Kinesthetic illusion refers to the phenomenon in which a stationary body part is perceived to move when vibratory stimulation is applied to muscles. However, the intensity of the illusion varies among individuals and some people do not perceive the illusions. To address this issue, this study proposes a method to induce and enhance kinesthetic illusions in the wrist joint by combining finger joint extension with vibratory stimulation. Finger joint extension is closely associated with wrist movements in daily activities. Therefore, by simultaneously presenting finger joint extension with vibratory stimulation, this approach aims to evoke the user's motor imagery and strengthen the illusion. A psychophysical experiment confirmed that the proposed method induced stronger illusions than the vibratory stimulation alone. Furthermore, this study revealed that the angle to which the fingers are extended is a key parameter for enhancing the illusion using the proposed method.
|
|
09:45-10:00, Paper Tu-S1-T14.6 | |
Effect of Multimodal Haptic Feedback Combining Force and Vibrotactile Feedback During Pressing Motion |
|
Hayami, Natsuki | Chuo University |
Sawahashi, Ryunosuke | Chuo University |
Nishihama, Rie | Chuo University |
Nakamura, Taro | Chuo Univ |
Keywords: Haptic Systems, Virtual/Augmented/Mixed Reality, Virtual and Augmented Reality Systems
Abstract: In virtual reality (VR) environments, enhancing force feedback is crucial for achieving immersive and natural interactions. Force feedback devices are generally used to present stiffness and resistance, allowing users to perceive physical properties of virtual objects. However, even with force feedback, users may fail to perceive contact despite visual confirmation that the object is being touched. To address this issue, we developed a multimodal haptic presentation system that combines force feedback using a magnetorheological (MR) brake with vibrotactile feedback. The integration of vibration aims to complement force feedback, enhancing the sensation of contact. Experiments were conducted to evaluate the effectiveness of the proposed system by comparing four feedback conditions: no feedback, force feedback only, vibrotactile feedback only, and combined feedback. The results indicated that the combination of force and vibrotactile feedback significantly improved contact perception and pressing sensation, especially in upward and forward pressing tasks. Additionally, vibrotactile feedback alone was found to enhance contact sensation compared to force feedback alone, suggesting that vibration effectively compensates for contact perception when force feedback is insufficient. These findings demonstrate that combining force and vibrotactile feedback contributes to more realistic and intuitive haptic experiences in VR, providing valuable insights for designing advanced haptic interfaces.
|
|
Tu-S1-T15 |
Room 1.85 |
Intelligent Computing and Its Applications |
Special Sessions: SSE |
Chair: Sung, Guo-Ming | National Taipei University of Technology |
Co-Chair: Sung, Wen-Tsai | National Chin-Yi University of Technology |
Organizer: Sung, Guo-Ming | National Taipei University of Technology |
Organizer: Chou, Jen-Hsiang | National Taipei University of Technology |
|
08:30-08:45, Paper Tu-S1-T15.1 | |
Modified Direct Torque Control Application-Specific Integrated Circuit with a Speed Controller and Nine-Stage Flux/Torque Error Fuzzy Controller for a Three-Phase Induction Motor (I) |
|
Sung, Guo-Ming | National Taipei University of Technology |
Hsieh, Chia-Jung | National Taipei University of Technology |
Yu, Chih-Ping | National Taipei University of Technology |
Lee, Ching-Yin | Tungnan University |
Chen, Chao-Rong | National Taipei University of Technology |
Lin, Tzu-Chiao | National Taipei University of Technology |
Keywords: Electric Vehicles and Electric Vehicle Supply Equipment, Intelligent Transportation Systems, Large-Scale System of Systems
Abstract: This study developed an application-specific integrated circuit (ASIC) with a speed controller, a nine-stage error fuzzy controller, and a discrete multiple vector voltage (DMVV) system for modified direct torque control (MDTC). By using the nine-stage error fuzzy controller, the proposed system effectively stabilizes a motor’s flux and ensures high control precision by incorporating speed feedback. This feature enables the system to ensure that the flux and torque values are close to the designed values, which results in high motor performance. A DMVV switching table plays a crucial role in facilitating the appropriate six-switch signals on the basis of the modified flux and torque signals from the fuzzy controller. The proposed DMVV system considerably reduces ripples and enhances overall system stability by generating more vector voltages than those generated in the conventional DTC method. The proposed system architecture and functional modules were implemented using Verilog hardware description language. After the syntax and functionality of the designed ASIC were rigorously verified using a field-programmable gate array development board, the designed ASIC was fabricated through the 0.18-μm complementary metal–oxide–semiconductor process of Taiwan Semiconductor Manufacturing Company. This ASIC caters to the specific requirements of three-phase induction motors. Measurement results indicated that the fabricated ASIC had a chip area of 0.974 * 0.976 mm2, a sampling frequency of 40 MHz, and power consumption of 0.5957 mW under a supply voltage of 1.8 V and an operating frequency of 10 MHz.
|
|
08:45-09:00, Paper Tu-S1-T15.2 | |
Adaptive Decision Feedback Equalization for High-Speed Serializer/Deserializer Communication System (I) |
|
Sung, Guo-Ming | National Taipei University of Technology |
Kohale, Sachin D. | National Taipei University of Technology |
Chang, Shu-Wen | National Taipei University of Technology |
Tung, Li-Fen | National Taipei University of Technology |
Tseng, Chwan-Lu | National Taipei University of Technology |
Chou, Jen-Hsiang | National Taipei University of Technology |
Keywords: Communications, Large-Scale System of Systems, Smart Sensor Networks
Abstract: An 8-tap feed-forward equalizer (FFE) and 10-tap decision feedback equalizer (DFE) were designed for the IEEE 802.3u 100Base-TX specification. The weights of these adaptive filters are updated with a sign–sign least-mean-square (SSLMS) algorithm. To ensure the flexibility of the circuits for various environments channel lengths, the equalizers are designed such that the number of taps can be adjusted to between 1 and 8 for the FFE and 1 and 10 for the DFE. Results indicated that the equalizer can compensate for channel delays of more than 20 dB, and it achieved a bit error rate of 2 × 10−3. A high-speed serializer/deserializer communication application-specific integrated circuit was designed in Verilog Hardware Description Language and implemented on the TSMC 90-nm CMOS 1P9M standard cell process. In simulations, the chip area, delay cycle, and logic gate counts were 845.445 845.445 m2, 12 cycles, and 75 187 gates, respectively. The supplied voltage and frequency of the proposed adaptive DFE were 1.2 V and 250 MHz, respectively.
|
|
09:00-09:15, Paper Tu-S1-T15.3 | |
The Improved Multi-Scale Attention Module for Lightweight Mango Leaf Detection Model (I) |
|
Sung, Wen-Tsai | National Chin-Yi University of Technology |
Isa, Indra Griha Tofik | National Chin-Yi University of Technology |
Hsiao, Sung-Jung | Department of Information Technology, Takming University of Scie |
Keywords: Smart Sensor Networks, System Modeling and Control, Consumer and Industrial Applications
Abstract: Visually monitoring mango leaf plants can increase optimal growth through early detection of leaf disease. However, the high complexity of mango leaves with multiple intersections and noisy background causes the model performance to be less sensitive to the actual condition of the leaf plant. To address this issue, the proposed model will be constructed by integrating YOLOv10 and the improved multi-scale attention module. Specifically, the detection head structure of YOLO will be fused with the attention module which consists of the integrated efficient multi-scale attention (EMA) module and non-local block mechanism. This mechanism not only improves the model performance, especially in long-range dependencies and tiny objects but also provides an adaptable lightweight model for edge computing systems. The experimental results indicate the proposed model has outstanding performance with mAP50 of 0.944. Meanwhile, the original model gains the mAP50 of 0.917. Compared with other models including ablation study, comparative model experiment, and practical application, the proposed model achieves the best overall detection performance.
|
|
09:15-09:30, Paper Tu-S1-T15.4 | |
Stock Prices Forecasting Using a Cerebellar Model Neural Network and Extreme Learning Machine (I) |
|
Zhang, Jin-Liang | Yuan Ze University |
Lin, Chih-Min | Yuan Ze University |
Keywords: Control of Uncertain Systems
Abstract: This paper presents an approach for the forecast of daily stock price. Stock prices are not constant over time and are becoming increasingly uncertain in modern financial markets, so that their forecasting is more important and challenging. A new structure, called a cerebellar model extreme learning machine (CMELM), is proposed, which includes a cerebellar model neural network used as a main predictor and an extreme learning machine used for parameter learning. In order to attain better accuracy, a wavelet is used to decompose the original stock price time series. This framework is tested using the data from the Taiwanese stock market, and the experimental results show that it outperforms the benchmarks that are established in this study. Because it is extremely fast and sufficiently accurate, the proposed method has great potential for practical applications.
|
|
09:30-09:45, Paper Tu-S1-T15.5 | |
A Dysarthric Speech Recognition System with Personal Style Embedding (I) |
|
Hsu, YuChen | National Taipei University |
Chang, Yue-Shan | National Taipei University |
Keywords: Adaptive Systems, Consumer and Industrial Applications, System Modeling and Control
Abstract: Conventional automatic speech recognition (ASR) systems often encounter difficulties in processing speech of dysarthric people, which is characterized by impaired neuromuscular control and atypical articulation. Existing commercial solutions, such as Google Speech to Text and Microsoft Azure Speech, exhibit high error rates when processing the speech of people with dysarthria, severely hindering the accessibility of communication. This study introduces a novel personalized ASR system that aims to address these challenges by integrating speaker adaptive modeling and advanced correction mechanisms. The system is based on the transformer-based SenseVoice mini-model, which utilizes the functionality of the transformer-based Large Language Model (LLM) to extract speaker-specific speech features from historical data and implement a dynamic history correction algorithm. By employing an LLM-based style extraction method, the system develops a personalized correction model that can be adapted to individual speech variations. Experimental evaluations show that the system provides substantial improvements in recognition accuracy and processing efficiency compared to traditional automatic speech recognition (ASR) techniques. This research highlights the potential of AI-driven personalization approaches in assistive communication technologies, providing a promising avenue for enhancing communication tools for people with speech impairments.
|
|
09:45-10:00, Paper Tu-S1-T15.6 | |
Multimodal Entity Alignment Via Siamese Network and Structural Attention (I) |
|
Wagner, Andrew | University of Cincinnati |
Anjum, Usman | University of Cincinnati |
Zhan, Justin | University of Cincinnati |
Keywords: Communications, Adaptive Systems, Decision Support Systems
Abstract: We consider the problem of entity alignment in multi-modal knowledge graphs (MMKGs). The explosion in interest in MMKGs has led to a wide array of techniques used to align entities within multiple MMKG’s. The nature of the different modalities available in the MMKG makes the act of embedding them in a shared space challenging. This paper proposes using optimal transport (OT) as a means of embedding and aligning multiple modalities into one unified representation. After acquiring the unified representation, we propose using contrastive learning to train a model for better performance in accurately predicting links between entities. Experiments looking at the effect of the additional modality and OT data fusion show poor performance failed to meet, let alone exceed, current state of the art.
|
|
Tu-S1-BMI.WS |
Room 0.49&0.50 |
BMI Workshop - Paper Session 1: Advances in BCI 1 |
BMI Workshop |
Chair: Volosyak, Ivan | Rhine-Waal University of Applied Sciences |
|
08:30-08:45, Paper Tu-S1-BMI.WS.1 | |
Preliminary Evaluation of an Augmentative and Alternative Communication System Operated Via a Brain-Computer Interface Based on Event-Related Potentials (I) |
|
Fernández-Rodríguez, Álvaro | Universidad De Málaga |
Arnaud, Axel | ENSC, Université De Bordeaux |
Dalphrase, Mathilde | ENSC |
Lespinet-Najib, Véronique | Bordeaux INP |
Velasco-Alvarez, Francisco | University of Malaga |
ANDRE Jean Marc, Andre | Bordeaux INP |
Ron-Angevin, Ricardo | University of Málaga |
Keywords: Active BMIs, BMI Emerging Applications, Other Neurotechnology and Brain-Related Topics
Abstract: A brain-computer interface (BCI) is a technology that enables direct communication between a user and external devices through brain activity. BCIs can be particularly beneficial for individuals with severe motor impairments who are unable to use conventional assistive technologies that require muscular control. The present study evaluates a preliminary ERP-based BCI designed as an augmentative and alternative communication (AAC) system for this population. Ten able-bodied participants were asked to operate the AAC-BCI system, which enabled stepwise selections through pictogram-based hierarchical menus to address various communication needs. Both objective performance metrics and subjective user feedback were collected. The results validate the feasibility of the proposed system as an AAC-BCI solution. However, further improvements are necessary to optimize its usability and effectiveness for the target population. Future research should focus on refining the system’s performance and adaptability to enhance its practical application in real-world scenarios.
|
|
08:45-09:00, Paper Tu-S1-BMI.WS.2 | |
Exploring New Territory II: Calibration-Free Decoding for ERP BCI (I) |
|
Thielen, Jordy | Radboud University |
Tangermann, Michael | Radboud University |
Keywords: Active BMIs, BMI Emerging Applications
Abstract: A brain-computer interface (BCI) typically requires a calibration phase to train its decoding model using supervised data, which can be time-consuming and impractical in real-world scenarios. This study eliminates the need for such calibration by investigating two zero-training approaches using a P300 event-related potential (ERP) dataset. We first evaluate reconvolution canonical correlation analysis (CCA) on ERP data, which marks its first application beyond its original domain of code-modulated visual evoked potentials (c-VEP). Second, we examine unsupervised mean-difference maximization (UMM), a recently proposed method for calibration-free ERP decoding. For both approaches, we assess performance under two conditions: instantaneous classification and cumulative learning across previously classified trials. Surprisingly, despite CCA originating from the c-VEP domain, our results demonstrate that both CCA and UMM achieve comparably high classification accuracy (97 % and 100 %) in decoding ERP data without requiring a calibration session. We highlight the unique advantages that each method offers and discuss implications for future research. By enabling reliable, calibration-free decoding, this work supports the development of more practical and accessible BCIs across various stimulus paradigms and applications. The potential integration of CCA and UMM presents a promising avenue to further enhance BCI usability.
|
|
09:00-09:15, Paper Tu-S1-BMI.WS.3 | |
Supervised and Semi - Supervised Machine Learning Networks Applied for Control of a Lower - Limb Exoskeleton (I) |
|
Bhambhani, Yash | Miguel Hernández University of Elche |
Ortiz, Mario | Universidad Miguel Hernández |
Polo-Hortigüela, Cristina | Universidad Miguel Hernández of Elche |
Quiles, Vicente | Miguel Hernandez University of Elche |
Cavaliere-Ballesta, Carlo | Miguel Hernández University of Elche |
Iáñez, Eduardo | Miguel Hernández University of Elche |
Azorin, Jose M. | Universidad Miguel Hernandez De Elche |
Keywords: BMI Emerging Applications, Active BMIs
Abstract: Brain–machine interfaces (BMI) for lower‑limb exoskeletons are a state‑of‑the‑art neurorehabilitation modality. They decode electroencephalographic (EEG) recordings during motor imagery (MI)—the mental rehearsal of movement—to infer intent and drive exoskeleton control. Yet MI decoding suffers from low signal‑to‑noise ratio, EEG non‑stationarity, and high inter‑trial/subject variability. Conventional machine‑learning classifiers further struggle with limited training data and overfitting, undermining real‑time robustness. In this preliminary, offline study on a single subject, a novel semi‑supervised MI‑classification network is implemented that includes an L2‑normalized autoencoder with dual reconstruction and classification branches—that, to our knowledge, is the first correctly tailored for closed‑loop lower‑limb exoskeleton control. This method is compared against four supervised approaches using a hybrid feature‑extraction pipeline capturing spectral, spatial, and temporal EEG dynamics. Supervised models were evaluated via leave‑one‑out cross‑validation, while the semi‑supervised framework’s latent representations were examined with K‑means clustering and t‑Stochastic Neighbour Embeddings (t-SNE). Event‑based false‑positive (FPR) and true‑positive ratios (TPR) served as comparative metrics. All approaches achieved 61–67 % accuracy, with the semi‑supervised network showing a lower FPR—suggesting its promise for more robust, data‑efficient BMI‑driven exoskeleton control.
|
|
09:15-09:30, Paper Tu-S1-BMI.WS.4 | |
Pre-Decision Feedback in Code-Modulated Visual Evoked Potentials Brain-Computer Interface for an 11-Class Keypad Typing Task |
|
Gomel, Jules | ISAE-Supaero |
Torre Tresols, Juan Jesus | ISAE-SUPAERO |
Cimarosto, Pietro | ISAE-Supaero |
Cabrera Castillos, Kalou | ISAE-Supaero |
Dehais, Frederic | ISAE-SUPAERO |
Keywords: Other Neurotechnology and Brain-Related Topics, BMI Emerging Applications
Abstract: This paper proposes to investigate the design of user-friendly reactive Brain-Computer Interfaces (rBCIs) based on Code-Modulated Visual Evoked Potentials (c-VEP). The BCI was implemented using the StAR-Burst paradigm, which features small, randomly-oriented texture patches designed to optimize foveal neural responses while enhancing user visual comfort. We extended this paradigm to support an 11-class selection scenario using dry EEG electrodes. In addition, the study explored the impact of predictive visual feedback on user experience. Two feedback types—Halo and Depth—were compared against a control condition with no feedback, aiming to enhance users' sense of control during interaction. Results demonstrated that the BCI achieved high classification accuracy with a dry EEG system across all three conditions (mean = 93,3%), using only 33 seconds of calibration data. Contrary to our expectations, predictive feedback did not lead to significant improvements in classification accuracy or decoding time. However, the Halo feedback significantly increased users' anticipation of success, though it also caused greater peripheral distraction. Interestingly, decoding time improved significantly with practice, underscoring the role of user adaptation in enhancing performance. Overall, the findings highlight the need for explicit training to help users effectively interpret and utilize predictive feedback.
|
|
09:30-09:45, Paper Tu-S1-BMI.WS.5 | |
Nonlinear Finite-Time Observer for Periodic Signal Estimation and Its Case Study for Electroencephalogram (I) |
|
Murakami, Madoka | Tokyo University of Science |
Nakamura, Hisakazu | Tokyo University of Science |
Keywords: Active BMIs, Other Neurotechnology and Brain-Related Topics
Abstract: Real-time brain-computer interfaces (BCIs) require fast and accurate alpha-band power estimation. However, alpha-waves are low-frequency components, which makes their rapid extraction using conventional methods, such as bandpass filters (BPF), difficult. This study applies a nonlinear finite-time observer (FTO) to extract low-frequency features from electroencephalogram signals. Experimental results demonstrate that the proposed FTO achieves a convergence time of 4 ms compared to 700 ms with BPF. Moreover, the FTO captures -8 dB attenuation as motor imagery-related alpha-band power, twice the magnitude obtained with conventional methods (-4 dB). Thus, compared to conventional approaches, the FTO enhances both responsiveness and clarity in detecting motor imagery-related reductions in alpha-band.
|
|
09:45-10:00, Paper Tu-S1-BMI.WS.6 | |
Towards Visual-Fatigue-Free BCI with Imperceptible Visual Evoked Potentials (I-VEP) (I) |
|
Fodor, Milan Andras | Rhine-Waal University of Applied Sciences |
Volosyak, Ivan | Rhine-Waal University of Applied Sciences |
Keywords: Active BMIs, BMI Emerging Applications, Other Neurotechnology and Brain-Related Topics
Abstract: A Brain–computer interface (BCI) enables direct control of external devices through neural activity. Among BCI paradigms, visual evoked potentials (VEPs) are widely used because they harness the brain’s response to visual stimuli to deliver an intuitive and inclusive control interface. However, these systems typically rely on low-frequency flickering stimuli to evoke strong neural responses, which can cause discomfort and fatigue. While previous attempts have sought to mitigate this discomfort, eliminating it entirely would be ideal. This could be achieved with high-frequency flickering operating above the critical fusion frequency threshold, which is invisible to the human eye yet still evokes a distinct neural imprint. While this neural response is distinguishable from that of a non-flickering stimulus, reliably differentiating between two different high-frequency stimuli remains challenging. Our novel Imperceptible VEP (I-VEP) paradigm combines imperceptible high-frequency flicker segments with static (non-flickering) intervals to create patterns analogous to steady-state VEP (SSVEP) and code-modulated VEP (cVEP). From this paradigm, we define two variants: Imperceptible Steady-State VEP (I-SSVEP) and Imperceptible Code-Modulated VEP (I-cVEP). Our initial I-cVEP experiments demonstrate its feasibility, achieving a mean bit-wise accuracy of over 92% across 27 participants. These findings may represent a significant first step toward fully comfortable visual-stimulus BCIs, enabling broader adoption and improved usability.
|
|
Tu-KN2 |
Hall F |
Keynote 2 |
Keynote |
Chair: Kovacs, Levente | Obuda University |
|
10:30-11:30, Paper Tu-KN2.1 | |
Keynote Talk: Personal Data Privacy – Especially Location |
|
Krumm, John | University of Southern California |
Keywords: Big Data Computing,
Abstract: John Krumm graduated from the School of Computer Science at Carnegie Mellon University in 1993 with a PhD in robotics and a thesis on texture analysis in images. He worked at the Robotics Center of Sandia National Laboratories in Albuquerque, New Mexico for the next four years. His main projects there were computer vision for object recognition for use in robots and vehicles. He was at Microsoft Research in Redmond, Washington, USA for 25 years, starting in 1997. He is currently an associate director of the Integrated Media Systems Center in the Viterbi School of Engineering at the University of Southern California. His research focuses on understanding peoples' location and personal data privacy. In 2017 he received a 10-year impact award for a paper on location privacy from the ACM UbiComp conference, and another from the same conference in 2021. He received the best paper award at the ACM SIGSPATIAL conference in 2022 and at the Mobile Data Management conference in the same year. His h-index on Google Scholar is 75. He is an inventor on 82 U.S. patents. Dr. Krumm was a PC chair for UbiComp 2007, ACM SIGSPATIAL 2013, and ACM SIGSPATIAL 2014. He is a past coeditor in chief of the Journal of Location Based Services and past associate editor for ACM Transactions on Spatial Algorithms and Systems. He currently serves on the editorial board of IEEE Pervasive Computing Magazine. He is the chair of the executive committee of ACM SIGSPATIAL and part of the Science Advisory Committee of the Geospatial Science and Human Security Division at Oak Ridge National Laboratory. He is an editorial fellow for the Paris Institute for Advanced Study.
|
|
Tu-S2-T1 |
Hall F |
Deep Learning 4 |
Regular Papers - Cybernetics |
Chair: Dey, Swarnava | TCS Research, Tata Consultancy Services Limited, Kolkata, India |
Co-Chair: Meng, Lin | Ritsumeikan University |
|
11:30-11:45, Paper Tu-S2-T1.1 | |
LA-GMF-Based Framework for Transparent Alzheimer's Diagnosis |
|
Braga de Albuquerque Maranhão, Giullia | Federal University of Pernambuco |
Mello, Carlos | Universidade Federal De Pernambuco |
Keywords: Deep Learning, Image Processing and Pattern Recognition, AI and Applications
Abstract: Alzheimer's disease (AD) is the leading cause of dementia in the elderly, with early diagnosis being a primary goal. Methods based on convolutional neural networks and attention, such as the LA-GMF model, offer strong performance and interpretability. We present a LA-GMF-based framework aimed at improving transparency and reliability. After retrieving the optimal number of attention heads through experimentations with a dataset from ADNI, for cross-validation, and the MIRIAD dataset as a hold-out set, we introduced a confidence threshold to reject uncertain predictions. Our proposed framework achieved 99.38% accuracy, and 98.97% sensitivity on accepted samples, outperforming the original LA-GMF while retaining 75.7% of the test set, enhancing system transparency to potentially achieve stakeholder trust.
|
|
12:00-12:15, Paper Tu-S2-T1.3 | |
Multi-Scale Token Pruning in Mask2Former for Semantic Segmentation |
|
Ishibashi, Ryuto | Ritsumeikan University |
Meng, Lin | Ritsumeikan University |
Deng, Mingcong | Tokyo University of Agriculture and Technology |
Keywords: Deep Learning, Image Processing and Pattern Recognition, Application of Artificial Intelligence
Abstract: Although Transformer has been successfully applied to Vision tasks in various fields, its large computational cost and performance degradation due to divergence from the language task are issues to be addressed. In this paper, we introduce token pruning to Mask2Former, a state-of-the-art segmentation method, to reduce computational cost and improve recognition accuracy without additional training. Multi-Scale Token Pruning (MSTP) works effectively on the multi-scale feature tokens of Mask2Former and can be universally implemented with various conventional token pruning methods. Experimental results show that introducing Top-K (norm+rand) MSTP into the Mask2Former of Swin-L backbone achieves +0.28 (56.31) mIoU on the ADE20K benchmark with +5.7% speed up. With this improvement, Mask2Former+MSTP can achieve mIoU equivalent to the large and powerful BEiT-UperNet with 1/4 of the computational complexity. In addition, +0.04 (57.86) PQ for COCO panoptic and +0.19 (63.36) mIoU for Mapillary Vistas are achieved, showing particular usefulness in complex semantic tasks with a large number of categories.
|
|
12:15-12:30, Paper Tu-S2-T1.4 | |
Anchor-Guided Contrastive Learning for User Identification Based on Video Preferences |
|
Iffath, Fariha | University of Calgary |
Hsu, Gee-Sern | National Taiwan University of Science and Technology |
Gavrilova, Marina | University of Calgary |
Keywords: Deep Learning, Image Processing and Pattern Recognition, Machine Learning
Abstract: User identification through aesthetic preferences has gained attention as a promising direction in social behavioral biometrics, offering a non-invasive and privacy-conscious alternative to traditional physiological identifiers. Unlike still images or single-modal inputs, video-based aesthetic preferences provide richer, temporally-aware insights into user behavior, enabling a deeper understanding of personal taste and style. Despite this potential, existing approaches often treat preference items independently and fail to capture the interrelationships within a user’s preference set. To address these limitations, this paper introduces Preference-Aware Set Encoding with Contrastive Personalization (PASE), a novel deep learning framework designed to model user identity based on structured preferred video sets. The proposed method integrates a Cross-Video Attention Encoder to learn co-preference patterns across videos, a User-Anchor Contrastive Loss to align personalized embeddings, and a Cross-Set Mixup Regularization technique to improve generalization by simulating diverse preference scenarios. Evaluation on a curated aesthetic dataset demonstrates that PASE achieves 98.38% identification accuracy, outperforming unimodal and multi-modal baselines. These findings highlight the unique advantages of leveraging video-based aesthetic information for biometric identification, particularly in applications demanding both accuracy and user-friendly privacy safeguards.
|
|
12:30-12:45, Paper Tu-S2-T1.5 | |
How Can One Choose the Best CAM-Based Explainability Method for a CNN Model? |
|
Costa, Daniel da Silva | Federal University of the State of Rio De Janeiro - UNIRIO |
de Souza Moura, Pedro Nuno | Federal University of the State of Rio De Janeiro - UNIRIO |
Alvim, Adriana Cesário de Faria | Federal University of the State of Rio De Janeiro - UNIRIO |
Keywords: Deep Learning, Neural Networks and their Applications, Image Processing and Pattern Recognition
Abstract: In recent years, several advances have been observed in Deep Learning with surprising results. Models in this area have been increasingly used in numerous applications, including those sensitive to human life, which require clear explanations and justifications. This has encouraged several types of research into the explainability of neural networks. Various explainability methods have been proposed, but not many metrics to evaluate these methods. The most commonly used metric is the Intersection over Union (IoU). It is applied between two bounding boxes to allow one to measure the similarity between them. However, due to the characteristics of the results of the explainability methods, called saliency maps, which do not have a known shape, we hypothesise that there must be a better metric that allows one to find an explainability method that produces results that best resemble the human perception. This work proposes using different metrics to assess the similarity between human perception and the explanation saliency maps to find a better metric. To this end, an investigation was conducted employing a subset of the ImageNet dataset, which corresponded to the Chihuahuas images. Several CAM-based explainability methods were used to generate saliency maps, which were compared with human perception of the most important parts of the chihuahuas in each image. Alignment was measured by applying distance metrics between the bounding box of human annotations and the saliency maps produced by each explainability method. Rankings of the best saliency maps were created using the results of the distance metrics and compared to the ranking obtained using people's choice, collected through crowdsourcing, of the best explanation saliency maps for each selected image. Comparison between rankings was performed using the Rank-Biased Overlap (RBO) metric. The results indicate the feasibility of our method to find the explainability method that best resembles human perception. In our experiments, the two metrics that best resemble human perception corresponded to Manhattan and Correlation. Besides, the best explainability methods regarding human perception were LayerCAM, Score-CAM, and IS-CAM.
|
|
12:45-13:00, Paper Tu-S2-T1.6 | |
Tuning Distillation to Generate Edge-Friendly All-Rounder Models |
|
Dey, Swarnava | TCS Research, Tata Consultancy Services Limited, Kolkata, India |
Mukherjee, Arijit | TCS Research, Tata Consultancy Services Limited, Kolkata, India |
Pal, Arpan | Tata Consultancy Services |
Keywords: Deep Learning, Representation Learning, Transfer Learning
Abstract: Deep learning models for edge deployments must be small, efficient, and robust. Knowledge distillation from large foundation models (FMs) can help, as they capture rich, transferable representations from large, multimodal datasets. However, two key challenges hinder this process: (1) extreme model compression based on a test dataset often leads to brittleness under distribution shifts, and (2) the distribution gap between an FM’s training data and a small model’s target dataset makes standard distillation methods ineffective. We propose a tunable dynamic loss curriculum for knowledge distillation to address these issues. Experiments show that small encoders trained with our approach achieve balanced transfer learning performance across both primary and out-of-distribution tasks. For instance, a tiny GPT-like model effectively transfers to sentiment classification and language modeling. Likewise, a tiny ResNet trained on CIFAR-10 achieves 10% higher accuracy on corrupted CIFAR-10 than state-of-the-art baselines for tiny models. While it trails dedicated robustness training by 4%, our method ensures superior adaptability across diverse datasets and tasks.
|
|
Tu-S2-T2 |
Hall N |
Application of Artificial Intelligence 4 |
Regular Papers - Cybernetics |
Chair: Souza Britto Jr, Alceu | Pontifícia Universidade Católica Do Paraná (PUCPR) |
Co-Chair: Hou, Yukun | Institute of Software Chinese Academy of Sciences |
|
11:30-11:45, Paper Tu-S2-T2.1 | |
Edge-Cloud Collaborative Multi-Axis Servo Coordination Control: A Reinforcement Q-Knapsack Approach |
|
Wu, Hao | Nanjing University of Aeronautics and Astronautics, Nanjing, Chi |
Zhu, Haifeng | Nanjing University of Aeronautics and Astronautics, Nanjing, Chi |
Chen, Jiayuan | Nanjing University of Aeronautics and Astronautics, Nanjing, Chi |
Zheng, Hao | Nanjing University of Aeronautics and Astronautics, Nanjing, Chi |
Yi, Changyan | Nanjing University of Aeronautics and Astronautics, Nanjing, Chi |
Keywords: Cloud, IoT, and Robotics Integration, AI and Applications, Deep Learning
Abstract: Multi-axis servo coordinated control enables multiple axes to track distinct target trajectories simultaneously. Through networked collaboration, these axes together can achieve complex tasks with enhanced adaptability and flexibility. In this paper, we introduce an edge-cloud collaborative multi-axis servo coordination control framework, exploiting both advantages of edge and cloud computing for optimizing the coordinated control performance of multi-axis servo system. Considering that axis states and control signal sequences are transmitted over a limited shared wireless channel, we formulate a long-term combinatorial decision problem under stringent communication resource constraints. A novel reinforcement Q-knapsack approach is proposed, which solves a grouped knapsack problem at each time step concerning the number of consumed slots and action Q-values, while deep reinforcement learning is utilized to optimize the action Q-value estimation in the long run. Simulation experiments demonstrate that the proposed approach is not only effective but also superior compared to counterparts.
|
|
11:45-12:00, Paper Tu-S2-T2.2 | |
Two-Stage Fire Detection and Analysis Method Based on the Collaboration of Large and Small Models |
|
Cao, Yuzhong | Beijing City University |
Hou, Yukun | Institute of Software Chinese Academy of Sciences |
Cao, YanQi | Meiji University |
Chen, Runqi | Beijing City University |
Cheng, Qiya | Beijing Forestry University |
Ma, Yulin | Beijing City University |
Zhang, Yifei | Beijing City University |
Zhao, ZeKang | Beijing City University |
Keywords: Application of Artificial Intelligence, Deep Learning, Machine Vision
Abstract: Fire detection faces challenges in complex scenarios (e.g., industrial plants, outdoor environments, and urban buildings), including blurred features of small targets, background interference, and shape variability. It also lacks support for false alarm verification and situation analysis, making it difficult to meet requirements for detection accuracy and emergency response. To address these issues, this paper proposes a two-stage fire detection and analysis method based on the collaboration between large and small models, aiming to achieve end-to-end optimization from efficient detection to intelligent emergency response. In the first stage, we propose MYGA-YOLO, a fire detection model based on YOLOv11n. It incorporates RFAConv to dynamically adapt to shape variability, the EMA attention mechanism to suppress background interference, and RepGFPN for efficient multi-scale feature fusion to enhance small target detection. These improvements significantly boost detection accuracy and robustness. Experimental results show that its accuracy, recall, and mAP@50 are improved by 3.7%, 2.9%, and 3.5%, respectively, compared to YOLOv11n. In the second stage, a multimodal large language model combined with Retrieval-Augmented Generation technology is employed. Through cross-modal reasoning, it integrates images, sensor data, and fire-fighting knowledge to perform false alarm verification, fire situation analysis, and generate emergency response strategies. By leveraging the collaboration between the computational efficiency of small models and the intelligent reasoning capabilities of large models, this approach addresses the limitations of existing methods in terms of detection accuracy and fire emergency support, offering an innovative solution for fire detection and emergency response in complex scenarios.
|
|
12:00-12:15, Paper Tu-S2-T2.3 | |
Adaptive Pooling and Dynamic Triplet Loss for Image-Text Retrieval |
|
Hu, Jinshuai | Shanghai University |
Wei, Xiao | Shanghai University |
Wang, Hongfei | Shanghai University |
Keywords: Application of Artificial Intelligence, Deep Learning, Multimedia Computation
Abstract: Image-text retrieval (ITR) is a fundamental task in multimodal learning. It aims to bridge image and text by constructing a shared embedding space that achieves accurate semantic alignment across two modalities. Most existing work has been devoted to designing tailored cross-attention modules to pursue retrieval accuracy but ignores the learning potential of the model network architecture. This work introduces a dual encoder that uses global information to enhance the feature interaction between visual regions. Designing an Adaptive Pooling (AP) module enables the model to automatically learn the optimal aggregation strategy from multiple perspectives based on the local features. We further propose a Dynamic Triplet Loss (DTL) to adjust the learning objective of the model network dynamically for efficient training. Experimental results on two benchmark datasets, Flickr30K and MS-COCO, demonstrate our APDTL model achieves state-of-the-art performance.Our code is released at github.com/jinshuaihu/APDTL.
|
|
12:15-12:30, Paper Tu-S2-T2.4 | |
RCD-DETR: A Lightweight Real-Time Detection Transformer for Conveyor Belt Egg Detection |
|
Shen, Yidong | Hangzhou City University |
Dai, Miaoyang | Hangzhou City University |
Cai, Yi | Hangzhou City University |
Cai, Jianping | Hangzhou City University |
Wei, Lina | Hangzhou City University |
Keywords: Application of Artificial Intelligence, Deep Learning, Neural Networks and their Applications
Abstract: Precise detection of eggs on conveyor belts in industrial automated production lines is of significant importance for reducing detection error rates and improving production efficiency. However, existing detection models face considerable challenges in detection accuracy and real-time processing when handling practical scenarios such as densely arranged eggs, complex background interference, and high-speed conveyor belts. This paper proposes a lightweight real-time detection Transformer model called RCD-DETR, consisting of three key innovative components: (1) a lightweight Reparam Context-aware Network (RCNet) that effectively balances feature extraction capability and computational efficiency; (2) a Context-Sensitive Refinement Feature Pyramid Network (CSRFPN) with enhanced contextual awareness and multi-scale feature representation capabilities, strengthening the model's ability to recognize small and dense objects; and (3) a Dilated Reparam Bottleneck C3 (DRepC3) module that expands the receptive field range while further reducing computational resource requirements. Experimental evaluation indicates that, compared to the RT-DETR baseline model, the proposed method achieves improved detection accuracy while reducing computational complexity by 54.1%, decreasing parameter count by 51.3%, and compressing model size by 50.8%. The method achieves an excellent balance between accuracy, inference speed, and resource consumption, making it suitable for deployment on resource-constrained edge computing devices for precise real-time egg detection on conveyor belts.
|
|
12:30-12:45, Paper Tu-S2-T2.5 | |
A Federated Learning Model for Privacy-Preserving and Cross-Domain Kidney Stone Detection in Medical Imaging |
|
Sotomaior, Lucas | Pontifícia Universidade Católica Do Paraná (PUCPR |
Brasão da Fonseca, Luiz Fernando | Pontifícia Universidade Católica Do Paraná (PUCPR |
Chiconelli Zangari, Matheus Antonio | Hospital Nossa Senhora Das Graças |
Krebs, Rodrigo | Hospital Nossa Senhora Das Graças |
Souza Britto Jr, Alceu | Pontifícia Universidade Católica Do Paraná (PUCPR) |
Kugler Viegas, Eduardo | Pontifícia Universidade Católica Do Paraná |
Keywords: Application of Artificial Intelligence, Deep Learning, Image Processing and Pattern Recognition
Abstract: Kidney stones significantly impact healthcare systems, with diagnosis typically requiring time-consuming Computed Tomography (CT) scan consultations between physicians and radiologists, often delaying patient care. Achieving a quick and accurate diagnosis is essential to ensure timely and effective treatment, which has motivated the development of Deep Neural Network (DNN)-based approaches for automated kidney stone detection. However, building effective models remains challenging, as it often requires access to large and diverse datasets that are siloed across institutions, and sharing such medical data is rarely feasible due to strict privacy regulations and patient confidentiality concerns. This paper proposes a privacy-preserving Federated Learning (FL) framework that enables multiple medical institutions to collaboratively train a DNN model without sharing sensitive patient data. Each institution trains a local model on its private dataset, and a centralized trusted server securely aggregates model parameters. We evaluate our approach using abdominal CT scan image datasets from two distinct institutions. Experimental results demonstrate that our proposed model achieves high classification accuracy within the same training environment, with an F1-score of up to 0.94. In addition, in cross-dataset evaluations, our approach outperforms traditional centralized baselines, showing significantly lower performance degradation while preserving patient privacy.
|
|
12:45-13:00, Paper Tu-S2-T2.6 | |
A Multi-Scale Decomposition and Fusion Framework Utilizing Mamba for Enhanced Time Series Forecasting |
|
Yu, Wenjun | Shanghai University of International Business and Economics |
Li, Wen | Shanghai University of International Business and Economics |
Li, Jiyanglin | Guizhou University of Finance and Economics |
Zheng, Kun | Shanghai Lixin University of Accounting and Finance |
Du, Heming | The University of Queensland |
Du, Shouguo | Shanghai Municipal Big Data Center |
You, Jinhong | Shanghai University of Finance and Economics |
Tang, Yiming | Shanghai Lixin University of Accounting and Finance |
Keywords: Application of Artificial Intelligence, Deep Learning, Neural Networks and their Applications
Abstract: Multivariate time series forecasting presents a significant challenge across various fields, requiring accurate predictions of future values based on multiple interrelated time series. Recent research has shown that the Channel Independent (CI) approach, which processes each sequence independently, can improve prediction accuracy, but neglecting the relationships between sequences may result in inadequate generalization. Channel Dependent (CD) methods, while integrating all sequences information, however, may compromise prediction accuracy by mixing potentially unrelated data. In this paper, we propose a novel framework called MDF-Mamba which employs a multi-scale decomposition strategy, utilizing the CI approach at fine scales to capture the unique characteristics of individual sequences, thereby enhancing model robustness. At coarse scales, the CD approach is used to capture correlations between sequences, improving the model's generalization capabilities. MDF-Mamba fully considers the individual characteristics of sequences and their interrelationships, balancing Channel Independent and Dependent to improve multivariate time series forecasting performance. Extensive experimental results across multiple real-world time series datasets demonstrate that MDF-Mamba achieves state-of-the-art performance.
|
|
Tu-S2-T4 |
Room 0.12 |
Robotic Systems 2 |
Regular Papers - SSE |
Chair: Li, Tzuu-Hseng S. | National Cheng Kung University |
Co-Chair: Winkler, Simon | MCI Internationale Hochschule GmbH |
|
11:30-11:45, Paper Tu-S2-T4.1 | |
Development of a Robotic Manipulator for HMD-Guided Stereoscopic Vision Control |
|
Dittrich, Meno | Management Center Innsbruck |
Winkler, Simon | MCI Internationale Hochschule GmbH |
Kim, Yeongmi | MCI Internationale Hochschule GmbH |
Keywords: Robotic Systems, Mechatronics
Abstract: Current training and development platforms in robotics assisted surgery often lack an affordable vision system. In this paper we present a cost-effective stereoscopic camera system for surgical training. Our approach utilizes a double parallelogram mechanism to achieve an approximate remote center of motion with three degrees of freedom: two revolute joints for orbital movement and one prismatic joint for depth positioning. The system incorporates two cameras with integrated illumination to provide depth perception. It can be steered hands-free using a head-mounted display, demonstrating the development potential the platform provides. A proof-of-concept evaluation focused on the control system, demonstrating the system’s ability to achieve precise positioning across all three axes with errors below 1° error for revolute joints and 1 mm error for prismatic joint. As a desktop-compatible and modular platform, we hope that this system significantly reduces barriers to surgical robotics in research, development and training.
|
|
11:45-12:00, Paper Tu-S2-T4.2 | |
Development of an Autonomous Mobile Robotic System for Efficient and Precise Disinfection |
|
Ou, Ting-Wei | Graduate Degree Program of Robotics, National Yang Ming Chiao Tu |
Jiang, Hia-Hao | Institute of Electrical and Control Engineering, National Yang M |
Huang, Guan-Lin | Department of Mechanical Engineering, National Yang Ming Chiao T |
Young, Kuu-Young | National Yang Ming Chiao Tung University |
Keywords: Robotic Systems, Modeling of Autonomous Systems, Mechatronics
Abstract: The COVID-19 pandemic has severely affected public health, healthcare systems, and daily life, especially amid resource shortages and limited workers. This crisis has underscored the urgent need for automation in hospital environments, particularly disinfection, which is crucial to controlling virus transmission and improving the safety of healthcare personnel and patients. Ultraviolet (UV) light disinfection, known for its high efficiency, has been widely adopted in hospital settings. However, most existing research focuses on maximizing UV coverage while paying little attention to the impact of human activity on virus distribution. To address this issue, we propose a mobile robotic system for UV disinfection focusing on the virus hotspot. The system prioritizes disinfection in high-risk areas and employs an approach for optimized UV dosage to ensure that all surfaces receive an adequate level of UV exposure while significantly reducing disinfection time. It not only improves disinfection efficiency, but also minimizes unnecessary exposure in low-risk areas. In two representative hospital scenarios, our method achieves desired disinfection effectiveness while reducing the time spent evidently. The video of the experiment is available at: https://youtu.be/wHcWzOcoMPM.
|
|
12:00-12:15, Paper Tu-S2-T4.3 | |
A Hybrid Learning and Optimization Framework for Reactive Whole-Body Motion Planning of Mobile Manipulators |
|
Zhang, Chenyu | Institute of Automation, Chinese Academy of Sciences |
Shiying, Sun | Institute of Automation, Chinese Academy of Sciences(CASIA) |
Liu, Kuan | Institute of Automation, Chinese Academy of Sciences |
Zhou, Chuanbao | CASIA |
Zhao, Xiaoguang | Institute of Automation, Chinese Academy of Sciences |
Tan, Min | Institute of Automation, Chinese Academy of Sciences |
Huang, Yanlong | University of Leeds |
Keywords: Robotic Systems, Modeling of Autonomous Systems, System Modeling and Control
Abstract: As an important branch of embodied artificial intelligence, mobile manipulators are increasingly applied in intelligent services, but their redundant degrees of freedom also limit efficient motion planning in cluttered environments. To address this issue, this paper proposes a hybrid learning and optimization framework for reactive whole-body motion planning of mobile manipulators. We develop the Bayesian distributional soft actor-critic (Bayes-DSAC) algorithm to improve the quality of value estimation and the convergence performance of the learning. Additionally, we use a quadratic programming method to calculate and constrain joint velocities, thereby improving the safety of the whole-body motion planning. We conduct experiments and make comparison with standard benchmark. The experimental results verify that our proposed framework significantly improves the efficiency of reactive whole-body motion planning, reduces the planning time, and improves the success rate of motion planning. Additionally, the proposed reinforcement learning method ensures a rapid learning process in the whole-body planning task. The novel framework allows mobile manipulators to adapt to complex environments more safely and efficiently.
|
|
12:15-12:30, Paper Tu-S2-T4.4 | |
Better Safe Than Sorry: Enhancing Arbitration Graphs for Safe and Robust Autonomous Decision-Making |
|
Spieker, Piotr Franciszek | Dotscene GmbH |
Le Large, Nick | Karlsruhe Institute of Technology |
Lauer, Martin | Karlsruhe Institute of Technology |
Keywords: Robotic Systems, System Architecture, Autonomous Vehicle
Abstract: This paper introduces an extension to the arbitration graph framework designed to enhance the safety and robustness of autonomous systems in complex, dynamic environments. Building on the flexibility and scalability of arbitration graphs, the proposed method incorporates a verification step and structured fallback layers in the decision-making process. This ensures that only verified and safe commands are executed while enabling graceful degradation in the presence of unexpected faults or bugs. The approach is demonstrated using a Pac-Man simulation and further validated in the context of autonomous driving, where it shows significant reductions in accident risk and improvements in overall system safety. The bottom-up design of arbitration graphs allows for an incremental integration of new behavior components. The extension presented in this work enables the integration of experimental or immature behavior components while maintaining system safety by clearly and precisely defining the conditions under which behaviors are considered safe. The proposed method is implemented as a ready to use header-only C++ library, published under the MIT License. Together with the Pac-Man demo, it is available at github.com/KIT-MRT/arbitration_graphs.
|
|
12:30-12:45, Paper Tu-S2-T4.5 | |
Implementation of RGBD-CNN and Autoencoder-Based Grasping with 3D Model-Aided Orientation Estimation on a Service Robot |
|
Chang, Kai-Chieh | National Cheng Kung University |
Lin, Han-Yu | National Cheng Kung University |
Li, Tzuu-Hseng S. | National Cheng Kung University |
Keywords: Robotic Systems, System Modeling and Control, Mechatronics
Abstract: A 3D object grasping point learning system is proposed in this paper, which contains object coordinate construction and grasping point learning. A RGBD Convolutional Neural Network (RGBD-CNN) is proposed to classify the orientation type of objects. An object model and the iterative closest point algorithm (ICP) are then applied to estimate the object pose. Hence, the object coordinate can be constructed in the end. For learning object grasping region, the normal vector images and depth image of the object are obtained first. Then, the grasping range of the end effector (palm) will be simulated on these images. Finally, Convolutional Autoencoder (CAE) is applied to encode the physical characteristics of the simulated palm image. By comparing the features of the simulated palm in the database through a 3D KD-tree, the proposed method can evaluate the grasping points. Through integrating object coordinates and learnt grasping points, the robot plans a suitable grasping point based on the appointed task. It is worth mentioning that most of the research only puts emphasis on either object orientation justification or grasp points generation. However, this research considers the problem of object pose estimation and grasping points generation together. Therefore, the robot can adapt to different task situations in real time. Real experiments demonstrate that the proposed method can recognize various kinds of shapes.
|
|
Tu-S2-T5 |
Room 0.14 |
Human-Machine Interaction 2 |
Regular Papers - HMS |
Chair: Mukherjee, Ranjan | Michigan State University |
Co-Chair: Lee, Chang-Shing | National University of Tainan |
|
11:30-11:45, Paper Tu-S2-T5.1 | |
Maintaining Seam Allowance in Sewing Using Haptic Feedback |
|
Thin, Kai | Michigan State University |
Ghosh, Sneha | Michigan State University |
Rockwell, Kyle | Michigan State University |
Mukherjee, Ranjan | Michigan State University |
Owen, Charles | Michigan State University |
Ranganathan, Rajiv | Michigan State University |
Keywords: Human-Machine Interaction, Haptic Systems
Abstract: Haptic feedback is proposed in sewing machine operations with the objective of reducing errors in seam allowance. Similar to the lane departure warning system in modern cars, such modality of feedback is expected to alert the operator in real time and improve the performance of the human control system. A traditional sewing machine was outfitted with a camera, a single board computer, and an eccentric motor. Experimental results demonstrate the potential for deployment of such systems in apparel manufacturing industries and provide opportunities for greater inclusion of people with disabilities in the workforce.
|
|
11:45-12:00, Paper Tu-S2-T5.2 | |
Look Where You Are Going: Evaluating the Influence of Visual Translation on Motion Sickness in Automated Vehicles |
|
Elbertse, Mitchel | TU Delft |
Wijlens, Rowenna | TU Delft |
Takamatsu, Atsushi | Nissan Motor Co., Ltd |
Makita, Mitsuhiro | Nissan Motor Co., Ltd |
Sato, Hikaru | Ritsumeikan University |
Wada, Takahiro | Nara Institute of Science and Technology |
van Paassen, Marinus M | Delft University of Technology |
Mulder, Max | Delft University of Technology |
Keywords: Human-Machine Interaction, Human Factors
Abstract: The shift from driver to passenger may increase the risk of Motion Sickness (MS) for Automated Vehicle (AV) occupants. Motion anticipation, which is believed to mitigate MS, relies to a large extent on visual cues. Yet, the mechanisms through which visual information influences MS remain poorly understood. This paper investigates the effect of visual translation on MS in AVs. In a simulator study, eighteen participants experienced three ‘AV rides’ with identical repetitive braking and accelerating motion on a straight road, differing only in their ‘out-the-window’ view to manipulate the amount of global optic flow. A rural, low optic flow, ride and an urban, high optic flow, ride were compared to a baseline without visual movement. Results suggest that visual translation in both the central and peripheral view, that is congruent with inertial cues, may only slightly reduce MS, as MS seemed to be slightly less in the rural and urban rides compared to the baseline. The amount of global optic flow seems to have little effect, with minimal differences in MS between the rural and urban rides. Nonetheless, it remains uncertain to what extent the type of visual content has affected MS development. A study using more generic visuals could help isolate these effects, by eliminating any recognizable visual elements, while still manipulating the global optic flow rate.
|
|
12:00-12:15, Paper Tu-S2-T5.3 | |
Physiological Measures of the Mental Workload in Users of a Lower Limb Exosuit: A Comparison of Subjective and Objective Metrics |
|
Mariani, Giulia | Istituto Italiano Di Tecnologia (IIT) |
Lambranzi, Chiara | Istituto Italiano Di Tecnologia, Politecnico Di Milano |
Cartocci, Nicholas | Italian Institute of Technology (IIT) |
Barresi, Giacinto | University of the West of England |
Di Natali, Christian | Istituto Italiano Di Tecnologia (IIT) |
De Momi, Elena | Politecnico Di Milano |
Ortiz, Jesus | Istituto Italiano Di Tecnologia |
Keywords: Human-Machine Interaction, Human Factors, Assistive Technology
Abstract: Lower-limb exosuits are particularly relevant for individuals with some degree of mobility impairment, such as post-stroke patients or older adults with reduced movement capabilities. This study aims to investigate the mental workload (MWL) assessment of XoSoft, a lower-limb soft exoskeleton, using and comparing subjective and objective physiological metrics. The NASA-TLX questionnaire, the average percentage change in pupil size (APCPS), and the Baevsky stress index (SI) are compared. The experiments were conducted on 18 healthy subjects while walking and involved mathematical tasks to create a double-task condition. The results show a complex interaction between task difficulty, exoskeleton activation, and pupillary dynamics, suggesting that the subject might reach a saturated condition under a high mental load. Besides, the data indicate that pupil diameter may be an objective mental workload indicator that correlates with subjective NASA-TLX questionnaires. The discordant indications from the stress index suggest how different metrics of the ocular and cardiac levels respond differently to various stimuli and dynamics. Research has also revealed ocular asymmetry, with the right eye more sensitive to cognitive load.
|
|
12:15-12:30, Paper Tu-S2-T5.4 | |
A Cross-Platform Study of Human Situational Awareness for Heterogeneous Low Altitude Autonomy |
|
Wang, Ziyue | Cranfield University |
Xing, Yang | Cranfield University |
Zolotas, Argyrios | Cranfield University |
Perrusquia, Adolfo | Cranfield University |
Guo, Weisi | Cranfield University |
Tsourdos, Antonios | Cranfield University |
Keywords: Human-Machine Interaction, Human Factors, Virtual/Augmented/Mixed Reality
Abstract: Human–autonomy teaming in the Low Altitude Economy (LAE) requires operators to manage both ground and aerial autonomous agents under time pressure, spatial uncertainty, and cognitive load. This study investigates how visual and haptic feedback affect operator situational awareness (SA) in simulated collision avoidance tasks involving cars and drones. A high-fidelity virtual environment was built using Unreal Engine 4 and AirSim, with haptic cues delivered through a wearable bHaptics vest. Twenty-two participants performed within-subject trials across visual-only and visual–haptic conditions. Results showed that haptic feedback significantly enhanced SA, particularly in dimensions related to information acquisition and spare mental capacity. Improvements were more consistent in car-based tasks, while drone scenarios exhibited greater inter-individual variability. These findings demonstrate the potential of multimodal interfaces to support cognitive performance and reduce platform-related disparities in operator SA. This work provides empirical evidence for designing adaptive, perception-aware interfaces in safety-critical human–autonomy teaming systems.
|
|
12:30-12:45, Paper Tu-S2-T5.5 | |
A Human-Machine-Environment Interaction System Based on VLA Model and Brain-Computer Interface |
|
Hua, Shaoyang | Eastern Institute for Advanced Study |
Zhang, Yuyang | Shanghai Jiao Tong University |
Zhang, Wenyao | Shanghai Jiao Tong Univerisity |
He, Qile | Shanghai Pinghe School |
Jin, Xin | Eastern Institute of Technology, Ningbo |
Keywords: Human-Machine Interaction, Human-Computer Interaction, Brain-Computer Interfaces
Abstract: The vision language action (VLA) model has shown extraordinary ability in the motion planning of the robots, which has great value in human-machine-environment interaction (HMEI), especially for rehabilitation engineering and robotic arm teleoperation fields. This paper proposes an HMEI system that integrates the brain-computer interface (BCI) with VLA model, enabling robots to interact with surrounding environment according to human intentions. Firstly, neural decoding technique is employed to interpret neural signals into intentions via a time and frequency fusion model. Then, the user's intentions are converted into a textual instruction, which subsequently guides the VLA model in generating corresponding motion trajectories for robotic arm teleoperation. Experimental results demonstrate smooth and compliant control of the robotic arm, which can be a viable solution for HMEI applications.
|
|
12:45-13:00, Paper Tu-S2-T5.6 | |
LLM-Based Intelligent Evaluation Agent with Knowledge Graph Construction for Human-Machine Interactive Learning |
|
Lee, Chang-Shing | National University of Tainan |
Wang, Mei-Hui | National University of Tainan |
Tseng, Guan-Ying | National University of Tainan |
Yue, Chao-Cyuan | National University of Tainan |
Lin, Chun-Han | National University of Tainan |
Lin, Yi-Jun | National University of Tainan |
Kubota, Naoyuki | Tokyo Metropolitan University |
Keywords: Human-Machine Interaction, Human-Computer Interaction, Human-Machine Cooperation and Systems
Abstract: This paper proposes an Intelligent Evaluation Agent (IEA) with knowledge graph construction based on the Large Language Model (LLM) and Trustworthy AI Dialogue Engine (TAIDE) for personalized Human-Machine Interactive Learning (HMIL). The intelligent agent will deal with multitasks such as learning data preparation and the learner’s data generation, preprocessing, analysis, and evaluation. Multi-modal data is collected from human-machine interactive activities and processed by an IEA to generate structured data stored in human learning repositories. The intelligent agent focuses on various temporal learning periods, such as macro, meso, and micro-level assessments by integrating Human Intelligence (HI) and Machine Intelligence (MI) results, with the MI-based Genetic Algorithm and Neural Network (GANN) learning mechanism employed to optimize the intelligent evaluation model. The learning data evaluation phase aims to identify a model that best fits the group’s learning behavior through HI-based evaluation and to train it further using MI, ensuring that the trained GANN-IEA model closely approximates the HI-based model. An LLM-based knowledge graph agent also supports the evaluation process by helping teachers analyze and visualize students’ learning progress. Experimental results demonstrate that students who study diligently gain knowledge and exhibit increased interest in learning through HMIL. However, the evidence also suggests that some students who excessively rely on Generative AI (GAI) to reproduce learning content without modification become less inclined to engage in diligent study. Additionally, the proposed IEA effectively reduces teachers’ workload in assessing students’ learning status at the end of the semester and supports personalized learning through the designed HMIL model.
|
|
Tu-S2-T6 |
Room 0.16 |
System Modeling and Control 2 |
Regular Papers - SSE |
Chair: Messias, Johnnatan | MPI-SWS |
Co-Chair: Harissa, Meriam | University of Michigan-Dearborn |
|
11:30-11:45, Paper Tu-S2-T6.1 | |
A Spatiotemporal Machine Learning Framework for Ecologically-Informed Bird Sighting Prediction |
|
Harissa, Meriam | University of Michigan-Dearborn |
Amin, Jana | University of Michigan-Dearborn |
Das, Srijita | University of Michigan-Dearborn |
Song, Zheng | University of Michigan-Dearborn |
Keywords: System Modeling and Control, Decision Support Systems, Large-Scale System of Systems
Abstract: Fine-grained bird sighting prediction is crucial for advancing ecological research, informing conservation planning, and enhancing the birdwatching experience while fostering public awareness of biodiversity. The rapid expansion of citizen-based bird observation networks has led to an exponential accumulation of bird sighting records, which can be leveraged to train machine learning models for more precise predictions. However, general-purpose machine learning models often fail to incorporate ecological factors that influence bird activity, resulting in less accurate predictions. In this paper, we present an ecologically informed machine learning framework based on LightGBM that integrates spatiotemporal correlations, ecological context, and dynamic environmental variables to improve bird sighting predictions. The framework captures temporal trends using rolling windows, applies spatial smoothing to account for observation proximity, and models ecological dependencies—such as temperature-food interactions—through interaction terms. Key environmental factors, including habitat classifications, weather conditions, and seasonally adjusted food availability proxies, are dynamically incorporated to enhance ecological relevance. Evaluation results demonstrate significant improvements in predictive accuracy, with increased F1-scores compared to baseline methods. By embedding ecological principles into machine learning models, this framework enables data-driven insights that reflect real-world environmental complexities, providing a powerful tool for biodiversity monitoring and conservation strategies.
|
|
11:45-12:00, Paper Tu-S2-T6.2 | |
Unrolling the Performance of ZK-Rollups through Stochastic Modeling |
|
Melo, Carlos | Unicamp |
Messias, Johnnatan | MPI-SWS |
Miquéias, José | UFPI |
Gonçalves, Glauber | Federal University of Piauí (UFPI) |
Silva, Francisco Airton | Federal University of Piauí |
Castelo Branco Soares, André | Universidade Federal Do Piaui |
Keywords: System Modeling and Control, Distributed Intelligent Systems, Discrete Event Systems
Abstract: Sidechains offer partial solutions to Ethereum's scalability challenges; however, they introduce trade-offs related to security and implementation complexity. These limitations have been further addressed by Layer-2 solutions known as rollups, which combine off-chain computation with on-chain verification, preserving both security and decentralization on the Ethereum platform. This paper proposes a Stochastic Petri Net model to evaluate the feasibility of ZK-Rollups by analyzing their impact on throughput and latency. The results indicate that increased adoption of Layer-2 transactions can enhance system throughput by up to 20%. Conversely, latency may rise by more than 100% when larger batches are used, revealing a fundamental performance trade-off.
|
|
12:00-12:15, Paper Tu-S2-T6.3 | |
Dynamic Service Placement and Computation Resource Allocation for Cloud-Edge Computing: A Reinforcement Learning Approach |
|
Gao, Yu | Southeast University |
Tao, Jun | Southeast University |
Wang, Haotian | Southeast University |
Keywords: System Modeling and Control, Infrastructure Systems and Services, Communications
Abstract: By locating computational and storage resources at the edge of the network, the emerging paradigm of Mobile Edge Computing (MEC) yields a significant enhancement in user Quality of Experience (QoE). However, the limited resources at edge nodes, coupled with the dynamism of user requests, present a considerable challenge to decision-making in service placement and computational resource assignment. This paper investigates the resource management problem within an edge-cloud cooperative network. Aiming to minimize long-term network latency, the original problem is first modeled as a Markov Decision Process (MDP) featuring a hybrid discrete-continuous action space. To address dynamically arriving tasks and varying network conditions, we develop a Dynamic Service Placement and Computation Resource Allocation (DSPCRA) scheme based on deep reinforcement learning (DRL). DSPCRA integrates a deep deterministic policy gradient (DDPG) with a parameterized action mechanism for online decision-making. Numerous simulations confirm that the proposed scheme exhibits good convergence properties and achieves lower latency performance compared to the benchmark algorithms.
|
|
12:15-12:30, Paper Tu-S2-T6.4 | |
Monoidal Systems and Functional Programming to Reduce Software Complexity in Computer Simulation |
|
Mitsuhashi, Daichi | The University of Tokyo |
Kanno, Taro | The University of Tokyo |
Keywords: System Modeling and Control, Large-Scale System of Systems
Abstract: Modeling and computer simulation are effective tools for predicting system behavior. However, simulation software (i.e., source codes), particularly those on social systems, tends to be complicated and un-reusable owing to the complexity, diversity, and ambiguity of the target systems. This makes it difficult to develop, validate, and explain simulation software. One possible solution is to formalize a mathematically rigorous simulation theory and incorporate it directly into a functional programming language. If the mathematical counterpart of the code is clear, coders can easily explain, validate, and extend it. In this study, we define "monoidal system" as the basis to describe the physical and social system that will be modeled. We also conducted an experiment to demonstrate that the introduction of a monoidal system could reduce program complexity in software development.
|
|
12:30-12:45, Paper Tu-S2-T6.5 | |
Generating Heterogeneous Multi-Dimensional Data : A Comparative Study |
|
Corbeau, Michael | Institut De Recherche En Informatique De Toulouse |
Claeys, Emmanuelle | Institut De Recherche En Informatique De Toulouse |
Serrurier, Mathieu | Institut De Recherche En Informatique De Toulouse |
Zaraté, Pascale | Institut De Recherche En Informatique De Toulouse |
Keywords: Decision Support Systems, Consumer and Industrial Applications, Digital Twin
Abstract: Allocation of personnel and material resources is highly sensible in the case of firefighter interventions. This allocation relies on simulations to experiment with various scenarios. The main objective of this allocation is the global optimization of the firefighters response. Data generation is then mandotory to study various scenarios In this study, we propose to compare different data gener- ation methods. Methods such as Random Sampling, Tabular Variational Autoencoders, standard Generative Adversarial Networks, Conditional Tabular Generative Adversarial Net- works and Diffusion Probabilistic Models are examined to ascertain their efficacy in capturing the intricacies of firefighter interventions. Traditional evaluation metrics often fall short in capturing the nuanced requirements of synthetic datasets for real-world scenarios. To address this gap, an evaluation of synthetic data quality is conducted using a combination of domain-specific metrics tailored to the firefighting domain and standard measures such as the Wasserstein distance. Domain-specific metrics include response time distribution, spatial-temporal distribution of interventions, and accidents representation. These metrics are designed to assess data variability, the preservation of fine and complex correlations and anomalies such as event with a very low occurrence, the conformity with the initial statistical distribution and the operational relevance of the synthetic data. The distribution has the particularity of being highly unbalanced, none of the variables following a Gaussian distribution, adding complexity to the data generation process.
|
|
Tu-S2-T7 |
Room 0.31 |
Brain-Based Information Communications 1 |
Regular Papers - HMS |
Chair: Ye, Li | Chinese Academy of Sciences |
Co-Chair: Guttmann-Flury, Eva | Shanghai Jiaotong University |
|
11:30-11:45, Paper Tu-S2-T7.1 | |
Obsessive Compulsive Disorder Diagnosing Based on Gramian Angular Field Combing Hemodynamic Response Features Measured by Functional Near-Infrared Spectroscopy |
|
Yang, Mingxi | Beihang University |
Wu, Di | Beihang University |
Liu, Haoran | Beihang University |
Wang, Daifa | Beihang University |
Keywords: Biometrics and Applications,, Brain-based Information Communications, Affective Computing
Abstract: The diagnosis of obsessive-compulsive disorder (OCD) often relies on clinical interviews and other subjective assessments. Functional near-infrared spectroscopy (fNIRS) is a non-invasive optical technique widely used to monitor brain activity and assist clinicians in objective brain function evaluation. The advancement of time-series-to-image transformation techniques has opened new pathways for integrating fNIRS with deep learning; however, these methods often suffer from the loss of hemodynamic response function (HRF) information. To address this challenge, we propose a framework for converting fNIRS time-series data into virtual images while preserving HRF activation information. This framework utilizes globally min-max normalized Gramian Angular Fields (GAF) for virtual image transformation, which are subsequently used in deep learning classification models. In cross-validation experiments, the GAF virtual images incorporating HRF information achieved the highest average classification accuracy of 86.7%. Our findings indicate that GAF virtual images integrated with HRF features provide a novel approach for fNIRS-based OCD classification research, further promoting the clinical application of fNIRS technology.
|
|
11:45-12:00, Paper Tu-S2-T7.2 | |
One Spiking Neuron Classification Based on Kolmogorov Complexity |
|
Lasserre, Ludivine | LAMIA, Université Des Antilles |
Doncescu, Andreï | LAMIA, Department of Mathematics and Computer Science, Universit |
Faux, Francis | IRIT, Université Paul Sabatier |
Keywords: Brain-based Information Communications, Cognitive Computing
Abstract: This paper investigates the potential of a minimalist spiking neural network for digit recognition tasks, using the MNIST dataset as a benchmark. The proposed model features a single spiking neuron utilizing the Izhikevich neuronal model, deliberately crafted without weights or a learning phase, embodying the minimalist approach of maximizing performance with minimal resources. Our approach integrates self-organizing maps with a novel optimization method for cluster selection, leveraging Kolmogorov complexity and prototype abstraction. The model achieves 90% accuracy on inverted grayscale Arabic numerals. On Roman numerals, it maintains strong 87.4% accuracy, excelling particularly on well-separated patterns. These results confirms the model's suitability for regular inputs and highlights the potential of minimalist spiking architectures for future neuromorphic systems.
|
|
12:00-12:15, Paper Tu-S2-T7.3 | |
16.9K-Parameters Lightweight Framework for Real-Time EEG-Based DoA Monitoring with Harmonic Mix and SimGate |
|
Huang, Zixuan | Chinese Academy of Sciences Shenzhen Institutes of Advanced Tech |
Ao, Licheng | Jinan University |
He, Zicong | Jinan University |
Kong, Lingyao | Jinan University |
Miao, Fen | Shenzhen Institute for Advanced Study, University of Electronic |
Ye, Li | Chinese Academy of Sciences |
Keywords: Brain-based Information Communications, Medical Informatics
Abstract: Monitoring the Depth of Anesthesia (DoA) using Electroencephalography (EEG) is essential for patient safety and optimal drug administration. However, existing methods face challenges like computational inefficiency and limited generalization, hindering real-time applicability. To address these, we propose a lightweight framework based on SimGate, a parameter-free RNN gating mechanism. SimGate simplifies gating by dynamically computing hidden states based on cosine similarity and requiring only a single linear layer and an initialization vector for training, thus reducing model complexity and enabling efficient parallel computation. To improve generalization, we introduce Harmonic Mix, a frequency augmentation strategy that enhances data diversity by applying harmonic-constrained low-pass filtering and convex combinations of EEG activities. This method preserves critical frequency bands and mitigates noise interference. Experimental results on the VitalDB dataset show that our model achieves 82.70% classification accuracy (ACC) and 5.79±0.64 root mean squared error (RMSE), outperforming existing models with only 16.9K parameters and an inference speed (IS) of 0.02ms. These results demonstrate the effectiveness of our model for real-time DoA monitoring in clinical settings.
|
|
12:15-12:30, Paper Tu-S2-T7.4 | |
Auditory Tagging: Improving Performance of Auditory Brain-Computer Interfaces by Modulating Stimuli |
|
Žák, Michal Robert | University of Vienna |
Grosse-Wentrup, Moritz | University of Vienna |
Keywords: Brain-Computer Interfaces, Brain-based Information Communications, Design Methods
Abstract: We propose auditory tagging, a novel method to enhance decoding performance in auditory brain-computer interface (BCI) paradigms. Drawing inspiration from steady-state visually evoked potentials (SSVEPs), auditory taggers involve embedding a steady frequency onto an auditory stimulus with the goal of eliciting a detectable neuronal response. In this work, we introduce three such approaches and evaluate them on the auditory intention decoding (AID) paradigm. In AID, subjects are primed with a question and potential target and non-target answer options are provided for this question. The BCI then decodes whether a given sample is a target or non-target. Despite the conceptual promise of the auditory taggers, our experiment results did not reveal statistically significant improvements in decoding accuracy using the proposed tagging approaches. We discuss potential explanations for this observation and highlight possible avenues of improvement for future research.
|
|
12:30-12:45, Paper Tu-S2-T7.5 | |
From Noise to Insight: Visualizing Neural Dynamics with Segmented SNR Topographies for Improved EEG-BCI Performance |
|
Guttmann-Flury, Eva | Shanghai Jiaotong University |
Zhao, Shan | Shanghai Jiao Tong University School of Medicine |
Zhao, Jian | Shanghai Jiao Tong University |
Sawan, Mohamad | Westlake University |
Keywords: Brain-Computer Interfaces, Brain-based Information Communications, Human-Machine Interface
Abstract: Electroencephalography (EEG)-based wearable brain-computer interfaces (BCIs) face challenges due to low signal-to-noise ratio (SNR) and non-stationary neural activity. We introduce in this manuscript a mathematically rigorous framework that combines data-driven noise interval evaluation with advanced SNR visualization to address these limitations. Analysis of the publicly available Eye-BCI multimodal dataset demonstrates the method's ability to recover canonical P300 characteristics across frequency bands (delta: 0.5-4 Hz, theta: 4-7.5 Hz, broadband: 1-15 Hz), with precise spatiotemporal localization of both P3a (frontocentral) and P3b (parietal) subcomponents. To the best of our knowledge, this is the first study to systematically assess the impact of noise interval selection on EEG signal quality. Cross-session correlations for four different choices of noise intervals spanning from early to late pre-stimulus phases also indicate that alertness and task engagement states modulate noise interval sensitivity, suggesting broader applications for adaptive BCI systems. While validated in healthy participants, our results represent a first step towards providing clinicians with an interpretable tool for detecting neurophysiological abnormalities and provides quantifiable metrics for system optimization.
|
|
Tu-S2-T8 |
Room 0.32 |
Evolutionary Computation 2 |
Regular Papers - Cybernetics |
Chair: Nakib, Amir | Universite Paris Est Creteil, |
Co-Chair: Pang, Lie Meng | Southern University of Science and Technology |
|
11:30-11:45, Paper Tu-S2-T8.1 | |
Genetic Algorithm-Based Deep Gradient Compression with Layer-Wise Adaptation for Distributed Training |
|
Sun, Feng | Shenzhen Institute for Advanced Study, University of Electronic |
Ke, Yan | Shenzhen Institute for Advanced Study, University of Electronic |
Li, Jintao | University of Electronic Science and Technology of China |
Li, Yun | Shenzhen Institute for Advanced Study, University of Electronic |
Keywords: Evolutionary Computation, Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing, Deep Learning
Abstract: Distributed deep learning faces a major bottleneck in communication efficiency due to the frequent exchange of gradient information among computing nodes. Gradient compression offers a viable solution to reduce communication overhead. However, existing methods often rely on uniform or limited adaptive strategies, which fail to fully exploit compression potential. To address this, we propose a layer-wise adaptive gradient compression method based on genetic algorithms. By dynamically adjusting compression parameters per layer using evolutionary search and incorporating multi-objective optimization, our approach achieves a better trade-off between compression rate and model accuracy. Experiments across various network architectures and datasets demonstrate that our method significantly improves communication efficiency without sacrificing accuracy.
|
|
11:45-12:00, Paper Tu-S2-T8.2 | |
Evolutionary Fractal Decomposition Based Search for Dynamic Optimization |
|
Llanza, Arcadi | University Paris Est Créteil, Laboratoire LISSI; Cyclope.ai |
Shvai, Nadiya | National University of Kyiv-Mohyla Academy; Cyclope.ai |
Nakib, Amir | Universite Paris Est Creteil, |
Keywords: Evolutionary Computation, Metaheuristic Algorithms
Abstract: Dynamic Optimization Problems (DOPs) pose significant challenges because of the evolving nature of their objective functions and constraints over time. These difficulties become more pronounced as the frequency of landscape changes and the dimension of the search space increase. In this work, a novel hybrid approach for dynamic optimization, called Evolutionary Fractal Decomposition based Search (EFDS), is proposed. EFDS uses fractal-based decomposition for space indexing and Evolutionary Algorithms (EAs) to select prominent regions while maintaining population diversity. Experimental results on the Moving Peak Benchmark (MPB) demonstrate the effectiveness of the proposed approach, outperforming competing methods in 21 out of 24 benchmark configurations.
|
|
12:00-12:15, Paper Tu-S2-T8.3 | |
Visual Tradeoff Analysis between Decision Space Diversity and Objective Space Diversity: Use of DTLZ Test Problems As Multi-Modal Multi-Objective Optimization Problems |
|
Pang, Lie Meng | Southern University of Science and Technology |
Shu, Tianye | Southern University of Science and Technology |
Ishibuchi, Hisao | Southern University of Science and Technology |
Keywords: Evolutionary Computation, Metaheuristic Algorithms, Computational Intelligence
Abstract: Multi-modal multi-objective optimization has become a hot research topic in the evolutionary multi-objective optimization (EMO) community. To support this line of study, many multi-modal multi-objective test problems have been developed to better understand the behavior of each algorithm in achieving a good balance between convergence and diversity in the decision space. However, the trade-off between decision space diversity and objective space diversity has not been well investigated. In this paper, first, we clearly explain that the widely used DTLZ1–4 test problems exhibit multi-modality (i.e., there exists a many-to-one mapping from the Pareto front to the Pareto set) despite their frequent use as standard benchmark problems for multi-objective optimization. Then, we demonstrate that the trade-off between decision space diversity and objective space diversity can be visually examined using the DTLZ1–4 problems with three or more objectives. Using these four test problems, we examine the search behavior of three standard EMO algorithms and three multi-modal multi-objective evolutionary algorithms (MMEAs). Experimental results show that uniformly distributed solutions obtained by the standard EMO algorithms in the objective space do not have good uniformity in the decision space. In contrast, solution sets obtained by different MMEAs show different trade-offs between decision space diversity and objective space diversity.
|
|
12:15-12:30, Paper Tu-S2-T8.4 | |
Dual Population-Based Objective Modification for Enhancing NSGA-II in Many-Objective Optimization |
|
Pang, Lie Meng | Southern University of Science and Technology |
Ishibuchi, Hisao | Southern University of Science and Technology |
Singh, Hemant Kumar | University of New South Wales |
Keywords: Evolutionary Computation, Metaheuristic Algorithms, Computational Intelligence
Abstract: NSGA-II, the most well-known evolutionary multi-objective optimization (EMO) algorithm, is widely believed to be ineffective at handling many-objective optimization problems (i.e., problems with four or more conflicting objectives). Since NSGA-II is a Pareto dominance-based EMO algorithm, it loses strong selection pressure as the percentage of non-dominated solutions increases. Additionally, if dominance-resistant solutions (DRSs) exist, they further impair the convergence ability of NSGA-II. Two approaches have been proposed to improve convergence and eliminate DRSs. One is to increase the dominated region of each solution, while the other is to increase the correlation among objectives. These two approaches work well for many-objective optimization and are equivalent under certain conditions. However, they require appropriate parameter values for different problem types. To avoid the difficulty of setting parameter values appropriately, we propose a dual population-based NSGA-II (called DP-MNSGA-II) that utilizes two different sub-populations for objective modification. One subpopulation uses a small parameter value (i.e., small modification), while the other uses a relatively large value for objective modification in NSGA-II. Experimental results show that the proposed DP-MNSGA-II works well on different problems without the need for careful parameter tuning.
|
|
12:30-12:45, Paper Tu-S2-T8.5 | |
Comparisons of Some Evolutionary Computation Algorithms for Prompt Optimization in Large Language Models |
|
Li, Jian-Yu | South China University of Technology |
Jin, Tian-le | Nankai University |
Zhan, Zhi-Hui | South China University of Technology |
Kwong, Sam Tak Wu | Lingnan University |
Zhang, Jun | Hanyang University |
Keywords: Evolutionary Computation, Swarm Intelligence, Application of Artificial Intelligence
Abstract: Prompt engineering has become an effective tool for numerous large language models (LLMs)-based tasks. However, it is still a challenge to adaptively optimize the prompt for the target task, where gradients are inaccessible via APIs and the search space is highly complex. Due to the ability to robustly handle complex and multimodal landscapes, and to adaptively explore complex search spaces without explicit gradient information, evolutionary computation (EC) algorithms have garnered significant attention in black-box optimization. However, the performance of different EC algorithms for prompt optimization remains uncertain, which cannot provide guidance for algorithm selection and design in prompt optimization. To alleviate this, a series of investigative experiments are conducted in this paper to evaluate the efficiency of EC algorithms in prompt optimization, which results in three novel findings. First, different EC algorithms, including XNES, CMA-ES, DE, and PSO, are compared on various few-shot learning tasks. The findings indicate that XNES can outperform other algorithms, achieving faster or smoother convergence and higher final accuracy under high-dimensional settings. Second, the role of truncation strategies is also investigated. Results show that moderate bounds effectively help the algorithm balance exploration and exploitation, whereas overly small or large bounds diminish performance. Third, the influence of different k-shot values is examined on the optimization performance for few-shot tasks, which shows that k=32 consistently provides the best trade-off between convergence speed and final model accuracy. In conclusion, these investigations validate the efficacy of the investigated EC algorithms and highlight the advantages of XNES for complex prompt optimization. They further underscore the importance of proper truncation boundaries and k-shot configurations, offering valuable guidance for future black-box prompt tuning scenarios. These insights pave the way for future work on more diverse and complex tasks with larger LLMs.
|
|
12:45-13:00, Paper Tu-S2-T8.6 | |
Adaptive Multi-Layer Prioritized Fictitious Self-Play in Multi-Agent Reinforcement Learning |
|
Yue, Chao | University of Chinese Academy of Sciences |
Xue, Jian | University of Chinese Academy of Sciences |
Zhao, Lin | University of Chinese Academy of Sciences |
Lei, Yuan | University of Chinese Academy of Sciences |
Mao, Shun | University of Chinese Academy of Sciences |
Lu, Ke | University of Chinese Academy of Sciences |
Keywords: Evolutionary Computation, Swarm Intelligence, Machine Learning
Abstract: In the field of Multi-Agent Reinforcement Learning (MARL), strategy selection and optimization are key challenges in improving agent performance. The Policy Space Response Oracles (PSRO) algorithm is widely used in MARL, and the meta-solver is one of its cores. However, existing meta-solvers may exhibit limitations in complex MARL environments, such as significant consumption of computing resources, bad convergent effect, poor stability, and so on. Therefore, an adaptive multi-layer Prioritized Fictitious Self-Play (PFSP) is proposed in this paper as a meta-solver method for the PSRO algorithm, which further improves the effectiveness of strategy optimization by utilizing the game results of the meta-game in a more efficient and reasonable way. Adaptive multi-layer PFSP can flexibly make strategy choices based on payoff at different layers, thus overcoming the shortcomings of traditional meta-solvers in dealing with complex strategy spaces. The experimental results show that the proposed method significantly improves the convergence speed and performance of strategies in complex MARL environments, especially in the training and testing of the Google Research Football environment, demonstrating its potential and advantages in practical applications.
|
|
Tu-S2-T9 |
Room 0.51 |
Infrastructure Systems and Services |
Regular Papers - SSE |
Chair: Araujo, Jean | Universidade Federal Do Agreste De Pernambuco |
Co-Chair: David, Beserra | EPITA |
|
11:30-11:45, Paper Tu-S2-T9.1 | |
Evaluating Software Aging Resistance in Serverless Computing under Stress Workloads |
|
Sousa, Antonio | Universidade Federal De Sergipe |
Beserra, David | École Pour l'Informatique Et Les Technologies Avancées (EPITA) |
Araujo, Jean | Universidade Federal Do Agreste De Pernambuco |
Keywords: Infrastructure Systems and Services, Fault Monitoring and Diagnosis, System Architecture
Abstract: Serverless computing abstracts infrastructure management, enabling cost-efficient and scalable application deployment. However, multiple operations can be performed simultaneously, leading to system degradation of resources exhaustion during prolonged executions. This study assesses resource utilization in serverless environments using Knative, testing four OS-container engine configurations (Ubuntu/Docker, Ubuntu/Podman, Debian/Docker, Debian/Podman) under stress workload. Results show Ubuntu paired with Docker achieves optimal resource efficiency, avoids software aging, and maintains consistent performance, making it the most effective configuration for scalable serverless deployments. The absence of progressive degradation in resource metrics underscores serverless architectures’ potential to resist software aging through autoscaling. Findings provide actionable insights for optimizing resource management in open-source serverless platforms.
|
|
11:45-12:00, Paper Tu-S2-T9.2 | |
Regulatory Policy Trends for the Energy Sustainability of Artificial Intelligence |
|
Guimarães, Júlio César | Universidade Federal Do Rio De Janeiro |
Argôlo, Matheus | Universidade Federal Do Rio De Janeiro |
Barbosa, Carlos Eduardo | Universidade Federal Do Rio De Janeiro |
Nóbrega, Lucas | Universidade Federal Do Rio De Janeiro |
Martinez, Luiz Felipe | Universidade Federal Do Rio De Janeiro |
de Almeida, Marcos Antonio | Ufrj |
Souza, Jano | Federal University of Rio De Janeiro |
Keywords: Infrastructure Systems and Services, Intelligent Green Production Systems, Technology Assessment
Abstract: In recent years, artificial intelligence (AI) has advanced rapidly, transforming various sectors of society and delivering value to governments, businesses, and end users. However, these advances have also increased computational costs and training times, resulting in significant energy consumption by AI systems. This situation highlights the urgent need for regulatory discussions and proposals that guide AI development and innovation toward energy sustainability. This work reviews, organizes, and categorizes recent regulatory policies for AI energy sustainability into four dimensions for trend analysis. The results show that most identified policies concentrate on transparency and energy efficiency, indicating a higher maturity of these measures within regulatory discussions. This work seeks to raise awareness among stakeholders involved with AI and sustainability, encouraging reflections on energy consumption and efficiency to build a more sustainable future.
|
|
12:00-12:15, Paper Tu-S2-T9.3 | |
EdgeWidgets: A Dual-Protocol IoT Platform for Resilient Environmental Monitoring Using Embedded Tilt Sensing |
|
Vanderlei de Oliveira, Gabriel | Universidade Federal De Pernambuco |
Ferreira, Joao | University of Coimbra |
Andrade, Ermeson | Federal Rural University of Pernambuco |
Balieiro, Andson | Federal University of Pernambuco |
Brito, Gilmar | Federal Institute of Pernambuco |
Dantas, Jamilson | UFPE |
Keywords: Infrastructure Systems and Services, Smart Buildings, Smart Cities and Infrastructures, System Architecture
Abstract: This paper introduces EdgeWidgets, a versatile IoT platform designed for environmental monitoring applications. The platform integrates a Bosch BNO055 inclinometer with an ESP32 C3 SuperMini microcontroller and implements a dual-protocol communication layer (HTTP and MQTT). Our experimental evaluation shows that MQTT achieves approximately 26.1% lower latency compared to HTTP, with significantly more consistent performance. These results demonstrate the substantial performance benefits of MQTT’s persistent connection design for monitoring in connectivity-challenged environments.
|
|
12:15-12:30, Paper Tu-S2-T9.4 | |
Performance Evaluation of Proxmox for High-Performance Computing in Resource-Shared Environments |
|
Izoulet, Aurélien | EPITA |
David, Beserra | EPITA |
Espie, Marc | EPITA |
Endo, Patricia Takako | Universidade De Pernambuco |
Araujo, Jean | Universidade Federal Do Agreste De Pernambuco |
Keywords: Infrastructure Systems and Services, System Architecture, Large-Scale System of Systems
Abstract: Virtualization platforms like Proxmox VE, which extend hypervisors such as KVM with orchestration tools, are increasingly adopted in high-performance computing (HPC) for improved flexibility and resource management. This study evaluates Proxmox VE against KVM and bare-metal environments to quantify virtualization overheads in CPU-bound and inter-process communication (IPC) workloads. Using texttt{HPL} and texttt{NetPIPE} benchmarks, competitive runs show Proxmox VE’s 0.5%--1.1% compute overhead versus bare-metal (KVM: 0.9%--1.2%), rising to 1.8%--2.0% for cooperative MPI-based texttt{HPL}. Proxmox VE reduces RAM usage by 15%--20% over KVM via dynamic ballooning. Intra-VM IPC achieves native performance (~120~Gbps, <0.005~µs), while intra-host IPC sustains ~50~Gbps (vs. KVM’s 60~Gbps) with 20%--30% latency increases. Inter-host IPC under high contention suffers 30% bandwidth loss and 25%--30% latency degradation. CPU utilization remains stable (~50%) across platforms. Results demonstrate Proxmox VE’s viability for CPU-bound/single-host HPC, with network optimizations needed for distributed workflows.
|
|
Tu-S2-T10 |
Room 0.90 |
Big Data and AI-Driven Technologies in Biomedical Science |
Special Sessions: Cyber |
Chair: Zhao, Xingming | Fudan University, Shanghai |
Co-Chair: Liu, Dan | Institute of Automation, Chinese Academy of Sciences |
Organizer: Zhao, Xingming | Fudan University, Shanghai |
Organizer: Xiong, Yi | Shanghai Jiao Tong Univeristy |
|
11:30-11:45, Paper Tu-S2-T10.1 | |
MO2Tracker: Polyp Tracking by Multi-Objective Optimization Approach (I) |
|
Yang, Xiao | Huazhong University of Science and Technology |
Ma, Guangzhi | Huazhong University of Science and Technology |
Yu, Dongming | Huazhong University of Science and Technology |
Guo, Jia | Hubei University of Economics |
Qiu, Wanyu | Hubei University of Economics |
Wang, Xianyuan | Wuhan United Imaging Healthcare Surgical Technology Co., Ltd |
Song, Enmin | Huazhong University of Science and Technology |
Keywords: Image Processing and Pattern Recognition, Optimization and Self-Organization Approaches, Biometric Systems and Bioinformatics
Abstract: Multi-Object Tracking (MOT) of polyps in colonoscopy videos serves as a foundational component for intelligent medical workflows. In tracking-by-detection algorithms, managing the equilibrium among competing optimization objectives during data association presents a complex issue. Current approaches, including threshold-based weighted linear aggregation methodologies and hierarchical cascade matching schemes, demonstrate persistent limitations when deployed in endoscopic polyp monitoring scenarios characterized by vigorous polyp movements and complex morphological deformations. Thus, this study proposes a novel polyp tracking framework named MO2Tracker, which can simultaneously optimize multiple objectives during the data association phase. Specifically, MO2Tracker integrates three discriminative similarity cues in parallel: polyp coordinates, detection confidence scores, and appearance consistency, to establish associations between new detections and existing trajectories. To resolve this optimization challenge, we implement Pareto Front analysis via a multi-objective optimization algorithm, deriving Pareto Front solutions. Subsequently, we design two novel selection paradigms, knee-oriented selection and human-inspired selection, to filter optimal detector-trajectory pairings from the Pareto Front. Extensive experiments conducted on two public datasets and one private dataset validate the superior performance of MO2Tracker. On the public dataset SUN-SEG, MO2Tracker achieves improvements of +4.1% in Multi-object Tracking Accuracy (MOTA), +7.2% in Higher Order Tracking Accuracy(HOTA), and +5.8% in IDF1 Score.
|
|
11:45-12:00, Paper Tu-S2-T10.2 | |
Generalized Fetal Brain Segmentation and Identification from Multi-View MRI (I) |
|
Cai, Zhigao | Fudan University |
Zhao, Xingming | Fudan University, Shanghai |
Keywords: Biometric Systems and Bioinformatics, Image Processing and Pattern Recognition, Machine Vision
Abstract: Fetal brain automatic segmentation and localization from MRI remains challenging, as existing methods often struggle with multi-view consistency and lack generalization across different fetal types and gestational ages. In this paper, we propose a novel cascade framework Co-UNext for generalized multi-view fetal brain segmentation and identification. Our model is based on a two-step cascade architecture that can take multi-views stacks from the same maternal volume, combining depthwise separable convolutions, attention mechanisms, and other strategies. It achieves precise segmentation of the brain across a set of multi-views stacks and provides fetus-specific segmentation results for multiple pregnancies. We evaluated on 212 fetal MRI scans from 20-36 weeks gestation, including 124 singletons and 88 twins scanned on 1.5T Siemens with axial, coronal, and sagittal views. Results demonstrate that Co-UNext achieves better segmentation performance and can accurately localize all the fetal brains with fetus-specific labels in twin stacks, while also providing more continuous segmentation for data affected by motion and artifacts. On both datasets, the proposed method achieved the best results - 96.9% Dice and 95.3% MIoU on singletons, 96.3% Dice and 94.9% MIoU on twins. Co-UNext shows state-of-the-art generalization capabilities for fetal brain segmentation to facilitate various subsequent quantitative analyses.
|
|
12:00-12:15, Paper Tu-S2-T10.3 | |
ParticleDiff: A Conditional Diffusion Trajectory Generator for Enhancing Biological Particle Tracking |
|
Zhang, Yudong | University of Chinese Academy of Sciences |
Liu, Dan | Institute of Automation, Chinese Academy of Sciences |
Yang, Ge | University of Chinese Academy of Sciences |
Keywords: Computational Life Science, Deep Learning, Transfer Learning
Abstract: Accurate particle tracking is essential for studying intracellular dynamics. While deep learning has advanced tracking performance, its reliance on large labeled datasets hampers generalization to unlabeled biological data. To bridge this gap, we propose a novel particle tracking enhancement framework that designed for annotation-scarce scenarios. At its core is ParticleDiff, a conditional diffusion-based trajectory generator tailored for biological motion. By leveraging historical trajectories as context, ParticleDiff conditionally predicts future positions in an autoregressive loop, ensuring both temporal coherence and biological plausibility. These generated trajectories are used to augment training for downstream tracking models, improving accuracy in data-scarce scenarios. Experiments demonstrate that trajectories generated by ParticleDiff closely resemble real biological data. As a result, tracking models trained on ParticleDiff-augmented data consistently outperform those using synthetic trajectories from other generators, achieving performance comparable to models trained on real annotations. Our framework significantly improves tracking accuracy while reducing reliance on labeled data, thereby enhancing the generalization of deep learning-based tracking models in life science applications. Code is available at https://github.com/imzhangyd/ParticleDiff.
|
|
12:15-12:30, Paper Tu-S2-T10.4 | |
Long-Term Disease Progression and Incidence of Complications in Type 2 Diabetes Mellitus |
|
Pósfai, Gergely | Óbuda University |
De Gaetano, Andrea | IASI-CNR |
Kovacs, Levente | Obuda University |
Eigner, Gyorgy | Obuda University |
Keywords: Cybernetics for Informatics, Computational Life Science
Abstract: Several prior studies focused on the estimation of the probability of Type 2 Diabetes Mellitus (T2DM) complications. None of the references used sophisticated mathematical models to simulate the long-term progression of the disease when estimating the probability of a complication. This study extends an existing T2DM model to predict the probability of Diabetic Retinopathy. The benefit of this approach is that we will need much less information to predict the probability of a complication than in a standalone model, as the progression of diabetes will be estimated with an identifiable model. To expand the model, a detailed qualitative analysis was required first to write the model equation along with relationships that can be supported by the pathogenesis of diabetes and the biological background of the development of complications. Then the collection of data necessary for setting up the model, which in the course of our present work was implemented by quantitative analysis of publicly available data releases. The next step was to write the model and identify it using the available data. Finally, we validated the model using a data set from a data source independent of the previous ones and evaluated the results. In the course of our present work, we focused on the probability of developing Diabetic Retinopathy. The completed model is suitable for supplementing the simulation of the initial model in such a way as to estimate the probability of retinopathy appearing in a patient during the progression of the disease.
|
|
Tu-S2-T11 |
Room 0.94 |
Online Health Monitoring and Optimal Control of Battery Via Trustworthy
Artificial Intelligence & Intelligent Power and Energy Systems for
Smart Cities |
Special Sessions: SSE |
Chair: Mo, Huadong | University of New South Wales |
Co-Chair: Dong, Daoyi | Australian National University |
Organizer: Mo, Huadong | University of New South Wales |
Organizer: Dong, Daoyi | Australian National University |
|
11:30-11:45, Paper Tu-S2-T11.1 | |
Time/Space Twist-Based Spatiotemporal Modeling under Incomplete Sampling (I) |
|
Xu, Zijie | City University of Hong Kong |
Li, Han-Xiong | City University of Hong Kong |
Keywords: System Modeling and Control
Abstract: Handling incomplete data in spatiotemporal systems is a critical challenge in scientific and engineering applications. This paper introduces a time/space twist-based spatiotemporal modeling framework to address incomplete sampling in distributed parameter systems (DPSs). By decomposing system dynamics into temporal basis functions and spatial coefficients, the proposed method reconstructs missing spatial information using neural networks while preserving physical principles. The framework integrates data-driven modeling with physical constraints, ensuring accurate reconstruction even under sparse observations. Experimental validation on a catalytic reaction process highlights the effectiveness and superiority of the approach compared to traditional methods, demonstrating its ability to capture complex spatiotemporal dynamics with limited data.
|
|
11:45-12:00, Paper Tu-S2-T11.2 | |
A Prescription-Centric Estimation Framework for Bi-Level Power System Operations with Analytical Representation (I) |
|
Jing, Yuhao | University of New South Wales |
Guo, Fusen | University of New South Wales |
Zhang, Chunyang | University of New South Wales |
Qiao, Li | University of New South Wales |
Mo, Huadong | University of New South Wales |
Dong, Daoyi | Australian National University |
Keywords: Decision Support Systems, Control of Uncertain Systems, System Modeling and Control
Abstract: The increasing integration of renewable energy sources and battery energy storage systems has amplified uncertainties in power system operations, necessitating advanced estimation methods that transcend traditional quality-oriented approaches. In this paper, we propose a prescription-centric estimation framework that embeds decision-making insights directly into the anticipation process. By leveraging parametric programming and implicit gradient representations, our approach establishes an analytical mapping between uncertain parameters and optimal operational decisions, thereby addressing the inherent asymmetry between uncertainty estimation and system re-balancing costs. Notably, the proposed method breaks through the limitations imposed by linearization constraints in conventional estimation networks and optimization models, paving the way for more accurate and robust decision-making. An iterative bi-level nonlinear optimization strategy is also introduced to overcome the shortcomings of purely data-driven methods. The effectiveness of the framework is demonstrated through case studies, underscoring its potential to enhance both decision accuracy and efficiency in power system operations.
|
|
12:00-12:15, Paper Tu-S2-T11.3 | |
A PINN-Centric Approach to Battery SOH: Harmonizing LSTM Dynamics with Kalman Filter Precision (I) |
|
Xu, Ke | The University of New South Wales |
Guo, Fusen | University of New South Wales |
Wang, Rui | Ningbo University |
Zhang, Rui | The University of New South Wales |
Mo, Huadong | University of New South Wales |
Keywords: Infrastructure Systems and Services, Intelligent Power Grid, Intelligent Green Production Systems
Abstract: This paper presents a new Physics-Informed Neural Network (PINN) framework to estimate the State of Health (SOH) of lithium-ion batteries. The proposed architecture, PINN-LSTM-KF, integrates Long Short-Term Memory (LSTM) networks with extended Kalman filtering under physics-based constraints. Conventional data-driven approaches often fail to generalize across different operating conditions due to nonlinear degradation patterns. Our method addresses these challenges by enforcing electrochemical constraints within a multi-scale architecture. It simultaneously captures microscopic physical processes, mesoscopic temporal dynamics, and macroscopic uncertainty quantification. Experiments on lithium-ion, lithium iron phosphate, and lithium-sulfur batteries demonstrate that the proposed framework achieves Mean Absolute Percentage Errors below 0.01% under physics-informed configurations. Model compression techniques further reduce memory overhead, enabling real-time deployment on embedded systems. The framework also supports feature-level interpretability by quantifying contributions of physical variables to degradation, offering practical insights for battery design and management. These results highlight the potential of combining physics-based modeling with learning-based estimation to improve reliability and safety in energy storage systems, particularly in critical domains such as electric vehicles and smart grids.
|
|
12:15-12:30, Paper Tu-S2-T11.4 | |
A Comparative Study of Battery SOH Prediction Models: Exploration of Transformer Method with Reversible Instance Normalisation (I) |
|
Wang, Runpu | The University of New South Wales Canberra |
Li, Zongjun | University of New South Wales |
Mo, Huadong | University of New South Wales |
Guo, Fusen | University of New South Wales |
Dong, Daoyi | Australian National University |
Keywords: Electric Vehicles and Electric Vehicle Supply Equipment, Intelligent Power Grid
Abstract: Accurate estimation of the state of health (SOH) of batteries is of great importance for the safe and efficient operation of energy storage systems. However, data-driven methods are often affected by limited generalisation due to sample distribution shift between different batteries, which make them difficult to extent models to unknown domains. To this end, a Transformer-based architecture integrated with RevIN is proposed in this study, which has presented a significant enhancement on the adaptability to distribution shifts of the input data. The normalisation and denormalisation process of Reversible Instance Normalisation reduces statistical bias whilst maintaining trend information. A comparative evaluation is conducted on the NASA battery datasets across six representative models, including Random Forest, eXtreme Gradient Boosting, Multilayer Perceptron, Long Short-Term Memory, Transformer, and the proposed RevIN-Transformer, under both intra-battery and cross-battery prediction settings. Moreover, the analysis of variance is employed to assess the consistency of SOH prediction errors across different batteries. The results indicate that whilst deep learning methods generally outperform traditional models, the RevIN-Transformer method achieves superior accuracy and stability in cross-battery SOH prediction tasks against distribution shifts. In addition, they also validate the effectiveness of integrating lightweight normalisation modules in a Transformer-based architecture for battery SOH time-series forecasting.
|
|
12:30-12:45, Paper Tu-S2-T11.5 | |
Enhancing the Flexibility of Power Distribution Grids by Self-Organizing Electric Vehicles (I) |
|
Iuliano, Silvia | University of Sannio |
Delattre, Nicolas | Polytech Clermont |
Lai, Chun Sing | Brunel University London |
Vaccaro, Alfredo | Università Del Sannio |
Keywords: Smart Buildings, Smart Cities and Infrastructures, Electric Vehicles and Electric Vehicle Supply Equipment
Abstract: One of the most promising flexible sources in decarbonized power systems is based on electric vehicles fleets equipped with V2G technology, which can exchange bidirectional power flows with the grid matching the system operator request, and providing valuable ancillary services, as far as frequency and voltage regulation are concerned. Providing these services requires deploying resilient, highly scalable, plug-and-play, and privacy-preserving computing frameworks for charging/discharging orchestration according to the requested active and reactive power profiles at the point of connections. For this purpose, this paper proposes a decentralized and self-organizing scheme based on a Peer-to-Peer architecture, leveraging privacy-preserving through cooperative consensus protocols. The effectiveness of the proposed scheme is demonstrated by both theoretical and experimental studies, which have been obtained on a hardware testbed simulating a variable number of grid-connected vehicles under realistic operation scenarios.
|
|
12:45-13:00, Paper Tu-S2-T11.6 | |
Large Language Model Based Data Augmentation for Peak Electricity Price Forecasting and Battery Energy Storage Arbitrage (I) |
|
Mao, Renjie | The University of Sydney |
Zheng, Zuqing | City University of Hong Kong |
Lai, Shuying | CAFEA Smart City Limited |
Tao, Yuechuan | City University of Hong Kong |
Dong, Zhaoyang | City University of Hong Kong |
Li, Tong | City University of Hong Kong |
Huang, Zuliang | City University of Hong Kong |
Qiu, Jing | The University of Sydney |
|
Tu-S2-T12 |
Room 0.95 |
Quantum Cybernetics and Machine Learning |
Special Sessions: Cyber |
Chair: Chen, Chunlin | Nanjing University |
Co-Chair: Kammueller, Florian | Middlesex University London |
Organizer: Chen, Chunlin | Nanjing University |
Organizer: Ma, Hailan | University of New South Wales |
Organizer: Pan, Yu | Zhejiang University |
|
11:30-11:45, Paper Tu-S2-T12.1 | |
Formalisation and Analysis of Decoy QKD in the Isabelle Infrastructure and Insider Framework Using Refinement and Attack Trees (I) |
|
Kammueller, Florian | Middlesex University London |
Nagarajan, Rajagopal | Quentangle |
Parker, Michael C | University of Essex |
White, Catherine | British Telecom Research |
Keywords: Quantum Machine Learning, Quantum Cybernetics, Agent-Based Modeling
Abstract: Quantum Key Distribution (QKD) leverages quan- tum effects to address the key distribution problem. Tradition- ally, its use has been limited to high-tech specialist networks for dedicated, trusted partners. However, advancements in hybrid quantum-classical networks now suggest the potential for QKD- level security on a wider scale. The Isabelle Infrastructure and Insider Framework (IIIf) has demonstrated the ability to formally verify security properties in proof-of-concept QKD models. In this paper, we build on these foundations to extend the formal analysis to real-world multi-photon QKD implementations. We enhance the existing model and show how IIIf’s attack tree analysis exposes threats such as the intercept-and-resend attack and the photon-number splitting (PNS) attack. Furthermore, refinement in IIIf enables the extraction of a formal specification for decoy-state QKD, strengthening security against vulnerabilities.
|
|
11:45-12:00, Paper Tu-S2-T12.2 | |
Constrained Policy Optimization with Approximately Monotonically Increasing Rewards (I) |
|
Lu, Yuanyang | Nanjing University |
Fu, Huiqiao | Nanjing University |
Tang, Kaiqiang | Nanjing University |
Chen, Chunlin | Nanjing University |
Keywords: Machine Learning, Agent-Based Modeling
Abstract: In reinforcement learning (RL), agents maximize accumulated rewards through trial-and-error in the environment to obtain high-performing policies. However, in some situations, loopholes in the purely synthetic reward signals are often exploited by agents, leading to unsafe behaviors, which necessitates the incorporation of safety constraints. In this paper, we propose a safe RL algorithm called Constrained Policy Optimization with Approximately Monotonically Increasing Rewards (CPO-AMIR) to address the safe policy learning in different scenarios and provide practical solutions. We present a novel update formula to achieve a better balance between increasing the reward and decreasing the cost. Furthermore, a theoretical analysis is provided to demonstrate that our approach guarantees the approximately monotonic improvement in rewards when learning constraint-satisfying policies. Our empirical results illustrate the effectiveness and superiority of CPO-AMIR on a set of constrained control tasks.
|
|
12:00-12:15, Paper Tu-S2-T12.3 | |
PDD: Planning Offline Meta-RL with Prompt Decision Diffuser (I) |
|
Zhang, Shilin | Nanjing University |
Hu, Zican | Nanjing University |
Wu, Wenhao | Nanjing University |
Xie, Xinyi | Nanjing University |
Tang, Jianxiang | Nanjing University |
Wang, Zhi | Nanjing University |
Keywords: Machine Learning
Abstract: Inspired by diffusion models that revolutionized image generation through language conditioning, offline reinforcement learning (RL) has been reformulated as a sequence modeling problem. Similar to how language descriptions guide image generation, RL sequence models require task-specific conditioning information to generalize to new tasks. We propose Prompt Decision Diffuser (PDD), addressing the generalization challenge in offline meta reinforcement learning (OMRL) by treating it as a sequence modeling problem, with demonstration-based prompting for few-shot adaptation. These prompts, encoded from few-shot demonstrations, effectively capture task-specific information to guide cross-task policy generation. Experimental results on Mujoco and Point-Robot benchmarks demonstrate PDD's superior few-shot generalization capabilities compared to baseline approaches.
|
|
12:15-12:30, Paper Tu-S2-T12.4 | |
Charging Optimization for Lithium-Ion Battery Based on Robust Deep Reinforcement Learning (I) |
|
Zhipeng, Zhu | Harbin Institute of Technology, Shenzhen |
Dong, Guangzhong | Harbin Institute of Technology, Shenzhen |
Keywords: Deep Learning, Machine Learning
Abstract: Improving lithium battery charging technology is crucial for advancing the widespread adoption of electric vehicles. A well-designed charging strategy should achieve the optimal balance between charging time and battery safety. The learning-based charging method employs an agent to directly interact with the battery plant, iteratively updating the charging strategy based on observations at each time step. This eliminates the need for complex modeling, but strategy performance is highly dependent on the accuracy of observed data. Inevitably, observations of the real battery are often subject to measurement errors and external interference (e.g. adversarial attack), which can potentially lead the agent to make incorrect decisions, posing a threat to the health and safety of the battery. To address this issue, this article proposes a fast-charging method base on robust deep reinforcement learning framework, which considers adversarial perturbations of state observations. The proposed framework is implemented in the deep deterministic policy gradient (DDPG) charging algorithm, wherein the policy network is trained by adding a smooth and robust regularization term. Results show that the proposed method enhances both the smoothness and robustness of the strategy compared with the current DDPG algorithm.
|
|
12:30-12:45, Paper Tu-S2-T12.5 | |
Imitation Learning with Process Adversarial Diffusion (I) |
|
Qi, Yiming | Nanjing University |
Fu, Huiqiao | Nanjing University |
Tang, Kaiqiang | Nanjing University |
Chen, Chunlin | Nanjing University |
Keywords: Machine Learning, Deep Learning
Abstract: Generative Adversarial Imitation Learning (GAIL) replicates expert behaviors by employing adversarial training involving the discriminator and the generator. In theory, GAIL can balance discriminator and generator through considerable online interactions to learn well. But real-world adversarial training is tricky and unstable. Early mistakes by the discriminator can lead to bad learning outcomes, or the generator might just copy average expert actions (mode collapse) instead of diverse strategies. To address these, we propose Process Adversarial Diffusion Imitation Learning (PADIL). Employing a conditional diffusion model as the generator facilitates the generation of a multi-modal strategy, thereby reducing the likelihood of mode collapse and enhancing imitation learning performance with fewer online interactions. At the same time, it lessens the generator's vulnerability to the fluctuating signals or errors conveyed by the discriminator throughout the training process. Furthermore, we have revised the sample extraction approach of the diffusion policy to tackle the concern of iterative instability under rapidly fluctuating reward signals. Experimental results demonstrate that our method achieves better performance compared to baseline methods.
|
|
Tu-S2-T13 |
Room 0.96 |
Cloud, IoT, and Complex Networks |
Regular Papers - Cybernetics |
Chair: Zhou, Kuang | Northwestern Polytechnical University |
Co-Chair: Chen, Jiayuan | Nanjing University of Aeronautics and Astronautics, Nanjing, China |
|
11:30-11:45, Paper Tu-S2-T13.1 | |
Bayesian Transformer-Based Fake News Detection System with Evidence Awareness |
|
Chen, Junhai | National University of Defence Technology |
Xiang, Fengtao | National University of Defense Technology |
Tuoxin, Li | National University of Defense Technology |
Wang, Chang | National University of Defence Technology |
Keywords: Cybernetics for Informatics, Expert and Knowledge-Based Systems, Information Assurance and Intelligence
Abstract: The proliferation of Internet users has accelerated the spread of fake news on social media, necessitating effective fake news detection systems. However, existing methods often focus solely on the features of claims without considering their uncertainty, limiting their reliability and generalizability. Inspired by Bayesian neural networks, this paper proposes a Bayesian Transformer-based Fake news Detection system with Evidence awareness (BTFDE). To validate the effectiveness of BTFDE, we conducted experiments on several datasets. The results demonstrate that BTFDE outperforms several baseline methods, improving the reliability of fake news detection by quantifying uncertainties and incorporating evidence awareness. This approach improves the generalizability of the model and provides a more rational basis for fake news detection.
|
|
11:45-12:00, Paper Tu-S2-T13.2 | |
Hand-Eye Calibration with Kernel Density and Decay Noise: An E-TD5 Reinforcement Learning Approach |
|
Li, Jiacheng | South China University of Technology |
Liu, Yican | South China University of Technology |
Chen, Wei | South China University of Technology |
Zeng, Delu | South China University of Technology |
Keywords: Cloud, IoT, and Robotics Integration, Cyborgs,, Machine Learning
Abstract: Hand-eye calibration is a crucial step in the implementation of vision-based robotic arm systems. However, existing calibration methods struggle to adapt to scenarios where the robotic arm or vision system frequently changes position. To address this challenge, this paper proposes a novel calibration method based on an exploration-optimized Twin Delayed DDPG (TD3) algorithm enhanced with kernel density estimation and decaying noise, referred to as E-TD5. The proposed approach not only incorporates the E-TD5 algorithm to improve the exploration and exploitation capabilities of the TD3 reinforcement learning framework but also introduces an adaptive target-perception enhancement system to handle frequent variations in hand-eye positioning. Experimental results validate the effectiveness of the proposed method, its robustness to changes in hand-eye positions, and the significant advantages of the E-TD5 algorithm compared to the standard TD3 approach.
|
|
12:00-12:15, Paper Tu-S2-T13.3 | |
A Protocol for Secure Data Search in Cloud-Edge-End Collaboration with Formal Verification |
|
Li, Xuejian | Anhui University |
Lv, Hong | Anhui University |
Wang, Mingguang | Anhui University |
Xia, Hantao | Anhui University |
Keywords: Information Assurance and Intelligence, Complex Network
Abstract: Cloud computing and edge computing technologies offer effective solutions to the substantial storage and computational demands resulting from the rapid increase in edge network traffic. However, data outsourcing may lead to potential leakage of users' sensitive information due to the limited trustworthiness of cloud and edge systems, allowing malicious users to conspire with these systems to access private data stored in the cloud. To mitigate security risks associated with collusion attacks on existing ciphertext search protocols, we propose a data security search protocol that is resistant to collusion in cloud-edge-end collaboration. This protocol enables fine-grained search of ciphertexts in the cloud while providing protection against collusion attacks. Furthermore, we have conducted formal methods analysis and verification of the protocol, demonstrating its alignment with its intended purpose.
|
|
12:15-12:30, Paper Tu-S2-T13.4 | |
Key Node Identification for Graphs Based on Graph Attention Networks |
|
Zhou, Kuang | Northwestern Polytechnical University |
Gao, Jiahui | Northwestern Polytechnical University |
Keywords: Complex Network, Machine Learning, Representation Learning
Abstract: Identifying key nodes in a graph is critical for the analysis and management of networked systems. Traditional centrality-based methods primarily focus on the intrinsic characteristics of nodes and the topological properties of graphs. However, these approaches depend heavily on handcrafted feature selection, resulting in poor generalization across networks with diverse structures. Graph Convolutional Networks (GCNs), as a classical deep learning model for graph-structured data, have demonstrated promising performance in key node identification. By aggregating features from a node and its neighbors, GCNs construct expressive node representations and enhance the model’s generalization ability. Nevertheless, existing GCN-based methods often fail to adequately capture the relative importance of neighboring nodes. To address this limitation, this paper introduces a novel key node identification model based on Graph Attention Networks (GAT), termed KeyGAT. By leveraging an attention mechanism, the proposed model enables adaptive and reliable fusion of node and neighbor features, thereby improving identification accuracy. Experimental results on different graph datasets validate the effectiveness of the proposed approach.
|
|
12:30-12:45, Paper Tu-S2-T13.5 | |
AMDCG: Joint Computation Offloading and Resource Allocation Via Metadata-Driven Centipede Game and Deep Reinforcement Learning in 6G SAGIN |
|
Dai, Lijun | Beijing University of Information Science and Technology |
Chen, Xin | Beijing Information Science and Technology University |
Jiao, Libo | Beijing Information Science and Technology University |
Dai, Xin | Beijing Information Science & Technology University |
Zhang, Ning | Beijing Information Science & Technology University |
Keywords: Complex Network, Optimization and Self-Organization Approaches, Deep Learning
Abstract: As 6G technology continues to evolve, future communication systems demand extremely low latency, improved energy efficiency, and support for massive device connectivity. To address these demands, the integration of Space-Air-Ground Integrated Networks (SAGIN) with Mobile Edge Computing (MEC) has emerged as a compelling strategy. This integration leverages edge nodes deployed on low Earth orbit (LEOs) satellites, unmanned aerial vehicles (UAVs), and terrestrial small base stations (SBSs) to deliver distributed computational resources. Nevertheless, due to the inherent heterogeneity and limited capabilities of these edge devices, reliably offloading computing tasks from ground user equipments (UEs) to appropriate edge nodes—whether satellite-based, aerial, terrestrial, or via local execution—remains a significant challenge. In this paper, we design a metadata-driven intelligent offload prediction and global resource optimization framework for centipede games, the metadata only contains the key information of the computing task, but not the actual task data itself. Then we envision a 6G smart city network architecture with complex computing scenarios, formulate the problem of minimizing the global average delay and energy consumption as a Markov Decision Process (MDP) by combining with Bellman's optimization equations, and propose an asynchronous metadata driven centipede game method (AMDCG) based on deep reinforcement learning (DRL). Simulation results show that the AMDCG method significantly reduces the system offloading overhead in terms of latency and energy consumption compared to other benchmark algorithms.
|
|
12:45-13:00, Paper Tu-S2-T13.6 | |
Mission-Oriented Super-Network Modeling and Reliability Evaluation Method for UAV Swarm |
|
Tang, Hui | Beihang University |
Wang, Lizhi | Beihang University |
Wang, Jie | Beihang University |
Che, Haiyang | Beihang University |
Xu, Minze | Beihang University |
Fu, Jingcheng | Beihang University |
Ma, Tielin | Beihang University |
Keywords: Complex Network, Swarm Intelligence
Abstract: With the advancement of artificial intelligence and UAV technologies, UAV swarms have been increasingly applied in a wide range of diversified missions. However, existing studies remain limited in their characterization of UAV swarm under dynamic mission demands and multi-layer complex interactions, as well as in the quantitative evaluation of their reliability. To address these gaps, this study proposes a mission-oriented multi-layer super-network modeling and reliability evaluation framework. First, four heterogeneous sub-networks are constructed from the dimensions of operation, mission, communication, and resource, capturing the swarm’s multi-layer interaction relationships. Then, considering inter-layer dependencies and cascading failures, a random failure and network reconfiguration strategy is introduced to quantitatively analyze the impact of key node failures on topological metrics and network vulnerability. Finally, a fire rescue case study is conducted to verify the effectiveness and accuracy of the proposed method.
|
|
Tu-S2-T14 |
Room 0.97 |
Cognitive Computing |
Regular Papers - HMS |
Chair: Flammini, Francesco | Mälardalen University |
Co-Chair: Nürnberger, Andreas | Otto-Von-Guericke-Universität Magdeburg |
|
11:30-11:45, Paper Tu-S2-T14.1 | |
CDCO: Cross-Domain Contrastive Optimization Framework for Enhancing Multi-Task Learning in Small Pre-Trained Language Models |
|
Cao, Yukun | ShangHai University of Electric Power |
He, Yongcheng | Shanghai University of Electric Power |
Keywords: Cognitive Computing, Augmented Cognition
Abstract: The adapter technique has significantly improved the performance of fine-tuning pre-trained language models (PLMs) in multi-task settings, especially for small models with limited resources. However, current adapter frameworks typically rely on single-task data, a fixed representation space, and a single training phase. This approach limits their ability to generalize across domains in multi-task scenarios, thus restricting performance improvements for smaller models. To address these challenges, we propose the cross-domain contrastive optimization (CDCO) framework to enhance performance in multi-task learning. CDCO improves model performance by asynchronously co-optimizing across diverse task data sources, representation spaces, and multi-stage structures. Specifically, CDCO introduces innovations in both data sample selection and training strategy. First, CDCO introduces out-of-domain manifold sampling (ODMS), which enhances training diversity by selecting challenging hard-negative samples from out-ofdomain datasets through manifold learning. Second, CDCO employs multi-stage asynchronous co-optimization (MAC), mapping samples from ODMS to Euclidean and Poincar´e spaces. Then, it constructs a cross-domain contrastive loss based on the spatial properties of these distributions to guide the optimization process. By sequentially optimizing adapter layers across different spatial distributions, CDCO maximizes the potential of the adapter while mitigating overfitting, thus improving the adaptability and stability of small models in multi-task environments. Experimental results demonstrate that CDCO significantly improves performance on in-domain (ID), out-of-domain (OOD), and knowledge-intensive (KI) tasks, confirming its broad applicability and effectiveness.
|
|
11:45-12:00, Paper Tu-S2-T14.2 | |
Cognitive Psychology Effects in Lexical Decision Tasks: Contrasting Human Behavior with Fine-Tuned Transformer-Based Models |
|
Dai, Ruichi | Jiangnan University |
Ding, Shengjian | School of Science, Jiangnan University |
Li, Weixuan | Jiangnan University |
Lyu, Ruimin | Jiangnan University |
Keywords: Cognitive Computing, Human Performance Modeling
Abstract: This study investigates the task of genuine-fake word classification in the Chinese language, with the objective of comparing human lexical decision-making with that of fine-tuned transformer-based language models. We propose an integrated framework that combines behavioral experiment data with multiple fine-tuned transformer-based models, enhanced with downstream classification layers. Experimental results show that these models consistently outperform human participants in accuracy, precision, and F1-score across all word frequency levels, with particularly greater stability in recognizing low-frequency words. From a cognitive perspective, the findings support the word frequency effect and information processing theory, indicating that low-frequency items impose higher cognitive load. Human participants may mitigate uncertainty through semantic compensation and risk-sensitive strategies. This study not only confirms the effectiveness of fine-tuned language models in lexical anomaly detection, but also provides empirical insights into the cognitive mechanisms underlying human word recognition.
|
|
12:00-12:15, Paper Tu-S2-T14.3 | |
A Multimodal Perception System for Predicting Restorative Effect in University Open Spaces |
|
Huang, Jiazhen | Sichuan University |
Qi, Ruoling | Sichuan University |
Liu, Jiayi | Sichuan University |
Han, Tengfei | Sichuan University |
Zhang, Fansheng | Sichuan University |
Sun, Jieqian | Tsinghua University |
Zhao, Wei | Sichuan University |
Ju, Wei | Sichuan University |
Keywords: Cognitive Computing, Human-centered Learning, Environmental Sensing,
Abstract: Growing mental health problems among college students underscore the urgent need to enhance the restorative effect of university campus. However, current evaluation tools lack systematic, scalable, and interpretable methods to quantify the psychological impact of open space design. This study proposes a multimodal perception system to predict the restorative effect of university open spaces by integrating visual, structural, and semantic features. We construct a large-scale dataset comprising 600 campus images and 12,147 subjective ratings collected through standardized psychological scales. Semantic segmentation is used to extract spatial visual indices from images, while semantic impressions are obtained via the Semantic Differential (SD) scale and converted into natural language descriptions. A dual-encoder alignment framework maps both index and text representations into a shared latent space, enabling cross-modal prediction of restorative scores. A Random Forest regressor is trained on this space to support score inference from either image- or text-based inputs. In addition, we apply a Rule-based Representation Learner (RRL) to extract interpretable spatial patterns associated with restorative outcomes. Experiments show that our method significantly outperforms traditional regression models, achieving an R^2 of 0.85 in predicting perceived restorative effect. The learned rules reveal both explicit visual drivers (e.g., greenery, sky openness) and implicit spatial logics (e.g., element interaction). This framework offers a lightweight and interpretable evaluation tool for health-oriented campus design, applicable across design and planning stages even without image data.
|
|
12:15-12:30, Paper Tu-S2-T14.4 | |
Morpho-Semantic Symbiosis in Chinese Characters: A Heterogeneous Graph Framework for Component Plasticity and Semantic Re-Creation |
|
Liu, Xuanhe | JiangNan University |
Ni, Qiuxiao | Jiangnan University |
Qi, Siyu | JiangNan University |
Wang, Qiuyue | Jiangnan University |
Lyu, Ruimin | Jiangnan University |
Keywords: Cognitive Computing, Human-Machine Interaction, Interactive and Digital Media
Abstract: Chinese characters represent a unique ideographic writing system where glyph structures encode rich semantics, yet current NLP models struggle to leverage these form-meaning associations. This paper introduces Radical-Graphormer, a novel computational framework addressing this gap by modeling the systemic relationship between character component structure and semantics. Using heterogeneous graph representations within a variational autoencoder (VAE) backbone, the framework uniquely captures the morpho-semantic symbiosis at the radical level. It enables bidirectional tasks: interpreting semantics from glyph structure and generating structurally-valid glyphs from semantic inputs. Supported by HanziFormSem, a new large-scale form-meaning dataset of 3,587 characters (details provided), our model effectively learns structure-to-meaning mapping (Macro-F1 0.81 for semantic classification). Crucially, it achieves controllable, semantics-driven glyph synthesis, generating characters with high structural validity (95%) and visual credibility (4.1/5.0 human rating). This work demonstrates the feasibility of computationally simulating Chinese character composition, offering a new methodology for modeling complex symbolic systems and opening avenues in computational linguistics, AI-driven design, and digital humanities.
|
|
12:30-12:45, Paper Tu-S2-T14.5 | |
Bridging the Gap: Multimodal Semantic Comparison of Human and AI-Generated Descriptions in Artistic Contexts |
|
Wang, Qiuyue | Jiangnan University |
Keywords: Human Perception in Multimedia, Intelligence Interaction, Cognitive Computing
Abstract: 大型语言模型在艺术描述方面表现出令人印象深刻的流畅性,但区分大型语言模型生成的文本和人类文本之间的语义抽象和创造力仍然是一个挑战。本研究通过多维度分析,包括词向量、句法、情感、隐喻检测和跨模态对齐,系统地比较了人类和大语言模型生成的书法描述。提出了一种优化的基于BERT的分类器和BLIP2闭环实验,以捕捉风格和语义上的区别。结果表明,人工智能生成的文本表现出更高的句法复杂性(平均句子长度:18.2 vs. 13.0)和词汇多样性(10.07 vs. 5.26),但在创造力(组内相似度:0.781 vs. 0.505)和跨模态任务的语义减少(周期准确率:21.7% vs. 35.3%)方面滞后。虽然人工智能在结构化表达和一致性方面表现出色,但它仍然受到训练模式的限制。相比
|
|
12:45-13:00, Paper Tu-S2-T14.6 | |
Surveying with AI: Simulating Human Responses Using Personalized LLM Agents and Social Media Data |
|
Liang, Shuxi | City University of Hong Kong |
Ren, Haolin | Southern University of Science and Technology |
Zhang, Tianze | City University of Hong Kong |
Liu, Xiao Fan | City University of Hong Kong |
Yin, Zhimeng | City University of Hong Kong |
Hu, Daning | Southern University of Science and Technology |
Keywords: Human Performance Modeling, Cognitive Computing, Human-Computer Interaction
Abstract: Survey research is a vital tool for understanding human opinions, behaviors, and preferences, but traditional methods face challenges such as high operational costs, limited sample diversity, and privacy concerns. Large Language Model (LLM)-based agents offer a novel approach to simulate human responses, generating high-quality synthetic data that preserves privacy while enabling researchers to explore patterns in public opinions. Our study introduces a flexible LLM-based survey simulation platform, which personalizes responses using real-world social media data to enhance realism and relevance. We conducted a validation case study using data from Steemit users, comparing the personalized simulated responses generated by LLM agents with actual survey responses collected from these users. This comparative analysis revealed key differences and potential biases, providing insights for refining survey methodologies. Personalized LLM simulations were highly effective for factual questions but faced limitations with subjective topics, often demonstrating lower variance compared to human responses. The quality of social media data and careful model selection also significantly influenced the accuracy of simulations. Our platform enables social science researchers to explore new methodologies for survey design, pre-test the impact of framing, and interact dynamically with LLM agents at reduced cost. This work provides valuable insights into the use of LLM-based agents for enhancing survey research, supporting their application in social science.
|
|
Tu-S2-BMI.WS |
Room 0.49&0.50 |
BMI Workshop - Paper Session 2: Passive BCIs |
BMI Workshop |
Chair: Hu, Yaoping | University of Calgary |
|
11:30-11:45, Paper Tu-S2-BMI.WS.1 | |
Classifier Model for Predicting Steering Intention in a Brain-Machine Interface |
|
Yamashita, Naoya | Institute of Science Tokyo |
Miura, Satoshi | Institute of Science Tokyo |
Keywords: BMI Emerging Applications, Passive BMIs, Active BMIs
Abstract: In recent years, the number of traffic accidents caused by the errors of older adults while driving in Japan has been on the increase. As we age, changes in reaction time can make it more challenging to drive safely. We aim to develop a system that estimates the driver’s intention and assists the driver in driving accordingly. We investigated whether the detection of movement-related cortical potential, which is a component in electroencephalography (EEG) that occurs before driving, would enable us to detect a driver’s intention to steer. In this study, we conducted experiments to measure EEG signals while a participant drove on a simulator. We analyzed the results using machine learning and classified the EEG during straight driving and before and after steering. Because the optimal learning model for a classifier to detect Movement-Related Cortical Potential from EEG has not yet been derived, we trained Light GBM, LDA, SVM, and LSTM models and identified an appropriate classifier. The result reveals that the best accuracy was obtained with Light GBM, which was able to detect steering intentions with an accuracy of approximately 60%. Further improvements in estimation accuracy and the development of a driver assistance system based on steering intention are future tasks.
|
|
11:45-12:00, Paper Tu-S2-BMI.WS.2 | |
Temporal Convolutional Networks for Driver Fatigue Classification from EEG Signals: A Novel Approach |
|
Jui, Most Julakha Jahan | Institute for Intelligent Systems Research and Innovation (IISRI |
Hettiarachchi, Imali Thanuja | Deakin University |
Keywords: Other Neurotechnology and Brain-Related Topics, BMI Emerging Applications, Passive BMIs
Abstract: Fatigue poses a significant challenge to safety and performance across various domains, making its accurate detection and classification essential for preventing accidents and enhancing operational efficiency. Fatigue classification using an electroencephalogram (EEG) plays a critical role in applications including occupational health, transportation safety, and performance monitoring in high-stakes environments. Deep learning models, including convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have shown potential for analysing EEG data but often face limitations in capturing both temporal dependencies and long-range patterns critical for fatigue detection. This study investigates the use of a transformer-based temporal convolutional network (TCN-Transformer) for EEG-based fatigue classification, leveraging the TCN's causal and dilated convolutional architecture in conjunction with transformer components to efficiently extract multi-scale temporal features and model long-range dependencies in sequential data. Comparative experiments demonstrate that the transformer-enhanced TCN framework outperforms CNN and RNN models for real-time fatigue classification, achieving an accuracy of 87.44%, a sensitivity of 93.27%, and an AUC of 0.9680. The evaluation strategy, employing 10-fold cross-validation repeated over 10 runs, validated the robustness of the proposed approach. This study establishes transformer-based TCNs as a promising tool for EEG-based fatigue classification, providing a streamlined and effective method for real-time fatigue monitoring in safety-critical environments.
|
|
12:00-12:15, Paper Tu-S2-BMI.WS.3 | |
Towards a Shift and Sustain-Based Decoding Architecture for Covert Visuospatial Attention BMI in a Natural Environment |
|
Forin, Paolo | University of Padua |
Tortora, Stefano | Intelligent Autonomous System Lab, Department of Information Eng |
Menegatti, Emanuele | University of Padua |
Tonin, Luca | University of Padova |
Keywords: BMI Emerging Applications, Other Neurotechnology and Brain-Related Topics
Abstract: Covert visuospatial attention (CVSA) enables users to direct focus without eye movements, offering a promising non-invasive control modality for brain-machine interface (BMI) systems. Traditional CVSA-based BMIs rely on decoding α-band electroencephalography (EEG) activity from whole-trial data, often overlooking the temporal dynamics of attentional processing. In this study, we propose a novel control framework that explicitly separates the CVSA task into two subcomponents: shift attention and sustained attention. EEG signals were recorded from nine participants performing a CVSA task in a real-world environment. Through spectral feature analysis and temporal segmentation, we identify distinct patterns associated with each subcomponent and train dedicated classifiers accordingly. Our pseudo-online analysis suggests that combining the outputs of two task-specific classifiers significantly improves both trial-level classification accuracy (by an average of +23.74%) and area under the curve (AUC) (by +4.49%) compared to the conventional approach. These findings suggest that modelling the temporal structure of CVSA can enhance decoding performance and offer a more robust foundation for attention-based BMI.
|
|
12:15-12:30, Paper Tu-S2-BMI.WS.4 | |
STESA-Net: A Hybrid Spatio-Temporal Self-Attentive Model for Attention-Inattention Classification from EEG |
|
Sekhar C S, Aswin | Singapore Institute of Technology |
Parashiva, Praveen Kumar | Singapore Institute of Technology |
A. P., Vinod | Singapore Institute of Technology |
Keywords: Passive BMIs, BMI Emerging Applications, Other Neurotechnology and Brain-Related Topics
Abstract: Attention is a critical cognitive function in daily life, and its impairment can lead to serious consequences. Electroencephalogram (EEG) enables detection of brain activity related to attention, but decoding attentional states from EEG signals remains challenging due to noise artefacts and low spatial resolution. While deep learning approaches have shown promise, their generalizability across subjects remain limited. This work proposes a novel end-to-end hybrid deep learning architecture, STESA-Net, for classifying attention versus inattention states from multichannel EEG signal. STESA-Net integrates spatiotemporal convolutional layers, self-attention mechanism, and Bidirectional Long Short-Term Memory (Bi-LSTM) units to effectively capture and weight relevant spatial and temporal features. The design enhances the model’s ability to learn subject-specific patterns while maintaining generalizability across individuals. The proposed method is validated using 11 subjects EEG data recorded during a simulated driving task, employing leave-one-out-validation (LOOV) method. The proposed method achieved an average attention v/s inattention classification accuracy, precision, recall and F1 score of 78.47%, 80.16%, 78.46%, and 78.14%, respectively. The improvement in average classification accuracy achieved by the proposed method over the existing methods is ~7%. The results highlight the effectiveness of combining convolutional layers, self-attention mechanisms, and Bi-LSTM networks for robust cross-subject attentional state classification. The proposed framework offers a promising solution for real-world applications in attention monitoring and may aid the early detection of attention-related cognitive impairments with reduced computational overhead and enhanced generalization.
|
|
12:30-12:45, Paper Tu-S2-BMI.WS.5 | |
A Transition Probability Matrix Approach to Brain Dynamics - a Quantitative Analysis of EEG Sequences |
|
Davis, Joshua | University of Auckland |
Kozma, Robert | University of Memphis, TN |
Keywords: BMI Emerging Applications, Passive BMIs
Abstract: EEG-based identification of brain states has advanced significantly, enabling cognitive monitoring during daily tasks. Brain dynamics can be studied as a Markovian, Semi-Markovian, or Non-Markovian stochastic process, prescribed by Transition Probability Matrices derived from EEG measurements. This study models brain dynamics as discrete Markov chains using second-by-second dominant frequencies derived from the power spectrum of high-density array EEG signals. By analyzing transition and limiting probabilities across modalities, we reveal distinct neural signatures that differentiate engaged from meditative states. Though preliminary, these findings highlight the method’s potential to enhance BCI systems and deepen our understanding of brain dynamics, brain health and the benefits of meditation in future studies.
|
|
12:45-13:00, Paper Tu-S2-BMI.WS.6 | |
Decoding of Covert Spatial Attention to Rapidly Presented In-Phase and Antiphase Visual Stimuli |
|
Reichert, Christoph | Leibniz Institute for Neurobiology |
Sweeney-Reed, Catherine M. | Otto Von Guericke University |
Dürschmid, Stefan | Leibniz Institute for Neurobiology |
Keywords: Active BMIs
Abstract: Covert attention to peripherally presented visual stimuli is an effective cognitive task that can be used to operate a brain-computer interface (BCI) even if the user is completely paralyzed. Brain responses to single visual search display onsets show attention-related components emerging between 180ms and 500ms. This poses a challenge when using fast visual stimuli that induce steady-state visual evoked potentials (SSVEP) for covert attention BCIs since a new stimulus is presented before the current stimulus is processed. Here we investigated brain activity during covert attention by presenting visual stimuli inducing SSVEPs in the left and right visual field. We decoded the attention shift to in-phase stimuli in the left and right visual field with an average accuracy of 72.5% across all channels. When left and right stimuli were presented with a phase shift but constant frequency, the average decoding accuracy increased to 84.6%. While with in-phase stimuli, top-down attention is decoded, decoding of attention to antiphase stimuli additionally takes advantage of stimulus-related processing.
|
|
Tu-MPO |
Foyer F |
Human-Machine Systems WiP Poster Session |
Work in Progress |
Chair: D'Aniello, Giuseppe | University of Salerno |
|
11:30-13:00, Paper Tu-MPO.1 | |
Identification of Communication Patterns through Sequential Analysis of Meeting Utterance Data and Regression Analysis between Utterance Patterns and a Creativity Indicator of Meetings |
|
Kitagawa, Haruki | The University of Tokyo |
Kanno, Taro | The University of Tokyo |
Chen, Yingting | The University of Tokyo |
Yoshino, Yuta | Ricoh Company, Ltd |
Watanabe, Shuhei | Ricoh Company, Ltd |
Hachisuka, Satori | The University of Tokyo |
Keywords: Visual Analytics/Communication, Team Performance and Training Systems, Human Factors
Abstract: Meetings are important activities that influence the achievement of team objectives. The characteristics of creative meetings must be clarified to enhance creativity. This study conducted two analyses with the aim of identifying participant behaviors that contribute to creativity by quantitatively analyzing creativity in actual meeting data. First, utterances in meetings were annotated using 11 categories, and frequently occurring dialogue patterns were examined both overall and for each team through pattern analysis. Second, regression analysis was conducted to examine the relationship between the frequency of these frequent patterns and a quantitative creativity index constructed based on text sentiment. In the first analysis, the most frequently observed type of interaction in meetings was the exchange of opinions. Additionally, communication characteristics specific to each team were identified. However, no significant relationships were found between the creativity index and the dialogue patterns in the second analysis. Further analysis, including a reconsideration of the method of aggregating conversation patterns, is required in future work.
|
|
11:30-13:00, Paper Tu-MPO.2 | |
Multi-Modal AI-Based Pain Detection in Intermediate Care Patients in the Postoperative Phase |
|
Nienaber, Sören | Otto-Von-Guericke-University Magdeburg |
Wang, Huibin | Otto-Von-Guericke-University Magdeburg |
Hempel, Thorsten | Otto-Von-Guericke University |
Walter, Steffen | University Hospital Ulm |
Barth, Eberhard | University Hospital Ulm |
Al-Hamadi, Ayoub | Otto-Von-Guericke University |
Keywords: Medical Informatics, Affective Computing
Abstract: “Multi-modal AI-Based Pain Detection in Inter- mediate Care Patients in the Postoperative Phase” is an interdisciplinary research work that operates in the domain of automated pain detection. It aims to improve previous work, based on pain databases like BioVid and UNBC shoulder pain, as well as AI-based approaches using computer vision and signal processing to analyze available modalities. Thus, we present our basic research idea on how to improve automatic pain detection in three major steps. The first step focuses on collecting pain data from postoperative patients in intermediate care stations (IMC). In addition, patients who are not fully oriented should be included in a separate data collection as a second focus group. Then, improvements on the state-of-the-art models should not only advance general pain detection, but also help bridge the gap to the real-world setting of the IMC data. Improvements include transferability analysis, feature selection evaluation, and balancing of data distribution to deliver better classification performance. In a last step, we aim to test, verify and evaluate the classification performance on the IMC data with the support of medical practitioners.
|
|
11:30-13:00, Paper Tu-MPO.3 | |
Quantifying Motor Self-Efficacy Changes Following Motor Interventions |
|
Kobayashi, Akihiro | Graduate School of Frontier Sciences, the University of Tokyo |
Nakano, Nobuyasu | National Institute of Advanced Industrial Science and Technology |
Kikuchi, Ken | University of Tokyo |
Yamashita, Atsushi | The University of Tokyo |
An, Qi | The University of Tokyo |
Ueda, Sayako | Japan Women’s University |
Keywords: Human Factors, Human-Machine Interaction
Abstract: Self-efficacy is crucial for the effective application of assistive technology and rehabilitation. This study proposes a novel approach to assess the impact of motor interventions on motor self-efficacy, relevant for human-robot interaction in rehabilitation, by focusing on the perceived reachable space. Twelve healthy adults underwent an arm movement restriction intervention using a robotic arm (KINARM), and changes in the perceived reachable space and muscle activity were measured before and after the intervention. The results indicated a reduction in the perceived reachable space and an adaptive decrease in muscle activity for unreachable targets following motor restriction. This suggests that the perceived reachable space can serve as an objective proxy for task-specific motor self-efficacy, which is valuable for evaluating user adaptation to robotic interfaces. Furthermore, these findings imply that in rehabilitation using interactive robots, a patient's effort levels may be influenced by their perception of task achievability.
|
|
11:30-13:00, Paper Tu-MPO.4 | |
Study on Evaluation of Personal-Fit Control System for Vehicles |
|
Makino, Yasuhiro | Hiroshima University |
Miyakoshi, Minoru | Hiroshima University |
Wakitani, Shin | Hiroshima University |
Yamamoto, Toru | Hiroshima University |
Saeki, Kazuhiro | Mazda Motor Corporation |
Takeda, Yusaku | Mazda Motor Corporation |
Yano, Yasuhide | Mazda Motor Corporation |
Keywords: Human-Machine Cooperation and Systems, Human-centered Learning, Kansei (sense/emotion) Engineering
Abstract: In the automotive industry, improving vehicle operability tailored to individual drivers is essential for achieving a long and enjoyable driving experience. This study presents a demonstration of a personal-fit control system that adaptively adjusts vehicle characteristics to match each driver’s preferred operability. The experiment is conducted using a driving simulator. In addition, an evaluation method based on a Kalman filter is examined. The results indicate that optimal vehicle characteristics vary among drivers, and that the personal-fit control system enables adaptation to these individual preferences.
|
|
11:30-13:00, Paper Tu-MPO.5 | |
Personalized and Situation-Aware Microlearning in Moodle with the CONSALE Framework |
|
D'Aniello, Giuseppe | University of Salerno |
Falcone, Roberto | University of Salerno |
Gaeta, Matteo | University of Salerno |
Keywords: Human-centered Learning, Cognitive Computing, Human Factors
Abstract: In a rapidly evolving global context - driven by technological innovation and societal change - the need for continuous reskilling and upskilling in education and training is more urgent than ever. To address this challenge, CONSALE (Constructing Situation Awareness in microLearning Environments) offers a structured framework for adaptive microlearning that aligns instructional goals with cognitive processes. By integrating Understanding by Design with Situation Awareness-Oriented Design, CONSALE enables the creation of personalized, context-aware learning experiences. Its implementation in Moodle — via a plugin-based architecture — supports dynamic learner profiling (based on the Felder-Silverman model), behavioral adaptation, and cognitively tagged content delivery, enhancing engagement and learning outcomes.
|
|
11:30-13:00, Paper Tu-MPO.6 | |
Formal Analysis of Vulnerabilities in Mixed-Reality Systems |
|
Wang, Timothy | RTX Technology Research Center |
Amundson, Isaac | Collins Aerospace |
Babar, Junaid | Collins Aerospace |
Wu, Peggy | Raytheon Technologies Research Center |
Keywords: Virtual/Augmented/Mixed Reality, Systems Safety and Security, Human Factors
Abstract: With the proliferation of mixed-reality (MR) systems in aerospace and defense, there is increased potential for adversarial exploitation of system vulnerabilities and/or properties in the human cognitive process in order to reduce mission-effectiveness. This paper presents our preliminary work on the Modeling and Analysis Toolkit for Realizable Intrinsic Cognitive Security (MATRICS), a formal methods-based approach to provide a mathematically rigorous design and verification framework for protecting MR systems and operators in mission-critical applications from cognitive attacks. We describe our approach and present initial results, including formal models of the human operator, MR device, and mission environment, and apply existing formal methods tools to prove the holistic cognitive security of MR systems.
|
|
11:30-13:00, Paper Tu-MPO.7 | |
Motion Intention Decoding: The Role of Data Parameters in Motor Unit-Based Decoders |
|
Meng, Long | Penn State University |
Hu, Xiaogang | Penn State University |
Keywords: Human-Machine Interaction, Assistive Technology
Abstract: Accurate decoding of human motion intention from surface electromyography (sEMG) signals recorded non-invasively from the skin surface is critical for enabling intuitive control in assistive robotics and human–machine interactions. With the advancement of high-density sEMG (HD-sEMG), neural decoding methods based on motor unit (MU) activity have shown promise due to their potential to capture finely controlled movement information. However, the effects of data segmentation parameters on the decomposition and decoding accuracy remain underexplored. In this study, we systematically investigated how the segmentation length and data size of sEMG signals used for decomposition affect the performance of finger force decoding. Specifically, HD-sEMG signals were recorded from eight human participants during single- and multi-finger isometric force tasks. A neural decoding pipeline was developed for finger force predictions. We first evaluated the impact of four segmentation window lengths (10 s, 20 s, 40 s, and 80 s) on decoding accuracy, and found that a 20-second window was sufficient to ensure accurate decoding, with no additional benefit from using longer segments. Using this setting, we further examined the effect of training data size by comparing decoders trained with different data sizes. Our results showed that using the full training dataset significantly improved decoding performance compared to using only half of the training dataset. These findings offer practical guidelines for optimizing data usage in MU-based motion intention decoding systems.
|
|
11:30-13:00, Paper Tu-MPO.8 | |
Web-Based Human-Machine Interface for CNC Systems and LSTM-Based Security Enhancements |
|
Lee, Hongik | Hiscom |
Sung, Minyoung | University of Seoul |
Kim, Woonggy | Dept. of Mechanical and Information Engineering |
Keywords: Human-Machine Interface, Supervisory Control, User Interface Design
Abstract: With the ongoing digital transformation of industry, web-based human-machine interfaces (HMIs) are gaining increasing importance. In motion control systems, HMIs play a vital role in facilitating user interaction for monitoring and control. Recent advancements in web technologies have spurred interest in adopting modern web-based interfaces for motion control systems, as they not only enhance user experience but also facilitate the integration of advanced technologies such as machine learning. This paper presents a case study on the design and implementation of a web-based HMI for CNC (computer numerical control) and proposes machine learning-based security enhancements. The system is developed using Python and Flask for the backend service, and Node.js with Vue.js for the frontend client interface. LinuxCNC, an open-source motion control software, serves as the control engine, while communication is established via WebSocket and RESTful APIs to exchange command and status. Experimental evaluation using industrial motor drives and controllers demonstrates that the proposed system offers comparable performance. Compared with the legacy native HMI, while the web-based HMI exhibits slightly higher input latency due to network overhead, it significantly reduces output latency while improving responsiveness. To address security concerns, the paper introduces an AI-driven framework that integrates LSTM (long short-term memory) based anomaly detection and interaction biometrics for implicit user authentication. A customized design is proposed to ensure secure operation within web-based motion control environments.
|
|
11:30-13:00, Paper Tu-MPO.9 | |
Interpreting the Digital Mirror: Relational and Emotional Meaning-Making in Interactive Art |
|
Sobiech, Franciszek Jakub | Lodz University of Technology |
Walczak, Natalia | Institute of Applied Computer Science, Lodz University of Techno |
Ignatowicz, Filip | Academy of Fine Arts in Gdańsk |
Babout, Laurent | Lodz University of Technology |
Wróbel-Lachowska, Magdalena | Lodz University of Technology |
Keywords: Human-Computer Interaction, Interactive and Digital Media
Abstract: This paper investigates the interpretive processes within interactive art, focusing on post-participation interviews from an immersive installation featuring a mirror that displayed a previous participant instead of the viewer. The study, conducted through semi-structured interviews, explores how participants interpret and construct personal meaning from their engagement with the artwork. Thematic analysis revealed key areas of interest present thorough the participants’ responses. Additionally, participants found metaphorical discussions more insightful than titles in framing meaning. The findings are analyzed through a framework for the psychology of aesthetics, offering new insights into participant experience in interactive art.
|
|
11:30-13:00, Paper Tu-MPO.10 | |
Do LLMs Tell Us What We Want to Hear? Investigating Confirmation Bias in AI Responses to Health Queries |
|
Ris-Ala, Rafael | Inesc Tec; Utad; Ufrj |
Gonçalves, Gonçalo | INESC TEC |
Lopes, Leonardo | Engineering Department Universidade De Trás-Os-Montes E Alto Dou |
Dantas, Tiago | Engineering Department Universidade De Trás-Os-Montes E Alto Dou |
Paulino, Dennis | INESC TEC and University of Trás-Os-Montes E Alto Douro |
Netto, André Thiago | INESC TEC and University of Trás-Os-Montes E Alto Douro, UTAD |
Guimarães, Diogo | INESC TEC and University of Trás-Os-Montes E Alto Douro, UTAD |
Rocha, Artur | Human-Centered Computing and Information Science, INESC TEC |
Vivacqua, Adriana S | Universidade Federal Do Rio De Janeiro |
Paredes, Hugo | INESC TEC/UTAD |
Keywords: Intelligence Interaction, Human-Computer Interaction, Ethics of AI and Pervasive Systems
Abstract: Large Language Models (LLMs) are widely used today in virtual assistants and content generation. However, there are suspicions that LLMs present confirmation bias, responding in a way that reinforces beliefs or assumptions embedded in users' questions, which can lead to erroneous decision-making, especially in sensitive areas such as healthcare. The objective of this research is to determine how often and under what conditions LLMs present confirmation bias and to identify the causes of this effect. The methodology involves conducting an experiment in which 52 biased healthcare questions are presented to 10 of the most popular models and analyzing whether their responses were biased. This work proves with statistical power the behavior of confirmation bias. We show that confirmation bias in LLMs occurs in all LLMs with a frequency of 20% to 60% of the occasions. The evidence suggests that the bias arises from the training database, the Transformer architecture itself, and the instructions in the fine-tuning phase by the companies behind the LLMs. This research explores pathways for the development of trustworthy LLMs.
|
|
11:30-13:00, Paper Tu-MPO.11 | |
Concept-Based Human-Machine Knowledge System |
|
Jakober, Lukas | FHNW |
Christen, Patrik | FHNW |
Keywords: Human-Machine Cooperation and Systems
Abstract: Despite recent advancements, Large Language Models (LLMs) face difficulties in providing reliable and verifiable information, especially in critical domains. It is challenging to find possible hallucinations or manage the intrinsic knowledge of modern models. This paper presents an approach leveraging human-machine cooperation inspired by LLM technologies and Knowledge Graph (KG) structures. The proposed framework enables users to create a hierarchical network of concepts from textual input. Concepts range from characters (low-level) through words (medium-level) to word chains (high-level). The system generates relationships mainly on word co-occurrences within sentences, while leveraging Natural Language Processing (NLP) techniques and statistics for providing recommendations for entities and relations that users then validate and curate through an interactive interface. This cooperation between human and machine embraces the strengths of computational accuracy and storage capability with human cognitive skills. The idea aims to empower users to effectively organise complex information and navigate the information space with ease through the help of strongly connected concepts.
|
|
11:30-13:00, Paper Tu-MPO.12 | |
An Auto-Labeling Tool for Occupancy Grid and BEV in Autonomous Driving Dataset |
|
Bea, Khean Thye | National Taipei University of Technology |
Yang, Yu Chen | National Taipei University of Technology |
Tseng, Shih Chi | National Taipei University of Technology |
Huang, Chih-Sheng | ELAN Microelectronic Corporation |
Chen, Yen-Lin | National Taipei University of Technology |
Keywords: Information Visualization, Human-Machine Cooperation and Systems, Cognitive Computing
Abstract: Accurate 3D semantic labeling is critical for scene understanding and navigation in autonomous driving systems. However, manual annotation of 3D point clouds is labor-intensive and difficult to scale across diverse environments. To address this challenge, we propose a modular-based automated labeling framework that leverages 2D semantic cues to facilitate 3D scene understanding. The system integrates state-of-the-art models—Grounding DINO, SAM2, YOLOv7, and Generative AI (GAI) —to perform high-quality 2D semantic segmentation, which is then processing into semantic LiDAR point clouds. Afterward, the framework employs multi-frame fusion and static-dynamic scene separation to construct dense 3D semantic occupancy grids. Additionally, GAI serves as a prompt-refinement module, improving segmentation accuracy at object boundaries and in distant regions. Experiments on real-world street-view datasets demonstrate that AutoLabel-2D significantly enhances labeling efficiency and segmentation completeness, offering strong generalization across scenes. This framework provides a scalable and effective solution for high-definition mapping and semantic perception in autonomous driving applications.
|
|
11:30-13:00, Paper Tu-MPO.13 | |
Binary-Phase Model and Pre-Evacuation Dynamics |
|
Wang, Peng | University of Connecticut |
Luh, Peter | University of Connecticut |
Sinčák, Peter | Technical University of Kosice |
Pitukova, Laura | Technical University of Kosice |
Keywords: Human Performance Modeling, Human Factors, Human-Centered Transportation
Abstract: Evacuation behavior is an important human factor in safety performance modeling, and pre-evacuation phase is an interval between receiving an alarm signal and decisive escaping for safety. To describe human awareness and response during pre-evacuation phase, this paper formulates a multi-agent process to simulate crowd pre-evacuation dynamics. The model mainly combines classical opinion dynamics with binary phase transition to describe how group pre-evacuation time emerges from individual interaction in a given social context. The model parameters are quantitatively meaningful to human factors research within socio-psychological background, e.g., to what extent an individual is stubborn or open-minded, or what kind of the social topology exists among the individuals and how it functions in aggregating individuals into groups. The modeling framework also describes collective motion of many agents in a planar space. The resulting multi-agent system is similar to Vicsek flocking model, and it is meaningful to explore complex behavior during phase transition of non-equilibrium processes.
|
|
11:30-13:00, Paper Tu-MPO.14 | |
Identifying Pitch Errors in Music through a Musician's Brain Waves (EEG) |
|
Kim, Anthony | Torrey Pines High |
Uppal, Abhinav | University of California, San Diego |
Kameron Gano, Kameron | University of California, San Diego |
Cauwenberghs, Gert | University of California at San Diego |
Keywords: Human Perception in Multimedia, Brain-Computer Interfaces, Brain-based Information Communications
Abstract: Understanding how the brain processes musical pitch errors is key to advancing music perception and auditory rehabilitation. This pilot study (N=1) examined neural responses to pitch deviations in a trained musician using a portable dry-electrode electroencephalography (EEG) system. The participant listened passively to seven~10-second violin excerpts with systematically varied pitch errors: none, medium, or high. EEG data were recorded, preprocessed, and analyzed for oscillatory patterns in the beta band across temporal and central brain regions. Findings revealed a graded increase in right-temporal beta power with error magnitude, alongside distinct temporal dynamics suggesting a sequential process: initial detection of deviations, evaluation of error severity, and adjustment of predictive models. These results highlight a potential cascade in passive pitch-error processing, extending prior work on active performance to naturalistic listening contexts. This work offers preliminary insights into the neural basis of pitch perception and suggests applications in music education and auditory training, though further research with larger cohorts is needed to confirm these findings.
|
|
11:30-13:00, Paper Tu-MPO.15 | |
Emergency Vehicle Siren Sounds That Are Easier for Deaf and Hard-Of-Hearing People to Hear |
|
Yasu, Keiichi | Tsukuba University of Technology |
Morita, Yuiki | Tsukuba University of Technology |
Hiraga, Rumi | Tsukuba University of Technology |
Keywords: Assistive Technology
Abstract: With the advancement of hearing aids and cochlear implants, auditory perception among the Deaf and Hard-of-Hearing (D/HoH) people has improved, including the recognition of environmental and instrumental sounds. However, even with these improvements, it remains difficult for many deaf individuals to perceive emergency vehicle sirens. This study proposes more perceivable siren patterns by modifying pitch and tempo. Through awareness surveys and listening experiments with D/HoH individuals, we examined the perceived urgency and audibility of modified sirens. The results suggest that increasing the tempo improves both recognizability and perceived urgency, while higher octave shifts tend to be rated as noisy or annoying.
|
|
11:30-13:00, Paper Tu-MPO.16 | |
An Assistance Control System for Safe and Flexible Operation Toward Operator Upskilling |
|
Ishida, Ren | Keio University |
Inoue, Masaki | Keio University |
Ishihara, Shinji | Hitachi, Ltd |
Obara, Hiroki | Hitachi, Ltd |
Keywords: Assistive Technology, Human Enhancements, Design Methods
Abstract: This paper addresses the design of an assistance control system that enables safe and flexible operation while facilitating operator upskilling. A key factor in upskilling is increasing opportunities for operators to attempt control actions. To this end, we design an assistance system that reduces intervention frequency and promotes operator freedom. We particularly focus on designing a human assistance system for water level control in a tank system. Then, we characterize the set of admissible control actions for the human operator in order to design assistance control logic that ensures the safety of the overall system. Finally, we conduct a human-in-the-loop simulation of the tank system. Through the simulation, we verify that the proposed assistance system ensures the safety of the overall system with minimal intervention, demonstrating its potential as a foundational technology for effective operator upskilling.
|
|
11:30-13:00, Paper Tu-MPO.17 | |
Assessment of Mental Fatigue in Healthy Participants During Extended BCI-HMD Sessions |
|
Evetovic, Nina | Slovak Academy of Sciences |
Rosipal, Roman | Slovak Academy of Sciences |
Polyanskaya, Arina | Slovak Academy of Sciences |
Rostakova, Zuzana | Institute of Measurement Science, Slovak Academy of Sciences |
Trejo, Leonardo Jose | Pacific Development and Technology, LLC |
Keywords: BMI Emerging Applications, Passive BMIs, Other Neurotechnology and Brain-Related Topics
Abstract: Prolonged use of brain–computer interfaces (BCIs) with virtual reality (VR) via head-mounted displays (HMDs) induces mental fatigue, potentially impairing neurorehabilitation. This study examines EEG-based fatigue markers in healthy participants during extended BCI-HMD sessions. Fatigue was classified using N-way Partial Least Squares (N-PLS) with linear discriminant analysis, achieving 82.42% (±7.5) accuracy. N-PLS components revealed spatial–spectral patterns in occipital and sensorimotor alpha activity. Temporal trajectories indicated progressive fatigue accumulation during sessions. Results demonstrate the feasibility of EEG-based fatigue monitoring for optimizing BCI-HMD post-stroke neurorehabilitation.
|
|
Tu-PL2 |
Hall F |
Plenary 2 |
Plenary |
Chair: Eigner, György | Obuda University |
|
14:00-14:45, Paper Tu-PL2.1 | |
ATRO - the Future of Robotics Is Modular |
|
Morscher-Unger, Thomas | Bechhoff Automation |
Keywords: Robotic Systems, Manufacturing Automation and Systems
Abstract: Industrial robots are typically designed with a fixed, standardized structure, which often forces manufacturers to use a system that's over-dimensioned, or far larger and more powerful than needed for the task at hand. This not only results in unnecessary costs but also takes up valuable floor space. A new approach, known as modular robotics, offers complete freedom in robot configuration. By using scalable, easily pluggable motor and link modules instead of a rigid, one-size-fits-all design, machine builders can create customized robot solutions with the precise kinematics required for a specific job. Combined with an open, PC-based control technology that provides programming options for both beginners and experienced PLC programmers, the ATRO system from Beckhoff Automation is intended to be a foundational step toward this future of modular industrial robotics.
|
|
14:45-15:30, Paper Tu-PL2.2 | |
Conversational AI and Agents at Raiffeisen Bank International |
|
Nagy, Marian | Raiffeisen Bank Hungary |
Ivana, Despotovic | Raiffeisen Bank International |
Lorenzo, Tosi | Raiffeisen Bank International |
Keywords: Application of Artificial Intelligence, AI and Applications
Abstract: This plenary talk explores how Raiffeisen Bank International (RBI) leverages conversational AI and intelligent agents to transform customer interaction and internal processes. It will highlight RBI’s strategic approach to deploying AI-driven solutions, focusing on real-world applications in banking, automation of complex workflows, and enhancing customer experience through natural language technologies. The session will also address challenges in scalability, compliance, and trust, offering insights into the future of AI-powered financial services.
|
|
Tu-S3-T1 |
Hall F |
Deep Learning 5 |
Regular Papers - Cybernetics |
Chair: Pitakwatchara, Phongsaen | Chulalongkorn University |
Co-Chair: Alam, Mohammed Talha | Mohamed Bin Zayed University of Artificial Intelligence |
|
16:00-16:15, Paper Tu-S3-T1.1 | |
ADAM-Dehaze: Adaptive Density-Aware Multi-Stage Dehazing for Improved Object Detection in Foggy Conditions |
|
AlHindaassi, Fatmah | Mohamed Bin Zayed University of Artificial Intelligence |
Alam, Mohammed Talha | Mohamed Bin Zayed University of Artificial Intelligence |
Karray, Fakhreddine | University of Waterloo |
Keywords: Deep Learning, Image Processing and Pattern Recognition, Machine Vision
Abstract: Adverse weather conditions, particularly fog, pose a significant challenge to autonomous vehicles, surveillance systems, and other safety-critical applications by severely degrading visual information. We introduce ADAM‑Dehaze, an adaptive, density‑aware dehazing framework that jointly optimizes image restoration and object detection under varying fog intensities. First, a lightweight Haze Density Estimation Network (HDEN) classifies each input as light, medium, or heavy fog. Based on this score, the system dynamically routes the image through one of three CORUN branches—Light, Medium, or Complex—each tailored to its haze regime. A novel adaptive loss then balances physical‐model coherence and perceptual fidelity, ensuring both accurate defogging and preservation of fine details. On Cityscapes and the real‑world RTTS benchmark, ADAM‑Dehaze boosts PSNR by up to 2.1 dB, reduces FADE by 30%, and improves object detection mAP by up to 13 points, all while cutting inference time by 20%. These results demonstrate the necessity of intensity‑specific processing and seamless integration with downstream vision tasks for robust performance in foggy weather conditions.
|
|
16:15-16:30, Paper Tu-S3-T1.2 | |
DUO-Net: Joint End-To-End 2D Object Detection and Depth Estimation Via Uncertainty-Aware Multitask Learning |
|
Ghaffar, Fazal | Deakin University |
Khan, Burhan | Deakin University |
Jalali, Seyed Mohammad Jafar | Deakin University |
Chee Peng, Lim | IISRI |
Keywords: Deep Learning, Image Processing and Pattern Recognition, Machine Vision
Abstract: DUO-Net is proposed, a unified multi-task learning framework for joint 2D object detection and depth estimation. The architecture employs a shared ResNet-based backbone with attention modules and task-specific heads to simultaneously perform bounding box localisation and dense depth prediction. A two-stage training method is adopted to sequentially pretrain each task and subsequently refine them through joint learning, enhancing both features and convergence reliability. To address task imbalance and noisy supervision, we incorporate uncertainty-aware loss weighting, enabling the model to dynamically adjust task contributions during training. Evaluated on the KITTI and JRDB datasets, DUO-Net demonstrates robust performance across both tasks while maintaining efficiency and scalability.
|
|
16:30-16:45, Paper Tu-S3-T1.3 | |
AI-Based Denoising and Interpolation of Magnetic UXO Data |
|
Kovalenko, Mykyta | Fraunhofer HHI |
Przewozny, David | Fraunhofer HHI |
Chojecki, Paul | Fraunhofer HHI |
Eisert, Peter | Fraunhofer HHI |
Hilsmann, Anna | Fraunhofer HHI |
Bosse, Sebastian | Fraunhofer HHI |
Keywords: Deep Learning, Machine Learning, Image Processing and Pattern Recognition
Abstract: Magnetic surveys are a key tool in detecting buried objects such as unexploded ordnance (UXO), where dense magnetic maps must be reconstructed from sparsely sampled gradiometer data. We present a deep learning-based approach that outperforms classical interpolation methods in both accuracy and speed. Trained on synthetic magnetic fields simulating realistic UXO signatures and measurement noise, our modified U-Net with ResNet-34 encoding reconstructs high-resolution magnetic maps from sparse inputs. Compared to state-of-the-art gridding methods, our model achieves 3–5% higher reconstruction accuracy on average while operating up to 80× faster than SOTA algorithms, enabling more efficient and interpretable UXO detection in real-world survey conditions.
|
|
16:45-17:00, Paper Tu-S3-T1.4 | |
FlipConcept: Tuning-Free Multi-Concept Personalization for Text-To-Image Generation |
|
Woo, Young Beom | Korea University |
Kim, Suneung | Korea University |
Lee, Seong-Whan | Korea University |
Keywords: Deep Learning, Machine Learning, Image Processing and Pattern Recognition
Abstract: Integrating multiple personalized concepts into a single image has recently gained attention in text-to-image (T2I) generation. However, existing methods often suffer from performance degradation in complex scenes due to distortions in non-personalized regions and the need for additional fine-tuning, limiting their practicality. To address this issue, we propose FlipConcept, a novel approach that seamlessly integrates multiple personalized concepts into a single image without requiring additional tuning. We introduce guided appearance attention to enhance the visual fidelity of personalized concepts. Additionally, we introduce mask-guided noise mixing to protect non-personalized regions during concept integration. Lastly, we apply background dilution to minimize concept leakage, i.e., the undesired blending of personalized concepts with other objects in the image. In our experiments, we demonstrate that the proposed method, despite not requiring tuning, outperforms existing models in both single and multiple personalized concept inference. These results demonstrate the effectiveness and practicality of our approach for scalable, high-quality multi-concept personalization.
|
|
17:00-17:15, Paper Tu-S3-T1.5 | |
LUMINA-Net: Low-Light Upgrade through Multi-Stage Illumination and Noise Adaptation Network for Image Enhancement |
|
Siddiqua, Namrah | Korea University |
Kim, Suneung | Korea University |
Lee, Seong-Whan | Korea University |
Keywords: Deep Learning, Machine Learning, Image Processing and Pattern Recognition
Abstract: Low-light image enhancement (LLIE) is a crucial task in computer vision aimed at enhancing the visual fidelity of images captured under low-illumination conditions. Conventional methods frequently struggle with noise, overexposure, and color distortion, leading to significant image quality degradation. To address these challenges, we propose LUMINA-Net, an unsupervised deep learning framework that learns adaptive priors from low-light image pairs by integrating multi-stage illumination and reflectance modules. To assist the Retinex decomposition, inappropriate features in the raw image can be removed using a simple self-supervised mechanism. First, the illumination module intelligently adjusts brightness and contrast while preserving intricate textural details. Second, the reflectance module incorporates a noise reduction mechanism that leverages spatial attention and channel-wise feature refinement to mitigate noise contamination. Through extensive experiments on LOL and SICE datasets, evaluated using PSNR, SSIM, and LPIPS metrics, LUMINA-Net surpasses state-of-the-art methods, demonstrating its efficacy in low-light image enhancement.
|
|
17:15-17:30, Paper Tu-S3-T1.6 | |
WS-DETR: Robust Water Surface Object Detection through Vision-Radar Fusion with Detection Transformer |
|
Yin, Huilin | College of Electronic and Information Engineering, Tongji Univer |
Wang, Pengyu | Tongji University |
Li, Senmao | College of Electronic and Information Engineering, Tongji Univer |
Yan, Jun | College of Electronic and Information Engineering, Tongji Univer |
Watzenig, Daniel | Graz University of Technology and the Virtual Vehicle Research, |
Keywords: Deep Learning, Machine Vision, Application of Artificial Intelligence
Abstract: 人水面航行器的鲁棒物体检测 (USV)在复杂的水域环境中至关重要 可靠的导航和作。具体来说,水 表面物体检测面临模糊的挑战 边缘和不同的物体比例。虽然视觉雷达 Fusion 提供了一个可行的解决方案,现有方法 遭受跨模态特征冲突,这对 影响模型鲁棒性。为了解决这个问题,我们 提出鲁棒视觉-雷达融合模型WS-DETR。在 特别是,我们首先引入了多尺度边缘 信息集成 (MSEII) 模块以增强边缘 感知和分层特征聚合器 (HiFA) 增强编码器中的多尺度物体检测。然后,我们 采用自移动点表示进行连续 卷积和残差连接,高效提取 不规则点情景下的不规则特征 云数据。为了进一步缓解跨模式冲突,一个 自适应特征交互融合 (AFIF) 模块是 引入以集成视觉和&
|
|
17:30-17:45, Paper Tu-S3-T1.7 | |
Lightweight and Dynamic Content-Augmented Object Detection for UAVs |
|
Sadeghi Bakhi, Mahdi | University of Calgary |
Leung, Henry | University of Calgary |
Wang, Xin | University of Calgary |
Keywords: Deep Learning, Machine Vision, Machine Learning
Abstract: This paper presents a novel lightweight and robust object detection framework tailored for UAV-based applications. The core of the proposed method is the Dynamic Content-Augmented Feature Pyramid Network (DCA-FPN), which integrates a Global Content Extraction Module (GCEM), an Adaptive Branching Network (ABN), and a Linear Transformer (LT) to enhance multi-scale feature representation and contextual understanding. These components collectively improve detection performance for small and occluded objects while addressing the misalignment issues inherent in traditional feature pyramids. Built on a MobileNet backbone with depthwise separable convolutions, the framework offers low computational complexity and real-time readiness for edge devices. Experimental results on the VisDrone dataset show a state-of-the-art mean Average Precision (mAP) of 42.50%, while additional evaluations on the MS COCO benchmark confirm competitive performance across diverse object scales. Furthermore, testing on the GDIT Aerial Airport dataset demonstrates the model’s applicability in infrastructure monitoring tasks, particularly in detecting airplanes across varied sizes and conditions. These results highlight the robustness, efficiency, and deployment potential of the proposed framework in real-world, resource-constrained UAV scenarios.
|
|
Tu-S3-T2 |
Hall N |
Application of Artificial Intelligence 5 |
Regular Papers - Cybernetics |
Chair: Widl, Edmund | Austrian Institute of Technology |
Co-Chair: Li, Jiacheng | South China University of Technology |
|
16:00-16:15, Paper Tu-S3-T2.1 | |
A Wavelet-Enhanced Sparse Framework for Time Series Forecasting |
|
Li, Jiacheng | South China University of Technology |
Chen, Wei | South China University of Technology |
Liu, Yican | South China University of Technology |
Zeng, Delu | South China University of Technology |
Keywords: Application of Artificial Intelligence, Evolutionary Computation, Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing
Abstract: Time series forecasting is crucial in areas like smart grids, traffic flow management, and financial analysis, especially for Long-sequence Time Series Forecasting (LTSF) tasks. We propose WaveSparseTSF, a lightweight LTSF model designed to handle complex temporal dependencies while minimizing computational costs. It integrates wavelet transform and cross-period sparse forecasting, which separate the data into low and high frequency components and focus on periodic trends to reduce complexity. By processing raw sequences through wavelet decomposition and downsampling, WaveSparseTSF effectively captures both global trends and local fluctuations. Despite using only around 1k parameters, it achieves competitive or superior results compared to state-of-the-art models and demonstrates strong generalization capabilities. As a result, WaveSparseTSF is particularly well-suited for environments with limited resources, small datasets, or low-quality data.
|
|
16:15-16:30, Paper Tu-S3-T2.2 | |
MambaControl: Anatomy Graph-Enhanced Mamba ControlNet with Fourier Refinement for Diffusion-Based Disease Trajectory Prediction |
|
Yang, Hao | Macao Polytechnic University |
Tan, Tao | Macao Polytechnic University |
Yang, Weiqin | University of Adelaide |
Tan, Shuai | Zhejiang University |
Cai, Kunyan | Macao Polytechnic University |
Chen, Calvin | University of Cambridge |
Sun, Yue | Macao Polytechnic University |
Keywords: Application of Artificial Intelligence, Image Processing and Pattern Recognition, Biometric Systems and Bioinformatics
Abstract: Modelling disease progression in precision medicine requires capturing complex spatio-temporal dynamics while preserving anatomical integrity. Existing methods often struggle with longitudinal dependencies and structural consistency in progressive disorders. To address these limitations, we introduce MambaControl, a novel framework that integrates selective state-space modelling with diffusion processes for high-fidelity prediction of medical image trajectories. To better capture subtle structural changes over time while maintaining anatomical consistency, MambaControl combines Mamba-based long-range modelling with graphguided anatomical control to more effectively represent anatomical correlations. Furthermore, we introduce Fourierenhanced spectral graph representations to capture spatial coherence and multiscale detail, enabling MambaControl to achieve state-of-the-art performance in Alzheimer’s disease prediction. Quantitative and regional evaluations demonstrate improved progression prediction quality and anatomical fidelity, highlighting its potential for personalised prognosis and clinical decision support.
|
|
16:30-16:45, Paper Tu-S3-T2.3 | |
Robustness Evaluation of Tactics, Techniques, and Procedures Knowledge in Large Language Models |
|
Chen, Yikai | Beihang University |
Lang, Bo | Beihang University |
Xiao, Nan | Beihang University |
Li, Xiangyu | Beihang University |
Chen, Ruibo | Beihang University |
Keywords: Application of Artificial Intelligence, Information Assurance and Intelligence, Expert and Knowledge-Based Systems
Abstract: Extracting Tactics, Techniques, and Procedures (TTP) from unstructured threat reports is a critical challenge in cyber threat intelligence (CTI). Although large language models (LLMs) may automate TTP extraction, their understanding of TTP knowledge and robustness remains unverified due to the lack of evaluation benchmarks. To address this, we propose TTP-RoB, a novel framework for probing the TTP knowledge and robustness inherent in LLMs. In TTP-RoB, by analyzing real-world CTI reports, we identify key interference factors:the absence of correct answers and the interference of similar technique names. Then, we develop a knowledge sampling algorithm to select representative examples from a relevant knowledge base. Finally, we construct a probing dataset that integrates the identified interferences and sampled knowledge. Our TTP-RoB evaluation of mainstream LLMs-including the Deepseek, GPT, Llama, and Qwen series-demonstrates that while all models show strong understanding of basic technical concepts, their robustness varies significantly under different interference conditions. The models generally maintain stable performance when faced with the absence of correct answers, but display substantial vulnerability to the interference of similar technique names, suffering accuracy drops ranging from 15.4% to 59.1%. Overall, the robustness exhibits a significant decline as model scale decreases. Furthermore, the evaluation results reveal that the distilled models (e.g., DeepSeek-R1 distill-Qwen) exhibit significantly lower robustness compared to their foundation models. These results confirm that rigorous evaluation for model selection is critical before using LLMs for TTP extraction and other CTI tasks.
|
|
16:45-17:00, Paper Tu-S3-T2.4 | |
A Jacobi-Based Conjugate Gradient Solver for Sparse Linear Systems on Multi-GPUs |
|
Benatia, Akrem | Ecole Militaire Polytechnique |
Amara, Yacine | Ecole Militaire Polytechnique |
Keywords: Application of Artificial Intelligence, Machine Learning, Big Data Computing,
Abstract: The Preconditioned Conjugate Gradient (PCG) algorithm is a widely-used iterative method for solving sparse linear systems. The sparse matrix-vector multiplication (SpMV) operation involved in the PCG algorithm dominates the computing cost of each iteration. To accelerate the PCG computation on multi-GPU systems, the input sparse matrix has to be partitioned on different GPUs. In addition, an appropriate sparse format must be used for each partition. In this paper, we propose a new multi-GPU PCG implementation that relies on horizontally partitioning the input matrix into multiple block-rows so that different sparse formats can be used at a low granularity. We then use a mapping algorithm based on the Minimum Completion Time (MCT) policy to assign different block-rows to the GPUs available in the system. Our experimental results with real-world large sparse matrices reveal a noticeable performance improvement, especially for the sparse systems characterized by irregular sparsity structure.
|
|
17:00-17:15, Paper Tu-S3-T2.5 | |
An Efficient and Compact Network for Simultaneous Multi-Object Tracking and Behavior Monitoring in Pigeon Farming |
|
Xie, Jiefeng | Macao Polytechnic University |
Tan, Tao | Macao Polytechnic University |
Liu, Yaoji | Zhongkai University of Agriculture and Engineering |
Ye, Tao | Zhongkai University of Agriculture and Engineering |
Zhang, Jinyi | Zhongkai University of Agriculture and Engineering |
Feng, Dachun | College of Information Science and Technology, Zhongkai Universi |
Sun, Yue | Macao Polytechnic University |
Keywords: Application of Artificial Intelligence, Machine Vision, Multimedia Computation
Abstract: Animal monitoring plays a crucial role in agriculture, especially in improving animal welfare and farming efficiency. However, existing methods cannot simultaneously achieve video-based tracking for each individual while monitoring their behaviors. To overcome this limitation and meet the demand for automated surveillance in large-scale pigeon farming, this study proposes an efficient and compact network for simultaneous multi-object tracking and behavior monitoring in pigeon farming. The network is capable of tracking each pigeon while recognizing their behaviors in a video sequence. The network combines the DETR detector of the lightweight RepViT backbone with the BoT-SORT motion tracker, which tracks accurately while saving costs. A text-guided multimodal action recognition model is introduced in the action recognition stage, which combines spatiotemporal video features with semantic text embedding to enhance classification accuracy. Experimental results show that the proposed method achieves the optimal tracking performance (IDF1: 96.58%) and action recognition accuracy (Top-1: 94.89%), while reducing the network complexity (13.6M parameters) and computational cost (45.67 GFLOPs). This method provides effective technical support to promote accurate management of poultry farming.
|
|
17:15-17:30, Paper Tu-S3-T2.6 | |
FIN-SIGN: A GNN-Based Learning Model for Online Lending Fraud Detection |
|
Song, Xiaodi | Bank of Shanghai |
Chen, Zhijian | Tongji University |
Wang, Xiaoguo | Tongji University |
Zhu, Hongming | Tongji University |
Keywords: Application of Artificial Intelligence, Neural Networks and their Applications
Abstract: With the rapid development of Fintech, increasing fraudulent behaviors have posed a threat to the smooth functioning of online lending platforms, leading to substantial losses for financial institutions. Enhancing the fraud detection capability of these platforms has thus become an urgent need. In the lending network of PPDAI, a well-known lending platform, two key characteristics are observed in the user data: (1) The presence of indicative missing values in node features; (2) The presence of weak homophily within the network structure. However, in previous work, the widely used GNNs always rely on the homophily assumption, and popular GNN-based fraud detection methods overlook the utilization of missing values, which limits the performance of lending fraud detection. To address the above problems, in this paper, we propose a GNN-based model FIN-SIGN for online lending fraud detection. The proposed model leverages Missing Mask to capture indicative information from missing values, alleviates the weak homophily through structure augmentation, and finally identifies fraudulent users with a SIGN-based GNN detector. Experimental results on the PPDAI dataset DGraph-Fin demonstrates the effectiveness of FIN-SIGN.
|
|
17:30-17:45, Paper Tu-S3-T2.7 | |
SAM2v-BTR: Accelerating SAM 2 Training for 3D Medical Image Segmentation through Bootstrap and Memory Annealing |
|
Zhang, Enzhi | Hokkaido University |
Iwasawa, Junichiro | Preferred Networks Inc |
Oda, Keita | Preferred Networks |
Tokuoka, Yuta | Preferred Networks, Inc |
Keywords: Application of Artificial Intelligence, Deep Learning, Image Processing and Pattern Recognition
Abstract: Vision Transformers (ViTs) have revolutionized image classification, but their application in dense prediction tasks such as segmentation, particularly for 3D medical imaging, faces scalability challenges. To address these issues, there has been growing interest in utilizing foundation models like the Segment Anything Model (SAM) for medical image analysis. The recent introduction of SAM 2 has generated optimism regarding enhancements for tasks such as 3D MRI segmentation. However, SAM 2’s memory bank mechanism, designed for object tracking and segmentation, introduces computational complexity and memory overhead, limiting training efficiency. In this paper, we address these limitations by proposing a bootstrap and memory annealing approach. The bootstrap mechanism replaces the memory bank with ground truth data, significantly enhancing training speed without compromising performance, achieving a 6x improvement on 3D medical image benchmarks such as KiTS and ACDC. To counteract potential overfitting caused by the bootstrap approach, we introduce memory annealing, which adaptively adjusts the selection of ground truth frames based on validation loss, resulting in a 5.46% improvement on the KiTS dataset. Our approach accelerates model training while maintaining or improving segmentation performance, offering a robust solution for efficient 3D medical image segmentation.
|
|
17:45-18:00, Paper Tu-S3-T2.8 | |
Teacher-Guided Code Generation with API Update |
|
Sun, Tianze | Harbin Institute of Technology |
Wang, Zekun | Opencsg |
Wang, Wei | OpenCSG |
Chen, Ran | OPENCSG |
Pei, Ji | OpenCSG |
Keywords: Application of Artificial Intelligence, AI and Applications, Deep Learning
Abstract: 大型语言模型已显示出相当大的 在程序编写和 API 等领域的潜力 调用 显着提高自动化和效率。然而 API 的频繁更新对 这些 模型,特别是在高成本方面 培训 以及与更新相关的维护的复杂性 模型的知识。目前的处理策略 这些 问题通常分为两类: 增强型 推理和 参数调整。其中,检索增强通用 ation (RAG),作为增强推理的主要方法, 在推理过程中协助代码模型 法典 通过修改提示信息来生成。但是,由于 它不会改变模型的参数,外部 知识 可能与模型中存储的知识发生冲突 训练;另一方面,参数调优的方法 将 API 更新合并到训练数据中,修改 这 代码模型的参数并启用重新推理。然而 由于高质量样品的稀缺,它举步维艰 为 有效的模型训练。为了应对这些২
|
|
Tu-S3-T5 |
Room 0.14 |
Human-Machine Interaction 3 |
Regular Papers - HMS |
Chair: Peng, Guangzhu | Nanjing University of Information Science and Technology |
Co-Chair: Li, Bo | The Hong Kong Polytechnic University |
|
16:00-16:15, Paper Tu-S3-T5.1 | |
Adaptive Impedance Learning for Robots Interacting with Unknown Environments Via Streaming Sparse Gaussian Processes |
|
Fang, Zheng | Nanjing University of Information Science and Technology |
Peng, Guangzhu | Nanjing University of Information Science and Technology |
Zhong, Yanzhi | Nanjing University of Information Science & Technology |
Yang, Chenguang | University of Liverpool |
Keywords: Human-Machine Interaction, Human-Machine Cooperation and Systems, Environmental Sensing,
Abstract: Impedance control with fixed parameters lacks the flexibility to adapt to dynamic and uncertain environments, which may not meet the task requirements during robot- environment interaction. In this paper, a novel impedance learning control method is proposed to enhance robotic adapt- ability in unknown environments. First, an adaptive gradient learning strategy is designed to optimize step size in iterative learning process, leveraging historical gradients for dynamic adjustment. Then, a data-efficient adaptive model based on Streaming Sparse Gaussian Process (SSGP) is employed to accelerate impedance learning convergence. Additionally, it also can reduce computational complexity and improve gen- eralization by online removing redundant data points, which utilizes prior data to estimate impedance parameters. The simulation results have demonstrated that the proposed method outperforms traditional iterative learning control approaches in convergence speed and generalization, verifying the feasibility and validity of the proposed method.
|
|
16:15-16:30, Paper Tu-S3-T5.2 | |
Hierarchical Procedural Framework for Low-Latency Robot-Assisted Hand-Object Interaction |
|
Yuan, Mingqi | The Hong Kong Polytechnic University |
Wang, Huijiang | University of Cambridge |
Chu, Kai-Fung | University of Cambridge |
Iida, Fumiya | University of Cambridge |
Li, Bo | The Hong Kong Polytechnic University |
Zeng, Wenjun | Eastern Institute of Technology, Ningbo |
Keywords: Human-Machine Interaction, Human-Machine Cooperation and Systems, Human-Machine Interface
Abstract: Advances in robotics have been driving the development of human-robot interaction (HRI) technologies. However, accurately perceiving human actions and achieving adaptive control remains a challenge in facilitating seamless coordination between human and robotic movements. In this paper, we propose a hierarchical procedural framework to enable dynamic robot-assisted hand-object interaction (HOI). An open-loop hierarchy leverages the RGB-based 3D reconstruction of the human hand, based on which motion primitives have been designed to translate hand motions into robotic actions. The low-level coordination hierarchy fine-tunes the robot's action by using the continuously updated 3D hand models. Experimental validation demonstrates the effectiveness of the hierarchical control architecture. The adaptive coordination between human and robot behavior has achieved a delay of ≤ 0.3 seconds in the tele-interaction scenario. A case study of ring-wearing tasks indicates the potential application of this work in assistive technologies such as healthcare and manufacturing.
|
|
16:30-16:45, Paper Tu-S3-T5.3 | |
Evaluation of Different Modalities for Interacting with a Tasking Agent in Manned-Unmanned Teaming Missions |
|
Künzel, Dominik | University of the Bundeswehr Munich |
Wuwer, Vivien | University of the Bundeswehr Munich |
Roth, Gunar | University of the Bundeswehr Munich |
Schulte, Axel | Bundeswehr University Munich |
Keywords: Human-Machine Interaction, Human-Machine Cooperation and Systems, Human-Machine Interface
Abstract: In our contribution, we investigate the impact of different modalities for tasking unmanned vehicles in Manned-Unmanned Teaming (MUM-T) scenarios. As pilots have to manage unmanned assets from the cockpit, human-machine interaction becomes critical to mission success. In this study we assessed the touch and voice modalities in a military helicopter simulator, measuring workload, usability, and mission efficiency. It has been shown that voice interaction reduces workload and improves usability, as well as mission performance, while touch input remains valuable as a backup. The findings underline the need for improved interaction design in future MUM-T systems to enhance safety and mission efficiency in high-demand flight environments.
|
|
16:45-17:00, Paper Tu-S3-T5.4 | |
Improving Model Generalization across Domains with Multi-Scale Feature Aggregation Filtering and Consistency Loss |
|
Tuoxin, Li | National University of Defense Technology |
Xiang, Fengtao | National University of Defense Technology |
Chen, Junhai | National University of Defence Technology |
Wang, Chang | National University of Defence Technology |
Keywords: Human-Machine Interaction, Human-Machine Cooperation and Systems, Networking and Decision-Making
Abstract: This work presents a novel Multi-Scale Feature Aggregation Filtering and Consistency Loss(MFAFC) to improve the extraction of domain-invariant features, crucial for enhancing human-machine system adaptability in dynamic environments. Unlike existing techniques that primarily focus on domain-invariant features while neglecting the influence of domain-specific features, this approach considers both aspects. By capturing information across multiple scales, it suppresses domain-specific features, improving domain-invariant feature extraction quality. A momentum-based inference consistency loss function is also introduced, using category center consistency to boost model robustness. Combining multi-scale extraction and momentum-based loss, the method effectively handles domain shift. Experiments on various public datasets show excellent performance in domain generalization, reducing the impact of domain-specific features and improving task ergonomics and cognitive performance in real-world systems.
|
|
17:00-17:15, Paper Tu-S3-T5.5 | |
Impact of Tasking Modalities on Pilot Flight Behavior in Manned-Unmanned Teaming Missions |
|
Wuwer, Vivien | University of the Bundeswehr Munich |
Künzel, Dominik | University of the Bundeswehr Munich |
Schulte, Axel | Bundeswehr University Munich |
Keywords: Human-Machine Interaction, Human-Machine Interface, Human-Machine Cooperation and Systems
Abstract: This study examines the impact of different tasking modalities - touch versus voice - on pilot performance in manned-unmanned teaming (MUM-T) mission scenarios. MUM-T operations place high cognitive and operational demands on military helicopter pilots, requiring simultaneous control of their own aircraft and coordination of unmanned aerial vehicles (UAVs). A simulator study was conducted analyzing subjective workload, head-down time, pilot flight behavior, and autopilot usage. Although pilots subjectively reported that voice tasking reduced head-down time and supported better flight performance, objective measurements showed only minor improvements or inconsistent results. Over 75 % head-down time was recorded across all conditions. Voice input alone does not notably mitigate visual demands. Task complexity, training, and context play a critical role. The findings underline the need for improved interaction design and more reliable flight automation in future MUM-T systems to enhance safety and mission efficiency in high-demand flight environments.
|
|
17:15-17:30, Paper Tu-S3-T5.6 | |
CareEmo: Supporting Caregivers with Personalized Communication Approaches to Enhance Older Adults’ Emotional Well-Being |
|
Li, Muchen | Zhejiang University&China Unicom Data Intelligence Co., LTD |
Xiang, Wei | Zhejiang University |
He, Yuyu | University of Waikato |
Jiang, Mengyun | Zhejiang University |
Wu, Xueting | Zhejiang University |
Keywords: Human-Computer Interaction
Abstract: Memory plays a crucial role in caregiving, sup- porting caregivers to address the emotional needs of elderly and to arrange appropriate topics during multiple round of care. However, this is burdensome especially when caregivers need to serve multiple elderly. This study presents CareEmo, a caregiver assistant that provides memory and emotion support to caregivers through Bluetooth earbuds, facilitating personal- ized communication and enhances care recipients’ emotional well-being. Through a formative study that included field observations and interviews with caregivers, family members and care recipients, we designed modules: emotion recognition, user profile creation, and care advice provision to address the challenges of diverse emotional needs in care recipients. CareEmo identifies emotional changes in care recipients, sum- marizes their emotional needs and interests, then provides caregivers with appropriate care advice based on recipients’ memories and caregiving histories. An empirical study involving 16 participants (8 caregivers and 8 care recipients), showed that CareEmo improved both caregiver and care recipient emotional care experience and contributed a high quality of emotional caregiving.
|
|
Tu-S3-T6 |
Room 0.16 |
System Modeling and Control 3 |
Regular Papers - SSE |
Chair: Araujo, Jean | Universidade Federal Do Agreste De Pernambuco |
Co-Chair: Dantas, Jamilson | UFPE |
|
16:00-16:15, Paper Tu-S3-T6.1 | |
Availability and Reliability Assessment of Tier I Data Center Infrastructure |
|
Souza, Lubnnia | Universidade Federal De Sergipe |
Camboim, Kadna | UFAPE |
Araujo, Jean | Universidade Federal Do Agreste De Pernambuco |
Keywords: System Modeling and Control, Quality and Reliability Engineering, Infrastructure Systems and Services
Abstract: Over the years, data centers have evolved to meet the most diverse demands of web services and applications, such as cloud computing, e-commerce, social networking, artificial intelligence, streaming, and healthcare services. These large data centers must meet several dependability requirements to ensure quality of service with high reliability and availability, reducing interoperability time, since this is an important competitive factor for companies. Data centers have three infrastructures: IT (Information Technology), electrical, and cooling. Therefore, for data centers to achieve high availability, these infrastructures must be designed according to redundancy specifications ranging from Tier I to Tier IV. This paper presents an SPN (Stochastic Petri Nets) model for planning the infrastructure of a Tier I data center, composed of the three subsystems mentioned. The evaluation will focus on key metrics, including availability, reliability, downtime, and uptime. A sensitivity analysis of the constructed model was performed to verify which subsystem impacts the system behavior.
|
|
16:15-16:30, Paper Tu-S3-T6.2 | |
SDN-Driven MEC Planning: Modeling for Capacity-Oriented Availability |
|
Barros Nascimento, Erick | Federal University of Pernambuco |
Araujo, Jean | Universidade Federal Do Agreste De Pernambuco |
Tavares, Eduardo | Universidade Federal De Pernambuco |
Dantas, Jamilson | UFPE |
Maciel, Paulo | UFPE |
Keywords: System Modeling and Control, System Architecture, Infrastructure Systems and Services
Abstract: The evolution of mobile technology from the fifth-generation (5G) radio access has amplified challenges. As new requirements emerge, there is a need for enhanced computational capacity for planning, operation, and availability posed by 5G. Multi-access Edge Computing (MEC) and Software-defined networks (SDN) are fundamental to addressing capacity planning and availability. This paper presents a hierarchical modeling approach using reliability block diagrams (RBD) and continuous-time Markov chains (CTMC) to estimate the MEC SDN-based availability and capacity-oriented availability (COA). The proposed models allow analytical evaluations for calculating dependability metrics, showing an availability increase from 97.33% to 99.58%, with only two clustered server nodes, effectively reducing annual downtime from 357.32 hours to 0.65 hours.
|
|
16:30-16:45, Paper Tu-S3-T6.3 | |
Innovative Modeling Based Framework to Enhance the Safety and Stability of Motion Simulation |
|
Scheidel, Hendrik | Deakin University |
Asadi, Houshyar | Deakin University |
Bellmann, Tobias | Deutsches Zentrum Für Luft Und Raumfahrt E.V |
Seefried, Andreas | German Aerospace Center |
Mohamed, Shady | Senior Research Fellow, Deakin University |
Nahavandi, Saeid | Swinburne University of Technology |
Keywords: System Modeling and Control, Technology Assessment, Digital Twin
Abstract: Motion simulation can substantially improve the immersion of any form of vehicle simulation. However, inadequate motion simulation can fully break the immersion or even induce adverse effects, such as discomfort, motion sickness, or other harm to the simulator occupants. The selection of a stable and safe motion cueing algorithm (MCA) is therefore essential. In particular, complex and simultaneously real-time capable MCAs can carry the risk of instability. This phenomenon can be observed in non-linear model predictive control-based MCAs when the prediction horizon is defined too short, and in learning-based MCAs when there is insufficient utilization of training data. Specifically, the employment of artificial neural networks in the modeling of MCAs can lead to problems such as lack of generalization or overfitting, which, combined with the difficult interpretability due to the black-box character, makes analytical guarantees difficult. The problem is further intensified when the MCA is in a control loop with a motion platform that has a highly non-linear behavior. This work proposes a sample-based framework that utilizes simulative modeling of the deployed simulator platform to investigate the behavior of MCAs. The proposed framework is structured to initialize the system in random states, which allows for a comprehensive investigation of the behavior of any non-specific MCA. The framework is applied to two variations of one state-of-the-art MCA, and the results are compared. It is shown that the framework can identify deficiencies in the trajectory planning of MCAs. Thus, it is able to contribute significantly to the safety and stability of motion simulation.
|
|
16:45-17:00, Paper Tu-S3-T6.4 | |
Performance Hierarchical Modeling of Microservices Using Stochastic Petri Nets |
|
Pinheiro, Thiago | Federal University of Pernambuco |
Mialaret Júnior, Marco | Faculdade Senac |
Dantas, Jamilson | UFPE |
Maciel, Paulo | UFPE |
Keywords: Discrete Event Systems, System Architecture, System Modeling and Control
Abstract: In this paper, we present a hierarchical modeling strategy that combines stochastic Petri nets (SPNs) with an iterative algorithm to evaluate performance in containerized microservices. Our approach models both synchronous and asynchronous calls, bounded queues, and network constraints, and then dynamically adjusts the number of replicas to keep the discard probability within predefined thresholds. By analyzing each microservice in a modular fashion, we avoid exploding the state space and achieve accurate predictions for different load and bandwidth scenarios. In experiments with 21 containerized microservices under three bandwidth constraints (10, 40 and 100 MB/s), our method correctly predicted throughput, container usage and bandwidth consumption within 95% confidence intervals, confirming its effectiveness for resource allocation decisions in microservice platforms.
|
|
17:00-17:15, Paper Tu-S3-T6.5 | |
Trajectory Alignment: A Method for Extracting Main Routes from Large Trajectory Data |
|
Hiraishi, Kunihiko | Japan Advanced Inst. of Sci. and Tech |
Keywords: Discrete Event Systems, System Modeling and Control
Abstract: We propose a method for extracting important routes from large trajectory data.The method consists of two phases. In the first phase, we recognize event occurrence when a predefined condition is satisfied in each trajectory and extract a time series of events. Further analysis is applied to the obtained event sequences having size much smaller than that of the original trajectory data. The conditions for event extraction are determined by the purpose of analysis. In the second phase, event patterns that appears frequently in the event sequences are discovered. For this purpose, we propose a new method called trajectory alignment. This method is an adaptation of the sequence alignment, used in bioinformatics, to trajectory data. The proposed approach is applied to an artificial data set and two real data sets.
|
|
17:15-17:30, Paper Tu-S3-T6.6 | |
A Two-Stage GNN for Joint UAV Positioning and Relay Routing |
|
Ren, Qianchen | Xiamen University |
Liu, Han | XiaMen University |
Tang, Yuliang | XiaMen University |
Li, Shao zi | XiaMen University |
Keywords: Communications, System Modeling and Control, Adaptive Systems
Abstract: In modern warfare, bionic robots are increasingly deployed to execute high-risk missions. To ensure robust remote control and command in complex urban environments, unmanned aerial vehicles (UAVs) are utilized as aerial relays to facilitate reliable data transmission. This paper investigates the joint optimization of UAV positioning and multi-hop relay path selection in UAV-assisted wireless networks. We formulate the problem as a graph-based optimization task and propose a two-stage Graph Neural Network (GNN) framework. In the first stage, a reinforcement learning (RL) enhanced Relay Path GNN (RPG) is developed to enable low-latency and efficient routing. In the second stage, a UAV Position GNN (UPG) determines near-optimal UAV deployment strategies. Both modules are designed to train without labeled data, relying instead on unsupervised and RL techniques tailored to the graph-structured problem, making the approach data-efficient and robust. Simulation results show that the proposed framework UPG-RPG achieves near-optimal performance with substantially reduced computational complexity and demonstrates superior scalability and adaptability compared to conventional heuristic or rule-based methods.
|
|
17:30-17:45, Paper Tu-S3-T6.7 | |
Temporal Deep Unrolling-Based MPC for Vehicle Trajectory Tracking (I) |
|
Sone, Taiga | Hiroshima University |
Ogura, Masaki | Hiroshima University |
Kishida, Masako | National Institute of Informatics |
Keywords: System Modeling and Control, Autonomous Vehicle
Abstract: This paper presents a trajectory tracking control method for autonomous vehicles based on Temporal Deep Unrolling-based Model Predictive Control (TDU-MPC). By temporally unrolling the state transitions of the vehicle dynamics to obtain a deep neural network and utilizing back-propagation, the proposed method enables efficient optimization of control inputs subject to complex nonlinearities that challenge conventional approaches. Comprehensive simulation experiments across diverse reference trajectories and disturbance conditions demonstrate that the proposed TDU-MPC consistently outperforms conventional Linear Time-Varying MPC (LTV-MPC), achieving superior tracking accuracy with smaller cumulative lateral error while exhibiting strong robustness to disturbance. Additional experiments using a hand-drawn, bird-shaped trajectory confirm the method's ability to stably track complex and highly nonlinear trajectories. These findings suggest that TDU-MPC offers a promising framework for achieving high-precision and robust trajectory tracking.
|
|
Tu-S3-T7 |
Room 0.31 |
Brain-Based Information Communications 2 |
Regular Papers - HMS |
Chair: Chan, Ho Tung Jeremy | Graz University of Technology |
Co-Chair: Wimmer, Michael | Know Center GmbH |
|
16:00-16:15, Paper Tu-S3-T7.1 | |
Informing EEG-Based Error Decoding with Explainable AI |
|
Chan, Ho Tung Jeremy | Graz University of Technology |
Wimmer, Michael | Know Center GmbH |
Šimić, Ilija | Know Center GmbH |
Müller-Putz, Gernot | Graz, University of Technology |
Veas, Eduardo | Graz University of Technology |
Keywords: Brain-Computer Interfaces
Abstract: Human cognition involves intricate neural processes for error perception and correction. These processes are crucial in error-monitoring processes such as feedback, learning, control, and decision-making. We present a complete workflow using explainable artificalintelligence (XAI) to guide the feature extraction of electroencephalographic (EEG) signals in a classification task with error-related brain responses. The identification of relevant channels for classification problems has practical relevance, as a dense electrode setup reduces the usability of brain-computer interfaces (BCIs).Specialists can inspect and select seemingly important sensors based on knowledge of brain regions and neural patterns. However, reduced configurations often harm accurate model performance. The contribution of this work lies in demonstrating that XAI can be used to inform the extraction of relevant temporal and spatial information with a strong connection to machine learning model sensitivity. We employed a local and a global XAI method to i) evaluate consistency with expert knowledge, ii) identify relevant time points for asynchronous error decoding, and iii) systematically reduce the EEG setup. This advances the integration of XAI in neuroscience, thus contributing to the design of practical BCIs.
|
|
16:15-16:30, Paper Tu-S3-T7.2 | |
Emotional EEG Classification Using Upscaled Connectivity Matrices |
|
Lee, Chae-Won | Yonsei University |
Lee, Jong-Seok | Yonsei University |
Keywords: Brain-Computer Interfaces, Affective Computing
Abstract: Recent studies have demonstrated the effectiveness of using connectivity matrices as input to convolutional neural networks (CNNs) for emotional EEG classification, as they can effectively capture interregional interaction patterns. However, these matrices often suffer from loss of critical spatial information during convolution operations. To address this issue, we propose a simple yet effective approach: upscaling connectivity matrices to enhance local patterns. Experimental results show that this technique significantly improves classification performance, highlighting the importance of preserving spatial structures in early processing stages.
|
|
16:30-16:45, Paper Tu-S3-T7.3 | |
Leveraging Foundation Models for Calibration-Free C-VEP BCIs |
|
Behboodi, Mohammadreza | University of Calgary |
Kinney-Lang, Eli | University of Calgary |
Etemad, Ali | Queen's University |
Kirton, Adam | University of Calgary |
Abou-Zeid, Hatem | University of Calgary |
Keywords: Brain-Computer Interfaces, Assistive Technology
Abstract: Foundation Models (FMs) have surged in popularity over the past five years, with applications spanning fields from computer vision to natural language processing. At the same time, Brain-Computer Interfaces (BCIs) have also gained momentum due to their potential to support individuals with complex disabilities. Among various BCI paradigms, code-modulated Visual Evoked Potentials (c-VEPs) remain relatively understudied, despite offering high information transfer rates and large selection target capacities. However, c-VEP systems require lengthy calibration sessions, significantly limiting their practicality, particularly outside of laboratory settings. In this study, we use a FM for the first time to eliminate the need for lengthy calibration in c-VEP BCI systems. We evaluated two approaches: (1) a truly calibration-free approach requiring no subject-specific data, and (2) a limited calibration approach, where we assessed the benefit of incorporating incremental amounts of calibration data. In both cases, a classification head is trained on data from other subjects. For a new subject, no calibration data is required in the calibration-free setup, making the c-VEP system effectively plug-and-play. The proposed method was tested on two c-VEP datasets. For the calibration-free approach, the average accuracy on the first dataset (n = 17) was 68.8% ± 17.6%, comparable to the full-calibration performance reported in the original study (66.2% ± 13.8%), which required approximately 11 minutes of calibration. On the second dataset (n = 12), the calibration-free accuracy was 71.8% ± 20.2%, versus 93.7% ± 5.5% from the original study, which required around 3.5 minutes. A limited-calibration approach using only 20% of the subject's data (approximately 43 seconds) yielded 92% ± 5.2% accuracy. These results indicate that our FM-based approach can effectively eliminate or significantly reduce the need for lengthy calibration in c-VEP BCIs.
|
|
16:45-17:00, Paper Tu-S3-T7.4 | |
Toward Inclusive Powered Mobility: A Novel Protocol Utilizing a Hybrid EOG-EEG BCI in a VR-Based Wheelchair Driving Simulator |
|
Marcaccini, Kevin | Department of Electrical, Electronic, and Information Engineerin |
Pulvirenti, Francesca Rita | IRCCS Istituto Delle Scienze Neurologiche Di Bologna, Bologna It |
Pierotti, Francesco | Department of Electrical, Electronic, and Information Engineerin |
Arcobelli, Valerio Antonio | Department of Electrical, Electronic, and Information Engineerin |
Tonin, Luca | University of Padova |
Tortora, Stefano | Intelligent Autonomous System Lab, Department of Information Eng |
Groppi, Annalisa | With DATeR UA Ospedale Maggiore, Bologna Italy |
Giorgi, Federica | IRCCS Istituto Delle Scienze Neurologiche Di Bologna, Bologna It |
Dellarole, Laura | DATeR UA Ospedale Maggiore, Bologna Italy |
Chiari, Lorenzo | Department of Electrical, Electronic, and Information Engineerin |
Cersosimo, Antonella | IRCCS Istituto Delle Scienze Neurologiche Di Bologna, Bologna It |
Orlandi, Silvia | University of Bologna |
Keywords: Brain-Computer Interfaces, Assistive Technology, Virtual/Augmented/Mixed Reality
Abstract: Safe and effective powered wheelchair (PW) use requires training, which is typically conducted in clinical settings with specific clinical evaluation methods, such as the Powered Mobility Program (PMP). However, these methods lack objective performance assessment. Virtual Reality (VR) has emerged as a promising solution for safe, home-based training, also reducing the high costs for healthcare systems. Still, most existing VR-based solutions often rely on joystick or hand-tracking controls, making them inaccessible to individuals with severe upper limb impairments. To address this gap, we developed VR-PMP, a VR-based wheelchair driving simulator integrated with a hybrid Brain-Computer Interface (BCI) that combines electroencephalography (EEG) and electrooculography (EOG) signals for control. This system targets individuals with severe motor impairments but preserved cognitive function who could benefit from alternative, non-manual control methods. In this proof-of-concept study, we aimed to evaluate the feasibility of such a BCI system in a group of six adults without disabilities. Results showed that participants successfully navigated the virtual environment using the EOG-EEG BCI, achieving up to 100% accuracy in the classifier evaluation phase. These findings highlight the potential of our system as an innovative, alternative access solution for powered wheelchair mobility, paving the way for future clinical applications.
|
|
17:00-17:15, Paper Tu-S3-T7.5 | |
Aligning Humans and Robots Via Reinforcement Learning from Implicit Human Feedback |
|
Kim, Suzie | Korea University |
Shin, Hye-Bin | Korea University |
Lee, Seong-Whan | Korea University |
Keywords: Brain-Computer Interfaces, Human-Machine Interaction, Assistive Technology
Abstract: Conventional reinforcement learning (RL) approaches often struggle to learn effective policies under sparse reward conditions, necessitating the manual design of complex, task-specific reward functions. To address this limitation, reinforcement learning from human feedback (RLHF) has emerged as a promising strategy that complements hand-crafted rewards with human-derived evaluation signals. However, most existing RLHF methods depend on explicit feedback mechanisms such as button presses or preference labels, which disrupt the natural interaction process and impose a substantial cognitive load on the user. We propose a novel reinforcement learning from implicit human feedback (RLIHF) framework that utilizes non-invasive electroencephalography (EEG) signals, specifically error-related potentials (ErrPs), to provide continuous, implicit feedback without requiring explicit user intervention. The proposed method adopts a pre-trained decoder to transform raw EEG signals into probabilistic reward components, enabling effective policy learning even in the presence of sparse external rewards. We evaluate our approach in a simulation environment built on the MuJoCo physics engine, using a Kinova Gen2 robotic arm to perform a complex pick-and-place task that requires avoiding obstacles while manipulating target objects. The results show that agents trained with decoded EEG feedback achieve performance comparable to those trained with dense, manually designed rewards. These findings validate the potential of using implicit neural feedback for scalable and human-aligned reinforcement learning in interactive robotics.
|
|
17:15-17:30, Paper Tu-S3-T7.6 | |
EEG-Based Motor Task Classification Using Complex Short-Time Fourier and Hilbert Transform Features |
|
Mohan, Anand | IIT Roorkee |
Chopra, Ojaswi | IIT ROORKEE |
Singh, Ruchi | IIT ROORKEE |
Simpi, Pradnya Praveen | IIT ROORKEE |
Srivastava, Pratham Kumar | IIT ROORKEE |
Anand, Rs | Indian Institute of Technology Roorkee |
Keywords: Brain-Computer Interfaces, Human-Machine Interaction, Human-Computer Interaction
Abstract: Neurological disorders like stroke, epilepsy, and dementia can severely impair motor function, limiting independence and quality of life. Traditional assistive technologies, such as prosthetic limbs or voice-controlled systems, often require invasive procedures, expensive equipment, or physical efforts. These may not be feasible for many individuals, highlighting the need for AI-driven EEG analysis for risk factor prediction, diagnosis and rehabilitation. Brain-Computer Interfaces (BCIs) powered by EEG-based movement classification offer a non-invasive, affordable, and accessible alternative, enabling individuals to control devices using their thoughts alone. This study aims to develop a robust EEG-based system capable of classifying different types of hand movements, both real and imagined, using machine learning. By improving classification accuracy and reducing noise in EEG signals, this research contributes toward making hands-free interaction with technology a reality. This study will help in empowering people with disabilities and bridging the accessibility gap in assistive technologies. Feature extraction is performed using the proposed Complex STFT with Hilbert Transform (CSTFT-H) method. Machine learning models, including CNN, SVM, DT, RF and MLP are utilized to analyze EEG signals and classify them into motor imagery and motor movement. Hyperparameter tuning and k-fold validation are applied to optimize model performance.The proposed model achieved 86% accuracy for classifying all four motor activity tasks across all subjects. This would greatly benefit individuals with physical disabilities by successfully interpreting motor-related EEG data in real- time.
|
|
17:30-17:45, Paper Tu-S3-T7.7 | |
RMAformer: A Transformer-Based Architecture for fNIRS Motion Artifact Removal Fusing Local and Long-Range Dependency |
|
Liu, Haoran | Beihang University |
Wu, Di | Beihang University |
Yang, Mingxi | Beihang University |
Wang, Daifa | Beihang University |
Keywords: Brain-Computer Interfaces, Medical Informatics, Human-Machine Interaction
Abstract: Functional near-infrared spectroscopy (fNIRS) is a promising noninvasive neuroimaging modality widely used in brain-computer interface (BCI) systems, but its reliability is often compromised by motion artifact (MA). Numerous methods have been proposed to mitigate MA contamination. However, traditional methods often struggle with complex and dynamic MA, while existing deep learning (DL)-based approaches primarily focus on local feature extraction, inadequately capturing global temporal structures, leading to suboptimal MA removal. To address these limitations, we propose RMAformer, a novel architecture designed for effective MA suppression and high-fidelity fNIRS signal preservation. Inspired by U-Net, RMAformer employs an encoder-decoder framework that integrates Transformer modules with hierarchical multi-head self-attention to model long-range temporal dependencies across multiple scales. To complement this global modeling, we develop a lightweight convolution-enhanced feed-forward module, combining depthwise separable convolutions with linear projections to efficiently capture fine-grained temporal patterns. Additionally, we introduce a customized loss function that combines mean squared error with a temporal continuity constraint, ensuring enhanced numerical accuracy and temporal smoothness in denoised signals. Extensive experiments on simulated and real task datasets demonstrate that RMAformer significantly outperforms state-of-the-art methods in MA suppression while preserving the fidelity and temporal consistency of fNIRS.
|
|
17:45-18:00, Paper Tu-S3-T7.8 | |
FESNet: A Fine-Grained EMG Segmentation Network for Enhanced Finger Movement Analysis |
|
Li, Dian | The University of Electro Communications |
Chen, Peiji | The University of Electro Communications |
Togo, Shunta | Univ. Electro-Communications |
Yokoi, Hiroshi | The University of Electro-Communications |
Jiang, Yinlai | The University of Electro-Communications |
Keywords: Human-Machine Interface, Brain-Computer Interfaces
Abstract: The analysis of electromyographic (EMG) signals is crucial for advancing human-machine interaction. Despite recent progress, most methods still approach gesture intention prediction as a single classification task, which overlooks the complex temporal dynamics and channel-specific variations present in EMG signals. To address these shortcomings, we propose FESNet (Fine-grained Electromyography Segmentation Network), a novel segmentation-based network that temporally segments EMG signals, enabling a more fine-grained analysis of finger movements. Our approach utilizes a robust backbone network for feature extraction, followed by a functional head that adapts to different granularities or task objectives (classification or segmentation). We evaluate our method on the Ninapro DB8 dataset, where FESNet outperforms previous models, demonstrating its superior performance. The source code is publicly available at: https://github.com/Dianli97/FESNet
|
|
Tu-S3-T8 |
Room 0.32 |
Computational Intelligence |
Regular Papers - Cybernetics |
Chair: Rodríguez, Ismael | Universidad Complutense De Madrid |
Co-Chair: Mousavirad, Seyed Jalaleddin | Mid Sweden University |
|
16:00-16:15, Paper Tu-S3-T8.1 | |
CLIP-Guided Fusion of Image and Leaf Dictionary Features for Plant Disease Classification |
|
Imeraj, Gent | Hosei University |
Iyatomi, Hitoshi | Hosei University |
Keywords: Computational Intelligence in Information, Image Processing and Pattern Recognition, Deep Learning
Abstract: Plant disease classification from leaf images is often hindered by domain shift and a lack of interpretability. We propose a multimodal framework that integrates visual features with structured semantic information through image-specific alignment. Using LongCLIP (ViT-B/16), our system extracts image embeddings and aligns them with natural language descriptions derived from a self-curated, CLIP-compatible leaf feature dictionary. This dictionary includes 16 botanical categories that capture traits such as leaf margin, venation pattern, texture, and lesion color. CLIP-based similarity scoring selects the most relevant features from each category, which are then synthesized into modular natural language descriptions and converted into binary semantic vectors. These vectors are fused with image embeddings in a dual-encoder architecture, enhancing both classification performance and interpretability. Our method outperforms image-only baselines across four crop datasets—tomato, cucumber, eggplant, and strawberry—achieving average macro-F1 improvements of +10.89 points on validation and +12.52 points on test data. The approach generalizes effectively to unseen farms, linking predictions to biologically meaningful features and providing a modular, extensible framework for explainable AI in plant health monitoring and beyond.
|
|
16:15-16:30, Paper Tu-S3-T8.2 | |
Explainable AI (XAI) for Spectral Analysis Via Reinforcement Learning: Learning to Optimize |
|
Chu, Anqi | Karlsruhe Institute of Technology |
Xie, Xiang | Karlsruhe Institute of Technology |
Jin, Muen | Karlsruhe Institute of Technology |
Stork, Wilhelm | Karlsruhe Institute of Technology (KIT) |
Keywords: Computational Intelligence, Application of Artificial Intelligence, Neural Networks and their Applications
Abstract: Spectral analysis is extensively utilized in various industrial and academic domains to examine the characteristics of unknown samples. On one hand, classic methods usually construct complex physical models to mathematically formulate the problem and then exploit suitable optimizers to iteratively converge to a solution. While effective, these approaches often suffer from long computational time and convergence problems at high dimensions. On the other hand, artificial intelligence (AI), particularly deep learning-based methods, has recently been proposed to address such issues and proven to be fast, efficient, and accurate. However, the black-box nature of neural networks often raises concerns among domain experts regarding the interpretability and trustworthiness of the results. To overcome these challenges, we propose a reinforcement learning-based framework as an explainable AI (XAI) approach for spectral analysis. Instead of directly generating an end-to-end (E2E) solution from input data using neural networks, we train an agent to serve as an intelligent optimizer. During the optimization iterations, the agent observes the environment and makes sequential decisions to update parameters toward convergence. This process reveals the intermediate steps and the reasoning behind the optimization, making it both explainable and verifiable. Experimental results on the benchmark dataset demonstrate that our method consistently achieves faster convergence than other popular classical optimizers, delivering accurate results with fewer iterations. These properties make the framework a promising tool for real-world deployment in industrial and scientific applications.
|
|
16:30-16:45, Paper Tu-S3-T8.3 | |
AADNet: A Human-Mind-Inspired Multi-Modal Framework for Object Concept Learning |
|
Tang, Chao | Peking University |
Chang, Xinhai | Peking University |
Keywords: Computational Intelligence, Artificial Life, Machine Vision
Abstract: Object concept learning, the task of defining objects through visual perception, has seen significant progress with neural networks. However, existing approaches often focus solely on classification while overlooking the process of relating objects to explicit features such as affordance, specific attributes, and geometry. To bridge this gap, we redefine object concept learning to include the detection of explicit features and their relationships to object names, which together form what we call "concepts". We propose AADNet, a human-inspired framework that incorporates modules for affordance detection, attributes analysis, and depth estimation. These explicit features are integrated using transformer-based methods and transformed into a specific vector that encapsulates all the essential information defining the object’s concept. This vector is then used for object classification. AADNet’s innovative structure emulates human learning processes, offering a more holistic approach to concept formation. Extensive experiments validate its effectiveness, and we plan to open-source the code, model, and data to benefit the community.
|
|
16:45-17:00, Paper Tu-S3-T8.4 | |
C2L-DE-Lite: A Lightweight Solution to Clustering Complexity in Differential Evolution for Neural Network Training |
|
Mousavirad, Seyed Jalaleddin | Mid Sweden University |
Schaefer, Gerald | Loughborough University |
Oliva, Diego | Universidad De Guadalajara |
O'Nils, Mattias | Mittuniversitetet |
Keywords: Computational Intelligence, Evolutionary Computation, Metaheuristic Algorithms
Abstract: Determining optimal weights and biases for neural networks is a critical task. While gradient-based methods are widely used for training, they are sensitive to initialisation and susceptible to local optima. Population-based metaheuristics, such as differential evolution (DE), can offer a reliable alternative. Recently, clustering-based DE approaches have been proposed to further improve this process. However, they suffer from increased complexity, particularly with growing network sizes, leading to longer computation times. In this paper, we introduce strategies to reduce the time complexity of clustering-based DE, including clustering in the objective space, a two-tier clustering period, and one-step k-means clustering. We select one of the recent training algorithms, C2L-DE, as a representative method to incorporate our proposed strategies, leading to a lightweight version, C2L-DE-Lite. We show that C2L-DE-Lite decreases the complexity from O(sqrt{N_{pop}} cdot N_{pop} cdot d cdot I), where N_{pop} is the population size, d is the dimensionality, and I is the number of iterations, to Oleft(frac{N_{pop} cdot sqrt{N_{pop}}}{CP}right), where CP is the clustering period. This means that the complexity remains constant for increasing sizes of networks. Extensive experiments demonstrate that while significantly reducing time complexity, C2L-DE-Lite maintains similar performance levels.
|
|
17:00-17:15, Paper Tu-S3-T8.5 | |
To Lie or Not to Lie... in Negotiations under Egalitarian Social Welfare |
|
Aranda, Jonathan | Universidad Complutense |
Godoy, Aitor | Universidad Complutense |
Rodríguez, Ismael | Universidad Complutense De Madrid |
Rubio, Fernando | Universidad Complutense |
Keywords: Computational Intelligence, Evolutionary Computation, Soft Computing, Socio-Economic Cybernetics
Abstract: When a set of agents (whether human or artificial) must reach an agreement on a series of measures, it is necessary to establish which criteria must be optimized to find the best agreement. In particular, in egalitarian social welfare, the aim is to maximize the benefit of the agent who is most disadvantaged. In this way, the aim is to ensure that no agent is too dissatisfied with the agreement reached, so that the probability of breaking the agreement is lower. Unfortunately, it is not (computationally) straightforward to compute the best agreements under egalitarian social welfare. Moreover, agents may try to lie about their true preferences to try to fool the optimization algorithm. In this paper we demonstrate the computational complexity of the problem and propose strategies to discourage agents from lying. In particular, we consider the case of political parties that have to reach an agreement about a given set of laws. Genetic algorithms are used to evaluate the usefulness of different strategies from an experimental point of view.
|
|
17:15-17:30, Paper Tu-S3-T8.6 | |
Improved VMD Based Remote Heartbeat Estimation Utilizing 60GHz mmWave Radar |
|
Gu, Boyuan | Glasgow College, University of Electronic Science and Technology |
Yang, Yanhui | Glasgow College, University of Electronic Science and Technology |
You, Siyu | University of Electronic Science and Technology of China |
Sun, Haiyang | University of Electronic Science and Technology of China |
Sun, Jiahui | University of Electronic Science and Technology of China |
Guo, Shisheng | University of Electronic Science and Technology of China |
Keywords: Computational Intelligence, Metaheuristic Algorithms, Image Processing and Pattern Recognition
Abstract: This study introduces an improved signal decomposition methodology for non-contact heartbeat estimation using millimeter-wave (mmWave) radar. With the increasing demand for non-invasive and continuous monitoring of vital signs, mmWave radar technology has become a promising alternative to traditional contact-based methods, such as electrocardiogram (ECG), due to its high sensitivity, robust penetration, and adaptability to diverse environments. Specifically, we first analyze the signal of the mmWave radar system to derive a model of cardiac signal extraction based on radar echo signal. Variational Mode Decomposition (VMD) integrated with a Newton-Raphson-based optimizer (NRBO) algorithm is then utilized for the accurate reconstruction of the cardiac mechanic signal (CMS). The VMD method decomposes the signal into its intrinsic mode functions (IMFs), while the NRBO dynamically optimizes the decomposition parameters, including the penalty factor (alpha), to enhance the precision of the heartbeat estimation. The effectiveness and robustness of the proposed model is validated through a 18 subjects experiment dataset, and the model shows significant improvements over three baselines in terms of accuracy and reliability of heartbeat detection.
|
|
17:30-17:45, Paper Tu-S3-T8.7 | |
Fed-SHA: An Efficient Hyperparameter Optimization Approach for Federated Learning |
|
Zheng, Yubin | Shanghai Jiao Tong University |
Tang, Peng | Shanghai Jiao Tong University |
Zhang, Xiheng | Shanghai Jiao Tong University |
Hong, Yijie | Shanghai Jiao Tong University |
Qiu, Weidong | Shanghai Jiao Tong University |
Keywords: Computational Intelligence, Optimization and Self-Organization Approaches, Machine Learning
Abstract: Federated Learning (FL) enables collaborative machine learning without exposing raw data, effectively addressing privacy concerns. However, FL still faces major challenges, particularly in communication efficiency. Among them, federated hyperparameter optimization emerges as a critical yet underexplored problem. Properly initialized hyperparameters play a vital role in accelerating convergence and significantly reducing communication overhead in FL settings. To address this, we propose a communication-efficient algorithm, Fed-SHA, based on continuous halving strategies. This work systematically analyzes the unique challenges of hyperparameter optimization in FL and introduces an alternative optimization objective for client-side tuning, which can be solved independently of federated training. Inspired by multi-fidelity optimization, Fed-SHA combines local search with global halving to improve efficiency. The algorithm leverages local computation to explore hyperparameter configurations and uses a central server to approximate federated loss and progressively reduce the search space. The experimental results show that Fed-SHA significantly reduces communication rounds and costs, while achieving better performance than existing baseline methods.
|
|
17:45-18:00, Paper Tu-S3-T8.8 | |
IPPA: Information-Guided Pheromone Puzzle Algorithm for CTSP |
|
Li, Guo | Civil Aviation University of China |
Peng, Xiong | Civil Aviation University of China |
Liu, Lili | Civil Aviation University of China |
Ding, Jianli | Civil Aviation University of China |
Li, Jing | Civil Aviation University of China |
Keywords: Computational Intelligence, Soft Computing, Socio-Economic Cybernetics, Swarm Intelligence
Abstract: 本研究提出了一种信息引导的信息素谜题 解决旅行推销员问题的算法 (IPPA) (TSP) 具有增强的可扩展性、收敛速度和 解决方案质量。该算法集成了光谱 聚类、动态信息素调整和滑动 window 2-opt 本地优化策略,以改进两者 全局搜索能力和本地细化。要评估其 性能,对八个进行了广泛的实验 标准 TSP 基准数据集,涵盖城市规模 50 美元到 700 美元。进行比较实验 针对经典遗传算法 (GA)、CTSP(GA 与 AP)和五个具有代表性的蚁群优化(ACO) 算法。结果表明,IPPA 始终如一 实现卓越的路径质量和更快的收敛。 在所有基准数据集中,IPPA 将游览长度缩短为 与表现最佳相比,0.18% 至 3.70% 基线算法,同时保持 更低的计算成本。这些结果验证了 IPPA解决这两者问题的有效性和效率 小
|
|
Tu-S3-T9 |
Room 0.51 |
Quality and Reliability Engineering & Smart Sensor Networks |
Regular Papers - SSE |
Chair: Hassan, Mohammad Mehedi | King Saud University |
Co-Chair: Santana, Marcelo | Universidade Federal De Pernambuco, Centro De Informática |
|
16:00-16:15, Paper Tu-S3-T9.1 | |
Leveraging Industrial Automation Boundaries and Regulation for Scope Reduction in Software Validation |
|
Wang, Yizhi | Technical University of Munich, Institute of Automation and Info |
Vogel-Heuser, Birgit | Technical University of Munich |
Wilch, Jan | Technical University of Munich |
Wagner, Cedric | Technical University of Munich |
Bremer, Andreas | Karlsruhe Institute of Technology, Institute of Information Secu |
Weigl, Alexander | Karlsruhe Institute of Technology |
Beckert, Bernhard | Karlsruhe Institute of Technology, Institute of Information Secu |
Keywords: Manufacturing Automation and Systems, Quality and Reliability Engineering, Cyber-physical systems
Abstract: Automated Production Systems (aPS) in regulated industries such as pharmaceuticals, MedTech, or food and beverages must comply with the stringent validation and documentation requirements of Good Manufacturing Practice (GMP) regulations within the European Union (EU). These obligations create significant burdens for aPS manufacturers, particularly when changes require revalidation through manual integration and system testing. But these prerequisites also enable opportunities for lower-effort software verification, leveraging GMP documentation and boundaries of the automation domain to prove, e.g., that an implemented software change realizes the change specification without side effects. This paper proposes a methodical workflow for deriving software slices suitable for formal verification, while being aligned with automation engineering practices and GMP requirements. It defines assumptions for slice utility based on system modularity, interface expressiveness, and domain boundaries. The approach is validated through a real-world GMP-regulated change from a German MedTech aPS manufacturer, following GAMP 5 guidelines, demonstrating the utility for GMP-regulated aPS engineering and automatic verification of a sliced program segment.
|
|
16:15-16:30, Paper Tu-S3-T9.2 | |
Stochastic Battery Degradation Modeling Considering Cell-To-Cell Current Interactions |
|
Santana, Marcelo | Universidade Federal De Pernambuco, Centro De Informática |
Silva, Jonatas | Universidade Federal De Pernambuco |
Almeida, Vinícius | Centro De Informática, Universidade Federal De Pernambuco |
Maciel, Paulo | UFPE |
Keywords: Quality and Reliability Engineering, Discrete Event Systems, Consumer and Industrial Applications
Abstract: This paper introduces a stochastic model to characterize battery degradation using Stochastic Petri Nets (SPN). The model analyzes battery degradation under varying discharge currents, architectural configurations, state of health (SoH) thresholds, K-out-of-N (KooN) architectures, and cell imbalances. It assesses reliability based on individual cell data, distinguishing it from traditional methods by accounting for cell failure influence. A NASA dataset was used to study battery degradation trends under various imbalance levels. Probability distributions for the SoH decay process were incorporated into the models, and simulations were conducted to observe battery behavior across cell imbalance ranges from 5% to 80% of the SoH amplitude. Results confirm that the model effectively reflects degradation behavior related to battery reliability. Findings indicate the parallel-series (PS) configuration achieves up to 18.5% higher mean cycles to failure (MCTF) than series-parallel (SP) under significant imbalance, with modest advantages (around 3%) under balanced conditions. This shows the model accurately assesses reliability and can be adapted for battery packs.
|
|
16:30-16:45, Paper Tu-S3-T9.3 | |
Priority-Aware Task Offloading for Latency and Energy Minimization in Healthcare IoT Systems |
|
Hasan, Md. Jamil | Green University of Bangladesh |
Rahman, Md. Hasibur | Green University of Bangladesh |
Hossen, Md Sajjad | Green University of Bangladesh |
Roy, Palash | University of Dhaka |
Razzaque, Md. Abdur | University of Dhaka |
Fortino, Giancarlo | University of Calabria |
Gravina, Raffaele | University of Calabria |
Hassan, Mohammad Mehedi | King Saud University |
Keywords: Quality and Reliability Engineering, Smart Sensor Networks, Distributed Intelligent Systems
Abstract: The Internet of Medical Things (IoMT) has emerged as a transformative technology platform in the healthcare sector, enabling real-time monitoring and intelligent decision-making through connected devices. However, prioritizing and offloading the massive volume of computational tasks generated by IoMT devices while minimizing latency and energy consumption poses significant challenges. Existing approaches often overlook dynamic real-time factors such as task urgency and data freshness, as well as the integration of local task processing via Device-to-Device (D2D) communication with offloading to Mobile Edge Computing (MEC) servers. In this paper, we develop a priority- and Age of Information (AoI)-Aware task offloading framework for latency and energy optimization in healthcare IoT systems, namely PRALEIT, exploiting Mixed Integer Linear Programming (MILP) problem. The developed PRALEIT system introduced probabilistic classification of IoMT tasks based on vital signs and AoI value by leveraging a Bayesian classifier. The experimental results depict that the PRALEIT system significantly reduces task execution delay and energy consumption compared to state-of-the-art models, ensuring reliable and sustainable healthcare services.
|
|
16:45-17:00, Paper Tu-S3-T9.4 | |
Optimal Fractional Order Kalman Consensus Filter Based on Historical Information |
|
Chen, Rui | Southeast University |
Zhang, Chengjia | Southeast University |
Jia, Chenxi | Southeast University |
Li, Chengkun | Southeast University |
Wei, Yiheng | Southeast University |
Keywords: Smart Sensor Networks, Cyber-physical systems, Distributed Intelligent Systems
Abstract: This paper presents an optimal fractional order Kalman consensus filter (OFOKCF) for distributed state estimation in fractional order systems. By leveraging historical state information, the algorithm captures long-term dynamics and model the system more precisely. A consensus mechanism ensures node consistency, while covariance intersection simplifies cross-correlation computations, using trace-based adaptive weights for multi-dimensional states. The simulation validates the effectiveness and accuracy of OFOKCF. The approach offers a scalable solution for distributed estimation in sensor networks and cooperative control, particularly for fractional order systems.
|
|
17:00-17:15, Paper Tu-S3-T9.5 | |
Estimating Significant Wave Height from IMU Data Using Transformer-Based Time-Series Regression Model |
|
Akiyama, Takeru | Tokyo Denki University |
Shinozuka, Ryohei | Tokyo Denki University |
Suzuki, Kaira | Tokyo Denki University |
Fujikawa, Taro | Tokyo Denki University |
Nakamura, Akio | Tokyo Denki University |
Keywords: Smart Sensor Networks, Consumer and Industrial Applications, Cyber-physical systems
Abstract: In this study, we propose a method for estimating ocean wave height using data acquired from an inertial measurement unit (IMU) mounted on a free-drifting buoy deployed at sea. The IMU captures three-axis acceleration and three-axis angular velocity, which are used as input for a time-series regression model based on the Transformer architecture. During training, ground truth wave height is derived by converting absolute altitude variations—measured via the Centimeter-Level Augmentation Service (CLAS), a high-precision positioning service of the Quasi-Zenith Satellite System, part of the Global Navigation Satellite System—into relative wave heights. The model is trained using both the converted wave height data and the corresponding IMU measurements. After training, the model estimates wave heights from new IMU input data, and its performance is evaluated by comparing the estimated values with the CLAS-derived ground truth. Results indicate that, when training and test data are obtained on the same measurement day, the maximum absolute error in significant wave height is 0.07 m. When training and test data are from different days, the maximum absolute error remains within 0.06 m. These results demonstrate that the proposed Transformer-based time-series regression model effectively estimates wave height from IMU data with high accuracy.
|
|
17:15-17:30, Paper Tu-S3-T9.6 | |
Multi-Stage Testing for Open Source IoT Frameworks |
|
Norbisrath, Ulrich | University of Tartu |
Rossi, Bruno | Masaryk University |
Jubeh, Ruben | OTH Regensburg |
Heydarov, Araz | University of Tartu |
Keywords: Smart Sensor Networks, Distributed Intelligent Systems, System Architecture
Abstract: The Internet of Things (IoT) has rapidly evolved, integrating networked intelligence into a web of things, servers, and cloudlets. Although there are various tools and approaches for software testing, the broader field of IoT testing presents unique challenges due to the heterogeneity of devices, large-scale deployments, dynamic environments, and real-time needs. This paper presents our approach to developing a multi-stage testing framework for the IoTempower framework, addressing the challenges of testing a versatile and evolving open-source Internet-of-Things framework used extensively in educational settings and beyond. The framework incorporates compilation testing, integration testing, and system testing with a focus on regression testing. This multi-stage testing approach allows us to validate the framework's functionality at various granularities, from the correct compilation of individual drivers to the seamless interaction of deployed hardware. This approach aims to proactively identify and prevent regressions, facilitating the integration of new features and enhancements without losing our scope of providing a real hands-on IoT experience in the classroom.
|
|
17:30-17:45, Paper Tu-S3-T9.7 | |
Robust Simultaneous UWB-Anchor Calibration and Robot Localization for Emergency Situations |
|
Liu, Xinghua | University of Groningen |
Cao, Ming | Yale University |
Keywords: Smart Sensor Networks, Robotic Systems, Cyber-physical systems
Abstract: In this work, we propose a factor graph optimization (FGO) framework to simultaneously solve the calibration problem for Ultra-WideBand (UWB) anchors and the robot localization problem. Calibrating UWB anchors manually can be time-consuming and even impossible in emergencies or those situations without special calibration tools. Therefore, automatic estimation of the anchor positions becomes a necessity. The proposed method enables the creation of a soft sensor providing the position information of the anchors in a UWB network. This soft sensor requires only UWB and LiDAR measurements measured from a moving robot. The proposed FGO framework is suitable for the calibration of an extendable large UWB network. Moreover, the anchor calibration problem and robot localization problem can be solved simultaneously, which saves time for UWB network deployment. The proposed framework also helps to avoid artificial errors in the UWB-anchor position estimation and improves the accuracy and robustness of the robot-pose. The experimental results of the robot localization using LiDAR and a UWB network in a 3D environment are discussed, demonstrating the performance of the proposed method. More specifically, the anchor calibration problem with four anchors and the robot localization problem can be solved simultaneously and automatically within 30 seconds by the proposed framework. The supplementary video and codes can be accessed via https://github.com/LiuxhRobotAI/Simultaneous_calibration_lo calization.
|
|
17:45-18:00, Paper Tu-S3-T9.8 | |
Multi-Task Estimation of Tip Kinematics and External Force in a Continuum Robot Using Fused Proprioceptive Sensing |
|
Liang, Chendi | Beihang University |
Liu, Yanzhen | Beihang University |
Yibulayimu, Sutuke | Beihang University |
Yang, Qing | Qingdao Municipal Hospital |
Shi, Chao | Beihang University |
Wang, Yunning | Beihang University |
Wang, Yu | Beihang University |
Keywords: Soft Robotics, Robotic Systems, Smart Sensor Networks
Abstract: Kinematic modeling of soft continuum robots in constrained environments remains challenging due to their complex nonlinear dynamics. Data-driven processing of proprioceptive signals offers a promising pathway to enhance robot perception of both its own state and the external environment. This paper proposes a soft continuum robot structure with multiple integrated proprioceptive sensing, including measurements of tendon tension and intersegmental pressure. A long short-term memory (LSTM) network is employed to jointly estimate multiple perception tasks, including tip position, orientation, and the magnitude and direction of external forces. Ablation studies demonstrate that fused proprioceptive inputs yield significantly higher accuracy in multi-task estimation than single-modality inputs. The proposed method provides a novel and effective approach for advancing perception and control in soft continuum robotics.
|
|
Tu-S3-T10 |
Room 0.90 |
Computational and Medical Cybernetics |
Special Sessions: Cyber |
Chair: Kisbenedek, Lilla | Obuda University |
Co-Chair: Drexler, Dániel András | Obuda University |
Organizer: Rudas, Imre | Obuda University |
Organizer: Kovacs, Levente | Obuda University |
Organizer: Eigner, György | Obuda University |
Organizer: Drexler, Dániel András | Obuda University |
Organizer: Kubota, Naoyuki | Tokyo Metropolitan University |
Organizer: Shi, Peng | University of Adelaide, Adelaide |
|
16:00-16:15, Paper Tu-S3-T10.1 | |
CD-Net: Context-Driven Ultrasound Image Enhancement, a 2.5D Approach for Scoliosis Assessment (I) |
|
Zhang, Chen | University Technology of Sydney |
Dana, Sumartini | University of Technology Sydney |
Jia, Wenjing | University Technology of Sydney |
Zheng, Yongping | The Hong Kong Polytechnic University |
Ling, Steve | University of Technology Sydney |
Keywords: Application of Artificial Intelligence, AI and Applications, Neural Networks and their Applications
Abstract: Artificial intelligence (AI) has advanced medical image, yet challenges persist in Ultrasound-based scoliosis diagnosis due to image quality issues like speckle noise, low contrast, and inconsistent features. While ultrasound is safer and more accessible than X-rays, traditional single-slice enhancement approaches fail to capture crucial spatial relationships between adjacent slices, limiting the detection of spine-related features such as Thoracic Bony Features (TBF) and Lamellar Bone Features (LBF). To overcome these challenges, we introduce the Contextual-Driven Ultrasound Enhancement Network (CD-Net), which incorporates two innovative modules: the Contextual Cross-Attention Transfer (CCAT) module to capture inter-slice spatial relationships and the Localized Attention Contrast Enhancement (LoCE) module to selectively refine features. By employing a self-supervised CD-Net, we optimize pixel-wise accuracy while preserving contextual consistency. Experiments on a dataset of 109 patients show a remarkable improvement in the detection rate, rising from 78.25% to 93.18%, while achieving a high Structural Similarity Index (SSIM) of 89.2% (sigma = 0.045). The enhanced visibility of TBF and LBF and robust performance across varying image qualities demonstrate CD-Net's potential to significantly improve the reliability and efficiency of Ultrasound-based scoliosis diagnosis in clinical settings.
|
|
16:15-16:30, Paper Tu-S3-T10.2 | |
Predicting Heart Failure Hospitalizations with LLMs from Health Insurance Data (I) |
|
Baro, Everton | Instituto Federal Do Paraná |
Oliveira, Luiz S. | UFPR |
Britto Jr, Alceu de Souza | State University of Ponta Grossa (UEPG) |
Keywords: Application of Artificial Intelligence, Machine Learning, Transfer Learning
Abstract: Heart failure (HF) represents a global clinical and economic challenge, with hospitalizations accounting for 65% of disease-related costs. This study proposes an approach to predict HF hospitalizations using Large Language Models (LLMs) trained on chronological data from Brazilian health insurance beneficiaries. By converting administrative records (consultations, medications, diagnoses) into temporal narratives, models like RoBERTa and Open-Cabrita3B were fine-tuned to identify clinical deterioration patterns. The HealthHistoryRoBERTa-pt model, trained with historical health insurance data and specifically adjusted for HF, achieved an AUC-ROC of 0.93-0.95 in prediction windows from 5 to 180 days, significantly outperforming other studies (AUC 0.63-0.76) using static clinical data or basic demographics, and those combining clinical-administrative data (AUC 0.82). It is noteworthy that the ability of the model to maintain an F1-score greater than 0.85 and sensitivity of 0.87 in predictions of up to 150 days, revealing that administrative variables (e.g., history of hospitalizations, frequency of consultations) function as effective proxies for socioeconomic and behavioral factors, traditionally neglected. Compared to other works, this study demonstrates that longitudinal health insurance data combined with NLP techniques capture non-linear risk trajectories, enabling precise predictions for strategic health planning.
|
|
16:30-16:45, Paper Tu-S3-T10.3 | |
Predicting Blood Glucose Trends with Deep Neural Networks: A Patient-Specific Approach (I) |
|
Simon, Barbara | Óbuda University |
Hartveg, Adam | Óbuda University |
Szasz, Laszlo | University Research and Innovation Center, Physiological Control |
Dénes-Fazakas, Lehel | Óbuda University |
Siket, Máté | Obuda University |
Eigner, György | Obuda University |
Kovacs, Levente | Obuda University |
Keywords: Deep Learning, Neural Networks and their Applications
Abstract: Diabetes mellitus is a chronic metabolic disorder requiring meticulous blood glucose regulation to minimize both acute complications and long-term vascular damage. Traditional glucose monitoring approaches—such as finger-prick tests and continuous glucose monitoring (CGM)—primarily support reactive interventions, often falling short in enabling proactive management. This study proposes a deep learning-based predictive framework for blood glucose level estimation using historical CGM data. The model's performance was evaluated using standard metrics including Root Mean Squared Error (RMSE), Mean Absolute Percentage Error (MAPE), and R-squared (R²) score. Experimental results across multiple patients reveal that the model achieved RMSE values ranging from 19.37 to 28.57, and MAPE values between 7.76 and 13.31. The highest predictive accuracy was observed for Patient 570 (RMSE: 20.38, MAPE: 7.76), while the model struggled with higher variability in Patient 559. These findings demonstrate the model's potential in delivering personalized, anticipatory glycemic control, thereby supporting more effective diabetes management strategies.
|
|
16:45-17:00, Paper Tu-S3-T10.4 | |
Development of Hardware-In-The-Loop Testing Framework for Artificial Pancreas Systems (I) |
|
Szasz, Laszlo | University Research and Innovation Center, Physiological Control |
Simon, Barbara | Óbuda University |
Dénes-Fazakas, Lehel | Óbuda University |
Siket, Máté | Obuda University |
Kovacs, Levente | Obuda University |
Eigner, György | Obuda University |
Keywords: Cybernetics for Informatics, Cloud, IoT, and Robotics Integration, Biometric Systems and Bioinformatics
Abstract: This paper introduces an integrated Hardware-in-the-Loop (HIL) testing framework, combining the UVA/Padova Type 1 Diabetes Simulator with AndroidAPS, an open-source artificial pancreas system. This integration forms a testing environment capable of evaluating insulin regulation algorithms under both virtual and real hardware conditions. The FDA-approved UVA / Padova Simulator models glucose-insulin dynamics and meal digestion. Paired with AndroidAPS, the system can actuate real-world insulin pumps to test insulin delivery control algorithms. The framework is tied together by various REST APIs and uses the Flask framework for efficient data exchange and system connectivity. The HIL approach provides a robust platform for functional and reliability testing of these algorithms. The developed APIs are open-source: https://github.com/OE-Diab/AP-HIL.
|
|
17:00-17:15, Paper Tu-S3-T10.5 | |
Multi-Stage Parameter Estimation for Nonlinear Mixed-Effects Modeling in a Tumor Growth Model (I) |
|
Puskás, Melánia | Obuda University |
Kisbenedek, Lilla | Obuda University |
Gombos, Balázs | Research Center for Natural Sciences |
Füredi, András | Research Center for Natural Sciences |
Kovacs, Levente | Obuda University |
Drexler, Dániel András | Obuda University |
Keywords: Computational Life Science, Heuristic Algorithms, Expert and Knowledge-Based Systems
Abstract: The future of healthcare increasingly depends on personalized treatments based on accurate modeling of patient-specific tumor dynamics. Tumor growth models play a key role in quantifying responses to chemotherapy. In preclinical studies, data collection is often constrained by ethical and practical limitations, resulting in sparse and heterogeneous measurements. Consequently, it is essential to extract the maximum possible information from the available data by focusing on model-consistent segments and minimizing the impact of measurement noise or biological variability that the model cannot capture. In this study, we estimate the parameters of an in vivo tumor model using nonlinear mixed-effects (NLME) modeling. A major challenge is that the mathematical model cannot describe resistant tumor phases, and NLME assumes inter-individual similarity, which may not hold in heterogeneous populations. Additionally, NLME fitting is sensitive to initial values, complicating the distinction between poor fit and poor initialization. In order to address these issues, we developed a multi-stage estimation workflow. We begin with a global NLME fit, followed by automatic exclusion of resistant segments and re-estimation on the trimmed data. Least squares fits are used to classify individuals into well- and poorly-fitting subgroups. Each group is then refitted using NLME with feedback-based parameter refinement. This approach improves estimation robustness and enables model-driven stratification that may reflect underlying biological heterogeneity.
|
|
17:15-17:30, Paper Tu-S3-T10.6 | |
Tumor Growth Model Fitting to in Vitro Measurements Using Markov Chain Monte Carlo Method (I) |
|
Dömény, Martin Ferenc | Obuda University |
Gergics, Borbála | Obuda University |
Füredi, András | Research Center for Natural Sciences |
Kovacs, Levente | Obuda University |
Drexler, Dániel András | Obuda University |
Keywords: Computational Life Science, Computational Intelligence, Biometric Systems and Bioinformatics
Abstract: In order to improve parameter identifiability in mathematical tumor models, we propose a Bayesian framework using Markov Chain Monte Carlo (MCMC) methods to fit pharmacodynamic parameters to in vitro tumor spheroid data. We conducted cytotoxicity experiments on Brca1-deficient murine mammary tumor cells, measured tumor volume changes via time-lapse fluorescence microscopy, and calibrated a modified tumor growth model to the resulting data using the No-U-Turn Sampler (NUTS) algorithm. Our results demonstrate that MCMC-based inference yields biologically meaningful posterior distributions, even under data uncertainty. We observe trends in parameter estimates across multiple drug concentrations and identify cases where parameter identifiability is limited. This approach offers a robust framework for model calibration and uncertainty quantification, which supports future efforts in therapy optimization based on in vitro experiments.
|
|
17:30-17:45, Paper Tu-S3-T10.7 | |
A Hierarchical Topological Approach for Extracting Motion Features in Patients with Unilateral Spatial Neglect (I) |
|
Obo, Takenori | Tokyo Metropolitan University |
Matsuda, Tadamitsu | Juntendo University |
Takasue, Naoyuki | Tokyo Metropolitan University |
Kubota, Naoyuki | Tokyo Metropolitan University |
Keywords: Computational Intelligence
Abstract: Extended Reality, artificial intelligence, and big data technologies offer new opportunities for advancing rehabilitation diagnosis and training. This study presents a method for extracting behavioral features from a visual search task conducted in an immersive virtual reality environment. To identify motion patterns specific to individual patients, we employ a topological mapping approach based on Growing Neural Gas (GNG), which adapts its structure dynamically using node activation and error-based edge management. While GNG effectively captures spatial characteristics, it lacks the ability to model temporal relationships and is sensitive to hyperparameter settings. To address these limitations, we introduce a spatiotemporal topological clustering method, along with a hierarchical framework that enables segmentation at multiple levels of granularity. Furthermore, to evaluate feasibility, we conducted a visual search task with three patients, including one with USN, and performed a comparative analysis of their extracted motion features.
|
|
17:45-18:00, Paper Tu-S3-T10.8 | |
Learning Protein-Ligand Binding Affinities through an Uncertainty-Aware Attention Mechanism |
|
Wang, Dan | Hong Kong Metropolitan Unversity |
Zhang, Tianlun | Southern University of Science and Technology |
Wang, Xizhao | Shenzhen University |
Keywords: Biometric Systems and Bioinformatics, Computational Life Science
Abstract: Predicting protein-ligand binding affinity from a structural level is a crucial but challenging problem in computational drug discovery (CDD). Although such works have benefited largely from the advancements of machine learning or deep learning techniques, developing an approach that performs well in predictions while producing decent interpretability still needs further exploration. In this study, we proposed a binding affinity prediction model (UAAM), which leverages the concept of thermodynamics cycle and can be interpreted from the perspective of energy. Moreover, it adopts an attention mechanism to investigate important atom pairs in each protein-ligand binding structure and quantifies the uncertainty levels of those attention scores using entropy. A loss function with a regularization term on the uncertainty levels is employed in UAAM for better extraction of important atom pairs. Compared with baseline approaches on a benchmark database, the proposed UAAM model performed well and was able to generate valuable knowledge at the atom-pair level. This work will contribute concretely to the sustainable development of the CDD community.
|
|
Tu-S3-T11 |
Room 0.94 |
Distributed Adaptive Systems |
Special Sessions: SSE |
Chair: Zhu, Haibin | Nipissing University |
Co-Chair: Shen, Weiming | Huazhong University of Science and Technology |
Organizer: Zhu, Haibin | Nipissing University |
Organizer: Shen, Weiming | Huazhong University of Science and Technology |
Organizer: Fortino, Giancarlo | University of Calabria |
Organizer: Xiong, Naixue | Northeastern State University |
|
16:00-16:15, Paper Tu-S3-T11.1 | |
A Distributed Adaptive System with Strong Tie Graphs for Trust-Aware Reasoning in Adversarial Graphs (I) |
|
Bu, Yu | The Hong Kong Polytechnic University |
Zhu, Yulin | Hong Kong Chu Hai College |
Yuni, Lai | The Hong Kong Polytechnic University |
Keywords: Distributed Intelligent Systems, Adaptive Systems, Trust in Autonomous Systems
Abstract: This study proposes a distributed adaptive system for robust reasoning over graph-structured data under structural poisoning attacks, where adversaries strategically manipulate edges to compromise predictive integrity. We introduce the Graph Adaptive Neural Network (GANN), a modular framework that treats trust and risk zones within the graph as semi-autonomous components capable of self-regulating their propagation behaviors based on adversarial feedback. Leveraging fuzzy-theoretic Strong Tie Graphs (STiG), GANN adaptively identifies and reinforces high-confidence regions to ensure resilient node classification and secure query handling. The system operates as a surrogate defense layer, dynamically managing zone-based structural decomposition and trust calibration in response to perturbations. Its selective validation-driven adaptation mechanism restricts the attacker’s ability to exploit unlabeled data, forcing them toward costly global strategies. A confidence-based regulation framework further enhances GANN’s robustness under bounded adversarial budgets, with demonstrated effectiveness in non-IID and directed graph settings. Experimental results validate GANN’s capability as a self-adjusting distributed system, advancing adaptive defenses in graph-based data management and adversarial query environments.
|
|
16:15-16:30, Paper Tu-S3-T11.2 | |
Establishing Role Networks by Providing a Crowdsourcing Platform (I) |
|
Zhu, Haibin | Nipissing University |
Peng, Chengyu | Nipissing University |
Yu, Kevin Zhe | Nipissing University |
Li, Jiajun | Nipissing University |
Keywords: Distributed Intelligent Systems, Decision Support Systems, Modeling of Autonomous Systems
Abstract: Social relations are complex. Traditional social network analysis encounters too complex social networks to identify and solve social problems. Thanks to the Environments – Classes, Agents, Roles, Groups, and Objects (E-CARGO) model and the methodology of Role-Based Collaboration (RBC), it is possible to clarify and build role networks to abstract and simplify the social relationships among people or agents. This article clarifies the role relationships from the viewpoint of E-CARGO/RBC, and presents the engineering challenges, which make the establishment of the world role networks impossible for one development team. Therefore, a crowdsourcing platform is required. To meet this requirement, his paper describes our practice in building a crowdsourcing platform that supports the final establishment of role networks for society. This work is the first effort to establish role networks in the world. The practice described in this paper provides good practice for researchers and practitioners to conduct social network analysis and understand the complexity of social organizations.
|
|
16:30-16:45, Paper Tu-S3-T11.3 | |
Joint Scheduling-Maintenance Optimization for Non-Identical Parallel Batch Processing Machines with Discrete-State Degradation (I) |
|
Zheng, Xiong | Tongji University |
Yan, Guichen | Tongji University |
Qiao, Fei | Tongji University |
Wang, Junkai | Tongji University |
Keywords: Decision Support Systems, Manufacturing Automation and Systems, Adaptive Systems
Abstract: Batch scheduling optimization is critical for improving equipment utilization in high-value manufacturing. However, high-intensity continuous operations exacerbate machine degradation effects, leading to frequent unplanned downtime and surges in maintenance costs. Existing studies predominantly assume identical machines and idealized maintenance responses, failing to adapt to real-world production scenarios. To this end, this paper investigates a co-optimization problem integrating batch scheduling with maintenance, which fully considers machine differentiation in capacity and degradation rates. We establish a discrete degradation state model for machines and design state-dependent maintenance policies. For non-identical machine capacities, we develop an adaptive capacity batch formation heuristic (ACBFLPT). To address state observation latency in multi-machine synchronous decision-making, we propose a Sequential QMIX Adaptive Batching (SQAB) algorithm that integrates a sequential decision-making mechanism based on the QMIX framework with ACBFLPT. The performance of our method has been validated through extensive comparative and ablation experiments.
|
|
16:45-17:00, Paper Tu-S3-T11.4 | |
Solve the Aquaculture Imbalance Problem between Supply and Demand (I) |
|
Huang, Zigeng | Guangdong University of Technology |
Wang, Kangjin | School of Computer Science and Technology (School of Artificial |
Zhu, Haibin | Nipissing University |
Liu, Dongning | Guangdong University of Technology |
Keywords: Adaptive Systems, Distributed Intelligent Systems, Consumer and Industrial Applications
Abstract: Aquaculture, as a vital component of the fisheries industry, is assuming an increasingly significant role in meeting the growing global demand for aquatic products. However, the allocation of aquaculture resources has become increasingly complex. Overproduction of a single species can lead to market oversupply, resulting in sharp price declines and substantial profit losses for fishermen. The Environment - Classes, Agents, Roles, Groups, and Objects (E-CARGO) model has shown strong potential in addressing such socio-economic problems. This study extends the Group Multirole Allocation (GMRA) model to formalize and address the Aquaculture Imbalance Problem between Supply and Demand (AISDP). The objective is to maximize total profit while considering market demand, disaster risk, and the potential for oversupply. Extensive simulation experiments reveal that the maximum total profit does not occur at the threshold, but rather beyond it—meaning that even though the unit price decreases, total profit can still be increased by further increasing the stocking quantity. Furthermore, the results suggest that fishermen should select the regulatory parameter based on real-world market dynamics and their risk tolerance. This enables aquaculture enterprises to adopt optimal, diversified decision-making strategies aligned with resource availability and strategic development goals.
|
|
17:00-17:15, Paper Tu-S3-T11.5 | |
Solving the Flexible Task Allocations Problem Via Group Role Assignment in Crowdsourcing (I) |
|
Wang, Kangjin | School of Computer Science and Technology (School of Artificial |
Huang, Qilin | School of Computer Science and Technology (School of Artificial |
Zhu, Haibin | Nipissing University |
Liu, Dongning | Guangdong University of Technology |
Keywords: Distributed Intelligent Systems, Modeling of Autonomous Systems, Service Systems and Organizations
Abstract: Flexible employment remains an unavoidable topic for crowdsourcing platforms. Decision-makers always focus their attention on maximizing operational efficiency, yet rarely prioritize the work experience of crowdsourcing employees. However, uncomfortable working conditions can lead to user attrition on these platforms, ultimately reducing platform profitability. While the importance of worker experience in crowdsourcing platforms is recognized, the Flexible Task Allocations Problem (FTAP) has seen relatively little computational investigation due to the absence of powerful quantitative analytical tools. A notable exception is the Environment-Classes, Agents, Roles, Groups, and Objects (E-CARGO) model, a mature computational framework with demonstrated efficacy in resolving similar socio-technical challenges. Consequently, this study builds upon the E-CARGO model and its Group Role Assignment (GRA) sub-model to provide a formalization and systematic analysis of the FTAP. By incorporating adjustments for distance and role-switching, the enhanced GRA model can significantly optimize the work experience of crowdsourcing employees, albeit at a slight cost to overall performance. This improvement helps crowdsourcing platforms retain more users and expand their scale. Relevant simulation experiments are conducted in this study to rigorously evaluate the effectiveness of these optimizations.
|
|
17:15-17:30, Paper Tu-S3-T11.6 | |
ROBIN: A Distributed MARL Approach for Dynamic Bandwidth Optimization in Drone Ad-Hoc Networks |
|
Sun, Xueyu | National University of Defense Technology |
Shi, Weijia | National University of Defense Technology |
Zhao, Baokang | National University of Defense Technology |
Zhou, Huan | National University of Defense Technology |
Xuefeng, Huang | Guilin University of Electronic Technology |
Keywords: Communications, Distributed Intelligent Systems
Abstract: The extensive deployment of drone swarms in emergency response and logistics applications presents stringent requirements for mobile ad-hoc network protocols, particularly due to their large-scale and highly dynamic characteristics. As a widely adopted protocol in industrial applications, BATMAN-ADV (Better Approach To Mobile Ad-hoc Networking Advanced) employs a fixed-interval transmission mechanism for control packets, which leads to inefficient utilization of bandwidth resources. Therefore, we propose ROBIN, a distributed Multi-Agent Reinforcement Learning(MARL) optimization method. Our approach enables each drone to act as an independent agent, autonomously adjusting the control packet transmission interval and optimizing transmission strategies based on localized network states. The solution maintains the protocol's inherent millisecond end-to-end latency while establishing a dynamic balance between bandwidth consumption and latency performance. Moreover, our method features extremely low resource consumption, making it suitable for drone with limited resources. Experimental results show that ROBIN reduces control packet overhead by 75.9% across diverse network scales and mobility scenarios, while maintaining only 2.29% CPU utilization and 0.25% memory consumption per node on commercial drones.
|
|
17:30-17:45, Paper Tu-S3-T11.7 | |
Spectrum Sharing in V2X Networks Based on Multi-Agent Graph Emergent Communication |
|
Pi, Yue | Southern University of Science and Technology |
Zhang, Wang | Southern University of Science and Technology |
Ding, Yulong | Southern University of Science and Technology |
Zhang, Jin | Southern University of Science and Technology |
Liu, Yongheng | Peng Cheng Laboratory |
Yang, Shuang-Hua | Southern University of Science and Technology |
Keywords: Communications, Distributed Intelligent Systems, Cooperative Systems and Control
Abstract: This paper introduces a novel spectrum resource sharing framework in Vehicle-to-Everything (V2X) communication networks based on multi-agent reinforcement learning with graph emergent communication (MARLGEC). We formulate resource sharing as a distributed multi-agent reinforcement learning (MARL) problem, where each vehicle acts as an agent, interacting with the environment to optimize spectrum allocation strategies. The introduction of the emergent communication mechanism enhances cooperation among agents in the distributed framework. Meanwhile, the graph attention mechanism effectively reduces the communication overhead incurred by emergent communication. This enables reliable vehicle-to-vehicle (V2V) payload transmission and improves system performance under varying network conditions. The experimental results validate the method’s effectiveness, demonstrating its ability to adapt to the dynamic environment and outperform existing MARL approaches that lack communication mechanisms.
|
|
Tu-S3-T12 |
Room 0.95 |
Quantum Cybernetics, Machine Learning, and Applications |
Regular Papers - Cybernetics |
Chair: Meikang, Qiu | Augusta University |
Co-Chair: Tong, Yong Feng | National Chi Nan University |
|
16:00-16:15, Paper Tu-S3-T12.1 | |
Learning Individual Intrinsic Reward in Multi-Agent Reinforcement Learning Via Incorporating Generalized Human Expertise (I) |
|
Wu, Xuefei | Nanjing University |
Yin, Xiao | Nanjing University |
Zhu, Yuanyang | Nanjing University |
Chen, Chunlin | Nanjing University |
Keywords: AI and Applications
Abstract: Efficient exploration in multi-agent reinforcement learning (MARL) is a challenging problem when receiving only a team reward, especially in environments with sparse rewards. A powerful method to mitigate this issue involves crafting dense individual rewards to guide the agents toward efficient exploration. However, individual rewards generally rely on manually engineered shaping-reward functions that lack high-order intelligence, thus it behaves ineffectively than humans regarding learning and generalization in complex problems. To tackle these issues, we combine the above two paradigms together and propose a novel framework, LIGHT (Learning Individual Intrinsic reward via Incorporating Generalized Human experTise), which can integrate human knowledge into MARL algorithms in an end-to-end manner. LIGHT guides each agent to avoid unnecessary exploration by considering both individual action distribution and human expertise preference distribution. Then, LIGHT designs individual intrinsic rewards for each agent based on actionable representational transformation relevant to Q-learning so that the agents align their action preferences with the human expertise while maximizing the joint action value. Experimental results demonstrate the superiority of our method over representative baselines regarding performance and better knowledge reusability across different sparse reward tasks on challenging scenarios.
|
|
16:15-16:30, Paper Tu-S3-T12.2 | |
A Quantum-Inspired Metaheuristic with Hierarchical Directional Strategy for Bi-Objective Cross-Market Investment Optimization |
|
Chou, Yao-Hsin | National Chi Nan University |
Tong, Yong Feng | National Chi Nan University |
Lin, Ping-I | National Chi Nan University |
Kuo, Shu-Yu | National Yunlin University of Science & Technology |
Keywords: Quantum Cybernetics, Computational Intelligence, Metaheuristic Algorithms
Abstract: Quantum-inspired metaheuristics have recently demonstrated strong potential in addressing complex optimization problems. Portfolio optimization is a representative real-world task involving two conflicting objectives: maximizing expected return and minimizing risk. Achieving minimum risk in portfolio optimization is a particularly challenging task, as it involves careful analysis of complex interactions among multiple stocks. To address this, this study introduces an enhanced variant of the Multi-objective Quantum-inspired Tabu Search algorithm, incorporating a hierarchical directional strategy that prioritizes exploration from the lowest-risk solutions. Intermediate non-dominated solutions are retained along this trajectory, and an entanglement move mechanism is subsequently applied to further expand the non-dominated solution set from both the high-return and low-risk ends. Experimental results in a cross-market setting involving the U.S. and Japan highlight the proposed model’s promising ability to navigate the enlarged decision space efficiently, delivering competitive results across standard multi-objective metrics with reduced runtime and computational cost. Overall, the proposed method effectively supports cross-market investment decisions with enhanced diversity and improved risk–return trade-offs.
|
|
16:30-16:45, Paper Tu-S3-T12.3 | |
Intrusion Detection of Power Systems with Quantum Neural Networks |
|
Peng, Lian | Augusta University |
Meikang, Qiu | Augusta University |
Li, Chong | Columbia University |
Keywords: Quantum Cybernetics, Quantum Machine Learning, Machine Learning
Abstract: Power systems' transition to smart grids demand scalable intrusion detection against evolving cyberthreats. While quantum neural networks (QNNs) promise exponential acceleration, they face challenges such as vanishing gradients and non-convergent oscillations as parameters increase, which limits their applicability in high-dimensional data tasks. To address these issues, we shift the primary objective of training from traditional loss minimization to the general trend convergence of the accuracy. By integrating batch-accuracy-weighted cost function and validation accuracy-decisive module, we propose the validation-Decisive Accuracy Optimization-QNN(VDAO-QNN) algorithms, achieving 96.8–99.5% detection accuracy on the IEC 60870-5-104 dataset, outperforming classical models overall in non-DoS scenarios. This work bridges quantum advantages with practical cybersecurity needs in critical infrastructure.
|
|
16:45-17:00, Paper Tu-S3-T12.4 | |
Quantum Contextual Bandits: Integrating Bandit Exploration into Quantum Neural Network |
|
Chen, Tianyi | Institute of Information Engineering, Chinese Academy of Science |
Pokhrel, Shiva Raj | School of IT, Deakin University, Australia |
Fang, Jiang | Institute of Information and Engineering, Chinese Academy of Sci |
Liu, Yinlong | Institute of Information and Engineering, Chinese Academy of Sci |
Sun, Jiyan | Institute of Information and Engineering, Chinese Academy of Sci |
Geng, Liru | Institute of Information and Engineering, Chinese Academy of Sci |
Li, Gang | School of IT, Deakin University, Australia |
Keywords: Quantum Machine Learning, Neural Networks and their Applications, AI and Applications
Abstract: Supervised quantum learning methods face notable limitations in dynamic, real-world environments due to their reliance on static labels and limited adaptability. To address these challenges, we propose a novel online learning framework — Quantum Contextual Bandit (QCB) — that integrates quantum neural networks (QNNs) with contextual bandit (CB) algorithms. The QCB framework enables adaptive decision-making by incorporating bandit-based exploration into QNN training, making it particularly suitable for applications such as recommender systems. To mitigate the adverse effects of quantum noise—including depolarizing, Pauli, and shot noise,the framework leverages a gradient-free optimization approach, enhancing robustness and convergence stability. Experimental results on various datasets demonstrate that QCB consistently outperforms traditional QNN training methods with identical circuit architectures. Notably, the model achieves over 99% accuracy under ideal conditions and sustains high performance under noisy quantum environments. These results underscore the potential of QCB as a scalable, noise-resilient solution for adaptive learning in quantum machine learning systems.
|
|
17:00-17:15, Paper Tu-S3-T12.5 | |
QAS-BO : Quantum Architecture Search Based on Bayesian Optimization Applied to Variational Quantum Algorithms |
|
Chao, Shuyan | East China Normal University |
Deng, Yuxin | East China Normal University |
Liu, Zhanou | East China Normal University |
Zhang, Yuwei | East China Normal University |
Keywords: Quantum Machine Learning, Optimization and Self-Organization Approaches, Metaheuristic Algorithms
Abstract: In the era of Noisy Intermediate-Scale Quantum (NISQ) computing, traditional quantum algorithms face the challenges of limited number of qubits, noise and decoherence. In order to address these issues, we propose a Quantum Architecture Search (QAS) method driven by Bayesian Optimization (BO), which is applied to variational quantum algorithms. In this work, QAS is regarded as a fixed-scale sampling problem. We innovatively propose a quantum gate pool and use a parameterized probabilistic model to dynamically determine the optimal quantum gate for each position in the quantum circuit, thus optimizing the circuit structure. Through using a gradient-free BO method based on radial basis function, we adaptively design end-to-end quantum circuits, significantly reducing circuit depths and improving computational accuracy. We conducted experiments on ground state energy estimation in quantum chemistry and combinatorial optimization problem. The experimental results show that our method is significantly superior to traditional methods and other meta-heuristic search methods in accuracy and efficiency. Our method not only reduces the depth of quantum circuits by up to 85% under a certain accuracy, but also improves the accuracy rate to nearly 100% in combinatorial optimization problem. This provides a powerful and efficient tool for designing optimal quantum circuits and promotes the practical application of quantum algorithms in the NISQ era.
|
|
17:15-17:30, Paper Tu-S3-T12.6 | |
Joint Black-Box Optimization of Warehouse Layout and Worker Assignment Using Quantum Annealing and Factorization Machines |
|
Takaki, Kairi | Nagoya University |
Asai, Yusuke | Nagoya University |
Katayama, Shin | Nagoya University |
Urano, Kenta | Nagoya University |
Yonezawa, Takuro | Nagoya University |
Kawaguchi, Nobuo | Nagoya University |
Keywords: Quantum Machine Learning, Quantum Cybernetics, Optimization and Self-Organization Approaches
Abstract: As global demand for logistics continues to grow, improving the efficiency of warehouse operations has become increasingly important. While many processes in logistics warehouses have been automated, the receiving area still relies heavily on manual work. Therefore, optimizing this area is essential. When optimizing operations in a logistics warehouse, various problems must be considered, such as layout design and worker assignment. Previous research has typically focused on these problems individually. However, because they are closely related, jointly optimizing them can further improve overall efficiency. In this study, we focus on the receiving area and conduct joint optimization of layout design and worker assignment. Specifically, we extend and apply a black-box optimization method called FMQA, which combines Factorization Machines (FM) with Quantum Annealing (QA). By using Factorization Machine regression, we build an objective function from simulator data. We then use Quantum Annealing to minimize it and improve the receiving area. A comparison with an actual warehouse environment, using the multi-agent simulator, shows that our approach can reduce mean package processing time by up to 22.1%. This result demonstrates the effectiveness of joint optimization of layout design and worker assignment.
|
|
17:30-17:45, Paper Tu-S3-T12.7 | |
Amplitude-Ensemble Quantum-Inspired Tabu Search Algorithm for Quantum Boolean Circuit Synthesis |
|
Tseng, Kuo-Chun | National Ilan University |
Sun, Zi-Yi | National Ilan University |
Ho, Pei-Lun | National Ilan University |
Cho, Yu-Chieh | National Ilan University |
Liu, Yu-Syuan | National Ilan University |
Lai, Wei-Chieh | National Ilan University |
Huang, Wei-Chun | National Ilan University |
Hong, Jen-Shin | National Chi Nan University |
Keywords: Evolutionary Computation, Metaheuristic Algorithms, Quantum Cybernetics
Abstract: Quantum circuit synthesis is a key technique in quantum computing. Traditional methods often generate circuits with high cost, making it crucial to produce correct and low-cost circuits efficiently. This study proposes the Amplitude-Ensemble Quantum-inspired Tabu Search (AE-QTS) algorithm for quantum Boolean circuit synthesis. AE-QTS demonstrates strong search capability with fewer parameters. Experimental results show that it outperforms other algorithms on low-bit problems and retains a certain level of synthesis ability on high-bit ones. However, the results also reveal that if the encoding frequently produces infeasible solutions, all algorithms will eventually fail, regardless of design. Therefore, future work should focus on developing encoding methods that consistently generate feasible solutions, which is essential for effectively applying metaheuristics to quantum Boolean circuit synthesis.
|
|
Tu-S3-T13 |
Room 0.96 |
AI and Applications 1 |
Regular Papers - Cybernetics |
Chair: Rodas-Silva, Jorge | UNEMI |
Co-Chair: Nishida, Yoshifumi | Institute of Science Tokyo |
|
16:00-16:15, Paper Tu-S3-T13.1 | |
A Unified Two-Stage Framework for Human-Aligned Information Extraction Using Large Language Models |
|
Ning, Zhiheng | Tongji University |
Jiang, Wenlin | Tongji University |
Lu, Han | Tongji University |
Li, Xiaojun | Tongji University |
Keywords: AI and Applications, Application of Artificial Intelligence, Deep Learning
Abstract: The information extraction (IE) task, aims to transform unstructured text into structured knowledge. However, due to challenges in human needs alignment, large language models (LLMs) often struggle with in IE tasks. This paper proposes a unified two-stage framework to enhance LLMs in IE tasks. First, we introduce a structured schema representation using JSON-LD (JSON for Linked Data) to eliminate ambiguity and enhance model robustness. Second, we design a two-stage learning method: supervised fine-tuning (SFT) for schema understanding, followed by direct preference optimization (DPO), using an automatic negative sampling strategy to align the output with human preferences. Experiments on 19 datasets for general and domain-specific IE tasks show state-of-the-art (SoTA) performance in both zero-shot and supervised settings. Our method achieves a relative improvement of 6.7% and 19.6% in the IE ability for unseen schemas, and maintains robustness in fine-grained entity recognition, outperforming existing baselines. The results highlight the effectiveness of structured schema representation and preference alignment in bridging the gap between LLMs and complex IE requirements.
|
|
16:15-16:30, Paper Tu-S3-T13.2 | |
A Comprehensive Approach for Ransomware Real-World Case Scenarios |
|
Falcao, Paulo | Federal University of Pernambuco |
Barros, Roberto Souto Maior de | Universidade Federal De Pernambuco-UFPE |
Keywords: AI and Applications, Application of Artificial Intelligence, Machine Learning
Abstract: Ransomware is a type of Malware that can cause great damage to their victims. These programs can encrypt and/or lock files in the infected system and, thus, restrict the access of the owner and users to important files. The early detection of ransomware is important, as it can avoid troublesome situations that would otherwise require ransom payment or data back up strategies to regain access to potential important data. Moreover, payment does not guarantee the release of the data and, if the file is truly lost, the damage can go even further, once this opens up the possibility of reputation damage involving customers and partners. One of the main difficulties to deal with ransomware is their rapid and constant evolution, creating methods to deceive outdated detectors, letting the new ransomware be unnoticed. Bearing that in mind, this work proposes a data drift detection approach aiming to improve the detection of new ransomware released in the network, pointing out new patterns that would evidence the need for an update. In particular, semi-supervised learning is adopted as a more fitting way to deal with ransomware, because it requires less time and cost to be updated and deployed again. In addition, this work used publicly available open datasets containing data from malicious ransomware behavior mixed with legitimate system usage of some benign software. Finally, our proposal delivered results that are similar to those of the original papers, and even better in some cases, despite using less than half labeled data.
|
|
16:30-16:45, Paper Tu-S3-T13.3 | |
A Three-Dimensional Complex Covariance Tensor Network for Enhanced Motor Imagery EEG Decoding |
|
Huang, Shoulin | Guangxi Normal University |
Huang, Renhui | Guangdong University of Finance |
Zhang, Ruifeng | Guilin Institute of Information Technology |
Wen, Xiaohao | Guangxi Normal University |
Wang, Shouguang | Zhejiang Gongshang University |
Keywords: AI and Applications, Artificial Social Intelligence, Biometric Systems and Bioinformatics
Abstract: Accurate decoding of motor imagery (MI) EEG signals is critical for brain-computer interface (BCI) applications, yet remains challenging due to the high dimensionality and complex dependencies of amplitude and phase information. We propose a Three-Dimensional Complex Covariance Tensor Network (3D-CCTN) that fuses amplitude and phase into high-dimensional complex covariance features, capturing spatial, spectral, and multivariate characteristics of EEG signals. A fully complex-valued 3D convolutional neural network is then used for robust feature learning and MI classification. Experiments on two public datasets demonstrate that our method outperforms state-of-the-art approaches, achieving at least 2.49% and 1.85% higher mean accuracies. These results highlight the effectiveness of amplitude-phase fusion in the complex domain for MI-based EEG decoding.
|
|
16:45-17:00, Paper Tu-S3-T13.4 | |
Attention Mamba: Time Series Modeling with Adaptive Pooling Acceleration and Receptive Field Enhancements |
|
Xiong, Sijie | Kyushu University |
Liu, Shuqing | Kyushu University |
Tang, Cheng | Kyushu University |
Okubo, Fumiya | Kyushu University |
Xiong, Haoling | East China University of Science and Technology |
Shimada, Atsushi | Kyushu University |
Keywords: AI and Applications, Application of Artificial Intelligence, Neural Networks and their Applications
Abstract: Time series modeling serves as the cornerstone of real-world applications, such as weather forecasting and transportation management. Recently, Mamba has become a promising model that combines near-linear computational complexity with high prediction accuracy in time series modeling, while facing challenges such as insufficient modeling of nonlinear dependencies in attention and restricted receptive fields caused by convolutions. To overcome these limitations, this paper introduces an innovative framework, Attention Mamba, featuring a novel Adaptive Pooling block that accelerates attention computation and incorporates global information, effectively overcoming the constraints of limited receptive fields. Furthermore, Attention Mamba integrates a bidirectional Mamba block, efficiently capturing long-short features and transforming inputs into the Value representations for attention mechanisms. Extensive experiments conducted on diverse datasets underscore the effectiveness of Attention Mamba in extracting nonlinear dependencies and enhancing receptive fields, establishing superior performance among leading counterparts. Our codes will be available on GitHub.
|
|
17:00-17:15, Paper Tu-S3-T13.5 | |
Three-Dimensional Situational Awareness for Everyday Safety: Integration of Epidemiological Data and Three-Dimensional Spatial and Behavioral Data |
|
Naoki, Nozaki | Institute of Science Tokyo |
Kawabe, Yuya | Institute of Science Tokyo |
Oono, Mikiko | National Institute of Advanced Industrial Science and Technology |
Kitamura, Koji | National Institute of Advanced Industrial Science and Technology |
Yamanaka, Tatsuhiro | Safe Kids Japan |
Nishida, Yoshifumi | Institute of Science Tokyo |
Keywords: AI and Applications, Big Data Computing,, Cloud, IoT, and Robotics Integration
Abstract: An inclusive society has a growing demand for support systems that can accommodate the diversity of human physical and mental functions and the diversity of living environments. This study focused on preventing home accidents involving young children as a concrete example of methodologies for observing, modeling, and controlling "living situations," which are complex systems composed of people and their environments, where interactions between people and their environment change over time. Although previous research has proposed technologies for recognizing children's behaviors and detecting environmental risks, these technologies have not dealt with the integration of behavior and environment to predict accident pathways and how long before they will occur. In this study, the authors developed a system that integrates home data scanned in three dimensions by smartphones, child position data obtained using posture recognition technology, and epidemiological accident databases (N=358,715 cases) collected by the Tokyo Fire Department in Japan. The system estimates accident risks based on the relative positions of furniture and children, and visualizes the predicted paths and expected time to arrival at high-risk objects in real time. An evaluation based on national accident statistics confirmed that the system's object recognition covers approximately 70% of furniture types involved in serious accidents. By conducting experiments with a six-year-old child and 55 three-dimensional scanned data obtained in actual living environments, the authors also demonstrated the feasibility of risk visualization considering the individuality and dynamic nature of living environments.
|
|
17:15-17:30, Paper Tu-S3-T13.6 | |
Dynamic Wavelet Filtering Network for Time Series Forecasting |
|
Li, Jintao | University of Electronic Science and Technology of China |
Deng, Hui | University of Electronic Science and Technology of China |
Xiao, Jiang | University of Electronic Science and Technology of China |
Yu, Shui | Shen Zhen Institute for Advanced Study, UESTC |
Li, Yun | Shenzhen Institute for Advanced Study, University of Electronic |
Keywords: AI and Applications, Computational Intelligence, Deep Learning
Abstract: Recent advances in frequency-domain analysis and wavelets have positioned spectral decomposition as a crucial tool for time series forecasting. However, existing studies predominantly interpret low-frequency components as carriers of long-term trend information, ignoring high-frequency signals in the modeling. This approach risks erasing fine-grained temporal features, leading to persistent bottlenecks in predictive accuracy. To improve, we propose a dynamic wavelet filtering network (DWFN) to synergize wavelet transforms with trainable filtering mechanisms. Specifically, DWFN first uses wavelet transforms to decompose time series signals into multiscale components, including high-frequency details and low-frequency approximations. These components are then adaptively reconstructed through a dynamic filtering module based on a multi-layer perceptron (MLP) network, which leverages trainable weighting mechanisms to suppress task-irrelevant noise while preserving cross-frequency features critical to prediction tasks. Further, we introduce a wavelet-informed loss function that explicitly aligns spectral reconstruction errors with forecasting objectives. Experiments on six time series forecasting benchmarks and experimental results demonstrate the superior performance of DWFN in terms of effectiveness and efficiency compared with the state-of-the-art methods. DWFN is available in this GitHub repositoryfootnote{scriptsize{url{https://github.com/Com puting-Intelligent-Decision-Team/DWFN}}}.
|
|
17:30-17:45, Paper Tu-S3-T13.7 | |
Twin Prompt: An End-To-End Framework Inspired by Human Cognition for Navigating Language Model Reasoning |
|
Zhuang, Ren | Hangzhou Normal University |
Wang, Ben | Hangzhou Normal University |
Sun, Shuifa | Hangzhou Normal University |
Keywords: AI and Applications, Cybernetics for Informatics, Deep Learning
Abstract: Large language models (LLMs) provide essential capabilities for smart systems, yet navigating complex reasoning frontiers reliably remains challenging, hindering deployment in dynamic environments. Existing prompting methods often lack robustness or demand costly multi-step interaction. We introduce Twin Prompt, a novel automated framework inspired by human cognition, operationalizing structured problem reformulation and answer refinement within a single, end-to-end interaction requiring no manual examples. This cognitively grounded structure guides the LLM's internal reasoning, enhancing analysis and leveraging latent self-correction capabilities to improve accuracy and reliability. Evaluations on challenging mathematical and general reasoning benchmarks (GSM8K, MATH, MMLU, BBH) demonstrate Twin Prompt significantly boosts performance over standard baselines across diverse LLMs. These findings highlight the potential of structured, single-pass prompting to advance LLM reasoning, enabling more capable and dependable AI components for navigating a dynamic world.
|
|
17:45-18:00, Paper Tu-S3-T13.8 | |
Enhanced Helmet Detection Using a CNN and Genetic Algorithm for Personal Protective Equipment Compliance |
|
Rodas-Silva, Jorge | UNEMI |
Parraga-Alava, Jorge | Universidad Tecnica De Manabi |
Keywords: AI and Applications, Deep Learning
Abstract: Automated helmet detection is essential for verifying compliance with personal protective equipment (PPE) regulations. Real-time detection of helmet use can significantly reduce the risk of injuries caused by falling objects, machinery accidents or other hazards in high-risk environments. In this paper, we explore the impact of Genetic Algorithm (GA) optimization on the computational efficiency and performance of Convolutional Neural Networks (CNNs) for automated helmet detection, ensuring compliance with PPE regulations. Our approach includes a robust data collection and annotation stage, a CNN architecture, and GA-based optimization to fine-tune hyperparameters for improved accuracy and computational efficiency. Experimental results demonstrate that the GA-optimized CNN outperforms the baseline model in accuracy and robustness, particularly under challenging conditions such as normal lighting and angle variation. With 94% accuracy and a 98% AUC, an inference time of 23.5 ms, and a memory footprint of 245 MB, the GA-CNN excels helmet detection.
|
|
Tu-S3-T14 |
Room 0.97 |
Artificial Intelligence of Things for Nuanced Human-Machine Interaction &
Living Memory: AI-Driven Preservation of Human Memory and Knowledge
Inheritance |
Special Sessions: HMS |
Chair: D'Aniello, Giuseppe | University of Salerno |
Co-Chair: Gravina, Raffaele | University of Calabria |
Organizer: D'Aniello, Giuseppe | University of Salerno |
Organizer: Gravina, Raffaele | University of Calabria |
Organizer: Nahavandi, Saeid | Swinburne University of Technology |
Organizer: Nürnberger, Andreas | Otto-Von-Guericke-Universität Magdeburg |
|
16:00-16:15, Paper Tu-S3-T14.1 | |
Assessing the Short-Term Impact of Air Pollution and Socioeconomic Factors on the Overall Mortality of Taranto: An eXplainable Machine Learning Approach (I) |
|
Lofù, Domenico | Dept. of Electrical and Information Engineering (DEI), Politecni |
Colonna, Gianluca | Politecnico Di Bari |
Sorino, Paolo | Politecnico Di Bari |
Castellana, Fabio | Data Sciences and Innovation, N.I. of Gastroenterology ”S. De Be |
Lombardi, Angela | Dept. of Electrical and Information Engineering (DEI), Politecni |
Ragone, Azzurra | Università Degli Studi Di Bari |
Sardone, Rodolfo | Data Sciences and Innovation, N.I. of Gastroenterology ”S. De Be |
Keywords: Human-centered Learning, Human-Machine Cooperation and Systems
Abstract: Increasing emissions due to urbanization pose a significant threat to human health. Air pollutants such as NO2, PM2.5, and PM10 have a well-documented impact on quality of life and mortality risk. Numerous studies are currently investigating the correlation between air pollution and adverse health effects, highlighting the risks associated with exposure to such emissions. In this context, models capable of predicting mortality by leveraging information on pollution and socioeconomic factors are essential for prevention. This work contributes to this field by using Machine Learning (ML) to predict mortality in the Municipality of Taranto (Italy). We propose an eXtreme Gradient Boosting (XGBoost) regression model that uses air pollution levels and socioeconomic data to assess mortality risk over a grid-based map of the examined city. Our study aligns with state-of-the-art findings, further expanding the understanding of the correlation between emissions, economic conditions, and mortality. The proposed model achieves a Root Mean Square Error (RMSE) of 1.61 on the test set, demonstrating the effectiveness of this approach. Additionally, eXplainable Artificial Intelligence (XAI) analysis is conducted using Shapley values to gain insights into the model’s decision-making process and better understand the importance of each feature in predicting mortality risk.
|
|
16:15-16:30, Paper Tu-S3-T14.2 | |
Explainable Machine Learning for Clinical Prediction of High-Flow Nasal Cannula Therapy in Pediatric Bronchiolitis (I) |
|
Sorino, Paolo | Politecnico Di Bari |
Lofù, Domenico | Dept. of Electrical and Information Engineering (DEI), Politecni |
Narducci, Fedelucio | Politecnico Di Bari |
Di Noia, Tommaso | Politecnico Di Bari, Bari (Italy) |
Di Sciascio, Eugenio | Politecnico Di Bari |
Lofù, Ignazio | Pediatric Unit, Maternal and Child Health Department, “S. Giacomo |
Keywords: Cognitive Computing, Wearable Computing, Medical Informatics
Abstract: High Flow Nasal Cannula (HFNC) therapy is commonly used in infants hospitalized with bronchiolitis, but its early indication remains challenging. This study proposes an explainable machine learning framework to predict the need for HFNC at admission using real-world clinical data from 109 pediatric patients. After preprocessing, class balancing with SMOTE, and feature selection via RFECV, four classifiers were trained and compared. The Multi-Layer Perceptron (MLP) achieved the best performance, significantly outperforming other models in terms of F1-score. To ensure interpretability, SHAP values were used to identify key predictive features such as SpO2, CRP levels, and hospital stay duration while counterfactual explanations highlighted actionable decision boundaries. The proposed model combines predictive accuracy with clinical transparency, offering a promising tool to support early respiratory management in pediatric bronchiolitis Index Terms—Machine Learning, Interpretable Artificial Ingelligence, Counterfactual Explanations, Pediatric Bronchiolitis.
|
|
16:30-16:45, Paper Tu-S3-T14.3 | |
Interacting with Political Narratives through LLMs: An Approach Based on Ontologies and Graph Embeddings (I) |
|
Pascuzzo, Antonella | University of Salerno |
Orciuoli, Francesco | University of Salerno |
Senatore, Sabrina | University of Salerno |
Damiano, Emanuele | University of Salerno |
Keywords: Human-Machine Interaction, Cognitive Computing, Design Methods
Abstract: Narratives are essential tools through which politicians and public figures construct shared meanings and shape public perception, both locally and globally. This paper introduces a computational approach for systematically identifying and analyzing narrative structures in political speeches, aiming to enhance our understanding of how politicians try to convey their messages. A novel ontology, OntoNarr (Ontology for Narrative Representation), is defined and used to identify narrative schemas within the full text of the speeches. The core contribution is a more interpretable and conceptually coherent method for comparing political speeches based on their underlying narrative structures. This is achieved by converting ontology-based representations into graph embeddings and visualizing them using scatterplots. Unlike traditional NLP pipelines that rely primarily on lexical and syntactic features, this method incorporates a formal semantic structure, addressing key limitations in conventional analysis. A case study involving speeches from four politicians demonstrates how historical context influences the choice of narrative schema while also revealing some cross-temporal and cross-ideological similarities. Lastly, a method from granular computing is used to quantitatively evaluate the ontology-based approach.
|
|
16:45-17:00, Paper Tu-S3-T14.4 | |
Chronic Stress Recognition through Multimodal Fusion of EEG Data and Personality Metrics (I) |
|
Riaz, Majid | University of Calabria |
Gravina, Raffaele | University of Calabria |
Fortino, Giancarlo | University of Calabria |
Keywords: Affective Computing, Brain-based Information Communications, Wearable Computing
Abstract: Abstract—Chronic stress significantly undermines cognitive function and well-being, yet its reliable detection remains elusive due to inter-individual psychobehavioral heterogeneity. To the best of our knowledge, this study is the first to introduce a novel multimodal framework synergizing psychometric traits (conscientiousness, neuroticism) and 64-channel electroencephalography (EEG) for precision-driven chronic stress classification. We collected psychophysiological data from 21 subjects, including Big Five personality inventories, Perceived Stress Scale (PSS) scores, and EEG recordings. Statistical analysis revealed a strong inverse correlation between conscientiousness and perceived stress (r = -0.59), while neuroticism showed a weaker positive association (r = 0.12). Power spectral density (PSD) features from artifact-corrected EEG were fused with trait metrics to train SVM and KNN classifiers. The proposed framework achieved state-of-the-art accuracy (SVM: 94.7%; KNN: 89.4%), with neuroticism emerging as a critical predictor alongside beta/gamma-band spectral markers—a novel finding underscoring its role in stress pathophysiology. This work pioneers the integration of personality-aware analytics with neurophysiological biomarkers for stress phenotyping, establishing a new paradigm for personalized mental healthcare. By demonstrating the feasibility of trait-guided machine learning, our contributions advance scalable, individualized interventions, addressing a critical gap in precision psychiatry. These results lay the groundwork for adaptive digital health systems that leverage multimodal data to mitigate chronic stress and enhance quality of life.
|
|
17:00-17:15, Paper Tu-S3-T14.5 | |
Modeling Information Diffusion in Social Media with a Wildfire-Inspired PDE System (I) |
|
D'Aniello, Giuseppe | University of Salerno |
Gaeta, Matteo | University of Salerno |
Moccia, Sabato | University of Salerno |
Zampoli, Vittorio | University of Salerno |
Keywords: Information Visualization, Visual Analytics/Communication, Information Systems for Design and Marketing
Abstract: Information diffusion in online social networks (OSNs) is a complex phenomenon influenced by user interactions and contextual factors. Traditional models fail to generalize across platforms or incorporate the structural properties of the network. This paper presents an approach that adapts a wildfire propagation model, formulated through a system of Partial Differential Equations (PDEs), to simulate the spread of information on social media. Focusing on the structural analysis of Reddit conversations related to the League of Legends World Championship, we investigate how network metrics can be mapped to the physical parameters of the PDE model. Our goal is to evaluate whether such a parallel enables a meaningful prediction of information diffusion in terms of engagement, measured as the number of comments per post.
|
|
17:15-17:30, Paper Tu-S3-T14.6 | |
Improving Situation Awareness and Self-Adaptation in Autonomous Wheelchair-Drone Systems through Floor Surface Anomaly Detection (I) |
|
Gaeta, Rosario | University of Salerno |
Corradini, Franca | University of Applied Sciences and Arts of Southern Switzerland |
De Santo, Massimo | University of Salerno |
Flammini, Francesco | Mälardalen University |
Ge, Hangli | The University of Tokyo |
Keywords: Assistive Technology, Cognitive Computing
Abstract: Autonomous wheelchair-drone systems represent a promising advancement in assistive mobility, enabling enhanced navigation in complex and dynamic environments. However, floor surface anomalies—such as uneven terrain, obstacles, and hazardous floor conditions—pose significant challenges to safe and efficient operation. This paper presents a novel approach improving situation awareness and self-adaptation by integrating floor surface-anomaly detection in autonomous wheelchair-drone systems. A specific architecture for situation-awareness is proposed, combining machine learning-based anomaly detection with adaptive motion planning, to enhance the system's resilience and responsiveness. Experimental results in simulated scenarios using Yolo-based architecture on real-world datasets demonstrate improved anomaly detection performances compared to the state-of-the-art, reducing the risk of instability and improving user safety. Experiments show a mAP50 of 0.764 and a F1 of 0.742 using a YoloV11s architecture. The research presented in this paper has been developed within the European project named REXASI-PRO, which aims to develop trustworthy AI solutions to assist individuals with reduced mobility.
|
|
17:30-17:45, Paper Tu-S3-T14.7 | |
Blind Sidewalk Segmentation and Navigation for Visually Impaired People Using Wearable Camera (I) |
|
Ma, Congcong | Wuhan University of Technology |
Sun, Lvyuan | Academy for Electronic Information Discipline Studies, Nanyang I |
Du, Xinchen | Academy for Electronic Information Discipline Studies, Nanyang I |
Gravina, Raffaele | University of Calabria |
Keywords: Wearable Computing
Abstract: Visually Impaired People (VIP) face various challenges in their daily life, especially while they navigate in outdoor environments. Portable and user-friendly devices would facilitate their independent living. In this work, we designed a wearable camera-based system to help VIP to detect and navigate on blind sidewalks. We proposed an adaptive image mask selection method to automatically segment the blind sidewalk; a walk deviation method is used to help the VIP navigating on such tactile paving. We implemented the algorithm on a Raspberry Pi to make the system standalone and cost-effective. The performance are evaluated in real-life scenarios and our results prove better recognition accuracy and computation efficiency. Experiment results showed that for each video frame our algorithm's average execution time is as low as 49.5 ms, while for the image mask renewal step is 50.4 ms.
|
|
Tu-S3-BMI.WS |
Room 0.49&0.50 |
BMI Special Event |
BMI Workshop |
Chair: Falk, Tiago H. | INRS-EMT |
|
16:00-16:15, Paper Tu-S3-BMI.WS.1 | |
Integrative Approaches to EEG Signal Analysis: Collaborative Perspectives from Multiple Disciplines |
|
Narducci, Fedelucio | Politecnico Di Bari |
Lombardi, Angela | Dept. of Electrical and Information Engineering (DEI), Politecni |
Lofù, Domenico | Dept. of Electrical and Information Engineering (DEI), Politecni |
Sorino, Paolo | Politecnico Di Bari |
Colafiglio, Tommaso | Dept. of Electrical and Information Engineering (DEI), Politecni |
Keywords: Passive BMIs, Active BMIs, BMI Emerging Applications
Abstract: 'Introduction to BCI devices and EEG signals' (20 min) Prof. Fedelucio Narducci 'Emotion Recognition with low-cost sparse electrode devices' (20 min) Prof. Angela Lombardi 'Critical ethical and privacy considerations that arise from using sensitive biometric information' (20 min) Prof. Domenico Lofù 'Brain Computer Interfaces for Neural Games' (30 min) Paolo Sorino 'EEG Scope: a tool for EEG Signal/BCI Signal analysis' (30 min) Tommaso Colafiglio More information: https://sites.google.com/view/smc-bmi-workshop2025/program
|
|
Tu-Online |
Online Room |
Online Session SSE |
Regular Papers - SSE |
|
08:30-18:00, Paper Tu-Online.1 | |
PFL-SA: Personalized Federated Learning with Switchable Aggregation Strategy |
|
Jia, Weidong | Capital Normal University |
Ren, Chang-E | Capital Normal University |
Cheng, Siyao | Capital Normal University |
Keywords: Distributed Intelligent Systems, Cyber-physical systems, Decision Support Systems
Abstract: 联邦学习 (FL) 支持在 客户端和服务器而不是数据传输,增强 用户隐私。在这项研究中,我们采用了异步 训练方法,允许服务器在没有 等待崩溃的客户端,从而改进轮次 效率。 此外,由于存在高度异构的数据 在客户之间分配,我们提出动态 个性化的联合学习方法,这有助于 全局模型更适合每个客户的独特数据 分配。此外,我们还认为 过时的模型会降低全局模型的准确性 因为它们无法捕获最新的用户偏好。 我们设计了一种可切换的聚合算法。如果 之间存在较大的数据分布差异 最新更新的客户端和其他过时的客户端,其中 意味着用户的偏好发生了很大变化。解决 这个挑战,在聚合时,他们的模型版本 将被考虑。如果有一点点区别,
|
|
08:30-18:00, Paper Tu-Online.2 | |
A Channel-Level Multi-Sensor Feature Fusion Fault Diagnosis Framework for AUV Thruster |
|
Meng, Qingxu | Ocean University of China |
Ni, Chenxing | Ocean University of China |
Feng, Chen | Ocean University of China |
Keywords: Fault Monitoring and Diagnosis, Robotic Systems
Abstract: The thruster is a critical power component of an autonomous underwater vehicle (AUV), and its fault diagnosis is essential for enhancing AUV reliability. Currently, in data driven fault diagnosis, features obtained from a single sensor are insufficient for fault detection, while features from multi-sensor often contain redundancy. To address these challenges, this research presents a channel-level multi-sensor feature fusion framework for AUV thruster, integrating a channel aggregation approach with adaptive feature allocation and a channel attention mechanism to efficiently fuse multi-channel data. Additionally, a two-channel enhanced graph attention network training strategy is introduced to further improve performance. To address the lack of diverse AUV thruster fault datasets, this paper presents a new multi-fault AUV thruster dataset, covering various fault types and multi-sensor data. Experimental results indicate that the proposed method achieves fault diagnostic accuracy and precision between 99.85% and 99.96% across different working conditions, showcasing great robustness and swift convergence.
|
|
08:30-18:00, Paper Tu-Online.3 | |
Ultrasonic-MYOLO: A Fault Detection Algorithm for Aerospace Components Based on Ultrasonic C-Scan Imaging (I) |
|
Xu, Jin | Shenyang Aerospace University |
Gao, Beihang | Shenyang Aerospace University |
Zhang, Senyue | Shenyang Aerospace University |
Keywords: Fault Monitoring and Diagnosis
Abstract: The aerospace field requires high structural integrity of critical components, and accurate and efficient fault detection is crucial for flight safety. In this paper proposed a fault detection algorithm based on ultrasonic C-scan imaging, called Ultrasonic-MYOLO. The method combines ultrasonic nondestructive testing techniques with the improved YOLO deep learning framework to realize high-precision recognition of defects. First, Ultrasonic-MYOLO constructs a detection model based on the state-space model, which can realize global information perception at a low computational cost. Second, a clustering self-attention mechanism is designed to further enhance the model’s ability to express non-local features. Finally, a hybrid expert mechanism is introduced to optimize the feature extraction unit based on the state-space model, which alleviates the problem of decreasing detection accuracy due to category imbalance. Compared with traditional image processing-based methods, Ultrasonic-MYOLO is able to extract defect features in ultrasound C-scan images more effectively and utilize deep learning models for automatic detection to improve detection accuracy and detection efficiency. Experimental results show that the algorithm exhibits excellent performance in several defect detection tasks for aerospace components, specifically, Ultrasonic-MYOLO can detect a single ultrasound image with a resolution of 224 × 224 within 1 millisecond, achieving an mAP50 of 96.4%, demonstrating both high detection speed and accuracy, providing an efficient and intelligent solution for aerospace structural health monitoring.
|
|
08:30-18:00, Paper Tu-Online.4 | |
A Linked Gauss-Seidel Stochastic Kriging for Multi-Layer Systems with Noisy Response |
|
Xie, Shan | Sun Yat-Sen University |
Huang, Hanyan | Sun Yat-Sen University |
Chen, Hongbo | Sun Yat-Sen University |
Keywords: System Modeling and Control
Abstract: Systems consisting of multiple coupled sub-models, called multi-layer systems, are common in engineering design and optimization, calling for the effective and efficient surrogate model for such systems. Nowadays the existing methods involves the structural method instead of the All-in-ones (AIO) method that treats the system as a black box. Among them, the Linked Gauss Process (LGP) has attracted attention because its strong theoretical basis and the robust performance of its fundamental model Kriging. However, the existing LGP can not handle the situation with noisy response within the multi-layer systems, since the ordinary Kriging is designed for the deterministic simulation. To solve this, this paper proposes a novel linked emulator for the multi-layer stochastic simulation, called Linked Gauss-seidel Stochastic Kriging (LinkedGSK). This method replaces the ordinary Kriging with stochastic Kriging within the linked emulator, meanwhile improves the stochastic Kriging with a Gauss-seidel iteration optimization of its parameters. Numerical experiments exhibit the significant and robust performance of the proposed LinkedGSK compared to other existing methods.
|
|
08:30-18:00, Paper Tu-Online.5 | |
Denoising Multi-Physiological Signals of Power Personnel in Multi-Operation Scenarios Using CBAM-Unet |
|
Xi, Yang | Northeast Electric Power University |
Wang, Hao | Northeast Electric Power University |
Wang, Wenjing | Northeast Electric Power University |
Zhang, Zihao | Northeast Electric Power University |
Keywords: Fault Monitoring and Diagnosis, System Modeling and Control, Smart Sensor Networks
Abstract: 作涉及复杂多变的条件,其中 ECG、PPG、EEG 和 EDA 等生理信号对于健康监测和异常检测至关重要。然而,这些信号经常受到噪声的污染,例如工业频率干扰、EMG 伪影和基线漂移,从而阻碍了准确的分析。为了解决这个问题,我们提出了 CBAM-Unet,这是一种多模态去噪模型,它集成了用于多尺度特征提取的上下文对比块 (CCB) 和用于动态特征聚焦的通道空间注意力模块 (CBAM)。CCB 通过粗略和精细的上下文分支捕获局部和全局噪声模式,而 CBAM 通过信道和空间注意力增强关键信号区域。实验结果表明,CBAM-Unet 在 SNR 和 RMSE 方面优于 Unet 和 FCN。在电源维护场景下的 ECG 去噪任务中,SNR 从 7.58 dB 提高到 17.91 dB,RMSE 从 0.2066 下降到 0.0687。对于 PPG 信号,SNR 从 22.16 dB 增加到 38.43 dB,RMSE 降低到 0.0532。这些结果证明了该
|
|
08:30-18:00, Paper Tu-Online.6 | |
An Improved Distributed SpMV with Cache-Aware Optimization and Load Balancing |
|
Shang, Yunkun | Qilu University of Technology (Shandong Academy of Sciences) |
Zhuang, Yuan | Qilu University of Technology (Shandong Academy of Sciences) |
Zeng, Yunhui | Qilu University of Technology (Shandong Academy of Sciences) |
Keywords: System Architecture, Large-Scale System of Systems, Distributed Intelligent Systems
Abstract: Sparse Matrix-Vector Multiplication (SpMV) is a critical operation in scientific and high-performance computing, playing a key role in large-scale numerical simulations. Handling extremely large sparse matrices on a single node is challenging, making distributed computing an effective solution. Existing distributed SpMV optimizations often overlook data locality and suffer from communication bottlenecks. This paper proposes two optimization strategies for distributed SpMV based on the Compressed Sparse Row (CSR) format to improve multi-node efficiency. First, combining cache-aware memory access with binary search, we partition thread tasks across nodes to enable fine-grained non-zero element distribution, reducing memory conflicts, load imbalance, and communication overhead. Second, we design a distributed SpMV kernel incorporating vectorization and load balancing to enhance performance. These strategies improve coordination across nodes and resource utilization. Experiments show average speedups of 81.82× and 7.99× over classical and graph-partitioned distributed SpMV, with peak speedups of 462.34× and 45.85×.Compared to recent distributed SpMV algorithms that balance computation and communication, our method achieves average and peak speedups of 1.23× and 4.27×.The algorithm scales well on large irregular sparse matrices, demonstrating the effectiveness of multi-node collaborative optimization.
|
|
08:30-18:00, Paper Tu-Online.7 | |
A Variational Bayesian Framework for Simultaneous Input and State Estimation |
|
Ren, Kunpeng | Southwest University |
Xu, Zihan | Southwest University |
Wang, Yizhen | Southwest University |
Yin, Le | Southwest University |
Keywords: System Modeling and Control, Adaptive Systems, Control of Uncertain Systems
Abstract: In this paper, a variational Bayesian framework for simultaneous input and state estimation (VBSISE) is proposed. In VBSISE, probability density functions (PDFs) are assigned to inputs, with their parameters adaptively updated to enhance input estimation. This offers greater flexibility than existing methods that only handle fixed input conditions. Specifically, inputs are modeled using Gaussian and inverse Wishart distributions, and the joint PDF of the estimated variables is derived and approximated via the variational Bayesian approach to obtain tractable distributions. The input and state estimates are then obtained as the means of the approximated PDFs. The proposed VBSISE effectively adapts to input variations, unlike conventional methods that rely on fixed statistical moments. Experimental results demonstrate that the proposed VBSISE outperforms existing filters in both input and state estimation accuracy.
|
|
08:30-18:00, Paper Tu-Online.8 | |
Online Sparse Streaming Feature Selection with Feature Correlation |
|
Tian, Hao | Southwest University |
Wu, Di | Southwest University |
Chen, Jia | Beihang University |
Zhou, Min | Southwest University |
Luo, Xin | Chinese Academy of Sciences |
Keywords: Adaptive Systems, Decision Support Systems
Abstract: Online streaming feature selection(OSFS) employs an online approach to achieve feature selection, has distinct advantages in big data. Existing methods focus on feature selection with complete data, while the issue of missing data in real-world applications affects the performance of models. To address this issue, we focus on the online selection of sparse streaming features and interactions between them, proposing an Online Sparse Streaming Feature Selection with Feature Correlation(OS2FSFC). It mainly consists of two parts: 1) Predictive completion of missing data using a latent feature analysis aligned with the feature selection objective, and 2) Feature selection applied to the completed streaming features, considering feature interaction. The results of the experiments demonstrate that OS2FSFC outperforms several state-of-the-art OSFS algorithms.
|
|
08:30-18:00, Paper Tu-Online.9 | |
PATVTN: Period-Aware Time-Varying Topological Graph Neural Network for Traffic Flow Forecasting |
|
Ao, Zhang | Xinjiang University |
Qin, Xizhong | XinJiang University |
Wang, Ben | Xinjiang University |
Jia, Zhenhong | XinJiang University |
Keywords: Intelligent Transportation Systems, Smart Buildings, Smart Cities and Infrastructures, Smart Sensor Networks
Abstract: Traffic flow forecasting has become a key technology to alleviate urban congestion. The core challenge is to accurately model the spatial-temporal coupling correlation in data. At present, many methods have made great progress, but most methods still face two important challenges: (i) The internal periodic pattern of traffic flow has not been accurately modeled. (ii) Failure to adapt effectively to the dynamic nature of traffic flow data. Accurate modeling of periodic patterns helps uncover the evolution of traffic flow, while improved adaptability to dynamic topologies enhances the capture of changing spatial dependencies. Therefore, this paper proposes a Periodic-Aware Time-Varying Topological Network (PATVTN) to enhance the precision of traffic flow prediction. In particular, PATVTN employs a Period-Aware Time Encoder to capture periodic traffic patterns, and introduces a Time-Varying Topological Space Encoder to adapt to the time-varying topological structure. Experiments conducted on four real-world traffic datasets show that our model consistently outperforms existing state-of-the-art methods.
|
|
08:30-18:00, Paper Tu-Online.10 | |
Trust Management and Information Reliability in IoV: A Blockchain-Bayesian Collaborative Framework |
|
Zha, Qixin | Yunnan University |
Tang, Li | Yunnan University |
Xu, Hongxing | Yunnan University |
Yao, Shaowen | School of Software Yunnan University |
Khalid, Adam | Maldives National University |
Zahir, Ahmed | Yunnan Univeristy |
|
|
08:30-18:00, Paper Tu-Online.11 | |
WhereRU: A Multimodal Indoor Localization System Via Deep Learning-Based RFID and Ultrasonic Fusion |
|
Liu, Wenrui | Institute of Information Engineering, Chinese Academy of Science |
Wang, Shuai | Chinese Academy of Sciences |
Zhang, Jinqing | University of Southern California |
Bai, Guangxuan | Institute of Information Engineering, Chinese Academy of Science |
Pan, Bofan | Institute of Information Engineering, Chinese Academy of Science |
Wang, Siye | Institute of Information Engineering, Chinese Academy of Science |
Li, Jing | University of Chinese Academy of Sciences |
Keywords: Smart Sensor Networks
Abstract: Indoor localization systems face critical trade-offs between accuracy, coverage, robustness, and privacy. Current technologies either achieve high precision at the cost of privacy (camera-based) or preserve privacy while compromising accuracy (signal-based). This limitation is particularly problematic in high-security environments requiring precise tracking within confined spaces (typically 2m × 2m) while prohibiting camera surveillance. We propose WhereRU, a novel multimodal indoor localization system integrating Radio Frequency Identification (RFID) with ultrasonic technologies through a fusion framework. Unlike existing approaches treating modalities as separate components, our system implements deep algorithmic integration where RFID-based coarse-grained localization directly constrains ultrasonic measurements. The system employs WhereNet, a lightweight residual network for RFID processing, achieving 99.74% accuracy at the grid level. This coarse position estimate guides ultrasonic ranging optimization through Limited-memory Broyden-Fletcher-Goldfarb-Shanno Bound (L-BFGS-B) algorithm, effectively mitigating edge-region inaccuracies typical in ultrasonic-only systems. Experimental results demonstrate that WhereRU achieves an average localization error of only 0.107 meters--representing an 81.86% improvement over traditional Time of Arrival methods--while maintaining complete privacy by eliminating biometric data collection. This approach provides a practical solution for secure facilities requiring both centimeter-level localization accuracy and strict privacy preservation.
|
|
08:30-18:00, Paper Tu-Online.12 | |
Memory-Priority Scheduling for Digital Simulation Model of Security and Stability Control Systems on a Multi-Core System |
|
Dingyu, Hu | Northwestern Polytechnical University |
Zhi, Wang | Computer Science and Engineering, Northwestern Polytechnical Uni |
Shen, Bo | Northwestern Polytechnical University |
Meikang, Qiu | Augusta University |
Keywords: Intelligent Power Grid, Cyber-physical systems, Large-Scale System of Systems
Abstract: The increasing integration of renewable energy sources, such as wind and solar, into power grids has amplified the demand for effective management of Security and Stability Control System (SSCS). To address cost and scalability challenges, digital simulations have become indispensable for evaluating the control strategies. As grid complexity grows, the need for precise simulations and efficient task scheduling in multi-core environments has become important. This paper introduces a memory-priority scheduling algorithm for the simulations of SSCS model, designed to optimize memory access and enhance task handling efficiency. We also derive the upper bound of the response time. Comparative analysis reveal its superior performance over existing algorithms, particularly in dynamic power system environments.
|
|
08:30-18:00, Paper Tu-Online.13 | |
Detection and Localization of False Data Injection Attacks in Power Systems Based on TGRU |
|
Wang, Xin | Qilu University of Technology |
Li, Xiaolong | Qilu University of Technology |
Liu, Chensheng | Qilu University of Technology |
Wu, Fazong | Qilu University of Technology |
Ming, Yang | Qilu University of Technology |
Wu, Xiaoming | Qilu University of Technology, Shandong Computer Science Center |
Keywords: Intelligent Power Grid, Cyber-physical systems, Fault Monitoring and Diagnosis
Abstract: Modern power systems benefit from enhanced reliability and efficiency due to advanced information technologies, but face increased cyber vulnerabilities, particularly false data injection attacks (FDIAs) that threaten grid stability. We propose a Transformer-Gated Recurrent Unit (TGRU) framework, integrating Transformer encoders’ feature extraction with GRUs’ temporal modeling. A novel Euclidean distance-based threshold selection method distinguishes legitimate from malicious data, and an FDILocator module accurately identifies attack locations by analyzing discrepancies between TGRU predictions and actual measurements. Experiments on the IEEE 14-bus system validate the approach’s effectiveness and accuracy.
|
|
08:30-18:00, Paper Tu-Online.14 | |
Empowering LLM-Based Software Defect Prediction with Chain-Of-Thought and In-Context Learning |
|
Duan, Xinhong | Beijing Information Science and Technology University |
Zhang, Yafei | Office of the Central Cyberspace Affairs Commission |
Cui, Zhanqi | Beijing Information Science and Technology University |
Chen, Jingjing | Beijing Information Science and Technology University |
Keywords: Fault Monitoring and Diagnosis, Quality and Reliability Engineering, Decision Support Systems
Abstract: Software defect prediction aims to identify potential defects by analyzing historical project data, thereby reducing development costs and improving software quality. Existing approaches rely on manually designed features or deep learning models, which struggle to capture complex semantics and contextual dependencies, and often require large volumes of labeled data. To address these limitations, this paper proposes ELASTIC(Empowering LLM-based Software Defect Prediction with Chain-of-Thought and In-context Learning), a novel software defect prediction method. ELASTIC employs MinHash and Locality Sensitive Hashing algorithms to retrieve similar code snippets and constructs enhanced prompts that combine code change context with stepwise reasoning paths. This approach enables effective defect prediction without requiring fine-tuning LLMs. Experimental results on the FunctionSStuBs4J dataset, which contains 21,047 samples, demonstrate that ELASTIC achieves an F1 improvement of 0.9% and a 31.9% increase in Recall compared to the state-of-the-art methods, outperforming baseline models such as COMPDEFECT.
|
|
08:30-18:00, Paper Tu-Online.15 | |
A Particle Swarm Optimization Based Residual Negative Magnitude Shaping Scheme for Vibration Control |
|
Yang, Weiyi | University of Chinese Academy of Sciences |
Yuan, Ye | Southwest University |
Shang, Mingsheng | Chinese Academy of Sciences |
Keywords: Robotic Systems, System Modeling and Control, Control of Uncertain Systems
Abstract: As modern manufacturing advances, suppressing residual vibrations in flexible structures and underactuated systems becomes crucial. Input shaping has garnered attention for its effectiveness in mitigating vibrations and enhancing motion performance. However, existing input shapers typically suffer from unavoidable time delays, modeling inaccuracies, and limited adaptability to uncertain systems, leading to suboptimal performance. To address these critical issues, this paper proposes a Particle Swarm Optimization-based Residual Negative Magnitude (PRV) vibration control scheme with two innovations: a) utilizing the zero vibration shaper to collect system data and employing a data-driven particle swarm optimization (PSO) algorithm to estimate system errors; and b) designing a robust PSO-based Residual negative magnitude (PR) shaper to reduce time delays, address modeling errors, and adapt to diverse system configurations. To validate its performance, two real-world datasets from two laboratory platforms have been established and made publicly available. Empirical results demonstrate that the proposed PR shaper outperforms state-of-the-art methods, and the proposed PRV scheme achieves significant vibration control effects.
|
|
08:30-18:00, Paper Tu-Online.16 | |
FedPAG: Enhancing Personalized Federated Learning Via Pseudo-Data-Driven Similarity-Aware Aggregation |
|
Li, Yuanfeng | Civil Aviation University of China |
Xiaoning, Ma | Civil Aviation University of China |
Hui, Zhi | TravelSky Technology Limited |
Keywords: Distributed Intelligent Systems, Cooperative Systems and Control
Abstract: Federated Learning (FL) is an emerging distributed learning paradigm that enables collaborative model training among multiple clients without requiring access to their private local data. However, in practical settings, FL commonly encounters both model heterogeneity and data heterogeneity. Variations in network architectures and patterns of non-independent and non-identically distributed (non-IID) data across clients often lead to significant degradation in global model performance. Existing approaches that utilize knowledge distillation to address heterogeneity in FL typically rely on access to public datasets or partial sharing of private data. Such reliance increases communication overhead while introducing privacy concerns, which impede practical deployment. In this work, we propose a novel heterogeneous federated learning framework, referred to as FedPAG. Our method extracts statistical information from Batch Normalization layers to synthesize pseudo-data aligned with local distributions, enabling similarity-aware aggregation of student models on the server while circumventing the need for external or shared data. Meanwhile, each client preserves and transfers its local knowledge via local knowledge distillation, enabling personalized optimization. By eliminating the dependence on public datasets and reducing privacy risk, FedPAG provides a scalable and privacy-preserving solution for heterogeneous FL. Rigorous experiments on three real-world datasets under diverse non-IID settings validate FedPAG’s superior performance and robustness over baseline methods in heterogeneous FL environments.
|
|
08:30-18:00, Paper Tu-Online.17 | |
Bearing Fault Diagnosis Method Based on Multi-Scale Dynamic Adversarial Transfer Learning |
|
Hao, Huijuan | Qilu University of Technology |
Wen, Lijun | Qilu University of Technology |
Liang, Hu | Qilu University of Technology |
Ding, Qingyan | Qilu University of Technology |
Bai, Jinqiang | Shandong Computer Science Center (National Supercomputer Center |
Tang, Yongwei | Qilu University of Technology |
Xu, Lei | Qilu University of Technology |
|
|
08:30-18:00, Paper Tu-Online.18 | |
Automated Penetration on Multi-Subnet Environments with Dual-Stage DRL Models |
|
Bu, Haoyu | Institute of Information Engineering, Chinese Academy of Science |
Wen, Hui | Institute of Information Engineering Chinese Academy of Science |
Zhu, Hongsong | Institute of Information Engineering Chinese Academy of Science |
Li, Hong | Chinese Academy of Sciences |
Song, Xirui | Institute of Information Engineering,Chinese Academy of S |
Yimo, Ren | Institute of Information Engineering, Chinese Academy of Sciences |
Keywords: Fault Monitoring and Diagnosis
Abstract: With the advent of artificial intelligence techniques, the field of Network Attack Defense (NAD) has witnessed a surge in research efforts towards automating penetration testing (PenTest). Our work presents a dual-stage PenTest model aiming at predicting attack paths in network topology and determining payload for vulnerabilities in hosts with deep reinforcement learning models. While constructing training environments, our approach integrates real-world vulnerability environments with virtual network topologies. This allows the model to take into account the process of vulnerability validation with success rate compared to existing work based on fully virtualized targets, while retaining the efficiency of deployment and training provided by virtualization. And we introduce a method that simulate hierarchical network topology with randomized subnets to simulate complex network environments, challenging the agent to adapt and learn effective policies across diverse configurations of the target networks. Our experiments demonstrate the effectiveness of our model in various network sizes. In addition, the results indicate that our approach not only achieves high performance but also maintains stability under the different success rate of vulnerability exploitation, showcasing the robustness and adaptability. Our work contributes to the advancement of automated PenTest by providing a more generalized and efficient solution.
|
|
08:30-18:00, Paper Tu-Online.19 | |
StarLight: Multi-Scale Spatial Attention-Guided Representation Learning for Traffic Signal Control |
|
Yan, Yongjun | Hunan University of Technology |
Tong, Lian | Changsha University |
Meng, Zhigang | Changsha University |
Zhu, Xiaoyu | Changsha University |
Wen, Zhiqiang | Hunan University of Technology |
Keywords: Intelligent Transportation Systems, Cooperative Systems and Control, Cyber-physical systems
Abstract: Effective Traffic Signal Control (TSC) in large-scale, dynamic urban networks hinges on informative state representations, a key challenge for reinforcement learning(RL). Existing methods often struggle to capture complex long-range spatial dependencies and multi-scale traffic patterns efficiently. We introduce StarLight, a novel framework that learns enhanced global traffic state representations using a bespoke Variational Autoencoder(VAE). StarLight's VAE encoder uniquely integrates spatial self-attention to model non-local interactions and multi-scale convolutions to capture heterogeneous traffic phenomena across diverse receptive fields. The resulting richer global embeddings significantly improve the value function estimation within a decentralized Proximal Policy Optimization (PPO) framework, leading to better context-aware local control decisions. Comprehensive experiments on benchmark simulations demonstrate that StarLight consistently outperforms state-of-the-art methods under both static and dynamic conditions, achieving substantial reductions in delay and queue lengths with improved convergence stability. StarLight offers a more effective representation learning paradigm for complex TSC problems.
|
|
08:30-18:00, Paper Tu-Online.20 | |
A Lightweight Federated Learning Architecture Approach for Short-Term Load Prediction |
|
Jingdong, Wang | Northeast Electric Power University |
Yang, Yang | Northeast Electric Power University |
Fanqi, Meng | Northeast Electric Power University |
Keywords: Intelligent Power Grid, Distributed Intelligent Systems, System Architecture
Abstract: In order to address the challenges of distributed power load forecasting with limited resources and the need to protect user privacy, this paper proposes a lightweight federated learning architecture, Deep Separable Temporal Convolutional Network (DS-TCN). The model consists of three core modules: parallel multi-branch inflated causal convolution is used to capture short-term mutations and long-term trends simultaneously; adaptive scale fusion module is used to dynamically weight the features of each branch; and the dynamically aware prediction head automatically adjusts the response strength according to the signal fluctuations through a gating mechanism. In the experiments on the public HUE dataset, all clients of DS-TCN perform 5 rounds of local iterations and 50 rounds of federated training, which dramatically reduces the communication and computation overheads compared to the 500 rounds of federated training of the LSTM model. Moreover, the DS-TCN federated global model achieves RMSE=0.1382, MAE=0.1267, and MSE=0.0194, which is equal to or even exceeds LSTM; the average RMSE at the local end is reduced from 0.0796 in LSTM to 0.0699, with the highest single-household reduction of nearly 49%, which fully proves the superiority of the proposed method.
|
|
08:30-18:00, Paper Tu-Online.21 | |
A Novel Contrastive Learning Based Data Augmentation Method for Electricity Consumption Pattern Discovering |
|
Zhang, Youlong | Chongqing Normal University |
Wang, Chao | Chongqing Normal University |
Wang, Bin | Chongqing Normal University |
Wang, Zonghui | Chongqing Normal University |
Zhu, Houyi | Chongqing Normal University |
Jiang, Canghai | Chongqing Normal University |
Keywords: Intelligent Power Grid, Consumer and Industrial Applications
Abstract: With the continuous development of smart sensing technology and energy internet, the growing demand for optimized energy resource management in power system operations side and the user side continues to grow and the in-depth mining of power users’ electricity consumption behavior patterns can provide valuable insights for optimizing energy management and improving user-centric services. However, energy consumption behavior is highly complex, dynamic, and nonlinear. Traditional methods, relying on linear assumptions, fixed feature engineering, and labeled data, lack the adaptability needed to effectively capture such complex patterns. To address this problem, we propose an adaptive data augmentation method based on the attention mechanism for contrastive learning. Contrastive learning can automatically learn effective feature representations under unlabeled conditions, while the attention mechanism improves the expressiveness of the model by focusing on key features. Combining these two strategies for joint training helps the model better identify users’ electricity consumption behavior patterns in different time periods. Through comparative experiments with four data augmentation strategies and three models, the results show that the data augmentation method we proposed combined with the joint training strategy significantly improves the representation ability of the feature extractor, especially in the recognition and feature learning of complex electricity consumption behavior patterns.
|
|
08:30-18:00, Paper Tu-Online.22 | |
Analysis and Research on Fault Propagation of Rotating Machinery System Based on Multiple Key Influencing Factors |
|
Huang, Xianming | Hunan University of Technology |
Si, Pengju | Hunan University of Technology |
Wang, Mingxi | Hunan University of Technology |
Yu, Hongxiang | Hunan University of Technology |
Liang, Ainan | Hunan University of Technology |
|
|
08:30-18:00, Paper Tu-Online.23 | |
OMATE-ZT: Optimized Multi-Attribute Trust Evaluation Model for Zero-Trust in Industrial Internet of Things |
|
Zhang, Meng | Shenyang Aerospace University |
Cheng, Yuliang | Shenyang Aerospace University |
Wang, Hangyu | Institute of Information Engineering, Chinese Academy of Science |
Lv, Fei | Institute of Information Engineering, Chinese Academy of Science |
Keywords: Cyber-physical systems, Consumer and Industrial Applications, Homeland Security
Abstract: 工业物联网 (IIoT) 的发展 设备及其与互联网的更深入集成, 网络攻击的风险正在增加。 传统的基于边界的安全架构依赖于 固定的信任边界。 一旦这些边界被突破,攻击者就可以移动 横向阻力小,使系统暴露在 进一步的威胁。 零信任 (ZT) 打破传统 基于边界的安全方法。 它采用以资源为中心的范式,并结合了 动态、多维度的信任评估,防止 跨攻击面的横向移动。 实现高精度、实时的ZT信任评估 复杂之下,多维属性已经成为 IIoT 环境安全面临的关键挑战。 因此,我们提出了 OMATE-ZT,一种优化的 基于ZT的多属性信任评估模型 原则。 它结合了增强的 FastKAN 算法,以 动态评估多个信任相关维度, 包括主体、客体、环境属性、 网络流量和物理指标。通过利
|
|
08:30-18:00, Paper Tu-Online.24 | |
SMACmix-FCN: A Network for Generating Fluid Velocity Field in Turbulent Scenes |
|
Guo, Shuqiang | Northeast Electric Power University |
Xie, ZhiCheng | Northeast Electric Power University |
Xiao, Bin | Northeast Electric Power University |
Keywords: Consumer and Industrial Applications
Abstract: 摘要— 流体流动现象的研究一直 一直是流体力学领域的热门话题。粒子图像 测速法 (PIV) 作为一种非接触式方法,被广泛使用 在流体力学实验中获得速度场 流体数据。PIV 的 key 是提取 来自粒子对的位移矢量信息 图像,从而构造二维速度 田。针对当前深度学习 用于生成二维速度场的方法 准确率较低,导致 生成的速度场,本研究完全基于 以卷积神经网络为基本框架, 提出了一个名为 湍流场景中的 SMACmix-FCN。这 卷积神经网络设计有四层 encoder、一个瓶颈层和四个解码器,以及 旨在对特征图的大小进行上采样以匹配 编码器模块的输出大小。然后,一个特征 名为 SMACmix 的提取和聚合模块,该模块结合了自我调节的注意力和卷积,是 提议和设计。此模块首先通
|
|
08:30-18:00, Paper Tu-Online.25 | |
A Value Decomposition Multi-Agent Reinforcement Learning Framework for Multi-Echelon Inventory Management in Supply Chain Network |
|
Luo, Ziqiang | School of Computer Science, Sichuan University, Chengdu, 610065, |
Jiang, Yuming | School of Computer Science, Sichuan University, Chengdu, 610065, |
Zhang, Jianxiong | School of Computer Science, Sichuan University, Chengdu, 610065, |
Chen, Yuhan | SICHUAN UNIVERSITY |
Hu, Dasha | School of Computer Science, Sichuan University, Chengdu, 610065, |
Guo, Bing | School of Computer Science, Sichuan University, Chengdu, 610065, |
Zhang, Jinbo | Chengdu Jwell Group Co., Ltd., Chengdu, 610305, Sichuan, China |
Teng, Lina | Chengdu Jwell Group Co., Ltd., Chengdu, 610305, Sichuan, China |
Deng, Yifei | Chengdu Jwell Group Co., Ltd., Chengdu, 610305, Sichuan, China |
Wang, Hao | Chengdu Jwell Group Co., Ltd., Chengdu, 610305, Sichuan, China |
Ding, Xuefeng | School of Computer Science, Sichuan University, Chengdu, 610065, |
Keywords: System Modeling and Control, Distributed Intelligent Systems, Cooperative Systems and Control
Abstract: Deep reinforcement learning (DRL) has been widely applied to address inventory management problems. To tackle the challenges posed by factors such as backlog, multisource replenishment, and demand priority in multi-echelon inventory systems, this paper proposes a value decompositionbased multi-agent reinforcement learning (MARL) framework. The framework utilizes independent DQN networks, embedded with self-attention and GRU modules, to facilitate distributed learning of local action value functions, thus simplifying the complexity of the action space. Additionally, a hybrid network based on the multi-head attention mechanism is constructed to approximate the joint action value function, aiming to optimize the overall system cost. Experiments have been conducted on various types of supply chain networks, and the results demonstrate the effectiveness and scalability of the proposed framework.
|
|
08:30-18:00, Paper Tu-Online.26 | |
Deep Reinforcement Learning-Based Collaborative Optimization for Multi-Echelon Supply Chains |
|
Chen, Yuhan | SICHUAN UNIVERSITY |
Jiang, Yuming | School of Computer Science, Sichuan University, Chengdu, 610065, |
Zhang, Jianxiong | School of Computer Science, Sichuan University, Chengdu, 610065, |
Luo, Ziqiang | School of Computer Science, Sichuan University, Chengdu, 610065, |
Hu, Dasha | School of Computer Science, Sichuan University, Chengdu, 610065, |
Guo, Bing | School of Computer Science, Sichuan University, Chengdu, 610065, |
Zhang, Jinbo | Chengdu Jwell Group Co., Ltd., Chengdu, 610305, Sichuan, China |
Deng, Yifei | Chengdu Jwell Group Co., Ltd., Chengdu, 610305, Sichuan, China |
Wang, Hao | Chengdu Jwell Group Co., Ltd., Chengdu, 610305, Sichuan, China |
Jv, Yang | Chengdu Jwell Group |
Ding, Xuefeng | School of Computer Science, Sichuan University, Chengdu, 610065, |
Keywords: Manufacturing Automation and Systems, Decision Support Systems, Intelligent Green Production Systems
Abstract: A typical supply chain consists of suppliers, manu- facturers, distributors, and customers, with supply, production, and distribution being the key links. Coordinating and optimiz- ing these stages to reduce waste and shorten delivery times poses a significant challenge. However, most existing research relies on heuristic algorithms and focuses primarily on the production and distribution stages. In large-scale, complex scenarios, the computational efficiency and adaptability of heuristic algo- rithms often fall short. This paper investigates a combined scheduling problem and heterogeneous vehicle routing problem. Unlike traditional heuristic approaches, we propose a novel deep reinforcement learning model. For different scenarios, customized encoder-decoder architectures are designed. Finally, a Multi-rollout algorithm is employed for collaborative training. Experimental results demonstrate that the proposed algorithm delivers competitive performance compared to heuristic meth- ods while significantly outperforming them in computation time for individual instances.
|
|
08:30-18:00, Paper Tu-Online.27 | |
Heterogeneous Traffic Study for a Cellular Automata Model Considering Progressive LC and Different Driving Style |
|
Gong, WangHan | Southwest University |
Zhang, Geng | Southwest University |
Song, Boyu | Southwest University |
Keywords: Technology Assessment, Intelligent Transportation Systems, Autonomous Vehicle
Abstract: 互联自动化的出现与发展 车辆 (CAV) 将有可能提高效率 交通系统。CAV 和载人驾驶车辆 (HDV) 不同的驾驶风格将在多车道上共存 长时间通行,变道 (LC) 动作 在这种环境中很常见。研究 LC 的影响 HDV 的动作和不同的驾驶风格 异构交通流,三车道异构 提出了交通流元胞自动机(CA)模型。在 该车型,车辆逐步完成LC工艺和 考虑了不同的人类驾驶方式。模拟 进行实验以显示效率和 不同情况下所提模型的拥塞率 CAV的渗透率和交通密度。结果 表明所提模型在效率上表现良好 和拥堵模拟。最后,的影响 不同的驾驶方式对交通流量也有 调查表明,激进的驾驶风格 显着增加了交通流量,但平静 驾驶风格和适度的驾驶风格影响交通 流量略有。
|
|
08:30-18:00, Paper Tu-Online.28 | |
SCoVerLLM: Smart Contract Vulnerability Detection Via LLM-Based In-Context and Chain-Of-Thought Prompts |
|
Yang, Kaiqi | Beijing Information Science and Technology University |
Gu, Xiguo | Beijing Information Science and Technology University |
Xu, WeiLi | Beiiing Information Science and Technology University |
Cui, Zhanqi | Beijing Information Science and Technology University |
Zheng, Liwei | Beijing Information Science and Technology University |
Keywords: Trust in Autonomous Systems, Fault Monitoring and Diagnosis, Distributed Intelligent Systems
Abstract: As a key application of blockchain technology, smart contracts have been adopted in various domains such as finance and the Internet of Thingsmart contracts may cause significant financial losses if vulnerabilities exist. However, their potential vulnerabilities can lead to significant economic losses, so efficient and accurate vulnerability detection methods are needed to guarantee their security. Existing detection methods mostly rely on predefined rules or discriminative models, which suffer from high maintenance costs and limited semantic understanding.However, with the increasing diversity of contract functionalities, traditional detection methods struggle to effectively identify vulnerabilities. Although Large Language Models (LLMs) have demonstrated strong capabilities in the field of software engineering, their training on general-purpose corpora limits their effectiveness in specialized tasks, such as vulnerability detection. To address this issue, this study paper proposes SCoVerLLM (Smart Contract Vulnerability Detection via LLM-Based In-Context and Chain-of-Thought Prompts), which is designed to enhance the performance of smart contract vulnerability detection by using LLMs. SCoVerLLM combines prediction information generated by deep learning models with similar contract examples, and leverages In-Context Learning prompts and structured Chain-of-Thought templates to guide LLMs in progressively step-by-step analyzing contract logic for vulnerability detection. Experimental results show that SCoVerLLM outperforms existing four methods, including MANDO and Mythril, in terms of multiple performance metrics, with improvements of 10.72% to 19.20% in Accuracy, 8.70% to 18.51% in Precision, and 10.09% to 25.08% in F1.
|
|
08:30-18:00, Paper Tu-Online.29 | |
Formal Modeling and Quantitative Evaluation for Online Monitoring Systems in Nuclear Facilities |
|
Fang, Letian | East China Normal University |
Tang, Wenbing | Nanyang Technological University |
Wang, Xin | East China Normal University |
Liu, Jing | East China Normal University |
Wang, Shengyuan | Tsinghua University |
Keywords: Cyber-physical systems, Control of Uncertain Systems, System Modeling and Control
Abstract: Advanced online execution monitoring is an essential system for ensuring the safety of nuclear facilities. Formal modeling and quantitative evaluation of these systems offer a promising approach to verifying their behaviors and identifying potential vulnerabilities. However, existing modeling languages often lack the capability to represent the system’s control flow logic. Additionally, the absence of automated transformation rules hinders the verification of generated models using available verification tools. Hence, in this paper, we propose a novel synchronous modeling language, Hybrid SynLong, which integrates data flow and control flow to effectively describe the real-time dynamic behaviors of online monitoring systems.Additionally, we present a method for converting the Hybrid SynLong language model into a network of stochastic hybrid automata, enabling direct verification with existing statistical model checkers. In consequence, the performance of an online monitoring system can be quantitatively evaluated by executing well-defined queries. The experimental results illustrate the effectiveness and efficiency of the proposed modeling language and transformation algorithms, as demonstrated through their application in an online monitoring system for a nuclear plant.
|
|
08:30-18:00, Paper Tu-Online.30 | |
Dynamic Multi-Scale Adaptive Graph Convolutional Network for Traffic Flow Prediction |
|
Ren, Bin | Dongguan University of Technology |
Zhang, Hao | Dongguan University of Technology |
Wang, Jiawei | Dongguan University of Technology |
Luo, Haocheng | Dongguan University of Technology |
Wen, Ya | Shenzhen University |
He, Chunhong | Dongguan University of Technology |
Keywords: Intelligent Transportation Systems
Abstract: Traffic flow prediction presents significant challenges due to complex spatio-temporal dependencies. Conventional static road network models fail to adequately capture dynamic traffic patterns and struggle with multi-scale feature extraction, limiting prediction accuracy. To address these problems, we present DMAGCN (Dynamic Multi-scale Adaptive Graph Convolutional Network), an innovative architecture combining MGTCN (Multi-scale Gated Temporal Convolution Network) and ADMGCN (Adaptive Dynamic Multi-Graph Convolutional Network) Modules. MGTCN extracts multi-scale temporal features by combining temporal attention mechanisms with gated convolutional networks, while ADMGCN enhances spatial representation through graph convolutions with spatial attention layers and adaptive adjacency matrices. Comprehensive experimental evaluations conducted on the PEMS04 and PEMS08 datasets demonstrate that DMAGCN consistently outperforms existing state-of-the-art methods in traffic flow prediction tasks.
|
|
08:30-18:00, Paper Tu-Online.31 | |
A Universal Vehicle-Trailer Navigation System with Neural Kinematics and Online Residual Learning |
|
Chen, Yanbo | Tsinghua University |
Tan, Yunzhe | Department of Automation, School of Mechanical Engineering and A |
Wang, Yaojia | Harbin Institute of Technology, Shenzhen |
Xu, Zhengzhe | The University of Hong Kong |
Tan, Junbo | Tsinghua University |
Wang, Xueqian | Tsinghua University |
Keywords: System Modeling and Control, Modeling of Autonomous Systems, Intelligent Transportation Systems
Abstract: Autonomous navigation of vehicle-trailer systems is crucial in environments like airports, supermarkets, and concert venues, where various types of trailers are needed to navigate with different payloads and conditions. However, accurately modeling such systems remains challenging, especially for trailers with castor wheels. In this work, we propose a novel universal vehicle-trailer navigation system that integrates a hybrid nominal kinematic model—combining classical nonholonomic constraints for vehicles and neural network-based trailer kinematics—with a lightweight online residual learning module to correct real-time modeling discrepancies and disturbances. Additionally, we develop a model predictive control framework with a weighted model combination strategy that improves long-horizon prediction accuracy and ensures safer motion planning. Our approach is validated through extensive real-world experiments involving multiple trailer types and varying payload conditions, demonstrating robust performance without manual tuning or trailer-specific calibration.
|
|
08:30-18:00, Paper Tu-Online.32 | |
CDMP: Enhancing Spatio-Temporal WaveNet with Critical Node-Based Dynamic Mask Pre-Training for Traffic Flow Forecasting |
|
Xu, FeiFei | Shanghai University of Electric Power |
Wu, Zixi | Shanghai University of Electric Power |
Du, Qinghan | Shanghai University of Electric Power |
Bi, HaoRan | Shanghai University of Electric Power |
Keywords: Intelligent Transportation Systems, Smart Buildings, Smart Cities and Infrastructures, Smart Sensor Networks
Abstract: Traffic flow forecasting is a classical research topic in spatio-temporal data mining with many real-world applications. Traditional spatio-temporal graph neural networks (STGNNs) jointly model the spatial and temporal patterns of traffic flow data through graph neural networks and sequential models. Limited by model complexity, STGNNs only consider short-term historical time series data, such as data over the past one hour. Recently, various methods based on mask pre-training, which learn temporal patterns from very longterm historical time series, have been proposed and achieve remarkable improvements. Current masked pre-training models primarily adopt a simple random mask strategy for the nodes during training. However, some nodes contain critical information due to their geographical location and fluctuating traffic and have more extensive connections with other nodes. Insufficient mask of these critical nodes may hinder the learning of their complex dependencies through reconstruction while excessive mask of critical nodes may prevent the model from perceiving the overall structure of the traffic network, resulting in the lower prediction accuracy. To address the issue, we propose a novel pre-training framework with a critical node-based dynamic mask strategy (CDMP). CDMP comprises the following innovative components: 1) Critical Node Scoring Module: Critical nodes are defined and evaluated through our proposed Dynamic Spatio-Temporal Flow Importance Metric. 2) Critical Node-Based Dynamic Mask Pre-training: We conduct mask pre-training using a strategy that retains the majority of critical nodes in the early stage and progressively increases their masking proportion as training advances. 3) SpatioTemporal Enhanced WaveNet: Short-term spatio-temporal features are fused with long-term spatio-temporal features from the Pre-training through multi-layer perceptron to achieve highprecision predictions. Experiments on six benchmark datasets validate the state-of-the-art performance of our CDMP.
|
|
08:30-18:00, Paper Tu-Online.33 | |
FusionNav: Enhancing Zero-Shot Object-Goal Navigation Via 3D Semantic Fusion and Farsight Value Reasoning |
|
Liu, Shugao | Institute of Automation, Chinese Academy of Sciences |
Zhang, Qichao | Institute of Automation, Chinese Academy of Sciences |
Haoran, Li | Institute of Automation, Chinese Academy of Sciences |
Dongbin, Zhao | Institute of Automation, Chinese Academy of Sciences |
Keywords: Robotic Systems, Autonomous Vehicle, Modeling of Autonomous Systems
Abstract: Zero-Shot Object-Goal Navigation (ZSON) tasks require agents to efficiently find target objects in unfamiliar environments, demanding strong semantic understanding and generalization. We propose FusionNav, a novel zero-shot navigation framework that combines 3D point cloud semantics with farsight value estimation. By integrating both local semantic information and semantic cues from regions outside the agent’s current field of view, FusionNav enables reasoning about unexplored areas and improves navigation efficiency. Experiments on the HM3D and MP3D benchmarks show that FusionNav outperforms strong baselines in both success rate and path efficiency. Moreover, FusionNav can be directly deployed on real-world robots without additional training or fine-tuning, and operates efficiently with moderate computational resources. These results demonstrate the effectiveness and practicality of FusionNav for real-world zero-shot object-goal navigation.
|
|
08:30-18:00, Paper Tu-Online.34 | |
Testing Autonomous Driving System Via Object-Level Replacement |
|
Xie, Songcheng | Beijing Information Science and Information Technology Universit |
He, Qifan | Beijing Information Science and Technology University |
Cui, Zhanqi | Beijing Information Science and Technology University |
Keywords: Autonomous Vehicle, Trust in Autonomous Systems
Abstract: With the rapid advancement of autonomous driving technology, ensuring the robustness and reliability of decision-making modules has become a critical challenge for the safety of autonomous driving systems (ADSs). In this paper, we propose a novel method, SOGR (Semantic-Guided Object Replacement), to evaluate the decision consistency of ADSs by constructing highly misleading test images that preserve the semantic integrity of the original driving scenes. SOGR identifies important objects using Grad-CAM and YOLOv8, and replaces them with semantically equivalent objects generated via Stable Diffusion. Experiments conducted on the BDD100K dataset demonstrate that SOGR outperforms the pixel-level perturbation baseline DeepIA, achieving higher misleading rates while maintaining lower perceptual similarity (LPIPS) scores. These results indicate that SOGR can effectively expose model vulnerabilities while maintaining high visual realism, offering a practical and semantically grounded approach for robustness testing in real-world autonomous driving scenarios.
|
|
08:30-18:00, Paper Tu-Online.35 | |
Implicit Device Tracking under the Multimedia Technology Wave |
|
Wang, Xiaoxi | Institute of Information Engineering, Chinese Academy of Science |
Liu, Xinyu | Institute of Information Engineering, Chinese Academy of Science |
Huang, Kerui | Institute of Information Engineering, Chinese Academy of Science |
Zheng, Chunyang | Institute of Information Engineering, Chinese Academy of Science |
Ren, Jinhe | Institute of Information Engineering, Chinese Academy of Science |
Liu, Wei | Institute of Information Engineering, Chinese Academy of Science |
Liu, Yuling | Institute of Information Engineering, Chinese Academy of Science |
Liu, Qixu | Institute of Information Engineering, Chinese Academy of Science |
Keywords: Infrastructure Systems and Services, System Architecture, Technology Assessment
Abstract: The proliferation of Android multimedia applications highlights the critical role of mobile sensors. Inherent manufacturing defects enable implicit device identification without consent, facilitating covert tracking. This bolsters security through reliable malicious actor tracking, unlike spoof vulnerable explicit methods. However, prior sensor-based identification suffers from signal noise and device degradation, compromising robustness. In this paper, we propose AMSensorFP, a novel Android implicit device tracking framework. We develop an application to collect device information and multi-sensor data to construct device fingerprints. Subsequently, we build a dataset by performing pairwise difference calculations on the collected fingerprints. And then enhance the dataset with Gaussian noise to improve data diversity and robustness. An autoencoder reduces feature dimensionality, and the processed features are fed into a BiLSTM model with a multi-head attention mechanism, enabling effective fingerprint recognition. Experimental results show that AMSensorFP achieves 99.88% accuracy and 97.34% true positive rate(TPR), significantly outperforming existing methods. Ablation analysis further highlights the contributions of each module and feature in the framework. AMSensorFP delivers a reliable solution for device tracking and security enhancement.
|
|
08:30-18:00, Paper Tu-Online.36 | |
FE-HGAT: Frequency-Enhanced Hybrid Graph Attention Network for Traffic Prediction |
|
Ren, Yao | Sichuan University |
Zhu, Wujiang | Sichuan University |
Lan, Shiyong | Sichuan University |
Zhou, Xinyuan | Sichuan University |
Yang, Hongyu | Sichuan University |
Hou, Zhiang | Sichuan University |
Keywords: Intelligent Transportation Systems
Abstract: Traffic flow prediction is crucial for urban traffic management and planning. However, although existing research has achieved promising results, most existing methods primarily focus on time-domain processing, with insufficient exploration of frequency-domain signal characteristics. Moreover, existing approaches often fail to effectively distinguish and simultaneously model the spatial dependencies between nearby and distant nodes. To address these issues, this paper proposes a Frequency-Enhanced Hybrid Graph Attention Network (FE-HGAT) for traffic flow prediction. Our approach employs a dynamic filter in the temporal dimension, which utilizes Fast Fourier Transform (FFT) to enhance key frequency-domain features, thereby better characterizing the temporal dependencies in traffic data. Besides, to capture spatial dependencies between nodes at varying distances, we design a dynamic threshold module to distinguish between nearby and distant nodes, employing external attention (EA) and a mixture-of-experts-enhanced graph attention network (MOE-GAT) to model local dependencies and long-distance semantic similarities, respectively. Experiments demonstrate that FE-HGAT outperforms existing baseline models on several public transportation datasets, validating its effectiveness in traffic forecasting.
|
|
08:30-18:00, Paper Tu-Online.37 | |
Optimizing Tensor Completion on GPU Heat Conduction-Based Load Balancing and Shared Memory Acceleration (I) |
|
Chen, Yuxiang | Hunan University of Science and Technology |
Yin, Guotong | Hunan University of Science and Technology |
Liang, Wei | Hunan University of Science and Technology |
Xie, Kun | Hunan University |
Wen, Jigang | Hunan University of Science and Technology |
Xiao, Jiahong | Hunan University of Science and Technology |
Tang, Yuanqiang | Xinyu University |
Liu, Tianxiong | Hunan Aerospace Hospital |
Keywords: Distributed Intelligent Systems, Adaptive Systems
Abstract: Large-scale tensor completion plays a crucial role in data analysis and anomaly detection, with Alternating Least Squares (ALS)-based CANDECOMP/PARAFAC(CP) decomposition being widely adopted due to its convergence properties and computational stability. However, when accelerating ALS computation on GPUs, load imbalance caused by data partitioning significantly affects efficiency. Due to the sparsity and heterogeneity of data, different thread blocks handle varying amounts of non-zero values, leading to suboptimal utilization of computational resources. Moreover, updating factor matrices in ALS involves frequent global memory accesses, where the latency is 100 times higher than that of shared memory. Efficient utilization of shared memory is therefore critical for improving computational performance. To address these challenges, we propose a GPU-optimized ALS framework that incorporates a heat conduction-based load balancing strategy and a shared memory acceleration mechanism. The load balancing strategy dynamically adjusts subtensor partitioning based on the distribution of non-zero values, ensuring balanced workload allocation across GPU resources. Meanwhile, the shared memory acceleration mechanism caches frequently accessed factor matrices and employs element-wise implicit computation, eliminating explicit intermediate matrix storage and thereby reducing memory overhead and global memory access latency. Based on this, a comparison was made with the other three methods on four data sets. While ensuring accuracy, the time and memory usage were greatly reduced, providing a practical solution for efficient tensor completion.
|
|
08:30-18:00, Paper Tu-Online.38 | |
A High-Performance and Memory-Efficient RISC-V Operating System Optimization for AIoT |
|
Cheng, Limin | Institute of Software, Chinese Academy of Sciences; University O |
Gao, Ke | Institute of Software, Chinese Academy of Sciences |
Yu, Jiageng | Institute of Software, Chinese Academy of Sciences |
Chen, Ruizhi | Institute of Software, Chinese Academy of Sciences |
Li, Weijia | Institute of Software, Chinese Academy of Sciences; University O |
Li, Ling | Institute of Software, Chinese Academy of Sciences; University O |
Wu, Yanjun | Institute of Software, Chinese Academy of Sciences |
Keywords: Infrastructure Systems and Services, System Architecture, System Modeling and Control
Abstract: The openness and flexibility of the RISC-V instruction set architecture (ISA) have driven its widespread adoption in AIoT (Artificial Intelligence of Things) devices. However, existing operating systems (OSes) for real RISC-V hardware often suffer from poor application performance and large memory footprints. To address these issues, we propose an OS optimization scheme tailored for RISC-V in AIoT devices. First, we introduce an application-transparent performance enhancement mechanism that leverages both coarse- and fine-grained process management to improve the performance of applications, particularly in AI inference. Second, we design a low-memory-footprint software stack through theoretical analysis and careful trade-offs in the adoption of software components. Lastly, we develop a lightweight OS image construction strategy algorithm tailored for RISC-V in AIoT. Using our OS optimization scheme, we build PolyOS from scratch to reduce the OS image size, thereby further lowering memory footprint. Across four real RISC-V hardware platforms, PolyOS achieves up to a 142% overall system performance improvement and up to 5.90× speedup in AI inference applications compared to baseline OSes (Armbian, Nucleisys, etc.). It also significantly reduces the runtime memory footprint of the standard C library, OpenCV, QuickJS, and AI inference applications, while shrinking the OS image size to 1/3.14–1/23.48 of its baseline OS.
|
|
08:30-18:00, Paper Tu-Online.39 | |
Fusing Contextual Clustering with State Space Models for Dzi Beads Image Classification |
|
Xia, Jianjun | Tibet University |
Gao, Dingguo | Tibet University |
Xu, Songtao | Tibet University |
Zhao, Qijun | Sichuan University, Tibet University |
Keywords: Consumer and Industrial Applications
Abstract: The dzi beads pattern categories are rich and diverse, and the differences between the categories are small, which leads to a certain challenge in its classification. To address the problems of weak generalisation performance, low accuracy, and high computational complexity, this paper proposes a FasterVim method, which takes FasterNet as the baseline and uses the designed FasterVimBlock module to improve the overall generalisation performance of the dzi beads images. Firstly, FasterNet is used as a baseline for data enhancement of dzi beads images to improve the overall generalisation performance of the classification model. Secondly, contextual clustering and state space models are fused to effectively implement the interaction between local features of dzi images and the linear representation of image features in long sequences, in order to achieve a balance between the computational complexity and accuracy of the model. Next, point convolution is introduced to achieve different scales of channel attention to efficiently extract the global and local features of the dzi beads image and improve the accuracy of the model. Finally, experimental validation is carried out on the dzi beads image classification dataset, and the proposed method achieves an accuracy of 93.8%, which is 1.91% and 1.5 times higher than the baseline model in Accuracy and Flops, respectively. The experimental results show that the proposed method is competitive in terms of accuracy and number of computational parameters, and can effectively meet the deployment application of the dzi beads image classification model.
|
|
08:30-18:00, Paper Tu-Online.40 | |
An Imbalanced Fault Diagnosis Framework Based on Subdomain Adaptation Mechanism of Dual-Branch Network |
|
Li, Chuanying | Zhejiang University |
Gong, Qing | Zhejiang University |
Yu, Zhuoyu | Zhejiang University |
Zhai, Yajing | Ningbo Institute of Digital Twin, Eastern Institute of Technolog |
Keywords: Fault Monitoring and Diagnosis, Distributed Intelligent Systems, Quality and Reliability Engineering
Abstract: Imbalanced data and label scarcity pose significant challenges to gearbox fault diagnosis, often limiting the accuracy and robustness of diagnostic models. To address these issues, this paper proposes an imbalanced fault diagnosis framework based on an unsupervised subdomain adaptation mechanism with a dual-branch network. The framework first constructs a dual-branch feature extraction structure consisting of a lightweight backbone network (ResNet-18) and a deep backbone network (ResNet-50), which are used to capture low-level local features and high-level semantic representations, respectively. After the last convolutional layer of each branch, a long short-term memory network and a multi-head self-attention mechanism are embedded to integrate temporal dependencies and spatial contextual information, thereby enhancing the discriminability and generalization ability of the extracted features. Furthermore, a dynamic simulation model is established based on the nonlinear dynamic characteristics of the gearbox system to enrich the feature distribution and improve subdomain adaptability. Experimental results on multiple real-world gearbox fault datasets demonstrate that the proposed method achieves superior performance in terms of diagnostic accuracy and domain adaptability compared with existing approaches.
|
|
08:30-18:00, Paper Tu-Online.41 | |
Spatiotemporal Trend Fusion Feature Graph Convolution Network for Spatial Interpolation in Traffic Scenes |
|
Hou, Zhiang | Sichuan University |
Lan, Shiyong | Sichuan University |
Zhou, Xinyuan | Sichuan University |
Zhu, Wujiang | Sichuan University |
Ren, Yao | Sichuan University |
Keywords: Intelligent Transportation Systems
Abstract: 传感器始终稀疏分布在交通网络中 由于部署成本高、传感器损坏等原因。 数据不足可能会影响我们对流量的感知 场景,导致智能 高效执行交通的运输系统 (ITS) 监控和情景决策。空间插值 方法用于推断没有 传感器。但是,现有的方法仍然具有 以下限制:(1) 主要是插值 通过提取时空依赖关系来观察未观察到的节点 节点之间,但忽略了 特征尺寸。(2) 最近常用的方法 标准 TCN 提取时间相关性,即 受异常数据影响。(3) 提取空间 相关性,使用深 GCN 层可以导致 过度平滑问题。为了减轻这些限制,我们 提出了一种新的空间插值模型,即 时空趋势融合特征图卷积 网络 (STFGCN)。具体来说,一种新的特征图 卷积网络用于捕获复杂的交互 模式。其
|
|
08:30-18:00, Paper Tu-Online.42 | |
Electricity Theft Detection Method Based on Semi-Supervised Domain Adaptation with Minimax Entropy |
|
Xia, Zhuoqun | Changsha University of Science and Technology |
Qiu, Han | Changsha University of Science and Technology, China |
Tan, Jingjing | Changsha University of Science and Technology |
Su, Ze | Changsha University of Science and Technology |
Xie, Yutong | Changsha University of Science and Technology |
Keywords: Intelligent Power Grid, Cyber-physical systems, Smart Metering
Abstract: Due to significant regional variations in electricity consumption, transfer learning methods for detecting electricity theft can effectively address challenges such as limited labeled data and domain shifts in emerging areas. However, these methods often encounter issues related to inadequate boundary feature characterization, underutilization of available data, and substantial computational overhead. To address these challenges, we propose a semi-supervised domain adaptation method for electricity theft detection based on minimax entropy. Specifically, domain-invariant prototype vectors align the feature distributions of labeled data across source and target regions, thereby enhancing cross-domain consistency. This alignment reduces inter-domain differences and improves detection, particularly when labeled data in target regions is scarce. Additionally, the minimax entropy strategy adjusts the prediction confidence of unlabeled data, thereby strengthening intra-class aggregation, enhancing inter-class separability, and optimizing feature distributions to capture discriminative boundary features. Experimental results demonstrate that our method significantly outperforms existing approaches in both detection performance and training efficiency.
|
|
08:30-18:00, Paper Tu-Online.43 | |
Efficient Strategy Learning by Decoupling Searching and Pathfinding for Object Navigation |
|
Zheng, Yanwei | Shandong University |
Feng, Shaopu | Shandong University |
Huang, Bowen | Shandong University |
Lan, Chuanlin | Shandong University |
Zhang, Xiao | Shandong University |
Yu, Dongxiao | Shandong University |
Keywords: Autonomous Vehicle, Robotic Systems, Adaptive Systems
Abstract: Inspired by human-like behaviors for navigation: first searching to explore unknown areas before discovering the target, and then the pathfinding of moving towards the discovered target, recent studies design parallel submodules to achieve different functions in the searching and pathfinding stages, while ignoring the differences in reward signals between the two stages. As a result, these models often cannot be fully trained or are overfitting on training scenes. Another bottleneck that restricts agents from learning two-stage strategies is spatial perception ability, since the studies used generic visual encoders without considering the depth information of navigation scenes. To release the potential of the model on strategy learning, we propose the Two-Stage Reward Mechanism (TSRM) for object navigation that decouples the searching and pathfinding behaviours in an episode, enabling the agent to explore larger area in searching stage and seek the optimal path in pathfinding stage. Also, we propose a pretraining method Depth Enhanced Masked Autoencoders (DE-MAE) that enables agent to determine explored and unexplored areas during the searching stage, locate target object and plan paths during the pathfinding stage more accurately. In addition, we propose a new metric of Searching Success weighted by Searching Path Length (SSSPL) that assesses agent's searching ability and exploring efficiency. Finally, we evaluated our method on AI2-Thor and RoboTHOR extensively and demonstrated it can outperform the state-of-the-art (SOTA) methods in both the success rate and the navigation efficiency.
|
|
08:30-18:00, Paper Tu-Online.44 | |
Mutual Information-Driven Graph Neural Network for Distributed Photovoltaic Power Forecasting |
|
Xu, Xiran | Southwest University |
Lu, Gang | The Energy Strategy and Planning Research Department, State Grid |
Yan, Xiaoqing | The Energy Strategy and Planning Research Department, State Grid |
Xia, Peng | The Energy Strategy and Planning Research Department, State Grid |
Wu, Di | Southwest University |
Keywords: Intelligent Power Grid
Abstract: As global population growth and industrialization accelerate, energy demand continues to rise. While fossil fuels currently dominate supply, their non-renewability and environmental impact highlight the urgency of transitioning to cleaner energy sources. Among renewables, photovoltaic (PV) energy stands out for its mature technology, cost-effectiveness, and wide applicability. However, traditional PV forecasting models often focus on single-site predictions, neglecting interactions between multiple sites. Additionally, PV generation is inherently unstable due to weather and geographical variations, impacting grid reliability. To address these challenges, this paper proposes the mutual information-Driven graph neural network for distributed photovoltaic power forecasting (MIPF) model, which: (1) constructs a mutual information network from historical power data and integrates it with a Graph Convolutional Network (GCN) to capture inter-site dependencies, and (2) combines Long Short-Term Memory (LSTM) and Extreme Gradient Boosting (XGBoost) to incorporate diverse influencing factors for improved prediction accuracy. Extensive experiments on real PV datasets demonstrate that MIPF outperforms six state-of-the-art models in prediction accuracy.
|
|
08:30-18:00, Paper Tu-Online.45 | |
Diff-BMFL: A Diffusion Bridge-Based Multi-Dimensional Features Learning for POI Recommendation |
|
Chen, Yuhan | Wenzhou University |
Hu, Jie | Wenzhou University |
Zheng, Jianwei | Zhejiang University of Technology |
Zhang, Xiaoqin | Zhejiang University of Technology |
Keywords: Decision Support Systems, Intelligent Transportation Systems, Service Systems and Organizations
Abstract: Point-of-Interest (POI) recommendation systems enhance user experience and promote smart mobility by delivering personalized exploration suggestions. While existing studies have advanced in modeling users' long-term and short-term preferences, traditional methods fail to comprehensively capture latent stochastic factors in user behaviors, and they often suffer from insufficient mining of long-tail interests. To address these issues, we propose a Diffusion Bridge-based Multi-Dimensional Features Learning (Diff-BMFL), which innovatively integrates diffusion probabilistic bridge modeling with multi-granularity graph neural networks. Specifically, this network employs a two-stage progressive learning approach, including (1) the Multi-Dimensional Features Learning stage and (2) the User Preference Distribution Space Sampling stage. In the first stage, we construct a global POI spatial graph to capture geographic correlations, incorporate a temporal module to model time-dependent patterns, and build user-specific transition graphs to learn personalized behavior, enabling deep integration of spatio-temporal and behavioral imformation. In the second stage, we leverage a diffusion bridge-based sampling module to overcome the limitations of pure-noise sampling in SDE-Based diffusion. By simulating potential influencing factors through diffusion, we enhance the alignment between the predicted POI distribution and users' actual preferences. Extensive evaluations on two real-world LBSN datasets demonstrate the effectiveness of our Diff-BMFL model. The results indicate that our method significantly enhances POI recommendation performance.
|
|
08:30-18:00, Paper Tu-Online.46 | |
Categorical Policies: Multimodal Policy Learning and Exploration in Continuous Control |
|
Islam, SM Mazharul | University of Texas at Arlington |
Huber, Manfred | The University of Texas at Arlington |
Keywords: System Modeling and Control, Robotic Systems
Abstract: A policy in deep reinforcement learning (RL), either deterministic or stochastic, is commonly parameterized as a Gaussian distribution alone, limiting the learned behavior to be unimodal. However, the nature of many practical decision-making problems favors a multimodal policy that facilitates robust exploration of the environment and thus to address learning challenges arising from sparse rewards, complex dynamics, or the need for strategic adaptation to varying contexts. This issue is exacerbated in continuous control domains where exploration usually takes place in the vicinity of the predicted optimal action, either through an additive Gaussian noise or the sampling process of a stochastic policy. In this paper, we introduce Categorical Policies to model multimodal behavior modes with an intermediate categorical distribution, and then generate output action that is conditioned on the sampled mode. We explore two sampling schemes that ensure differentiable discrete latent structure while maintaining efficient gradient-based optimization. By utilizing a latent categorical distribution to select the behavior mode, our approach naturally expresses multimodality while remaining fully differentiable via the sampling tricks. We evaluate our multimodal policy on a set of DeepMind Control Suite environments, demonstrating that through better exploration, our learned policies converge faster and outperform standard Gaussian policies. Our results indicate that the Categorical distribution serves as a powerful tool for structured exploration and multimodal behavior representation in continuous control.
|
|
08:30-18:00, Paper Tu-Online.47 | |
Periodic Selection Reordering Algorithm for Extending Truck Ranking Driving Mileage |
|
Yang, Zhikai | Nanjing Tech University |
Guo, Shaopan | Nanjing Tech University |
Wang, Xiaoyu | Nanjing Tech University |
Liu, Miao | Nanjing Tech University |
Xiao, Long | Nanjing Tech University |
Keywords: Intelligent Transportation Systems, Cooperative Systems and Control, Autonomous Vehicle
Abstract: This study addresses the limitations of traditional truck platoon cooperative control methods in optimizing dynamic fuel efficiency. The fixed-order truck platoon has a key flaw: it is unable to dynamically respond to real-time vehicle state changes. To address this, we introduce three innovative methods based on deep reinforcement learning. Compared with fixed-cycle formation transformation strategies, the number of formation transformations is reduced to varying degrees for truck platoons of different sizes. For truck platoons with identical specifications, the impact of different cycle sizes on driving mileage is found to be minimal. This study proves that the dynamic decision mechanism based on deep reinforcement learning can effectively balance formation transformation costs and long-term fuel-saving benefits. The core value lies in establishing an intelligent control paradigm with environmental adaptability. The new algorithm significantly improves fuel economy indicators through real-time state perception and probabilistic decision-making while maintaining formation stability. This method provides a new technical route for energy-saving control in complex transportation scenarios. The core framework can be extended to multi-objective collaborative optimization fields.
|
|
08:30-18:00, Paper Tu-Online.48 | |
Dynamic Re-Sequencing of EV Platoons Using Noisy Dueling DQN for Energy Fairness |
|
Zheng, BaiWenJie | Nanjing Tech University |
Guo, Shaopan | Nanjing Tech University |
Liu, Miao | Nanjing Tech University |
Xiao, Long | Nanjing Tech University |
Keywords: Intelligent Transportation Systems, Electric Vehicles and Electric Vehicle Supply Equipment, Autonomous Vehicle
Abstract: Improving energy efficiency in electric vehicle (EV) platoons remains a critical challenge for sustainable transportation systems. Traditional methods typically maintain fixed vehicle sequences, which leads to persistent energy consumption imbalances due to unequal aerodynamic benefits. This disparity not only reduces driving range but also accelerates battery degradation, limiting overall fleet efficiency. To address this issue, we formulate the problem as an Optimal Re-Sequencing (ORS) task and propose a dynamic platooning strategy where vehicles adjust their positions at predefined points along the route to balance energy usage. We develop and evaluate five deep reinforcement learning (DRL) algorithms—including DQN, Double DQN, Dueling DQN, Noisy DQN, and a hybrid Noisy Dueling DQN—under a simulated real-world transportation setting. Our results demonstrate that the proposed Noisy Dueling DQN achieves the lowest standard deviation in final state-of-charge (SOC), improving energy fairness by 37% compared to Dueling DQN and significantly outperforming traditional methods such as brute-force search and SOC-based ranking. Despite a slight increase in computation time, the model achieves convergence faster and delivers more stable training performance. These findings suggest that dynamic reordering strategies, when powered by advanced DRL models, hold strong promise for enhancing energy efficiency and operational stability in future intelligent transportation systems.
|
|
08:30-18:00, Paper Tu-Online.49 | |
A Novel Automation Method of Cybersecurity Alerts Analysis and Response in Satellite Cloud Systems |
|
Geng, Liru | Institute of Information and Engineering, Chinese Academy of Sci |
Hu, Tian | Institute of Information Engineering, Chinese Academy of Science |
Fang, Jiang | Institute of Information and Engineering, Chinese Academy of Sci |
Sun, Jiyan | Institute of Information and Engineering, Chinese Academy of Sci |
Liu, Yinlong | Institute of Information and Engineering, Chinese Academy of Sci |
Ma, Wei | Institute of Information Engineering Chinese Academy of Sciences |
Keywords: Fault Monitoring and Diagnosis, Communications
Abstract: 安全警报疲劳仍然是一个持续存在的挑战 对于网络安全专业人员。本文提出了一个 网络安全警报分析和响应的新方法 (CAAR) 基于大型语言模型 (LLM) 框架 卫星云系统。该方法围绕 三个核心组成部分:第一,高效检索和 警报的压缩是通过过滤掉来实现的 冗余和同质条目;二、紧急警报 使用微调的大型语言重新定义和分类 型;第三,响应动作自动 通过安全编排、自动化、 和响应 (SOAR) 平台。广泛的实验 结果表明,所提方法实现了 平均压缩率为26.5%,分类 自动告警分类覆盖率达99.93%。 此外,微调后的 LLM 超越了 最先进的 GPT-4 Turbo 模型处理安全性 涉及中文内容的警报响应。
|
|
08:30-18:00, Paper Tu-Online.50 | |
A Fine-Grained Troubleshooting Method in 6G NTN Systems Based on Signaling Messages |
|
Geng, Liru | Institute of Information and Engineering, Chinese Academy of Sci |
Guo, Zhaorui | Institute of Information and Engineering, Chinese Academy of Sci |
Sun, Jiyan | Institute of Information and Engineering, Chinese Academy of Sci |
Fu, Jiadong | Institute of Information and Engineering, Chinese Academy of Sci |
Fang, Jiang | Institute of Information and Engineering, Chinese Academy of Sci |
Liu, Yinlong | Institute of Information and Engineering, Chinese Academy of Sci |
Ma, Wei | Institute of Information Engineering Chinese Academy of Sciences |
Keywords: Fault Monitoring and Diagnosis, Communications
Abstract: The signaling collected in mobile communication networks can intuitively display the operational status of the system, which can use to locate faults. This paper proposes a novel signaling-based end-to-end fine-grained troubleshooting (simFGT) method for 6G NTN networks. First, the signaling collected from the core network is analyzed to extract multi-dimensional KPIs and attribute information. Second, a root cause localization algorithm is employed for fine-grained fault localization. Third, a lightweight data-driven ensemble learning method is adopted, with the abnormal KPI of the localized root cause node as inputs, and precise fault classification is achieved through data-driven weight optimization. Experiments results show that the proposed lightweight SimFGT method achieves best balance between precision and recall, resulting in highest F1 score, outperforming current state-of-the-art solutions.
|
|
08:30-18:00, Paper Tu-Online.51 | |
ACR: Adaptive Computation Reuse for Video Analytics in Collaborative Edge Computing |
|
Yu, Jiale | East China Normal University |
Zhu, Minghua | East China Normal University |
Keywords: Communications, Adaptive Systems, Smart Buildings, Smart Cities and Infrastructures
Abstract: Video analytics typically requires substantial computational resources and energy. In edge computing application scenarios (e.g., smart cities), multiple users and devices may be spatially proximate, leading to offloaded tasks with high similarity and redundant computation. Computational results from previously executed tasks can be cached and reused for subsequent tasks based on similarity to enhance system efficiency. However, existing computation reuse methods often employ fixed similarity thresholds tailored to specific tasks, struggling to adapt to dynamically changing scenarios. This results in low reuse accuracy under low similarity thresholds and high latency under high similarity thresholds. Additionally, while edge servers possess larger storage capacities, they significantly increase real-time retrieval overhead. To address these issues, this paper proposes ACR, an adaptive computation reuse framework based on edge-cloud collaboration. ACR leverages the localized deployment of edge gateways to reduce cache query latency and introduces a deep Q-Network (DQN) algorithm with n-step temporal difference (TD) to adaptively adjust similarity thresholds. Our evaluation results demonstrate that, on typical urban surveillance datasets, ACR effectively balances overall system latency and reuse accuracy compared to other computation reuse methods.
|
|
08:30-18:00, Paper Tu-Online.52 | |
PSformer: Periodic-Aware Semantic Transformer for Traffic Prediction |
|
He, Lihua | Macao Polytechnic University |
Yu, Ziyue | Macao Polytechnic University |
Luo, Wuman | Macao Polytechnic University |
Keywords: Intelligent Transportation Systems, Smart Sensor Networks, Decision Support Systems
Abstract: Traffic prediction plays an important role in Intelligent Transportation Systems (ITS). The main challenge lies in effectively capturing the dynamic multiple temporal periodic correlations and the long-range spatial correlation of traffic data. Despite the significant progress of many existing works, these methods often have two major limitations: 1) They mined the dynamic multi-period properties by using raw traffic sequences or the fixed periodicity strategy (e.g., hours, days, weeks), which failed to capture the dynamic multi-period characteristics of temporal correlation. 2) They mined the long-range spatial correlation of traffic data by stacking multilayer networks or directly using traditional similarity algorithms (e.g., conventional DTW). However, DTW has its own limitations leading to sub-optimal similarity assessment. To address these issues, we propose a periodic-aware spatial semantic transformer called PSformer for traffic prediction. Specifically, we propose the Periodic-aware Embedding Module (PAEmbed) to capture the dynamic multi-period properties by decoupling the traffic sequence into the multilevel frequency components via Fast Fourier Transform (FFT). In addition, we propose a Semantic Spatial Attention Mechanism (SSAM) to capture the long-range spatial correlation. In SSAM, we propose Time-weighted Dynamic Time Warping (TDTW) to model spatial correlations in semantically identical but geographically distant regions, which avoids considering two traffic patterns with large time spans as similar. Finally, to evaluate the performance of PSformer, we conduct extensive experiments on four real datasets. Experimental results show that our model achieves better performance than other state-of-the-art methods.
|
|
08:30-18:00, Paper Tu-Online.53 | |
Privacy-Preserving Estimated Time of Arrival Prediction with Lightweight Multi-Task Federated Learning (I) |
|
Zhai, Jiahui | Beijing University of Technology |
Bi, Jing | Beijing University of Technology |
Yuan, Haitao | Beihang University |
Wang, Ziqi | Zhejiang University |
Ma, Hongyao | Beijing University of Technology |
Wang, Chen | Beijing University of Technology |
Zhang, Jia | Southern Methodist University |
Keywords: Adaptive Systems, Decision Support Systems
Abstract: Accurate estimated time of arrival (ETA) prediction for long vehicular trips remains challenging in intelligent transportation systems (ITS) due to heterogeneous traffic patterns and limited local data availability. While federated learning (FL) addresses privacy concerns by decentralizing data training, traditional FL frameworks often struggle with high computational costs and poor adaptability to multi-task scenarios. To overcome these limitations, this paper proposes a Lightweight Multi-task Federated Learning (LMFL) framework for efficient and privacy-preserving ETA prediction. LMFL integrates a novel SE-CIFG, combining a Squeeze-Excitation (SE) attention module to prioritize critical spatio-temporal features and a Coupled Input and Forget Gate (CIFG) to simplify long-term traffic dependency modeling. Additionally, LMFL employs a Federated Gradient Compression Algorithm (Fed-GCA) to reduce communication overhead between edge and cloud using adaptive thresholding and sparse tensor encoding. Real-world traffic simulation dataset demonstrates that LMFL achieves significantly higher predictive accuracy compared to existing methods, achieving an average 17.1% improvement in prediction precision while reducing training time by 4.1%.
|
|
08:30-18:00, Paper Tu-Online.54 | |
Efficient Multi-Task Modeling through Automated Fusion of Trained Models |
|
Zhou, Jingxuan | National University of Defense Technology |
Bao, Weidong | National University of Defense Technology |
Wang, Ji | National University of Defense Technology |
Zhang, Dayu | National University of Defense Technology |
Zhong, Zhengyi | National University of Defense Technology |
Keywords: System Modeling and Control, Modeling of Autonomous Systems, Smart Buildings, Smart Cities and Infrastructures
Abstract: Although multi-task learning is widely applied in intelligent services, traditional multi-task modeling methods often require customized designs based on specific task combinations, resulting in a cumbersome modeling process. Inspired by the rapid development and excellent performance of single-task models, this paper proposes an efficient multi-task modeling method that can automatically fuse trained single-task models with different structures and tasks to form a multi-task model. As a general framework, this method allows modelers to simply prepare trained models for the required tasks, simplifying the modeling process while fully utilizing the knowledge contained in the trained models. This eliminates the need for excessive focus on task relationships and model structure design. To achieve this goal, we consider the structural differences among various trained models and employ model decomposition techniques to hierarchically decompose them into multiple operable model components. Furthermore, we design an Adaptive Knowledge Fusion (AKF) module based on Transformer, which adaptively integrates intra-task and inter-task knowledge based on model components. Through the proposed method, we achieve efficient and automated construction of multi-task models, and its effectiveness is verified through extensive experiments on three datasets.
|
|
08:30-18:00, Paper Tu-Online.55 | |
Large Language Model-Driven Lightweight Method for IoT Malicious Traffic Detection |
|
Chen, Ruoshui | Information Engineering University, He’nan Province Key Laborato |
Liu, Aodi | Information Engineering University |
Keywords: Communications, Smart Sensor Networks, Cyber-physical systems
Abstract: To address security issues caused by limited computing resources, weak security protection, and sparse malicious sample data in IoT environments, this paper proposes a large language model-driven lightweight method for IoT malicious traffic detection (LLMD-IMTD). First, a large language model-driven small sample data enhancement algorithm is used to solve the problem of high detection false positive rate caused by too few malicious traffic samples, so as to alleviate the impact of uneven data distribution on malicious traffic detection performance. After that, a robust traffic feature selection algorithm based on meta-learning is used to select the feature subset that has the greatest impact on the malicious traffic detection task in a specific scenario, and the feature subset is dynamically updated according to the characteristics of the attack traffic, so as to quickly adapt to lightweight attack traffic detection tasks. Finally, combining the malicious traffic detection algorithm based on grid search and multi-classifier detection, the malicious traffic classifier is constructed, and the multi-model collaborative voting mechanism is used to predict the detection results, so that the algorithm can achieve more stable and accurate traffic detection under the complex network traffic environment. The experimental results show that the proposed method can achieve the F1 Score of 99.25 on the BoTIoT-2018 dataset, which has better detection performance than the benchmark method.
|
|
08:30-18:00, Paper Tu-Online.56 | |
Dual-GNN-Assisted Cooperative Hunting Optimizer for Dynamic Job Shop Scheduling (I) |
|
Wang, Chen | Beijing University of Technology |
Bi, Jing | Beijing University of Technology |
Wang, Ziqi | Zhejiang University |
Zhang, Junqi | Beijing University of Technology |
Yuan, Haitao | Beihang University |
Zhang, Jia | Southern Methodist University |
Keywords: Cooperative Systems and Control, Large-Scale System of Systems
Abstract: The Dynamic Job Shop Scheduling Problem (DJSP), a critical challenge in 3C manufacturing, requires efficient resource allocation under dynamically changing production conditions where jobs arrive unpredictably. Traditional optimization methods struggle to provide scalable solutions due to the high computational cost of searching for optimal schedules in large and complex environments. To solve this problem, this work proposes the Dual-Graph convolutional networks assisted Dynamic Cooperative Hunting Optimizer (DG-DCHO), which integrates graph-convolutional networks (GCN) with metaheuristic optimization to generate high-quality schedules while significantly improving computational efficiency. GCN generator processes graph representations of the job-shop environment and captures complex dependencies among jobs and machines to construct high-quality initial schedules that serve as initial solutions for the optimization process. GCN evaluator estimates makespan values directly from schedule representations and replaces costly fitness evaluation that minimizes computational overhead and improves optimization speed. Dynamic Cooperative Hunting Optimizer (DCHO) serves as the base optimizer and generates scheduling solutions by balancing global exploration with local exploitation through an adaptive search strategy. Experimental results across various DJSP instances demonstrate that DG-DCHO consistently outperforms state-of-the-art scheduling algorithms by producing superior solutions while requiring fewer computational resources, making itself a scalable and effective framework for real-time dynamic scheduling in large-scale 3C manufacturing systems.
|
|
08:30-18:00, Paper Tu-Online.57 | |
Privacy-Preserving Multi-Source Data-Driven Optimization for Intelligent EV Charging |
|
Nahid, Emama | Rajshahi University of Engineering & Technology |
Zhao, Chen | Kennesaw State University |
Amirgholy, Mahyar | Kennesaw State University |
Zheng, Danyang | Southwest Jiaotong University |
Xu, Honghui | Kennesaw State University |
Keywords: Intelligent Transportation Systems, Electric Vehicles and Electric Vehicle Supply Equipment, Trust in Autonomous Systems
Abstract: The increasing adoption of Electric Vehicles (EVs) is driving the need for secure, efficient, and intelligent charging systems. While EVs offer a sustainable alternative to conventional vehicles, challenges such as charging station availability, range anxiety, and the privacy risks associated with sharing sensitive data—like location and energy usage—remain significant barriers to broader adoption. To address these challenges, this paper introduces a novel privacy-preserving multi-source data-driven framework for intelligent EV charging optimization. The proposed system combines a hybrid optimization strategy incorporating an enhanced Hungarian Matching Algorithm for cost-efficient EV-to-charging station assignment, a Random Forest regression model for accurate EV range prediction using contextual data, and a Laplace mechanism-based differential privacy module to protect user location data. This unified framework not only improves charging efficiency and predictive accuracy but also provides formal privacy guarantees against inference attacks. Extensive experiments conducted on synthetic datasets demonstrate the framework's effectiveness in reducing charging costs, enhancing range prediction accuracy, and preserving EV user privacy. The results suggest strong potential for real-world deployment in future intelligent transportation systems.
|
|
08:30-18:00, Paper Tu-Online.58 | |
HPCsim: A High-Level Simulation and Workload Data Schema Framework for HPC Workload Management Research |
|
Wang, Lingfei | The University of Melbourne |
Rodriguez, Maria A. | The University of Melbourne |
Lipovetzky, Nir | The University of Melbourne |
Keywords: System Modeling and Control, Distributed Intelligent Systems, Large-Scale System of Systems
Abstract: This paper presents HPCsim, a modular and extensible simulation framework designed to support research in High-Performance Computing workload management. HPCsim focuses on modeling the full scheduling cycle — including job submission, queuing, job selection, resource allocation, execution, and evaluation — using real traces and structured cluster metadata. Most of the existing simulation tools focus on low-level system behavior but lack the abstractions needed for HPC workload management research. They do not natively support multi-resource job models, batch queuing, or system-level scheduling policies. Moreover, their focus on low-level execution details, such as hardware timing, message passing, and architectural interactions, introduces significant computational overhead, making them inefficient for large-scale experiments in high-level workload management, especially in learning-based scheduling where rapid simulation and frequent policy updates are critical. The framework is built based on Slurm, one of the most widely deployed workload managers, and introduces a structured workload data schema encompassing job traces, node configurations, and switch-based topologies. HPCsim is fully compatible with Gymnasium, enabling integration with reinforcement learning workflows, while also supporting other scheduling strategies. We validate HPCsim through experiments using real traces from a GPU-accelerated HPC cluster and demonstrate its utility for comparing scheduling policies and visualizing topology-sensitive placement behavior. HPCsim is released as an open-source tool to facilitate reproducibility and accelerate research in HPC scheduling.
|
|
08:30-18:00, Paper Tu-Online.59 | |
Leading Attackers Astray: Mitigating Link Flooding Attacks through Stub Node Relocation and Insertion |
|
Wang, Bin | Harbin Institute of Technology |
Liu, Kehong | Harbin Institute of Technology |
Zhang, Yu | Harbin Institute of Technology |
Shi, Jiantao | Harbin Institue of Technology |
Zhu, Guopu | Harbin Institute of Technology |
Fang, Binxing | Guangzhou University |
Keywords: Infrastructure Systems and Services
Abstract: Link Flooding Attacks (LFA) exploit network topology knowledge to disrupt connectivity by targeting critical links and nodes. Existing defenses often presuppose an attacker with complete topological awareness and overlook the concentration of attack traffic on specific routers. Furthermore, many countermeasures rely on SDN, which can suffer from performance degradation due to the limited packet processing capabilities of switches. To address these issues, we introduce the GateLFA attacker model, which assumes that attackers lack complete topology knowledge and guide their attacks based on traffic density analysis. We propose the EqualFlow algorithm, which utilizes stub node relocation and insertion to minimize adversarial impact, balance attack traffic, and reduce defense costs. Additionally, we present the Network Topology Obfuscation System, leveraging XDP for high-speed packet processing at the network boundary to overcome the performance challenges of SDN-based solutions. Our experimental results demonstrate that EqualFlow computes high-quality virtual topologies, outperforming existing algorithms across small, medium, and large-scale networks. Moreover, the Network Topology Obfuscation System effectively disrupts prominent topology probing tools through explicit information interference at a 10 Gbps line rate. For implicit interference, the system increases the packet rate of typical traceroute probes by approximately 17% compared to traffic control methods. This research provides an efficient and practical solution for defending against LFA.
|
|
08:30-18:00, Paper Tu-Online.60 | |
Assessing Redundancy Strategies to Improve Availability in Virtualized System Architectures |
|
Silva, Alison | UPE |
Callou, Gustavo | Federal Rural University of Pernambuco |
Keywords: Fault Monitoring and Diagnosis, Quality and Reliability Engineering, Distributed Intelligent Systems
Abstract: Cloud-based storage platforms are becoming more common in both academic and business settings due to their flexible access to data and support for collaborative functionalities. As reliability becomes a vital requirement, particularly for organizations looking for alternatives to public cloud services, assessing the dependability of these systems is crucial. This paper presents a methodology for analyzing the availability of a file server (Nextcloud) hosted in a private cloud environment using Apache CloudStack. The analysis is based on a modeling approach through Stochastic Petri Nets (SPNs) that allows the evaluation of different redundancy strategies to enhance the availability of such systems. Four architectural configurations were modeled, including the baseline, host-level redundancy, virtual machine (VM) redundancy, and a combination of both. The results show that redundancy at both the host and VM levels significantly improves availability and reduces expected downtime. The proposed approach provides a method to evaluate the availability of a private cloud and support infrastructure design decisions.
|
|
08:30-18:00, Paper Tu-Online.62 | |
Few-Shot Substation Anomaly Object Detection Via Variational Aggregation and Multi-Head RPN |
|
Bi, Zhongqin | Shanghai University of Electric Power |
Guo, Yikun | Shanghai University of Electric Power |
Zhang, Weina | Shanghai University of Electric Power |
Dai, Dan | Aston University |
Keywords: Fault Monitoring and Diagnosis, Intelligent Power Grid
Abstract: Due to the rare occurrence of abnormal targets in substations, detecting these anomalies can be framed as a few-shot learning problem, for which recent few-shot object detection (FSOD) methods offer promising solutions. However, the complexity of substation scenarios makes current FSOD approaches struggle to control class centers, resulting in trained models typically exhibiting bias towards base classes and tending to confuse novel classes with base classes. To address these issues, we propose a feature aggregation method based on Variational Autoencoder (VAE) that transforms instance-level features into class distributions, thereby enhancing the robustness of feature aggregation. Additionally, to improve the quality of region proposal generation in sample-limited and complex environments, we integrate a multi-head attention mechanism into the Region Proposal Network (RPN), enabling it to produce more relevant region proposals. Extensive experiments on the Substation Abnormal Target Dataset (SATD2024) demonstrate that our approach consistently outperforms existing methods across various settings. Furthermore, we validate the effectiveness and generalization of our method on the public Pascal VOC dataset.
|
|
08:30-18:00, Paper Tu-Online.63 | |
Robust Composite Control Strategy for Constrained Continuous-Time Nonlinear Systems |
|
Zhao, Ruotong | Beijing Institute of Technology |
Meng, Huan | Beijing Institute of Technology |
Zhang, Jinhui | Beijing Institute of Technology |
Tsukada, Manabu | The University of Tokyo |
Keywords: Control of Uncertain Systems, System Modeling and Control
Abstract: In this letter, we propose a robust composite control strategy for constrained continuous-time nonlinear systems. The proposed composite control consists of sliding mode control (SMC) and model predictive control (MPC), which are integrated to achieve complementary advantages. First, the SMC method is employed to design a controller capable of effectively handling system disturbances. To address the inherent difficulty of handling constraints with SMC, an optimal control problem (OCP) is formulated based on the SMC input. The resulting control sequence is then obtained using the MPC approach, ensuring effective adjustment of the SMC input. The final composite control input not only enhances the robustness of the closed-loop system but also ensures constraint satisfaction throughout the entire control process. Moreover, the MPC solves the OCP only at the sampling instants, which effectively reduces the computational efficiency disparity resulting from the integration of the controllers. Furthermore, the recursive feasibility of OCP and the stability of the closed-loop system are rigorously analyzed. The effectiveness of the proposed composite control strategy is demonstrated through the cart-damper-spring system.
|
|
08:30-18:00, Paper Tu-Online.64 | |
A Personalized Lane-Change Safety Verification Framework Based on Driving Style and Formal Modeling |
|
Wang, Xin | East China Normal University |
Fang, Letian | East China Normal University |
Liu, Jing | East China Normal University |
Hou, Rongbin | Nuclear Power Institute of China |
Keywords: Autonomous Vehicle, Trust in Autonomous Systems
Abstract: Ensuring the safety of lane-change maneuvers remains a critical challenge in autonomous driving, especially given the variability in individual driving behaviors. However, most existing decision-making models fail to account for driver heterogeneity, resulting in overly generalized and potentially unsafe strategies. In this paper, we propose a personalized lanechange risk verification framework that integrates unsupervised driving style classification with formal stochastic modeling. We first propose a volatility-based feature extraction method and employ k-means++ clustering to identify three representative driving styles—aggressive, normal, and conservative—from naturalistic trajectory data. We then construct a modular Network of Stochastic Timed Automata (NSTA) to represent individualized driving dynamics and enforce TTC-based safety constraints, enabling probabilistic safety verification. Finally, we propose a data-driven runtime verification pipeline, which evaluates the lane-change safety of individual maneuvers using real-world inputs. Experiments on 205 lane-change cases from the highD dataset demonstrate the framework’s ability to quantify safety probabilities across different driving styles. Results show that aggressive behaviors significantly increase the risk of unsafe lane changes, underscoring the importance of behavior-aware modeling. This work provides a structured and interpretable alternative to black-box risk models for autonomous vehicle decision-making.
|
|
08:30-18:00, Paper Tu-Online.65 | |
CLEAR: Contextual Learning for Enhanced Scientific Articles Recommendation (I) |
|
Yanes, Nacim | University of Gabes, ISGGB, University of Manouba, ENSI, RIADI ( |
ELayadi, Khawla | University of Gabes |
Keywords: Decision Support Systems
Abstract: The exponential growth of scientific publications creates significant challenges for delivering personalized recommendations in academic environments, where user preferences are highly dynamic and context-dependent. This paper introduces a novel hybrid context-aware framework for scientific articles recommendation (CLEAR), designed to integrate static user profiles with multi-dimensional contextual signals, such as temporal, behavioral, and domain-specific contexts. Our hybrid framework utilizes a multi-source data pipeline from scientific literature databases, employing advanced preprocessing and contextual embedding techniques to capture scientific articles content and user interactions. The framework combines content-based filtering, collaborative filtering based on interaction duration, sequential modeling with LSTM (Long Short-Term Memory), and interpretable natural language explanations generated by a fine-tuned LLM (Large Language Model). By continuously monitoring real-time user interactions, our framework dynamically refines contextual embeddings to adapt recommendations across diverse academic domains. Key contributions include: (1) robust adaptation to temporal and contextual shifts in user preferences; (2) enhanced accuracy through the fusion of complementary recommendation techniques; and (3) improved trust via transparent natural language explanations. Evaluated on a synthetic dataset of 1,000 users and 100,000 articles, our scalable framework achieves superior recommendation accuracy, diversity, and contextual relevance, making it ideal for academic digital libraries.
|
|
08:30-18:00, Paper Tu-Online.66 | |
An Improved Bi-RRT* Algorithm for UAV Path Planning |
|
Ren, Yifan | Beihang University |
Wu, Min | Beihang University |
Cheng, Gong | Beihang University |
Yu, Xinlong | Jianghuai Advance Technology Center |
Wang, Ziwei | Beihang University |
Keywords: Autonomous Vehicle
Abstract: Unmanned Aerial Vehicle (UAV) path planning is a critical task that directly affects the efficiency and safety of UAV operations in various fields. This paper proposes an improved Bi-RRT* algorithm to enhance the efficiency of path planning while ensuring feasible and smooth navigation. The proposed algorithm integrates a goal point switching mechanism and dynamic ellipsoid sampling with a goal-bias strategy to improve the sampling process, an enhanced expansion strategy combining goal-point-oriented growth with an improved artificial potential field (APF) mechanism to accelerate convergence, and a density-aware node rewiring strategy with greedy pruning and B-spline smoothing to generate an optimized path. Extensive simulations in five complex 3D environments demonstrate that the proposed method significantly outperforms five existing RRT* variants, achieving shorter path lengths, fewer iterations, reduced computation time, and a lower number of nodes while maintaining high robustness and adaptability in obstacle-dense environments.
|
|
08:30-18:00, Paper Tu-Online.67 | |
Enhancing Small Object Detection in Aerial Images Via Transformer Scaling and Dynamic Fusion |
|
Zheng, Kai | UESTC |
Liu, Weixuan | National University of Singapore |
Li, Jintao | University of Electronic Science and Technology of China |
Yu, Shui | Shen Zhen Institute for Advanced Study, UESTC |
Li, Yun | Shenzhen Institute for Advanced Study, University of Electronic |
Keywords: Distributed Intelligent Systems, System Modeling and Control
Abstract: At present, accurate detection of small objects in an aerial imagery remains a challenge in remote sensing due to limited pixel resolution, background clutter, and scale variations. To address these issues for high-precision detection in a complex remote sensing scene, we propose a novel detection framework based on RepViT Dynamic Fusion and YOLOv11, termed RDF-YOLO. The RDF-YOLO brings in two core innovations: (1) a Dynamic Scale RepViT module that integrates lightweight Transformer operations into the backbone to enhance global context modeling and semantic discrimination under noisy conditions, and (2) a dynamic fusion module that incorporates spatially aware dilated convolutions and channeladaptive fusion strategies to enable flexible, scale-aware feature interaction. Extensive experiments on the challenging AI-TOD dataset show that the RDF-YOLO outperforms state-of-the-art methods by substantial margins. In particular, the RDF-YOLO improves AP50:95 by 6.9% and AP50 by 8.6% over the YOLOv11 baseline and on small-object metrics, including APvt, APt, and APs. These results verify the effectiveness of the RDF-YOLO architecture for robust and efficient detection of small objects in remote sensing imagery. The source code is available at https://github.com/AssiiKk/RDF-YOLO.
|
|
08:30-18:00, Paper Tu-Online.68 | |
Evaluating Prompt Strategies for Multi-Robot Planning with ChatGPT: Batch, Interactive, and Algorithmic Modes |
|
Hashimoto, Takumi | University of Aizu |
Yuichi, Yaguchi | University of Aizu |
Keywords: Robotic Systems, Autonomous Vehicle, Infrastructure Systems and Services
Abstract: This paper explores using ChatGPT, a large language model (LLM), as a high-level planner for multi-robot systems. We investigate whether ChatGPT can coordinate multiple service robots under spatial and operational constraints by generating movement plans and task allocations through prompt interaction alone. To this end, we design five prompt strategies, such as batch, interactive, and algorithmic, and apply them to a simulated cocktail-serving task on a 5times8 grid. Robots must serve fixed guests while avoiding collisions and following adjacency and return-home rules. ChatGPT receives only natural language prompts and is evaluated on rule compliance, coordination quality, and planning efficiency. Results show that batch prompts produce generally valid but uneven plans, while interactive prompts suffer from inconsistency and loss of shared context, especially in multi-agent settings. Algorithmic prompting, where the model generates symbolic logic, enables more structured and cooperative behavior. These findings indicate that LLMs can act as high-level reasoning agents when appropriately prompted and are best suited for generating symbolic strategies rather than executing reactive control. The study highlights the potential of LLMs as cognitive modules in hybrid or cloud-based robot systems.
|
|
08:30-18:00, Paper Tu-Online.69 | |
AI-Powered Pavement Moduli Prediction: Advancing Smart City Infrastructure with Scalable and Interpretable Solutions |
|
Guo, Xinyu | University Kembangan Malaysia |
Sun, Nan | UNSW Canberra |
Chen, Yue | UNSW Canberra |
Xue, Jianfeng | UNSW Canberra |
Keywords: Smart Buildings, Smart Cities and Infrastructures, Quality and Reliability Engineering, Infrastructure Systems and Services
Abstract: With the growth of smart cities, data-driven approaches are transforming infrastructure management by enabling efficient and accurate decision-making. Pavement condition evaluation, a critical aspect of transportation infrastructure, plays a vital role in ensuring mobility, safety, and economic productivity. This study compares traditional machine learning (ML) and deep learning (DL) models for predicting pavement layer moduli, which are key indicators of structural health, using Falling Weight Deflectometer (FWD) data from the Long-Term Pavement Performance (LTPP) database. Ensemble machine learning models, such as Random Forest and XGBoost, offer high predictive accuracy with minimal training time, making them suitable for scenarios with limited data. In contrast, advanced DL models—particularly hybrid ResRNN architectures with Wide & Deep (W&D), LSTM, and GRU components—demonstrate superior scalability and predictive power, which aligns with the growing availability of data in smart infrastructure systems. To enhance interpretability, SHapley Additive exPlanations (SHAP) analysis is employed, uncovering feature importance patterns that align with pavement engineering principles and informing future sensor placement strategies. The findings support the selection of AI models that balance accuracy, efficiency, and transparency in smart infrastructure systems.
|
|
08:30-18:00, Paper Tu-Online.70 | |
Pricing Strategy for On-Demand Content Exclusive to Members under the Word-Of-Mouth Effect (I) |
|
Liu, Xuwang | Henan University |
Xu, Ya | Henan University |
Qi, Wei | Henan University |
Guo, Xiwang | Liaoning Petrochemical University |
Wang, Jiacun | Monmouth University |
Tang, Ying | Rowan University |
Keywords: System Modeling and Control, Decision Support Systems, Service Systems and Organizations
Abstract: In recent years, with the rapid development of artificial intelligence and social media, the influence of word-of-mouth (WOM) on the diffusion of online content has become increasingly evident. Video platforms can use artificial intelligence to collect WOM data of programs and formulate corresponding pricing strategies. Based on this background, considering the impact of online WOM effects on the diffusion of on-demand content exclusive to members, this study constructs a two-stage product provision model for online video platforms, consisting of the premiere and follow-up broadcast stage. Based on expected utility theory, this research explores the pricing strategies for member-exclusive on-demand content under two profit models and analyzes the influence of program WOM attributes and program quality on optimal decision-making. The findings reveal that: When the premiere stage WOM for a program is either highly positive or negative, video platform should adopt an "advertising-dominant strategy". When the premiere stage WOM is moderate, a " fee-dominant strategy" is preferable. Higher program quality increases the platform's inclination toward the "fee-dominant strategy". The better the premiere stage WOM and program quality, the more users tend to watch during the premiere stage. Accordingly, both the program price and the platform's expected profit will vary to different degrees depending on these conditions.
|
|
08:30-18:00, Paper Tu-Online.71 | |
FOFL: Dynamic Function Output-Based Software Fault Localization Via Deep Learning |
|
Sun, Qingyuan | Beijing Institute of Technology |
Peng, Tu | Beijing Institute of Technology |
Yang, Yating | School of Cyberspace Science and Technology, Beijing Institute O |
Song, Tian | School of Cyberspace Science and Technology, Beijing Institute O |
Keywords: Fault Monitoring and Diagnosis, Quality and Reliability Engineering
Abstract: Learning-based fault localization has become a prominent research direction in software engineering due to its ability to leverage diverse program artifacts such as execution traces and coverage data for precise fault identification. However, current techniques face two fundamental limitations: (1) oversimplified representations of program behavior through basic coverage metrics, and (2) high computational cost associated with collecting fine-grained runtime data. In this work, we present FOFL, a novel function output-based fault localization approach. Our intuition is that programs can be modeled as complex dynamical systems, where faults manifest as perturbations observable through function outputs. Our method first captures function-level output patterns during execution, then encodes them into a structured matrix representation that preserves system-level behavioral signatures. These matrices are analyzed through a hybrid deep learning architecture combining convolutional neural networks for spatial feature extraction and attention mechanisms for critical pattern recognition, ultimately producing a ranked list of suspicious locations. Experiments on the widely used Defects4J benchmark show that FOFL outperforms existing techniques by localizing 204 bugs in the Top-1 ranking, which corresponds to a 25-bug improvement over state-of-the-art methods, while also enhancing MFR and MAR by 4.5% and 4.3%, respectively. Furthermore, ablation studies confirm the positive contribution of listwise loss function and feature aggregation design. The results establish FOFL as an effective solution that advances fault localization through principled system modeling and targeted learning of informative data representations.
|
|
08:30-18:00, Paper Tu-Online.72 | |
PantoPoseNet: A Two-Stage Framework for Real-Time Pantograph-Catenary Keypoint Detection |
|
Ma, Yixuan | Beijing Jiaotong University |
Xu, Shuai | Beijing Jiaotong University |
Zhao, Yutai | Beijing Jiaotong University |
Wu, Xiaorui | Beijing Jiaotong University |
Wang, Qiyuan | Beijing Jiaotong University |
Keywords: Intelligent Transportation Systems
Abstract: Pantograph-catenary attitude detection is crucial for high-speed railway safety, but existing methods struggle with sparse keypoints, small target regions, and complex environments. To address these specific challenges, we present PantoPoseNet, a novel two-stage framework for real-time pantograph-catenary keypoint detection. PantoPoseNet's first stage introduces three key innovations: (1) integration of VanillaNet with SIAF activation as the backbone network, which reduces model parameters by 55.6% while maintaining detection accuracy; (2) replacement of the original RepC3 module with CSP module to enhance multi-scale feature fusion; and (3) implementation of a hybrid loss function combining GIoU and NWD metrics, specifically designed to address gradient vanishing when detecting small keypoint regions. The second stage employs a specialized VanillaNet-KP network that processes 32×32 pixel regions for precise keypoint localization. Experimental results across diverse railway scenarios demonstrate that PantoPoseNet achieves 98.2% mAP for keypoint region detection and 93.6% PCK for overall keypoint localization at 25.94 FPS, significantly outperforming current state-of-the-art methods. These results indicate that the application potential of our method in pantograph-catenary monitoring systems.
|
|
08:30-18:00, Paper Tu-Online.73 | |
SafeSim: An Open-Source Platform for Safety-Critical Driving Scenario Simulation and Curriculum-Based Adversarial Training |
|
Li, Yizhe | Tsinghua University |
Zhang, Linrui | Tsinghua University |
Lin, Jiuzhou | Tsinghua University |
Junlong, Wu | Tsinghua University |
Yang, Qi | Tsinghua University |
Zheng, Han | Tsinghua University |
Wang, Xueqian | Tsinghua University |
Liu, Houde | Tsinghua University |
Keywords: Autonomous Vehicle
Abstract: We present SafeSim, a comprehensive benchmarking platform and a unified end-to-end framework for safety-critical driving scenario simulation and curriculum-based adversarial training. SafeSim integrates 8 classic adversarial environment generation algorithms, enabling the generation at any trigger moment and for any duration based on various naturalistic traffic datasets. During the automatic vehicle (AV) training process, SafeSim dynamically adjusts the risk-level of scenarios based on the current capability of AV, progressively enhancing its ability to handle accident-prone situations. The platform features rich interfaces and extensibility, providing a complete workflow and tool-chain for risk scenario generation, risk assessment, AV training, and algorithm evaluation. Additionally, SafeSim includes multiple baseline results, serving as a standardized benchmark for future research.
|
|
08:30-18:00, Paper Tu-Online.74 | |
Engineering the Uncertain in Automated Systems |
|
Borth, Michael | TNO |
van der Ploeg, Chris | TNO |
Keywords: System Modeling and Control, Trust in Autonomous Systems, Autonomous Vehicle
Abstract: Systems Engineering seeks to preclude uncertainty, pursuing instead to design and realize predictable systems that perform without flaws and undue risk. Yet, automated systems that adapt their behavior to the situation at hand do so by using AI in core functionalities, like perception and planning – and today’s embedded AI brings inherent uncertainty with it, as it may or may not deliver results according to functional specifications and will often show performance degradation under adverse conditions. Consequently, engineers of smart control or automated systems need to handle uncertainty in all their main tasks, like systems architecting, the design and selection of suitable functions and components, but also system validation and risk analysis. For this, we propose a methodology that analyzes a system’s information flow and its qualities and timeliness. Altogether assessing a system's fitness for its purpose under various environmental conditions, component health states, and other impact factors, our methodology uses probabilistic reasoning to support decision-making in design as well as in risk analysis, which we illustrate in the automated driving domain.
|
|
08:30-18:00, Paper Tu-Online.74 | |
Optimizing Economic Policy Design through Reinforcement Learning: A Evolutionary Approach in Intelligent Economies |
|
Li, Yuanbai | Nanjing University |
Wang, Minghao | Nanjing University |
Zhang, Xinlei | Department of Control and Systems Engineering, School of Managem |
Sun, Yuxiang | Nanjing University |
Zhou, Xianzhong | Nanjing University |
Keywords: Cooperative Systems and Control, Decision Support Systems, Consumer and Industrial Applications
Abstract: Traditionally, economic models have often been based on fixed assumptions and analyzed within static scenarios. Economists have long sought a more flexible and dynamically adaptive model to find more precise models to represent socio-economic laws. In this study, we built an economic engine grounded in multi-agent intelligent game simulations and employed reinforcement learning (RL) techniques to simulate economic activities. By establishing diverse agent behaviors across three industries and iterating taxation adjustment policies, we simulated and deduced socio-economic activities. Our simulation experiments confirmed that multi-agent intelligent game simulations, underpinned by reinforcement learning, can to some extent capture valuable economic operational patterns. The AI tax table, derived from the evolutionary economic engine, proves to be more realistic; it can boost overall societal production while ensuring maximal social fairness, thus finding a valuable local Nash equilibrium.
|
|
08:30-18:00, Paper Tu-Online.75 | |
Lightweight and Tamper-Resilient Data Aggregation through Reversible Watermarking and Homomorphic Encryption |
|
Song, Lei | China University of Petroleum (East China) |
Shi, Leyi | China University of Petroleum (East China) |
Keywords: System Architecture, Smart Sensor Networks
Abstract: In cyber-physical systems (CPS) and Internet of Things (IoT) environments, the security and efficiency of data aggregation are critical under resource constraints and partial trust conditions. This paper presents DA-RWFHE, a dual-layer framework that integrates Fully Homomorphic Encryption (FHE) with Reversible Watermarking (RW) to support aggregation in the encrypted domain and integrity verification during transit. The design emphasizes system robustness and adaptability, ensuring reliable performance in dynamic and failure-prone environments. Security analysis confirms that DA-RWFHE is resilient against manipulation, replay, collusion, and denial-of-service attacks. Extensive OMNeT++ simulation results show that DA-RWFHE outperforms existing schemes, such as APPA, MFEG, ECBDA, and BAMDD, improving bandwidth by 26.18 Mbps, reducing latency by 20.42 ms, and lowering energy consumption by 1.050 W. These results demonstrate that DA-RWFHE provides a lightweight, verifiable, and scalable solution for secure aggregation in distributed systems, particularly in resource-constrained environments. While DA-RWFHE performs well in simulations, it still faces challenges in real-world applications, particularly regarding encryption computational complexity on low-power devices and node topology changes. To address these challenges, the paper proposes optimization strategies, with future work planned for hardware validation to further enhance the feasibility of this approach.
|
|
08:30-18:00, Paper Tu-Online.76 | |
Cooperative Consensus Q-Learning for Micro Multi-Agent Tumor Targeting |
|
Elhami Fard, Neshat | Concordia University |
Merikhi, Behnaz | Concordia University |
Selmic, Rastko | Concordia University |
Keywords: Cooperative Systems and Control, Control of Uncertain Systems, System Modeling and Control
Abstract: This paper introduces a cooperative consensus Q-learning method that targets tumors using micro multi-agent reinforcement learning (MMARL) systems. The micro-agents in this system navigate toward cancerous tumors autonomously in a simulated 2-D vascular environment using Q-learning and cooperative position consensus. In this proposed method, the agents share Q-values and align their movements with each other. Employing this strategy results in coordinated behavior. Under bounded perturbations, theoretical analyses show convergence guarantees for Q-value and cooperative position consensuses. The simulation results demonstrate that this offered technique boosts convergence speed, improves stability, and enhances cumulative rewards compared to the standard consensus method, highlighting its significance for biomedical applications.
|
|
08:30-18:00, Paper Tu-Online.77 | |
Explainability-Driven Adaptation: A Collaborative Framework for Autonomous Service Systems (I) |
|
Cao, Jiashuo | Nanjing University |
Yang, Haitao | Nanjing University |
Sun, Yuxiang | Nanjing University |
Huaxiong, Li | Nanjing University |
Zhou, Xianzhong | Nanjing University |
Keywords: Trust in Autonomous Systems, Autonomous Vehicle, Adaptive Systems
Abstract: This paper introduces a novel paradigm for building trustworthy autonomous systems through explanation-driven adaptation. We propose a co-adaptive architecture where black-box neural policies, interpretable policy distillation models and explanation engines evolve synergistically through bidirectional feedback loops. The framework fundamentally transforms explainability from passive post-analysis to active system guidance, enabling continuous policy optimization while maintaining human-aligned transparency. By embedding explanation consistency as a core adaptation objective, our approach establishes dynamic equilibrium between environmental responsiveness and operational verifiability. This work advances the design of safety-critical autonomous systems through formalized principles for explainability-guided adaptation, creating new pathways for resilient human-machine collaboration in complex service environments.
|
|
08:30-18:00, Paper Tu-Online.78 | |
Smart Grocery Shopping: A Utility-Based Mathematical Framework and Optimal Strategies |
|
Li, David | Yeshiva University |
Keywords: Smart Buildings, Smart Cities and Infrastructures, Consumer and Industrial Applications, Decision Support Systems
Abstract: Smart grocery shopping presents a complex decision-making challenge involving dynamic pricing, real-time discounts, budget flexibility, and trip cost optimization. This paper proposes a mathematical framework that models shopper behavior through a stochastic partial differential equation (PDE), capturing utility evolution under uncertainty. The model integrates shopper preferences, inferred discounts from nearby consumers, and stochastic price fluctuations to formulate a utility maximization problem subject to soft budget constraints and trip cost penalties. A novel assignment matrix ensures each item is purchased from exactly one store, enabling the derivation of a closed-form solution for optimal item quantities and store selections. The Lagrangian formulation enforces budget flexibility while a penalty function captures psychological aversion to overspending. Designed for efficient implementation on mobile devices, the model is suitable for deployment in real-time consumer applications. This framework offers a principled, interpretable, and deployable tool for intelligent grocery shopping in data-rich environments.
|
|
08:30-18:00, Paper Tu-Online.79 | |
Dynamic Incentive Design Via Reinforcement Learning Stochastic Control Optimization |
|
Li, David | Yeshiva University |
Keywords: Smart Buildings, Smart Cities and Infrastructures, System Modeling and Control, Decision Support Systems
Abstract: We propose a reinforcement-aware stochastic control framework for real-time reward optimization in sharing economy platforms. Unlike traditional static incentive schemes, our model dynamically allocates incentives by integrating beliefdriven user behavior modeling, Nash equilibrium assumptions, and constrained utility maximization. The core framework unifies dynamic programming with reinforcement learning (RL) approximations to handle partial observability and large-scale deployment. A novel pricing-based calibration method is introduced to quantify the marginal value of a successful transaction, enabling budget-aligned incentive strategies. We further address theoretical assumptions, computational complexity, and practical implementation, providing a scalable path toward intelligent reward systems for real-world digital platforms.
|
| |