| |
Last updated on October 3, 2023. This conference program is tentative and subject to change
Technical Program for Tuesday October 3, 2023
|
Tu-P0-P Late-Breaking Session, Lanai |
Add to My Program |
Industrial Applications |
|
|
|
08:00-09:00, Paper Tu-P0-P.1 | Add to My Program |
Sentiment Analysis Based Human-Machine Teaming Dynamics Modelling for Improved Situational Awareness in Simulation Environments |
|
Menon, Vineetha | University of Alabama in Huntsville |
Weger, Kristin | University of Alabama in Huntsville |
Mesmer, Bryan | University of Alabama in Huntsville |
Gholston, Sampson | University of Alabama in Huntsville |
Keywords: Human-Machine Interaction, Human Factors, Human Performance Modeling
Abstract: Introduction of sentiment analysis, AI and machine learning (ML) based modeling of human-machine interactions can act as an excellent guide to assess the current strengths and weaknesses of human teams in a given paradigm and recommend targeted areas for improvement in team performance dynamics and situational awareness. This case study is conducted on human-machine interaction data from a team-based fire-fighting simulation environment. The primary goal of this research is to conduct sentiment analysis and ML on team communication data to gain deeper understanding of how teams behave and respond in stressful scenarios, how the levels of individual and team situational awareness influence the team dynamics, decision-making capabilities, and in turn impact the team performance (high vs. low) in the context of fire-extinguishing tasks. The AI/ML analysis of the relationship between the team performance variables and situational awareness scores reveal contributing factors that were the best predictors for team performance.
|
|
08:00-09:00, Paper Tu-P0-P.2 | Add to My Program |
Preliminary Results of an Augmented Reality Tool for Supporting Remote Site Exploration |
|
Vergara, Francesca | University of Pisa |
Ryals, Andrea Dan | University of Pisa |
Arenella, Antonio | University of Pisa |
Pollini, Lorenzo | University of Pisa |
Keywords: Virtual and Augmented Reality Systems, Virtual/Augmented/Mixed Reality, Human-Machine Interface
Abstract: This brief paper describes an augmented reality tool as proof of concept of a system for increasing situational awareness during remote site exploration: the system allows the operator see an avatar of its vehicle through the obstacles, as if they were transparent, and adds nearby obstacles as seen by a Lidar sensor as 3D features to facilitate the perception of depth, a key point in making the whole concept usable. The paper presents briefly the idea, highlights the need and importance of a calibration procedure, and shows preliminary results achieved in a controlled laboratory environment using a Meta Oculus Quest 2 and an indoor motion tracking system.
|
|
08:00-09:00, Paper Tu-P0-P.3 | Add to My Program |
Towards Edge-Computing Assesment of Cognitive Workload Using fNIRS Data |
|
Paul, Tanya Sarah | Thales Research and Technology Canada |
Salvan, Laura | Thales Research and Technology Canada |
Kopf, Maelle | Thales Research and Technology Canada |
Benesch, Danielle | Thales Research and Technology Canada |
Marois, Alexandre | Thales Research and Technology Canada |
Keywords: Passive BMIs, BMI Emerging Applications, Other Neurotechnology and Brain-Related Topics
Abstract: Situations of high cognitive workload can induce errors. This is widely prevalent in safety-critical domains where operators must make high-stake decisions while facing complex, dynamic situations. Therefore, it seems critical to detect instances of high cognitive workload to provide timely assistance to operators when they face such error-prone situations. This, however, requires capacities for real-time evaluation, which can be performed using brain-computer interface systems driven by measures of the central nervous system. Functional nearinfrared spectroscopy (fNIRS) measures have been shown to be associated with variations in cognitive workload, but these are typically processed and analyzed offline in a post-hoc fashion. Besides, to be used in operational situations, workload evaluation should be performed with mobile devices, supported by edge-computing devices. The goal of this paper is to present the first steps toward a real time, mobile assessment of cognitive workload using fNIRS data. We present two models developed using the Open Neural Network Exchange format that allows edge inference of cognitive workload with fNIRS data. We present prediction performance of the models and show the possibility to integrate the model into the Sensor Hub platform, a real-time sensor-agnostic data integration, synchronization, and processing nexus that allows sampling data from multiple sensors and users simultaneously for operational use cases. More specifically, we show that the model can produce workload inference from the fNIRS signal within the one-second latency requirement of the Sensor Hub platform.
|
|
08:00-09:00, Paper Tu-P0-P.5 | Add to My Program |
Explainable Markov Chain Based Pattern Forecasting |
|
Paul, Debdeep | Panasonic Industrial Devices Singapore |
Wijaya, Chandra Suwandi | Panasonic Industrial Devices Singapore Pte. Ltd |
Yamaura, Sahim | Panasonic Industry |
Miura, Koji | Panasonic Industry Co., Ltd |
Tajika, Yosuke | Panasonic Industry Co., Ltd |
Keywords: Decision Support Systems, Consumer and Industrial Applications, Enterprise Information Systems
Abstract: We consider the trend or pattern forecasting for demand timeseries in a business-to-business supply chain where demand exhibits high volatilities, non-stationarities, and skewness. We develop a pattern forecasting system by designing a data driven, feature dependent Markov chain-based framework. To increase adoption of AI based techniques among the various stakeholders we address the aspect of explainability. We define two metrices to evaluate the quality of explainability. To provide guidelines on selecting different attributes of our pipeline, we compare between feature selection methods from two families, one advanced and one traditional. We evaluate the proposed strategy on a real dataset and observe a sparsity promoting feature selection show a sparse feature selection method outperforms a conventional decision tree-based feature selection method.
|
|
08:00-09:00, Paper Tu-P0-P.5 | Add to My Program |
Experimental Design for Fatigue Detection in Varied Ambient Lighting Conditions |
|
Woodruff, Katharine | Collins Aerospace |
Dutra, Stephanie | Collins Aerospace |
Danna, Adelaide | Collins |
Wu, Peggy | Raytheon Technologies Research Center |
Ferris, Thomas | Texas A&M University |
Matthews, Cheyenne | Collins Aerospace |
Keywords: Human Performance Modeling, Human Factors, Human-Machine Interaction
Abstract: Technological advances in the aircraft continue to improve aircraft performance and extend the duration of operational missions; however, the physiological demand on pilots increases as well. With the increasing length and complexity of missions, monitoring constructs such as fatigue, attention, and stress become safety critical. This paper specifically looks at measuring fatigue and the experimental design to yield various levels of fatigue. There were 3 lighting conditions during data collection to represent varied lighting that occurs in operations. The experimental tasks and design are described, as well as the behavioral coding used during data collection. Lastly, current progress and future works are discussed.
|
|
08:00-09:00, Paper Tu-P0-P.6 | Add to My Program |
Consumer Reactions to E-Government Services: The Influence of Personal Information Sharing |
|
Frank, Björn | Waseda University |
Nishikawa, Akira | Waseda University |
Hu, Yingfei | Waseda University |
Keywords: Service Systems and Organizations, Consumer and Industrial Applications, Communications
Abstract: Due to the global trend of digital government transformation, it is essential to understand the predictors of consumer adoption of e-government services and the underlying mechanisms to accelerate the digital transformation process. Although e-government services often collect personal information, no research examines its impact on e-government service adoption. Drawing on the stimulus-organism-response (S-O-R) theory, this research investigates the influence of the level of personal information sharing on consumers’ attitudes and intentions to use e-government services. Based on two experimental studies with 669 participants, this research finds that a high level of personal information sharing hinders the service adoption both directly and indirectly by worsening consumers’ attitudes toward service usability and the government. However, when consumers view the service as highly useful, their sharing of personal information enhances their service adoption by improving their attitudes toward the government’s personal information management. For female consumers, favorable attitudes toward service usability and personal information management and the availability of diverse functions lead to higher adoption intentions. These findings provide valuable insights for government officials and policymakers in designing effective e-government services that strike a balance between personal information requirements and consumer acceptance.
|
|
Tu-P1-P Late-Breaking Session, Lanai |
Add to My Program |
Machine Learning V |
|
|
|
11:00-12:00, Paper Tu-P1-P.2 | Add to My Program |
A Comparative Evaluation of Deep Learning Based Region-Of-Interest Detection of Images in Biomedical Literature |
|
Rahman, Md | Morgan State University |
Regmi, Bikesh | Morgan State University |
Keywords: Image Processing and Pattern Recognition, Deep Learning, Application of Artificial Intelligence
Abstract: Biomedical images are frequently used in articles to illustrate medical concepts and highlight regions-of-interests (ROIs). In many cases multiple annotation markers such as different arrows, letters or symbols overlaid on figures are often pointing to different Region-of-Interests (ROIs) relevant to the article. This work presents a proof-of-concept based comparative evaluation of three popular Deep Learning based object detection techniques (Mask R-CNN, YOLOv5, and YOLOv7) to identify such ROIs in a dataset of 450 Chest CT images which are appeared in biomedical articles. The results demonstrate that all three DL-based object detection techniques achieved high accuracy in recognizing and localizing the ROIs in biomedical images, whereas YOLOv7 exhibited the highest precision of 92.5%, indicating its ability to accurately identify ROIs.
|
|
11:00-12:00, Paper Tu-P1-P.3 | Add to My Program |
Mixture Model-Based Approach for Accurate Finance Table Image Classification |
|
Kim, Ho-Jung | Hallym University |
Jeon, Yeong-Eun | Hallym University |
Jung, Won- Seok | Yonhap Infomax Inc |
Bae, Dae- Hyun | Yonhap Infomax Inc |
Park, Yung- Il | Yonhap Infomax Inc |
Won, Dong-Ok | Hallym University |
Keywords: AI and Applications, Neural Networks and their Applications, Deep Learning
Abstract: Financial analysis reports contain a wealth of information, including details about a company's assets and performance, as well as analyst assessments. For subsequent analysis and valuation, information regarding a company's assets and performance is crucial. As a result, numerous securities firms seek to leverage artificial intelligence technology to profit from this data. We intend to resolve the issue of extracting and classifying table data from financial analysis reports by converting them to images. Financial report table images contain less information that can be extracted than typical images used in image classification research, and their sizes vary, making classification difficult with simple models. As a result, we tested a variety of classification models to determine which ones performed the best, and we propose a new model that combines their strengths. This new model achieved 94.5% accuracy and an F1-Score of 0.9414 on our self-collected financial analysis table image dataset, demonstrating its effectiveness in classifying table images in financial analysis reports.
|
|
11:00-12:00, Paper Tu-P1-P.4 | Add to My Program |
Interactive Driving Control Design for Autonomous Vehicles Using Deep Reinforcement Learning |
|
Huang, Mei-Lin | National Yang Ming Chiao Tung University |
Lee, Ching-Hung | National Yang Ming Chiao Tung University |
Chiang, Hsin-Han | National Taipei University of Technology |
Keywords: Autonomous Vehicle, Control of Uncertain Systems, Intelligent Transportation Systems
Abstract: The decision making and motion planning continue to pose many challenges to the reliability and safety of autonomous vehicles (AV), especially under complex and nondeterministic traffics. Nowadays, growing research on AV attempts to solve this problem by means of reinforcement learning paradigms while endowing machines with how to behave through environmental interaction. In this study, a deep reinforcement learning approach with a modified version of the deterministic policy gradient (DDPG) is proposed to implement an interactive driving control for AV. Under the developed co-simulation framework, the deep controller to achieve interactive driving safety of AV can be generated automatically with the aid of model-free deep reinforcement learning.
|
|
11:00-12:00, Paper Tu-P1-P.5 | Add to My Program |
Deep SIRMs Fuzzy Inference Model and Its Application to Estimating Roles in a Werewolf Game |
|
Kita, Shuhei | Osaka University |
Haruta, Suguru | Osaka University |
Seki, Hirosato | Osaka University |
|
|
11:00-12:00, Paper Tu-P1-P.6 | Add to My Program |
Mesh Normal Vectors: A New Texture Feature for Single-View 3D Object Classification |
|
Que, Yufei | Soochow Univerisity |
Xie, Jie | Soochow Univerisity |
Zhang, Jin | Soochow University |
Wu, Cheng | Soochow Univerisity |
Keywords: Intelligent Transportation Systems, Autonomous Vehicle, Decision Support Systems
Abstract: The depth information provided by LiDAR 3D point clouds provides help for object detection when the depth direction is occluded. However, large amounts of data from lidar point clouds requires unusual computing resources. Inspired by images and 3D reconstruction, we propose to use normal vectors of the triangular mesh as a new texture feature of point clouds. Our current work indicates that the texture features we created have a certain contribution in the classification task of single-view point clouds. Key word: Classification, Single-view, reconstruction, Normal vectors.
|
|
Tu-P2-P Late-Breaking Session, Lanai |
Add to My Program |
Medical Applications I |
|
|
|
12:45-13:45, Paper Tu-P2-P.1 | Add to My Program |
Driver Fatigue State Estimation Using Time Series Analysis of Body Acceleration |
|
Yuda, Emi | Tohoku University |
Yoshida, Yutaka | Tohoku University |
Sugita, Norihiro | Tohoku University |
Yoshizawa, Makoto | Tohoku University |
Sakai, Masao | Tohoku University |
Keywords: Cybernetics for Informatics, Computational Life Science, AIoT
Abstract: The detection of driver fatigue is a longstanding issue in the field of biometric measurements. In this study, we examined the possibility of detecting the fatigue state of drivers using their body accelerations. Nine healthy subjects operated a driving simulator for one hour in the morning and one hour in the evening. A Holter ECG was worn and body acceleration was measured using a built-in triaxial accelerometer. The results of the comparison of body acceleration in the morning and evening showed that the variance of the data was larger in the afternoon for most subjects. By visualizing and comparing the time series within the same subject, this study showed that the variance of the driver's body acceleration increases with fatigue. It is expected that this method will be applied to improve driver safety in the future.
|
|
12:45-13:45, Paper Tu-P2-P.2 | Add to My Program |
Personal and Situational Determinants of Patients’ Adoption of Robotic Surgery |
|
Frank, Björn | Waseda University |
Shimoura, Kentaro | Waseda University |
Hu, Yingfei | Waseda University |
Keywords: Medical Informatics, Assistive Technology, Human-Machine Interaction
Abstract: The global market for surgical robots is expected to experience rapid growth in the next decade, primarily attributed to the increasing demand for healthcare services driven by population aging and improved living standards. Although robotic surgery offers many advantages compared to traditional methods, lack of public awareness of these benefits poses challenges to its widespread adoption, necessitating an understanding of factors that influence patients’ choice of robotic surgery. Many studies in the literature examine the facilitators and barriers to adopting robotic surgery, focusing on surgical techniques and surgeons’ perspectives. However, limited research explores patients’ perceptions and reactions, mostly emphasizing personal characteristics. Drawing on the unified theory of acceptance and use of technology (UTAUT) model, this study investigates two situational determinants and three personal determinants of patients’ robotic surgery adoption. Analyses of Japanese data reveal that patients’ interest in cutting-edge tech, trust in AI, and surgery hospital access improve their attitude toward robotic surgery and, thereby, stimulate their robotic surgery adoption. The availability of a trusted surgeon influences robotic surgery adoption only when patients are interested in cutting-edge technology. In addition, the need for social connections directly impedes the choice of robotic surgery. These findings extend scholars’ understanding of the UTAUT model to the robotic surgery context, and they guide both surgical robotics firms in market segmentation and hospitals in decisions on the adoption of robotic surgery equipment.
|
|
12:45-13:45, Paper Tu-P2-P.3 | Add to My Program |
Revealing Trend Clusters Influenced by a Social Event from COVID-19 Tweets in Japan |
|
Harakawa, Ryosuke | Nagaoka University of Technology |
Iwahashi, Masahiro | Nagaoka University of Technology |
Keywords: Media Computing
Abstract: This paper proposes graphical lasso-guided principal component analysis (GLIPCA) with interrupted time series analysis (ITSA), that is, GLIPCA-ITSA. This enables us to reveal trend clusters influenced by a social event (intervention) from Japanese tweets related to coronavirus disease (COVID-19). We regard daily changes in the frequencies of each word in tweets as trends and define trends with similar wave forms as trend clusters. First, we need to identify trends influenced by the intervention. ITSA, which can quantify the effect of the intervention even if its counterfactual does not exist, is useful for this aim. However, ITSA cannot exclude trends influenced by an event on a day close to the intervention. To overcome this difficulty, we newly adopt GLIPCA that combines sparse structural learning with network clustering. We can divide multiple trends, which may have been influenced by the intervention, into trend clusters with different peaks. This helps us judge the trend cluster strongly influenced by the intervention. In the experiment, we verify the effectiveness of GLIPCA-ITSA through considerations on the state-of-emergency declaration.
|
|
12:45-13:45, Paper Tu-P2-P.4 | Add to My Program |
Improved Generalized Performance of Hemodynamics Scenarios Prediction with Digital Biomarkersby Conv1D Approach |
|
Shirotori, Momo | Chugai Pharmaceutical Co., Ltd |
Kato, Kosuke | Japan Data Science Consortium Co. Ltd |
Hondo, Takaya | Japan Data Science Consortium Co. Ltd |
Kim, Kijun | Japan Data Science Consortium Co. Ltd |
Tokuyama, Kento | Chugai Pharmaceutical Co., Ltd |
Keywords: Human Performance Modeling, Wearable Computing, Medical Informatics
Abstract: Digital biomarkers (dB) provide valuable information for the continuous assessment of disease status in clinical practice. Hemodynamics is an important endpoint for evaluating a patient's status and, therefore, a continuous monitoring system using dB needs to be developed. In this study, we developed supervised learning modeling approaches for estimating hemodynamic scenarios of new patients using a time series dataset obtained from contact and contact-free sensors.
|
|
12:45-13:45, Paper Tu-P2-P.5 | Add to My Program |
Clean Your Hands: Using Computational Modeling to Improve Infection Rates in Anesthesia Induction |
|
Rose, Olivia Claire | University of Virginia |
Bolton, Matthew | University of Virginia |
Miller, Michael | University of Virginia |
Keywords: Human Performance Modeling, Human Factors
Abstract: In United States healthcare, there are persistent problems associated with infections, especially in relation to hand hygiene in anesthesia. While the sterile field of an operating room is well defined, cross contamination can still occur due to the complexity of the anesthesia work environment. This study aims to address this deficiency using a novel computational technique. Formal methods will be used to verify the cleanliness and properties associated with the anesthesia induction process as they dynamically evolve. Our hypothesis is that formal verification will be able to predict and identify how infection can spread in the operating room during the anesthesia induction process. This paper describes preliminary work demonstrating the feasibility of the modeling underlying this approach.
|
|
Tu-P3-P Late-Breaking Session, Lanai |
Add to My Program |
Medical Applications II |
|
|
|
13:45-14:45, Paper Tu-P3-P.1 | Add to My Program |
Chinese Sentences Composition for People with Dysarthria |
|
Jhong, Ren-Sheng | National Taipei University |
Hsu, Chun-Yao | National Taipei University |
Chang, Yue-Shan | National Taipei University |
Huang, Hung-Shing | National Taipei University |
Keywords: Human-Machine Interface, Human-Computer Interaction, Human-Machine Interaction
Abstract: Dysarthria is a kind of speech impairment. As well-known, the people with dysarthria cannot speak clearly or his speech intelligibility is poor; so that cause them to seldom communicate with others. In addition, since each dysarthria patient has a different degree of language nerve damage, it is not easy to collect enough voice files to train ASR (Automatic Speech Recognition) for certain dysarthria patient, although the deep learning-based ASR system is very mature. It is not easy to construct a good ASR system for these people with dysarthria. Therefore, the speech recognition accuracy of this kind of ASR is not very good. How to compose meaningful Chinese sentences in this situation has become an important issue. In this paper, based on our previous work, a N-gram based Chinese sentence composition method is proposed to compose a meaningful Chinese sentence for the people with dysarthria. To address this issue, we introduce a N-Gram repository that include a lot of N-words based term, and a HSWD (History Successive Word Dictionary) to record the follow-up term for the certain term. While an ASR model recognize the voice of a sentence and generate a N-gram candidate matrix based on the N-gram repository, the system will compose possible candidate sentence according to the HSWD and recognized N-gram candidate matrix. In the work, we propose a framework and algorithm for inferring and compositing the possible candidate sentence. The experimental results show that the proposed approach can achieves over 80 % accuracy rate of composed sentence in ten candidate sentences.
|
|
13:45-14:45, Paper Tu-P3-P.2 | Add to My Program |
A Hybrid YOLOv8 and Instance Segmentation to Distinguish Sealed Tissue and Detect Tools’ Tips in FLS Laparoscopic Box Trainer |
|
Mohaidat, Mohsen | Electrical & Computer Engineering Western Michigan University |
Grantner, Janos | WMU |
A Shebrain, Saad | Western Michigan University |
Abdel-Qader, Ikhlas | Western Michigan University |
Keywords: Machine Vision, Image Processing and Pattern Recognition, Biometric Systems and Bioinformatics
Abstract: Abstract—Intracorporeal suturing is one of the most crucial skills in the Fundamentals of Laparoscopic Surgery. Surgical residents are evaluated by their supervisory surgeon, but surgical assessment demands a substantial amount of time from the surgeons and can result in biased evaluations. In addition, distinguishing sealed tissue among multiple trainees would be a subjective decision. Therefore, we propose an autonomous assessment support system, which can supervise the execution of the suturing task by using YOLOv8 instance segmentation and object detection, tracking the tip of the suturing instruments and monitoring tissue seals and knots. We used mean average precision and inference time metrics to evaluate the performance of the instance segmentation for our proposed suturing assessment system. It was found that the precision of all suturing instruments was 95% and that the mask precision of the tissue was 98.8%. Our proposed autonomous laparoscopic training system saves the supervisor surgeons' time, and the outcomes of the proposed methodology may also be utilized in the development of surgical robots.
|
|
13:45-14:45, Paper Tu-P3-P.3 | Add to My Program |
Autoencoder Learning and Variational Gaussian Inference for Predicting Mean Arterial Pressure in Fluid Resuscitation |
|
Estiri, Elham | Kent State University |
Mirinejad, Hossein | Kent State University |
Keywords: Digital Twin, Modeling of Autonomous Systems
Abstract: This paper introduces a novel method, called robust nonlinear state space modeling (RNSSM), for predicting hemodynamic responses in fluid resuscitation. The RNSSM approach integrates autoencoder learning and Gaussian inference in a unified framework to address the challenges associated with identifying reliable models with limited and noisy critical care data. Simulation results demonstrate the initial feasibility and performance evidence of the RNSSM approach, which serves as a digital twin of an animal study, in fluid resuscitation scenarios.
|
|
13:45-14:45, Paper Tu-P3-P.4 | Add to My Program |
Managing Nurse Cognitive Load in Emergency Patient Care Situations with Decision Making Aids |
|
Goodman, Clare | Purdue University |
Anton, Nicholas | Purdue University |
Ray, Poushali | Purdue University |
Chen, Haozhi | Purdue University |
Yu, Denny | Purdue University |
Keywords: Virtual/Augmented/Mixed Reality, Assistive Technology, Augmented Cognition
Abstract: In an emergency setting, patient status can change in a matter of seconds, and failure to react in a timely manner can pose significant risks to patient outcomes. Nurses are responsible for processing large volumes of complex information in a limited time frame, which can result in high cognitive load (CL). With the advent of new technology, tools such as augmented reality (AR) can be utilized as an aid to nurses in decision-making (DM). Previous interviews indicated experienced nurses thought AR could be beneficial in nursing education. Based on this feedback, the team constructed a virtual checklist to leverage clinical judgment for novice nurses. This pilot study aims to determine whether using the Microsoft HoloLens, an AR device, to display a virtual checklist will reduce CL in novice nurses.
|
|
13:45-14:45, Paper Tu-P3-P.5 | Add to My Program |
A Comparative Study Via Explainable AI Methods of Preterm Birth Prediction Model Trained on Ultrasound Images with Slight Changes |
|
Jeon, Yeong-Eun | Hallym University |
Son, Ga-Hyun | Hallym University |
Kim, Ho-Jung | Hallym University |
Lee, Jae-Jun | Hallym University |
Won, Dong-Ok | Hallym University |
Keywords: Biometrics and Applications,, Medical Informatics
Abstract: Preterm birth (PTB), the leading cause of infant death, is mainly predicted by cervical length (CL). But using CL alone, even medical experts have difficulty predicting PTB. Therefore, we aimed to identify new predictors of PTB except CL using eXplainable Artificial Intelligence (XAI). Moreover, we conducted a comparative study to investigate the impact of the model's learning pattern and features according to the differences in expert guidance and XAI methods on the same ultrasound images. As a result, we figured out that these influences may be a hurdle to finding new prediction factors for preterm birth.
|
|
Tu-P4-P Late-Breaking Session, Lanai |
Add to My Program |
Robotic Systems |
|
|
|
15:00-16:00, Paper Tu-P4-P.1 | Add to My Program |
Semantic Segmentation Model for Marine Pollution Detection |
|
Heng, Wei Bin | National Taipei University of Technology |
Yi-Kai, Chiu | National Taipei University of Technology |
Chen, Xiu-Zhi | National Taipei University of Technology |
Chen, Yen-Lin | National Taipei University of Technology |
Keywords: Robotic Systems, Autonomous Vehicle, Consumer and Industrial Applications
Abstract: Unmanned aerial vehicles (UAV) are nowadays an important piece of equipment in search and rescue missions depending to their flexibility and convenience. Computer vision algorithms such as machine or deep learning are often used in such tasks. However, such approaches rely on many aerial view data to train the machines, but most of the existing publicly available aerial datasets belong to land or traffic scenes, and a few maritime aerial datasets focus on object detection, tiny object detection, and so on. Therefore, this paper introduces a model for marine pollution detection. We use a video of a stranded cargo ship to extract 142 images as a dataset and use the PIDNet[1] as semantic segmentation model for training and validation to demonstrate the applicability to maritime oil pollution detection. Source code is released in https://github.com/HengWeiBin/Oil-Polution-Dataset-with-PID Net.
|
|
15:00-16:00, Paper Tu-P4-P.2 | Add to My Program |
Modeling and Tracking Control of Differential-Drive Mobile Robots Using Artificial Fuzzy Neural Networks |
|
Huang, Hsu-Chih | National Ilan University |
Keywords: Autonomous Vehicle, Robotic Systems, Infrastructure Systems and Services
Abstract: This paper contributes to the development of mechanical modeling and motion control of autonomous mobile robots (AMRs) using artificial fuzzy neural networks. An interval type-2 fuzzy neural network (IT2FNN) algorithm is incorporated with the mechanical modeling of mobile robots to develop intelligent control schemes. The metaheuristic genetic algorithm is employed to determine the initial IT2FNN structure. Taking the mechanical modeling, genetic algorithm and IT2FNN control strategy, the two-wheeled nonholonomic mobile robots are steered to accomplish time-varying trajectory tracking tasks. Numerical simulations were conducted to illustrate the efficiency and superiority of the proposed intelligent IT2FNN control method for autonomous mobile robots.
|
|
15:00-16:00, Paper Tu-P4-P.3 | Add to My Program |
Intelligent Tracking Control for AGV Navigation in Industry Environment Based on Adaptive Fuzzy-Neural Approach |
|
He, Jyun-Hong | National Taipei University of Technology |
Chen, Yen-Lin | National Taipei University of Technology |
Chiang, Hsin-Han | National Taipei University of Technology |
Hsu, Pei-En | National Taiwan Normal University |
Lin, Cheng-Hung | National Taiwan Normal University |
Keywords: Manufacturing Automation and Systems, Robotic Systems, Intelligent Transportation Systems
Abstract: This study proposes an intelligent control strategy for automatic guided vehicle (AGV) trajectory tracking control for application in manufacturing facilities Based on the fuzzy-neural network (FNN) method combined with the newly proposed error algorithm in the design of the Loss function, virtual models with the working environment and the AGV platform can be used to learn controller's parameters in advance from the simulation environment. Finally, the real-time trajectory tracking task of navigation is evaluated to verify the effectiveness of the proposed control strategy integrated with a reflector-assisted localization algorithm.
|
|
15:00-16:00, Paper Tu-P4-P.4 | Add to My Program |
Design and Implementation of Virtual and Real Kawaii Companion Robots by Affective Evaluation Using EEG and ECG |
|
Ohkura, Michiko | Shibaura Institute of Technology |
Laohakangvalvit, Tipporn | Shibaura Institute of Technology |
Sripian, Peeraya | Shibaura Institute of Technology |
Sugaya, Midori | Shibaura Institute of Technology |
Natsuko, Noda | Shibaura Institute of Technology |
Berque, Dave | DePauw University |
Chiba, Hiroko | DePauw University |
Keywords: Affective Computing, Human-Collaborative Robotics, Human-Machine Interface
Abstract: Companion robots become familiar year by year. This manuscript describes our collaboration project related to the design and implementation of virtual and real kawaii companion robots by affective evaluation using EEG and ECG by Japanese and American university students. Each group consisted by the combination of Japanese and American students designed and implemented several robots which cause different affective reactions estimated by EEG and ECG.
|
|
15:00-16:00, Paper Tu-P4-P.5 | Add to My Program |
Unraveling the Connection: How Cognitive Workload Shapes Intent Recognition in Robot-Assisted Surgery |
|
Sharma, Mansi | German Research Center for Artificial Intelligence |
Krüger, Antonio | German Research Center for Artificial Intelligence (DFKI) |
Keywords: Human-Collaborative Robotics, Human-Machine Interaction, Human-Machine Cooperation and Systems
Abstract: Robot-assisted surgery has revolutionized the healthcare industry by providing surgeons with greater precision, reducing invasiveness, and improving patient outcomes. However, the success of these surgeries depends heavily on the robotic system's ability to accurately interpret the intentions of the surgical trainee or even surgeons. One critical factor impacting intent recognition is the cognitive workload experienced during the procedure. In our recent research project, we are building an intelligent adaptive system to monitor cognitive workload and improve learning outcomes in robot-assisted surgery. The project will focus on achieving a semantic understanding of surgeon intents and monitoring their mental state through an intelligent multi-modal assistive framework. This system will utilize brain activity, heart rate, muscle activity, and eye tracking to enhance intent recognition, even in mentally demanding situations. By improving the robotic system's ability to interpret the surgeon's intentions, we can further enhance the benefits of robot-assisted surgery and improve surgery outcomes.
|
|
Tu-P5-P Late-Breaking Session, Lanai |
Add to My Program |
Transportation Applications |
|
|
|
16:00-17:00, Paper Tu-P5-P.1 | Add to My Program |
Fuel-Efficient Interval Management for Air Traffic Descending Operation |
|
Ishii, Minami | Keio University |
Inoue, Masaki | Keio University |
Toratani, Daichi | Electronic Navigation Research Institute |
Keywords: Intelligent Transportation Systems, System Modeling and Control, Cooperative Systems and Control
Abstract: To improve the operational efficiency of civil aviation, advanced technology in the interval management of aircraft needs to be developed. In this manuscript, the design of a fuel-efficient interval management algorithm is addressed: ensuring proper interval of arriving aircraft with altitude control. The effectiveness of the presented algorithm is verified in a simulation with arrival aircraft at Kansai International Airport.
|
|
16:00-17:00, Paper Tu-P5-P.2 | Add to My Program |
Increasing EV Integration with Reinforcement Learning and Distribution Network Reconfiguration |
|
Gholizadeh, Nastaran | University of Alberta |
Musilek, Petr | University of Alberta |
Keywords: Intelligent Power Grid, Electric Vehicles and Electric Vehicle Supply Equipment, Cooperative Systems and Control
Abstract: The rapid increase in penetration of electric vehicles demands the widespread installation of fast charging stations. These stations require a very high level of electrical power, drastically changing the electrical load profile by increasing its peak. This results in increased system losses and voltage drops throughout the network and limits the number of electric vehicles that can charge at the same time. This paper presents a reinforcement learning-based optimization of vehicle charging location. This novel approach uses optimal distribution network reconfiguration to train an electric vehicle charging coordinator, implemented as a reinforcement learning agent.
|
|
16:00-17:00, Paper Tu-P5-P.3 | Add to My Program |
Cooperative Decision-Making in Mixed Urban Traffic Scenarios |
|
Varga, Balint | Karlsruhe Institute of Technology (KIT), Campus South |
Yang, Dongxu | Institute of Control Systems, Karlsruhe Institute for Technology |
Hohmann, Sören | KIT |
Keywords: Human-Machine Cooperation and Systems, Human-Machine Interaction, Systems Safety and Security
Abstract: As the number of highly automated and autonomous vehicles on public roads continues to rise, gaining trust becomes a crucial aspect for their acceptance in society. Particularly challenging situations arise when these vehicles drive at low speeds, as they frequently encounter interactions with vulnerable road users. Consequently, addressing these interactions becomes essential for the vehicle's automation. This abstract has two key contributions: the introduction of an approximate vulnerable road user model and the comparison of the model-based and model-free decision-making algorithms. The simulation results showed the effectiveness and practicality of the proposed model-based algorithm even for real-world applications due to the traceability of this decision-making algorithm.
|
|
16:00-17:00, Paper Tu-P5-P.4 | Add to My Program |
Multi UAV Network Restoration Scheme Using Recovery UAV |
|
Jang, Min-Hui | Kumoh National Institute of Technology |
Kim, Hyeong-Jin | NSLab Co., Ltd |
Lee, Jae-Min | Kumoh National Institute of Technology |
Kim, Dong-Seong | Kumoh National Institute of Technology |
Keywords: Fault Monitoring and Diagnosis, System Modeling and Control, System Architecture
Abstract: UAV(Unmanned Aerial Vehicle), which has advantage of flexibility, cost-effectiveness, easy to deploy and risk avoidance, is emerging as a communication relays for building network infrastructure not only in military sector but also in the civilian sector. However, these networks lack countermeasures for situations in which network disconnection occurs after construction due to UAV’s own defects, such as variables that occur in nature, shooting down from enemy force in battlefield environments, operation shutdown and malfunction. Therefore, in this paper, recovery UAV is additionally deployed in a multi-UAV based FANET(Flying Ad-Hoc Network) environment composed of UAV-BS(UAV-Base Station). As a result, when a defect in the UAV-BS occurs, the recovery UAV moves to the corresponding coordinate. And Restores communications on behalf of the UAV-BS mission.
|
|
16:00-17:00, Paper Tu-P5-P.5 | Add to My Program |
Can Information Presentation Using a Mixed Reality Device Reduce Anxiety of Autonomous Driving? a Preliminary Study in Simulator Environment |
|
Yoshitake, Hiroshi | The University of Tokyo |
Harada, Ryunosuke | The University of Tokyo |
Shino, Motoki | Tokyo Institute of Technology |
Keywords: Human-Machine Interaction, Virtual/Augmented/Mixed Reality, Human-Centered Transportation
Abstract: It is important that passengers of autonomous personal mobility vehicles (PMVs) do not feel anxious while moving. Previous works have shown that presenting information about the PMV’s future behavior and intentions effectively reduces anxiety. Mixed reality (MR) devices are promising tools for realizing this information presentation. However, current devices’ limitations may influence the visual information’s effect. Therefore, the effect of information presentation using an MR device on reducing the anxiety of autonomous PMVs was investigated using a PMV simulator. Investigation results showed that the anxiety of autonomous PMV passengers could be reduced by visual information presented through an MR device.
|
|
16:00-17:00, Paper Tu-P5-P.6 | Add to My Program |
AI-Based Autonomous Diagnostics and Maintenance System for Cyber Physical Systems |
|
Zhang, Fan | Georgia Institute of Technology |
Keywords: Cyber-physical systems
Abstract: Cyber-physical systems (CPSs) have been deployed rapidly and widely for Industry 4.0. This work develops an AI-based autonomous diagnostics and maintenance system for CPSs to reduce operations & maintenance costs through detecting artefacts such as faults and equipment degradation, eliminating unnecessary downtime, and performing non complicated maintenance. This system is made possible by recent rapid technological advancements in robotics and computing to perform dynamic sensing and remote manipulation to reduce repetitive work and maintenance tasks, especially in harsh environments dangerous or hostile to humans. This autonomous system aims to: (1) uses mobile robots with portable sensors to dynamically sense in-the-field data, (2) uses machine learning (ML) and AI to analyze the integrated dynamic sensor data with sensor data from sensors installed in CPSs, and (3) accomplish ancillary maintenance tasks via robots. The current work progress includes a power plant simulation for robotic navigation and manipulation simulations, trajectory optimization and task planning algorithms development, and ML-based diagnostics system. This work also discusses the research challenges which need to be solved to achieve the proposed autonomous diagnosis and maintenance system.
|
|
Tu-PS10-T4 Regular Session, Hawaii 2 |
Add to My Program |
AI and Applications I |
|
|
|
08:15-08:30, Paper Tu-PS10-T4.2 | Add to My Program |
Deploying a Machine Translation Model on a Mobile Device with Improved Latency Constraints |
|
Chang, Wei-Chien | National Chung Cheng Univeristy |
Liu, Alan | National Chung Cheng University |
Wang, Hua-Ye | National Chung Cheng Univeristy |
Keywords: AI and Applications, Evolutionary Computation, Transfer Learning
Abstract: This paper presents a method of deploying a complex machine learning model on a mobile device which has limited computing resources. With the rapid advancement of machine learning and deep learning, Natural Language Processing (NLP) has made considerable progress in recent years. The transformer approach has been widely used in NLP tasks with the ability of parallel processing in recurrent neural networks through the use of a self-attention mechanism. However, the excessive pursuit of accuracy leads to higher complexity in a model, and it becomes difficult to deploy such a model on mobile devices. To solve the problem of high computational cost, this research uses Neural Architecture Search (NAS) with Genetic Algorithm as the search strategy. It trains the supernet for performance evaluation to automatically design efficient architecture for different hardware platforms while considering network compression. We further reduce the size of the model with little impact on accuracy by applying K-means clustering. To show our method’s effectiveness, we have deployed a model on a mobile device to perform machine translation. We use BLEU, FLOPs, and Latency as evaluation metrics. BLEU is used as an evaluation of sequence generative tasks while FLOPs and latency are used to observe whether the architecture found by the NAS is suitable for the hardware.
|
|
08:30-08:45, Paper Tu-PS10-T4.3 | Add to My Program |
Deep Reinforcement Learning Based Upper Limb Neuromusculoskeletal Simulator for Modelling Human Motor Control |
|
Fu, Jirui | University of Central Florida |
Choudhury, Renoa | University of Central Florida |
Park, Joon-Hyuk | University of Central Florida |
Keywords: AI and Applications, Computational Intelligence, Application of Artificial Intelligence
Abstract: The neuromusculoskeletal modeling and simulator (NMMS) have been widely utilized in various fields and applications. The deep reinforcement learning (DRL) algorithm is a promising method to study human motor controls and movement biomechanics via NMMS without experimental data. However, existing research lacks exploration of the DRL implementation for controlling neuromusculoskeletal simulators, and only a few have presented myoelectric control systems applied to the DRL-based NMMS. In this work, an off-policy DRL algorithm, Deep Deterministic Policy Gradient (DDPG), was implemented on an upper limb NMMS with two different types of action space – direct muscle activation output and PD-based internal model, and compared their control performance. In addition, we evaluated the performance of proportional myoelectric control systems implemented on the DRL-based upper limb NMMS. The results indicate that the DRL-based NMMS can execute upper limb movements accurately, and the proportional myoelectric control system reduced the muscle activation under both types of action space. Moreover, the PD-based internal model action space shows better learning and error-tracking performance than the direct muscle activation output action space.
|
|
08:45-09:00, Paper Tu-PS10-T4.4 | Add to My Program |
Contour Detection from Ultrasound Kidney Images with a Coarse-To-Fine Approach |
|
Tao, Peng | Soochow University |
Gu, Yidong | Suzhou Municipal Hospital |
Xu, Yanqing | UT Southwestern Medical Center |
Wang, Caishan | The Second Affiliated Hospital of Soochow University |
Zhang, Lei | Duke Kunshan University |
Cai, Jing | Hong Kong Polytechnic University |
Keywords: AI and Applications, Image Processing and Pattern Recognition
Abstract: Ultrasound kidney image segmentation presents significant challenges due to missing or ambiguous boundaries. In this study, we introduce a coarse-to-refinement approach incorporating four novel aspects. Firstly, we leverage the properties of a principal curve (PC) to automatically fine-tune the curve shape and employ a neural network's learning ability to reduce model error. Secondly, a deep fusion learning network is utilized for the coarse segmentation step, incorporating a parallel architecture to enhance deep-learning performance. Thirdly, addressing the limitation of standard PC-based methods in determining the number of vertices automatically, we propose an automatic searching polygon tracking method using a mean shift clustering-based approach to replace the projection and vertex extension step in standard PC-based methods. Lastly, we develop an explainable mathematical map function for the kidney contour, as denoted by the neural network output (i.e., optimized vertices), which aligns well with the ground truth contour. We conducted various experiments to evaluate our method's performance, demonstrating its effectiveness in ultrasound kidney image segmentation.
|
|
Tu-PS10-T5 Special Session, Honolulu |
Add to My Program |
Computational and Medical Cybernetics II |
|
|
Organizer: Rudas, Imre | Obuda University |
Organizer: Kovacs, Levente | Obuda University |
Organizer: Eigner, György | Obuda University |
Organizer: Szilágyi, László | Obuda University |
Organizer: Kubota, Naoyuki | Tokyo Metropolitan University |
Organizer: Kozma, Robert | University of Memphis, TN |
|
08:00-08:15, Paper Tu-PS10-T5.1 | Add to My Program |
Stepwise Search Transition-Based Hybrid Optimization for 3D Pose Estimation (I) |
|
Eguchi, Masatoshi | Tokyo Metropolitan University |
Obo, Takenori | Tokyo Metropolitan University |
Kubota, Naoyuki | Tokyo Metropolitan University |
Keywords: Hybrid Models of Computational Intelligence, Metaheuristic Algorithms, Swarm Intelligence
Abstract: We aim to develop a simple motion capture system for home environments. As users need to install their own cameras, a calibration-free system is required. Therefore, we propose a 3D pose estimation method based on 3D joint angles of humans using multiple smart devices with a hybrid optimization method that combines Particle Swarm Optimization and steepest descent method. We also estimate the relative angles between humans and cameras to facilitate camera calibration. In this paper, we discuss the impact of the combination of global and local search capabilities of the optimization method on the system's performance. Specifically, we propose an optimization method that gradually changes the number of iterations of Particle Swarm Optimization and Steepest Descent Method and compare it with a simple sequential combination.
|
|
08:15-08:30, Paper Tu-PS10-T5.2 | Add to My Program |
SRPE and ACWR to Control Fatigue Levels and Minimize Injuries in Performance Sports (I) |
|
Biró, Attila | George Emil Palade University of Medicine, Pharmacy, Science, An |
Cuesta-Vargas, Antonio Ignacio | University of Malaga |
Szilágyi, László | Sapientia - Hungarian University of Transylvania |
Keywords: Application of Artificial Intelligence, Machine Learning, Biometric Systems and Bioinformatics
Abstract: sRPE and ACWR are valuable tools for controlling fatigue levels and minimizing injuries in performance sports. Their ability to provide individualized assessment, integrate subjective and objective measures, and inform data-driven decision-making makes them essential components of a comprehensive sports safety and performance monitoring system. The goal of this study was to provide first a computer-assisted solution to predict the fatigue level and then to expand the currently existing manual calculation models with an artificial intelligence-supported solution for a more advanced pipeline in performance sports to control fatigue and minimize the injury level.
|
|
08:30-08:45, Paper Tu-PS10-T5.3 | Add to My Program |
Reseach in Collaborative Space Using Complex Experimental Model (I) |
|
Horváth, László | Óbuda University |
Keywords: Expert and Knowledge-Based Systems, Hybrid Models of Computational Intelligence, AI and Applications
Abstract: Theories, methodologies, practices, and software for problem solving by model definition, analysis and simulation, and communication were essentially developed in engineering during the past decades. This paper first gives an analysis of the above main advances which are tied to paradigm shifts and essential for PhD and other engineering related research programs. Following this, scenario of a proposed new style of research is introduced. This research is relied upon methods and software tools offered by high end engineering modeling platform. Another main contribution in this paper is full research in a purposefully configured autonomous and reactive research centered experimental model (RCEM) which is continuously developed during the lifecycle of research. RCEM collects and organizes model and simulation representations which accommodate results of the research. Research process is also controlled by contexts from industrial and other application environments. Essential representations in RCEM are behaviors to achieve active knowledge and intelligent computing intensive solution. New concentration area is model representation for medical problem related solutions in this paper. Virtual Research Laboratory (VRL) was founded to realize globally participated collaborative space organized high level PhD research on the cloud utilizing world level engineering modeling platform. Communication with other collaborative spaces and outside world sources and controls is also a main concern. Implementation and initial application of the results in this paper is done at the VRL.
|
|
08:45-09:00, Paper Tu-PS10-T5.4 | Add to My Program |
LSTM-Based Motion Trajectory Prediction in a Perceiving-Acting Cycle System (I) |
|
Sekiguchi, Takuro | Tokyo Metropolitan University |
Obo, Takenori | Tokyo Metropolitan University |
Kubota, Naoyuki | Tokyo Metropolitan University |
Matsuda, Tadamitsu | Juntendo University |
Keywords: Deep Learning, Machine Learning, Knowledge Acquisition
Abstract: The aim of this study is to model the cognitive processes based on a perceiving-acting cycle in patients with unilateral spatial neglect (USN). USN is the inability to perceive features of the environment, body, or objects on one side. To extract the cognitive characteristics of USN patients in a multifaceted manner, we constructed a multimodal sensing system using immersive VR. In this paper, we present a system that predicts movement of a subject while performing a search task using the measurement results and an LSTM neural network.
|
|
Tu-PS10-T6 Special Session, Kahuku |
Add to My Program |
Results and Applications of Discrete Event System Models and Artificial
Intelligence I |
|
|
Organizer: Fanti, Maria Pia | Polytecnic of Bari, Italy |
Organizer: Li, Zhiwu | Xidian University |
|
08:00-08:15, Paper Tu-PS10-T6.1 | Add to My Program |
Collision Avoidance Strategy for Autonomous Intersection Management by a Central Optimizer Algorithm (I) |
|
Paparella, Francesco | Polytechnic University of Bari |
Volpe, Gaetano | Polytechnic University of Bari |
Mangini, Agostino Marcello | Polytechnic of Bari |
Fanti, Maria Pia | Polytecnic of Bari, Italy |
Keywords: Autonomous Vehicle, Intelligent Transportation Systems
Abstract: The increasing volume of traffic worldwide enlightens the problem of ensuring driving safety and preventing collisions at unsignalized intersections. In this regard, with the advent of Connected Autonomous Vehicles (CAV), Collision Avoidance (CA) and Autonomous Intersection Management (AIM) problems have been intensively studied and many cooperative and optimization-based approaches have been proposed. In this paper, we introduce a double-level collision-free control system, performed by a Central Optimizer (CO) and local CAV controllers, that allows CAVs to safely cross the intersection at the same time. The CO collects data from CAVs and imposes waiting times at specific waypoints that are then used by the low-level controllers to regulate the vehicle speed. The main advantage of the presented method is that a less complex optimization problem is formulated by imposing waiting times rather than determining the speed profile. A simulation campaign conducted on Matlab shows the performance of the proposed approach.
|
|
08:15-08:30, Paper Tu-PS10-T6.2 | Add to My Program |
K-Protection of Global Secret in Discrete Event Systems Using Supervisor Control (I) |
|
Liu, Ruotian | Polytechnic University of Bari |
Duan, Wei | Xidian University |
Mangini, Agostino Marcello | Polytechnic of Bari |
Fanti, Maria Pia | Polytecnic of Bari, Italy |
Keywords: Discrete Event Systems
Abstract: This work addresses the security problem of protecting secrets in discrete-event systems modeled by deter- ministic finite automata. We first characterize a global secret that composes of one or several states, in which each state is associated with a security level. In addition, we assume that the protected event labels must be recovered within a bounded of consecutive protected events (called as K-protection). The aim of this work is to design an K-protection event policy such that there exists event sequence from the initial state that reaches a secret state piece contains a number of protected events no less than the required level of security, such that the protected secret pieces satisfy protection setting value. To this end, a security automaton that integrates the information of the system state and its current security level is given, and then a K-protection automaton that lists all the possible protections of event sequences is established. Then by using the supervisor control theory technique, the valid protecting policy to enforce the security requirement is obtained. Finally, examples are used to illustrate the proposed protection method.
|
|
08:30-08:45, Paper Tu-PS10-T6.3 | Add to My Program |
Quantification of Reliable Detection Range Using Lidar-Based Object Detection Approaches and Varying Process Parameters |
|
Boschmann, Waldemar | Universität Duisburg |
Soeffker, Dirk | University of Duisburg-Essen |
Keywords: Adaptive Systems, Autonomous Vehicle, Decision Support Systems
Abstract: Highly automated or autonomous systems utilize machine learning-based approaches to perceive the environment and make decisions based on the obtained information. These approaches are highly depended on the model used, training data, and environmental conditions. While high performance can be expected in certain and suitable situations, unknown or uncertain situations might lead to undetected underperformance. In case of object detection, the quantification of reliability of a particular prediction is usually given by a detection score predicted by the trained model. While a higher score indicates higher confidence, it does not reflect the actual uncertainty of the prediction considering situational variations. In this paper, the predicted detection score is converted into a true-positive rate to decide whether to accept or reject a prediction. Furthermore, the detection rate is used as a function over distance to determine the dynamic performance expectations using situational knowledge and reliability requirements.
|
|
Tu-PS10-T8 Special Session, Hawaii 3 |
Add to My Program |
Intelligent Learning in Control Systems |
|
|
Organizer: Tsai, Ching-Chih | National Chung Hsing UNversity |
Organizer: Hwang, Kao-Shing | National Sun Yat-Sen University |
Organizer: Yu, Gwo-Ruey | National Chung Cheng University |
|
08:00-08:15, Paper Tu-PS10-T8.1 | Add to My Program |
Data-Driven Safe Formation Control for Multi-Agent Systems and Its Applications in Multi-UGV Systems (I) |
|
Yan, Bing | The University of Adelaide |
Shi, Peng | University of Adelaide, Adelaide |
Zhang, Daotong | The University of Adelaide |
Yang, Yize | The University of Adelaide |
Keywords: Cooperative Systems and Control, Autonomous Vehicle, Control of Uncertain Systems
Abstract: In this paper, we propose a reinforcement learning-based safe control strategy for uncertain heterogeneous multi-agent systems. The objective is to achieve collision-free time-varying formations under switching topologies and with limited network resources. Without requiring global communication information, an event-triggered observer is designed to decouple the heterogeneous dynamics from switching networks and reduce data transmission frequency. A data-driven off-policy reinforcement learning algorithm is developed for addressing the robust safe formation control problem. The algorithm is capable of solving non-quadratic optimization problems without requiring model information. The proposed strategy is applied to a multi-unmanned ground vehicle (UGV) system for a patrolling mission, and the experimental results verified the effectiveness of the proposed strategy.
|
|
08:15-08:30, Paper Tu-PS10-T8.2 | Add to My Program |
Data-Driven Vehicle Dynamic Model for Autonomous Vehicle Applications (I) |
|
Gregory, Jack | Deakin University |
Pappu, Mohammad Rokonuzzaman | Deakin University |
Mohajer, Navid | Deakin University |
Ghafarian, Mohammadali | Deakin University |
Keywords: Autonomous Vehicle, Modeling of Autonomous Systems, System Modeling and Control
Abstract: Vehicle Dynamics Models (VDMs) face a trade-off scenario between accuracy and speed. More complex models can generate more accurate predictions of vehicle state but are more computationally slow. Additionally, many VDMs rely on the explicit estimation of unknown parameters. To avoid these limitations, we propose a feed-forward Time Delay Neural Network (TDNN) which surpasses the physics-based VDMs in accuracy and speed without explicit estimation of unknown quantities. Notably, the TDNN model was able to accurately predict the vehicle state on various road surfaces despite no knowledge of the tyre-road friction. The TDNN predicts the longitudinal and yaw accelerations of the vehicle with a Root Mean Square Error (RMSE) of as low as 0.0387 rad/s 2.
|
|
08:30-08:45, Paper Tu-PS10-T8.3 | Add to My Program |
Design and Implementation of Intuitive Human Robot Interface System by DDPG with HER and RCA (I) |
|
Yang, Jie-Yao | National Cheng Kung University |
Li, Tzuu-Hseng S. | National Cheng Kung University |
Keywords: Robotic Systems, Mechatronics, System Modeling and Control
Abstract: This paper presents an intuitive human-robot interface system (iHRIS), where the deep deterministic policy gradient (DDPG) with hindsight experience replay (HER) training method is proposed to accelerate model training. The system consists of a motion capture system and a motion learning network. The motion capture system includes an RGB-D camera and an operating glove. The position of the human operator's hand is estimated by Openpose using the RGB-D images, while the hand's posture is determined by the information captured by the glove. An inertial measurement unit (IMU) and a microprocessor are equipped on the glove. The IMU data is calibrated using the Recursive Least Squares (RLS) method and computed using Madgwick's algorithm. Based on the observed position and posture of the human operator's hand, a motion can be generated. This motion is then trained using the DDPG network with the Reverse Curriculum Generation (RCG) method. The network has an Actor-Critic structure and a replay experience buffer, which makes it more feasible and helps avoid overfitting. Furthermore, HER is integrated into the network to enhance convergence speed and performance. Finally, the experiments demonstrate that the proposed iHRIS enables real-time imitation of the human operator by the robot, and the imitated motion can be learned by the DDPG network.
|
|
Tu-PS10-T9 Special Session, Hawaii 4 |
Add to My Program |
Real-World Applications of Intelligent Systems |
|
|
Co-Chair: Pei, Yan | University of Aizu |
|
08:00-08:15, Paper Tu-PS10-T9.1 | Add to My Program |
Fitness Landscape Approximation with Dimensionality Reduction Using Multi-Dimensional Scaling (I) |
|
Zhao, Ying | University of Aizu |
Yu, Jun | Niigata University |
Pei, Yan | University of Aizu |
Keywords: Evolutionary Computation, Computational Intelligence
Abstract: We propose a method that combines multi-dimensional scaling (MDS) with a regression model to approximate fitness landscapes and enhance the efficiency of evolutionary search algorithms. The approach involves projecting the original population onto a low-dimensional space using MDS, thereby preserving pairwise distances between data points. The optimal individual is determined using the approximated fitness landscape in the low-dimensional space and subsequently mapped back to the original space using our designed rules. Furthermore, several prediction points are generated in the proximity of the optimal individual to replace certain individuals. This iterative process is repeated multiple times to accelerate the search speed of evolutionary computation (EC). Experimental results demonstrate the superiority of our proposed method over the other two algorithms across most tested functions.
|
|
08:30-08:45, Paper Tu-PS10-T9.3 | Add to My Program |
Autonomous Decision Making with Reinforcement Learning in Multi-UAV Air Combat (I) |
|
Feng, Xutao | Beihang University |
Ma, Yaofei | Beihang University |
Zhao, Liping | Beihang University |
Yang, Hanbo | National Key Laboratory of Modeling and Simulation for Complex S |
Keywords: AI and Applications, Agent-Based Modeling, Deep Learning
Abstract: A multi-agent decision network based on QMIX is proposed in this paper to cope with the coordination decision problem of multiple UAV air combat missions. To speed up the training process, three improvements are introduced: 1) An improved epsilon-decaying method that enable some tutor to help in action selection at the early stage of the training. This measure greatly improves the exploring efficiency when the network are far from being fully trained; 2) State pruning and action mask measures are applied during the training. The former improves the effectiveness of the input state information, and the latter reduces unnecessary action exploring. 3) A gradually training configuration is used to make the training process more robust, where the combat adversaries are configured as the static targets, the randomly maneuver vehicles, and the Min-Max strategy vehicles respectively. The multi-UAV air combat scenarios are built up and the experiments are conducted. The results shows that these improvements have significantly improved training efficiency.
|
|
08:45-09:00, Paper Tu-PS10-T9.4 | Add to My Program |
Fine-Tuning Vision Transformer for Arabic Sign Language Video Recognition on Augmented Small-Scale Dataset (I) |
|
Gochoo, Munkhjargal | United Arab Emirates University |
Batnasan, Ganzorig | United Arab Emirates University |
Ahmed, Ahmed | United Arab Emirates University |
Otgonbold, Munkh-Erdene | United Arab Emirates University |
Shih, Timothy K. | National Central University |
Tan, Tan-Hsu | National Taipei University of Technology |
Alnajjar, Fady | United Arab Emirates University, |
Lai, Khin wee | University Malaya |
Keywords: Deep Learning, Machine Vision, Transfer Learning
Abstract: With the rise of AI, the recognition of Sign Language (SL) through sign-to-text has gained significance in the field of computer vision and deep machine learning. However, there are only a few medium to large open datasets available for this task, as it requires a vast dataset of thousands of signs for words/phrases in different environments, which is a time-consuming and tedious process. Furthermore, there has been very little effort towards Arabic Sign Language Recognition (ArSLR). This research paper presents the results of fine-tuning the Vision Transformer (ViT) model on a small-scale in-house dataset of ArSL. The main goal is to attain satisfactory results by utilizing minimal computing power and a small dataset involving less than 10 individuals, with only one recording made for each sign in every environment. The dataset comprises 49 classes/signs, all of which were made with two hands and belong to the Level I category in terms of popularity. To enhance the dataset, three types of augmentations - translation, shear, and rotation were employed. The ViT model, pre-trained on the Kinetics dataset, was trained on the variation of augmented datasets with 2 to 10 times samples for each original video, where the training set includes solely augmented videos of 8 volunteers and the test set includes all original videos of one particular volunteer. Experimental results reveal that the combination of rotation and shear outperformed the others, achieving an accuracy of 85% on the 10 times augmented dataset. This study sheds light on small-scale dataset-based SLR tasks and video/action recognition in general.
|
|
Tu-PS20-T1 Regular Session, Hawaii 1 |
Add to My Program |
Human-Centered Learning |
|
|
|
10:45-11:00, Paper Tu-PS20-T1.1 | Add to My Program |
Dynamic Equilibrium-Based Continual Learning Model with Disentangled Meta-Features |
|
Zhang, Mingyi | Institute of Automation, Chinese Academy of Sciences |
Zhang, Junge | Institute of Automation, Chinese Academy of Sciences |
Keywords: Cognitive Computing, Human-centered Learning
Abstract: The field of artificial intelligence research has witnessed remarkable advancements in recent decades. However, conventional approaches in AI research primarily depend on fixed datasets and stationary settings, which have limited applicability to real-world scenarios. In order to address this limitation, there is an increasing need to develop and study algorithms and methods for continual learning, which enables artificial systems to learn from a continuous stream of data. One of the key challenges in continual learning is to strike a balance between transfer and interference, and to identify an equilibrium solution that can effectively learn from nonstationary data. This paper presents a canonical model that is specifically designed for continual learning, utilizing the derivative of the loss function to evaluate parameter changes between tasks and achieve dynamic equilibrium. Additionally, to improve the efficiency of limited training samples in continual tasks, a feature learning method based on meta-feature disentangling is proposed. By leveraging the second derivative term of the canonical model, the parameter vector can be decoupled and meta-features can be discovered. Experimental results demonstrate the superiority of the proposed method over state-of-the-art methods in continual lifelong supervised learning benchmarks. The validity of the proposed canonical model is further supported by these experimental results. As the demands of available settings become increasingly stringent, the advantages of disentangling meta-features become more prominent, resulting in a significant performance gap with other continual learning methods.
|
|
11:00-11:15, Paper Tu-PS20-T1.2 | Add to My Program |
Reconstructive Approach for Detection and Analysis of Hemiplegia in Human Walking Motion Based on Musculoskeletal Model |
|
Fatunmbi, Daniel Tunmise | Hokkaido University |
Tanaka, Takayuki | Hokkaido University |
Keywords: Medical Informatics, Human-centered Learning, Assistive Technology
Abstract: This study aims to explore the potential of a reconstructive approach in detecting and analyzing hemiplegia in human walking motion using a musculoskeletal model. Hemiplegia, a neurological disorder characterized by partial paralysis of one side of the body, is a common complication of stroke that can significantly impact its victim’s mobility and overall quality of life. We employed a combination of surface electromyography (sEMG) and motion capture data collection techniques to study healthy individuals’ muscle activation and kinematics both in normal conditions and while wearing a hemiplegia simulator. The collected data were analyzed using a non-negative matrix factorization (NMF) algorithm to decompose the muscle activation into distinct muscle synergies and associated weights. We then compared the number of muscle synergies used between the healthy and simulated hemiplegia conditions to identify changes in muscle coordination patterns. Using the information obtained from the comparison, an attempt was made to construct the synergies of the hemiplegia gait from the synergies of the healthy gait by adjusting the weights. The results of this study suggest that a reconstructive approach is a feasible tool for detecting and analyzing hemiplegia in a human walking motion. Furthermore, this study provides a framework for understanding the underlying muscle coordination patterns in hemiplegia and can serve as a basis for developing new rehabilitation strategies. This study is expected to contribute to the advancement of research in this field by providing insights into the neural control of human walking motion.
|
|
11:15-11:30, Paper Tu-PS20-T1.3 | Add to My Program |
SWDPM: A Social Welfare-Optimized Data Pricing Mechanism |
|
Yu, Yi | Shanghai Artificial Intelligence Laboratory |
Yao, Shengyue | Shanghai AI Laboratory |
Li, Juanjuan | Institute of Automation, Chinese Academy of Sciences |
Wang, Fei-Yue | Institute of Automation, Chinese Academy of Sciences |
Lin, Yilun | Shanghai Artificial Intelligence Laboratory |
Keywords: Multi-User Interaction, Human-centered Learning
Abstract: Data trading has been hindered by privacy concerns associated with user-owned data and the infinite reproducibility of data, making it challenging for data owners to retain exclusive rights over their data once it has been disclosed. Traditional data pricing models relied on uniform pricing or subscription-based models. However, with the development of Privacy-Preserving Computing techniques, the market can now protect the privacy and complete transactions using progressively disclosed information, which creates a technical foundation for generating greater social welfare through data usage. In this study, we propose a novel approach to modeling multi-round data trading with progressively disclosed information using a matchmaking-based Markov Decision Process (MDP) and introduce a Social Welfare-optimized Data Pricing Mechanism (SWDPM) to find optimal pricing strategies. To the best of our knowledge, this is the first study to model multi-round data trading with progressively disclosed information. Numerical experiments demonstrate that the SWDPM can increase social welfare 3 times by up to 54% in trading feasibility, 43% in trading efficiency, and 25% in trading fairness by encouraging better matching of demand and price negotiation among traders.
|
|
Tu-PS20-T2 Regular Session, Hawaii 5 |
Add to My Program |
Best Student Paper Finalists |
|
|
|
10:45-11:00, Paper Tu-PS20-T2.1 | Add to My Program |
Automated Machine Learning for Remaining Useful Life Predictions (I) |
|
Zoeller, Marc-Andre | USU Software AG |
Mauthe, Fabian | Esslingen University of Applied Sciences |
Zeiler, Peter | Esslingen University of Applied Sciences |
Lindauer, Marius | Leibniz University Hannover |
Huber, Marco | University of Stuttgart |
Keywords: Machine Learning, Neural Networks and their Applications, Deep Learning
Abstract: Being able to predict the remaining useful life (RUL) of an engineering system is an important task in prognostics and health management. Recently, data-driven approaches to RUL predictions are becoming prevalent over model-based approaches since no underlying physical knowledge of the engineering system is required. Yet, this just replaces required expertise of the underlying physics with machine learning (ML) expertise, which is often also not available. Automated machine learning (AutoML) promises to build end-to-end ML pipelines automatically enabling domain experts without ML expertise to create their own models. This paper introduces AutoRUL, an AutoML-driven end-to-end approach for automatic RUL predictions. AutoRUL combines fine-tuned standard regression methods to an ensemble with high predictive power. By evaluating the proposed method on eight real-world and synthetic datasets against state-of-the-art hand-crafted models, we show that AutoML provides a viable alternative to hand-crafted data-driven RUL predictions. Consequently, creating RUL predictions can be made more accessible for domain experts using AutoML by eliminating ML expertise from data-driven model construction.
|
|
11:00-11:15, Paper Tu-PS20-T2.2 | Add to My Program |
Evaluation of User Interfaces for Cooperation between Driver and Automated Driving System |
|
Steckhan, Lorenz | Technical University of Munich |
Spiessl, Wolfgang | BMW Group |
Bengler, Klaus | Chair of Ergonomics, Technical University of Munich |
Keywords: Human-Centered Transportation, User Interface Design, Human-Machine Cooperation and Systems
Abstract: Cooperation between driver and vehicle automation can be utilized to overcome failures and problems of the automated system. Failures do not always indicate mandatory intervention scenarios but can be within the optional decision space and have a negative impact on user experience. In this paper, we evaluate one user interface (UI) based on a joystick and one based on a touchscreen that enable an increased degree of cooperation compared to state of the art UIs in automated vehicles. Both UIs allow users to adjust parameters and initiate as well as abort maneuvers of the vehicle motion within the optional space of action. In a driving simulator experiment, participants experienced the new concepts in comparison to a state of the art UI. Participants rated both newly developed concepts positively regarding usability and user experience. Overall, they preferred the novel concepts compared to the baseline interface. An objective evaluation of interaction issues and visual distraction revealed flaws that can be resolved by adapting the designs. Participants preferred an increased level of driver vehicle cooperation (DVC) and indicated a high intention to make use of it for optional interventions. The indicated usage intention decreased with increasing automation level.
|
|
11:15-11:30, Paper Tu-PS20-T2.3 | Add to My Program |
Monocular Visual-Inertial System Based on Adaptive Scale Recovery of Structured Features |
|
Hsu, Chih-Ming | National Taipei University of Technology |
Ma, Chih-Han | National Taipei University of Technology |
Keywords: Robotic Systems, Autonomous Vehicle, Intelligent Transportation Systems
Abstract: With the development of technology, the automation industry has gradually improved, and the application of the visual-inertial system has become increasingly prosperous. Self-driving cars and autonomous mobile robots often use the visual-inertial system as a state estimator, and their image information can be used in the artificial intelligence part to provide more information about the robot. The combination of a monocular camera and an inertial sensor has the characteristics of light weight and low power. Monocular Visual-Inertial SLAM is a system that relies on monocular cameras and inertial sensors to perform localization and mapping. Monocular Visual-Inertial SLAM restores its vector to scale with the help of known extrinsic parameter information. However, compared with other methods, it is necessary to be careful in estimating the gyroscope bias of the inertial sensor. The bias and noise of the inertial sensor will directly affect the results of pre-integration, as well as the accuracy of the visual-inertial scale recovery. This phenomenon is called scale uncertainty, which results in feature points with uncertain scale or depth, rapidly reducing the accuracy of pose estimation. This paper proposes an architecture of visual-inertial based on Structure-Scale Adaptivity. In this study, the tightly coupled residual between the structure scale and the pose is designed to provide adaptivity according to the feature conditions, providing system structure and scale constraints, and thus suppressing the problem of scale uncertainty. This study is compared with state-of-the-art algorithms and succeeds in scenarios where other algorithms have failed. It also obtains the best accuracy results in the overall test. Finally, we evaluate our proposed method on VCU-RVI and KITTI datasets, demonstrating the proposed algorithm's effectiveness and feasibility for monocular visual-inertial SLAM.
|
|
11:30-11:45, Paper Tu-PS20-T2.4 | Add to My Program |
Shoe-Type-Force-Feedback Device and Falling Sensation with Two-Step Dropping |
|
Ishida, Yuki | Chuo Univercity |
Sawahashi, Ryunosuke | Department of Precision Mechanics, Chuo Univercity |
Nishihama, Rie | Chuo University |
Nakamura, Taro | Department of Precision Mechanics, Chuo Univercity |
Keywords: Kansei (sense/emotion) Engineering, Virtual/Augmented/Mixed Reality, Entertainment Engineering
Abstract: In this study, a two-step dropping model was proposed to replicate a long-distance fall in a virtual reality (VR) space, while realizing only a short drop distance in reality. In a previous study, a two-step dropping model representing a falling motion was proposed and presented to humans in real space. This concept replicates the sensation of a long-distance fall by presenting the dropping motion to the wearer only at the start and landing of the fall in the VR space. A two-step dropping device was fabricated using an air cylinder. A two-step dropping motion was realized by combining the open/close control of the conduit with a direct-acting solenoid valve and pressure control with a proportional solenoid valve. In the sensitivity evaluation experiment, participants were presented with a combination of the device's operation and VR images. The evaluation scores for the sense of reality of falling in the two-step dropping condition tended to be higher than those in the other conditions, compared to the condition in which the device was not operated or only one-step dropping was presented. This confirmed that the two-step drop was an effective operation for experiencing a free-fall sensation.
|
|
11:45-12:00, Paper Tu-PS20-T2.5 | Add to My Program |
Robust State Estimation for Satellite Formations in the Presence of Unreliable Measurements |
|
Pedari, Yasaman | University of Vermont |
Waleed, Danial | University of Vermont |
Duffaut Espinosa, Luis Augusto | University of Vermont |
Ossareh, Hamid | University of Vermont |
Keywords: System Modeling and Control, Cooperative Systems and Control, Fault Monitoring and Diagnosis
Abstract: This paper presents a new framework for robust relative position and velocity estimation of satellite swarms in Low Earth Orbit (LEO) in the presence of statistical outliers in sensor measurements. The proposed framework is based on the Robust Generalized Maximum Likelihood Kalman Filter (RGMKF) and uses a minimal sensor configuration and inter-satellite communication. The satellites are assumed to be equipped with GPS receivers that obtain signals from high-altitude navigation satellites, as well as relative positioning sensors based on radar, cameras, or lasers. Measurements from the latter set of sensors will aid in keeping reasonable accuracy when GPS is unavailable or unreliable and improve estimation accuracy when GPS is available and reliable. The paper highlights that practical scenarios may present scaling challenges when applying RGMKF, which could be effectively addressed by properly reordering data in the measurement matrix. The proposed framework and the outlier detection scheme are evaluated through simulations demonstrating their effectiveness in addressing challenges in state estimation of satellite swarms with sensor outliers.
|
|
Tu-PS20-T3 Regular Session, Hawaii 6 |
Add to My Program |
Cybernetics General II |
|
|
|
10:45-11:00, Paper Tu-PS20-T3.1 | Add to My Program |
An Analysis of Evolutionary Migration Models for Multi-Objective, Multi-Fidelity AutoML |
|
Campero Jurado, Israel | Eindhoven University of Technology |
Vanschoren, Joaquin | Eindhoven University of Technology |
Keywords: Machine Learning, Metaheuristic Algorithms, Evolutionary Computation
Abstract: Methods have been proposed to maximize the efficiency of Automated Machine Learning (AutoML) systems while simplifying the search for solutions. Multi-fidelity approaches have been shown as an alternative to achieve this since they are straightforward to implement and reduce the computational cost when modeling large datasets. However, they are not suited to shifting data distributions, and they remove configurations too fast. Therefore, they must be combined with other techniques. Combining multi-fidelity methods with evolutionary algorithms and island models helps to adapt better and can contribute to maintaining diversity. This paper presents a comparative analysis of 10 network topologies distributed over a generalized island model for AutoML. A dynamic migration model with multi-objective and multi-fidelity evaluation is proposed to reduce the complexity of the tasks. This proposal is compared against state-of-the-art AutoML frameworks. It was found that Hypercube, Grid 2-dim, and Grid 3-dim topologies have the best performance as they maintain a balance in the number of connections. Furthermore, these topologies were shown to be competitive against other frameworks in the state of the art.
|
|
11:15-11:30, Paper Tu-PS20-T3.3 | Add to My Program |
Data Efficient Incremental Learning Via Attentive Knowledge Replay |
|
Lee, Yi-Lun | National Yang Ming Chiao Tung University |
Chen, Dian-Shan | National Yang Ming Chiao Tung University |
Lee, Chen-Yu | Google |
Tsai, Yi-Hsuan | Google |
Chiu, Wei-Chen | National Yang Ming Chiao Tung University |
Keywords: Machine Vision, Deep Learning, Representation Learning
Abstract: Class-incremental learning (CIL) tackles the problem of continuously optimizing a classification model to support growing number of classes, where the data of novel classes arrive in streams. Recent works propose to use representative exemplars of learnt classes, and replay the knowledge of them afterward under certain memory constraints. However, training on a fixed set of exemplars with an imbalanced proportion to the new data leads to strong biases in the trained models. In this paper, we propose an attentive knowledge replay framework to refresh the knowledge of previously learnt classes during incremental learning, which generates virtual training samples by blending between pairs of data. Particularly, we design an attention module that learns to predict the adaptive blending weights in accordance with their relative importance to the overall objective, where the importance is derived from the change of the image features over incremental phases. Our strategy of attentive knowledge replay encourages the model to learn smoother decision boundaries and thus improves its generalization beyond memorizing the exemplars. We validate our design in a standard class-incremental learning setup and demonstrate its flexibility in various settings.
|
|
11:30-11:45, Paper Tu-PS20-T3.4 | Add to My Program |
Prompt-Assisted Relation Fusion in Knowledge Graph Acquisition |
|
Jing, Xiaonan | Purdue University |
Rayz, Julia | Purdue University |
Keywords: Knowledge Acquisition, Expert and Knowledge-Based Systems
Abstract: This paper investigated how prompt-based learning techniques can assist with relation fusion in Knowledge Graph (KG) acquisition. We created a unsupervised framework to generate a KG from a real-world dataset. The framework incorporates prompting with knowledge entity metadata and generating predicate embeddings with the pretrained Masked Language Model (MLM) RoBERTa. Predicate embeddings were clustered to form conceptual groups and feature tokens were used to derive relation labels. In addition, we conducted a comparative study on the effects of different prompting templates. The resulting relation labels were evaluated by human annotators, which indicated that prompt-based learning, if applied appropriately, can help with deducing conceptualized relations. Our framework proposed a way to improve the quality of KGs acquired using traditional Relation Extraction (RE). It can also assist human experts effectively in semi-automated knowledge acquisition.
|
|
11:45-12:00, Paper Tu-PS20-T3.5 | Add to My Program |
An Innovative Quantum-Inspired Hybrid Strategy and In-Depth Analysis of Cross-Market Portfolio Optimization |
|
Jiang, Yu-Chi | National Taiwan University, Academia Sinica |
Lai, Yun-Ting | National Chi Nan University |
Chen, Po-Chun | National Chih Nan University |
Wu, Kun-Min | National Chi Nan University |
Chang, Yu-Yu | National Chi Nan University |
Tong, Yong Feng | National Chi Nan University |
Kuo, Shu-Yu | National Taiwan University |
Chou, Yao-Hsin | National Chi Nan University |
Keywords: Quantum Cybernetics, Expert and Knowledge-Based Systems, Metaheuristic Algorithms
Abstract: With the growing popularity of online trading platforms, investors can consider investing in the global market without regard to their geographical location. This study analyzes the investment performance of Group of Seven (G7) members, who greatly influence the global economy and stock market. Regardless of the market in which they invest, investors seek long-term and stable gains in the stock market; therefore, investing in a risk diversification option is paramount. By constructing a portfolio, investment risk can be decreased. The relationship between return and risk is crucial. In this study, the proposed intelligent investing system with a novel trend ratio indicator aids in constructing an efficient portfolio. In a volatile stock market, the daily ups and downs are all profit opportunities; therefore, the proposed intelligent system constructs long-selling and short-selling portfolios. This study expands the investment discussion market to the G7 market and combines trading strategies into an innovative hybrid trading strategy to evaluate the G7 market's overall performance. As the trading strategies and markets under consideration are more inclusive, the proposed quantum-inspired algorithm with the entanglement technique can address the vast portfolio construction solution space. This study generalizes trend ratios in long-selling and short-selling portfolios and G7 markets. The experiment's result reveals that the hybrid strategy can increase portfolio profitability and efficiency.
|
|
Tu-PS20-T4 Regular Session, Hawaii 2 |
Add to My Program |
AI and Applications II |
|
|
|
10:45-11:00, Paper Tu-PS20-T4.1 | Add to My Program |
Application of a Dual-Stage Deep Learning Framework to Detect Left Atrial Enlargement for Pet Heart Failure |
|
Oh, Junyoung | Chungbuk National Univerysity |
Lee, In-Gyu | Chungbuk National University |
Chang, Hyun-Ho | Animal Hospital DOCTORS |
Lee, Euijong | Chungbuk National University |
Jeong, Ji-Hoon | Chungbuk National University |
Keywords: AI and Applications, Image Processing and Pattern Recognition, Deep Learning
Abstract: Artificial intelligence (AI) has transformed medical diagnosis and improved quality of life. But in the field of veterinary medicine has been limited due to training data and obtaining high-quality data. In this study, we propose a framework for diagnosing left atrial enlargement in dogs using AI techniques. Our framework involves generating X-ray image data and utilizing the UNet model for segmentation. The results of our experiments show excellent performance, with a mean dice score of 0.9186 for segmentation. The highest classification accuracy was achieved in trial 1 for normal and overall cases, with 0.9200 and 0.8478, respectively, while trial 2 had the highest abnormal heart classification accuracy of 0.8095. Our findings indicate that generating data and training the model with a certain percentage of the generated data can lead to high classification accuracy. We conclude that the proposed framework has the potential for clinical application in veterinary medicine.
|
|
11:00-11:15, Paper Tu-PS20-T4.2 | Add to My Program |
Learning Resume Embeddings with Search Data and Transformers |
|
Hourany, Jonathan | Hired |
Zira, Aaron | Hired |
Avas, Ignacio | Hired |
Thiebaut, Nicolas | Hired |
Keywords: AI and Applications, Deep Learning, Representation Learning
Abstract: In typical search engines, users are exposed to an interface that combines free-form textual fields and hard filters. After the initial query, users implicitly express their preferences through clicks, purchases, or connection requests. In this paper, we propose leveraging implicit user feedback to learn similarity metrics between search results that receive attention together. The learned similarity model can notably be used to serve recommendations. It captures a form of latent similarity that is not known to the search engine but learned from implicit user feedback. We introduce a method to create a dataset suited for similarity learning from search data. Using this dataset, we use contrastive learning and train a similarity model that outputs large scores for pairs of results that tend to receive the same attention from users and low scores otherwise. Our experiments show that BERT models and their siamese equivalents (Sentence BERT) produce meaningful similarity metrics when fine-tuned on the dataset built from search data.
|
|
11:15-11:30, Paper Tu-PS20-T4.3 | Add to My Program |
Image Generation with Diffusion Model by Interactive Evolutionary Computation |
|
Kobayashi, Haruka | University of Tokyo |
Pindur, Adam Kotaro | The University of Tokyo |
Nagar Anthel Venkatesh, Suryanarayanan | The University of Tokyo |
Iba, Hitoshi | University of Tokyo |
Keywords: AI and Applications, Deep Learning, Evolutionary Computation
Abstract: Text-to-image generation using deep learning based models has become a popular research topic, allowing users to generate custom artworks from specified text input. However, generating suitable prompts that produce creative and desirable images remains a significant challenge. To address this challenge, we propose a novel method that incorporates interactive evolutionary computation (IEC) to evolve the latent array. By integrating human perception into the system, our approach enables users to search for and generate images that align with their desired specifications through interaction with the system. We demonstrate the effectiveness of our proposed method in generating images that align with the users' mental images from an initial image using genetic algorithms through a series of experiments. Furthermore, the results from our user studies show that our proposed method enables users to generate images that match their desired mental images with less effort and in less time compared to conventional generation methods. Overall, this study contributes to the field of text-to-image generation by introducing a human-in-the-loop approach that enhances user control and specificity in the image generation process.
|
|
Tu-PS20-T6 Special Session, Kahuku |
Add to My Program |
Discrete Event System Models and Intelligent Learning |
|
|
Organizer: Fanti, Maria Pia | Polytecnic of Bari, Italy |
Organizer: Li, Zhiwu | Xidian University |
|
10:45-11:00, Paper Tu-PS20-T6.1 | Add to My Program |
Terminal Iterative Learning Control for an Electrical Powertrain System with Backlash |
|
Kim, ByungJun | KAIST |
Choi, Seibum | KAIST |
Keywords: Electric Vehicles and Electric Vehicle Supply Equipment, System Modeling and Control, Intelligent Transportation Systems
Abstract: This paper proposes a backlash control algorithm using Terminal Iterative Learning Control (TILC) in the angle domain. The proposed method addresses the issue of control time variation in the Iterative Learning Control (ILC) method, which makes it impractical for vehicle control. By controlling backlash in the angle domain, the control interval remains the same for each iteration. The backlash impact is proportional to the velocity at the end of the backlash mode. However, by bringing the reference value nearer to zero, the impact was mitigated. Additionally, the utilization of TILC enhances its resilience to sensor noise. The proposed method is evaluated through simulations and experimental results, demonstrating its practical applicability to vehicles and high accuracy in various initial conditions. This paper provides a novel approach to backlash control in-vehicle systems, contributing to the advancement of control methods for improved ride comfort and safety.
|
|
11:00-11:15, Paper Tu-PS20-T6.2 | Add to My Program |
Using Knowledge Awareness to Improve Safety of Autonomous Driving |
|
Calvagna, Andrea | University of Catania |
Ghosh, Arabinda | Max Planck Institute for Software Systems |
Soudjani, Sadegh | Newcastle University |
Keywords: Autonomous Vehicle, Intelligent Transportation Systems
Abstract: We present a method, which incorporates knowledge awareness into the symbolic computation of discrete controllers for reactive cyber physical systems, to improve decision making about the unknown operating environment under uncertain/incomplete inputs. Assuming an abstract model of the system and the environment, we translate the knowledge awareness of the operating context into linear temporal logic formulas and incorporate them into the system specifications to synthesize a controller. The knowledge base is built upon an ontology model of the environment objects and behavioural rules, which includes also symbolic models of partial input features. The resulting symbolic controller support smoother, early reactions, which improves the security of the system over existing approaches based on incremental symbolic perception. A motion planning case study for an autonomous vehicle has been implemented to validate the approach, and presented results show significant improvements with respect to safety of state-of-the-art symbolic controllers for reactive systems.
|
|
11:15-11:30, Paper Tu-PS20-T6.3 | Add to My Program |
A Modified VGG Architecture for Brain Tumor Classification (I) |
|
Dénes-Fazakas, Lehel | Óbuda University |
Kovacs, Levente | Obuda University |
Eigner, György | Obuda University |
Szilágyi, László | Obuda University |
Keywords: Image Processing and Pattern Recognition, Neural Networks and their Applications, AI and Applications
Abstract: Automated brain tumor classification is an intensively investigated problem, which recently attracted significant attention. Convolutional neural networks (CNN) and deep learning represent the standard for the foundation of any recent solution. This paper proposes two simplified VGG architectures and investigates their capabilities and limitations, in comparison with state-of-the-art CNN networks deployed via transfer learning. Various parameter settings are involved in the evaluation process, including different kernel sizes, dropout rules, loss functions, etc. Networks are trained and tested on a public brain tumor classification data set consisting of 3064 images and three tumor classes (meningioma, glioma and pituitary tumor). The thorough evaluation process revealed that the proposed CNN models can achieve competitive performances with regard to state-of-the-art methods in several scenarios. The best achieved accuracy benchmarks are 98.2% overall Dice similarity score and correct decision rate, and AUC values over 99.6% for each of the three tumor classes.
|
|
11:30-11:45, Paper Tu-PS20-T6.4 | Add to My Program |
Output Recurrent Fuzzy Neural LSTM-BLS Controller for Nonlinear Digital Time-Delay Dynamic Systems (I) |
|
Rospawan, Ali | National Chung Hsing University |
Tsai, Ching-Chih | National Chung Hsing UNversity |
Hung, Chi-Chih | National Chung Hsing University |
Keywords: System Modeling and Control, Control of Uncertain Systems, Adaptive Systems
Abstract: In this paper, a novel control architecture is presented by integrating an Output Recurrent Fuzzy Neural Long Short-Term Memory (ORFNLSTM) and a Broad Learning System (BLS) for a class of single-input-single-output (SISO) nonlinear dynamic systems. This new controller, abbreviated as the ORFNLSTM-BLS controller, is especially proposed by combining the techniques of deep learning and broad learning method to establish an adaptive intelligent controller with an online deepest gradient descent learning algorithm to online update its weights. The ORFNLSTM-BLS controller aims to improve the performance of the ORFBLS controller by incorporating the memory of LSTM to handle time-series data more effectively. A sufficient condition of the proposed controller is established to accomplish its uniformly asymptotical stability. The effectiveness and superiority of the ORFNLSTM-BLS controller are well exemplified by carrying out one comparative simulation in comparison with a fixed-gain proportional-integral-derivative (PID) controller, an adaptive predictive PID controller augmented with ORFBLS (ORFBLS-APPID), and an existing ORFBLS controller in terms of two types of control performance indexes: the overall performance indexes and transient state performance indexes. The results show that the proposed ORFNLSTM-BLS controller outperforms the three existing control methods. The developed techniques would provide useful references for professionals working in the fields of process and servomechanism control.
|
|
11:45-12:00, Paper Tu-PS20-T6.5 | Add to My Program |
Traffic Risk Assessment from Driving Scene Images (I) |
|
Wu, Wun-Jia | National Chung Cheng University |
Lin, Huei-Yung | National Taipei University of Technology |
Keywords: Intelligent Transportation Systems
Abstract: Different applications for advanced driver assistance systems (ADAS) have been increasing rapidly. With the advances of machine learning techniques, traffic risk assessment is becoming possible by understanding the complex driving scenes. This paper presents a technique to classify possible risks in four levels using deep neural networks. The important regions in road scene images are first predicted, followed by the risk assessment using a series of weighted RGB and optical flow images obtained from the proposed network model. To achieve better understanding of driving scenes, driver visual attention is incorporated to enhance the feature extraction. An LSTM model is then employed to learn the temporal information from road scene video clips. In addition to the experiments using the public TRA dataset, we also present a new LTDR dataset for performance evaluation. Compared with existing techniques, the result has demonstrated the effectiveness of the proposed method.
|
|
Tu-PS20-T7 Special Session, Oahu |
Add to My Program |
IoT and AI-Driven Sustainable Manufacturing Systems |
|
|
Organizer: Zhou, Mengchu | New Jersey Institute of Technology |
Organizer: Li, Zhiwu | Xidian University |
Organizer: Wisniewski, Remigiusz | University of Zielona Gora |
Organizer: Bazydło, Grzegorz | University of Zielona Gora |
|
10:45-11:00, Paper Tu-PS20-T7.1 | Add to My Program |
Design Mechanisms to Improve Data Packet Forwarding Efficiency in BT Mesh Networks |
|
Chuang, Yue-Ru | Fu Jen Catholic University |
Lin, Yi-Dong | Fu Jen Catholic University |
Lin, Yu-Chieh | Fu Jen Catholic University |
Keywords: Communications, Smart Sensor Networks, Technology Assessment
Abstract: Abstract—Based on IoT network security considerations, the BT mesh specification defines twice encryption for each data packets. One of the encryptions is in Network layer, it uses NetKey to separately encrypt DST (destination address) and Lower Transport PDU with AES encryption, and obfuscates SEQ (sequence number), SRC (source address), and TTL (time to live). However, the security mechanism may increase the packet process overhead of all relay nodes in mesh networks, and reduce packet forwarding efficiency. This paper will design a BT mesh routing, tag switching and adopt CRC-24 chained computing mechanisms concurrently to avoid the packet process overhead caused by the security mechanism and improve the forwarding efficiency of data packets at all relay nodes
|
|
11:15-11:30, Paper Tu-PS20-T7.3 | Add to My Program |
Conceptualization and Preliminary Development of Statistical Digital Twin and Cyber-Thermophysical System for Advanced Analysis, Monitoring, and Control of the Laser Remelting Process |
|
Bordatchev, Evgueni | National Research Council of Canada |
Cvijanovic, Srdjan | Western University |
Wu, Honghe | Western University |
Gorski, Adam | Western University |
Beyfuss, Daniel | Western University |
Tutunea-Fatan, Remus O. | Western University |
Keywords: Cyber-physical systems, Digital Twin, System Modeling and Control
Abstract: Laser remelting (LRM) is one of very few universal technologies (i.e., no material removal and no material addition) used in a wide range of manufacturing applications, spanning from surface polishing to functional structuring. Given that LRM is a nonlinear, non-stationary thermodynamic process, its analysis, monitoring, control, and optimization require a comprehensive and fundamental understanding of multiple interlinked laser-material interaction phenomena. Such an understanding is derived from on-line information gained from measurements made during the process by various sensors. Toward this end, this study presents a concept and recent achievement in the development of a statistical digital twin and cyber-thermophysical system towards their use to stabilize, control, and optimize the process. In particular, the digital twin statistically describes the transformation of the initial surface into the LRM topography in terms of thermodynamic transfer functions of remelting and resolidification of surface topography and bulk material. In addition, a cyber-thermophysical system was developed interconnecting the thermodynamic heat-transfer model with the laser beam position, thermal-emission distribution, and overall temperature of the laser-material interaction zone measured and synchronized on-line in space-time position coordinates. The efficiency of the developed cyber-thermophysical system was demonstrated by analyzing the surface formation and process thermodynamics during LRM of H13 tool steel. The preliminary results open new research directions in thermophysics-supported advanced analysis, monitoring, and optimization of LRM, including the implementation of artificial-intelligence methods.
|
|
11:30-11:45, Paper Tu-PS20-T7.4 | Add to My Program |
Add-If-Silent Rule-Based Growing Neural Gas with Amount of Movement for High-Density Topological Structure Generation of Dynamic Object |
|
Shoji, Masaya | ROBOTIS Co., Ltd. / AIIT / Tokyo Metropolitan University |
Obo, Takenori | Tokyo Metropolitan University |
Kubota, Naoyuki | Tokyo Metropolitan University |
Keywords: Computational Intelligence, Neural Networks and their Applications, Hybrid Models of Computational Intelligence
Abstract: In order to realize a super-smart society (Society 5.0) where humans and robots coexist, there is a need for a perceptual system that can recognize the environment quickly and flexibly in an environment that changes from moment to moment. In an unknown environment, the characteristics of objects cannot be known in advance, and thus prior learning-based recognition methods such as deep reinforcement learning may not be able to cope with this situation. In this study, we construct a 3D topological map of the environment in real-time using Growing Neural Gas (GNG), which can learn 3D topological structures even for unlearned objects. However conventional GNG have the problem that they cannot generate nodes with high-density for distant objects and cannot identify whether an unknown object is static or dynamic. Therefore, by directly adding useful input data as a new node (reference vector) based on the object category labels of the winner nodes (nearest nodes) to the input vector (3D point cloud), it is possible to generate high-density topological structures even for distant objects. We proposed the Add-if-Silent rule-based GNG with Amount of Movement (AiS-GNG-AM), which can identify between static and dynamic objects based on the past amount of movement of a node. The effectiveness of the proposed method is verified through experiments using a 3D dynamics simulator.
|
|
11:45-12:00, Paper Tu-PS20-T7.5 | Add to My Program |
A New Data Structure - Satellite List for Solving Colored Traveling Salesman Problems (I) |
|
Duan, Yaxing | Southeast University |
Li, Jun | Southeast Univeristy |
Keywords: Decision Support Systems, Discrete Event Systems, Large-Scale System of Systems
Abstract: This paper presents an improved Delaunay-Triangulation-based Variable Neighborhood Search (DVNS) using a new solution data structure, Satellite List (SL), to solve large-scale colored traveling salesman problem instances. SL encoding can reduce the time complexity of three basic operators in DVNS, i.e., insertion, swap, and flip to O(1), dramatically promoting the iteration efficiency of DVNS. Extensive experiments are conducted to validate the improved DVNS by comparing the presented method with the state-of-the-art algorithm DVNS. The results show that the improved DVNS outperforms the original in algorithm convergence, iteration efficiency, and solution quality.
|
|
Tu-PS30W Workshop Session, Puna |
Add to My Program |
Emerging Topics and Applications of BMIs |
|
|
|
13:00-13:15, Paper Tu-PS30W.1 | Add to My Program |
Adversarial Stimuli: Attacking Brain-Computer Interfaces Via Perturbed Sensory Events (I) |
|
Upadhayay, Bibek | University of New Haven |
Vahid, Behzadan | University of New Haven |
Keywords: Brain-Computer Interfaces, Systems Safety and Security, Human-Machine Interaction
Abstract: Machine learning models are known to be vulnerable to adversarial perturbations in the input domain, causing incorrect predictions. Inspired by this phenomenon, we explore the feasibility of manipulating EEG-based Motor Imagery (MI) Brain Computer Interfaces (BCIs) via perturbations in sensory stimuli. Similar to adversarial examples, these adversarial stimuli aim to exploit the limitations of the integrated brain-sensor-processing components of the BCI system in handling shifts in participants' response to changes in sensory stimuli. In this paper, we first define adversarial stimuli and enumerate the characteristics of such stimuli. Second, we study the feasibility of adversarial stimuli as an attack vector in a series of human subject experiments, and report the findings on the impact of visual adversarial stimuli in the form of random partial flickers on the MI task of playing a video game. Our findings suggest that adversarial stimuli can significantly deteriorate the performance of MI BCIs across all participants, and also that such attacks are more effective in conditions with induced stress. Furthermore, we provide a preliminary analysis on the underlying dynamics of adversarial stimuli attacks by investigating the variations in Alpha, Beta, and Gamma bands. Lastly, we present a discussion on the impact of our findings, and enumerate directions of further research in this area.
|
|
13:15-13:30, Paper Tu-PS30W.2 | Add to My Program |
Towards Privacy Preserving BCIs: Profiling the Feasibility of Federated Learning for Motor Imagery Brain-Computer Interfaces (I) |
|
Floreani, Erica | University of Toronto |
Chau, Tom | University of Toronto |
Keywords: BMI Emerging Applications, Active BMIs, Passive BMIs
Abstract: —Brain-computer interfaces (BCIs) are positioned to help individuals with physical disabilities, yet training data for these systems are time-consuming and expensive to collect. BCIs could thus benefit from cross-user data sharing, but as neurophysiological signals used in BCIs are personal health information, the protection of user privacy and data sovereignty will be of the utmost importance for end-user BCI applications. Federated learning is a novel privacy-preserving machine learning technique that decentralizes training, leaving end-user data on the device/institution of origin. In this work, we explored the feasibility of federated learning for BCIs, profiling convergence and performance for two federated learning algorithms (FedAvg and FedDC) on both identically and independently distributed (IID) and non-IID partitions of data. Federated learning for a 4-class motor imagery BCI decoding task was found to be feasible, although came at a cost of reduced performance longer convergence rate and reduced accuracy). The FedDC algorithm, which introduces drift correction for heterogeneous data, well out-performed the FedAvg algorithm. Larger datasets with more subjects and data per subject would be beneficial for further investigations, and future work should explore federated transfer learning techniques for combating inter-subject data heterogeneity and improving global model performance.
|
|
13:45-14:00, Paper Tu-PS30W.4 | Add to My Program |
Brain-eNet: Towards an Enabling Technology for BCI-IoT Systems |
|
Gonzalez-Espana, Juan Jose | University of Houston |
Sánchez-Rodríguez, Lianne | University of Houston |
Craik, Alexander | University of Houston |
Wong, Sarah | University of Houston |
Feng, Jeff | University of Houston |
Contreras-Vidal, Jose | University of Houston |
Keywords: BMI Emerging Applications, Other Neurotechnology and Brain-Related Topics, Passive BMIs
Abstract: Brain-Computer Interface (BCI) and Internet of Things (IoT) systems have recently been amalgamated to create BCIoT. Most of the early applications have focused on the healthcare sector, and more recently, in education, virtual reality, smart homes, and smart vehicles, amongst others. While there are many transversal developing stages that can be satisfied by a single system, no common enabling technology or standards exist. These challenges are address in the proposed platform, Brain-eNet. This technology was developed considering the constraints-space defined by BCIoT real-time mobile applications. This is expected to enable the development of BCIoT systems by providing modular hardware and software resources. Two instances of this platform implementation are provided, a motor intent detection for rehabilitation and an emotion recognition system.
|
|
Tu-PS30W1 Workshop Session, Hilo |
Add to My Program |
Paper Talks 2 |
|
|
Organizer: Stoica, Adrian | NASA Jet Propulsion Laboratory |
|
13:00-13:15, Paper Tu-PS30W1.1 | Add to My Program |
Challenges and Advancements in Telepresence Frameworks: From Medicine to Autonomous Systems (I) |
|
Mohsenzadeh Kebria, Parham | Deakin University |
Nahavandi, Saeid | Swinburne University of Technology |
Homaifar, Abdollah | North Carolina A&T State University |
Keywords: Telepresence, Human-Machine Interaction, Human-Machine Cooperation and Systems
Abstract: Telepresence is the technology that allows human beings to remotely monitor, supervise, and control a wide variety of tasks. These tasks range from medical examinations to space discovery. Teleoperation, as the underlying foundation of telepresence, saves money and lives by replacing and relocating human elements from dangerous or harmful environments such as factories’ production lines, mines, ports, logistic centres, and so on. Teleoperation also finds its place in autonomous systems, including training an autonomous mobile robot via imitation learning (learning from demonstrations) and human-in-the-loop performance evaluation. On the other hand, this technology can improve life quality by offering healthcare services to remote and rural areas, where access to clinical experts is dangerously limited. However, teleoperation frameworks generally suffer several threats and challenges, including time delays, uncertainties, safety, and cyber-attacks. Time delay is one of the most critical factors that should be taken into account in the design and development of teleoperation systems of any kind. Transmission of command/control and sensory information signals through a communication medium between the two main sides of a teleoperation platform, being the operator and teleoperator, is critically vulnerable to latency and uncertainties. Delays in those transmissions will negatively affect the stability and performance of the teleoperation task. Moreover, designing a safe and secure teleoperation framework is another challenge considering the topology of the network on which the teleoperation is being conducted. In the case of remotely commanding an automated vehicle, it is down to the teleoperation strategy to determine the safety and feasibility of the desired path and manoeuvring before driving it; or in clinical applications and telehealth applications, cyber-attacks can maliciously steal and manipulate patients’ data or jeopardize the clinical teleoperation (remote surgery, medical imaging, rehabilitation, etc). In all these cases, a robust and reliable teleoperation strategy is a must to guarantee the stable, safe, and effective performance of the desired tasks in a remote environment.
|
|
13:15-13:30, Paper Tu-PS30W1.2 | Add to My Program |
Prototype of Augmented Avatar with Mobile Smart Device Operation (I) |
|
Haruna, Masaki | Mitsubishi Electric Corporation |
Ogino, Masaki | Kansai University |
Tagashira, Shigeaki | Kansai Univ |
Morita, Susumu | Mitsubishi Electric |
Keywords: Human-Machine Interaction, User Interface Design, Telepresence
Abstract: The world population exceeded 8 billion by 2020. On the other hand, there are concerns about shortage of labor population in in many developed countries due to the declining population with declining birthrate and an aging population. AVATAR technology is expected to solve the social problem of uneven population distribution by enabling everyone to live and work where and with whom they want to live. There are a lot of functional Avatar in the world. Because of the same body structure, humanoid avatars are suitable for replacing human tasks, especially when they are remotely controlled by wearing HMDs to perform a variety of tasks. However, humanoid robots are currently too complex and expensive and wearing HMDs may disturb the spread of tele-operation services. We at LAST MILE aim to realize a tele-operation service that realize anyone to live and work where they want and with whom they want. In order to popularize the service, it is important to develop flexible design technology that enables low cost and interface technology that can be operated by mobile smart devices. In this paper, we introduce two main strategies developed to make this service begin true. i. The first is the proposed Augmented Avatar technology, which can be flexibly designed for each service by separating the "mind" and the "body" of the avatar. By a selection of modularly designed communication avatars and manipulation avatars in accordance with the service, the combined avatar which we named “Augmented Avatar” can communicate and manipulate with people in the surrounding area at a distance. ii. The second is a mobile smart device interface technology which operate our new prototyped Augmented Avatar. It makes an operator communicate and manipulate remotely only by a smart phone.
|
|
13:30-13:45, Paper Tu-PS30W1.3 | Add to My Program |
Passivity Control with Energy Reflection and Haptic Data Reduction for Delayed Robust Bilateral Teleoperation (I) |
|
Singh, Harsimran | DLR |
Michael, Panzirsch | DLR |
Hulin, Thomas | Deutsches Zentrum für Luft- und Raumfahrt |
Xu, Xiao | Technical University of Munich |
Steinbach, Eckehard | Technical University of Munich |
Albu-Schäffer, Alin | DLR - German Aerospace Center |
|
|
13:45-14:00, Paper Tu-PS30W1.5 | Add to My Program |
IEEE Telepresence Roadmap: Current Status and Call for Participation (I) |
|
van Erp, Jan | University of Twente |
Takayama, Leila | Hoku Labs and Robust.AI |
Fong, Terry | NASA |
Lee, Johnny | Google |
Mason, Julian | Third Wave Automation |
Falk, Tiago H. | INRS-EMT |
Niemeyer, Günter | Caltech |
|
Tu-PS30-T1 Regular Session, Hawaii 1 |
Add to My Program |
Human Performance Modeling I |
|
|
|
13:00-13:15, Paper Tu-PS30-T1.1 | Add to My Program |
Decoding Neural Activity for Part-Of-Speech Tagging (POS) |
|
Ahmed, Salman | Ulster University |
Singh, Muskaan | Ulster University |
Bhattacharyya, Saugat | Ulster University |
Keywords: Human Performance Modeling
Abstract: Decoding Part of Speech(POS) tagging directly from electroencephalography (EEG) signals whilst the user overtly spoke (voiced speech) sentences could improve direct speech brain-computer interfaces (BCIs) using imagined or inner speech. To our knowledge, earlier work uses a machine learning approach using 74,953 sentences/tokens recorded in 75 EEG sessions. The tokens can be found in 4,479 phrases consisting of terms from the English Online treebank, which contains the record of weblogs, newsgroups, reviews, and Yahoo Answers. The results demonstrated the feasibility of POS decoding from EEG based on word class, word frequency, and word length with an accuracy of 71%,86%, and 89%, respectively. We believe there is significant room for improvement with more advanced artificial intelligence. In this paper, we further extend the existing work with end-to-end transformers. Our results presents transformer model outperforms benchmark traditional ML results with +20% in length, +13% for the open vs closed class and +12% in frequency. In our empirical analysis, we find the decoding performance was better when using multi-electrode recordings as compared to single-electrode recordings.
|
|
13:15-13:30, Paper Tu-PS30-T1.2 | Add to My Program |
Engineering Analysis of One-Sided State Transitions for the Derivation of Dual-Task Conditions |
|
Tsuji, Ayumu | Waseda University |
Oh, Joi | Waseda University |
Iwasaki, Yukiko | CNRS |
Kato, Fumihiro | Waseda University |
Iwata, Hiroyasu | Waseda University |
Keywords: Human Performance Modeling, Augmented Cognition, Information Systems for Design
Abstract: Dual tasks with extended and natural bodies can enhance and complement human body functions and improve productivity. In recent years, methods for sharing information between two points have been investigated to realize dual tasks. However, even with these methods, when the required attentional load increases due to time and environmental factors, dual tasks may not be possible when the human cognitive ability reaches its limits. Therefore, the purpose of this study is to determine the intervention method among individuals by analyzing and categorizing how the performance on a dual task decreases when the required attentional load increases through a validation test. The validation test consisted of a dual task with two different tasks: the Button task, which needed to be performed with priority, and the Tracking task, which needed to be performed lower priority but with a gradually increasing amount of attention. As a result of the validation test, we hypothesized that there exists a one-sided state in which response delays occur in the low-priority task during the transition from the dual-task state, where the task is performed stably, to the saturation state, where all tasks are disrupted. Therefore, considering the one-sided state, we classified participants' attention distribution state into four states during the dual-task and analyzed the transition of the collapse model for individual. As a result, we were able to identify the one-sided state, classify the collapse models of the dual-task into three patterns, and propose appropriate timing of intervention methods for each pattern. The appropriateness of the timing of intervention methods needs to be verified, which is a topic for future research.
|
|
13:30-13:45, Paper Tu-PS30-T1.3 | Add to My Program |
Estimating Individual Growth Processes During Pilot Training Using Software Reliability Growth Models |
|
Yamada, Kento | Japan Aerospace Exploration Agency |
Ikeshita, Harumi | Japan Airlines Co., Ltd |
Kyoya, Yuta | Japan Airlines Co., Ltd |
Ueno, Makoto | Japan Aerospace Exploration Agency |
Keywords: Human Performance Modeling, Team Performance and Training Systems, Resilience Engineering
Abstract: This paper aims to classify and quantify individual growth processes during pilot training to understand the variations in their growth processes. Various software reliability growth models were fitted to growth processes, and the best-fit model, which showed the minimum Akaike information criterion, was clarified for each applicant. As a result, it was shown that the individual growth processes were classified into concave and S-shaped processes. The concave and S-shaped processes were attributed to the cases where an applicant completed the training without and with wandering, respectively. Since the best-fit models included models with imperfect debugging and testing efforts these factors were discussed from the viewpoint of flight training. It was also indicated that the estimated growth models after the training can be utilized to monitor the applicant's competencies depending on operation-service flights done up to recurrent training.
|
|
Tu-PS30-T3 Regular Session, Hawaii 6 |
Add to My Program |
Cybernetics General III |
|
|
|
13:00-13:15, Paper Tu-PS30-T3.1 | Add to My Program |
Network Anomaly Detection with Stacked Sparse Shrink Autoencoders and Improved XGBoost |
|
Bi, Jing | Beijing University of Technology |
Guan, Ziyue | Beijing University of Technology |
Yuan, Haitao | Beihang University |
Zhang, Jia | Southern Methodist University |
Keywords: Intelligent Internet Systems, Neural Networks and their Applications, Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing
Abstract: Efficient and accurate identification of network anomalies is of great significance to the construction of network security systems in the information age. It is highly challenging to accurately detect abnormal behaviors in the increasing network data. Currently, classification methods based on feature extraction of autoencoders have been proven to be suitable for network anomaly detection. However, traditional detection models with autoencoders have poor detection accuracy in the face of massive network features. In addition, the hyperparameter optimization of their models cannot be effectively solved. For network anomaly detection, a new network anomaly detection method named SAXP is proposed in this work. SAXP integrates Stacked sparse shrink Autoencoders and an unbalanced XGBoost model based on genetic simulated annealing Particle swarm optimization (GSPSO). Specifically, features extracted by stacked sparse shrink autoencoders are introduced into the XGBoost model based on improved unbalance parameters for classification, and GSPSO is used to optimize the hyperparameters of XGBoost. Experimental results based on two real-life data sets demonstrate that the proposed SAXP achieves higher recognition accuracy than several state-of-the-art algorithms.
|
|
13:15-13:30, Paper Tu-PS30-T3.2 | Add to My Program |
Cost-Effective and Dynamic Migration for Microservices in Hybrid Mobile Cloud-Edge System |
|
Zhai, Jiahui | Beijing University of Technology |
Bi, Jing | Beijing University of Technology |
Yuan, Haitao | Beihang University |
Zhang, Jia | Southern Methodist University |
Keywords: Intelligent Internet Systems, Cloud, IoT, and Robotics Integration, Evolutionary Computation
Abstract: Mobile edge computing (MEC), as a promising paradigm, delivers computation and storage capacities at the edge of the network. It supports delay-sensitive services for mobile users (MUs). However, dynamic and stochastic characteristics of MEC networks necessitate constant migration of installed services across edge servers to keep up with the mobility of MUs. As a result, the cost of maintaining the network increases significantly. Existing studies of MEC rarely consider the cost of service migration due to MU mobility. To minimize the long-term cost for microservices in a hybrid cloud-edge system comprising of MUs, small base stations (SBSs), and a cloud data center (CDC), the total cost minimization is formulated as a constrained mixed-integer nonlinear program. To solve it, this work designs a novel meta-heuristic optimization algorithm called Multi-swarm Gray-wolf-optimizer based on Genetic-learning (MGG), which effectively combines strong local search capabilities of gray wolf optimizer with superior global search capabilities of genetic algorithm. MGG simultaneously optimizes service request routing among MUs, SBSs, and CDC, CPU speeds of SBSs, service deployment of SBSs, service migration cost of SBSs, as well as MUs’ transmission power and channel bandwidth allocation. Simulation results with Google cluster trace demonstrate that MGG outperforms several state-of-the-art peers with respect to the overall cost of the hybrid system.
|
|
13:30-13:45, Paper Tu-PS30-T3.3 | Add to My Program |
Active Semantic Mapping for Household Robots: Rapid Indoor Adaptation and Reduced User Burden |
|
Ishikawa, Tomochika | Ritsumeikan University |
Taniguchi, Akira | Ritsumeikan University |
Hagiwara, Yoshinobu | Ritsumeikan University |
Taniguchi, Tadahiro | Ritsumeikan University |
Keywords: Knowledge Acquisition, Machine Learning
Abstract: Active semantic mapping is essential for service robots to quickly capture both the map of an environment and its spatial meaning, while also minimizing the burden on users during robot operation and data collection. SpCoSLAM, a method of semantic mapping with place categorization and simultaneous localization and mapping (SLAM), is well suited to environmental adaptation, as it is not limited to predefined labels. However, SpCoSLAM presents two issues that increase the burden on users: 1) users struggle to efficiently determine a destination for the robot's quick adaptation, and 2) providing instructions to the robot becomes repetitive and cumbersome. To address these challenges, we propose Active-SpCoSLAM, which enables the robot to actively explore uncharted areas and employs CLIP for image captioning to provide a flexible vocabulary that replaces human instructions. The robot determines its actions by calculating information gain integrated from both semantics and SLAM uncertainties. We conducted experiments in a simulated environment, comparing the proposed method to other methods in terms of efficiency and applicability to object discovery tasks. Additionally, we tested the proposed method, which combines user instruction and CLIP, in a real environment. Our results demonstrated that the robot explored its environment with approximately five fewer iterations and 11 minutes faster compared to the case of random exploration. Moreover, our method achieved a higher success rate in object discovery tasks during earlier stages of learning compared to other methods. In conclusion, the proposed method rapidly covers an environment while gathering useful data for object discovery tasks, thus reducing the burden on users and enhancing the robot's adaptability. The project website is https://tomochika-ishikawa.github.io/Active-SpCoSLAM/.
|
|
13:45-14:00, Paper Tu-PS30-T3.4 | Add to My Program |
Temporal Feature Mining in Dynamic Graph of Brain Connectivity Data |
|
Liu, Tao | Qilu University of Technology |
Zhang, Guangwei | Qilu University of Technology |
Jing, Ming | Big Data Institute |
Zhang, Li | Qilu University of Technology |
Yu, Jiguo | Qilu University of Technology |
Keywords: Knowledge Acquisition, Machine Learning
Abstract: In recent years, the graph feature mining method of brain connection data based on graph theory has been regarded as a popular and universal technology in the field of neuroscience. How to mine valuable information from brain connection data has become a research hotspot. Current research shows that the pathogenic factors of attention deficit and hyperactivity disorder (ADHD) may be caused by the abnormal connection between brain network structures. In order to find out the pathogenic factors of ADHD patients, we also carried out frequent subgraph mining on the connectivity graph data of brain functional network. By constantly adjusting the support threshold, all the subgraphs of ADHD patients and healthy control group were mined, and the differences in brain region connectivity were successfully found out. By combining the recently introduced neural document embedding model with traditional pattern mining techniques, we regard the brain network connection structure graph as the document and frequent subgraph as the atomic unit of the embedding process. By learning the mapping, each graph can be mapped to a D-dimensional continuous vector. The mapping needs to capture the similarity between the graphs. Feature vectors can be used as the direct input of graph classification in many traditional machine learning methods. Finally, support vector machine in machine learning is used to verify the accuracy of classification, and the results show that the accuracy is high.
|
|
Tu-PS30-T4 Regular Session, Hawaii 2 |
Add to My Program |
AI and Applications III |
|
|
|
13:00-13:15, Paper Tu-PS30-T4.1 | Add to My Program |
Robotic Information Gathering Via Deep Generative Inpainting |
|
Khatib, Tamim | University of North Florida |
Kreidl, O. Patrick | University of North Florida |
Dutta, Ayan | University of North Florida |
Boloni, Ladislau | University of Central Florida |
Roy, Swapnoneel | University of North Florida |
Keywords: AI and Applications, Application of Artificial Intelligence, Deep Learning
Abstract: In today’s era of automation, mobile robots are being used for collecting meaningful information about an ambient phenomenon such as temperature or moisture distribution in an agricultural field. Most of the studies in the literature assume that the underlying information field is Gaussian, and therefore, Gaussian Process (GP)-based models are extremely popular. Furthermore, we have found that due to the inherent computational complexity of such naive GP-based techniques, most studies in the literature do not scale well beyond small-size environments, i.e., where the number of informative points n < 1000. These render such a predictive model more or less useless in many practical applications. In this paper, we posit that a different technique, Generative Adversarial Network-based inpainting, for robotic information gathering can be useful. The state-of-art inpainting techniques 1) do not assume that the underlying data is Gaussian, and 2) easily scale to n ≫ 1000. Thus, they eliminate the two bottlenecks posed by the GP-based solutions. We have tested our hypothesis on a synthetic and a real-world crop dataset. Results show that while the inpainting technique easily scales to 1024 × 1024, GP-based predictions cannot. On the other hand, their solution qualities are shown to be comparable.
|
|
13:15-13:30, Paper Tu-PS30-T4.2 | Add to My Program |
Contextual and Nonstationary Multi-Armed Bandits Using the Linear Gaussian State Space Model for the Meta-Recommender System |
|
Miyake, Yusuke | GMO Pepabo, Inc |
Mine, Tsunenori | Kyushu University |
Keywords: AI and Applications, Intelligent Internet Systems
Abstract: Selecting an optimal recommendation method is crucial for an electronic commerce (EC) site. However, the effectiveness of recommendation methods cannot be known in advance. Although continuous comparative evaluation in an actual environment is essential, it results in opportunity loss. To overcome this problem, opportunity loss reduction has been studied as a multi-armed bandit (MAB) problem, and adaptive meta-recommender systems (meta-RS) were devised to automatically and continuously select the best recommendation method according to a policy. The following three factors cause opportunity loss: the context of the recommendation method, temporal variation, and response time. However, studies are yet to formulate an MAB policy that considers all three factors. Thus, reducing opportunity loss remains a problem. We propose an MAB policy that considers all three causes of opportunity loss by using a Kalman filter for a linear Gaussian state space model. We conducted extensive experiments to select the best recommendation method using data from a real EC site. The results revealed that the proposed policy is highly effective for reducing opportunity loss of the meta-RS during evaluation and increasing cumulative clicks compared with baseline policies.
|
|
13:30-13:45, Paper Tu-PS30-T4.3 | Add to My Program |
Learning with Local Gradients at the Edge |
|
Lomnitz, Michael | SRI International |
Daniels, Zachary | SRI International |
Farkya, Saurabh | SRI International |
Isnardi, Michael | SRI International |
Zhang, David | SRI International |
Piacentino, Michael | SRI International |
Keywords: AIoT, Machine Learning, Deep Learning
Abstract: "To enable domain adaptation of AI on edge devices with fast convergence and low memory, we present a novel backpropagation-free optimization algorithm dubbed Target Projection Stochastic Gradient Descent (tpSGD). tpSGD uses layer-wise stochastic gradient descent (SGD) and local targets generated via random projections of the labels to train the network layer-by-layer with only forward passes. It doesn’t require retaining gradients during optimization, thus greatly reducing memory allocation compared to SGD backpropagation (BP) methods. Compared to other target propagation methods, tpSGD generalizes the concept to arbitrary local layer-wise loss functions. Our method performs comparably to BP gradient-descent within ∼5% accuracy on relatively shallow networks of fully connected layers, convolutional layers, transformers, and recurrent layers. tpSGD also outperforms other state-of-the-art gradient-free algorithms with competitive accuracy and less memory and compute time."
|
|
Tu-PS30-T5 Regular Session, Honolulu |
Add to My Program |
Human-Machine Cooperation and Systems I |
|
|
|
13:00-13:15, Paper Tu-PS30-T5.1 | Add to My Program |
Improved YOLOv7 Based on Transformer for Object Detection in UAV-Captured Images |
|
Luo, Yuefan | Hunan University |
Zhu, Qing | Hunan University |
Zhou, Zhen | Hunan University |
Chen, Lin | Chongqing Institute of Green and Intelligent Technology, Chinese |
Zhou, Jiaming | Hunan University |
Jiang, Tianjian | Hunan University |
Li, Yijiang | Hunan University |
Wang, Danwei | Nanyang Technological University |
Wang, Yaonan | Hunan University |
Keywords: Human-Machine Interaction, Human-Machine Cooperation and Systems, Information Visualization
Abstract: As the drone captures image targets at different flying altitudes, their scales may vary significantly, which can pose challenges for the object detection model to accurately detect them.Additionally, tiny objects in the image contain minimal information, making them difficult to distinguish from the background.To overcome these two challenges, we proposed a network architecture that aims to improve the accuracy of tiny object detection in drone images. Specially, we designed a tiny object detector(TOD) that can effectively extract features of tiny objects and distinguish between tiny object features and image background. Furthermore, this TOD module contains a Convolutional Visual Attention Network (CVAN) to better focus on the regions of tiny objects. Experimental results demonstrate that the proposed method achieves mAP@.5 accuracy of 53.9% on the VisDrone2021-test-dev dataset and improves by 2.8% compared to YOLOv7.
|
|
13:15-13:30, Paper Tu-PS30-T5.2 | Add to My Program |
Simulation of the Human Arm Dynamics and Motor Control in Unstable Environments: A Comparison with Experimental Data |
|
Treichl, Tobias | German Aerospace Center |
Keywords: Human-Machine Interaction, Human-Machine Cooperation and Systems, Human Factors
Abstract: The simulation of the human body dynamics offers a powerful way to investigate the interaction between humans and systems. This so-called physical human-machine interaction (HMI) is useful to consider during the development of new products like tools, cars or aircrafts, in order to increase comfort and safety. In this work, the Modelica Human Body Library is presented first. It is the only digital human mode (DHM) for physical HMI available in the multi-domain modeling language Modelica so far. The model considers the human body dynamics as well as the motor control of the limbs. A special feature of the model is the human inspired control scheme, which takes into account effects such as human reaction time and limb stiffness adaption. After an introduction to the DHM, its capabilities are evaluated by comparing simulation results with experimental data from a experimental study. In the study, different unstable force fields were applied to the hands of subjects as they approached a target. The resulting hand trajectories were measured. As many control tasks for humans, especially while using tools, are characterized by unstable behavior, this study is well suited to evaluate a DHM. Comparison with the experimental data shows that the resulting hand trajectories in the simulation are comparable to the ones the subjects produced due to the force fields. This applies for both, the initial tries with big deviations as well as the tries after the subjects adapted their arm stiffness to the force field.
|
|
13:30-13:45, Paper Tu-PS30-T5.3 | Add to My Program |
Interplay of Human and AI Solvers on a Planning Problem |
|
Ameri Ekhtiarabadi, Afshin | Mälardalen University |
Miloradovic, Branko | Malardalen University |
Çürüklü, Baran | Mälardalen University |
Papadopoulos, Alessandro | Malardalen University |
Ekström, Mikael | Mälardalen University |
Dreo, Johann | Computational Biology Dept., Universite Paris Cite, Institut Pas |
Keywords: Human-Machine Interaction, Human-Computer Interaction, Human-Machine Cooperation and Systems
Abstract: With the rapidly growing use of Multi-Agent Systems (MASs), which can exponentially increase the system complexity, the problem of planning a mission for MASs became more intricate. In some MASs, human operators are still involved in various decision-making processes, including manual mission planning, which can be an ineffective approach for any non-trivial problem. Mission planning and re-planning can be represented as a combinatorial optimization problem. Computing a solution to these types of problems is notoriously difficult and not scalable, posing a challenge even to cutting-edge solvers. As time is usually considered an essential resource in MASs, automated solvers have a limited time to provide a solution. The downside of this approach is that it can take a substantial amount of time for the automated solver to provide a sub-optimal solution. In this work, we are interested in the interplay between a human operator and an automated solver and whether it is more efficient to let a human or an automated solver handle the planning and re-planning problems, or if the combination of the two is a better approach. We thus propose an experimental setup to evaluate the effect of having a human operator included in the mission planning and re-planning process. Our tests are performed on a series of instances with gradually increasing complexity and involve a group of human operators and a metaheuristic solver based on a genetic algorithm. We measure the effect of the interplay on both the quality and structure of the output solutions. Our results show that the best setup is to let the operator come up with a few solutions, before letting the solver improve them.
|
|
13:45-14:00, Paper Tu-PS30-T5.4 | Add to My Program |
Human-Object-Robot Interaction through Playing Behaviour for Social Robots |
|
Vincze, David | Chuo University |
Niitsuma, Mihoko | Chuo University |
Keywords: Human-Machine Interaction, Human-Machine Interface, Human-Machine Cooperation and Systems
Abstract: Inspired by a common behaviour pattern from the dog-human relationship, we present a behaviour model combined with a method for controlling a real mobile robot, effectively realizing physical Human-Object-Robot Interaction. Translating social behaviours from the dog-human relationship into social robots has been already found to be a viable way. Playing, a cardinal behaviour element in the dog-human relationship, turned out to be one of the highest priority factors in forming a successful and lasting relationship between dogs and humans. Accordingly, we modeled the playing behavior when a human repeatedly initiates playing with a dog by picking up and throwing an object (toy), expecting the dog to go for the object and bring it back to the person throwing it, commonly known as playing fetch. We also developed a robot control system able to execute the commands from the behaviour model, and for gathering the poses of the participants in the real physical space we have used the What-You-See-Is-What-You-Get Indoor Localization (WIL) system. The integration of these components allowed us to create a social robotics system able to offer real physical human-object-robot interaction fast enough to be enjoyable for human participants, as a tool to be used in future HRI experiments, which was our main goal. The proposed behaviour model is intended to be embedded into social robots with the purpose of applying it in e.g. therapy, child-rearing, child-robot interaction scenarios, human behaviour assessment, recreational or entertainment scenarios.
|
|
Tu-PS30-T6 Regular Session, Kahuku |
Add to My Program |
Human Enhancements I |
|
|
|
13:15-13:30, Paper Tu-PS30-T6.2 | Add to My Program |
Pursuing Equilibrium of Medical Resources Via Data Empowerment in Parallel Healthcare System |
|
Yu, Yi | Shanghai Artificial Intelligence Laboratory |
Yao, Shengyue | Shanghai AI Laboratory |
Wang, Kexin | Nanjing Medical University |
Chen, Yan | Nanjing Medical University |
Wang, Fei-Yue | Institute of Automation, Chinese Academy of Sciences |
Lin, Yilun | Shanghai Artificial Intelligence Laboratory |
Keywords: Human Enhancements, Cognitive Computing
Abstract: The imbalance between the supply and demand of healthcare resources is a global challenge, which is particularly severe in developing countries. Governments and academic communities have made various efforts to increase healthcare supply and improve resource allocation. However, these efforts often remain passive and inflexible. Alongside these issues, the emergence of the parallel healthcare system has the potential to solve these problems by unlocking the data value. The parallel healthcare system comprises Medicine-Oriented Operating Systems (MOOS), Medicine-Oriented Scenario Engineering (MOSE), and Medicine-Oriented Large Models (MOLMs), which could collect, circulate, and empower data. In this paper, we propose that achieving equilibrium in medical resource allocation is possible through parallel healthcare systems via data empowerment. The supply-demand relationship can be balanced in parallel healthcare systems by (1) increasing the supply provided by digital and robotic doctors in MOOS, (2) identifying individual and potential demands by proactive diagnosis and treatment in MOSE, and (3) improving supply-demand matching using large models in MOLMs. To illustrate the effectiveness of this approach, we present a case study optimizing resource allocation from the perspective of facility accessibility. Results demonstrate that the parallel healthcare system could result in up to 300% improvement in accessibility.
|
|
13:30-13:45, Paper Tu-PS30-T6.3 | Add to My Program |
Gait Changes with Powered Wear for Walking Aid: Verification Experiment Measuring Walking Motion and Physical Characteristics |
|
Imamura, Yumeko | National Institute of Advanced Industrial Science and Technology |
Sumitani, Masahiko | The University of Tokyo Hospital |
Otake, Yuko | Bunkyo Gakuin University |
Hanzawa, Fumiya | Hantech |
Kishimoto, Kazuaki | SHIN-JIGEN Inc |
Keywords: Assistive Technology, Human Enhancements
Abstract: This study aims to clarify the relationship between the user's characteristics and the assist effect of wearable robot for walking aid, thereby elucidating the scope of application of robot-assistive technology. In this paper, we report on the transformation of walking movements when assisting forces are applied from a powered wear. The powered wear developed in the study assists the hip joint during walking with a wire-driven system. An experiment was conducted to measure walking movements and physical characteristics. Thirteen middle–aged and elderly persons took part in the experiment. The participants were instructed to walk at their preferred walking speed, but some of them showed a slight decrease in cadence and an increase in stride length during assisted walking. Particularly, participants with average cadence tended to increase their stride length when assisted. Furthermore, the participants with higher knee extensor muscle tended to improve their propulsive force during assisted walking. The experiment result showed changes in gait specific to the assistive robot and their relationship to the user's physical ability and gait characteristics.
|
|
13:45-14:00, Paper Tu-PS30-T6.4 | Add to My Program |
Human EEG Beta Band Power Is Related to Tibialis Anterior Muscle Activation Reductions During Walking with an Ankle Exoskeleton |
|
Song, Seongmi | Texas A&M University |
Haynes, Courtney | US ARMY DEVCOM Army Research Laboratory |
Naeem, Jasim | DCS Corporation/U.S. Army DEVCOM Army Research Laboratory |
Bradford, J. Cortney | US DEVCOM Army Research Laboratory |
Keywords: Human-Machine Interaction, Human Enhancements, Human Factors
Abstract: To achieve efficient and effortless movement with exoskeletons, it is essential to optimize human-exoskeleton fluency. Current methods for measuring fluency include energetic cost, muscle activation, and spatiotemporal gait parameters, but cortical brain dynamics have not been well studied. Understanding individual cortical responses could improve the application of personalized exoskeleton assistance through human-in-the-loop (HIL) optimization approaches. Our research aimed to enhance the understanding of individual responses and identify a potential objective function based on electrocortical activity. We found that electrocortical responses show more individual variability than muscle activations, emphasizing the need for personalized cortical optimization to maximize exoskeleton assistance. We also found a significant negative relationship between changes in TA muscle activation and EEG beta power in the motor area. These findings suggest that EEG beta power changes could be a possible objective function for HIL applications.
|
|
Tu-PS30-T7 Special Session, Oahu |
Add to My Program |
Computational Intelligence and Big Data Science for Bioinformatics |
|
|
Organizer: Zhu, Xin | The University of Aizu |
Organizer: Liu, Bo | Massey University |
Organizer: Jianqiang, Li | Beijing University of Technology |
Organizer: Pei, Yan | University of Aizu |
|
13:00-13:15, Paper Tu-PS30-T7.1 | Add to My Program |
DDPM-SKDNet: A Deep Learning Method for ICG Image Classification (I) |
|
Wang, YuHao | Beijing University of Technology |
Liu, Bo | Massey University |
Yang, Bin | Beijing University of Technology |
Li, Jianqiang | Beijing University of Technology |
Li, Yong | Beijing University of Technology |
Pei, Yan | University of Aizu |
Keywords: Deep Learning, Application of Artificial Intelligence, Biometric Systems and Bioinformatics
Abstract: Over the past several years, deep learning technologies have made tremendous progress in medical image tasks including classification, segmentation, and object detection. However, there are two main limitations of indocyanine green(ICG) images which are often used in breast cancer related lymphedema (BCRL): insufficient sample numbers and low image quality. Consequently, the conventional deep learning based classification methods such as ResNet have faced challenges in achieving satisfactory results. To tackle the concern, this paper puts forward a deep learning method named Denoising Diffusion Probabilistic Model Self-supervised Knowledge Distillation Net (DDPM-SKDNet) for the ICG images classification task, by incorporating a contrastive learning-based approach as the network architecture and using DDPM as the image generator in the contrastive module to expand the dataset size. Furthermore, a knowledge distillation approach is utilized to increase the effectiveness of the network. The proposed method was validated on ICG datasets and achieved a significant improvement in classification accuracy, increasing it from 66.7% in the baseline method to 82.1% in the proposed method.
|
|
13:15-13:30, Paper Tu-PS30-T7.2 | Add to My Program |
Anatomy-Guided Weakly Supervised Breast Lesion Segmentation Fusing Contour and Semantic Information (I) |
|
Liu, Xiaoling | Beijing University of Technology |
Li, Jianqiang | Beijing University of Technology |
Zhao, Linna | Beijing University of Technology |
Liu, Zhaolei | Beijing University of Technology |
Zhu, Chujie | Beijing University of Technology |
Ma, Tianbao | Beijing University of Technology |
Xu, Xi | Beijing University of Technology |
Zhao, Qing | Beijing University of Technology |
Keywords: Image Processing and Pattern Recognition, Biometric Systems and Bioinformatics, Deep Learning
Abstract: Accurate lesion segmentation on breast ultrasound (BUS) images is a crucial procedure in computer-aided ultrasonic diagnosis. Owing to the privacy of BUS data and the complexity of acquiring pixel-level labels, numerous researches attempt to achieve breast lesion segmentation with predefined feature-based and deep learning-based methods in a unsupervised or weakly supervised scenario. Although the former can typically extract more reliable contour information of the lesion, it is severely interfered by irrelevant tissues due to its inability to capture any semantic information. Furthermore, the weakly supervised deep learning segmentation based on class activation map (CAM) can explore the semantic information while failing to provide precise contour information. In view of the above observation, we present a weakly supervised framework merging complementary contour and semantic information for early lesion segmentation in BUS images. Specifically, guided by the prior knowledge of breast anatomy, we first extract and filter the contour information of suspected lesions located in breast parenchyma layer by clustering and morphological characteristic, respectively. Afterward, semantic information extraction is performed by a classification network to automatically explore the category information of the lesion. Finally, we selectively fuse the complementary information to facilitate lesion segmentation performance with more comprehensive features. Extensive experiments are conducted on the public dataset BUSI, and the results confirm the validity of our approach.
|
|
13:30-13:45, Paper Tu-PS30-T7.3 | Add to My Program |
Automated ICD Coding for Primary Diagnosis Based on Graph Convolution Network (I) |
|
Ma, Zerui | BJUT |
Li, Jianqiang | Beijing University of Technology |
Sufyan, Muhammad | Beijing University of Technology |
Li, Jing | Beijing University of Technology |
Keywords: Deep Learning, Neural Networks and their Applications, AI and Applications
Abstract: International Classification of Diseases (ICD) coding is an internationally unified diagnostic system, which sets a unique code for each patient who carries particular disease. The primary diagnosis indicates the most crucial disease for the patient in the hospitalization, and its coding is very important for both the patient and the hospital. The current ICD encoding methods mostly use text as the input of the model, and the unstructured nature of text data results in a relatively scattered feature distribution, which is not conducive to feature extraction and interpretability research of the models. In this paper, we utilized a knowledge graph to transform text into graph structured data, and used an improved graph convolutional model to extract features from the transformed graph, achieving automatic ICD encoding for the primary diagnosis of disease. The method was tested on a Chinese dataset with macro-averaged F1 score of 0.862, and the comparative experiments depict that the performance of method based on graph convolutional networks is generally better than that ICD coding models at the text level and node level.
|
|
Tu-PS30-T8 Regular Session, Hawaii 3 |
Add to My Program |
Wearable Computing I |
|
|
|
13:00-13:15, Paper Tu-PS30-T8.1 | Add to My Program |
A Sleeve Device Using Electrical Impedance for Coaching Jump Shots in Basketball |
|
Takaishi, Kazuma | Artificial Intelligence Laboratory |
Saiki, Hayato | University of Tsukuba |
Hirokawa, Masakazu | NEC Corporation |
Hassan, Modar | University of Tsukuba |
Suzuki, Kenji | University of Tsukuba |
Keywords: Team Performance and Training Systems, Wearable Computing, Human-Machine Interface
Abstract: In basketball free-throw practice, learning the appropriate amount and timing of applying force is crucial. However, since the throwing motion is done in a short time and the applied force cannot be directly observed, it is hard for coaches to evaluate the shooter's motion and give instructions. In this study, we propose a wearable device that measures the wearer's force control during throwing based on an electrical impedance and acceleration measurement, and provides feedback to help the wearer adjust their motion. Throughout experiments, we figured out the temporal features of the measured electrical impedance and acceleration during throwing that differentiate experts and beginners. Based on the features we also proposed a training system for force control using the proposed device, and verified that is effect to improve the free-throw skill.
|
|
13:15-13:30, Paper Tu-PS30-T8.2 | Add to My Program |
Study on Gait Stabilization Method Using Wearable Cyborg HAL Trunk-Unit for Parkinson’s Disease and Parkinsonism with Freezing of Gait |
|
Ikeda, Kaosu | University of Tsukuba |
Uehara, Akira | University of Tsukuba |
Kawamoto, Hiroaki | University of Tsukuba |
Sankai, Yoshiyuki | University of Tsukuba |
Keywords: Assistive Technology, Human Enhancements, Wearable Computing
Abstract: Freezing of Gait (FOG) is one of the typical parkinsonian gait disturbances of progressive neurological disorders such as Parkinson’s disease (PD) and progressive supranuclear palsy. Previous studies showed that the wearable cyborg HAL trunk-unit improved their gait disturbances through only an autonomous sway control that provided lateral swing with constant frequency. To promote the improvement by establishing interactive Bio-Feedback loop, it is necessary to realize synchronization between the wearer’s intention and gait states for stabilizing gait, i.e. reducing variation of gait cycle. In this study, we developed a voluntary control method that responded to an intentional change in gait cycle and a method that switched 2 kinds of controls including the voluntary control method and the autonomous sway control method for stabilizing gait. The voluntary swing control method synchronized the HAL’s force with wearer’s gait in a stable gait state less prone to FOG. The autonomous sway control method provided lateral swing with constant frequency as feedback to the patients to achieve a gait state less prone to FOG. These controls were switched according to the gait stability calculated by wearer’s gait cycle. Through the gait experiments with an abled-body participant, we confirmed that the HAL’s lateral cyclic sway synchronized with the participant’s gait and each control switched based on a gait stability. These results showed that the proposed methods had the feasibility of stabilizing gait.
|
|
13:30-13:45, Paper Tu-PS30-T8.3 | Add to My Program |
Mind Indriya: A System for Simultaneous Assessment of Cognitive Load, Anxiety and Visual Attention |
|
B S, Mithun | TCS |
Karmakar, Somnath | TCS Research |
Varghese, Tince | TCS |
Jaiswal, Dibyanshu | TCS Research |
Chatterjee, Debatri | Research Scientist, Tata Consultancy Services Ltd |
Gavas, Rahul | Tata Consultancy Services Limites |
Ramakrishnan, Ramesh Kumar | TCS Research |
Pal, Arpan | Tata Consultancy Services |
Keywords: Wearable Computing, Human Factors, Human-Computer Interaction
Abstract: Every human being is unique and behaves differently in any given context. Standard approaches like surveys that are used today to assess human behaviour often generate subjective responses and can be administered either before or after task/activity only. This presents an opportunity to develop solutions that can monitor a human being’s cognitive, affective and mental state in their current context continuously and in real time, during activities and in an unobtrusive manner. We have developed a composite system called Mind Indriya that can unobtrusively measure cognitive load (CL), anxiety and visual attention using a combination of frugal sensors like wrist wearable and webcam . The accuracies of each of the individual algorithms have been proven with cognitive load at accuracy 69.5% , anxiety at accuracy of 86% and the detection of eye blinks as an index of visual attention at Fscore of 0.91. Moreover, our system is person and task independent while state-of-the-art techniques are either task dependent or requires heavy personalization. The proposed system can be used in various real life applications like a) neuromarketing - to understand how customer behavior towards new products or advertisement, b) online learning - to personalize contents or understand the learning outcome, c) Website/user interface design - to understand the effect of the new design on people’s cognitive load, affect and attention span and therefore how effective the design is and so on.
|
|
Tu-PS50W Workshop Session, Puna |
Add to My Program |
Passive BMIs and Applications |
|
|
|
16:00-16:15, Paper Tu-PS50W.1 | Add to My Program |
Effect of Machine Reliability on the Cognitive Processes of Task Performance |
|
Zenia, Nusrat Zerin | University of Calgary |
Ghaemi Dizaji, Lida | Univeristy of Calgary |
Hu, Yaoping | University of Calgary |
Keywords: Passive BMIs, Other Neurotechnology and Brain-Related Topics
Abstract: Brain machine interfaces (BMI) are becoming increasingly prevalent in diverse applications including motor rehabilitation, virtual reality trainings, etc. Two critical aspects of an effective BMI are machine reliability and cognitive workload (CWL). Previous studies have reported a notable effect of machine reliability on the 6 factors of the CWL. However, it remains unclear whether this effect can be detected in cognitive processes. Electroencephalography (EEG) is a widely used technique to explore cognitive processes by recording brain activities as signals. Therefore, we utilized the event-related spectral power (ERSP) feature of EEG signals to determine the cognitive processes regarding the effect of machine reliability. The results revealed that machine reliability affected the CWL factor of performance which was reflected in the Gamma band activities of the right prefrontal cortex. The findings indicate the potential of cognitive processes in detecting the effect of machine reliability. The detection could pave the way for designing adaptive BMI to balance between the machine reliability and the CWL.
|
|
16:15-16:30, Paper Tu-PS50W.2 | Add to My Program |
Restoring Engagement in Human-Robot Interaction: A Brain-Computer Interface for Adaptive Learning with Robots |
|
Pruss, Ethel | Tilburg University |
Prinsen, Jos | Tilburg University |
Ceccato, Caterina | Tilburg University |
Vrins, Anita | Tilburg University |
Ziadeh, Hamzah | Aalborg University |
Knoche, Hendrik | Aalborg University |
Alimardani, Maryam | Tilburg University |
Keywords: Passive BMIs, BMI Emerging Applications
Abstract: This paper investigates the efficacy of a passive Brain-Computer Interface (BCI) in enabling a robot tutor to adaptively respond to a user's engagement level in real-time. The BCI system extracted EEG Engagement Index from the user's electroencephalography (EEG) signals as an indicator of engagement during Human-Robot Interaction (HRI). A within-subjects study was conducted in which the robot performed attention-recapturing behavior during a learning task under two conditions; either in an adaptive manner whenever a lapse in the user's engagement level was detected by the BCI system (Adaptive condition) or at random intervals regardless of the user's mental states (Random condition). In both conditions, users completed an information retention test following the interaction. The study found no significant difference in the post-interaction test results or mean EEG Engagement Index values between the Adaptive and Random conditions. However, analysis of 10-sec time windows following robot interventions showed that adaptively timed gestures were significantly more effective in restoring user engagement to optimal level compared to randomly timed gestures. This finding provides evidence for the potential of passive BCIs in improving user experience in pedagogical HRI settings.
|
|
16:30-16:45, Paper Tu-PS50W.3 | Add to My Program |
Gender-Sensitive EEG Channel Selection for Emotion Recognition Using Enhanced Genetic Algorithm |
|
Duan, Danting | Key Laboratory of Media Audio & Video, Communication University |
Sun, Bing | College of Computer and Information, Henan Normal University |
Yang, Qiang | Nanjing University of Information Science and Technology |
Zhong, Wei | State Key Laboratory of Media, Convergence and Communication, Com |
Ye, Long | State Key Laboratory of Media, Convergence and Communication, Com |
Zhang, Qin | State Key Laboratory of Media, Convergence and Communication, Com |
Zhang, Jun | Hanyang University |
Keywords: Brain-Computer Interfaces, Affective Computing
Abstract: EEG channel selection can reduce data redundancy, thereby beneficial for improving the utility and efficiency of emotion recognition. Previous studies on EEG channel selection have not considered the influence of genders despite long-standing belief in gender differences with respect to emotion analysis. In this paper, we collected EEG signals from 20 subjects containing 10 males and 10 females by letting them watch short emotional videos. Then, to reduce data redundancy, we propose an enhanced genetic algorithm to select the optimal channel subsets separately for male and female subjects by incorporating a novel evolution operation. Experimental results show that the proposed algorithm achieves higher accuracy in terms of emotion recognition than several compared methods with a smaller channel subset. Besides, experimental results also indicate that the gender differences in neural patterns indeed exist. Through this study, the gender-sensitive channel selection offers a new avenue for further development of EEG based emotion recognition.
|
|
16:45-17:00, Paper Tu-PS50W.4 | Add to My Program |
Optimization of Electrode Configuration for the Removal of Eye Artifacts with Adaptive Noise Cancellation |
|
Gonzalez-Espana, Juan Jose | University of Houston |
Craik, Alexander | University of Houston |
Ramirez, Carolina | Texas A&M University |
Alamir, Ayman | University of Houston |
Contreras-Vidal, Jose | University of Houston |
Keywords: BMI Emerging Applications, Other Neurotechnology and Brain-Related Topics, Passive BMIs
Abstract: Scalp electroencephalography (EEG) is a neural source signal that is extensively used in neuroengineering due to its non-invasive nature and ease of collection. However, a drawback to the use of EEG is the prevalence of physiological artifacts generated by eye movements and eye blinks that contaminate the brain signals. Previously, we have proposed and validated an H∞-based Adaptive Noise Cancellation (ANC) technique for the real-time identification, learning and removal of eye blinks, eye motions, amplitude drifts and recording biases from EEG simultaneously. However, the standard electrooculography (EOG) electrode configuration requires four electrodes for EOG measurement, which limits its applicability for reduced-channel mobile applications, such as brain-computer interfaces (BCI). Here, we assess multiple configurations with varying number of EOG electrodes and compare the ANC effectiveness of these configurations to the ideal four-electrode configuration. From an analysis of the root mean squared error (RMSE) and differences in signal to noise ratios (SNR) between the ideal four-electrode case and the alternative configurations, it is reported that several three-electrode alternative configurations were effective in essentially replicating the ability to remove EOG artifacts in an experimental cohort of ten healthy subjects. For nine subjects, it was shown that only two to three EOG electrodes were needed to achieve similar performance as compared to the four-electrode case. This study demonstrates that the typical four-electrode configuration for EOG recordings for adaptive noise cancellation of ocular artifacts may not be necessary; by using the proposed new EOG configurations it is possible to improve electrode allocation efficiency for EOG measurements in mobile EEG applications.
|
|
17:00-17:15, Paper Tu-PS50W.5 | Add to My Program |
Robust Emotion Recognition in EEG Signals Based on a Combination of Multiple Domain Adaptation Techniques |
|
Mirzaee, Seyed Alireza | Islamic Azad University, Dariun Branch, Shiraz, Iran |
Kordestani, Mojtaba | University of Windsor |
Rueda, Luis | University of Windsor |
Saif, Mehrdad | University of Windsor |
Keywords: BMI Emerging Applications, Other Neurotechnology and Brain-Related Topics
Abstract: Conventional classification approaches for EEG-based emotion recognition cannot often adapt to different domains, such as cross-subject or cross-dataset scenarios, leading to poor performance. To handle this challenge, we introduce a novel fusion method using a combination of multiple domain adaptation techniques to improve the emotional states in EEG datasets via classification accuracy. For this aim, Our proposed approach exploits domain adaptation approaches such as Transfer Component Analysis (TCA), Correlation Alignment (CORAL), Transfer Joint Matching (TJM), Geodesic Flow Kernel (GFK), and Joint Distribution Adaptation (JDA), to enhance the overall classification performance. Later, a new fusion approach called Multiple Domain Adaptation based on a Neuro-Fuzzy Inference System (MDA-NF) is applied to combine the classifiers using proper fuzzy membership functions and deliver maximum separation between classes. The main contribution is by applying the fusion approach using MDA-NF technique, adaptability is sufficiently enhanced. Another advantage is to employ multiple adaptation techniques that improve separation between classes. In experimental test results conducted with cross-subject and cross-dataset scenarios, the MDA-NF approach demonstrates superior performance in terms of accuracy for both the valence and arousal aspects, as observed in two public DEAP and DREAMER datasets.
|
|
Tu-PS50-T1 Special Session, Hawaii 1 |
Add to My Program |
Conflict Resolution |
|
|
Organizer: Fang, Liping | Toronto Metropolitan University |
Organizer: Hipel, Keith | University of Waterloo |
|
16:00-16:15, Paper Tu-PS50-T1.1 | Add to My Program |
An Evacuation Navigation Model Considering Fire's Impact on Escape Capability of Evacuees (I) |
|
Golshani, Feze | Toronto Metropolitan University |
Fang, Liping | Toronto Metropolitan University |
Keywords: Conflict Resolution, Decision Support Systems
Abstract: An evacuation navigation model incorporating a fire’s detrimental effect on evacuees’ escape capability is presented. Fire is one of the most common hazards in buildings that severely impacts evacuees' escape capability. A fire dynamics model is used to simulate dynamics of a fire at various locations in a building. An index, called Capability Deterioration Rate (CDR), is introduced to model how fire effluents, such as smoke, heat, and toxic gases, impact escape capability spatiotemporally. This index is incorporated into a modified Dijkstra algorithm by changing weights of navigation edges. An agent-based model taking into account the fire dynamics and escape capability is used to simulate an agent’s evacuation from a location to minimize the total travel cost from the location to an egress door. A three-floor building is used as a case study to demonstrate the application and effectiveness of the proposed model. The proposed model can be used to evaluate and select the best navigation plan in terms of escape capability under various fire scenarios.
|
|
16:15-16:30, Paper Tu-PS50-T1.2 | Add to My Program |
Diffusion of Electric Vehicles and Public and Home Charging Stations in a Two-Sided Market (I) |
|
Kishi, Shinnosuke | Kyoto University |
Kotani, Hitomu | Kyoto University |
Matsushima, Kakuya | Kyoto University |
Keywords: Electric Vehicles and Electric Vehicle Supply Equipment, Conflict Resolution, Smart Buildings, Smart Cities and Infrastructures
Abstract: The diffusion of electric vehicles (EVs) is affected by the spread of EV public charging stations (PCSs) and vice versa. Their interactions are often referred to as ``indirect network effects'' and analyzed with a two-sided market model. However, the consumer adoption of home charging stations (HCSs), which are considered key drivers of EV deployment, is often ignored. By modeling a two-sided market of EVs and PCSs (i.e., indirect network effects) under the adoption of HCSs, this study explores the EV diffusion process and considers effective strategies for its spread. We examine two cases: in the first, consumer adoption is determined exogenously; in the second, it is determined endogenously. In each case, we vary the strength of the indirect network effects, which correspond to the drivers' concern, referred to as ``range anxiety.'' Through numerical simulations, we evaluate the market shares of EVs, HCSs, and PCSs, as well as social welfare. Our findings have strategic implications for policymakers seeking to increase the market share of EVs in the presence of different types of charging stations without negative social impacts.
|
|
16:30-16:45, Paper Tu-PS50-T1.3 | Add to My Program |
Obtaining Lower and Upper Probabilistic Preferences in the Graph Model for Conflict Resolution through Multicriteria Evaluation (I) |
|
Rêgo, Leandro Chaves | Federal University of Ceará |
Silva, Maisa | Federal University of Pernambuco |
Keywords: Conflict Resolution
Abstract: This paper proposes a procedure to elicit lower and upper probabilistic preferences within the Graph Model for Conflict Resolution (GMCR) through a multicriteria perspective. This imprecise probability approach is especially recommended in situations where there is not enough information regarding how DMs evaluate or compare the feasible conflict states. Within this proposal, the elicitation process is carried out through three questions in which decision makers are asked to evaluate the performance of each viable state in each criterion. Uncertainty in this performance generates lower and higher probabilistic preferences among states. To illustrate the applicability of the new elicitation procedure, we present a real-world example from a construction project in Brazil.
|
|
16:45-17:00, Paper Tu-PS50-T1.4 | Add to My Program |
A Large-Scale Group Decision Making Model Based on Adaptive Subgroup Rescue Mechanism |
|
Zhang, Yingying | University College London |
Shiyu, Gong | NUS |
Chai, Junyi | Beijing Normal University - Hong Kong Baptist University United |
Keywords: Conflict Resolution, Decision Support Systems
Abstract: This paper proposes an adaptive subgroup rescue mechanism to better balance efficiency and information loss in large-scale group decision-making. We calculate the consensus for different clusters through linguistic preferences and trust relations, find the group with the lowest consensus level, and advise the group to exit from the process. A rescue process is triggered once the group has low cohesion or can propose a new idea. A trust-based feedback mechanism is designed to ensure enough decision makers in consensus reaching. Finally, we conduct an empirical study to verify the feasibility of our mechanism.
|
|
17:00-17:15, Paper Tu-PS50-T1.5 | Add to My Program |
Social-Aware Planning and Control for Automated Vehicles Based on Driving Risk Field and Model Predictive Contouring Control: Driving through Roundabouts As a Case Study |
|
Zhang, Li | City University of Hong Kong |
Dong, Yongqi | Delft University of Technology |
Farah, Haneen | Delft University of Technology |
van Arem, Bart | TU Delft |
Keywords: Autonomous Vehicle, Intelligent Transportation Systems, Robotic Systems
Abstract: The gradual deployment of automated vehicles (AVs) results in mixed traffic where AVs will interact with human-driven vehicles (HDVs). Thus, social-aware motion planning and control while considering interactions with HDVs on the road is critical for AVs’ deployment and safe driving under various maneuvers. Previous research mostly focuses on the trajectory planning of AVs using Model Predictive Control or other relevant methods, while seldom considering the integrated planning and control of AVs altogether to simplify the whole pipeline architecture. Furthermore, there are very limited studies on social-aware driving that makes AVs understandable and expected by human drivers, and none when it comes to the challenging maneuver of driving through roundabouts. To fill these research gaps, this paper develops an integrated social-aware planning and control algorithm for AVs’ driving through roundabouts based on Driving Risk Field (DRF), Social Value Orientation (SVO), and Model Predictive Contouring Control (MPCC), i.e., DRF-SVO-MPCC. The proposed method is tested and verified with simulation on the open-sourced highway-env platform. Compared with the baseline method using purely Nonlinear Model Predictive Control, the DRF-SVO-MPCC can achieve better performance under various maneuvers of driving through roundabouts with and without surrounding HDVs.
|
|
Tu-PS50-T2 Regular Session, Hawaii 5 |
Add to My Program |
Best Paper Finalists |
|
|
|
16:00-16:15, Paper Tu-PS50-T2.1 | Add to My Program |
Bayesian Approach for Adaptive EMG Pattern Classification Via Semi-Supervised Sequential Learning |
|
Yoneda, Seitaro | Hiroshima University |
Furui, Akira | Hiroshima University |
Keywords: Human-Machine Interface, Assistive Technology, Medical Informatics
Abstract: Intuitive human-machine interfaces may be developed using pattern classification to estimate executed human motions from electromyogram (EMG) signals generated during muscle contraction. The continual use of EMG-based interfaces gradually alters signal characteristics owing to electrode shift and muscle fatigue, leading to a gradual decline in classification accuracy. This paper proposes a Bayesian approach for adaptive EMG pattern classification using semi-supervised sequential learning. The proposed method uses a Bayesian classification model based on Gaussian distributions to predict the motion class and estimate its confidence. Pseudo-labels are subsequently assigned to data with high-prediction confidence, and the posterior distributions of the model are sequentially updated within the framework of Bayesian updating, thereby achieving adaptive motion recognition to alterations in signal characteristics over time. Experimental results on six healthy adults demonstrated that the proposed method can suppress the degradation of classification accuracy over time and outperforms conventional methods. These findings demonstrate the validity of the proposed approach and its applicability to practical EMG-based control systems.
|
|
16:15-16:30, Paper Tu-PS50-T2.2 | Add to My Program |
Fast Verification of Petri Net-Based Model of Industrial Decision-Making Systems: A Case Study (I) |
|
Wisniewski, Remigiusz | University of Zielona Gora |
Patalas-Maliszewska, Justyna | University of Zielona Góra |
Wojnakowski, Marcin | University of Zielona Gora |
Topczak, Marcin | University of Zielona Góra |
Zhou, Mengchu | New Jersey Institute of Technology |
Keywords: System Modeling and Control, Manufacturing Automation and Systems, Cyber-physical systems
Abstract: This work deals with the verification of a decision-making system for additive manufacturing (AM) technology adoption specified by a Petri net. An innovative verification technique of a Petri net-based system is oriented toward practical applications, and can detect errors at the early design and modelling stage. The idea is illustrated by a real-life case study of supporting decision making in AM technology adoption affecting supply chain management (SCM). Two main issues are addressed. Firstly, making optimal decisions about AM technology requires models rarely possessed within a company. Therefore, this work proposes a model supporting the decision making related to the implementation of AM technology, based on a Petri net, by utilizing its main advantages: graphical modelling, and strong mathematical support of formal verification techniques. Contrary to the most popular analysis methods (which are bounded exponentially in a general case), it is proved that the presented method is bounded by a cubic polynomial with net size. Secondly, an investment in AM technology is often financially assessed and does not affect other processes, as in our SCM case. Hence, strong connectivity within the proposed Petri net-based model is examined.
|
|
16:30-16:45, Paper Tu-PS50-T2.3 | Add to My Program |
Think BIG: Brain-Computer Interface Goals for Children with Quadriplegic Cerebral Palsy |
|
Kelly, Dion | University of Calgary |
Rowley, Danette | Alberta Children's Hospital |
Floreani, Erica Danielle | University of Toronto |
Kinney-Lang, Eli | University of Calgary |
Robu, Ion | Alberta Children's Hospital |
Kirton, Adam | University of Calgary |
Keywords: Brain-Computer Interfaces, Assistive Technology, Cooperative Work in Design
Abstract: There is a pressing need for alternative access technologies that enable children with severe physical disabilities, as current options often require some degree of controlled movement to be used efficiently. Brain-computer interfaces (BCIs) hold significant potential for improving the lives of children with severe physical disabilities, however research must prioritize user-centered approaches and real-world applications to maximize benefits. This paper examines the integration of home-based BCIs for children with quadriplegic cerebral palsy through user-centered design, focusing on the feasibility, usability, and impact on personal goals and activities of daily living. Seven children aged 6-15 years with quadriplegic cerebral palsy and their families participated in this pilot study, using personalized BCI packages and home-based virtual sessions to help them achieve individualized goals in self-care, productivity, and leisure. We utilized a collaborative goal-setting approach and assessed satisfaction and performance changes using the Canadian Occupational Performance Measure (COPM) and the BCI-adapted Quebec User Evaluation of Satisfaction with Assistive Technology (e-QUEST2.0). Significant improvements in performance and satisfaction were observed in the COPM scores, while parents were most satisfied with professional services and least satisfied with the adjustability of the BCI system, as per the eQuest2.0 questionnaire. Despite no significant improvement in BCI consistency across nine sessions, the intervention positively impacted participants' perceived performance and satisfaction in goal-oriented activities. Future research should focus on enhancing BCI design, comfort, and effectiveness while considering user priorities and feedback for personalized goal achievement.
|
|
16:45-17:00, Paper Tu-PS50-T2.4 | Add to My Program |
FGRL-Net: Fine-Grained Personalized Patient Representation Learning for Clinical Risk Prediction Based on EHRs |
|
Chio, Ka Kit | Macao Polytechnic University |
Zhu, Wenhao | University of Electronic Science and Technology of China |
He, Lihua | Macao Polytechnic University |
Zhang, Dian | Shenzhen University |
Yang, Xu | Macao Polytechnic University |
Luo, Wuman | Macao Polytechnic University |
Keywords: Medical Informatics
Abstract: Personalized patient representation learning (PPRL) is a critical element in clinical risk prediction. It aims to obtain a complete portrait of each patient based on Electronic Health Records (EHR). Although existing works have achieved remarkable progress in healthcare prediction, there are still three major issues. First, feature correlation is crucial for risk prediction, but it has not yet been fully exploited by existing works. Second, variation pattern of dynamic feature contains useful information about patient's physical status, but adaptive pattern recognition is still a challenge. Third, existing works usually adopt a two-stage embedding process to process each dimension of the EHR data. However, some useful low-level information for PPRL will be lost. To address these issues, in this paper, we propose a fine-grained PPRL architecture named FGRL-Net for clinical risk prediction based on EHR. Specifically, we propose a Medical Feature Correlation Detection Module (FCM) to effectively learn the feature correlations for each patient and a Temporal Variation Pattern Recognition Module (TVM) to effectively detect the variation patterns of each dynamic feature. Moreover, we design a Fine-Grained Representation Mechanism (FGRM) to preserve the low-level information (from both feature and visit dimensions) useful for risk prediction. In addition, in the stage of data preprocessing, We utilize generic medical classification knowledge to classify numerical dynamic data. We conduct the in-hospital mortality experiment and the decompensation experiment on a real-world dataset. The experiment results show that the FGRL-Net outperforms state-of-the-art approaches. The source code is provided in github https://github.com/JackyChio/FGRL-Net.
|
|
17:00-17:15, Paper Tu-PS50-T2.5 | Add to My Program |
An Automated Detection of Amyotrophic Lateral Sclerosis from Resting State MEG Data Using 3D Deep Convolutional Neural Network |
|
Samanta, Kaniska | Ulster University |
Roy, Sujit | Brainalive Research Pvt. Ltd |
Marchand-Pauvert, Veronique | Sorbonne Universit´es, UPMC Univ Paris 06, CNRS, Inserm, Laborat |
Dora, Shirin | Loughborough University |
Duguez, Stephanie | Ulster University |
Singh, Muskaan | Ulster University |
Prasad, Girijesh | University of Ulster |
Bhattacharyya, Saugat | Ulster University |
Keywords: Other Neurotechnology and Brain-Related Topics, BMI Emerging Applications
Abstract: A novel 3D deep convolutional neural network (3D-CNN) model called MEGNet3D has been proposed in the paper. MEGNet3D is designed to differentiate between amyotrophic lateral sclerosis (ALS) and healthy individuals from their resting state (eyes open and eye closed condition) sensor-level magnetoencephalography (MEG) data. The raw MEG data is initially transformed into their time-frequency representation which are then used as inputs to MEGNet3D. Both magnetometer and gradiometer recordings have been investigated separately. The proposed model exhibits an accuracy of over 75% for most classification conditions. Thus, MEGNet3D is capable of handling high subject variability and shows that spectral-temporal representation of resting state MEG data yields relevant neural markers related to the existence of ALS. Furthermore, it has also been observed resting state with eyes closed yields better classification accuracy as compared to resting state with eye open condition.
|
|
Tu-PS50-T3 Regular Session, Hawaii 6 |
Add to My Program |
System Modeling and Control |
|
|
|
16:00-16:15, Paper Tu-PS50-T3.1 | Add to My Program |
Adaptive Estimated Inverse Control for Uncertain Nonlinear Systems with Hysteresis Effect |
|
Zhou, Ning | Nanjing University of Science and Technology |
Deng, Wenxiang | Nanjing University of Science and Technology |
Yao, Jianyong | Nanjing University of Science and Technology |
Keywords: Control of Uncertain Systems, Adaptive Systems
Abstract: An adaptive estimated inverse control methodology for a class of uncertain nonlinear system subjected to unknown Prandtl-Ishlinskii (PI) hysteresis nonlinearity is studied in this paper. The largest difficulty in the inverse hysteresis compensation mechanism lies in that its parameters are hard to determine. To obviate such obstruction, the unknown hysteresis parameters are estimated online by the adaptive techniques together with uncertain system parameters. Thus, hysteresis effects as well as parameter uncertainties can both be suppressed without the time-consuming off-line identification. Meanwhile, the tracking differentiator is utilized to solve "explosion of complexity" problem in backstepping control by calculating the derivatives of the virtual control laws. Based on the Lyapunov stability analysis, it has been revealed that the boundedness of all the system signals is achieved and the tracking error along with estimation errors can all converge to an compact set around zero. Contrastive simulations are performed to demonstrate the theoretical findings of this control algorithm.
|
|
16:15-16:30, Paper Tu-PS50-T3.2 | Add to My Program |
Decentralized Leader-Follower Control for Centroid and Formation Tracking |
|
Sileo, Monica | University of Basilicata |
Karayiannidis, Yiannis | Lund University |
Pierri, Francesco | University of Basilicata |
Caccavale, Fabrizio | University of Basilicata |
Keywords: Autonomous Vehicle, Robotic Systems, Distributed Intelligent Systems
Abstract: In this paper, a novel decentralized leader-follower control scheme for multi-agent systems is devised, where each agent communicates only with a subset of neighboring mates. The goal is to track assigned trajectories for the centroid and the formation of the system. The desired trajectories are known only by a subset of agents, named leaders: the other agents, the followers, are required to estimate the desired trajectories based on a dynamic consensus scheme. Then, the desired trajectories to be tracked by each agent are computed from the estimated trajectories for the centroid and the formation and a simple local control loop is adopted to track the former. Stability and performance are analyzed and experiments are run on Robotarium platform to show the effectiveness of the approach and the effect of different parameters on the achieved performance.
|
|
16:30-16:45, Paper Tu-PS50-T3.3 | Add to My Program |
Level Plane SLAM: Out-Of-Plane Motion Compensation in a Globally Stabilized Coordinate Frame for 2D SLAM |
|
Lovett, Samuel | Carleton University |
Paquette, Tyler | Hibou Systems Inc |
DeBoon, Brayden | Hibou Systems Inc |
Rajan, Sreeraman | Carleton University |
Rossa, Carlos | Carleton University |
Keywords: Autonomous Vehicle, Digital Twin
Abstract: Two-dimensional (2D) simultaneous localization and mapping (SLAM) using a LIDAR is a method used to track the position and orientation of a moving platform. 2D-SLAM assumes that the platform translates in a 2D plane and can only rotate about an axis perpendicular to that plane. However, the assumption of no out-of-plane (OOP) motion does not hold true for platforms experiencing motion in six degrees-of-freedom (6-DOF), such as wearable technologies that have no 3D LIDAR. This paper proposes a new algorithm, called the Level Plane for SLAM (LPS) for removing OOP motion from 2D-LIDAR scans generated on platforms experiencing 6-DOF without requiring scan-matching in 3D. Like other existing methods, an IMU is combined with a 2D-LIDAR to determine the platform’s orientation, capture OOP motion, and generate a scan in 3D. Unlike other methods, OOP motion is removed by projecting scans onto a globally stabilized coordinate frame in 2D where both scan matching and map alignment take place. The proposed algorithm is validated over a series of experiments with different levels of induced and observed OOP motion. Experimental results show that LPS is able to handle more OOP motion than other algorithms and run in real-time.
|
|
16:45-17:00, Paper Tu-PS50-T3.4 | Add to My Program |
Optimal Input Distribution Over Multiple Control Objectives for Adaptive High-Rise Structures |
|
Dakova, Spasena | University of Stuttgart |
Kohl, Katharina | University of Stuttgart |
Heidingsfeld, Julia | Institute for System Dynamics |
Sawodny, Oliver | University of Stuttgart |
Böhm, Michael | University of Stuttgart |
Keywords: System Modeling and Control, Smart Buildings, Smart Cities and Infrastructures, Adaptive Systems
Abstract: Adaptive high-rise buildings use multiple sensor systems, actuators integrated into the structure, and a control unit in order to actively counteract external disturbances. In civil engineering, a distinction is made between static loads such as snow, and dynamic loads, e.g. wind and earthquakes. These result in two control objectives, each of which is achieved by a separate controller. A static load compensation method is employed to minimize static displacements, while a model predictive controller induces additional damping into the structure to suppress structural vibrations. Here, both controllers use the same set of actuators with limited forces. This paper presents an investigation of the control input requirement for static load compensation and vibration damping. An algorithm for optimal control input distribution over the different control objectives is implemented to achieve good performance of the overall system. The method is tested in simulations considering a wind disturbance. By applying the introduced control strategy, the closed loop achieves a performance improvement of 14% with regard to the building's displacements and velocities compared to the application of solely a model predictive controller.
|
|
17:00-17:15, Paper Tu-PS50-T3.5 | Add to My Program |
Stability Conditions and Control Designs Via Improved Lyapunov Function for Takagi-Sugeno Fuzzy Descriptor Systems |
|
Asai, Yuto | Aoyama Gakuin University |
Itami, Taku | Aoyamagakuin University |
Yoneyama, Jun | Aoyama Gakuin University |
Keywords: System Modeling and Control
Abstract: This paper is concerned with control design methods via a new Lyapunov function for fuzzy descriptor systems. Takagi-Sugeno fuzzy descriptor model, that can represent nonlinear systems, are divided into two families: one with same membership functions in both sides of fuzzy systems and another with different membership functions in right and left side of fuzzy systems. Although many papers propose control designs for the above two systems that are derived from Lyapunov stability theory, they are very conservative because they use a Lyapunov function including a single Lyapunov matrix. In this paper, to design less conservative conditions, we propose new Lyapunov functions with all Lyapunov matrices that are multiple. Finally, numerical examples are given to show the effectiveness of our methods.
|
|
17:15-17:30, Paper Tu-PS50-T3.6 | Add to My Program |
Steep Turn of a Tilt-Rotor UAV with Redundancy in Control Inputs |
|
Urakubo, Takateru | Kobe University |
Nakamura, Ryota | Kobe University |
Kikumoto, Chihiro | Kobe University |
Sabe, Kohtaro | Aerosense Inc |
Hirai, Shinji | Aerosense Inc |
Keywords: System Modeling and Control, Mechatronics, Robotic Systems
Abstract: This paper reveals that the turning radius of a tilt-rotor UAV during high-speed flights can be significantly reduced by utilizing the redundancy in control inputs. Based on a dynamic model of the UAV, the flight conditions of steady level turns are analyzed to numerically search for the minimum turning radius when all the actuators installed in the UAV are utilized. Numerical results indicate that adding lift with sub-rotors installed for rotary-wing mode makes it possible to increase the roll angle during level turns while reducing the turning radius. The tilt angle of main rotor can also be chosen within a certain range to reduce the turning radius. The feasibility of such steep turns is checked by numerical simulations using a feedback control.
|
|
Tu-PS50-T4 Special Session, Hawaii 2 |
Add to My Program |
Intelligent Industrial Environments and Cyber-Physical Industrial Systems |
|
|
Chair: Strasser, Thomas | AIT Austrian Institute of Technology GmbH |
Co-Chair: Wang, Jiacun | Monmouth University |
Organizer: Strasser, Thomas | AIT Austrian Institute of Technology GmbH |
Organizer: Farid, Amro | Stevens Institute of Technology |
|
16:00-16:15, Paper Tu-PS50-T4.1 | Add to My Program |
Optimally Scheduling Single-Arm Multicluster Tools for Manufacturing Hybrid-Type Wafers (I) |
|
Wang, GengHong | Guangdong University of Technology |
Zhu, QingHua | Guangdong University of Technology |
Hou, Yan | Guangdong University of Technology |
Qiao, Yan | Macau University of Science and Technology |
Wu, Nai Qi | Macau University of Science and Technology |
Zhou, Mengchu | New Jersey Institute of Technology |
Keywords: Optimization and Self-Organization Approaches
Abstract: In semiconductor manufacturing, a multicluster tool is widely employed for most wafer fabrication processes. With the demand for high-mix integrated circuit chips and shrinkage of circuit width, a scheme that multiple wafer types are fabricated inside multicluster tools is adopted by wafer foundries to earn more profits. Multiple wafer types, multiple robots, and wafer residency time constraints make these scheduling problems challenging. This work focuses on scheduling a single-arm multicluster tool to process two wafer types concurrently subject to wafer residency time constraints in which a conventional one-wafer cycle and backward strategy are not efficient. To our best knowledge, it is the first time to propose a one-wafer-per-type cycle and two-wafer-type backward sequence for such a scenario. With such properties, several necessary and sufficient conditions are presented to check the feasibility of a periodic schedule. Further, polynomial time algorithms are proposed to match the schedulability and coordinate multiple robots to handle wafers for the schedulable scenarios. The cycle time of an obtained schedule can reach the lower bound. Finally, a practical example is used to show the effectiveness of the proposed algorithm.
|
|
16:15-16:30, Paper Tu-PS50-T4.2 | Add to My Program |
ReACT: Reinforcement Learning for Controller Parametrization Using B-Spline Geometries (I) |
|
Rudolf, Thomas | KIT Karlsruhe Institute of Technology, FZI Research Center for I |
Flögel, Daniel | FZI Forschungszentrum Informatik |
Schürmann, Tobias | FZI Research Center for Information Technology |
Süß, Simon | FZI Forschungszentrum Informatik |
Schwab, Stefan | FZI Research Center for Information Technology |
Hohmann, Sören | KIT |
Keywords: AI and Applications, Deep Learning, Optimization and Self-Organization Approaches
Abstract: Robust and performant controllers are essential for industrial applications. However, deriving controller parameters for complex and nonlinear systems is challenging and time-consuming. To facilitate automatic controller parametrization, this work presents a novel approach using deep reinforcement learning (DRL) with N-dimensional B-spline geometries (BSGs). We focus on the control of parameter-variant systems, a class of systems with complex behavior which depends on the operating conditions. For this system class, gain-scheduling control structures are widely used in applications across industries due to well-known design principles. Facilitating the expensive controller parametrization task regarding these control structures, we deploy an DRL agent. Based on control system observations, the agent autonomously decides how to adapt the controller parameters. We make the adaptation process more efficient by introducing BSGs to map the controller parameters which may depend on numerous operating conditions. To preprocess time-series data and extract a fixed-length feature vector, we use a long short-term memory (LSTM) neural networks. Furthermore, this work contributes actor regularizations that are relevant to real-world environments which differ from training. Accordingly, we apply dropout layer normalization to the actor and critic networks of the truncated quantile critic (TQC) algorithm. To show our approach's working principle and effectiveness, we train and evaluate the DRL agent on the parametrization task of an industrial control structure with parameter lookup tables.
|
|
16:30-16:45, Paper Tu-PS50-T4.3 | Add to My Program |
Carousel Storage and Picking Scheduling Issues: A Review (I) |
|
Qin, Shujin | Shangqiu Normal University |
Yang, Xinyi | Shangqiu Normal University |
Wang, Jiacun | Monmouth University |
Liu, Shixin | Northeastern University |
Guo, Xiwang | Liaoning Petrochemical University |
Qi, Liang | Shandong University of Science and Technology |
Keywords: Artificial Immune Systems, Evolutionary Computation, Swarm Intelligence
Abstract: This paper classifies and summarises the historical literature on carousel systems in automated storage and retrieval systems in recent years. As an automated storage and retrieval system for distribution centers and production facilities, carousels facilitate the storage and dispatching of goods, significantly improving warehouse turnover efficiency. Their performance have been investigated by many scholars and experts. As carousels evolve and upgrade, more and more innovative algorithms have been used to improve the efficiency of outbound carousel storage. In this paper, we collate articles investigating how the carousel system is stored inbound versus retrieved outbound. We then discuss articles on the dualcommand model of automatic storage retrieval systems as a whole. By reviewing over 50 papers, we summarise research on how to store and unload goods, focusing on the performance of automatic storage retrieval systems under dual-command conditions. On this basis, we review the current research’s limitations and suggest future research directions.
|
|
Tu-PS50-T5 Regular Session, Honolulu |
Add to My Program |
Anomaly Detection, Computer Vision and Image Processing |
|
|
|
16:00-16:15, Paper Tu-PS50-T5.1 | Add to My Program |
A Deep Learning Based Detection of Bird Droppings and Cleaning Method for Photovoltaic Solar Panels |
|
Kshetrimayum, Satchidanand | National Taipei University of Technology |
Liou, James | National Taipei University of Technology, Industrial Engineering |
Huang, Yo-Ping | National Taipei University of Technology |
Keywords: Consumer and Industrial Applications, System Modeling and Control, Modeling of Autonomous Systems
Abstract: The accumulation of bird droppings on photovoltaic (PV) farms reduces power generation efficiency and necessitates manual cleaning on a regular basis, which is a challenge in large power plants. To solve this problem, this paper proposes an automatic Unmanned Aerial Vehicle (UAV) based bird droppings detection, localization, and cleaning method on large PV power plant. An automated flight route is first created, and use an UAV to fly over the solar farm to capture images of the solar panels. The captured images are then stitched together to create a high-resolution orthomosaic image of the solar farm, which enables to precisely locate the bird droppings on the solar farm. An improved YOLOv7-based model is proposed to detect the bird droppings because they are quite small in comparison to the stitched image. Then, using the ground sample distance, we calculate the distance between each of the bird droppings and the drone’s takeoff point, which is used to clean the bird droppings from the solar panel. Last, the proposed model is verified by high-resolution orthomosaic images and the experimental outcomes unequivocally show that it is successful for detecting and cleaning of bird droppings on PV farms.
|
|
16:30-16:45, Paper Tu-PS50-T5.3 | Add to My Program |
Data and Model-Based Approaches in Fault Detection and Identification for Connected Vehicles |
|
Jalali, Mahsa | University of Windsor |
Coulter, Nolan | Embry-Riddle Aeronautical University |
Jado Puente, Rocio | Embry-Riddle Aeronautical University |
Gutierrez, Tatiana A | Embry-Riddle Aeronautical University |
Moncayo, Hever | Embry-Riddle Aeronautical University |
Moradi Heydarloo, Milad | University of Windsor |
Saif, Mehrdad | University of Windsor |
Keywords: Fault Monitoring and Diagnosis, Cooperative Systems and Control, Distributed Intelligent Systems
Abstract: In recent years, significant progress has been made in the application of data-driven, learning-based approaches to fault detection in distributed networks. These methods are optimized for quickly detecting and identifying faulty instruments, whether originating from within a single vehicle or from a network of connected vehicles. This paper provides a preliminary review of typical Fault Detection and Identification (FDI) techniques, with a focus on platoons of vehicles arranged in a rectilinear formation using a leader-follower architecture. Specifically, this paper discusses the advantages and disadvantages of data-driven versus model-based methods for addressing the FDI problem. In particular, the main characteristics of a novel immunity-based bio-inspired data-driven technique are highlighted, and numerical simulations of a multi-vehicle system under normal and faulty conditions are presented to support the discussion.
|
|
16:45-17:00, Paper Tu-PS50-T5.4 | Add to My Program |
Deep Learning Detection of Tiny Wood Splinters on Gymnasium Floor |
|
Saisho, Koji | Tokyo University of Science |
Petrilli, Alberto | Tokyo University of Science |
Sumiya, Shigeki | Senoh Co |
Yamamoto, Masataka | Tokyo University of Science |
Takemura, Hiroshi | Tokyo University of Science |
Keywords: Robotic Systems, Autonomous Vehicle, Mechatronics
Abstract: Injuries during the practice of sports in gymnasiums have been reported, and one of the causes of injuries is due to environmental factors as tiny wood splinters on the gymnasium floor. Although it is important to regularly inspect gymnasium floors, it is difficult for humans to inspect the entire gymnasium floor, as it is done manually and visually, and requires a lot of time and manpower. We have developed an automatic inspection system to detect tiny splinters on the gymnasium floor. The system attaches cotton to tiny splinters and detects the attached cotton by using an image processing technique. Using this system, the entire gymnasium floor can be inspected automatically by using simply creating a 2D map. After the inspection, the system can show where splinters are located on the map. In this paper, the method for detecting splinters attached to cotton using deep learning object detection-YOLO was proposed. The detection ratio of the proposed method was improved by 25.0 % compared to the conventional method of threshold color segmentation process. In an inspection of an entire gymnasium, the proposed method detected 33 markers and was able to detect splinters that could cause injury.
|
|
17:00-17:15, Paper Tu-PS50-T5.5 | Add to My Program |
Exploring Global and Local Information for Anomaly Detection with Normal Samples |
|
Xu, Fan | University of Science and Technology of China |
Wang, Nan | Beijing Jiaotong University |
Zhao, Xibin | Tsinghua University |
Keywords: Fault Monitoring and Diagnosis, System Modeling and Control
Abstract: Anomaly detection aims to detect data that do not conform to regular patterns, and such data is also called outliers. The anomalies to be detected are often tiny in proportion, containing crucial information, and are suitable for application scenes like intrusion detection, fraud detection, fault diagnosis, e-commerce platforms, et al. However, in many realistic scenarios, only the samples following normal behavior are observed, while we can hardly obtain any anomaly information. To address such problem, we propose an anomaly detection method GALDetector which is combined of global and local information based on observed normal samples. The proposed method can be divided into a three-stage method. Firstly, the global similar normal scores and the local sparsity scores of unlabeled samples are computed separately. Secondly, potential anomaly samples are separated from the unlabeled samples corresponding to these two scores and corresponding weights are assigned to the selected samples. Finally, a weighted anomaly detector is trained by loads of samples, then the detector is utilized to identify else anomalies. To evaluate the effectiveness of the proposed method, we conducted experiments on three categories of real-world datasets from diverse domains, and experimental results show that our method achieves better performance when compared with other state-of-the-art methods.
|
|
17:15-17:30, Paper Tu-PS50-T5.6 | Add to My Program |
Multivariate Beta Normality Scores Approach for Deep Anomaly Detection in Images Using Transformations |
|
Sghaier, Oussama | Concordia University |
Amayri, Manar | Concordia University |
Bouguila, Nizar | Concordia University |
Keywords: System Architecture, System Modeling and Control, Decision Support Systems
Abstract: In this work, we propose a novel anomaly detection approach in images based on normality scores using transformations. By applying various transformations to the input image such as rotation and flipping, we train a classifier to predict the transformation label applied to the images. Then, we represent the output of the classifier by a softmax vector. Thanks to the flexibility of multivariate Beta in fitting the data compared to other conventional distributions such as the Dirichlet distribution, we approximate the softmax vector by this general form of the Beta distribution to construct the normality scores. Moreover, we use the Maximum Likelihood to estimate the parameters of the proposed distribution. To show the power and the effectiveness of our approach, we conduct experiments of detecting anomalies in various public datasets. Furthermore, the proposed method is compared with state-of-the-art techniques and results demonstrate its superiority in terms of Area Under Receiver Operating characteristics (AUROC).
|
|
Tu-PS50-T6 Regular Session, Kahuku |
Add to My Program |
Artificial Intelligence and Robotics in Advanced Systems |
|
|
|
16:15-16:30, Paper Tu-PS50-T6.2 | Add to My Program |
Robust H∞ Estimation of Sideslip Angle of Vehicles with Fading Measurements (I) |
|
Hedayati, Mohammad | Institute for Intelligent Systems Research and Innovation (IISRI |
Mohajer, Navid | Deakin University |
Pappu, Mohammad Rokonuzzaman | Deakin University |
Nahavandi, Saeid | Swinburne University of Technology |
Keywords: Autonomous Vehicle, Intelligent Transportation Systems
Abstract: This study reports the robust sideslip angle estimation of vehicles with an uncertain tire cornering stiffness and fading measurements. The missing measurement and possible inaccuracy in the measurement of the vehicle’s yaw rate are considered by using a random variable distributed over [0, 1]. Norm-bounded uncertainties are considered in the vehicle’s tire cornering stiffness. Next, the Lyapunov stability theory is used to design a sideslip angle estimator such that the filtering error dynamics is stochastically stable and the H∞ performance criterion is met. The desired parameters of the proposed H∞ sideslip angle estimator are gained by solving a linear matrix inequality (LMI) problem. Simulation results show that the proposed novel estimator can efficiently estimate the sideslip angle while it demonstrates robust performance to uncertainties and fading measurements.
|
|
16:45-17:00, Paper Tu-PS50-T6.4 | Add to My Program |
HR-Chain a Blockchain-Based Solution for Managing and Securing Heterogeneous Robots |
|
Tang, Kailei | Fudan University |
Dong, Zhiyan | Fudan University |
Shi, Wenxiang | Fudan University |
Gan, Zhongxue | Fudan University |
Keywords: Robotic Systems, Trust in Autonomous Systems, System Architecture
Abstract: In modern factories and daily life, the application of heterogeneous robot groups is becoming more and more widespread. However, there are still many areas for improvement in the management and communication of heterogeneous robot swarms, including the need for different connection interfaces for heterogeneous robots and the use of heterogeneous robot action logs to identify possible bottlenecks in the production line or record unplanned behaviors, whether malicious or not. In order to better manage heterogeneous robot swarms, this paper introduces the Heterogeneous Robots Chain (HR-Chain), a blockchain-based solution that can manage different types of robots and prevent unnecessary changes in robot operation logs to help improve production efficiency or other management requirements. HR-Chain is a Tezos-based blockchain project that securely stores robot logs in the blockchain using smart contracts and designs a new consensus algorithm, Delegated Proof of Stake with node's Resource and Behavior (DPoRB), which is tailored to heterogeneous robot swarms to improve the efficiency and fairness of the consensus process. Finally, this paper conducts experimental research on HR-Chain, and the simulation results show that the new consensus algorithm has better performance in terms of throughput and so on. The real experimental results show that the robot system based on HR-Chain has better response speed and execution force, showing its potential application prospects in the industrial and consumer fields.
|
|
17:00-17:15, Paper Tu-PS50-T6.5 | Add to My Program |
MoCArU: Low-Cost Wireless Portable Robot Localization System Using IoT |
|
Assabumrungrat, Rawin | Tohoku University |
Bezerra, Ranulfo | Tohoku University |
Pereira Barros, Iuri | Tohoku University |
Kojima, Shotaro | Tohoku University |
Okada, Yoshito | Tohoku University |
Konyo, Masashi | Tohoku University |
Ohno, Kazunori | Tohoku University |
Tadokoro, Satoshi | Tohoku University |
Keywords: Robotic Systems, System Modeling and Control, Smart Sensor Networks
Abstract: Localization is crucial for various automation systems to provide awareness of the robot's position and orientation. Additionally, a localization system that offers portability, flexibility, and low computational and economic costs is required by a variety of robotics applications. However, no existing system can offer all the aforementioned features suitable for motion capture tasks involving ground swarm robots. In this study, we propose MoCArU, a novel Motion Capture system based on odometry and ArUco, robustly recognized through image data with low computational cost. We have evaluated the system's performance by comparing it with the ground truth trajectory and adopting different numbers of cameras. The results show that MoCArU can achieve a root mean square error of 0.1345±0.0065 m using ten cameras. Our findings add to previous knowledge by presenting a robust and cost-effective alternative to existing localization methods. Here, we show that MoCArU's use of lightweight camera stands and wireless communication ensures ease of installation, portability, and low computational cost, making it suitable for tracking swarm ground robot systems. We anticipate this system to be used in various applications, such as robot position control, navigation, and obstacle avoidance control. Overall, MoCArU provides a reliable and cost-effective solution for the real-time localization of robots, so its wider applicability in various environments is a significant advantage in robotics. An open-source implementation of MoCArU, as well as its related details, is open for public use at https://www.rm.is.tohoku.ac.jp/MoCArU.
|
|
17:15-17:30, Paper Tu-PS50-T6.6 | Add to My Program |
Node Placement to Maximize Reliability of a Communication Network with Application to Satellite Swarms |
|
Buchanan, Calum | University of Vermont |
Bagrow, James | University of Vermont |
Rombach, Puck | University of Vermont |
Ossareh, Hamid | University of Vermont |
Keywords: Cooperative Systems and Control, Communications, Distributed Intelligent Systems
Abstract: The structure of a mobile ad hoc network changes dynamically based on node positioning. We consider a setting in which nodes can communicate if they are within a prescribed distance of one another, giving rise to a communication network. An example is a swarm of small satellites that cooperate to perform tasks; such swarms are likely to become commonplace in space missions. In this paper, we consider the problem of adding a new node or repositioning a current node in the network while optimizing a given network parameter such as network reliability. Although there are infinitely many locations to place the new node in space, there are only finitely many possible changes to the communication network. We provide an algorithm that enumerates all possible network changes in time O(n^2 log(n)) or O(n^3 log(n)), for networks in 2- or 3-dimensional Euclidean space, respectively. We apply the proposed algorithm to a satellite swarm formation planning problem, where the goal is to maximize network reliability.
|
|
Tu-PS50-T7 Regular Session, Oahu |
Add to My Program |
Vehicle Automation and Control |
|
|
|
16:00-16:15, Paper Tu-PS50-T7.1 | Add to My Program |
A Beta-Less Approach for Vehicle Cornering Stiffness Estimation under Varying Road Friction |
|
Wittmer, Kelvin | Institute for System Dynamics, University of Stuttgart |
Henning, Kay-Uwe | AUDI AG |
Sawodny, Oliver | University of Stuttgart |
Keywords: Autonomous Vehicle, Intelligent Transportation Systems, System Modeling and Control
Abstract: Modern vehicles are equipped with a growing amount of advanced driver assistance functions (ADAS), such as adaptive cruise control or collision avoidance systems. These functions heavily rely on the current state of the vehicle. Particularly for lateral vehicle dynamics functions, the state of the tires has a crucial influence. For moderate driving scenarios, the vehicle lateral tire forces are mainly described by the tire cornering stiffnesses, which give a linear relation between tire slip angles and the resulting lateral tire forces. However, the tire cornering stiffnesses heavily depend on the type of tire and on many parameters, most significantly the tire vertical load, the tire wear and the tire pressure. As such changes in tire cornering stiffness can lead to a completely different vehicle behavior, it is essential to obtain current cornering stiffness estimates for the application of ADAS. Thus, this paper proposes a combined beta-less approach for vehicle cornering stiffness and road-friction coefficient estimation based on a nonlinear tire model, which solely makes use of typical vehicle in-series sensors. Vehicle measurements on dry road as well as on ice show the effectiveness of the proposed method.
|
|
16:30-16:45, Paper Tu-PS50-T7.3 | Add to My Program |
A Novel Scenario-Based Testing Approach for Cooperative-Automated Driving Systems |
|
Zhang, Xizhe | University of Warwick |
Mo, Yuen Kwan | University of Warwick |
Chodowiec, Emil | University of Warwick |
Tang, Yun | University of Warwick |
Higgins, Matthew | University of Warwick |
Khastgir, Siddartha | WMG, University of Warwick, UK |
Jennings, Paul | WMG, University of Warwick |
Keywords: Autonomous Vehicle, Communications, Trust in Autonomous Systems
Abstract: This paper presents a scenario-based safety assurance process for Automated Driving Systems (ADSs) combined with the Vehicle-to-Everything (V2X) connectivity aspect, system with connectivity capability is benchmarked with system without such capabilities. In addition, a novel approach to V2X modelling is introduced and implemented to obtain the required configuration of the V2X parameters for individual system, such modelling approach ensures that the V2X is effective within a system during testing by using a distance-based V2X parameter that correlates to the speed of the system. Such parameter is then used as the configuration of the V2X model when carrying out the ADS and V2X combined test. Using a pedestrian crossing scenario, a reduction in failure cases is demonstrated when combining the V2X and the ADS. Therefore, a synchronised approach of vehicle, sensor and communication sub-system can improve the overall safety function of the system.
|
|
16:45-17:00, Paper Tu-PS50-T7.4 | Add to My Program |
Feed-Forward Control of a Construction Vehicle's Hydro-Mechanical Powertrain to Prevent Engine Stalling |
|
Parlapanis, Christos | University of Stuttgart |
Frontull, Matthias | Liebherr-Werk Telfs GmbH |
Sawodny, Oliver | University of Stuttgart |
Keywords: Mechatronics, System Modeling and Control
Abstract: Automation functionalities of mobile construction site vehicles continue to improve constantly with the growth of the industry. Advanced sensor technology and enhanced computational power allow for the development of innovative and robust assistance systems. By applying control algorithms to the machines, they are enabled to operate on a construction site autonomously. Whereas such use cases expand the utilization of these vehicles, existing features can be improved as well. For instance, load limit control avoids engine stalling of a mobile machine in critical scenarios by regulation of the engine dynamics. This work presents a feed-forward control approach for a telescopic handler's driving functionality, which is actuated by a hydro-mechanical powertrain and powered by a diesel engine. Application of the proposed method to the real system yields a quality of life improvement through a more reliable prevention of engine stalling due to a lower reaction time to external loads. Furthermore, the engine speed decrease is reduced by efficiently regulating the drive pump.
|
|
17:00-17:15, Paper Tu-PS50-T7.5 | Add to My Program |
FlyTransformer: A Cross-Modal Fusion Policy for UAV End-To-End Trajectory Planning |
|
Shi, Wenxiang | Fudan University |
Zhao, Chen | Fudan University |
Tang, Kailei | Fudan University |
Sheng, Junru | Fudan University |
Dong, Zhiyan | Fudan University |
Zhang, Lihua | Fudan University |
Kang, Xiaoyang | FUDAN University |
Cao, Kai | Fudan University |
Keywords: Autonomous Vehicle, Robotic Systems, System Architecture
Abstract: The ability to perform efficient trajectory planning is crucial for UAV to carry out tasks autonomously. However, existing research on UAV trajectory planning often employs the cascade process method that involves high-precision maps, real-time positioning and path planning. These methods have limitations such as high computational complexity and time delay, which hinder the efficiency of trajectory planning. End-to-end trajectory planning methods offer a promising solution to this problem. As the core of these end-to-end methods, perception-end plays a decisive role in trajectory planning. But current multimodal fusion of perception is only post-fusion, lacks intermediate feature-level fusion and lacks attention to global visuospatial information. To solve these problems, we propose a new network architecture called FlyTransformer, which fuses the proprioceptive state and visual perception in feature-level for end-to-end trajectory planning. And the key visuospatial information can be attentioned in this architecture. We evaluate our method in forest and cuboid scenarios and their corresponding outdoor scenarios. The results show that FlyTransformer outperforms other baseline algorithms in terms of efficiency and performance.
|
|
17:15-17:30, Paper Tu-PS50-T7.6 | Add to My Program |
Vehicular Teamwork for Better Positioning |
|
Famili, Alireza | Virginia Tech |
Slyusar, Vladyslav | Ford Greenfield Labs |
Lee, Yun Ho | Ford Greenfield Labs |
Stavrou, Angelos | Virginia Tech |
Keywords: Autonomous Vehicle, Communications, Cooperative Systems and Control
Abstract: Recent developments in the autonomous vehicle industries have increased the significance of accurate positioning. Popular techniques for localization include the Global Positioning System (GPS). However, owing to the presence of obstructions, GPS signals are unavailable in dense urban environments. Moreover, in indoor environments (such as a parking garage below the ground), GPS signals are inaccessible to users. In this article, we introduce a novel technique for accurate indoor vehicular positioning. The first step in our proposed system is localization based on received signal strength (RSS) fingerprints of 5G New Radio (NR) downlink signals. Furthermore, to compensate for the high susceptibility of RSS fingerprinting techniques in varying environments, we propose a real-time collaborative localization scheme based on 5G sidelink device-to-device (D2D) communication. We develop extensive test campaigns to assess the efficacy of our proposed two-step scheme. According to test results, our proposed algorithm outperforms scenarios that rely solely on 5G RSS fingerprints.
|
|
Tu-PS50-T8 Special Session, Hawaii 3 |
Add to My Program |
Emerging Manufacturing Technologies and Their Applications |
|
|
Organizer: Qiao, Yan | Macau University of Science and Technology |
Organizer: Zhou, Mengchu | New Jersey Institute of Technology |
Organizer: Liu, Bin | IKAS Industries |
Organizer: Ghahramani, Mohammadhossein | Birmingham City University |
|
16:00-16:15, Paper Tu-PS50-T8.1 | Add to My Program |
Grasping in Uncertain Environments: A Case Study for Industrial Robotic Recycling (I) |
|
Daniels, Annalena | Technical University of Munich |
Kerz, Sebastian | Technical University of Munich |
Bari, Salman | Technical University of Munich |
Gabler, Volker | Technical University of Munich |
Wollherr, Dirk | Technische Universität München |
Keywords: Consumer and Industrial Applications, Robotic Systems, Manufacturing Automation and Systems
Abstract: Autonomous robotic grasping of uncertain objects in uncertain environments is an impactful open challenge for the industries of the future. One such industry is the recycling of Waste Electrical and Electronic Equipment (WEEE) materials, in which electric devices are disassembled and readied for the recovery of raw materials. Since devices may contain hazardous materials and their disassembly involves heavy manual labor, robotic disassembly is a promising venue. However, since devices may be damaged, dirty and unidentified, robotic disassembly is challenging since object models are unavailable or cannot be relied upon. This case study explores grasping strategies for industrial robotic disassembly of WEEE devices with uncertain vision data. We propose three grippers and appropriate tactile strategies for force-based manipulation that improves grasping robustness. For each proposed gripper, we develop corresponding strategies that can perform effectively in different grasping tasks and leverage the grippers design and unique strengths. Through experiments conducted in lab and factory settings for four different WEEE devices, we demonstrate how object uncertainty may be overcome by tactile sensing and compliant techniques, significantly increasing grasping success rates.
|
|
16:15-16:30, Paper Tu-PS50-T8.2 | Add to My Program |
Development of a 3D Mushroom Cultivation Medium with Drop Harvesting Mechanism Based on a 3D Printed Elastic Structure |
|
Saito, Kouki | Yamagata University |
Ogawa, Jun | Yamagata University |
Watanabe, Yosuke | Yamagata University |
Shiblee, MD Nahin Islam | Yamagata University |
Furukawa, Hidemitsu | Yamagata University |
Keywords: Control of Uncertain Systems, Adaptive Systems, Soft Robotics
Abstract: Harvesting mushrooms while maintaining their quality remains a challenging task due to their fast growth rate, uneven developmental posture, and soft, easily damaged characteristics. This study proposes a 3D cultivation medium that enables harvesting by deformation to eliminate the mismatch between harvesting and cultivation. The proposed medium embeds a 3D-printed anisotropic elastic support structure in the covering soil, allowing mushrooms to be generated from the sides and bottom. Furthermore, a mechanism is developed to detach the successfully grown mushrooms by uniaxial contraction of the medium, induced by the out-of-plane pressure generated during free-fall. We also discuss how our method can potentially improve yield compared to conventional flat plate mushroom cultivation.
|
|
16:30-16:45, Paper Tu-PS50-T8.3 | Add to My Program |
Digital Twin Modeling Framework for Manual Warehouses |
|
Drissi Elbouzidi, Adnane | LAMIH CNRS, Arts Et Métiers ParisTech |
Pellerin, Robert | CIRRELT, IVADO, Polytechnique Montréal |
Ait El Cadi, Abdessamad | Univ. Polytechnique Hauts-De-France, CNRS, UMR 8201 - LAMIH / In |
Lamouri, Samir | LAMIH CNRS, Arts Et Métiers ParisTech |
Boubaker, Selmen | Square Research Center |
Keywords: Digital Twin, Adaptive Systems, System Modeling and Control
Abstract: This paper introduces a novel framework for replicating manual processes in digital twins. The proposed approach allows the modeling of manual process variability, a key aspect often overlooked in the literature. Our research highlights the potential of AI to tailor the digital twin to specific contexts influenced by human factors. The model's accuracy can be improved with each simulation synchronization cycle through supervised machine learning, which enhances the alignment of the virtual and physical processes. The proposed digital twin framework aims to avoid discrepancies from the physical counterpart, disregarding decisions based on fixed parameters that do not evolve over time and ensuring a higher degree of realism of the virtual replica. Ultimately, we aim to design the digital twin as a realistic and effective decision-making aid for all human participants engaged in the digital twin loop.
|
|
16:45-17:00, Paper Tu-PS50-T8.4 | Add to My Program |
Gig Work Systems Design for Dynamic Work Management with Freelancers |
|
Asanaka, Riko | Keio University |
Inoue, Masaki | Keio University |
Keywords: Service Systems and Organizations, System Modeling and Control, Decision Support Systems
Abstract: Gig work is a work style in which freelancers undertake one-off jobs based on their own convenience.One can see that the gig work system is a management system that matches one-off jobs with registrant freelancers under their free decision-making.This paper addresses a systematic design of the gig work system.To this end, first, we model the decision-making by the freelancers and the evolutions in remaining workloads.We express the probability that at least one of the registrant freelancers gets opportunities of one-off jobs based on their utility.Next, we design a model-based controller, which matches one-off jobs with them.Finally, the effectiveness of the proposed work management system is verified in numerical experiments.
|
|
17:00-17:15, Paper Tu-PS50-T8.5 | Add to My Program |
Multi-Parametric Model Predictive Control Strategies for a Rotary Tablet Press in Pharmaceutical Industry |
|
Nascu, Ioana | Technical University of Cluj Napoca |
Diangelakis, Nikolaos | Technical University of Crete |
Huang, Yan-Shu | Purdue University |
Nagy, Zoltan | Purdue University |
Birs, Isabela | Ghent University, FWO |
Nascu, Ioan | Universitatea Tehnica Din Cluj Napoca |
Keywords: System Modeling and Control, Control of Uncertain Systems, Digital Twin
Abstract: The pharmaceutical manufacturing industry has been undergoing a paradigm change from the standard batch pharmaceutical manufacturing towards continuous manufacturing, which is a quicker, more effective approach. Even so, most work in continuous tablet manufacturing has focused on classic model predictive control strategies. Multi-parametric model predictive control strategies are a better fit for assuring a quality-by-control (QbC) approach and progressing towards embracing the Industry 4.0 revolution. In this work an advanced multi-parametric model based predictive control strategy for the control of a continuous manufacturing process for solid dosage forms in the pharmaceutical industry is developed. The model used to develop the advanced control strategies is validated and calibrated using real data from the Pilot Plant. This will lead to better process performances including increased process efficiency, expanded process flexibility, and decreased environmental impact despite the presence of process uncertainties, measurement noise and disturbances.
|
|
17:15-17:30, Paper Tu-PS50-T8.6 | Add to My Program |
Robust Optimization for Bilevel Production Planning Problems under Customer's Uncertainties |
|
Nakao, Jun | Okayama University |
Nishi, Tatsushi | Okayama University |
Liu, Ziang | Okayama University |
Keywords: Enterprise Information Systems, Discrete Event Systems, Decision Support Systems
Abstract: In recent years, a form of production known as high-mix low-volume production has attracted attention. Mass customization is a production strategy that manufactures produce products to meet diverse customer demands while maintaining production efficiency close to mass production. Modular production is one of the production methods to achieve mass customization, and products are manufactured by combining modules. This production method allows a manufacturer to increase its total profit by manufacturing products suited to individual customers while reducing costs. In actual production planning, production costs and customer demands are uncertain. By considering these uncertainties, robust production planning can be achieved. Various robust production planning methods have been proposed in previous studies, however, no previous studies have reported cases of modular production that reflect customer decisions. Therefore, in this study, the production planning problem is formulated as a bilevel production planning problem consisting of customers and a manufacturer, and robust optimization is applied. Finally, we verify the usefulness of the proposed method for customer demand uncertainty by comparing the conventional model with the robust optimization model.
|
|
Tu-PS50-T9 Special Session, Hawaii 4 |
Add to My Program |
Advanced System with Advanced Computational Techniques to the Real-Life
Applications |
|
|
Organizer: Hirotani, Daisuke | Prefectural University of Hiroshima |
Organizer: Hayashida, Tomohiro | Hiroshima University |
Organizer: Tamura, Keiichi | Hiroshima City University |
|
16:00-16:15, Paper Tu-PS50-T9.1 | Add to My Program |
Two-Swarm Cooperative Particle Swarm Optimization Including Prediction Using Gaussian Process Regression (I) |
|
Hayashida, Tomohiro | Hiroshima University |
Nishizaki, Ichiro | Hiroshima University |
Sekizaki, Shinya | Hiroshima University |
Kashihara, Yuki | Hiroshima University |
Keywords: Decision Support Systems, Soft Robotics
Abstract: Particle Swarm Optimization (PSO) (Kennedy and Eberhart, 1995) is an evolutionary computation technique that imitates the foraging behavior of flocks of birds. Particles, with position and velocity information, explore the search space and efficiently search for solutions by sharing information among them. However, PSO has limitations such as early convergence to a local solution and difficulty in finding a global solution due to insufficient search in the wide range of the target space. To fix this issue, TCPSO (Two-swarm Cooperative Particle Swarm Optimization: TCPSO) (Sun and Li, 2014) employes two types of particles, slave particles for intensive search and master particles for global search. However, even TCPSO may struggle with high-dimensional and complex problems and fall into a local solution. This study aims to improve the performance of TCPSO by estimating the function of the target problem using Gaussian process regression during the solution search process.
|
|
16:15-16:30, Paper Tu-PS50-T9.2 | Add to My Program |
A Computationally-Efficient Rollout-Based Approach for Bathymetric Mapping with Multiple Low-Cost Unmanned Surface Vehicles |
|
Macesker, Matthew | University of Connecticut |
Pattipati, Krishna | University of Connecticut |
Licht, Stephen | University of Rhode Island |
Gilboa, Roy | University of Rhode Island |
Keywords: Adaptive Systems, Cooperative Systems and Control, Autonomous Vehicle
Abstract: This paper proposes an integrated software-hardware approach for automated bathymetry mapping using low-cost unmanned surface vehicles (USVs). The solution takes inspiration from adaptive sampling, aiming to minimize the time, hardware, and training costs compared to traditional coverage path planning (CPP) methods. The proposed approach implements a scalable Markov Decision Process (MDP)-based model and a planning method based on multi-agent approximate dynamic programming (ADP), which prescribes USVs to seek the most informative samples in bandwidth-limited environments with uncertain and incomplete prior map information. The control structure offloads surrogate model updates and path planning to the control station and ensures mission completion even with communication dropouts. Computer simulation results on both real-world and computer-generated bathymetry data demonstrate the effectiveness of the proposed approach in terms of predictive accuracy and efficiency. The proposed solution has the potential to significantly aid coastal development by streamlining access to accurate models of near-shore bathymetry.
|
|
16:30-16:45, Paper Tu-PS50-T9.3 | Add to My Program |
Data-Driven Linear Predictive Control of Nonlinear Processes Based on Reduced-Order Koopman Operator |
|
Zhang, Xuewen | Nanyang Technological University |
Han, Minghao | Nanyang Technological University |
Yin, Xunyuan | Nanyang Technological University |
Keywords: System Modeling and Control, Control of Uncertain Systems
Abstract: In this paper, we propose an efficient data-driven predictive control approach for general nonlinear processes based on a reduced-order Koopman operator. A Kalman-based sparse identification of nonlinear dynamics method is employed to select lifting functions for Koopman identification. The selected lifting functions are used to project the original nonlinear state space into a higher-dimensional linear function space, in which Koopman-based linear models may be constructed for the underlying nonlinear process. To address the potential issue of a significant increase in the dimensionality of the resulting full-order Koopman models caused by the use of lifting functions, we propose a reduced-order Koopman modeling approach based on proper orthogonal decomposition. A computationally efficient linear robust predictive control scheme is established based on the reduced-order Koopman model. A case study on a benchmark chemical process is conducted to illustrate the proposed framework.
|
|
16:45-17:00, Paper Tu-PS50-T9.4 | Add to My Program |
Small Pipe Inspection Robots with Wireless Communication Using Microwave Guided Modes Propagating Along a Pipe Wall |
|
Mizukami, Masato | Muroran Institute of Technology |
Yamaguchi, Masami | Muroran Institute of Technology |
Murata, Hiroshi | Mie University |
Hirata, Akihiko | Chiba Institute of Technology |
Keywords: Mechatronics, Infrastructure Systems and Services, Robotic Systems
Abstract: In recent years, infrastructure facilities have been noticeably deteriorating and need to be inspected. The use of pipe-inspection robots has been investigated, but the robots developed so far are too large for smaller pipes. As the robot is downsized, however, the cables used for communication and power transmission disturb the movement of the robot. In this report, we utilized microwave guided modes propagating along a pipe wall as a communication method. An experimental robot platform for small-diameter pipe inspection was constructed. We confirmed the feasibility of the communication method by conducting a 3D electromagnetic field simulation and experiments on the transmission properties. In particular, the experiments on data communication and picture transmission demonstrated the possibility of communication with the robot. A small pipe-travelling robot platform for experiments on wireless communication was designed and fabricated. We confirmed that the prototype robot was able to communicate wirelessly by using microwave guided modes propagating along the pipe wall.
|
|
17:00-17:15, Paper Tu-PS50-T9.5 | Add to My Program |
Stability-Guaranteed Control Systems with Min-Max Constraints and Machine Learning-Based Virtual Sensors |
|
Hilgert, Eric | Graduate School for Applied Research in North Rhine-Westphalia |
Schwung, Andreas | Southwestphalia University of Applied Science |
Keywords: System Modeling and Control, Smart Sensor Networks, Control of Uncertain Systems
Abstract: In this paper, we present a comprehensive approach for designing and analyzing control systems with min-max constraint controllers and machine learning-based virtual sensors. By leveraging the Standard Nonlinear Operator Form (SNOF), we establish the necessary conditions for global asymptotic stability and demonstrate their applicability through an illustrative example based on a modified plant model from the literature. The proposed methodology effectively handles nonlinearities and constraints, ensuring stability while providing a systematic procedure for constructing a well-formed SNOF by integrating the plant, virtual sensor, and controller. The successful application of this method in the example highlights its potential for addressing complex control problems involving min-max constraints and virtual sensors in real-world scenarios. This paper contributes to the growing body of knowledge in this area and sets the stage for future advancements, including the exploration of transforming other machine learning architectures into the SNOF and extending the stability analysis to accommodate different types of nonlinearities and constraints.
|
|
Tu-S4T1 Virtual Session, Room T1 |
Add to My Program |
Human-Machine Systems Special Session |
|
|
|
16:00-17:00, Paper Tu-S4T1.1 | Add to My Program |
Collaborative Estimating Multiple Gaussian Graphical Models on Resource Constrained Devices in IoT Networks (I) |
|
Zhan, Ying | Southeast University |
Wang, Beilun | Southeast University |
Keywords: Medical Informatics, Systems Safety and Security, Environmental Sensing,
Abstract: In recent years, the rapid development of the Internet of Things (IoT) has attracted significant interest in smart healthcare. However, such collaborative IoT applications still face three major challenges: multi-task data heterogeneity, high communication cost, decentralized computation and privacy preservation. As a result, efficiently integrating and processing such complex data has become a crucial problem to solve. In this paper, we propose COFFE, a collaborative federated learning framework that allows edge devices to participate in the model training. Our framework leverages the local computing power of edge devices and utilizes the communication resources of the IoT network to enable distributed learning. We apply a compression algorithm which optimizes the communication cost by minimizing the amount of exchanged data between the edge devices and the cloud server. We demonstrate the effectiveness of our approach through a set of experiments on a simulated IoT network. Our results show that our proposed COFFE approach achieves significant performance gains compared to traditional centralized learning approaches. Overall, our work provides a promising direction in smart healthcare for applying federated learning to IoT setting and enabling edge device collaboration.
|
|
16:00-17:00, Paper Tu-S4T1.2 | Add to My Program |
DAAN: A Dictionary-Based Adaptive Attention Network for Biomedical Named Entity Recognition with Chinese Electronic Medical Records (I) |
|
Zhu, Zhichao | Beijing University of Technology |
Li, Jianqiang | Beijing University of Technology |
Xu, Chun | Xinjiang University of Finance and Economics |
Zhao, Qing | Beijing University of Technology |
Keywords: Medical Informatics, Biometrics and Applications,
Abstract: Biomedical named entity recognition (BNER) is a basic task of the extraction of medical information. The existing deep learning-based approaches usually represent the medical text by using words or characters. However, most of biomedical terms consist of many words (characters). Splitting them into many fragments (words or characters) while leveraging the attention mechanism to assign attention scores for each fragment maybe disperse the importance weight and cause a lower attention score for the biomedical terms. Therefore, this paper presents a dictionary-based adaptive attention network for BNER. Specifically, a biomedical dictionary is firstly constructed by integrating multiple existing medical resources. Secondly, building the guidance vectors by matching the electronic medical record (EMR) text to the constructed dictionary. Then, an adaptive attention strategy is presented to guide the attention mechanism to assign higher attention to the overall medical term by using the guidance vectors. We conduct extensive experiments on a real-world dataset, the results illuminate that our presented method outperforms all baselines.
|
|
16:00-17:00, Paper Tu-S4T1.3 | Add to My Program |
Towards a Multi-Agent Simulation of Cyber-Attackers and Cyber-Defenders Battles (I) |
|
Soulé, Julien | Univ. Grenoble Alpes, Grenoble INP, LCIS, 26000 Valence, France |
Jamont, Jean-Paul | Univ. Grenoble Alpes |
Occello, Michel | LCIS |
Théron, Paul | AICA IWG |
Traonouez, Louis-Marie | AICA IWG |
Keywords: Systems Safety and Security
Abstract: As cyber-attacks show to be more and more complex and coordinated, cyber-defenders strategy through multi-agent approaches could be key to tackle against cyber-attacks as close as entry points in a networked system. This paper presents a Markovian modeling and implementation through a simulator of fighting cyber-attacker agents and cyber-defender agents deployed on host network nodes. It aims to provide an experimental framework to implement realistically based coordinated cyber-attack scenarios while assessing cyber-defenders dynamic organizations. We abstracted network nodes by sets of properties including agents' ones. Actions applied by agents model how the network reacts depending in a given state and what properties are to change. Collective choice of the actions brings the whole environment closer or farther from respective cyber-attackers and cyber-defenders goals. Using the simulator, we implemented a realistically inspired scenario with several behavior implementation approaches for cyber-defenders and cyber-attackers.
|
|
16:00-17:00, Paper Tu-S4T1.4 | Add to My Program |
Deep Positional-Representation-Based Local Information Retention Networks for Mammography Classification (I) |
|
Han, Bowen | China University of Petroleum (East China) |
Sun, Luhao | Shandong First Medical University |
Li, Chao | Shandong First Medical University |
Yu, Zhiyong | Shandong First Medical University |
Jiang, Wenzong | China University of Petroleum (East China) |
Liu, Weifeng | China University of Petroleum (East China) |
Tao, Dapeng | Yunnan University |
Liu, Baodi | College of Information and Control Engineering, China University |
Keywords: Medical Informatics
Abstract: 因为在 最常见的乳房 X 光检查图像,肿瘤通常 仅占整个图像的一小部分,这 经常使深度学习模型失去对 肿瘤区域。在以前的工作中,大多数模型都解决了这个问题 使用 ROI 标记来训练模型的问题,即 价格昂贵且难以广泛应用。一些最近的 无ROI方法使用多尺度特征或多阶段 训练,摆脱模型对投资回报率的依赖 但大大增加了计算复杂度和 部署难度大,限制了深度的潜力 神经网络。因此,深 基于位置表示的本地信息保留 提出了网络(PR-LIR)。PR-LIR 是一个单一功能, 端到端乳房X光图像分类方法,其中 使用位置表示 (PR) 和多尺度 区域汇集(MRP)模块,用于定位肿瘤区域和 保留小靶点肿瘤的区域语义信息 不同规模,无需ROI标签和多级 训练,并且几乎没
|
|
16:00-17:00, Paper Tu-S4T1.5 | Add to My Program |
A Simple yet Effective 2D-3D Lifting Method for Monocular 3D Human Pose Estimation |
|
Fang, Qin | Tongji University |
Xu, Zihan | Tongji University |
Hu, Mengxian | Tongji University |
Liu, Chengju | Tongji University |
Chen, Qijun | Tongji University |
Keywords: Human Performance Modeling
Abstract: Monocular single-frame 3D human pose estimation (HPE) has garnered significant interest, particularly in the domains of human-computer interaction and human action recognition. Several remarkable works have achieved excellent results in obtaining accurate 2D human poses. Building upon the foundation laid by previous advancements, this paper focuses on 2D-3D lifting. The proposed 2D-3D lifting algorithm in this paper is a transformer-based model, which demonstrates its unique advantages in processing sequential data. The self-attention mechanism in the transformer can handle global information without being limited by the receptive field. The transformer's superior ability to process global information enables it to adaptively learn the relationships between human joints across different human behaviors. However, directly estimating 3D human pose from a single 2D image is a complex task due to depth ambiguity and joint occlusion. This ill-posed problem arises from the limited information available in a 2D image, making it challenging to determine the precise 3D pose. Additionally, occlusions further complicate the accurate estimation of joint positions. This article introduces a transformer-based network that enhances the algorithm's robustness by utilizing multi-layer dual-stream blocks. One path concatenates the 2D coordinates and the focal length of the camera as input, while the other path takes the difference between the 2D coordinates as input. The two paths undergo an information fusion process at the front end of the block. Experiments conducted on various datasets verify the effectiveness of the algorithm and achieve state-of-the-art performance on Human 3.6M and MPI-INF-3DHP benchmarks. Our code will be made publicly available on GitHub.
|
|
16:00-17:00, Paper Tu-S4T1.6 | Add to My Program |
Cross-Domain Based Deep Neural Network for Obstructive Sleep Apnea Detection Via Piezoelectric Ceramic Sensor Array |
|
Shao, Yingying | Beijing University of Posts and Telecommunications |
Hu, Dikun | Beijing University of Posts and Telecommunications |
Liu, Yi | Beijing University of Posts and Telecommunications |
Li, Zhengdong | Beijing University of Posts and Telecommunications |
Fan, Xiaomao | Shenzhen Technology University |
Gao, Weidong | Beijing University of Posts and Telecommunications |
Keywords: Biometrics and Applications,, Design Methods, Networking and Decision-Making
Abstract: Obstructive sleep apnea (OSA) is considered one of the most common sleep disorders, causing multiple organ multiple system dysfunction and leading to a series of complications such as depression, insomnia, stroke, and so on. Piezoelectric ceramic sensor array (PCSA) with the merit of easily embedding into a mattress is a quite promising tool to monitor OSA events at home. Previous efforts achieved promising results for OSA detection based on PCSA, however, there are two major challenges still on open:(1)how to screen out the high-quality PCSA signals;(2)how to alleviate the difference among PCSA signals from different individuals. To address these challenges, we propose a cross-domain deep neural network for OSA event detection named CDDNNet. To obtain the high-quality PCSA signals, we propose a new dynamic channel-selection algorithm with maximizing energy minimizing variance of PCSA signals. To make CDDNNet with the cross-domain learning ability, we employ the gradient reverse learning (GRL) technique to alleviate the difference among PCSA signals from different individuals. What’s more, we conduct a pilot study to collect overnight PCSA signals from the Peking Union Medical College Hospital. Experiment results show that CDDNNet can achieve competitive results of 54.20% sensitivity,90.30% specificity, and 84.68% accuracy for OSA event detection.
|
|
16:00-17:00, Paper Tu-S4T1.7 | Add to My Program |
When You Were Old: Exploring a Virtual Reality Older Adults Experience Simulation System |
|
Liu, Zicheng | Southeast University |
Ding, Ding | Southeast University |
Zheng, Yuchen | Southeastuniversity of China |
Li, Zhuying | Southeast University |
Xiong, Runqun | Southeast University |
Keywords: Virtual and Augmented Reality Systems, Virtual/Augmented/Mixed Reality, Human-Computer Interaction
Abstract: The mental health of older adults is a vital issue in the era of global aging. Empathy towards older adults constitutes a crucial component of social interaction, as it engenders awareness of the physical and mental obstacles encountered by this population. Empathy promotes individuals to be more friendly, considerate, and prosocial when interacting with older adults on various occasions, such as volunteering and professional caring. Various approaches have been used to promote people's empathy, but disadvantages like low participation enthusiasm, high cost, and time-consuming cannot be ignored and are not easy to deal with. Aimed at facilitating people's empathy towards older adults, we developed EmpathiaVR, a virtual reality older adults experience simulation system. It provides a multi-sensory mixed reality experience, including vision, hearing, and kinesthesis, from an older person's perspective, which is beneficial for provoking users' empathy towards older adults. To investigate the effectiveness of the system, an empirical study was conducted with 24 participants. The experiment employed a between-subjects design with two groups. The experiment results from both the subjective reports and behavioral variables indicated that the system enhanced people's empathy.
|
|
Tu-S4T2 Virtual Session, Room T2 |
Add to My Program |
Cloud, IoT, and Robotics Integration, and Big Data Computing |
|
|
|
16:00-17:00, Paper Tu-S4T2.1 | Add to My Program |
A CNN-Based Deep Learning Approach in Anomaly-Based Intrusion Detection Systems |
|
Babaei, Aptin | Deakin University |
Mohsenzadeh Kebria, Parham | Deakin University |
Moradi Dalvand, Mohsen | Deakin University |
Nahavandi, Saeid | Swinburne University of Technology |
Keywords: Cloud, IoT, and Robotics Integration, Application of Artificial Intelligence, Cybernetics for Informatics
Abstract: The growing prevalence of cybersecurity threats has increased the demand for robust intrusion detection systems (IDSs). Deep learning techniques have shown promising results in detecting and mitigating these threats, making them an increasingly popular choice in IDS design. However, evaluating the performance of deep learning-based IDSs can be challenging due to the complexity of the models and the lack of standardized evaluation metrics. This review paper presents an overview of the most common evaluation metrics used in deep learning-based IDSs, including precision, confusion metrics, accuracy, F1 score, Area Under Curve (AUC), and recall. Several studies have applied machine-learning classic algorithms like Random Forest, Decision Tree, Logistic Regression, and others, but for this paper, we used a Convolutional Neural Network (CNN) that would be independent of the features in the dataset. The studied papers did not provide AUC and none of them balanced the dataset based on the feature's proportion. The dataset utilized in this study is the CSE-CIC-IDS2018 dataset, which underwent meticulous cleansing and normalization procedures to ensure the inclusion of legitimate and useful data. Furthermore, a weighting mechanism was introduced to balance the dataset and mitigate the potential for bias in the Machine Learning process.
|
|
16:00-17:00, Paper Tu-S4T2.2 | Add to My Program |
GC-SALM: Multi-Task Runoff Prediction Using Spatial-Temporal Attention Graph Convolution Networks |
|
Lu, Jin | HoHai University |
Xie, Zaipeng | Hohai University |
Chen, Jiayu | Hohai University |
Li, Maohua | Hohai University |
Xu, Chenghong | Hohai University |
Cao, Hongli | Southeast University |
Keywords: Big Data Computing,, Computational Intelligence, Machine Learning
Abstract: Runoff prediction is essential for flood forecasting, irrigation planning, and sustainable water resource management. However, accurate predictions can be challenging due to the involvement of multiple variables. This paper presents a novel Graph Convolution-based Spatial-temporal Attention LSTM Multi-Task learning (GC-SALM) model for accurate runoff predictions. Our approach combines a multilayer neural network and an attention mechanism for enhanced generalization performance. The GC-SALM model employs spatial attention and graph convolutional networks to discern local and global spatial patterns, while temporal attention and LSTM are utilized to capture temporal characteristics within extended sequences. Experimental results reveal that the proposed model outperforms six state-of-the-art methods in runoff prediction and flow calibration, emphasizing its potential for real-world hydrological applications.
|
|
16:00-17:00, Paper Tu-S4T2.3 | Add to My Program |
MACEdge: Real-Time Video Analytics Based on Multi-Access Collaborative Edge Computing |
|
Zhong, Dian | East China Normal University |
Zhu, Minghua | East China Normal University |
Huang, Binbin | ECNU |
Keywords: Cloud, IoT, and Robotics Integration, AI and Applications, AIoT
Abstract: Video analysis typically requires a significant amount of computing resources and energy. Traditional cloud-based video analysis relies on concentrating computing resources in the cloud, which puts a tremendous load on network bandwidth and introduces poor user experience due to latency. Edge computing offers a solution by offloading a portion of the video analysis tasks to the edge, significantly reducing the pressure on bandwidth and latency. However, the fixed video configurations (resolution, frame rate, etc.) uploaded to the edge may not be optimal for specific situations. Moreover, using a low video configuration makes it difficult for edge video analysis systems to recognize environmental changes, while using a high video con- figuration poses challenges to latency and bandwidth. Therefore, in this paper, we propose the MACEdge—an edge video analysis framework by leveraging the multi-access capability of the edge gateway to enhance edge perception by accessing heterogeneous sensors. We offer an algorithm based on Q Learning for dynamic adaptive adjustment of sensor threshold and video configurations. In experiments using real-world data, our collaborative solution improved accuracy and reduced high bandwidth costs compared to other video configuration adjustment benchmarks.
|
|
16:00-17:00, Paper Tu-S4T2.4 | Add to My Program |
RACCS: Real-Time Cloud-Edge Collaborative Anomaly Analysis System with Compressive Sensing Joint Optimization |
|
Huang, Binbin | ECNU |
Zhu, Minghua | East China Normal University |
Zhong, Dian | East China Normal University |
Keywords: Cloud, IoT, and Robotics Integration, AIoT, Deep Learning
Abstract: In the context of the Internet of Things (IoT), ensuring public safety relies heavily on the real-time detection of anomalies in monitoring systems. Thus, video anomaly analysis has garnered significant attention. We propose a collaborative anomaly analysis system(RACCS) that leverages cloud-edge-end computing. The system receives real-time video packets generated by cameras through edge gateways and utilizes load balancing algorithms to determine the destination node, reducing system latency. Using the MQTT protocol of the IoT, the video packets are transmitted to edge nodes or the cloud for inference. To prevent highly confident video packets from being transmitted to the cloud for secondary inference, we deploy a lightweight model on the edge nodes for initial inference, thereby reducing bandwidth consumption and latency. To achieve this goal, we adopt specific quantization methods for both the input video packets and complex cloud model parameters. This converts the complex cloud model into a lightweight model, reducing model storage and improving inference speed. We also propose a joint optimization framework for compressive sensing and anomaly classification that meets both high compression rates for video packets and high accuracy for anomaly classification. We further conduct a series of experiments to determine the optimal configuration and compare our approach with other mainstream methods, demonstrating the superiority of our system.
|
|
16:00-17:00, Paper Tu-S4T2.5 | Add to My Program |
QueryEdge: Real-Time Muti-Video Query in Edge-Cloud Collaborative System |
|
Zhong, Jihua | East China Normal University |
Niu, Yannian | East China Normal University |
Zhu, Minghua | East China Normal University |
Keywords: Cloud, IoT, and Robotics Integration, AIoT, Machine Vision
Abstract: The real-time query of surveillance video plays a significant role in many fields such as public safety, smart city, and abnormality monitoring. However, with the exponential growth of surveillance video data, traditional cloud-based intelligent video processing faces significant challenges in terms of latency and bandwidth, while the pure edge computing approach is deficient in query accuracy due to its lack of computational power. Existing edge cloud collaboration approaches, such as SurveilEdge, focus on real-time target queries within a single video stream and do not show promising results during target queries across multiple video streams. For this reason, this paper proposes QueryEdge, an edge-cloud collaborative real-time query system for multiple video streams. Specifically, we design a real-time query system based on an edge-cloud collaboration framework to achieve highly accurate and low-latency target query services in multiple video streams. In addition, we introduce a prioritization mechanism and a load-balancing strategy in the query task scheduling process to further improve query efficiency. The evaluation proves that QueryEdge has a significant improvement in query latency and bandwidth consumption compared with pure cloud computing, pure edge computing, and SurveilEdge.
|
|
16:00-17:00, Paper Tu-S4T2.6 | Add to My Program |
Enhancing Out-Of-Domain Detection for Speech Spoofing Countermeasure Via Supervised Contrastive Learning |
|
Ji, Jianan | Zhejiang University |
Xie, Yang | Zhejiang University |
Yingchun, Yang | Zhejiang University |
Keywords: Biometric Systems and Bioinformatics, AI and Applications, Deep Learning
Abstract: High-performance anti-spoofing countermeasures (CMs) have been widely used to protect automatic speaker verification systems by identifying and filtering spoofing speech. However, their performance degrades severely when confronted with out-of-domain (OOD) samples. To solve this issue, recent papers investigate the strategy that a CM can opt for abstention when it is not confident about the decision. In this paper, we develop an effective approach to enhance the performance of CMs with abstention. Specifically, we introduce supervised contrastive learning to group samples into known classes more tightly and employ data augmentation to enhance the diversity of known samples. The experiments are conducted on ASVspoof 2019 logical access corpus and another test set consisting of different OOD samples from other databases. With our scheme, CMs with abstention can achieve an equal error rate (EER) of 1.84% on the LA test set and 0.86% on the other test set. These results demonstrate that CMs trained with our approach show better OOD detection performance and can make more confident decisions.
|
|
16:00-17:00, Paper Tu-S4T2.7 | Add to My Program |
Consistency Inspection for Assembly of Bolt on Engine Using Multi-View Stereo |
|
Liu, Shuo | Xi'an Jiaotong University |
Jiang, Minglv | Xi'an Jiaotong University |
Lu, Yuanliang | Xi'an Jiaotong University |
Wang, Jianji | Xi'an Jiaotong University |
Zheng, Nan-Ning | Xi'an Jiao Tong University |
Keywords: Cloud, IoT, and Robotics Integration
Abstract: Assembly inspection is a crucial aspect of smart manufacturing to guarantee the quality of products. To address the challenges posed by complex workshop scenes, an assembly inspection algorithm based on multi-view stereo is proposed. Specifically, to enhance the accuracy of reconstruction, the multi-view stereo method utilizes segmentation attention-assisted depth estimation and point cloud fusion to mitigate the noise in the reconstructed point cloud. And the proposed method introduces normalized depth loss to improve the reconstruction ability of foreground pixels. Futhermore, this paper presents a novel 3D point cloud-based size estimation algorithm capable of accurately estimating the size of small objects in complex working scenes. Experimental results demonstrate the efficacy of the improved multi-view stereo algorithm in assembly scenes and verify the superiority of the proposed size estimation algorithm.
|
|
Tu-S4T3 Virtual Session, Room T3 |
Add to My Program |
Machine Learning for Intelligent Imaging Systems VII |
|
|
Organizer: Tang, Jinshan | George Mason University |
Organizer: Agaian, Sos | New York City University |
|
16:00-17:00, Paper Tu-S4T3.1 | Add to My Program |
A Multi-Distance Feature Dissimilarity-Guided Encoder-Decoder Network for Polyp Segmentation (I) |
|
Zhang, Xianchao | Sichuan Normal University |
Guo, Jinjia | Chongqing University |
Mu, Nan | Sichuan Normal University |
Jiang, Jingfeng | Michigan Technological University |
Keywords: Neural Networks and their Applications, Deep Learning, Computational Life Science
Abstract: Most colorectal cancers originate from adenomatous polyps, which start as single asymptomatic polyps and develop into malignant tumors. In clinical practice, colonoscopy is an extremely effective method for detecting polyps, and it provides important visual information for the accurate identification and removal of polyps. However, it is highly challenging to achieve accurate segmentation of various polyps due to the complex and variable size, shape, color, number, and growth background of polyps at different periods. To address these dilemmas, we propose a multi-distance feature dissimilarity-guided encoder-decoder network for automatic polyp segmentation, mainly consisting of the Multi-Distance Differential Module (MDDM) and the Hybrid Loss Module (HLM). The former mainly utilizes the multilayer feature subtraction (MLFS) operations to extract the difference information between short-distance adjacent layer features and short-distance and long-distance cross-layer features. Given this, the pyramid-inspired MDDM obtains discriminative features continuously between adjacent/cross layers, enhancing complementary features between different layers. The latter supervises the feature maps extracted at each network level to achieve finer predictions. Experiments on four challenge datasets confirm that the proposed model outperforms most state-of-the-art methods in six evaluation metrics while yielding reasonably accurate segmentation results.
|
|
16:00-17:00, Paper Tu-S4T3.2 | Add to My Program |
Visual Tracking Based on Efficient Dual-Branch Siamese Network (I) |
|
Zhou, Wenjun | Southwest Petroleum University |
Wang, Nan | Southwest Petroleum University |
Liang, Dong | Nanjing University of Aeronautics and Astronautics |
Peng, Bo | Southwest Petroleum University |
Keywords: Image Processing and Pattern Recognition, Machine Vision, Machine Learning
Abstract: In this paper, we propose a dual-branch Siamese network for visual object tracking. The proposed network consists of two distinct branches: a shallow network branch and a deep network branch. The shallow network branch focuses on precisely locating the target object and improving the anti-interference ability to similar objects, while the deep network branch focuses on capturing the more abstract semantic features of the object. Additionally, a multi-scale key feature fusion module is embedded into the shallow network, enabling the model to accurately locate the target object. Furthermore, we leverage the attention mechanism to further enhance the robustness of the model. Experimental results on three different public datasets demonstrate that our method outperforms state-of-the-art tracking algorithms.
|
|
16:00-17:00, Paper Tu-S4T3.3 | Add to My Program |
Machine Learning-Based Context Space Theory (I) |
|
D'Aniello, Giuseppe | University of Salerno |
Gaeta, Matteo | University of Salerno |
Policastro, Pasquale | University of Salerno |
Keywords: Cognitive Computing, Human Factors, Human-Machine Interaction
Abstract: Situation awareness of human and artificial agents can be improved by the recognition and adequate representation of real-life situations. The lack of easy-understandable, easy-to-use, and effective computational models of situations hindered the adoption and diffusion of situation awareness approaches in modern human-machine systems. Context Space Theory is a context awareness approach that uses a geometric metaphor to provide integrated mechanisms for representing contexts and situations. A drawback of this approach is the expert-based definition of context and situation spaces. This process can be time-consuming and expensive. In this paper, we propose a novel approach, namely Machine Learning-based Context Space Theory, which adopts machine learning techniques and, in particular, decision trees, to semi-automatically define context spaces and situation spaces with a data-driven approach. A case study related to the monitoring and control of the Covid-19 pandemic in Italy is proposed to demonstrate the feasibility and benefits of the proposed approach.
|
|
16:00-17:00, Paper Tu-S4T3.4 | Add to My Program |
Neural Implicit 3D Reconstruction with Double Supervision (I) |
|
Wang, Yifan | Southwest Petroleum University |
Zhou, Tong | Southwest Petroleum University |
Yang, Ping | Southwest Petroleum University |
Zhou, Wenjun | Southwest Petroleum University |
Peng, Bo | Southwest Petroleum University |
Keywords: Machine Vision, Image Processing and Pattern Recognition, Neural Networks and their Applications
Abstract: As human life continues improving, applications like virtual reality (VR) and augmented reality (AR) necessitate increasingly higher-quality 3D reconstructions. With the advancements in neural implicit 3D surface and volume rendering, multi-view 3D reconstruction has garnered significant attention. A prevailing issue in this domain is that the direct combination of neural implicit surface and volume rendering typically considers only photometric consistency loss, leading to an under-constrained surface problem. To address this problem, we develop a double-supervised approach and an integrated network for implicit surface and volume rendering, enabling the generation of a 3D surface model of an object from a set of multi-view images. In our approach, the object surface is represented by a signed distance function, and a 3D surface reconstruction model is jointly trained with photometric consistency constraints and geometry constraints. Experiments demonstrate that by enhancing prior geometry supervision and integrating the double-supervised network architecture into neural implicit 3D reconstruction, we can achieve accurate and high-quality 3D reconstruction.
|
|
16:00-17:00, Paper Tu-S4T3.5 | Add to My Program |
Exposing Computer-Generated Images Via Amplifed Texture Differences Learning (I) |
|
Xu, Qiang | City University of Hong Kong |
Wang, Zhe | City University of Hong Kong |
Mi, Zhongjie | Shanghai Jiao Tong University |
Yan, Hong | City University of Hong Kong |
Keywords: Neural Networks and their Applications, Deep Learning, Image Processing and Pattern Recognition
Abstract: Many Computer-Generated (CG) images are spreading widely on the Internet, which may deliberately misinform or deceive the public. Therefore, distinguishing CG images from natural photographic (PG) has become a frontier research topic in the field of image forensics. Although many algorithms have been proposed, it is still very challenging to detect CG images generated by the recent cutting-edge generative methods. Besides, most existing algorithms tend to generalize poorly when facing different unseen multimodal generative models. To address this issue, a novel method based on amplified texture differences learning is proposed to tackle this problem. We first design a deep texture enhancement module for discriminative texture amplification. Specifically, a semantic segmentation module is utilized to generate semantic segmentation map for the affine transformation operation guidance, which can be further used to recover the texture in different regions of the input image. Then, the combination of the original image and the high-frequency components of the original and enhanced images are fed into a hybrid neural network equipped with attention mechanisms, which refines intermediate features and facilitates trace exploration in spatial and channel dimensions respectively. By verifying on several commonly used benchmark datasets and a newly constructed dataset with more realistic and diverse images, the experimental results demonstrate that the proposed approach outperforms some existing methods.
|
|
16:00-17:00, Paper Tu-S4T3.6 | Add to My Program |
Holistic ARDS Prognosis Evaluation Framework Utilizing Data Governance and Ensemble Feature Selection (I) |
|
Han, Xiaodong | Yunnan University |
Cai, Fei | Yunnan University |
Xiu, Guanghui | Affiliated Hospital of Yunnan University (The Second People's Ho |
Tao, Dapeng | Yunnan University |
Keywords: Medical Informatics
Abstract: Acute respiratory distress syndrome (ARDS) prognosis has become integral to modern critical care models aimed at determining expected patient outcomes, optimizing clinical pathways, and improving resource allocation. However, existing clinical studies of ARDS prognosis face severe challenges due to the redundancy of multiple sources of heterogeneous data in current healthcare datasets, the high dimensionality of patient features, and the wide variation in the importance of each feature. In this paper, we propose a holistic ARDS prognostic framework for assessing ARDS prognosis through a standardized data governance process and an integrated feature selection approach based on embedded algorithms. Specifically, we extract ARDS patient data from medical data sources by various criteria and normalize the features. Then we apply multiple embedded feature selection methods to obtain a decision matrix based on the data-governed dataset. We conducted extensive experiments to demonstrate the efficiency and superiority of our proposed framework. The experimental results show that our proposed framework performs well in both data governance and feature selection and has wide clinical application potential.
|
|
16:00-17:00, Paper Tu-S4T3.7 | Add to My Program |
Cross-Machine Few-Shot Fault Diagnosis for Rotating Machinery with Asymmetric Distribution Measure Network |
|
Zhang, Jianxiang | Tsinghua University |
Zhang, Linxuan | Tsinghua University |
Keywords: Fault Monitoring and Diagnosis
Abstract: In practical industrial scenarios, it is difficult to collect sufficient labeled fault data for diverse machine types. How to train a robust intelligent fault diagnosis model with limited labeled samples and work well on different machine types remains a challenging issue. This paper proposes a novel cross-machine few-shot fault diagnosis method based on asymmetric distribution measure. The proposed method first constructs a large number of meta-diagnostic tasks from source machines with sufficient labeled fault data and then trains the meta-diagnostic model based on meta-learning to learn task-independent diagnostic knowledge, which will be transferred for the diagnostic task of the target machine to achieve accurate few-shot fault diagnosis. A series of cross-machine few-shot fault diagnosis experiments based on three rotating machinery datasets have verified the effectiveness of the proposed method.
|
|
Tu-S4T4 Virtual Session, Room T4 |
Add to My Program |
BMI Workshop Virtual Session |
|
|
|
16:00-17:00, Paper Tu-S4T4.1 | Add to My Program |
A Discussion of Statistical Criteria for Assessing Awareness with SMR BCI after Brain Injury |
|
Amaunam, Idorenyin | University of Essex |
Schneider, Christoph | CHUV |
Da Silva, Marina Lopes | Department of Clinical Neurosciences, Centre Hospitalier Univers |
Jane, Johr | Department of Clinical Neurosciences, Centre Hospitalier Univers |
Diserens, Karin | Department of Clinical Neurosciences, Centre Hospitalier Univers |
Perdikis, Serafeim | University of Essex |
Keywords: Active BMIs, BMI Emerging Applications
Abstract: This work discusses the implications of selecting particular statistical metrics and thresholds as criteria to diagnose awareness through Brain-Computer Interface (BCI) technology in patients with Disorders of Consciousness (DOC). We report a first analysis of a novel dataset collected to investigate whether a motor attempt electroencephalography (EEG) paradigm coupled with Functional Electrical Stimulation (FES) can detect command following and, therefore, signs of conscious awareness in DOC. We assessed 22 DOC patients admitted to the acute rehabilitation unit after a brain lesion over one or more sessions. We extracted EEG sensorimotor rhythms and performed a standard open-loop BCI pipeline evaluation, classifying motor attempt against resting-state trials. We validate this approach by correlating classification accuracy with the established clinical scale Coma Recovery Scale Revised. We employ a machine learning (ML)- inspired diagnostic criterion based on confidence intervals over chance-level classification accuracy and show that it yields more conservative and, arguably, reliable inference of Cognitive Motor Dissociation (CMD) by means of command-following, neuroimaging-based tools, compared to diagnoses based on clinical assessments or criteria examining the statistical significance of brain features across different mental states.
|
|
16:00-17:00, Paper Tu-S4T4.2 | Add to My Program |
Enhancing Subject-Independent EEG-Based Auditory Attention Decoding with WGAN and Pearson Correlation Coefficient |
|
Pahuja, Saurav | University of Bremen |
Ivucic, Gabriel | University of Bremen |
Putze, Felix | University of Bremen |
Cai, Siqi | National University of Singapore |
Haizhou, Li | National University of Singapore |
Schultz, Tanja | University of Bremen |
Keywords: Passive BMIs, Other Neurotechnology and Brain-Related Topics, BMI Emerging Applications
Abstract: Electroencephalography (EEG) related research faces a significant challenge of subject independence due to the variation in brain signals and responses among individuals. While deep learning models hold promise in addressing this challenge, their effectiveness depends on large datasets for training and generalization across participants. To overcome this limitation, we propose a solution to the above limitation by increasing the size and quality of training data for subject-independent auditory attention decoding (AAD) using EEG with deep learning. Specifically, our method employs a Wasserstein Generative Adversarial Network (WGAN) to generate synthetic data, with Pearson correlation filtering the most realistic samples. We evaluated this method on a publicly available dataset of selective auditory attention experiments and showed superior performance in subject-independent AAD performance. The mixed training set, consisting of both real and artificial data generated by the WGAN+Pearson Correlation Coefficient, demonstrated approximately 4% improvement in AAD accuracy for a 1-second window. These results demonstrate that deep learning remains a viable approach to overcoming data scarcity in subject-independent AAD tasks based on EEG. Moreover, the proposed method has the potential to improve the generalization and reliability of EEG classification tasks.
|
|
16:00-17:00, Paper Tu-S4T4.3 | Add to My Program |
Sentence Reconstruction Leveraging Contextual Meaning from Speech-Related Brain Signals (I) |
|
Lee, Jiwon | Korea University |
Lee, Seo-Hyun | Korea University |
Lee, Young-Eun | Korea University |
Kim, Soowon | Korea University |
Lee, Seong-Whan | Korea University |
Keywords: BMI Emerging Applications, Other Neurotechnology and Brain-Related Topics, Active BMIs
Abstract: Brain-to-speech systems, which enable communication through neural activity, have gathered significant attention as potential neuroprosthesis for patients and as novel communication tools for broader individuals. To date, most non-invasive brain-to-speech research has focused on word-level decoding, while sentence-level reconstruction remains challenging. In this study, we introduce a sentence reconstruction method using a restricted range of 16 unique words and compare two different approaches: word-in-sentence reconstruction and natural sentence generation. The focus is on efficiently generating sentences by utilizing the temporal convolutional network model to extract features from EEG signals and create word embeddings that considers the contextual relevance between words. The language model and keyword density measuring are applied to evaluate the sentence reconstruction performance for each approach. The results show that the word-in-sentence approach with language model leads to a significant reduction in the word error rate of 31.58 ± 18.58 % for spoken speech and 56.01 ± 7.57 % for imagined speech. The natural sentence generation approach significantly improved the words per minute performance, enabling more natural mode of brain-to-speech. We conducted an online demo to verify the potential of the proposed approaches, generating audible speech from brain signals in real-time. These findings demonstrate the feasibility of natural brain-to-speech systems by considering the contextual relevance, allowing users to freely communicate natural sentences in real life.
|
|
16:00-17:00, Paper Tu-S4T4.4 | Add to My Program |
An Empirical Study to Evaluate Feature Extraction Approaches CSP, TSM, and CSP-TSM on a MI-BCI under Distraction |
|
Moufassih, Mustapha | IBN ZOHR University |
Tarahi, Ousama | IBN ZOHR University |
Hamou, Soukaina | IBN ZOHR University |
Agounad, Said | IBN ZOHR University |
Idrissi Azami, Hafida | IBN ZOHR University |
Keywords: Active BMIs, BMI Emerging Applications, Other Neurotechnology and Brain-Related Topics
Abstract: Common spatial patterns (CSP) and tangent space mapping (TSM) are frequently used approaches for Motor Imagery Brain-Computer Interface (MI-BCI). These two methods are used in the feature extraction block to transform the EEG time series into a set of features to facilitate the classification learning process. CSP is based on spatial filtering, while TSM is based on covariance matrix estimation and Riemannian geometry framework, a third approach called CSP-TSM can be used by combining CSP and TSM. In this paper, we compare experimentally the classification accuracy and the computational time of these three approaches (CSP, TSM, and CSP-TSM) on a MI-BCI dataset acquired under five types of distractions that simulates a pseudo-realistic environment. The obtained results help us to explore the pros and cons of using each approach in a MI-BCI operated out-of-lab environment.
|
|
16:00-17:00, Paper Tu-S4T4.5 | Add to My Program |
Development of a BCI-Controlled Lower Limb Exoskeleton Simulator |
|
Yousefi Koma, Amir Hossein | Simon Fraser University |
Park, Edward J. | Simon Fraser University |
Arzanpour, Siamak | Simon Fraser University |
Keywords: Brain-Computer Interfaces
Abstract: Brain-computer interface (BCI) technology, particularly those based on electroencephalography (EEG), holds significant potential for controlling powered lower limb exoskeletons in rehabilitation contexts. This study introduces an EEG-based BCI system designed to decode anticipated gait direction, thereby enabling command of a self-balancing, overground exoskeleton. To evaluate the performance of the proposed system safely and effectively, we utilized a dynamic simulator (or a digital twin) of the physical exoskeleton, developed commercially for individuals with limited or no walking ability. Six healthy participants, wearing an EEG device, were instructed to initiate gait movements toward the direction indicated by on-screen arrow triggers (forward, backward, left, and right). A Convolutional Neural Network (CNN), operating on an 80%-20% Train-Test Ratio, was used to evaluate the system. The results demonstrated low error rates for the exoskeleton simulator, and an overall system accuracy of 0.75, reflecting the performance of the EEG-based BCI. Notably, the system had an average delay of 5 minutes in a real-time control setting, primarily attributed to its signal processing step.
|
|
Tu-S4T5 Virtual Session, Room T5 |
Add to My Program |
General Cybernetics II |
|
|
|
16:00-17:00, Paper Tu-S4T5.1 | Add to My Program |
StGAN: A Novel Symbolic Signal Decomposition Base on GANs and Swin Transformer |
|
Sun, Ming | Qingdao University |
Guo, Li | Qingdao University |
Chen, Long | University of Macau |
Keywords: Transfer Learning, Machine Vision, Machine Learning
Abstract: The symbolic imagery signal decomposition is a common problem in digital signal processing. Its main purpose is to divide the symbolic imagery signal into different parts. However, in real-world applications, symbolic imagery signal captured by the camera is usually influenced by complex negative lighting environments such as highlights, reflections, drop shadows, or information loss. To overcome those problems, a generative adversarial network with Swin Transformer (StGAN) is proposed and applied to symbolic imagery signal decomposition and semantic segmentation tasks. In addition, a realistic image dataset taken in complex lighting conditions is proposed for symbolic imagery signal decomposition, which have complex lighting environments and edge information loss, we name it the MLT dataset. We demonstrate StGAN brings significant improvements in performance than some existing methods in the accuracy of symbolic imagery signal decomposition on MLT datasets. Further experiments on Sky and Facades datasets prove that StGAN works well in other tasks, especially when fewer data is involved in training, but it can still ensure good accuracy.
|
|
16:00-17:00, Paper Tu-S4T5.2 | Add to My Program |
Cooperative Knowledge Elicitation for Formal Ontology Design: An Exploratory Study Applied in Industry for Knowledge Management (I) |
|
Labbani Narsis, Ouassila | CIAD UMR 7533, Université De Bourgogne, UB |
Nicolle, Christophe | Université De Bourgogne |
Keywords: Cooperative Work in Design, Design Methods, Networking and Decision-Making
Abstract: Building formal ontologies remains a complex process for companies. In the literature, this process is based on the technical knowledge and expertise of domain experts, without further details on the used methodologies. Possible problems of disagreements between experts, expression of implicit knowledge related to high level know-how rarely verbalized, qualification of results by use cases, or simply adhesion of the group of experts, remain currently unsolved. This paper proposes a methodological approach based on knowledge elicitation for the conception of formal, consensual, and shared ontologies. The proposed approach is experimentally tested on industrial collaboration projects in the field of manufacturing (associating knowledge sources from multinational companies) and in the field of viticulture (associating explicit knowledge and implicit knowledge acquired through observation).
|
|
16:00-17:00, Paper Tu-S4T5.3 | Add to My Program |
Data-Driven Identification and Optimal Control of a Biomechanical Triple-Link Inverted Pendulum for Sit-To-Stand Movement |
|
Haras, Muhammad | University of Arkansas at Little Rock |
Iqbal, Kamran | University of Arkansas at Little Rock |
Keywords: System Modeling and Control, Robotic Systems, Adaptive Systems
Abstract: Data-driven methods are becoming popular to identify large, complex, and nonlinear systems and replacing equation-based methods. Dynamic mode decomposition methods are utilized to identify the linear dynamics of an unforced reference generator system and a highly nonlinear triple-link inverted pendulum biomechanical model around an equilibrium point. The identification methods successfully revealed the underlying linear dynamics of the systems. Feedforward and feedback optimal control policies are designed using a policy iteration method based on adaptive dynamic programming. The designed controllers exhibited satisfactory simulation performance in tracking the sit-to-stand movement of the nonlinear model.
|
|
16:00-17:00, Paper Tu-S4T5.4 | Add to My Program |
Moisture Content Prediction of Sugi Wood Drying Using Deep LSTM AE Minimizing Perturbed Error |
|
Wang, Ting | South China University of Technology |
Ng, Wing Yin | South China University of Technology |
Zhang, Mingyang | Tencent |
Xueli, Zhang | South China University of Technology Guangzhou |
Zhang, Jianjun | South China University of Technology |
Deng, Mingcong | Tokyo University of Agriculture and Technology |
Keywords: Computational Intelligence, Machine Learning
Abstract: Wood drying technology plays a key role in extending service lifetime of wood as the moisture content has a great influence on the wood quality. This paper presents a moisture content prediction model based on the deep long shortterm memory (LSTM) autoencoders with stochastic sensitivity (DLASS) to extract a hidden representation of input data. The DLASS uses multiple LSTM encoders to learn more informative hidden representations from unseen samples, which are then decoded using multiple LSTM decoders. The DLASS is trained by minimizing the perturbed error from historical moisture content data. Furthermore, a nonlinear fully connected feedforward neural network as a regression layer is applied to predict moisture content using hidden representations learned by the DLASS. The DLASS is applied to real-world industrial data of Sugi wood processed by a drying kiln made by SECEA from August 4 to 19, 2008. Multiple test cases and comparisons with existing classical and state-of-the-art models show that the DLASS model yields more accurate moisture content prediction results and has high generalization capability. To be specific, the DLASS yields the lowest MAE (0.026), MAPE (0.280), and RMSE (0.058) for predicting moisture content during the wood drying process.
|
|
16:00-17:00, Paper Tu-S4T5.5 | Add to My Program |
Improving 6D Object Pose Estimation Based on Semantic Segmentation (I) |
|
Gao, Fang | Guangxi University |
Li, Qiujun | Guangxi University |
Sun, Qingyi | Guangxi University |
Keywords: Neural Networks and their Applications, Machine Vision, Deep Learning
Abstract: The performance of 6D pose estimation, which is important for scene understanding, can be improved by more accurate object segmentation. RGB-D data including depth maps can provide more accurate position information than RGB data for semantic segmentation. In this work, we propose a novel two-stage RGBD-based pose estimation network, which can provide more precise semantic segmentation and effective point cloud features. Firstly, we use a lightweight semantic segmentation head to process the RBG-D data to get the pixel-level clustered mask, and then use a multi-scale and attention-based backbone to extract the point cloud features for pose estimation. We analyze the performance of our network on the YCB-Video dataset and the results show that our method is comparable to current state-of-the-art methods after optimization.
|
|
16:00-17:00, Paper Tu-S4T5.6 | Add to My Program |
Sophisticated Swarm Reinforcement Learning by Incorporating Inverse Reinforcement Learning |
|
Kuroe, Yasuaki | Doshisha University |
Takeuchi, Kenya | Kansai University |
Keywords: Machine Learning, Computational Intelligence, Swarm Intelligence
Abstract: In the last decades, the reinforcement learning method has attracted a great deal of attention and many studies have been done. However, the method is basically a trial-and-error scheme and it takes much computational time to acquire optimal strategies. Furthermore, optimal strategies may not be obtained for large and complicated problems with many states. To resolve these problems we have proposed the swarm reinforcement learning method, which is developed inspired by the multi-point search optimization methods. In this paper, we propose a sophisticated swarm reinforcement learning method incorporating inverse reinforcement learning, which can improve learning speed and obtain better solutions. The proposed method is developed especially for the partially observable Markov decision processes (POMDPs). We evaluate the proposed method through experiments and compare the results with those of the conventional swarm reinforcement learning, Q-learning and Hierarchical Q-learning (HQ-learning), It is confirmed that the proposed method makes it possible to obtain better solutions with less learning time than the existing methods, especially for the problem in POMDPs.
|
|
16:00-17:00, Paper Tu-S4T5.7 | Add to My Program |
GeAE: GAE-Embedded Autoencoder Based Causal Representation for Robust Domain Adaptation |
|
Zhou, Kuang | Northwestern Polytechnical University |
Jiang, Ming | Northwestern Polytechnical University |
Gabrys, Bogdan | University of Technology Sydney |
Keywords: Transfer Learning, Representation Learning, Machine Learning
Abstract: In this work, we study the unsupervised robust domain adaptation problem where only a single well labeled source domain data is available during the learning process. A new causal representation method based on a Graph autoencoder embedded AutoEncoder, named GeAE, is introduced to learn invariant representations across domains for robust domain adaption. The proposed method can handle nonlinear causal relations included in the data by a causal structure learning process similar to a graph autoencoder. Moreover, the cross-entropy loss as well as the causal structure loss and the reconstruction loss are incorporated in the objective function designed in a united autoencoder to improve the quality of predictions using causal representations. Experimental results on one generated dataset and three real-world datasets demonstrate the effectiveness of GeAE in comparison with the state-of-the-art methods.
|
|
Tu-S4T6 Virtual Session, Room T6 |
Add to My Program |
Additional CYB I |
|
|
|
16:00-17:00, Paper Tu-S4T6.1 | Add to My Program |
DehazeDM: Image Dehazing Via Patch Autoencoder Based on Diffusion Models (I) |
|
Yang, Yuming | Chongqing University |
Zou, Dongsheng | Chongqing University |
Song, Xinyi | Chongqing University |
Zhang, Xiaotong | Chongqing University |
Keywords: Machine Vision, Image Processing and Pattern Recognition, Deep Learning
Abstract: Image dehazing is a crucial computer vision application with the primary objective of estimating haze-free images from hazy images. Deep neural network architectures have emerged as the dominant approaches and achieved remarkable progress. However, due to the intricacy, existing dehazing methods need help to train large deep learning networks. This work proposes a novel image dehazing network based on Diffusion Model (DehazeDM). Firstly, by segmenting the image into patches during the sampling procedure, we can dehaze images of arbitrary size. Then we compress the image into the latent space via the auto-encoder model and conduct the diffusion operation in the latent space, significantly decreasing the computational complexity associated with the task while exhibiting negligible effects on the perceptual fidelity of the resultant images. Extensive experiments verify the effectiveness and the superior performance of DehazeDM in image dehazing.
|
|
16:00-17:00, Paper Tu-S4T6.2 | Add to My Program |
A Spatial-Temporal Transformer Based on Domain Generalization for Motor Imagery Classification |
|
Liu, Shaozhe | Peking University |
Leike, An | Peking University |
Zhang, Chi | Peking University |
Jia, Ziyu | Institute of Automation, Chinese Academy of Sciences |
Keywords: Brain-Computer Interfaces
Abstract: Motor imagery (MI) has emerged as a classical paradigm in brain-computer interface (BCI) research. In recent years, advancements in deep learning techniques, such as the application of Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs), have enabled the use of MI classification. Despite their success, CNNs and RNNs are not capable of effectively extracting brain spatial and temporal information necessary for MI classification. Additionally, differences in individual subjects further complicate the classification process. To address these limitations, a novel Spatial-Temporal Transformer based on Domain Generalization (ST-DG) has been proposed for MI classification using EEG signals. This framework utilizes a spatial-temporal transformer architecture to capture essential spatiotemporal characteristics of the brain, while also employing Domain Generalization techniques to account for cross-subject variability and improve the model's generalization performance. Experimental results on two public datasets demonstrate the state-of-the-art classification performance.
|
|
16:00-17:00, Paper Tu-S4T6.3 | Add to My Program |
PUW-Feat: A Progressive and Unified Method for Weakly Supervised Local Feature Learning |
|
Zhou, Xun | Tongji University |
Yan, Qingqing | Tongji University |
Liu, Chengju | Tongji University |
Chen, Qijun | Tongji University |
Keywords: Neural Networks and their Applications, Machine Vision, AI and Applications
Abstract: Local feature extraction is a fundamental module towards downstream tasks in computer vision applications. Weakly supervised learning methods are getting more attention with its convenience of collecting datasets, but they have still not get proper trade-off among training costs, accuracy and speed. In this paper, we propose PUW-Feat, a progressive and unified weakly supervised learnable local feature extractor. We design a progressive describe-then-detect learning pipeline to save training costs, which partly decouples the training process yet ensures its consistency by sharing the loss function structure. We build a unified keypoint location training framework which can predict keypoint locations by a learnable network branch to avoid slow post-process, thus we increase feature extraction speed while keep accuracy. Our method achieves the best balance on training costs, accuracy and real-time performance in experiments on different datasets and tasks.
|
|
16:00-17:00, Paper Tu-S4T6.4 | Add to My Program |
Classification of Audience Comprehension During Math Presentations Using EEG Brain Activity |
|
Morita, Takahiro | SoftBank Corporation |
Zhang, Liang | SoftBank Corporation |
Nagate, Atsushi | SoftBank Corp |
Alimardani, Maryam | Tilburg University |
Nishio, Shuichi | Osaka University |
Keywords: Brain-based Information Communications, Human Enhancements, Kansei (sense/emotion) Engineering
Abstract: This study aims to estimate an audience's comprehension of the content in a presentation and investigate the relationship between the difficulty of the presentation topics, the quality of explanations, comprehension, and brain activity patterns. Four types of videos with different presentation characteristics were prepared, and brain activity during video viewing was measured using electroencephalography (EEG) sensors. Subsequently, multiple features were generated from the acquired EEG data, and differential evaluations between two types of features were conducted in all six cases. We generated classifiers by features from 16 participants and evaluated them. In the identification of videos with difficult topics explained poorly and videos with easy topics explained well, a maximum accuracy of 71% (21% above the chance level) was recorded. Furthermore, although the identification accuracy between videos with difficult topics explained well and videos with easy topics explained poorly was approximately 60%, applying network analysis to the generated features improved the accuracy up to a maximum of 70%. These results suggest that the meaning of words we usually use ambiguously (difficult/easy, good/bad explanation, and understood/not understood) can be clarified using brain activities.
|
|
16:00-17:00, Paper Tu-S4T6.5 | Add to My Program |
Augmented Transport Data-Free Class Incremental Learning |
|
Feng, Yu | Shanxi Agricultural University |
Zhao, Pengcheng | Shanxi Agricultural University |
Wang, Kun | Shanxi Agricultural University |
Hao, Wangli | Shanxi Agricultural University |
Zhao, Ran | China Agricultural University |
Li, Fuzhong | Shanxi Agricultural University |
Keywords: Deep Learning, Machine Learning, Representation Learning
Abstract: Although deep neural networks exhibit high performance on several specific tasks, they suffer from a severe issue of catastrophic forgetting when learning new tasks incrementally. Several incremental learning methods have been recently proposed to address this problem, some of which rely on storing data or constructing generative models to enhance the performance of incremental models. However, storing old data from previous tasks may be limited by memory or privacy issues, and generative models may often be unstable and inefficient during model training. In this study, we present a non-exemplar-based class-incremental learning method to mitigate catastrophic forgetting in incremental learning when saving old class samples is not possible, while distinguishing between old and new classes. We tackle the problem of representation bias and classifier bias that are inherent in class-incremental learning. Specifically, we introduce class enhancement, which enables the model to encounter more classes during training to acquire transferable and diverse representation features, mitigating representation bias. Furthermore, we use backtracking transfer, which employs semantic mapping to limit inter-class relationships by learning class-level semantic relationships across two tasks, to preserve discriminative old knowledge and prevent catastrophic forgetting due to classifier bias. We perform comprehensive experiments on three benchmark datasets that demonstrate the superiority of our proposed method over state-of-the-art methods.
|
|
16:00-17:00, Paper Tu-S4T6.6 | Add to My Program |
Effect of Smile on Facial Landmark Based Face Recognition |
|
Dey, Swapnil | Bangladesh University of Engineering and Technology |
Hossain, Md. Musharaf | Bangladesh University of Engineering and Technology |
Hassan, Asif Mustafa | Bangladesh University of Engineering and Technology |
Rahman, Ashikur | Bangladesh University of Engineering and Technology |
Tarin, Tamima | Bangladesh University of Engineering and Technology |
Keywords: Machine Learning, Application of Artificial Intelligence, AI and Applications
Abstract: Smile reveals distinctive facial features that makes a face more recognizable. However, little is known about the impact of smile on face recognition accuracy, especially in the context of automated face recognition systems using traditional machine learning algorithms. This is an important topic to study because understanding the influence of smile on face recognition could contribute to improve the performance of various applications that use face recognition underneath. In this paper, we propose a handful of novel features to capture the effect of smile on faces. With the new feature set, we then apply several machine learning algorithms on the data set that includes a comprehensive set of both smiling and neutral facial images of people with different gender, ethnicity and age groups. Our experimental result suggests that smiling faces produce high discriminating power that can be used effectively by the machine-learning algorithms to improve their accuracy and fairness.
|
|
16:00-17:00, Paper Tu-S4T6.7 | Add to My Program |
A Pareto-Optimality-Based Approach for Selecting the Best Machine Learning Models in Mild Cognitive Impairment Prediction (I) |
|
Sorino, Paolo | Politecnico Di Bari |
Paparella, Vincenzo | Dept. of Electrical and Information Engineering (DEI), Politecni |
Lofù, Domenico | Dept. of Electrical and Information Engineering (DEI), Politecni |
Colafiglio, Tommaso | Dept. of Electrical and Information Engineering (DEI), Politecni |
Di Sciascio, Eugenio | Politecnico Di Bari |
Narducci, Fedelucio | Politecnico Di Bari |
Sardone, Rodolfo | Data Sciences and Innovation, N.I. of Gastroenterology ”S. De Be |
Di Noia, Tommaso | Politecnico Di Bari, Bari (Italy) |
Keywords: Medical Informatics
Abstract: Mild Cognitive Impairment (MCI) is a syndrome characterized by cognitive impairment that is greater than expected for a subject’s age and level of education. Nevertheless, it does not interfere with daily activity. Prevalence in epidemiological and population-based studies ranges from 3% to 19% in adults older than 65 years. A very interesting approach in this area is related to the identification of an Artificial Intelligence (AI)-based model and a subset of relevant features to predict the MCI clinical outcome. In our study, we propose a Pareto-optimality-based approach to identify the best model for predicting MCI. In fact, the best model achieves an Accuracy and Recall on Yes MCI of 71% and 80% respectively. With this approach, it is possible to select the best model in order to predict Yes MCI (highest risk class). Our study presents a new best model selection approach that can be applied in identifying the best model that can be applied in various disease classification problems.
|
|
Tu-S4T7 Virtual Session, Room T7 |
Add to My Program |
Additional CYB II |
|
|
|
16:00-17:00, Paper Tu-S4T7.1 | Add to My Program |
A Multipopulation Ant Colony System Algorithm for Multiobjective Trip Planning |
|
Sun, Meng-Meng | South China Normal University |
Chen, Zong-Gan | South China Normal University |
Jiang, Yuncheng | South China Normal University |
Zhan, Zhi-Hui | South China University of Technology |
Zhang, Jun | Hanyang University |
Keywords: Evolutionary Computation, Swarm Intelligence
Abstract: Trip planning service can save the time and energy of tourists for preparing a trip and provide a more comfortable and satisfying travel experience. This paper particularly considers the planning of transportation mode between point of interests (POI) and formulates a multiobjective trip planning model to simultaneously maximize the visit time in POIs, minimize the travel time between POIs, and minimize the travel fare needed for the trip. To simulate the real-world environment, the formulated model incorporates the real-world POI and transportation data crawled from Tripadvisor and Baidu Map API, respectively. To obtain efficient trip planning schemes, a multipopulation ant colony system algorithm for trip planning, abbreviated as MACS-TP, is proposed. First, MACS-TP uses two colonies to optimize the time-related objective and fare-related objective respectively, which enhances the search efficiency. Second, an archive is employed to store the nondominated solutions found by both colonies and a new pheromone global update rule is designed based on the archive to help colonies optimize their corresponding objective sufficiently. Third, an elite learning strategy is proposed to further enhance the quality of solutions in the archive. Experimental results on a real-world dataset of Guangzhou, China illustrate the effectiveness of MACS-TP.
|
|
16:00-17:00, Paper Tu-S4T7.2 | Add to My Program |
EPO-S: A Constrained RL Method to Enhance UAV Safety with Spatial Representation |
|
Zhang, Qin | Tsinghua University |
Zhang, Linrui | Tsinghua University |
Yang, ZaiHui | Tsinghua Shenzhen International Graduate School |
Wang, Haoyu | Tsinghua |
Wang, Xueqian | Tsinghua University |
Chang, Yongzhe | Tsinghua University |
Keywords: Robotic Systems, Control of Uncertain Systems, Adaptive Systems
Abstract: Path planning and collision avoidance are critical components of UAV control algorithms that play a crucial role in executing UAV missions. As scenarios become increasingly complex, the traditional control methods just ain’t cutting it to meet the requirements. Reinforcement learning is an emerging decision-making control algorithm that attempts to address these issues as an alternative to traditional methods and has made significant advances. Unfortunately, standard RL approaches only aim to maximize rewards, however balancing task performance and safety in completing UAV tasks poses a challenge since these two objectives sometimes conflict, leading to a trade-off often difficult to manage. This paper proposes three techniques to address this problem. First, we model the path planning and collision avoidance issue in a constrained RL framework, eliminating the need for complex reward engineering. Second, we expand our previous work in the UAV setting and introduce an exact penalty optimization (EPO) algorithm to provide stricter constraint guarantees. We also propose a novel spatial information representation method for the UAV scenario to help UAVs better understand environmental information. The experimental results demonstrate the effectiveness of the EPO and spatial representation modules proposed in this paper, through a significant reduction in collisions as well as a strong improvement in the rate of reaching the destination.
|
|
16:00-17:00, Paper Tu-S4T7.3 | Add to My Program |
Automatic Market Making System with Offline Reinforcement Learning |
|
Guo, Hong | Tsinghua University |
Zhao, Yue | Tsinghua University |
Lin, Jianwu | Tsinghua University |
Keywords: Agent-Based Modeling, Application of Artificial Intelligence, Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing
Abstract: Market making is an important research topic in quantitative finance. Market makers need to continuously optimize their ask and bid prices to provide liquidity and make profits, which can be viewed as a continuous control problem. Reinforcement learning is a common method for solving sequential decision-making problems, in which an agent learns from reward signals through interactions with the environment to maximize the cumulative return. However, traditional online reinforcement learning methods can be inefficient in practice as they require the agent to interact with the environment to collect training data, which could be unstable. Additionally, exploration in financial trading can be very expensive. To address these issues, we apply offline reinforcement learning methods which use historical experience to train agents. In this paper, we present ORL4MM (Offline Reinforcement Learning for Market Making), a novel market making agent using offline training and online fine-tuning to mitigate potential losses and instabilities. We demonstrate the effectiveness of our method through experiments, where our agent outperforms all baseline models, including traditional models and online RL agents. To the best of our knowledge, we are the first to explore the application of offline reinforcement learning in market-making tasks, and we provide valuable practical experience for the deployment of reinforcement learning in financial scenarios.
|
|
16:00-17:00, Paper Tu-S4T7.4 | Add to My Program |
Combining Mental States Recognition and Machine Learning for Neurorehabilitation (I) |
|
Colafiglio, Tommaso | Dept. of Electrical and Information Engineering (DEI), Politecni |
Sorino, Paolo | Politecnico Di Bari |
Lofù, Domenico | Dept. of Electrical and Information Engineering (DEI), Politecni |
Lombardi, Angela | Dept. of Electrical and Information Engineering (DEI), Politecni |
Narducci, Fedelucio | Politecnico Di Bari |
Di Noia, Tommaso | Politecnico Di Bari, Bari (Italy) |
Keywords: Human-Collaborative Robotics, Human-Computer Interaction
Abstract: Brain-computer interfaces are widely used to control machines using Electroencephalography (EEG) signals. Several low-cost electroencephalographs are available on the market that achieves good-quality EEG signals. One of the most intriguing issues for developing biofeedback systems is classifying users’ emotional states using EEG signals and Machine Learning (ML) methods. In our study, we propose a novel ML-based biofeedback tool using a BCI to detect two different users’ mental states: Focus, and Relaxation. We compared several ML algorithms achieving an average accuracy on the Test Set of 0.90 by using SVM. Finally, we propose a prototype for music generation according to the classification output that could be adopted in neurorehabilitation scenarios.
|
|
16:00-17:00, Paper Tu-S4T7.5 | Add to My Program |
LoST: A Mental Health Dataset of Low Self-Esteem in Reddit Posts |
|
Garg, Muskan | Mayo Clinic |
Gaur, Manas | University of Maryland, Baltimore County |
Goswami, Raxit | HealthcareNLP LLC |
Sohn, Sunghwan | Mayo Clinic |
Keywords: Computational Intelligence, Machine Learning, Deep Learning
Abstract: Low self-esteem and interpersonal needs (i.e., thwarted belongingness (TB) and perceived burdensomeness (PB)) have a major impact on depression and suicide attempts. Individuals seek social connectedness on social media to boost and alleviate their loneliness. Social media platforms allow people to express their thoughts, experiences, beliefs, and emotions. Prior studies on mental health from social media have focused on symptoms, causes, and disorders. Whereas an initial screening of social media content for interpersonal risk factors and low self-esteem may raise early alerts and assign therapists to at- risk users of mental disturbance. Standardized scales measure self-esteem and interpersonal needs from questions created using psychological theories. In the current research, we introduce a psychology-grounded and expertly annotated dataset, LoST: Low Self esTeem, to study and detect low self-esteem on Reddit. Through an annotation approach involving checks on coherence, correctness, consistency, and reliability, we ensure gold-standard for supervised learning. We present results from different deep language models tested using two data augmentation techniques. Our findings suggest developing a class of language models that infuses psychological and clinical knowledge.
|
|
16:00-17:00, Paper Tu-S4T7.6 | Add to My Program |
Multi-Objective ADAM Optimizer (MAdam) |
|
Nikbakhtsarvestani, Farzaneh | Ontario Tech University |
Ebrahimi, Mehran | Ontario Tech University |
Rahnamayan, Shahryar | Brock University |
Keywords: Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing, Metaheuristic Algorithms, Optimization and Self-Organization Approaches
Abstract: Multi-objective optimization is a prevalent challenge in the area of deep learning. There is a lack of robust multi-objective optimization methods applicable in deep learning capable of training networks by simultaneously optimizing conflicting multiple loss functions. Its applications include a wide range of deep neural network branches such as multi-loss, multi-task, multi-modal, and cross-modal learning. In this paper, we develop MAdam as a multi-objective extension of the well-known Adam optimization algorithm. MAdam is a classical population-based approach that uses the gradient information of multiple objectives to accelerate population convergence toward an optimal minimum. The method applied a non-dominated sorting algorithm to keep selective population members and improve the diversity across the landscape. The performance of MAdam is evaluated on the standard ZDT test functions as the proof of concept. Promising results show the capability of this approach to converge towards an estimated Pareto front and to generate a well-distributed set of non-dominated solutions.
|
|
16:00-17:00, Paper Tu-S4T7.7 | Add to My Program |
Compact NSGA-II for Multi-Objective Feature Selection (I) |
|
Zanjani Miyandoab, Sevil | Ontario Tech University |
Rahnamayan, Shahryar | Brock University |
Asilian Bidgoli, Azam | Wilfrid Laurier University |
Keywords: Evolutionary Computation, Optimization and Self-Organization Approaches, Machine Learning
Abstract: Feature selection is an expensive challenging task in machine learning and data mining aimed at removing irrelevant and redundant features. This contributes to an improvement in classification accuracy, as well as the budget and memory requirements for classification, or any other post-processing task conducted after feature selection. In this regard, we define feature selection as a multi-objective binary optimization task with the objectives of maximizing classification accuracy and minimizing the number of selected features. In order to select optimal features, we have proposed a binary Compact NSGA-II (CNSGA-II) algorithm. Compactness represents the population as a probability distribution to enhance evolutionary algorithms not only to be more memory-efficient but also to reduce the number of fitness evaluations. Instead of holding two populations during the optimization process, our proposed method uses several Probability Vectors (PVs) to generate new individuals. Each PV efficiently explores a region of the search space to find non-dominated solutions instead of generating candidate solutions from a small population as is the common approach in most evolutionary algorithms. To the best of our knowledge, this is the first compact multi-objective algorithm proposed for feature selection. The reported results for expensive optimization cases with a limited budget on five datasets show that the CNSGA-II performs more efficiently than the well-known NSGA-II method in terms of the hypervolume (HV) performance metric requiring less memory. The proposed method and experimental results are explained and analyzed in detail.
|
|
Tu-S4T8 Virtual Session, Room T8 |
Add to My Program |
Additional SSE I |
|
|
|
16:00-17:00, Paper Tu-S4T8.1 | Add to My Program |
Analysis of Self-Organized Criticality in Complex Manufacturing Systems (I) |
|
Zhang, Yulin | Tongji University |
Yu, Qingyun | Tongji University |
Ji, Pengcheng | Tongji University |
Yu, TingYi | Tongji University |
Li, Li | Tongji University |
Keywords: Manufacturing Automation and Systems, System Modeling and Control
Abstract: The theory of criticality has been applied exploratively to the industry in recent years since complex systems in critical states often have a high degree of uncertainty. As a first step towards examining the evolutionary process in complex manufacturing systems, this paper verifies that the self-organized criticality theory also exists in complex manufacturing systems. Based on six real production lines in Shanghai, a semiconductor manufacturing system model is constructed, which serves as the verification object. Several preprocessing techniques are used to prepare the simulation results, including principal component analysis and Pearson correlation coefficients method. An analysis of the manufacturing system based on mathematical derivation and statistical analysis confirms the phenomenon of self-organized criticality. In addition, the results of the experiments demonstrate that the theory of self-organized criticality is applicable to complex manufacturing systems as well.
|
|
16:00-17:00, Paper Tu-S4T8.2 | Add to My Program |
Population-Based Multi-Agent Evaluation for Large-Scale Voltage Control |
|
Jin, Cheng | Zhejiang University |
Zhang, Senlin | Zhejiang University |
Liu, Meiqin | Xi'an Jiaotong University |
Zheng, Ronghao | Zhejiang University |
Dong, Shanling | Zhejiang University |
Keywords: Systems Safety and Security, Multi-User Interaction, Shared Control
Abstract: Under the purpose of achieving the optimal voltage control strategy in power grid system, multi-agent evaluation algorithms like α-rank are widely used. However, in large-scale systems with massive agents and strategies, these methods are not time feasible. Therefore, a two-stage population-based multi-agent evaluation algorithm is proposed to solve voltage control problem in large-scale power grid systems. For stage one, a population is first established for each agent. And then, individuals in the populations randomly combined to form joint strategies. Base on the max and mean reward from the interaction between joint strategies and the environment, populations evolve to a near-optimal joint strategy. Stage two takes the above near-optimal joint strategy as the starting point, and uses a strategy search algorithm with maximum transfer possibility to find the Markov-Conley chain in the system. Finally, the above two-stage method is simulated in 10 and 32-agent power grid systems to verify the effectiveness.
|
|
16:00-17:00, Paper Tu-S4T8.3 | Add to My Program |
SWOT Analysis of Extended Reality in Architecture Engineering and Construction Organizations |
|
Aziz, Ferzon | George Brown College |
Morris, Alexis | OCAD University |
Keywords: Virtual/Augmented/Mixed Reality, Virtual and Augmented Reality Systems, Information Visualization
Abstract: This paper proposes SWOT (strength, weakness, opportunity, and threat) based criteria to evaluate the usefulness of Extended Reality (XR) in Architecture, engineering, and Construction (AEC) organizations, using XR literature in the AEC industry for the period 2018 to 2022, inclusive. A SWOT matrix is developed using thematic analysis to highlight the strengths, weaknesses, opportunities, and threats associated with XR in AEC organizations. A total of 105 articles were identified and analyzed from literature using a structured process in a spreadsheet matrix. A total of 22 criteria were developed and described, 5 criteria for strength, 6 criteria for weakness, 6 criteria for opportunities, and 5 criteria for threats. The developed criteria are proposed to aid AEC organizations in achieving their strategic goal for the implementation of XR technologies. This paper identifies criteria that may be useful to XR researchers and industry practitioners.
|
|
16:00-17:00, Paper Tu-S4T8.4 | Add to My Program |
Integrated Tractor and Trailer Scheduling for Airport Baggage Transport Service |
|
Wang, Xinyue | The Hong Kong Polytechnic University |
Long, Yuying | The Hong Kong Polytechnic University |
Xu, Gangyan | The Hong Kong Polytechnic University |
Liu, Yuxuan | Harbin Institute of Technology, Shenzhen |
Keywords: Intelligent Transportation Systems, Decision Support Systems, Cooperative Systems and Control
Abstract: Airport baggage transport is a key aspect of airport ground handling which involves transporting passenger baggage between the baggage handling center and aircraft stands. Typically, tractors and trailers are used to implement the baggage transport service under the drop-and-pull mode in practice. Efficient airport baggage transport service is essential to reduce flight delays, save airport ground handling costs, and guarantee aviation safety. However, the existing methods for scheduling tractors and trailers perform poorly when dealing with high flight volumes at busy hub airports, leading to delayed responses for baggage transport demands. Besides, few works have explored the significance of the drop-and-pull mode in improving airport baggage transport service equality. Thus, to enhance the efficiency of airport baggage transport service and reduce airport ground handling costs, this paper presents an integrated tractor and trailer scheduling problem under the drop-and-pull mode. Then, an improved Genetic Algorithm (GA) is developed to solve this problem. Finally, simulation experiments are conducted to verify the effectiveness of the model and algorithm.
|
|
16:00-17:00, Paper Tu-S4T8.5 | Add to My Program |
Properly Scoring Users’ Mainstreamness to Evaluate Recommendation Bias |
|
Hara, Yamato | University of Tsukuba |
Sato, Masahiro | Independent Researcher |
Nobuhara, Hajime | University of Tsukuba |
Keywords: Ethics of AI and Pervasive Systems, Information Systems for Design and Marketing, Information Systems for Design
Abstract: In recommender systems, a mainstream bias (MS bias) exists, which degrades the fairness of recommender systems and may decrease user satisfaction. In other words, the recommendation quality for users who prefer mainstream items is higher than that for users who prefer niche items. The MS bias degrades fairness in the recommender system and might decrease user satisfaction. % To address the MS bias, we first quantified the degree of bias. To address the MS bias, we first need to quantify the degree of bias. % However, the mainstreamness of users (the MS score) is difficult to measure properly, and previous MS scores have been confirmed to be greatly confounded by the number of user interactions that are irrelevant to mainstreamness. However, the mainstreamness of users (the MS score) is difficult to measure properly, and we reveal that previous MS scores are greatly confounded by the number of user interactions that are irrelevant to mainstreamness. To address this issue, we propose a novel MS score that is unaffected by the number of interactions. Specifically, we introduce a new similarity metric that differs from the conventional Jaccard similarity measure to eliminate the effect of interaction numbers. We confirm the validity of our proposed MS score through a simulation study that discriminates mainstream users from niche users under varying numbers of user interactions. The proposed unconfounded MS score enables the proper evaluation of MS biases and the selection of a recommendation model with less MS bias. We evaluate the MS biases of two popular recommendation models using seven real-world datasets. We demonstrate that MS biases exist in all datasets and that a better model differs depending on the datasets. Furthermore, we confirm that the previous MS scores fail to select recommendation models with small biases. The results show the effectiveness of our new MS score in improving recommendations for fairness among users.
|
|
16:00-17:00, Paper Tu-S4T8.6 | Add to My Program |
MAPPO-Based Optimal Reciprocal Collision Avoidance for Autonomous Mobile Robots in Crowds |
|
Liu, Zhihao | Tongji University |
Yao, Chenpeng | Tongji University |
Na, Wenjie | Tongji University |
Liu, Chengju | Tongji University |
Chen, Qijun | Tongji University |
Keywords: Neural Networks and their Applications, Deep Learning, Machine Learning
Abstract: This paper proposes an improved Optimal Reciprocal Collision Avoidance (ORCA) algorithm for robot navigation in crowds using a deep reinforcement learning algorithm, Multi-agent Proximal Policy Optimization (MAPPO). The original ORCA algorithm allows the robot to compute escape velocities to escape from collisions with humans and then gets a set of collision-free velocities based on the reciprocal assumption. But humans usually don't follow the assumption, and thus ORCA performs badly. In this paper, we propose a MAPPO-based method to explore the optimal escape velocity for the robot to each human and use an estimation module to decide the effect of each escape velocity on the robot's new velocity. The proposed method is evaluated in simulated crowds of humans, and the original ORCA algorithm with different configurations are used as baselines. Simulation results show the proposed method achieves a significantly higher navigating success rate with almost the same time consumption compared with the original ORCA algorithm in robot navigation among crowds.
|
|
16:00-17:00, Paper Tu-S4T8.7 | Add to My Program |
A Double-Norm Aggregated Latent Factorization of Tensors Model for Temporal-Aware QoS Prediction |
|
Wang, Juan | China West Normal University |
Wu, Hao | Chongqing Institute of Green and Intelligent Technology, Chinese |
He, Chunlin | China West Normal University |
Keywords: Big Data Computing,, Knowledge Acquisition
Abstract: Time-varying Quality-of-Services (QoS) data describes the non-functional characteristics of Web service, which plays a key role in service selection. Whereas, QoS data is often high-dimensional and incomplete (HDI) due to the impossibility for users to request all services. A Latent Factorization of Tensors (LFT)-based QoS predictor proves to be efficient in predicting time-varying QoS data. However, current LFT models mostly use L2-norm-oriented Loss. Yet L2 norm is sensitive to outlier data, resulting model robustness not being guaranteed. Moreover, although L1 norm has intrinsically robustness, it is less sensitive to error. To address the above problems, this paper proposes a Double-norm Aggregated Latent factorization of tensors (DAL) model. Its main idea is to aggregate L2-norm and smooth L1-norm to form its Loss, making it have both high accuracy and strong robustness in predicting the unobserved time-varying QoS data. Empirical studies on two time-varying QoS datasets shows that the proposed model has higher prediction accuracy and better convergence rate than state-of-the-art models.
|
|
Tu-S4T9 Virtual Session, Room T9 |
Add to My Program |
New Session for Latest Online Requests IV |
|
|
|
16:00-17:00, Paper Tu-S4T9.1 | Add to My Program |
Deep Reinforcement Learning Based Control of Rotation Floating Space Robots for Proximity Operations in PyBullet |
|
Srivastava, Raunak | Tata Consultancy Services |
Lima, Rolif | Tata Consultancy Services |
Sah, Roshan | Tata Consultancy Services (TCS) - Research |
Das, Kaushik | TCS Research |
Keywords: Autonomous Vehicle, Robotic Systems, Control of Uncertain Systems
Abstract: This paper presents a model-free learning-based controller for whole-body control of an orbiting space robot during proximity operations. Proximity control methods for space robots (robotic manipulators mounted on a floating satellite base) enable the robotic arm to reach up to the target body in order to perform autonomous tasks like in-orbit servicing, debris capture, etc. However, coupled motion control of such robots is tricky due to the floating nature of the satellite base. Although conventional controllers have been employed for coupled control of such nonlinear systems, their modeling and sophisticated control become all the more difficult with the increasing degree of freedom of the robot. Model-free Deep Reinforcement Learning (RL) has been successful in learning complex policies in the field of robotic manipulation. However, most of the research in this domain has been focused only on the control of the space robotic arm, and not the satellite base, with a majority of them focusing only on the arm position control. A coupled controller which also simultaneously controls the satellite base orientation is essential for the proper functioning of onboard sensors and equipment which have pointing requirements. This paper uses Proximal Policy Optimization (PPO) algorithm to control the position (3 DOF) and orientation (3 DOF) of the end-effector while also controlling the orientation of the base satellite (3 DOF). To the best knowledge of the authors, a model-free Deep RL method has not yet been used for simultaneous 9 DOF control of a floating space robot so far. We also improvise over the standard reward functions that are used in Deep RL algorithms for improved performance of the learning algorithm. The training of the policy is performed using a PyBullet physics simulator and the comparison of the performance of the learning algorithm against the standard reward functions is presented.
|
|
16:00-17:00, Paper Tu-S4T9.2 | Add to My Program |
Cybernetic Telepresence Humanoid Surgeon Avatar Robotic Astronaut (I) |
|
Jewell, Susan | Avatarmedic Inc |
Jewell, Emmy | MMAARS |
Keywords: AI and Applications, Cloud, IoT, and Robotics Integration, Expert and Knowledge-Based Systems
Abstract: This paper will discuss the current research and projects focusing on the potential for creating humanoid Cybernetic Surgeon Avatar Robotic Astronaut for future space exploration. To-date the current and exciting technological innovations in telepresence avatars and telerobotic research has enabled the vision to create the future of remote, real-time CYBERNETIC TELEPRESENCE HUMANOID AVATARS (CTHA) technology. A reality that can potentially become the future “Space Robotic Astronauts” for planetary missions and missions that are dangerous to send human astronauts. A CTHA is a physical robot that can be substituted for the physical presence of a person. These “humanoid” avatars are integrated with frontier technologies, such as, Augmented Reality (AR) and Extended Reality (XR), for example, spatial computing headset devices, such as, HoloLens 2, and equipped with sensors, haptics, and cameras that allow them to perceive their surroundings where they can move and interact with the environment that is similar to a human being. The key difference is that the person controlling the CTHA is typically located in a remote location, such as, a remote office or a different geographical site. The technology behind CTHA is potentially possible by the convergence of several frontier technologies and advancement in robotic development. The avatar is typically controlled by a human operator who can see what the CTHA avatar sees through cameras and other sensors and can control the movements and interactions via remote controls or joysticks. The advancement of facial recognition and deep Artificial Intelligence (AI) can allow the operator to control the facial expressions of the avatar allowing it to emote humanoid-like expressions and convey a sense of an authentic interaction with the person. This review will explore the concept of CTHA and provide examples of how this technology could be used in different contexts, such as, space exploration, space medicine, space psychiatry and applications for social impact and sustainability.
|
|
16:00-17:00, Paper Tu-S4T9.3 | Add to My Program |
Dynamic Adaptive Individual Weighting Model for Opinion Diffusion in Social Networks |
|
Lv, Yifan | Wuhan Textile University |
Xu, Han | Huazhong University of Science and Technology |
Keywords: Agent-Based Modeling, Complex Network, Cybernetics for Informatics
Abstract: Opinion dynamics, which concerns how opinions evolve and spread in social networks has been widely studied during past years, and a lot of classical models have been proposed to describe the opinion diffusion process. However, most existing leader-follower-relationship based models ignore the influence of normal individuals and do not consider the feedback effect of opinion difference on individuals, which are important for opinion spread. In this paper, inspired by two well-known social theories: Emotional Mobilization and Social Judgement Theory, we first propose a method to identify individual influence factor based on both network topology and personal behavior attributes. Then, we propose a novel opinion evolution model named Dynamic Adaptive Individual Weighting model which focuses on individual heterogeneity and considers the opinion difference assimilation effect. In this model, the influence weight of an agent's neighbour on the agent can be dynamically affected by their opinion difference and adaptively adjusted based on the neighbour's relative influence factor. Moreover, environmental noise is also introduced to assimilate realistic situations' uncertainty. Experimental results on 12 real and 2 generated network datasets show that our proposed model can precisely reflect the evolution process and trend of opinions over different social networks. The study can enable decision-makers better understand the fundamental processes of opinion diffusion and design more efficient strategies for political or business activities.
|
|
16:00-17:00, Paper Tu-S4T9.4 | Add to My Program |
Experimental Evaluation of Model Predictive Mixed-Initiative Variable Autonomy Systems Applied to Human-Robot Teams (I) |
|
Ramesh, Aniketh | Extreme Robotics Lab, University of Birmingham |
Braun, Christian | Karlsruhe Institute of Technology |
Ruan, Tianshu | University of Birmingham |
Rothfuss, Simon | Karlsruhe Institute of Technology (KIT) |
Hohmann, Sören | KIT |
Stolkin, Rustam | Extreme Robotics Lab, NCNR, University of Birmingham |
Chiou, Manolis | Extreme Robotics Lab, NCNR, University of Birmingham |
Keywords: Shared Control, Human Factors, Human-Machine Interaction
Abstract: Adjusting the level of autonomy in human-machine systems (e.g., human-robot systems) holds great potential for achieving high system performance while maintaining operator involvement. To support operators with the task of setting the proper level of autonomy, we present a novel approach to realise a Model Predictive Controller that determines the optimal LoA for each tessellation in the robot's path plan based on the estimated performance degradation due environmental adversities. We also report on an experimental evaluation of a mixed-initiative system where both the operator and the Model Predictive Controller are in charge of dynamically adjusting the level of autonomy cooperatively while performing a challenging navigational task with a mobile ground robot in a high-fidelity simulation. To this end, we conducted a user study with 15 participants comparing the performance and user experience of the model predictive system with a state-of-the-art system. The results show significant benefits of the model predictive system in terms of a reduction of conflicts for control and an improved user experience. Additionally, there are indications of benefits in terms of robot health and, consequently, performance for the model predictive system.
|
|
16:00-17:00, Paper Tu-S4T9.5 | Add to My Program |
Enhancing Hand and Object Detection for Monitoring Patients with Upper-Limb Impairment: A Study on the Impact of Input Size in Foundation Models |
|
Izadmehr, Yasaman | Swiss Federal Institute of Technology Lausanne (EPFL) and Univer |
Aminian, Kamiar | Swiss Federal Institute of Technology Lausanne (EPFL) |
Perez-Uribe, Andres | HEIG VD |
Keywords: Assistive Technology, Human-Computer Interaction, Wearable Computing
Abstract: The choice of input image size can have a significant impact on the performance of the state-of-the-art algorithms. We can always customize the algorithms by training and fine-tuning them on our datasets, but it is time consuming. Nowadays there is a trend to use foundation models but in our application of monitoring patients, both hand detection, object detection and hand-object interaction detection resulted in mediocre performance. This study aimed to investigate the significance of input size for detecting hand-object interaction in two datasets: the patient dataset (captured by super view mode) and the EpicKitchen dataset (captured by normal view mode). The results showed that using different input sizes with the same foundation model can lead to a significant improvement in performance. In the patient dataset, using frames with input sizes of 300×300 pixels (px) and 256×256 px after cropping and resizing the original images led to more successful hand detection results. Furthermore, using video processing tools like FFmpeg for resizing frames instead of passing the original images to the MediaPipe model for resizing resulted in a 33% improvement. In the EpicKitchen dataset with normal view mode, successful hand detection results were obtained by resizing frames into a rectangle of 256 px and 300 px after padding and cropping the original images. Overall, the study emphasizes the significance of input size for detecting hand-object interaction detection for the purpose of monitoring patients with upper-limb impairment. The combination analysis within each dataset showed that the most effective combination in hand-object interaction detection is achieved by applying the MediaPipe model to an input image size of 300×300 px (for super view mode) or 256×256 px (for normal view mode) along with the result of YOLOv7 model with an input image size of 1920×1920 px. By using this combination, a 100% success rate was achieved for both datasets.
|
|
16:00-17:00, Paper Tu-S4T9.6 | Add to My Program |
Towards an Edge Intelligence-Based Traffic Monitoring System (I) |
|
Barbuto, Vincenzo | Università Della Calabria |
Savaglio, Claudio | Italian National Research Council (CNR)-ICAR, University of Cala |
Minerva, Roberto | Institut Polytechnique De Paris |
Crespi, Noel | Institut Polytechnique De Paris |
Fortino, Giancarlo | University of Calabria |
Keywords: Smart Sensor Networks, Distributed Intelligent Systems, Digital Twin
Abstract: Cities have undergone significant changes due to the rapid increase in urban population, heightened demand for resources, and growing concerns over climate change. To address these challenges, digital transformation has become a necessity. Recent advancements in Artificial Intelligence (AI) and sensing techniques, such as synthetic sensing, can elevate Digital Twins (DTs) from digital copies of physical objects to effective and efficient platforms for data collection and in-situ processing. In such a scenario, this paper presents a comprehensive approach for developing a Traffic Monitoring System (TMS) based on Edge Intelligence (EI), specifically designed for smart cities. Our approach prioritizes the placement of intelligence as close as possible to data sources, and leverages an “opportunistic” interpretation of DT (ODT), resulting in a novel and interdisciplinary strategy to re-engineering large-scale distributed smart systems. The preliminary results of the proposed system have shown that moving computation to the edge of the network provides several benefits, including (i) enhanced inference performance, (ii) reduced bandwidth and power consumption, (iii) and decreased latencies with respect to the classic cloud-centric approach.
|
|
16:00-17:00, Paper Tu-S4T9.7 | Add to My Program |
TRecX: Text-Based Recommender with EXplanations |
|
Pérez-Núñez, Pablo | Artificial Intelligence Center, University of Oviedo |
Díez, Jorge | Artificial Intelligence Center, University of Oviedo |
Bahamonde, Antonio | Artificial Intelligence Center, University of Oviedo |
Luaces, Oscar | Artificial Intelligence Center, University of Oviedo |
Keywords: Application of Artificial Intelligence, Machine Learning, Neural Networks and their Applications
Abstract: Recommender systems have proven their usefulness both for companies and customers. The former increase their sales and the latter get a more satisfying shopping experience. These systems can benefit from the advent of explainable artificial intelligence since a well-explained recommendation will be more convincing and may broaden the customer's purchasing options. Many approaches offer justifications for their recommendations based on the similarity (in some sense) between users, past purchases, etc., which requires some knowledge of the users. In this paper, we present a recommender system with explanatory capabilities which can deal with the so-called cold-start problem since it does not require any previous knowledge of the user. Our method learns the relationship between the products and some relevant words appearing in the textual reviews from previous customers for those products. Then, starting from the textual query of a user’s request for a recommendation, our approach elaborates a list of products and explains each recommendation based on the compatibility between the query's words and the relevant terms for each product.
|
|
Tu-S5T1 Virtual Session, Room T1 |
Add to My Program |
Integration of Engineering Disciplines and Resource Management |
|
|
|
17:00-18:00, Paper Tu-S5T1.1 | Add to My Program |
Relevance of ISO/SAE 21434 in Vehicular Architecture Development |
|
V, Kethareswaran | Indian Institute of Information Technology Guwahati |
Moulik, Sanjay | IIIT Guwahati |
Keywords: Autonomous Vehicle, System Architecture, Cyber-physical systems
Abstract: Research in terms of security in vehicular architecture has been a focal point of deliberation in the automotive sector. The Automotive industry development has continuously evolved the scenario of adding more integration to networked vehicles or semi-autonomous vehicles. The end product of such manufacturing practices is a vehicle becoming a networked computer on wheels. In spite of providing numerous advantages, these developments in the automotive industry are also bringing more challenges in terms of security. Hence, the industry has been forced to emphasise the principle of Security by Design. Efforts in this direction have resulted in the formulation of the ISO/SAE 21434 standard. In this work, we focus on summarising the ISO/SAE 21434 standard and analysing its relevance for automotive architecture development. Related works in the standard's approach of Security by Design are summarised. Additionally, we propose an application work of the standard in the context of automotive Architecture development as future work.
|
|
17:00-18:00, Paper Tu-S5T1.2 | Add to My Program |
Research on Key Influencing Factors of Platform Economy Empowering Value Enhancement of the Whole Agriculture Industry Chain |
|
Zheng, Xin | Northwestern Polytechnical University |
Zhu, Yuming | Northwestern Polytechnical University |
Liu, Caihong | Northwestern Polytechnical University |
He, Lei | Northwestern Polytechnical University |
Mu, Bingxu | Northwestern Polytechnical University |
Keywords: Conflict Resolution, Decision Support Systems, System Modeling and Control
Abstract: 作为一个有机的整体,由多个链接和 学科耦合和协调,引入 网络协同的平台经济意义重大 对农业产业升级的意义 链,因此调查密钥很重要 平台经济的影响因素 提升农业全产业链价值。 通过文献研究、元分析和专家 访谈,本研究确定了34个初始影响因素 整个农业价值提升的因素 平台经济赋能产业链,分化 将初始影响因素分为三个维度 论一般系统论思想,即产业链 环节要素、产业链参与主体、 产业链体系与环境。之后,我们 构建关键影响因素识别 基于模糊AHP和模糊DEMATEL方法的模型 最终确定了七个关键影响因素,分别是 智能管理生产,组织 农业生产、数字素养和技能 农业主体、合作意愿与协同 各类学科能力、农村数字基础设施 建筑、农业
|
|
17:00-18:00, Paper Tu-S5T1.3 | Add to My Program |
Research on the Evaluation System of the Effectiveness of the Construction of First-Class Engineering Disciplines in Chinese Universities |
|
Zong, Fan | Northwestern Polytechnical University |
Xie, Xiaoxiao | Northwestern Polytechnical University |
Keywords: System Modeling and Control, Decision Support Systems, Cooperative Systems and Control
Abstract: The existing evaluation system for the construction of first-class disciplines is mostly focused on the disciplinary output indicators such as personnel training, scientific research and social services, and does not fully consider the impact of the disciplinary foundation and process management and other related indicators. In this paper, a solid literature study was conducted to filter the set of three-level evaluation indicators. The reliability and validity tests have verified that process management and disciplinary foundation are two important evaluation dimensions in the effectiveness of the construction of first-class engineering disciplines in Chinese universities, thus enriching the existing research on relevant evaluation theories. The entropy weighting method is used to assign weights to all indicators, and a complete three-level progressive evaluation system for the effectiveness of the construction of first-class engineering disciplines in Chinese universities is constructed. The results of the study verify the scientific validity of the constructed evaluation system and expand the scope of the existing research on the evaluation of the effectiveness of the construction of first-class disciplines.
|
|
17:00-18:00, Paper Tu-S5T1.4 | Add to My Program |
Research on the Factors Influencing the Integration of Traditional Characteristic and Advantageous Engineering Majors with Artificial Intelligence in Colleges and Universities |
|
Xie, Xiaoxiao | Northwestern Polytechnical University |
Zhu, Yuming | Northwestern Polytechnical University |
Zong, Fan | Northwestern Polytechnical University |
Wang, Xu | Northwestern Polytechnical University |
Keywords: System Modeling and Control, Decision Support Systems, Control of Uncertain Systems
Abstract: Facing the new round of technological revolution and industrial transformation, digital education represented by artificial intelligence has brought unprecedented opportunities and challenges to the reform of higher engineering education. ow to promote the deep integration of traditional characteristic and advantageous engineering majors with artificial intelligence is an important issue that commonly faced by higher education worldwide at present. Based on this, the factors influencing the integration of engineering majors with traditional characteristics and artificial intelligence are systematically analyzed. 5 primary influencing factors and 19 secondary influencing factors are identified through literature research and expert interviews, and the interaction relationship between the factors is analyzed by using structural equation model. The results indicate that the degree of influence on the integration of traditional characteristic and advantageous engineering majors and artificial intelligence is ranked from high to low in terms of curriculum system, faculty construction, practical teaching, educational positioning and quality assurance. Among which, teaching objectives, faculty structure, industry-university cooperation, research and education, professional orientation and teaching evaluation are the most important indicators corresponding to each first-level influencing factor, respectively. Based on the research results, countermeasure suggestions for promoting the integration of traditional characteristic and advantageous engineering majors with artificial intelligence are proposed from five aspects.
|
|
17:00-18:00, Paper Tu-S5T1.5 | Add to My Program |
Research on the Key Influencing Factors and Guarantee Strategies of Comprehensive Land Consolidation Based on the Optimization of Fuzzy-BWM |
|
Liu, Caihong | Northwestern Polytechnical University |
Zhu, Yuming | Northwestern Polytechnical University |
Mu, Bingxu | Northwestern Polytechnical University |
He, Lei | Northwestern Polytechnical University |
Keywords: Conflict Resolution, Decision Support Systems, System Modeling and Control
Abstract: Comprehensive land consolidation (CLC) is a more integrated, synergistic, and sustainable land management model that can better meet the requirements of socio-economic development and ecological environmental protection. However, CLC involves multiple fields, stakeholders, and complex socio-economic environments, which make it challenging to grasp the main points and difficulties in understanding CLC, resulting in insufficient scientific and targeted work. Therefore, identifying the key influencing factors (KIFs) of CLC is crucial for its successful implementation. This study adopts a multivariate system theory perspective and identifies 43 factors that can be categorized into three groups: consolidation subject, consolidation element, and consolidation system and environment. Furthermore, the Hamming distance optimization traditional fuzzy best and worst method (F-BWM) is introduced to identify eight KIFs, including A6 comprehensive quality of professionals, A7 importance of local government, B9 financial support, B18 land use planning, B20 comprehensive consolidation cost, C2 benefit distribution mechanism, C5 whole process supervision and control mechanism, and C8 multi-departmental coordination mechanism. This study provides a new theoretical framework and methodology for research on the influencing factors of CLC, and helps policy makers and decision makers understand the interrelationships and influences of different factors in the complex consolidation work, clarifying the management priorities for CLC implementation. This can lead to the formulation of more scientific and reasonable policies and measures to promote the sustainable use and comprehensive management of land resources. Moreover, the introduction of Hamming distance optimization to the traditional fuzzy BWM method enhances the reliability and scientific rigor of the research results, and can serve as a reference for similar studies.
|
|
17:00-18:00, Paper Tu-S5T1.6 | Add to My Program |
A Safety-Critical Integrated Planning and Control Method for Autonomous Ground Vehicles |
|
Pan, Wei | Tongji University |
Zhao, Yongpo | Great Wall Motor Company Limited |
Sun, Huiyun | Great Wall Motor Company Limited |
Zhang, Lin | Tongji University |
Chen, Zhitao | Beijing Institute of Space Launch Technology |
Chen, Hong | Tongji University |
|
|
17:00-18:00, Paper Tu-S5T1.7 | Add to My Program |
Exploring a Self-Attentive Multilayer Cross-Stacking Fusion Model for Video Salient Object Detection (I) |
|
Yang, Heng | Sichuan Normal University |
Mu, Nan | Sichuan Normal University |
Guo, Jinjia | Chongqing University |
Wang, Rong | Sichuan Normal University |
Keywords: Image Processing and Pattern Recognition, Deep Learning, Neural Networks and their Applications
Abstract: As an effective measure to capture the object of interest in video sequence, video salient object detection (VSOD) requires the processing of information from spatial-motion modalities, although plenty of traditional VSOD models were dedicated to developing efficient spatial and motion features to obtain salient objects of global consistency, the highly redundant spatial information brought by consecutive identical objects will inevitably reduce the generalization ability of these VSOD model. Although exploring the integration of spatial and motion information can improve the inter-frame correlation of salient objects to some extent, previous models tend to focus only on simple spatio-temporal fusion, which can also lead to the generation of redundant information, resulting in poor detection performance. Therefore, it is necessary to focus on effectively fusing the feature information of different modalities to eliminate the effect of redundant information. In this research, we proposed a self-attentive multilayer cross-stacking fusion based VSOD model, which productively extracts the multimodal features for two-way information transfer, fully utilizes the spatial and temporal knowledge to complement each other, and refines the cross-stacking of the interacted information and spatial features for local and global saliency optimization. As a result, the redundant spatial information can be largely eliminated, reducing the misidentification of salient objects due to blurred backgrounds or moving objects, and adaptively activating more weights of the salient object to achieve globally consistent saliency. Comprehensive experiments on four publicly available VSOD datasets demonstrated that the model had superior performance compared to the latest multiple VSOD models.
|
|
17:00-18:00, Paper Tu-S5T1.8 | Add to My Program |
Design of a Quantum Self-Attention Neural Network on Quantum Circuits (I) |
|
Zheng, Jin | Beihang University |
Gao, Qing | Beihang University |
Miao, Zibo | Harbin Institute of Technology |
Keywords: Quantum Machine Learning
Abstract: This paper proposes a quantum self-attention neural network (QSAN) model that is implementable on quantum circuits, providing a novel avenue to processing text classification tasks in natural language processing (NLP). The QSAN framework is established by integrating four basic blocks: the data preprocessing block, the quantum encoding block, the model design block, and the network optimization block. Simulation results demonstrate remarkable convergence and accuracy on various text classification datasets. In particular, the proposed QSAN surpasses the current state-of-the-art quantum NLP (QNLP) model in terms of test accuracy.
|
|
17:00-18:00, Paper Tu-S5T1.9 | Add to My Program |
Digital Twin Integration for Software Defined Vehicles: Decoupling Hardware and Software in Automotive System Development (I) |
|
Purohit, Shatad | University of Southern California |
Madni, Ayesha | University of Southern California |
Adiththan, Arun | City University of New York |
Madni, Azad | University of Southern California |
Keywords: Digital Twin, Autonomous Vehicle, System Architecture
Abstract: The ongoing evolution of Software Defined Vehicles (SDVs) in the automotive industry has drawn attention to the importance of decoupling hardware and software to enable greater flexibility, upgradability, and adaptability in vehicle design and functionality. Digital twin technology, which involves creating virtual models of physical systems with bidirectional communication with the physical system, presents a promising approach for optimizing and validating various aspects of SDVs within this new paradigm . This paper proposes a novel framework for applying digital twins in the context of SDVs, and the benefits derived from decoupling hardware and software.
|
|
Tu-S5T2 Virtual Session, Room T2 |
Add to My Program |
Machine Learning for Intelligent Imaging Systems VI |
|
|
Organizer: Tang, Jinshan | George Mason University |
Organizer: Agaian, Sos | New York City University |
|
17:00-18:00, Paper Tu-S5T2.1 | Add to My Program |
Semantic Segmentation of 3D Liver Image Based on Multi-Path Features Attention Mechanism (I) |
|
Jiang, Zhihui | Wuhan University of Science and Technology |
Zhang, Xiaolong | Wuhan University of Science and Technology |
Deng, He | Wuhan University of Science and Technology |
Ren, Hongwei | Wuhan University of Science and Technology |
Keywords: Image Processing and Pattern Recognition
Abstract: It is challenging to precisely segment the liver from surrounding organs in medical images because of the poor contrast between them. A method of semantic segmentation of 3D liver images based on multi-path features attention mechanism is proposed to address this issue. It integrates three-dimensional spatial information and feature information from several paths in the model to automatically segment the liver area. The model in this paper uses the LiTS dataset for training, testing, and ablation experiments, and compares the results with previous models. The experimental results demonstrate that the model in this paper has reached 0.965 in the DICE similarity coefficient, and has also improved in evaluation indicators such as volume overlap error (VOE) and root mean square symmetric surface distance (RMSD). It also has better segmentation performance when tested on the CHAOS dataset. Cross-validation was carried out on the clinical MRI dataset of a hospital, and the DICE similarity coefficient reached 0.971. The results show that the model has good performance on the multi-modal datasets of CT and MRI.
|
|
17:00-18:00, Paper Tu-S5T2.2 | Add to My Program |
CPR-Net: A Novel Joint Learning Network for Pulmonary Nodule Evaluation (I) |
|
Chen, Gong | Wuhan University of Science and Technology |
Liu, Jun | Wuhan University of Science and Technology |
Li, Chen-Qian | Wuhan University of Science and Technology |
Keywords: Deep Learning, Image Processing and Pattern Recognition, Machine Vision
Abstract: Lung cancer is the most common malignant tumor worldwide, with high mortality rates. Pulmonary nodule is a common manifestation of lung cancer. Accurate segmentation and detection of pulmonary nodules from CT scans are essential for proper assessment of patient prognosis. However, this task remains challenging due to various factors, such as class imbalance and the need for detailed characterization of lung nodule segmentation. To address these issues, we propose a novel network, CPR-Net(Convolutional Balance Residuals-Net), that combines segmentation and benign-malignant classification for CT images of lung nodules. Our approach utilizes a PBMM module to expand the perception field and enhance the representation capability of the model, allowing it to learn more detailed information. We evaluate the classification performance using accuracy, precision, recall, F1 score, AUC, and the segmentation performance using the Dice coefficient. Experiments on both publicly available datasets and self-constructed datasets demonstrate that our proposed method in this paper outperforms other methods in terms of classification and segmentation performance under limited labeled data conditions. Moreover, Our results suggest that this model has great potential for improving lung nodule diagnosis in radiology for heterogeneous intranodal images.
|
|
17:00-18:00, Paper Tu-S5T2.3 | Add to My Program |
3D U-Net3+ Based Microbubble Filtering for Ultrasound Localization Microscopy (I) |
|
Han, Wenzhao | School of Computer Science, Southwest Petroleum University |
Zhang, YuTing | Southwest Petroleum University |
赵, 亚川 | 西南石油大学 |
Luo, Anguo | Sichuan Provincial Key Laboratory of Ultrasound Cardiac Electrop |
Peng, Bo | Southwest Petroleum University |
Keywords: Application of Artificial Intelligence, Image Processing and Pattern Recognition
Abstract: Ultrasound localization microscopy (ULM) is an innovative imaging technique that employs microbubbles (MBs) to improve the spatial resolution of ultrasound (US) imaging. Accurately extracting the MB signals from the original ultrasounddata is essential for successful ULM. Traditional MB filtering methods, such as SVD, have high complexity and computational intensity. Due to the sensitivity to spatiotemporal information, 3D convolutional neural networks (3D CNN) have been utilized in MB filtering. However, the large number of parameters in 3D convolutional layers and complex network architectures affect the real-time performance of ULM. To optimize the network structure of 3D CNN and reduce parameters, this study proposes a novel MB filtering method based on 3D CNN and U-Net3+ named 3D U-Net3+. It adopts full-scale connection strategy to reduce network parameters, while combining low-level seman tics and high-level semantics to capture fine-grained semantics and coarse-grained semantics at full scale. The experimental results demonstrate that the proposed MB filtering method can effectively preserve the spatiotemporal information of MBs in ultrasound sequence images. The SSIM and PSNR values of the MB image processed by the proposed method achieve 0.9141 and 30.881 dB, respectively. The obtained ULM image shows the microvessels as small as 20 µm. Furthermore, compared to the 3D U-Net MB filtering method, this proposed method achieves lower time complexity with an average processing time per frame of 29.49ms for an image size of 256×256.
|
|
17:00-18:00, Paper Tu-S5T2.4 | Add to My Program |
Three-Dimensional Reconstruction of Vascular Model in Intravascular Ultrasound Images Using Semantic Segmentation (I) |
|
Jiang, Yuqi | Southwest Petroleum University |
Zhang, Pengfei | Qilu Hospital, Shandong University |
Peng, Bo | Southwest Petroleum University |
Keywords: Image Processing and Pattern Recognition, Application of Artificial Intelligence
Abstract: Coronary atherosclerotic disease is a major cause of myocardial infarction and usually causes partial or complete coronary artery disorders, which can be life-threatening in severe cases. Currently, the diagnosis of coronary atherosclerosis is mainly achieved through intravascular ultrasound (IVUS). By using intravascular ultrasound, the location and morphology of lesions can be identified early, which is crucial for the early detection and accurate diagnosis of coronary artery disease. Intravascular ultrasound is a widely used imaging technique for diagnosing and treating cardiovascular diseases. In this paper, we propose a new method for three-dimensional (3D) reconstruction of IVUS vascular models based on semantic segmentation. The proposed method utilizes state-of-the-art deep learning techniques to accurately segment the blood vessels in IVUS images. The proposed method utilizes state-of-the-art deep learning techniques to accurately segment the blood vessels in IVUS images. Subsequently, the segmented vessels are used to generate 3D reconstructions of the vascular models. Our method achieves high accuracy and robustness, and it has the potential to enhance the accuracy of IVUS-based diagnosis and treatment planning. Experimental results on a dataset of real-world IVUS images demonstrate the effectiveness of our method.
|
|
17:00-18:00, Paper Tu-S5T2.5 | Add to My Program |
Graph Convolutional Networks with Feature Enhancement for Choroidal Neovascularization Segmentation in OCT Images (I) |
|
Liu, Xiaoming | Wuhan University of Science and Technology |
Wang, Rui | Wuhan University of Science and Technology |
Tang, Jinshan | George Mason University |
Keywords: Image Processing and Pattern Recognition, Deep Learning, AI and Applications
Abstract: Choroidal neovascularization (CNV) is a prevalent retinal disease that can result in vision loss and blindness. Therefore, accurate segmentation of CNV is crucial for ophthalmologists to effectively treat patients with CNV. However, due to the complex pathological features of CNV, there exist significant variations in the size and shape of different CNV lesions. As a result, the challenge of segmenting CNV in optical coherence tomography (OCT) images remains unresolved. In this paper, we propose a Graph Convolutional Network with Feature Enhancement (GCFE-Net) for CNV segmentation. Our approach introduces a Graph Attention Module (GAM) on top of the encoder to extract pixel characteristics and enhance the model's space utilization. Additionally, we propose a Dynamic Fusion Module (DFM) in the decoder to address the issue of semantic misalignment when the CNV scale undergoes substantial changes. The effectiveness of the proposed method is demonstrated through experiments conducted on the Cell public dataset.
|
|
17:00-18:00, Paper Tu-S5T2.6 | Add to My Program |
Weakly Semi-Supervised Object Detection with Point Annotations in Retinal OCT Images (I) |
|
Liu, Xiaoming | Wuhan University of Science and Technology |
Zhu, Xin | Wuhan University of Science and Technology |
Tang, Jinshan | George Mason University |
Keywords: Image Processing and Pattern Recognition, Deep Learning, AI and Applications
Abstract: Optical coherence tomography (OCT) is a widely used ophthalmic imaging technique, and accurate detection of retinal biomarkers in OCT images can help physicians diagnose diseases. However, OCT images are not easy to obtain and are time-consuming and laborious. In addition, the size of biomarkers varies widely. Past deep learning-based methods can hardly solve the above problems well. Thus, to overcome the above challenges, we propose a weakly semi-supervised method called PO-Net for the detection of retinal biomarkers in OCT images. we use a small number of images with bounding box labels, and a large number of weakly annotated images with only one point annotation per biomarker. The training of the net is composed of several steps. In the first step, we use the weakly annotated images to train a point-to-box regression network. In the second step, the point-annotated images are used to generate pseudo-bounding boxes. In the third step, the images with bounding box annotations and the generated images with pseudo-bounding box labels are used as inputs to the detection network. Furthermore, we propose a multi-scale feature fusion module to deal with the problem of biomarker appearance changes. The effectiveness of the proposed method is evaluated on a local dataset, and the state-of-the-art performance of our method is achieved in all datasets with different percentages of bounding box annotations.
|
|
17:00-18:00, Paper Tu-S5T2.7 | Add to My Program |
Towards AI-Based Accessible Digital Media: Image Analysis Pipelines and Blind User Studies |
|
Salous, Mazen | OFFIS Institute for Information Technology |
Lange, Daniel | OFFIS Institute for Information Technology |
Von Reeken, Timo | OFFIS Institute for Information Technology |
Heuten, Wilko | OFFIS Institute for Information Technology |
Boll, Susanne | OFFIS Institute for Information Technology |
Abdenebaoui, Larbi | OFFIS Institute for Information Technology |
Keywords: Assistive Technology, Intelligence Interaction, Human-Computer Interaction
Abstract: We report from our work in progress within our project ABILITY, in which we implemented AI-based image analysis pipelines to improve the accessibility of digital media for Blind and Visually Impaired (BVI) users. In addition, we conducted two user studies with BVIs, the first one as a preliminary study, and the current one to evaluate different types of AI-based automatic description. Based on our current progress, multimodal assistant (namely speech, tactile representations and braille) will be transformed to a dedicated BVI-tablet.
|
|
17:00-18:00, Paper Tu-S5T2.8 | Add to My Program |
Discovering Reliable Information Extraction Patterns with Pre-Trained Model for Text with Writing Style |
|
Bu, Chenyang | Hefei University of Technology |
Liu, Jiacheng | Hefei University of Technology |
Liu, Jiaxuan | Hefei University of Technology |
Ji, Shengwei | Hefei University |
Yang, Hongbin | Hefei University of Technology |
Keywords: Deep Learning, Machine Learning, AI and Applications
Abstract: Large-scale pre-trained models such as GPT and BERT have demonstrated remarkable performance in information extraction tasks. However, their black-box nature poses challenges for reliability and interpretability. In contrast, rule-based extraction methods have better interpretability, but typically require domain experts to manually establish rules, limiting their generalization ability. In industry, there is often a demand for reliable knowledge extraction to reduce the time spent on manual verification of each piece of knowledge. In this paper, we explore the use of GPT-based approaches to automatically discover trustworthy extraction patterns in text with a particular writing style. This method leverages the characteristics of high information density and similar writing patterns in text with a specific writing style to generate verifiable and reliable patterns. We conduct experiments on two datasets with a specific writing style to demonstrate its effectiveness, validating the idea of combining large models for reliable information extraction pattern discovery in the tested datasets.
|
|
17:00-18:00, Paper Tu-S5T2.9 | Add to My Program |
The Consequences of Preference Correlation for Fair Division of Indivisible Items |
|
Ziaei, Fahimeh | Wilfrid Laurier University |
Kilgour, Marc | Wilfrid Laurier University |
Keywords: Conflict Resolution
Abstract: Fair allocation of indivisible items among multiple individuals is a fundamental problem in collective choice. We consider the problem of allocating two of four indivisible items to each of two players, where the only known preference information is each player's strict ranking of the items. How is the rank correlation of preferences, as measured by Kendall Tau, related to properties that facilitate fair allocation such as the availability of envy-free, Pareto-optimal, maximin, and max BordaSum allocations? We also examine the relationship between the ranked correlation and features of Fallback Bargaining such as the depth of agreement and the probability of a (two-way) tie. We further categorize the players into two types, risk-averse and risk-acceptant, and analyze how player type affects various fair division properties. Our results suggest that increasing similarity of preferences tends to increase the number of Pareto-optimal and maximin allocations but to decrease the number of envy-free allocations. Higher rank correlation also makes Fallback Bargaining less compelling. Understanding how similarity of preference rankings influences the trade-offs among allocation properties gives new insight into the difficulties of fair allocation of indivisible items.
|
|
Tu-S5T3 Virtual Session, Room T3 |
Add to My Program |
General Cybernetics I |
|
|
|
17:00-18:00, Paper Tu-S5T3.1 | Add to My Program |
A Pairwise Surrogate Model Using GNN for Evolutionary Optimization |
|
Gharavian, Vida | Ontario Tech University |
Rahnamayan, Shahryar | Brock University |
Asilian Bidgoli, Azam | Wilfrid Laurier University |
Makrehchi, Masoud | Ontario Tech University |
Keywords: Evolutionary Computation, Computational Intelligence, Neural Networks and their Applications
Abstract: Optimization problems widely arise in various science and engineering fields and can be computationally expensive in many real-world applications. Evaluation of the fitness function to assess a candidate solution is the main operation in all optimization procedures which can be heavily compute-intensive. Machine learning-based surrogate models can contribute to learning the specific pattern among the decision variables and objective values to consequently reduce the computation time of fitness evaluation. In this study, we have proposed a novel pairwise surrogate model to identify the superiority between candidate solutions in a pairwise comparison despite the fact that most of the surrogate models try to predict the exact fitness value. The proposed idea can significantly help the optimizer to reach better results in a shorter period of time. It seems comparing two candidate solutions for a greedy selection is much easier than approximating fitness values for both. We demonstrated Graph Neural Network (GNN) for this purpose to be trained on a limited number of pairwise ranks and then utilized to compare a pair of candidate solutions. In order to examine the efficacy of our model, we utilized different well-known single-objective optimization benchmarks in dimensions 10,20, and 30. Moreover, the results of the learning-based evaluation are compared with the results from the real fitness evaluation. The results, assessed in terms of the number of fitness calls and the best-found solution, showed that the proposed method is able to decrease the computing cost of fitness evaluation significantly while we achieve a comparable solution. Our model can be tested with any optimization algorithm which employs comparison-based mechanism among its candidate solutions
|
|
17:00-18:00, Paper Tu-S5T3.2 | Add to My Program |
A Neuronal Circuit Simulation Highlights the Role of Neuroglia in Modulating Information Transmission (I) |
|
Pompa, Marcello | National Research Council (IASI CNR) Rome |
Tartarisco, Gennaro | National Reseach Council of Italy |
Panunzi, Simona | IASI-CNR |
D´Orsi, Laura | IASI CNR |
Borri, Alessandro | IASI-CNR |
De Gaetano, Andrea | IASI-CNR |
Keywords: Computational Life Science
Abstract: A simple model of a neuronal circuit based on a stochastic discrete-time difference equation is described. The model assumes loosely connected clusters of densely connected neurons within each cluster, and a circuit topology is produced by connecting the clusters in sequence. The action of the glia is simulated by two parameters referring to its trophic action in restoring neuronal energy levels after firing and to its scavenging action at the synaptic level affecting the probability that an impulse is transmitted. It is shown that only glial trophic support within a limited range allows ordinate cyclic functioning of the circuit. It is also shown that changes in local action at the synapse determine changes in the frequency of the cycling. This kind of model paves the way to a quantitative description of changes in neurodegenerative diseases, so as to potentially predict the evolution of quality of life in these conditions.
|
|
17:00-18:00, Paper Tu-S5T3.3 | Add to My Program |
A Preference-Based Multi-Objective Evolutionary Algorithm with Local Pareto Front Modeling |
|
Zhao, Pei pei | Zhejiang University of Technology |
Liping, Wang | Zhejiang University of Technology |
Hui, Wang | Zhejiang University of Technology |
Keywords: Evolutionary Computation, Optimization and Self-Organization Approaches, Computational Intelligence
Abstract: Most existing multi-objective evolutionary algorithms (MOEAs) have difficulties in approximating the whole Pareto Fronts with complicated geometries. However, the decision maker (DM) may only be interested in a small portion of the front, referred to as the region of interest (ROI). Bearing this in mind, this paper develops a preference-based MOEA with local Pareto Front (PF) estimation to address the above issues. We first modify the r-dominance relation, which is not affected by the position of the user reference point. The modified non-r-dominated solutions are then used as training data during the optimization process to model the PF of the ROI. The estimated local front can be used to drive the search toward the preferred PF region. Experimental results demonstrate the effectiveness of our proposed algorithm on a variety of benchmark problems with different kinds of PFs.
|
|
17:00-18:00, Paper Tu-S5T3.4 | Add to My Program |
Optimizing Strategies for Design of 3D Chip Layout Using Games Approach and Swarm Intelligence |
|
Grzesiak-Kopeć, Katarzyna | Jagiellonian University in Cracow |
Ogorzalek, Maciej | Jagiellonian University |
Keywords: Swarm Intelligence, Application of Artificial Intelligence, Heuristic Algorithms
Abstract: Heuristic approaches are often used to efficiently find solutions to complex engineering problems such as 3D chip layout design. When considering numerous constraints, multiple evaluation functions can be used. It is not straightforward how to combine these various functions to obtain a satisfying solution. This paper proposes a particle swarm optimization (PSO) algorithm as a method of finding the right blending for a given set of heuristics. Such an approach not only modifies some parameters, but redefines the entire design process by defining it as a game. The designer plays the role of the game master who proposes heuristic evaluation functions. Design components are autonomous intelligent agents moving in a virtual world. The PSO algorithm develops blending factors for given functions to find a universal design strategy for a given design problem. A case study of a 3D chip layout design is presented and discussed. The consequences and benefits of such an approach are also shown.
|
|
17:00-18:00, Paper Tu-S5T3.5 | Add to My Program |
FATPaSE: Proposing a Computationally Intelligent Framework for Automated Target Profiling and Social Engineering (I) |
|
Bischof, Mario | University of Fribourg, Human-IST Institute, Infoguard AG |
Portmann, Edy | University of Fribourg |
Keywords: Communications, Technology Assessment, Homeland Security
Abstract: Social engineering is recurrently listed as a prime threat according to major cyber security agencies. Over the course of the last decades, various sources expressed serious concerns about the potential impact of artificial intelligence on automating criminal cyber activities. With the recent developments and skyrocketing popularity of large language models, this threat forecast drastically worsened within an overwhelmingly short time period. In this paper, we propose the framework FATPaSE which pursues the goal of maximizing automation of digital profiling and social engineering aided by computational intelligence techniques. We discuss the design of the framework in detail, argue for possible technological choices to realize specific parts and elaborate on already existing proof of concept work. The interconnection of all components should enable a working, fully automated kill chain. During the subsequent design science oriented implementation, the achievable degree of automation will be intensely studied, resulting in novel, innovative research artifacts. Our progress shall be directly validated in the field based on a realistic, industrial test setup. We will continuously publish our results during the development process and expect to gain deepened insights on the potentials of the idea to contribute to the toolbox of cyber security experts.
|
|
17:00-18:00, Paper Tu-S5T3.6 | Add to My Program |
MLGNet: A Multi-Period Local and Global Temporal Dynamic Pattern Integration Network for Long-Term Forecasting |
|
Hui, Wang | Zhejiang University of Technology |
Liping, Wang | Zhejiang University of Technology |
Qicang, Qiu | Zhejiang Lab |
Zhao, Pei pei | Zhejiang University of Technology |
Keywords: Deep Learning, Representation Learning, AI and Applications
Abstract: Long-term forecasting is widely used in meteorology, hydrology, and finance. However, non-stationary time series make it hard to make accurate long-term predictions because of their complicated multi-period local-global temporal dynamic patterns. Currently, state-of-the-art methods use transformers or temporal convolutions to obtain global and local temporal dynamic patterns. Nevertheless, the former suffers from the computational complexity of self-attention mechanisms despite having a global temporal receptive field. Despite being able to catch local temporal patterns, the latter requires additional layers to capture global temporal patterns. Moreover, the present research disregards integrating multi-period patterns into long-term forecasting. In this paper, we propose MLGNet to tackle the mentioned challenges, which integrates local and global temporal dynamic patterns with multiple periods for long-term forecasting. In particular, we suggest using the maximal overlap discrete wavelet transform (MODWT) as a multi-period decoupling method to decompose non-stationary time series and apply it for the first time to long-term forecasting. In addition, we suggest a multi-scale encoder-decoder framework to capture and fuse local-global temporal dynamic patterns in each decomposed period. Inception dilated causal convolutions-based encoder and a lightweight MLP-based decoder in the framework capture local and global temporal dynamic patterns in series while avoiding the high computational complexity of self-attention mechanisms. Lastly, we suggest time-separable convolutions for aggregating information on temporal dynamic patterns among multiple periods. The above method helps MLGNet better balance the representation ability of time series in 1D and 2D space. Evaluation of five benchmark datasets shows that MLGNet outperforms traditional and state-of-the-art methods, with relative improvements of 13.8% and 21.9% for multivariate and univariate long-term forecasting, respectively.
|
|
17:00-18:00, Paper Tu-S5T3.7 | Add to My Program |
Collision Avoidance of Autonomous Vehicles with E-Bike at Un-Signalized Occluded Intersections Based on Reinforcement Learning |
|
Zhang, Delei | Shandong University of Science and Technology |
Qi, Liang | Shandong University of Science and Technology |
Luan, Wenjing | Shandong University of Science and Technology |
Guo, Xiwang | Liaoning Petrochemical University |
Keywords: Machine Learning, Agent-Based Modeling, Application of Artificial Intelligence
Abstract: Un-signalized occluded intersections are residential road intersections with narrow lanes and surrounding buildings, which are prone to traffic accidents. This work uses deep reinforcement learning to design driving strategies for Autonomous Vehicles (AVs) for avoiding collision and reducing damage to electric bicycles (e-bikes) with dangerous behaviors at un-signalized occluded intersections. The conflict-avoidance behavior of e-bikes is modeled. It adopts a multi-objective reward function that considers the injury severity of e-bike riders and the driving safety and comfort of AVs. A deep deterministic policy gradient method is used to train the model to control the acceleration and steering of AVs. The performance of the proposed method is compared with that of an autonomous emergency braking system and a risk-aware high-level decision strategy by simulation experiments. Experimental results show that the driving strategy can reduce the collision probability by 26.38% on average, and the injury can be reduced by 14.05% on average when the collision is unavoidable. To our knowledge, this is the first paper that employs reinforcement learning to model and design driving strategies for AVs conflicting with e-bikes. It can be used to improve the state of the art in AV control and safety at intersections.
|
|
17:00-18:00, Paper Tu-S5T3.8 | Add to My Program |
Controlled Dropout for Uncertainty Estimation |
|
Hasan, Md Mehedi | Deakin University |
Hossain, Ibrahim | Deakin University |
Rahman, Ashikur | Bangladesh University of Engineering and Technology |
Nahavandi, Saeid | Swinburne University of Technology |
Keywords: Trust in Autonomous Systems
Abstract: Uncertainty quantification in a neural network is one of the most discussed topics for safety-critical applications. Though Neural Networks (NNs) have achieved state-of-the-art performance for many applications, they still provide unreliable point predictions, which lack information about uncertainty estimates. Among various methods to enable neural networks to estimate uncertainty, Monte Carlo (MC) dropout has gained much popularity in a short period due to its simplicity. In this study, we present a new version of the traditional dropout layer where we are able to fix the number of dropout configurations. As such, each layer can take and apply the new dropout layer in the MC method to quantify the uncertainty associated with NN predictions. We conduct experiments on both toy and realistic datasets and compare the results with the MC method using the traditional dropout layer. Performance analysis utilizing uncertainty evaluation metrics corroborates that our dropout layer offers better performance in most cases.
|
|
17:00-18:00, Paper Tu-S5T3.9 | Add to My Program |
Empirical Studies of Resampling Strategies in Noisy Evolutionary Multi-Objective Optimization |
|
Zhou, Shasha | University of Electronic Science and Technology of China |
Li, Ke | University of Exeter |
Keywords: Evolutionary Computation, Optimization and Self-Organization Approaches, AI and Applications
Abstract: Optimization problems are ubiquitous in real-world engineering scenarios where the goals are to enhance interested aspects such as efficiency, productivity, and profitability. However, solving practical optimization problems could be non-trivial, partly due to the presence of a wide range of noises, including environmental noises, model biases, time-domain variations, measurement uncertainties and many other uncontrolled variables. In this paper, we empirically study the effect of noise range, sample size and resampling type on the solution quality of MOEAs when noise is added to decision variables. Our empirical results, conducted on three commonly used Multi-Objective Optimization Problems (MOEAs), i.e. NSGA-II, MOEA/D and IBEA, demonstrate that noise range has more significant impact on the robustness of optimization algorithms compared to sample size and resampling type. In addition, we introduce the concept of bad point, which is able to illustrate how noise affects the performance of different MOEAs.
|
|
Tu-S5T4 Virtual Session, Room T4 |
Add to My Program |
General HMS VI |
|
|
|
17:00-18:00, Paper Tu-S5T4.1 | Add to My Program |
Optimize the Accessibility of Healthcare Facilities Via ACP-Based Approach |
|
Lv, Xinyi | Nanjing Medical University |
Yu, Yi | Shanghai Artificial Intelligence Laboratory |
Xie, Xinzhao | Nanjing Medical University |
Wang, Fei-Yue | Institute of Automation, Chinese Academy of Sciences |
Lin, Yilun | Shanghai Artificial Intelligence Laboratory |
Chen, Yan | Nanjing Medical University |
Keywords: Human Factors, Human-centered Learning
Abstract: Rational allocation of medical resources can help more people access medical services. To allocate medical resources efficiently and fairly, we need to optimize the layout of healthcare facilities, which helps improve residents' health levels. In this paper, we use the theory of Artificial Scenarios, Computational Experiments, and Parallel Execution (ACP) and Artificial Intelligence (AI) techniques to optimize the healthcare facilities' accessibility in a parallel healthcare system. To illustrate the feasibility of our proposed method, a case study in Gulou, Nanjing, is taken. The results reveal that the existing healthcare facilities are scattered with limited continuity, which fails to meet the demands adequately. With the layout optimization based on the Particle Swarm Optimization (PSO) algorithm, approximately 95% of residential areas can be covered by adding 15 additional facilities. It not only reduces the standard deviation of accessibility from 0.06 to 0.04 but also significantly improves the fairness and efficiency of healthcare facilities. From an accessibility perspective, optimizing the layout of healthcare facilities ensures fair access to public services and aligns the supply with the demand. The research also provides a reference for the utilization and optimization of public service facilities.
|
|
17:00-18:00, Paper Tu-S5T4.2 | Add to My Program |
Boosting Intelligent Diagnostic Process in Internet Hospital: A Conversational-AI-Enhanced Framework |
|
Wang, Kexin | Nanjing Medical University |
Yao, Shengyue | Shanghai AI Laboratory |
Wu, Ziyi | Nanjing Medical University |
Wang, Fei-Yue | Institute of Automation, Chinese Academy of Sciences |
Lin, Yilun | Shanghai Artificial Intelligence Laboratory |
Chen, Yan | Nanjing Medical University |
Keywords: Human Factors, Human-centered Learning
Abstract: The development of Internet Hospital attracts growing attention worldwide to improve medical service quality and efficiency. However, the existing Internet Hospital failed to fully allocate the patients' consultation demands and improve the level of satisfaction, which is mainly caused by an overlong online waiting time during the diagnostic process. The emergence of Large-Language-Model (LLM) technology provides an opportunity to improve the existing synchronous and sequential online diagnostic process towards an asynchronous and concurrent process. With the consideration of applying LLM technology in developing the Internet Hospital, a conversational-AI-enhanced intelligent diagnostic process framework is proposed in this paper. By hierarchically decomposing the online diagnostic service into three layers, namely the AI doctor, the rotating doctor, and the expert doctor, the diagnostic process is capable of providing instant treatments with a lower misdiagnosis rate, meanwhile relieving the workload of human doctors. In addition, a case study of the Internet Hospital operated by Jiangsu Provincial Hospital is conducted, which reveals the importance of boosting the diagnostic progress by the proposed framework. Further, the proposed framework is examined by a numerical experiment based on statistical data. The experiment results indicate that both the patient waiting time and the misdiagnosis rate can be significantly reduced, which suggests a great potential for applying the proposed framework in practice.
|
|
17:00-18:00, Paper Tu-S5T4.3 | Add to My Program |
A Novel Industrial Robot Calibration Method Based on Multi-Planar Constraints |
|
Chen, Tinghui | Southwest University |
Shuai, Li | University of Oulu, Technology Research Center of Finland (VTT) |
Keywords: Computational Intelligence
Abstract: Absolute positioning accuracy is the critical factor affecting the performance of an industrial robot. For its improvement, researchers commonly adopt the calibration techniques to optimize robot kinematic parameters. However, an industrial robot’s working space is mostly restricted in real working environments, making the collected samples fail in covering the actual working space to result in the overall migration data. To address this vital issue, this work proposes a novel industrial robot calibrator that integrates a measurement configurations selection (MCS) method and an alternation-direction-method-of-multipliers with multiple planes constraints (AMPC) algorithm into its working process, whose ideas are three-fold: a) selecting a group of optimal measurement configurations based on the observability index to suppress the measurement noises, b) developing an AMPC algorithm that evidently enhances the calibration accuracy and suppresses the long-tail convergence, c) proposing an industrial robot calibration algorithm that incorporates MCS and AMPC to optimize an industrial robot’s kinematic parameters efficiently. For validating its performance, a public-available dataset (HRS-P) is established on an HRS-JR680 industrial robot. Extensive experimental results demonstrate that the proposed calibrator outperforms several state-of-the-art models in calibration accuracy.
|
|
17:00-18:00, Paper Tu-S5T4.4 | Add to My Program |
EgoFormer: Transformer-Based Motion Context Learning for Ego-Pose Estimation |
|
Li, Tianyi | Xi'an Jiaotong University |
Zhang, Chi | Xi'an Jiaotong University |
Su, Wei | Xi'an Jiaotong University |
Liu, Yuehu | Xi'an Jiaotong University |
Keywords: Wearable Computing, Human-Machine Interaction, Human Perception in Multimedia
Abstract: Ego-pose estimation, i.e. predicting 3D pose of the camera wearer, has an essential value in AR and VR applications. First-person video has an ambiguity in that similar video frames may correspond to totally different body poses because of the invisible body part. However, exploiting the context of a video and establishing a long-term temporal relationship can alleviate this ambiguity. To this end, this paper proposes EgoFormer, a Transformer-based model, to learn the motion context from egocentric videos. Moreover, dynamic features commonly used to characterize first-person video do not provide sufficient temporal information to remove the ambiguity inherent in such videos.Therefore, we present a method that can effectively extract temporal features in first-person videos. Results on real-scene and synthetic datasets show that our method could estimate a sequence of human poses with high accuracy and coherence.
|
|
17:00-18:00, Paper Tu-S5T4.5 | Add to My Program |
A Neighbor-Induced Graph Convolution Network for Undirected Weighted Network Representation |
|
Chen, Jiufang | Southwest University |
Yuan, Ye | Southwest University |
Keywords: Networking and Decision-Making
Abstract: Precise representation learning to an undirected weighted network (UWN) is the foundation of understanding its connection patterns and functional mechanisms. A graph convolution network (GCN) is frequently utilized to tackle this issue. However, it only considers the neighbor information in the forward propagation process, which unfortunately impairs its representation learning ability. Motivated by this discovery, this work proposes a novel Neighbor-induced Graph Convolution Network (N-GCN) that adopts two-fold ideas: a) employing the weighted forward propagation process, which aggregates the neighbor information by considering the interaction strength of node pair; b) incorporating a neighbor-regularizer into the loss function, which induces the neighbor information to illustrate a UWN’s intrinsic symmetry, thereby boosting the representation learning ability. Experimental results on four UWNs validate the proposed N-GCN outperforms the state-of-the-art models in achieving a highly accurate representation of UWNs emerging from real applications.
|
|
17:00-18:00, Paper Tu-S5T4.6 | Add to My Program |
AI4S Based on DeSci: Reference Model and Research Issues |
|
Ding, Wenwen | Macau University of Science and Technology |
Li, Juanjuan | Institute of Automation, Chinese Academy of Sciences |
Qin, Rui | Institute of Automation, Chinese Academy of Sciences |
Guan, Sangtian | Macau University of Science and Technology |
Wang, Fei-Yue | Institute of Automation, Chinese Academy of Sciences |
Keywords: Human Enhancements, Human Factors
Abstract: The rise of Artificial Intelligence for Science (AI4S) has highlighted the importance and urgency of ensuring openness, fairness, impartiality, diversity, and sustainability in scientific systems. Existing scientific systems, referred to as Centralized Science (CeSci), are built on centralized organizational structures and top-down institutional frameworks, which are lagging behind the development and practical requirements of AI4S. To address these limitations, AI4S needs to embrace new scientific organizational and operational paradigm, namely Decentralized Science (DeSci). It can provide strong support to AI4S, via effectively addressing issues such as information silos, biases, unfair distribution, and monopolies, and promoting multidisciplinary, interdisciplinary, and transdisciplinary cooperation in science. Based on these consideration, this paper presents the framework of AI4S based on DeSci and explores its potential application scenarios and research issues. The research can provide effective guidance for the development of scientific systems. Index Terms—Intelligent Science, Decentralized Science
|
|
17:00-18:00, Paper Tu-S5T4.7 | Add to My Program |
Multi-Constrained Symmetric Nonnegative Latent Factor Analysis for Accurately Representing Undirected Weighted Networks |
|
Zhong, Yurong | Dongguan University of Technology |
Xie, Zhe | School of Computer Science and Technology, Dongguan University O |
Li, Weiling | Dongguan University of Technology |
Luo, Xin | Chinese Academy of Sciences |
Keywords: Representation Learning, Knowledge Acquisition, Big Data Computing,
Abstract: An Undirected Weighted Network (UWN) is frequently encountered in a big-data-related application concerning the complex interactions among numerous nodes. A Symmetric High-Dimensional and Incomplete (SHDI) matrix can smoothly illustrate such a UWN, which contains rich knowledge like node interaction behaviors and local complexes. To extract desired knowledge from an SHDI matrix, an analysis model should carefully consider its topology for describing a UWN’s intrinsic symmetry precisely. Representation learning to a UWN borrows the success of a pyramid of symmetry-aware models like a Symmetric Nonnegative Matrix Factorization (SNMF) model whose objective function utilizes a sole Latent Factor (LF) matrix for representing SHDI’s symmetry precisely. However, they suffer from the following drawbacks: 1) their computational complexity is high; and 2) their modeling strategy narrows their representation features, making them suffer from low learning ability. Aiming at addressing the above critical issues, this paper proposes a Multi-constrained Symmetric Nonnegative Latent-factor-analysis (MSNL) model with two-fold ideas: 1) introducing multi-constraints composed of multiple LF matrices, i.e., inequality and equality ones into a data-density-oriented objective function for precisely representing the intrinsic symmetry of an SHDI matrix with broadened feature space; and 2) implementing an alternating direction method of multipliers (ADMM)-incorporated learning scheme for efficiently solving such a multi-constrained model. Empirical studies on three SHDI matrices from a real bioinformatics or industrial application demonstrate that the proposed MSNL model achieves higher representation accuracy than state-of-the-art models do, as well as promising computational efficiency.
|
|
17:00-18:00, Paper Tu-S5T4.8 | Add to My Program |
Parallel Reasoning Based on ACP Method for Power Grid Dispatching |
|
Xu, Yancai | Institute of Automation, Chinese Academy of Sciences |
Yang, Linyao | Institute of Automation, Chinese Academy of Sciences |
Zhu, Fenghua | Institute of Automation, Chinese Academy of Sciences |
Wang, Xiao | Institute of Automation, Chinese Academy of Sciences |
Wang, Fei-Yue | Institute of Automation, Chinese Academy of Sciences |
Keywords: Cognitive Computing, Systems Safety and Security, Supervisory Control
Abstract: Multi-source heterogeneous knowledge collaboration is the technical foundation for establishing a complete knowledge base. A parallel reasoning framework based on the ACP method is proposed. First, it provides a virtual experimental platform for generating the artificial data needed for missing knowledge extraction by constructing an artificial system. Second, it generates artificial big data and organize it into a knowledge graph to achieve structured representation and storage of system control knowledge by carrying out calculation experiments related to missing scene knowledge. Finally, it achieves unbiased application and update of knowledge through parallel execution, completing the optimization and control of the actual system. Experiments are carried out and the effectiveness of parallel reasoning is verified.
|
|
17:00-18:00, Paper Tu-S5T4.9 | Add to My Program |
An Inverse Chance-Constrained Approach to the Calibration of Robust Models |
|
Crespo, Luis | NASA |
Kenny, Sean | NASA |
Keywords: System Modeling and Control, Control of Uncertain Systems
Abstract: This paper proposes a strategy to calibrate computational models according to uncertain input-output data. To this end, uncertainty in the data is first characterized using adversarial data sets. Samples drawn from such sets are then mapped from the input-output space to the parameter space using an inverse mapping. This mapping minimizes the collective output spread of an ensemble of point predictions while satisfying a set of individual data-matching requirements. The distribution of the resulting parameter points, which often exhibits strong parameter dependencies, is then modeled using Sliced Normal distributions. The chance-constrained formulation used to learn this distribution enables the analyst to trade-off a greater likelihood for most of the data against a lower likelihood for some of the data, thereby relaxing the conservatism of the calibrated model. This formulation neglects the worst-performing quantiles of each adversarial distribution and eliminates the detrimental effects that outliers have on the resulting model. This calibration approach not only has a considerably lower computational cost than the standard forward approach but it also allows for the identification of suitable distribution classes, which in turn yield better calibrated models.
|
|
Tu-S5T5 Virtual Session, Room T5 |
Add to My Program |
General Cybernetics III |
|
|
|
17:00-18:00, Paper Tu-S5T5.1 | Add to My Program |
Facial Expression Recognition in the Wild Using FAM, RRDB and Vision Transformer Based Convolutional Neural Networks |
|
Liu, Kuan-Hsien | National Taichung University of Science and Technology |
Shih, Xiang-Kun | National Taichung University of Science and Technology |
Liu, Tsung-Jung | National Chung Hsing University |
Liu, Wen-Ren | National Taichung University of Science and Technology |
Keywords: Deep Learning, Application of Artificial Intelligence, Neural Networks and their Applications
Abstract: Facial expression recognition in the wild is a quite challenge task because facial images captured in natural settings are often affected by various factors such as irregular occlusions, inconsistent face angles and varying light levels. To overcome these factors, we presented a new triple-branch deep neural network model containing facial attention module, residual in residual dense block and vision transformer to deal with facial expression recognition problem. The facial attention module can help backbone network extract more useful features. Residual in residual dense block and vision transformer can further get more detailed features. In addition to the image problems caused by above factors, the ratio of number of images for different expressions in the dataset is also very different. We proposed a new loss function to tackle this uneven ration problem. Experimental results on two benchmark in-the-wild datasets show that our model is indeed helpful for images in the wild. We also created a new expressions dataset called Fairfaceplus, which is built by adding expression categories to the original label on FairFace dataset. The code of our proposed method will be made available on GitHub https://github.com/dreampledge/fairfaceplus.
|
|
17:00-18:00, Paper Tu-S5T5.2 | Add to My Program |
Scoring Protein-Ligand Complex Structures by HybridNet |
|
Wang, Dan | Hong Kong Metropolitan Unversity |
Wang, Ran | Shenzhen University |
Keywords: Biometric Systems and Bioinformatics, Computational Life Science, Artificial Life
Abstract: Scoring the binding for a protein-ligand complex structure is a widely-discussed and open problem in structure-based drug design. As deep learning and artificial intelligence continue to rapidly advance, developing deep-learning scoring models is currently an active area of research. Intermolecular-contact features are fast-to-generate and can be efficiently handled by deep-learning models, while they are oversimplified to characterize the binding between a ligand and its target protein. In this work, we have developed the HybridNet model that profiles multi-range intermolecular contacts and deals with the heterogeneous channels of such features using a hybrid deep-learning architecture. The intermolecular-contact profiles keep the simplicity of original features but describe the interactions more deeply. Besides, compared to individual learning architectures and classical scoring models, HybridNet performed more favorably in protein-ligand scoring tasks. The proposed method of featurization and scoring will prospectively benefit related tasks like molecular docking and virtual screening in the long term.
|
|
17:00-18:00, Paper Tu-S5T5.3 | Add to My Program |
Bidformer: A Transformer-Based Model Via Bidirectional Sparse Self-Attention Mechanism for Long Sequence Time-Series Forecasting |
|
Li, Wei | Harbin Engineering University |
Meng, Xiangxu | Harbin Engineering University |
Chen, Chuhao | Harbin Engineering University |
Mi, Hailin | Harbin Engineering University |
Wang, Huiqiang | Harbin Engineering University |
Keywords: Neural Networks and their Applications, Machine Learning, Deep Learning
Abstract: Long Sequence Time-Series Forecasting (LSTF) is an important and challenging research with broad applications. Recent studies have shown that Transformer-based models can be effective in solving correlation problems in time-series data, but they also introduce issues of quadratic time and memory complexity, which make them unsuitable for LSTF problems. As a response, we investigate the impact of the long-tail distribution of attention scores on prediction accuracy and propose a Bis-Attention mechanism based on the mean measurement to bi-directionally sparse the self-attention matrix as a way to enhance the differentiation of attention scores and to reduce the complexity of the Transformer-based models O(L^2) to O((logL)^2). Moreover, we reduce memory consumption and optimize the model architecture through the use of a sharedQK method. The effectiveness of the proposed method is verified by theoretical analysis and visualisation. Extensive experiments on three benchmarks demonstrate that our method achieves better performance compared to other state-of-the-art methods, including an average reduction of 19.2% in MSE and 12% in MAE compared to Informer.
|
|
17:00-18:00, Paper Tu-S5T5.4 | Add to My Program |
High-Precision Indoor Fingerprint Localization Based on Graph Neural Network |
|
Meng, Xiangxu | Harbin Engineering University |
Li, Wei | Harbin Engineering University |
Zhao, Zheng | Harbin Engineering University |
Cai, Yinan | Harbin Engineering University |
Liu, Guoqing | Harbin Engineering University |
Keywords: AI and Applications, Neural Networks and their Applications, Deep Learning
Abstract: Thanks to the rapid development of machine learning and deep learning, the fingerprint localization community has achieved high localization performance through advanced models. However, current approaches mainly focus on machine learning or pure convolutions algorithms for example k-Nearest Neighbor (KNN) and Convolutional Neural Network (CNN). Despite their success, their localization ability is mainly derived from the development of model learning ability without in-depth study of the physical meaning of Channel State Information (CSI), such as the varying degrees of correlation and interference between subcarriers of antennas located at different locations in a wireless communication system. In particular, we develop a novel graph neural network (GNN)-based method to embed subcarriers from different antennas at different positions of the CSI as nodes and connect them to K-nearest neighbours (KNNs) to obtain a cohesive graph structure. Furthermore, we introduce a dilated KNN to perform graph-level learning using graph convolution, effectively modeling the correlation and interference between subcarriers. Finally, to prevent the degradation of positioning performance caused by reduced node feature diversity, we introduce a feedforward neural network (FFN) module for node feature transformation. Experimental results on real datasets indicate that our method achieves an average localization error of 0.17m, which is almost 39.3% better than the existing optimal baseline.
|
|
17:00-18:00, Paper Tu-S5T5.5 | Add to My Program |
Top-R Influential Communities Identification Over Attributed Networks |
|
Li, Wei | Harbin Engineering University |
Zhao, Zheng | Harbin Engineering University |
Wang, Xiao | Harbin Engineering University |
Meng, Xiangxu | Harbin Engineering University |
Lv, Hongwu | Harbin Engineering University |
Keywords: Big Data Computing,, Complex Network, Heuristic Algorithms
Abstract: In recent years, the problem of discovering influence communities has received much attention. However, in practical applications, vertices are often associated with attributes that are significant for understanding community. This paper explores the problem of computing top-r influential communities in attributed networks. We present a new community model called AAC that aims to promote cohesiveness in both structure and attributes. In order to measure the impact of a community C, we have developed an influential score function called iScore(C) by balancing attribute and structure cohesiveness. We propose two baseline approaches with effective pruning techniques and an index-based approach to efficiently report communities in an attributed network. The experimental results on real attributed networks indicate the effectiveness of our model and the efficiency of our proposed online and index-based algorithms.
|
|
17:00-18:00, Paper Tu-S5T5.6 | Add to My Program |
PatchNF: A Patching-Based Method for Network Latency Forecasting in URLLC Scenarios |
|
Li, Wei | Harbin Engineering University |
Meng, Xiangxu | Harbin Engineering University |
Chen, Chuhao | Harbin Engineering University |
Mi, Hailin | Harbin Engineering University |
Zhao, Zheng | Harbin Engineering University |
Keywords: Application of Artificial Intelligence, AI and Applications, AIoT
Abstract: Ultra-Reliable Low Latency Communication (URLLC) has become crucial in various communication services. It encompasses a broad range of fields, including Internet of Things (IoT) and Internet of Vehicles (IoV), where precise network latency prediction analysis is required to ensure reliability. However, current network latency forecasting models are insufficient in capturing the temporal periodicity and statistical features of network latency. Therefore, we propose a Patching-based Method for Network Latency Forecasting (PatchNF) in the URLLC scenarios. Specifically, to preserve local semantic information, we segment the latency sequence into subsequence-level patches that are used as input tokens for the encoder. Next, we employ dilated convolutions to capture different characteristics of network delay periodicity by utilizing varying receptive fields. To make full use of features with different network delay semantics and improve prediction performance, we introduce a hierarchical approach that exploits network delay semantics at different levels before the execution of the fully-connected mapping. Experimental results on real datasets collected in a URLLC scenario indicate that the proposed method outperforms TS2Vec by 55.2% and Informer by 181.9% in terms of mean squared error (MSE). These results confirm the significant advantage of PatchNF in network latency forecasting.
|
|
17:00-18:00, Paper Tu-S5T5.7 | Add to My Program |
Enhancing Human Self-Regulation with Controllable Robot Swarms Acting As Extended Bodies |
|
Rockbach, Jonas David | Fraunhofer Institute for Communication, Information Processing A |
Keywords: Swarm Intelligence, Cyborgs,, Artificial Life
Abstract: Human-swarm interaction investigates the integration of human capabilities with the benefits of robot swarms. In this context, we are interested in how the natural self-regulatory capabilities of humans can be enhanced by engineered robot swarms over different applications. We formulated a grid world survival game as an experimental framework that has implications for search and rescue, in which a human must be stabilized against environmental disturbances such as falling objects. Building upon the abstract survival game, controllable swarm behaviours based on virtual pheromones that protect the human from environmental disturbances are proposed. Here, human and swarm are treated as part of the same hybrid superorganism in which the swarm acts as an extended and controllable body of the human. We evaluated basic protective behaviours in the context of the survival game and used the results to synthesize a hybrid superorganism design.
|
|
17:00-18:00, Paper Tu-S5T5.8 | Add to My Program |
Dense Depth Estimation for Monocular Endoscope Robot with an Adaptive Baseline |
|
Song, Rihui | Sun Yat-Sen University |
Tan, Zhidong | Sun Yat-Sen University |
Liang, Hongli | Sun Yat-Sen University |
Ling, Yehua | Sun Yat-Sen University |
Chen, Gang | Sun Yat-Sen University |
Gong, Jin | The Third Affiliated Hospital of Sun Yat-Sen University |
Huang, Kai | School of Data and Computer Science |
Keywords: Machine Vision, Image Processing and Pattern Recognition
Abstract: Depth information is useful to surgeons and surgical assistance systems. However, it is a challenging task to estimate the depth of various surgical scenes based on a monocular endoscope. We propose a depth estimation approach for a monocular endoscope with a stereo matching algorithm. The monocular endoscope is moved horizontally by a robotic endoscope holder to simulate a stereo vision system. The main challenge is how to obtain a proper baseline for better depth information generation as the depth range of a surgical scene is unknown beforehand. We design a baseline evaluation and selection algorithm to search for suitable baselines for surgical scenes with different depth ranges. Experimental results show that our approach improves the average accuracy of different depth scenarios by 10.8% when the error range is 2mm.
|
|
17:00-18:00, Paper Tu-S5T5.9 | Add to My Program |
A CNN-LSTM Based Model to Predict Trajectory of Human-Driven Vehicle |
|
Alsanwy, Shehab | Deaklin University |
Asadi, Houshyar | Deakin University |
Qazani, Mohammad Reza Chalak | Deakin University |
Mohamed, Shady | Senior Research Fellow, Deakin University |
Nahavandi, Saeid | Swinburne University of Technology |
Keywords: Human-Machine Cooperation and Systems, Human Performance Modeling, Networking and Decision-Making
Abstract: Vehicle trajectory prediction is essential in ensuring the safe and efficient operation of advanced driver assistance systems (ADAS) and autonomous vehicles (AVs), as it enables highly efficient collision avoidance, path planning, and traffic control. However, existing models for vehicle trajectory prediction predominantly focus on limited driving scenarios, resulting in limited applicability. To address this limitation, we present a novel vehicle trajectory prediction approach that employs a Convolutional Long Short-Term Memory (CNN-LSTM) model, incorporating simulated environments and vehicle dynamic time series data, including longitudinal, vertical, and latitudinal position and acceleration. Our approach is distinguished by its ability to handle diverse urban driving scenarios, such as highways, roundabouts, intersections, and turns, which enhances its applicability and generalizability. We experimented and collected vehicle data from 17 drivers using a stationary driving simulator and the Euro Truck Simulator software. For the model implementation and validation, we utilized Python 3.9 and Google Colab, as well as the Scikit-learn library for Deep learning algorithms. The proposed CNN-LSTM model leverages a convolutional layer to learn local patterns and an LSTM layer to capture long-term temporal dependencies, improving performance in predicting vehicle trajectories. The experimental results demonstrate that the CNN-LSTM model provides more accurate predictions for longitudinal and lateral positions compared to traditional vehicle trajectory prediction methods that employ LSTM and Recurrent Neural Network (RNN). This research contributes to developing robust and reliable vehicle trajectory prediction systems vital for ADAS and AVs' safe and efficient operation. The proposed approach broadens the applicability of trajectory prediction models, enabling better-informed decision-making in various driving conditions and ultimately improving road safety and efficiency in the rapidly evolving field of autonomous transportation.
|
|
Tu-S5T6 Virtual Session, Room T6 |
Add to My Program |
Human-Machine Interaction VII |
|
|
Chair: Fan, Hongfei | Tongji University |
|
17:00-18:00, Paper Tu-S5T6.1 | Add to My Program |
Building Temporary Isolated Workspace in Real-Time Collaborative Programming Environment |
|
Jiang, Jinfeng | Tongji University |
Xie, Yuxiang | Tongji University |
Fang, Bicheng | Tongji University |
Wang, Mingjie | Tongji University |
Fan, Hongfei | Tongji University |
Keywords: Cooperative Work in Design, Human-Computer Interaction, Multi-User Interaction
Abstract: Real-time collaborative programming supports a team of programmers to concurrently view and edit the same set of source code at the same time, which is beneficial in meeting particular collaboration needs. However, during the collaboration process, programmers are not able to compile and debug the source code with syntactic errors as it is being continuously edited by other collaborators. To address this challenge, we propose a novel approach named Reversion of Error-free Code with Workspace Isolation (RECON) and contribute supporting techniques with prototype implementation. In this approach, the system continuously monitors the files being collaboratively edited, detects source code without syntactic error, and maintains additional error-free source code copies. Whenever a programmer attempts to compile and debug the code, the system creates a temporary isolated workspace and replaces the source code files with the latest error-free copies. The proposed approach and solution have been implemented in a prototype system named CoIDEA, which has indicated the feasibility of the scheme and techniques.
|
|
17:00-18:00, Paper Tu-S5T6.2 | Add to My Program |
Mental Models of AI Performance and Bias of Nontechnical Users |
|
Walsh, Sarah | Georgia Institute of Technology |
Feigh, Karen | Georgia Institute of Technology |
Keywords: Human-Computer Interaction, User Interface Design, Team Performance and Training Systems
Abstract: Understanding human mental models of AI are critical for designing human-centered AI. Examining mental models provides depth in understanding of how users want to interact with AI, when users may need additional explanation of the system, and what knowledge is shared between the user and the AI. This work investigates users' mental models of an AI-decision aid. An experiment was designed to mimic a realistic emergency preparedness scenario in which a resource must be allocated into 1 of 100 possible locations based on a variety of dynamic visual heat maps. The users are assisted in resource placement by an AI-decision aid. The experiment was divide into two experimental blocks. The first of which was used to determine mental model accuracy. The second of which was used to examine preferences in meeting the individual and human-AI team goals. The users are asked to determine whether the AI is satisfying a set of constraints to best serve the affected population. Users are also asked to provide an overall score for the AI performance in resource placement. It was found that users tended to exhibit a binary bias in which they tended to categorize performance into discrete bins rather than on a continuous scale, however, the users were able to distinguish between individual and team goals in a human-AI team decision task and did not exhibit a bias towards the human goals.
|
|
17:00-18:00, Paper Tu-S5T6.3 | Add to My Program |
Identify and Characterize Fall-Risk in Older Adults: A Data-Driven Approach |
|
Fu, Enqi | Tsinghua University |
Tang, Huimin | Tsinghua University |
Xie, Xiaolei | Tsinghua University |
Kang, Lin | Peking Union Medical College Hospital |
Keywords: Human-centered Learning, Human Factors
Abstract: In this study we introduce a precise fall-risk screening method for large-scale older population. Based on a dataset including 7084 older adults across 30 provinces in China, we developed a data-driven method to identify the fall-risk group and determine the major characteristics in older adults. First, the entire sample were divided into two groups by gender based on analysis of Cluster Feature Tree. Extreme Gradient Boosting models confirmed that patient clustering can improve the performance of fall-risk prediction, and pinpointed the common and different important features for different patient groups. The findings provide evidence for future behavioral trait indicators for geriatric rehabilitation and have potential to enhance geriatric health management in primary care.
|
|
17:00-18:00, Paper Tu-S5T6.4 | Add to My Program |
Effects of Different Levels of Self-Representation on Spatial Awareness, Self-Presence and Spatial Presence During Virtual Locomotion |
|
Zhao, Jingbo | China Agricultural University |
Wang, Zhetao | China Agricultural University |
Wang, Yaojun | College of Information and Electrical Engineering China Agricul |
Keywords: Virtual/Augmented/Mixed Reality, Virtual and Augmented Reality Systems
Abstract: Recently, there has been growing interest in investigating the effects of self-representation on user experience and perception in virtual environments. However, few studies investigated the effects of levels of body representation (full-body, lower-body and viewpoint) on locomotion experience in terms of spatial awareness, self-presence and spatial presence during virtual locomotion. Understanding such effects is essential for building new virtual locomotion systems with better locomotion experience. In the present study, we first built a walking-in-place (WIP) virtual locomotion system that can represent users using avatars at three levels (full-body, lower-body and viewpoint) and is capable of rendering walking animations during in-place walking of a user. We then conducted a virtual locomotion experiment using three levels of representation to investigate the effects of body representation on spatial awareness, self-presence and spatial presence during virtual locomotion. Experimental results showed that the full-body representation provided better virtual locomotion experience in these three factors compared to that of the lower-body representation and the viewpoint representation. The lower-body representation also provided better experience than the viewpoint representation. These results suggest that self-representation of users in virtual environments using a full-body avatar is critical for providing better locomotion experience. Using full-body avatars for self-representation of users should be considered when building new virtual locomotion systems and applications.
|
|
17:00-18:00, Paper Tu-S5T6.5 | Add to My Program |
Multimodal Speech Emotion Recognition Using Modality-Specific Self-Supervised Frameworks |
|
Patamia, Rutherford Agbeshi | UESTC |
Santos, Paulo E. | Flinders University Tonsley |
Acheampong, Kingsley Nketia | University of Electronic Science and Technology of China |
Ekong, Favour | University of Electronic Science and Technology of China |
Sarpong, Kwabena | University of Electronic Science and Technology of China |
She, Kun | UESTC |
Keywords: Affective Computing, Cognitive Computing, Intelligence Interaction
Abstract: Emotion recognition is a topic of significant interest in assistive robotics due to the need to equip robots with the ability to comprehend human behavior, facilitating their effective interaction in our society. Consequently, efficient and dependable emotion recognition systems supporting optimal human-machine communication are required. Multi-modality (including speech, audio, text, images, and videos) is typically exploited in emotion recognition tasks. Much relevant research is based on merging multiple data modalities and training deep learning models utilizing low-level data representations. However, most existing emotion databases are not large (or complex) enough to allow machine learning approaches to learn detailed representations. This paper explores modality-specific pre-trained transformer frameworks for self-supervised learning of speech and text representations for data-efficient emotion recognition while achieving state-of-the-art performance in recognizing emotions. This model applies feature-level fusion using nonverbal cue data points from motion capture to provide multimodal speech emotion recognition. The model was trained using the publicly available IEMOCAP dataset, achieving an overall accuracy of 77.58% for four emotions, outperforming state-of-the-art approaches
|
|
17:00-18:00, Paper Tu-S5T6.6 | Add to My Program |
Haptic-Guided Assisted Telemanipulation Approach for Grasping Desired Objects from Heaps |
|
Adjigble, Maxime | University of Birmingham |
Stolkin, Rustam | Extreme Robotics Lab, NCNR, University of Birmingham |
Marturi, Naresh | University of Birmingham |
Keywords: Shared Control, Haptic Systems, Human-Collaborative Robotics
Abstract: This paper presents an assisted telemanipulation framework for reaching and grasping desired objects from clutter. Specifically, the developed system allows an operator to select an object from a cluttered heap and effortlessly grasp it, with the system assisting in selecting the best grasp and guiding the operator to reach it. To this end, we propose an object pose estimation scheme, a dynamic grasp re-ranking strategy, and a reach-to-grasp hybrid force/position trajectory guidance controller. We integrate them, along with our previous Spect-GRASP grasp planner, into a classical bilateral teleoperation system that allows to control the robot using a haptic device while providing force feedback to the operator. For a user-selected object, our system first identifies the object in the heap and estimates its full six degrees of freedom (DoF) pose. Then, SpectGRASP generates a set of ordered, collision-free grasps for this object. Based on the current location of the robot gripper, the proposed grasp re-ranking strategy dynamically updates the best grasp. In assisted mode, the hybrid controller generates a zero force-torque path along the reach-to-grasp trajectory while automatically controlling the orientation of the robot. We conducted real-world experiments using a haptic device and a 7-DoF cobot with a 2-finger gripper to validate individual components of our telemanipulation system and its overall functionality. Obtained results demonstrate the effectiveness of our system in assisting humans to clear cluttered scenes.
|
|
17:00-18:00, Paper Tu-S5T6.7 | Add to My Program |
A Probabilistic-Based Approach to Phase Variable Estimation for Lower-Limb Prostheses Control |
|
Jeevagan, Lakshmi Priya | San Francisco State University |
Selly, George | San Francisco State University |
Sebbo, Anthony | San Francisco State University |
Aceves, Rodrigo | San Francisco State University |
Quintero, David | San Francisco State University |
Keywords: Assistive Technology, Human Performance Modeling, Biometrics and Applications,
Abstract: A mechanical phase variable represents human gait progression that can parameterize the joint kinematic trajectories for lower-limb prostheses control. Current phase variables uses unactuated states, such as the human thigh angle and its derivative or integral, to compute the thigh phase angle to estimate gait phase during stride. With time-dependent states, the phase variable can potentially produce abnormal behavior for the prosthesis controller when the user instantaneously changes speed or inclination. We propose a probabilistic-based approach that uses a maximum likelihood estimation technique from only the thigh angle to derive a holonomic phase variable for a continuous gait phase estimation. Since it does not depend on a time-wise parameter (either derivative or integral), the phase variable can respond to instantaneous changes (e.g., unwanted disturbances) during locomotion. We evaluated the proposed phase variable algorithm across various walking speeds using able-bodied subject datasets, and introduced start and stop transitions to evaluate robustness for non-rhythmic behaviors. The analysis demonstrates a probabilistic adaptation for correcting gait phase abnormalities that can drive locomotion progression for lower-limb prostheses control.
|
|
17:00-18:00, Paper Tu-S5T6.8 | Add to My Program |
How Important Is the Temporal Context to Anticipate Oncoming Vehicles at Night? |
|
Ewecker, Lukas | Dr. Ing. H.c. F. Porsche AG |
Winkler, Timo | Technische Universität Würzburg-Schweinfurt |
Väth, Philipp | Technical University of Applied Sciences Würzburg-Schweinfurt |
Schwager, Robin | Dr. Ing. H.c. F. Porsche AG |
Brühl, Tim | Dr. Ing. H.c. F. Porsche AG |
Schleif, Frank-Michael | Technical University Wuerzburg-Schweinfurt |
Keywords: Human Perception in Multimedia, Human Factors, Human Performance Modeling
Abstract: Driving at night is a challenging task for humans due to low-light conditions and often low concentration caused by drowsiness. Here, advanced driver assistance systems can come to the rescue and support the driver to increase comfort as well as safety. For that, the vehicle needs a sound and complete understanding of its environment, especially other road participants. The earlier this information is available, the more proactive decisions can be made by the system. To detect oncoming vehicles as soon as possible even before they are actually directly visible, the light reflections caused by their headlamps can be leveraged. Previous work showed that algorithms can perform this task on rural land roads. However, for more complex urban scenarios no approach exists so far. Yet, before starting the dataset and algorithm development for a computer vision system able to solve the task, a sound understanding of the problem is required. In a recent study we already showed that humans can anticipate oncoming vehicles based on their light reflections also in urban scenarios. Still, it remains unclear whether spatial information alone is sufficient to perform the task, or if the temporal context is required. This understanding is essential when designing labeling pipelines for a dataset, as well as algorithm development. Therefore, in this paper we perform a large experiment to evaluate the importance of temporal context for the human ability to anticipate oncoming vehicles at night in urban scenarios. We present participants scenes with different numbers of frames and measure their anticipation performance. We show that providing temporal context significantly increases the human detection accuracy as well as decision confidence. With this we provide additional insights into the task of anticipatory vehicle detection at night which can be taken into consideration when designing a dataset as well as algorithms.
|
|
17:00-18:00, Paper Tu-S5T6.9 | Add to My Program |
Photovoltaic Power Forecast Based on Gated Recurrent Unit and Wavelet Transform (I) |
|
Chang, Yu Ming | Industrial Technology Research Institute (ITRI) |
Chen, Chao-Rong | National Taipei University of Technology |
Brice, Ouedraogo | National Taipei University of Technology |
Chou, Chih-Ju | National Taipei University of Technology |
Lee, Ching-Yin | Tungnan University |
Keywords: Deep Learning, Machine Learning, Neural Networks and their Applications
Abstract: Solar power generation varies according to time and several meteorological parameters making its prediction difficult. This paper proposes a hybrid model combining gated recurrent units (GRU) with wavelet transform called Wavelet-GRU for short-term solar power generation prediction. The proposed method is optimized with the hyperband algorithm and implemented with TensorRT for effective inference on GPU. The forecasting result forecasted from ten minutes to two hours ahead using actual data collected by a power plant in central Taiwan demonstrated that the proposed hybrid model can effectively predict the upcoming solar power generation in short terms horizon. The proposed approach can achieve 5.28% error rate of 60 min ahead forecasting and provides valuable contribution for accurate solar power forecasting while dispatching an important reference.
|
|
Tu-S5T7 Virtual Session, Room T7 |
Add to My Program |
Real-World Applications of Intelligent Systems VI |
|
|
|
17:00-18:00, Paper Tu-S5T7.1 | Add to My Program |
Construction and Analysis of Deep Functional Corticomuscular Coupling Effect (I) |
|
Liu, Jinbiao | Research Center for Human-Machine Augmented Intelligence |
Li, Xinhang | Zhejiang Lab |
Wang, Lijie | Zhejiang Lab |
Luo, Manli | Zhejianglab |
Feng, Linqing | Zhejiang Lab |
Tang, Tao | Zhejiang Lab |
Liu, Honghai | Shanghai Jiao Tong University |
Wei, Yina | Zhejiang Lab, Research Center for Augmented Intelligence |
Keywords: Mechatronics, Robotic Systems
Abstract: Based on electroencephalogram (EEG) and electromyogram (EMG) signals, the function coupling between the cerebral cortex and muscles has been widely studied to evaluate the motor function and reveal various motor control and pathological mechanisms in healthy individuals or patients with movement disorders. However, the effect of the different signal sources on the functional corticomuscular coupling remains unclear. In this study, four different signal source combinations were constructed by EEG and high-density surface EMG (HD-sEMG) signals as well as their reconstructed source signals to analyze the corticomuscular coupling during isometric index finger contraction tasks at different levels of maximum voluntary contraction. A nonparametric coupling model was used to study the effect of deep source signals on changing the coherence magnitude indicators of corticomuscular coupling related to hand movements. The results showed that the reconstruction of EEG and HD-sEMG signals significantly improved the coherence peak and coherence strength under low-level force. However, as the force level increased, only the reconstructed brain source signals demonstrated a more substantial effect. In addition, the coherence peak was positively correlated with the finger force level. This study demonstrated the importance and positive impact of the reconstructed EEG and EMG signals for estimating corticomuscular coherence.
|
|
17:00-18:00, Paper Tu-S5T7.2 | Add to My Program |
Ranking Center-Based NSGA-II (I) |
|
Khosrowshahli, Rasa | Ontario Tech University |
Rahnamayan, Shahryar | Brock University |
Ibrahim, Amin | Department of Electrical, Computer, and Software Engineering |
Makrehchi, Masoud | Ontario Tech University |
Keywords: Evolutionary Computation, Optimization and Self-Organization Approaches, Computational Intelligence
Abstract: Multi-objective optimization is a branch of computation to solve mathematical optimization problems when conflicting multiple objective functions must be simultaneously optimized. Many population-based algorithms, such as NSGA-II is used to find an optimal Pareto set of solutions in a short time. However, previous research works showed the performance of classic NSGA-II is degraded when solving problems with many objectives (Mgeq5). In this work, we aim to take advantage of center-based sampling scheme to increase the exploration and exploitation capability of NSGA-II algorithm. This sampling strategy has demonstrated promising results on several single-objective evolutionary algorithms such as GA, DE, and PSO. Recently, a novel clustering center-based strategy has been proposed, which motivated us to utilize center-based sampling scheme in NSGA-II to solve multi-objective optimization problems. The outcomes confirm that the proposed clustering center-based NSGA-II is able to effectively solve CEC-2017 multi-objective benchmark problems with 2, 3, 5, 10, and 15 objectives.
|
|
17:00-18:00, Paper Tu-S5T7.3 | Add to My Program |
Self-Supervised Learning Using Noisy-Latent Augmentation (I) |
|
Khosrowshahli, Rasa | Ontario Tech University |
Rahnamayan, Shahryar | Brock University |
Asilian Bidgoli, Azam | Wilfrid Laurier University |
Makrehchi, Masoud | Ontario Tech University |
Keywords: Machine Learning, Deep Learning, Neural Networks and their Applications
Abstract: Generally speaking, labeled data is difficult and expensive to provide for applications in machine learning and data mining. One of the earliest approaches to tackle this problem is semi-supervised self-training to take advantages of labeled and unlabeled data to create pseudo-labeled data. However, reaching a high level of confidence to predict the unseen data by a classifier with a limited number of labeled samples is a challenging task. Generating pseudo-labeled data can collaboratively improve the self-training to increase its confidence for further predictions. This paper proposes a collaborative framework between augmentation and self-training to accurately train a model with very limited labeled data. Our framework includes two components where the first component is an unsupervised technique to augment labeled data and feed to the second component, which is self-training. The first component uses a custom variational autoencoder architecture (VAE) to generate new samples by adding randomly generated noise to encoded latent representation. As a result, the augmentation component can generate unique and unexplored images with respect to limited input data distribution. We evaluated the proposed framework on the handwritten MNIST image dataset. The conducted experiment shows that the generative component can be helpful in overcoming the problem of inaccurate self-training prediction when sufficient labeled data is not accessible.
|
|
17:00-18:00, Paper Tu-S5T7.4 | Add to My Program |
Multi-Objective Binary Coordinate Search for Feature Selection |
|
Zanjani Miyandoab, Sevil | Ontario Tech University |
Rahnamayan, Shahryar | Brock University |
Asilian Bidgoli, Azam | Wilfrid Laurier University |
Keywords: Decision Support Systems
Abstract: A supervised feature selection method selects an appropriate but concise set of features to differentiate classes, which is highly expensive for large-scale datasets. Therefore, feature selection should aim at both minimizing the number of selected features and maximizing the accuracy of classification, or any other task. However, this crucial task is computationally highly demanding on many real-world datasets and requires a very efficient algorithm to reach a set of optimal features with a limited number of fitness evaluations. For this purpose, we have proposed the binary multi-objective coordinate search (MOCS) algorithm to solve large-scale feature selection problems. To the best of our knowledge, the proposed algorithm in this paper is the first multi-objective coordinate search algorithm. In this method, we generate new individuals by flipping a variable of the candidate solutions on the Pareto front. This enables us to investigate the effectiveness of each feature in the corresponding subset. In fact, this strategy can play the role of crossover and mutation operators to generate distinct subsets of features. The reported results indicate the significant superiority of our method over NSGA-II, on five real-world large-scale datasets, particularly when the computing budget is limited. Moreover, this simple hyper-parameter-free algorithm can solve feature selection much faster and more efficiently than NSGA-II.
|
|
17:00-18:00, Paper Tu-S5T7.5 | Add to My Program |
Balanced Supervised Contrastive Learning for Skin Lesion Classification (I) |
|
Yan, Lan | Hunan University |
Li, Kenli | Hunan University |
Keywords: Deep Learning, Machine Vision
Abstract: Deep neural networks have emerged as an important tool for computer-aided diagnosis. However, deep models for skin lesion classification still face the challenges of intra-class variation and inter-class similarity, as well as data imbalance. To address these challenges, in this paper, we propose a balanced supervised contrastive learning (BSCL) approach for the skin lesion classification task. Our model consists of two branches for supervised contrastive learning and classification, respectively. The introduced supervised contrastive learning branch helps the network to learn more discriminative representations. Moreover, we design both a category-averaging strategy which averages the instances of every class in a mini-batch, and a category-complement strategy which makes all categories to appear in each mini-batch, to balance the influence from different skin lesion categories. Besides, we introduce a multi-weighted classification loss to learn a balanced classifier. Extensive experiments on two benchmarks demonstrate that our approach is able to learn strong feature representations and achieve state-of-the-art skin lesion classification performance.
|
|
17:00-18:00, Paper Tu-S5T7.6 | Add to My Program |
Online Sparse Streaming Feature Selection Via Decision Risk |
|
Xu, Ruiyang | Chongqing University of Posts and Telecommunications |
Wu, Di | Southwest University |
Luo, Xin | Chinese Academy of Sciences |
Keywords: Control of Uncertain Systems, Decision Support Systems
Abstract: Online streaming feature selection (OSFS) is an effective approach to addressing high-dimensional data. In real big data-related applications, streaming features commonly have massive missing data due to various uncertain factors. The missing data may cause some uncertain relationships between sparse features and labels. However, existing OSFS methods tend to select sparse streaming features based on certain relevance and redundancy analysis, which may erroneously discard some weak relevant but irredundant features. As a result, some essential information is discarded. Motivated by this, this paper proposes a Decision- Risk-incorporated OSFS (DRO) algorithm. Its main idea is two-fold: 1) the missing data of sparse streaming features are pre-estimated by using the Latent Factor Analysis (LFA), and 2) the decision risk of relevance and redundancy analysis is minimized on the estimated complete streaming features via the three-way decision (3WD). Extensive empirical studies are conducted on eight real-world datasets. The results show that DRO significantly outperforms five state-of-the-art competitors.
|
|
17:00-18:00, Paper Tu-S5T7.7 | Add to My Program |
Optimization of a Robotaxi Dispatch Problem in Pandemic Era |
|
Li, Mengqi | Shandong University of Science and Technology |
Qi, Liang | Shandong University of Science and Technology |
Luan, Wenjing | Shandong University of Science and Technology |
Zhang, Rongyan | Shandong University of Science and Technology |
Guo, Xiwang | Liaoning Petrochemical University |
Keywords: Optimization and Self-Organization Approaches, Application of Artificial Intelligence, Computational Intelligence
Abstract: Autonomous driving has been successfully realized in particular areas such as logistics distribution centers, container terminals, and university campuses. Robotaxi could be another potential application in the near future. This work studies a robotaxi dispatch problem during the pandemic time. It proposes a multi-objective optimization model to minimize the number, waiting time, and driving distance of the robotaxis. Besides, a dispatch strategy is innovatively designed according to a defined severity degree of the pandemic. A virus infection rate can be decreased by reducing contact among passengers. A two-stage nondominated sorting genetic algorithm (NSGA-TS) is proposed to solve the problem. Three operations are used to generate offspring solutions, which can ensure the diversity of the population and speed up the convergence of the algorithm. The effectiveness of NSGA-TS is verified compared with two popular multi-objective optimization algorithms, i.e., multi-objective evolutionary algorithm based on decomposition (MOEA/D) and nondominated sorting genetic algorithm II (NSGA-II). Experimental results show that the proposed model performs well on the studied problem. It can reduce the virus infection rate by decreasing contact among passengers at different risk levels of the pandemic while accomplishing passenger orders. This work is conducive to society building intelligent transportation in the post-pandemic era.
|
|
17:00-18:00, Paper Tu-S5T7.8 | Add to My Program |
Unmasking Deception: A Comparative Study of Tree-Based and Transformer-Based Models for Fake Review Detection on Yelp |
|
Wang, Pengqi | The Hong Kong University of Science and Technology (Guangzhou) |
Lin, Yue | The University of Hong Kong (HKU) |
Chai, Junyi | Beijing Normal University - Hong Kong Baptist University United |
Keywords: Application of Artificial Intelligence, Media Computing, Artificial Social Intelligence
Abstract: The increasing prevalence of fake online reviews jeopardizes firms' profits, consumers' well-being, and the trustworthiness of e-commerce ecosystems. We face the significant challenge of accurately detecting fake reviews. In this paper, we undertake a comprehensive investigation of traditional and state-of-the-art machine learning models in classification, based on textual features, to detect fake online reviews. We attempt to examine existing and noteworthy models for fake online review detection, in terms of the effectiveness of textual features, the efficiency of sampling methods, and their performance of detection. Adopting a quantitative and data-driven approach, we scrutinize both tree-based and transformer-based detection models. Our comparative studies evidence that transformer-based models (specifically BERT and GPT-3) outperform tree-based models (i.e., Random Forest and XGBoost), in terms of accuracy, precision, and recall metrics. We use real data from online reviews on Yelp.com for implementation. The results demonstrate that our proposed approach can identify fraudulent reviews effectively and efficiently. Synthesizing ChatGPT-3, tree-based, and transformer-based models for fake online review detection is rather new but promising, this paper highlights their potential for better detection of fake online reviews.
|
|
17:00-18:00, Paper Tu-S5T7.9 | Add to My Program |
ICE-YoloX: An Effective Face Mask Detection Method |
|
Chen, Jiaxin | Hangzhou Dianzi University |
Zhang, Xuguang | Hangzhou Dianzi University |
Tang, Yinggan | Yanshan University |
Yu, Hui | University of Portsmouth |
Keywords: Cognitive Computing, Visual Analytics/Communication
Abstract: Deep learning technologies such as YoloX have achieved impressive progress in face mask detection recently. However, the neck network used in YoloX network may lead to severe confounding effect in feature mapping due to the inherent defect of channel reduction in hybrid fusion, which affects its precise localization ability of mask-wearing targets. To tackle this issue, we present a new FPN network structure (ICE-FPN) based on channel-enhanced feature pyramid network (CE-FPN) in this paper, which can mitigate the YoloX network confounding effect while reducing the number of parameters and computational effort caused by CE-FPN. Experiments conducted on the WMD dataset show that the mAP0.5 of the model improves from 99.54% to 99.62% and the mAP0.75 improves from 89.47% to 91.35%. The ablation and comparison experiments demonstrate that the proposed ICE-YoloX has achieved superior performance over existing methods.
|
|
Tu-S5T8 Virtual Session, Room T8 |
Add to My Program |
Additional SSE II |
|
|
|
17:00-18:00, Paper Tu-S5T8.1 | Add to My Program |
Efficient Extended Neighborhoods Dynamic Selection Re-Ranking for Person Re-Identification |
|
Chao, Wang | Wuhan University |
Zhongyuan, Wang | Wuhan University |
Xiaochen, Wang | Center For Multimedia Software School of Computer Science Wuhan University Wuhan |
Ruimin, Hu | Center For Multimedia Software School of Computer Science Wuhan University Wuhan |
Mithun, Mukherjee | Nanjing University of Nanjing University of Information Science and Technology Nanjing |
|
17:00-18:00, Paper Tu-S5T8.2 | Add to My Program |
HyIntent: Hybrid Intention-Proposal Network for Human Trajectory Prediction |
|
Liu, Chunyu | Computer Network Information Center, Chinese Academy of Sciences, |
Yu, Jianjun | Computer Network Information Center, Chinese Academy of Sciences, |
Keywords: Smart Buildings, Smart Cities and Infrastructures, Intelligent Transportation Systems, Autonomous Vehicle
Abstract: Pedestrian trajectory prediction is a challenging task due to its inherent uncertainty and multi-modal nature of human intention. There are multiple possible trajectories based on the same historical locations. Our key insight is that humans are intention-driven, the future trajectories can be effectively captured by a set of intention proposals. This leads to our Hybrid Intention-proposal trajectory prediction (HyIntent) framework. HyIntent is a hybrid network architecture that acts solely on historically observed locations. We use LSTM as our recurrent backbone to memorize the sequential feature and a transformer decoder to capture the long-term dependency with intention. Given some orthogonal learned intention proposals, HyIntent reasons about the relations of intention proposals and the historical observations to tailor generate the multiple predictions in parallel. HyIntent demonstrates improved performance on public datasets (i.e. ETH/UCY, SDD) and outperforms state-of-the-art while minimizing the complexity.
|
|
17:00-18:00, Paper Tu-S5T8.3 | Add to My Program |
Refining Multi-Teacher Distillation for Multi-Modality COVID-19 Medical Images: Make the Best Decision (I) |
|
Pan, Xiaoyu | Chongqing Medical University |
Jia, Yuanyuan | Chongqing Medical University |
Keywords: Image Processing and Pattern Recognition, AI and Applications, Biometric Systems and Bioinformatics
Abstract: Accurate COVID-19 lesion segmentation is vital for diagnosing lung infections. Currently, most segmentation networks achieve a high level of accuracy at the cost of considerable parameter complexity, making it challenging to apply them in resource-limited hospitals. To tackle this problem, we creatively design a multi-teacher distillation framework based on the CNN-Transformer. The student acquires knowledge from the pre-trained strong and weak teachers to continuously improve its learning ability. Our proposed framework has three main benefits: 1) We design GMDiff (Generate Medical Diffusion Model) to solve the scarcity problem of high-quality COVID-19 medical images to adapt few-shot learning. 2) The DWD (Dynamic Weight Distribution) can adaptively adjusts the weight of a weak teacher to reduce incorrect guidance on student learning. When the weak teacher has enough confidence in their decisions, the students can achieve satisfactory results. 3)To bridge the semantic gap between the student and multi-teacher, we propose the MSL(Multi-Scale Loss), AGL(Attention Gradient Loss) and EPL (Edge Pixel Loss) strategies to supervise student learning. Before utilizing the above strategies, we adopt IPM(Information Processing Module) to standardize the feature representation shared by student and teachers. Extensive experimentation shows that our proposed distillation framework achieves state-of-the-art segmentation results with a fine-tuned balance between parameters and complexity on multi-modality datasets.
|
|
17:00-18:00, Paper Tu-S5T8.4 | Add to My Program |
Learn to Coordinate: A Whole-Body Learning from Demonstration Framework for Differential Drive Mobile Manipulator |
|
Yang, Yuqiang | South China University of Technology |
Huang, Darong | South China University of Technology |
Chen, Chen | Huawei Technologies Co., Ltd |
Zeng, Chao | Universität Hamburg |
He, Yanong | Huawei Technologies Co., Ltd |
Yang, Chenguang | University of the West of England |
Keywords: Robotic Systems, Cooperative Systems and Control, Modeling of Autonomous Systems
Abstract: This paper proposes a whole-body learning from demonstration (LfD) framework that enables differential drive mobile manipulators to learn coordination working and disturbance rejection. First, an efficient kinesthetic teaching method is devised based on the weighted least-norm (WLN) inverse kinematics solution and an admittance controller, which facilitates human users to guide the nmobile manipulator to perform tasks. Second, we propose a whole-body LfD framework through Gaussian Process, which endows the mobile manipulator’s skill learning process with features of large-scale convergence, coordination working and disturbance rejection, after just a few human demonstrations. The proposed learning framework also allows for human-in-the-loop correction when the whole-body is conducting a task. Finally, the effectiveness of the proposed framework is verified via two simulations and a pick-and-place experiment.
|
|
17:00-18:00, Paper Tu-S5T8.5 | Add to My Program |
Fine-Grained Cross-Modal Graph Convolution for Multimodal Aspect-Oriented Sentiment Analysis |
|
Zhao, Zefang | Computer Network Information Center, Chinese Academy of Sciences |
Liu, Yuyang | Chinese Academy of Medical Sciences and Peking Union Medical Col |
Song, Liujing | University of Chinese Academy of Sciences, Beijing |
Li, Jun | Computer Network Information Center |
Keywords: Affective Computing, Cognitive Computing
Abstract: Aspect-oriented multimodal sentiment analysis aims to identify the sentiment associated with a given aspect using text and image inputs. Existing methods have focused on the interaction between aspects, text, and images, achieving significant progress through cross-modal transformers. However, they still suffer from three problems: (1) Ignoring the dependency relationships between objects within the image modality; (2) Failing to consider the role of syntactic dependency relationships within the text modality in capturing aspect-related opinion words; (3) Neglecting the inherent dependency relationships between modalities. To address these issues, we propose a fine-grained cross-modal graph convolutional network model (FCGCN). Specifically, we construct intra-modality dependency relationships using syntactic and spatial relationships and fuse the two modalities through semantic similarity calculation. We then design a GCN-Attention layer to capture richer multimodal fusion information. Additionally, an aspect-oriented transformer module is introduced to capture aspect features interactively. Experimental results on the Twitter datasets show that our FCGCN model consistently outperforms state-of-the-art methods.
|
|
17:00-18:00, Paper Tu-S5T8.6 | Add to My Program |
On the Collaborative Object Transportation Using Leader Follower Approach |
|
Ghosh, Sumanta | TCS Research |
Nath, Subhajit | TCS Research |
Sortee, Sarvesh | TCS Research |
Kumar, Lokesh | TCS Research |
Bera, Titas | TCS Research |
Keywords: Cooperative Systems and Control, Robotic Systems, Cyber-physical systems
Abstract: In this paper we address the multi-agent collaborative object transportation problem in a partially known environment with obstacles under a specified goal condition. We propose a leader follower approach for two mobile manipulators collaboratively transporting an object along specified desired trajectories. The proposed approach treats the mobile manipulation system as two independent subsystems: a mobile platform and a manipulator arm and uses their kinematics model for trajectory tracking. In this work we considered that the mobile platform is subject to non-holonomic constraints, with a manipulator carrying a rigid load. The desired trajectories of the end points of the load are obtained from Probabilistic RoadMap-based planning approach. Our method combines Proportional Navigation Guidance-based approach with a proposed Stop-and-Sync algorithm to reach sufficiently close to the desired trajectory, the deviation due to the non-holonomic constraints is compensated by the manipulator arm. A leader follower approach for computing inverse kinematics solution for the position of the end-effector of the manipulator arm is proposed to maintain the load rigidity. Further, we compare the proposed approach with other approaches to analyse the efficacy of our algorithm.
|
|
17:00-18:00, Paper Tu-S5T8.7 | Add to My Program |
Uncertainty-Aware Deep Learning for Segmenting Ultrasound Images of Breast Tumours |
|
Munia, Afsana Ahmed | Deakin University |
Hossain, Ibrahim | Deakin University |
Jalali, Seyed Mohammad Jafar | Institute for Intelligent Systems Research and Innovation, Deaki |
Tabarisaadi, Pegah | Institute for Intelligent Systems Research and Innovation (IISRI |
Rahman, Ashikur | Bangladesh University of Engineering and Technology |
Nahavandi, Saeid | Swinburne University of Technology |
Keywords: Control of Uncertain Systems
Abstract: Precise image segmentation is one of the dominant factors in disease diagnosis. A typical application is the segmentation of breast ultrasound images, allowing radiologists to suggest what to do next. After emerging deep learning technology especially convolutional neural networks (CNNs), the image segmentation model achieved state-of-the-art performance in various medical applications such as cancer detection and classification, lung node segmentation, cell segmentation and so on. However, despite these successes, a big question arises: to what extent is the model certain about the predicted result? Generally, most deep learning models focus on high accuracy but not on uncertainty of predicted results, which is not enough to make a critical real-life decision such as a disease diagnosis, where a wrong decision can be life-threatening. Hence for making a crucial decision, it is essential that the predicted result will provide not only accuracy but also estimate model uncertainty. Our contribution to this research is to build a system that predicts pixel-wise semantic segmentation and provides uncertainty estimation of the predicted results. It is achieved by adding a dropout layer during training and using Monte Carlo dropout in inference. We evaluate our model with the breast ultrasound image dataset(BUSI) and compare the results with a few other state-of-the-art methods where our method outperforms others in terms of IoU.
|
|
17:00-18:00, Paper Tu-S5T8.8 | Add to My Program |
A SINS Error Correction Approach Based on Dual-Threshold ZV Detection and Cubature Kalman Filter |
|
Xu, Ruijie | Beijing University of Chemical Technology |
Chen, Shichao | Institute of Automation,Chinese Academy of Sciences |
Sun, Wenqiao | Transportation and Economics Research Institute, and the Center |
Lv, Yisheng | Institute of Automation, Chinese Academy of Sciences |
Luo, Jialiang | China University of Geosciences Beijing |
Tang, Ying | Rowan University |
Keywords: Consumer and Industrial Applications, Infrastructure Systems and Services, Cyber-physical systems
Abstract: Global Navigation Satellite Systems (GNSS) can provide real-time positioning information for outdoor users, but cannot for indoor scenarios or heavily occluded outdoor scenarios. Strap-down Inertial Navigation System (SINS) are widely used to locate people in complex interior or heavily occluded outdoor scenarios due to its light weight and low power consumption. However, IMU of SINS are noisy, and the sampling data error is large, which is a divergence of the error with time.Therefore, it will generate a positioning accumulation error, which affects the final positioning accuracy. The problem of cumulative IMU errors is usually dealt with by Zero-Velocity Update (ZUPT). The zero-velocity detection part of basic ZUPT method usually uses a single threshold to determine the gait of pedestrian, which often has the problem of gait misjudgment and omission. To address these problems, this paper proposes a composite conditional detection method to solve the problem of misjudgment in the zero-velocity interval. In addition, we redesign the zero-velocity update algorithm and uses the Cubature Kalman filter (CKF) for pedestrian positioning error correction. The experimental results demonstrate that the proposed ZUPT method based on dual-threshold detection can better detect the interval between pedestrian motion and stationery than ones with single threshold. The zero-velocity update algorithm based on CKF has higher performance than conventional EKF and UKF methods, which constrains the cumulative error of SINS to about 0.2% of the whole walking distance.
|
|
17:00-18:00, Paper Tu-S5T8.9 | Add to My Program |
Self-Supervised Learning Based on Similar Users for Sequential Recommendation |
|
Shu, Xiaomei | Sichuan University |
He, Jun | Sichuan University |
Huang, Feihu | Sichuan University |
Peng, Jian | Sichuan University |
Keywords: Representation Learning, Deep Learning, Neural Networks and their Applications
Abstract: Sequential Recommendation (SR) predicts the next interaction behavior via modeling the interaction between the user and the item over a time sequence. A series of works applied Self-Supervised Learning (SSL) in SR to obtain better user representations. Although these efforts proved effective, they only focused on the information of the user itself and ignores self-supervised signals from other users. Due to the widely observed homogeneity in recommender systems, these signals from other users are also vital for user representation. To this end, we propose a novel framework, Self-Supervised Learning based on Similar users for Sequential Recommendation (SSLSRec). We present a contrastive learning objective in SSLSRec to consider augmented views from the same user and similar users as positive samples. Moreover, we propose novel Insert and Substitute augmentation methods to construct more reasonable augmentation views for user sequences. Extensive experiments demonstrate the effectiveness of SSLSRec.
|
|
Tu-S5T9 Virtual Session, Room T9 |
Add to My Program |
New Session for Latest Online Requests V |
|
|
|
17:00-18:00, Paper Tu-S5T9.1 | Add to My Program |
Spatio-Temporal Traffic Data Recovery Via Latent Factorization of Tensors Based on Tucker Decomposition |
|
Mi, Jiajia | Dongguan University of Technology |
Wu, Hao | Chongqing Institute of Green and Intelligent Technology, Chinese |
Li, Weiling | Dongguan University of Technology |
Luo, Xin | Chinese Academy of Sciences |
Keywords: Intelligent Transportation Systems
Abstract: Complete and valid spatio-temporal traffic data play a vital role in intelligent transportation systems applications, such as congestion avoidance and route guidance. However, traffic data from real-world scenarios is usually incomplete or corrupted due to communication or sensors malfunctions, which makes the traffic analytics difficult. Since traffic data contains complex spatio-temporal patterns, it is very challenging to develop an efficient learning model that can accurately recover incomplete traffic data. To tackle this issue, this work propose a Tucker Decomposition-based Latent factorization of tensors (TDL) model with two interesting ideas: 1) modeling spatio-temporal traffic data as an incomplete third-order tensor and building a Tucker decomposition based learning objective according to the density-oriented principle for precisely recovering missing traffic data; and 2) adopting a proportional-integral-derivative (PID) control principle-incorporated parameters learning scheme for achieving high computational efficiency. Empirical studies on four traffic speed datasets generated from different cities demonstration that the proposed TDL model achieves significant performance gain in both accuracy and computational efficiency compared with state-of-the-art models.
|
|
17:00-18:00, Paper Tu-S5T9.2 | Add to My Program |
From DAO to TAO: Finding the Essence of Decentralization |
|
Li, Juanjuan | Institute of Automation, Chinese Academy of Sciences |
Liang, Xiaolong | Macau University of Science and Technology |
Qin, Rui | Institute of Automation, Chinese Academy of Sciences |
Wang, Fei-Yue | Institute of Automation, Chinese Academy of Sciences |
Keywords: Systems Safety and Security, Cognitive Computing, Multi-User Interaction
Abstract: Decentralized Autonomous Organizations (DAOs) have been gaining popularity in recent years due to their promise of realizing the decentralized Web3. However, most DAOs rely heavily on token-centric value systems as well as allocate decisionmaking authority and yield-sharing rights according to the held tokens, which often lead to monopolization of power and rights. To address this issue, this paper contributes to propose a truly decentralized organization model, named True Autonomous Organizations (TAOs), that does not count upon tokens and is guided by principles of contribution-based and on-demand allocation. We first discuss the design of TAOs, including their infrastructures, power structures, and value systems, and then provide a technical roadmap for implementing TAOs in the DeSci context. This research can provide a valuable guidance for the construction and application of TAOs.
|
|
17:00-18:00, Paper Tu-S5T9.3 | Add to My Program |
Management-Oriented Operating Systems: Harnessing the Power of DAOs and Foundation Models |
|
Qin, Rui | Institute of Automation, Chinese Academy of Sciences |
Li, Juanjuan | Institute of Automation, Chinese Academy of Sciences |
Liang, Xiaolong | Macau University of Science and Technology |
Wang, Fei-Yue | Institute of Automation, Chinese Academy of Sciences |
Keywords: Human Enhancements, Cognitive Computing, Multi-User Interaction
Abstract: This paper presents Management-Oriented Operating Systems (M2OS) that leverages the power of parallel intelligence theory, Decentralized Autonomous Organizations (DAOs) and foundation models to revolutionize the manner of management in Cyber-Physical-Social Systems (CPSS). The parallel architecture on M2OS is proposed, including the parallel interactive actual M2OS and artificial M2OS. Among them, the artificial M2OSs provide digital infrastructures for organizations to operate, collaborate, and make decisions in the virtual space, and conduct computational experiments to evaluate management decisions and predict future states of the actual M2OS. Through parallel execution and closed-loop feedback between the artificial and actual M2OSs, the management and control, experimentation and evaluation, as well as learning and training of the actual M2OS can be realized. Moreover, the functional layers of M2OS, including the infrastructure layer, data layer, scenario layer, modeling layer, decision layer, and application layer, are discussed. These layers work together to support the intelligent, autonomous, collaborative, and adaptive nature of the M2OS, and facilitate data-driven decision-making, optimize business operations, and empower managers with realtime actionable insights. The proposed M2OS paradigm has great potential to transform the management paradigm and opens up new possibilities for intelligent and collaborative decision-making.
|
|
17:00-18:00, Paper Tu-S5T9.4 | Add to My Program |
Enhancing Human Action Recognition with Asymmetric Generalized Gaussian Mixture Model-Based Hidden Markov Models and Bounded Support |
|
Al-Bazzaz, Hussein | CONCORDIA UNIVERSITY |
Azam, Muhammad | CONCORDIA UNIVERSITY |
Amayri, Manar | Concordia University |
Bouguila, Nizar | Concordia University |
Keywords: Modeling of Autonomous Systems, Robotic Systems, Consumer and Industrial Applications
Abstract: Human action recognition~(HAR) is a crucial research field that necessitates the implementation of advanced mathematical concepts to recognize human activities from sequences of observations. This paper presents a novel framework employing a mixture-based Hidden Markov Model~(HMM) that capitalizes on the advantages of asymmetric modeling, bounded support, and robustness in sensor-based HAR. To accommodate variations in observations within each human activity class, we propose an asymmetric generalized Gaussian mixture model (AGGM) to model the emission probabilities. Subsequently, we propose incorporating the bounded asymmetric generalized Gaussian mixture model~(BAGGM) to address the constraints inherent in real-life data. The parameters of the corresponding HMM are estimated using the Baum-Welch algorithm, and the most probable sequence of hidden states is inferred using the Viterbi algorithm. We validate our proposed framework using four datasets for human activities. Experimental results demonstrate that our proposed model outperforms all state-of-the-art HAR models that are HMM-based, thereby emphasizing the superiority of our proposed frameworks in HAR systems to understand behaviours, predict potential actions, and facilitate sports applications for the general population and independent living applications for vulnerable populations.
|
|
17:00-18:00, Paper Tu-S5T9.5 | Add to My Program |
Explainable Robust Smart Meter Data Clustering for Improved Energy Management |
|
Al-Bazzaz, Hussein | CONCORDIA UNIVERSITY |
Azam, Muhammad | CONCORDIA UNIVERSITY |
Amayri, Manar | Concordia University |
Bouguila, Nizar | Concordia University |
Keywords: Consumer and Industrial Applications, Infrastructure Systems and Services, Intelligent Power Grid
Abstract: The widespread deployment of smart meters in residential settings has led to a wealth of high-resolution electrical power consumption data, providing the opportunity to discover valuable insights into energy consumption patterns. Mixture models are essential for revealing hidden patterns in data, enabling accurate insights and informed decision-making across diverse applications. In this paper, we introduce the mixture of mixtures of bounded asymmetric generalized Gaussian and Uniform distributions~(BAGGUMM) and investigate its potential for characterizing residential energy users, thereby enhancing energy management applications, including demand response and energy efficiency programs. We investigate the potential for improved clustering efficacy by incorporating an inner mixture containing the Uniform distribution to enhance robustness against outliers. Additionally, we integrate a decision tree algorithm for model explainability to define pattern boundaries using if-then statements. We validate our proposed model using three real-life datasets. Additionally, The performance of BAGGUMM is compared against several state-of-the-art mixture models.
|
|
17:00-18:00, Paper Tu-S5T9.6 | Add to My Program |
Refining Nonparametric Mixture Models with Explainability for Smart Building Applications |
|
Al-Bazzaz, Hussein | CONCORDIA UNIVERSITY |
Saravanakumar, Kumar Prabhakaran | Concordia University |
Amayri, Manar | Concordia University |
Bouguila, Nizar | Concordia University |
Keywords: Adaptive Systems, Control of Uncertain Systems, Consumer and Industrial Applications
Abstract: Nonparametric mixture models are a powerful and flexible approach to data clustering; they account for uncertainty by using Bayesian inference to obtain a posterior distribution over the model's parameters and a better fit to the data. Additionally, nonparametric mixture models can adaptively adjust the number of components to fit the data. Incorporating asymmetric generalized Gaussian distribution~(AGGD) within the mixture framework extends the capabilities of the widely used Gaussian mixture model (GMM) by adding parameters that control the shape and the skewness of the per-component distribution, which enables the model to capture diverse and complex data patterns. Furthermore, we incorporate explainability within our proposed infinite asymmetric generalized Gaussian mixture model~(IAGGMM) to provide interpretable insights into the clustering results, enhancing the model's practicality and transparency. This integration facilitates a deeper understanding of the underlying data structures and the rationale behind the model's decisions, fostering trust and promoting the adoption of our approach in various real-world scenarios. In this study, we explore the application of occupancy estimation for optimizing energy efficiency and facility management in smart buildings. Our approach demonstrates superior performance in modelling complex and asymmetric data distributions, resulting in improved accuracy and adaptability for occupancy level estimation. Therefore, we achieve the optimal trade-off between model complexity and accuracy.
|
|
17:00-18:00, Paper Tu-S5T9.7 | Add to My Program |
A Novel Approach for Smoothing the Path of Emergency Vehicles in Urban Areas |
|
Yu, Weiqi | Shandong University of Science and Technology |
Qi, Liang | Shandong University of Science and Technology |
Bai, Weichen | Shandong University of Science and Technology |
Luan, Wenjing | Shandong University of Science and Technology |
Guo, Xiwang | Liaoning Petrochemical University |
Keywords: Intelligent Transportation Systems, Cooperative Systems and Control, Adaptive Systems
Abstract: Emergency vehicles (EVs) are crucial in responding to time-critical events such as traffic accidents, medical emergencies, and fires in urban areas. Most traffic control approaches try to reduce the travel time of EVs by giving them the highest road-use priority, which may cause delays for other nearby traffic participants and reduce the smoothness of normal traffic. This work proposes a novel approach to reduce both the travel time of an EV and the negative impact on normal traffic by dynamically evacuating traffic adjacent to an emergency path. The approach periodically acquires a subnet for each road segment of the emergency path based on dynamic traffic conditions. Regular vehicles on the subnet are restricted from using the emergency path, which minimizes the time for emergency service delivery. The experimental results show that the proposed approach outperforms the existing approach in many metrics, such as the travel time of the EV and the additional delay of normal traffic. In addition, this work performs sensitivity analysis on regular vehicles’ compliance rate to evacuation. The experimental results show the superiority of the proposed approach at different compliance rates.
|
|
17:00-18:00, Paper Tu-S5T9.8 | Add to My Program |
Dynamic Tracking with Fuzzy Rules for Evolutionary Dynamic Constrained Optimization |
| |