| |
Last updated on October 20, 2023. This conference program is tentative and subject to change
Technical Program for Tuesday October 24, 2023
|
TU-PS00T1 Room T1 |
Add to My Program |
Supplementary Session 11 |
|
|
|
08:00-09:00, Paper TU-PS00T1.1 | Add to My Program |
A GAN-Based FDI and Civil Attack Detection Framework for Digital Relays |
|
Aflaki, Arshia | University of Calgary Calgary, Canada |
Karimipour, Hadis | University of Calgary |
Namavar Jahromi, Amir | University of Guelph |
Keywords: Deep Learning, Neural Networks and their Applications, Expert and Knowledge-Based Systems
Abstract: Digital relays are critical components of smart power grids, therefore, their security is paramount for the proper operation of the grid. This paper proposes a cyber-attack detection method for digital relays using a modified generative adversarial network combined with the extra tree classifier for dimensionality reduction. The proposed method is evaluated on a IEEE 39-bus transmission network. The results shows that the proposed method can detect false data injection attacks and civil-attack with more than 97 percent of accuracy, f1 score, and more than 96 percent of sensitivity.
|
|
08:00-09:00, Paper TU-PS00T1.2 | Add to My Program |
Multi-Sensory Visual-Auditory Fusion of Wearable Navigation Assistance for People with Impaired Vision |
|
Li, Guoxin | The Institute of Artificial Intelligence, Hefei Comprehensive Na |
Li, Zhijun | South China Univ. of Tech |
Xia, Haisheng | University of Science and Technology of China |
Feng, Ying | South China University of Technology |
Keywords: Human-Computer Interaction, Augmented Cognition, Wearable Computing
Abstract: Navigating independently is a challenge for visually impaired vision due to the demand of obstacles avoiding, recognizing desired objects, and wayfinding in complicated environments. In this paper, we present an augmented wearable E-Glasses with a set of sensors, where an object detection neural network based on visual-auditory fusion method is employed to search desired targets, thus addressing navigation challenges and improving the mobility and independence of the visually impaired. We demonstrate advanced navigation capabilities: indoor wayfinding, recognizing and steering the users to desired goals, and a sequence of indoor challenges. The fusion network adopts a feature-level fusion strategy, which is capable to align two modalities automatically and effectively integrate visual features and audio features. Across all experiments, the developed fusion algorithm has a 94.67% success rate. The wearable E-Glasses supply a platform that helps to improve the mobility and quality of life of people with impaired vision.
|
|
08:00-09:00, Paper TU-PS00T1.3 | Add to My Program |
Self-Adaptive Facial Expression Recognition Based on Local Feature Augmentation and Global Information Correlation (I) |
|
Yan, Lingyu | Hubei University of Technology |
Xia, Jinyao | Hubei University of Technology |
Wang, Chunzhi | Hubei University of Technology |
Keywords: Control of Uncertain Systems, Decision Support Systems, Discrete Event Systems
Abstract: Facial expression recognition(FER) is one of the important research in computer vision, which has been widely applied inhuman-computer interaction, education, healthcare, transportation, etc. However, the wide application of facial expression recognition technology also brings new challenges, where occlusion and pose variation are two of the worst factors that disturb facial expression recognition in the wild. We propose a facial expression recognition method based on local feature augmentation and multi-scale global correlation which can adaptively extract robust local features and global features from the feature level to suppress the disturbances of occlusion and pose variation on facial expression recognition. The experimental results show that our method performs well on the RAF-DB dataset and has stronger robustness compared with other algorithms.
|
|
08:00-09:00, Paper TU-PS00T1.4 | Add to My Program |
Deep and Spatio-Temporal Detection for Abnormal Traffic in Cloud Data Centers (I) |
|
Yuan, Haitao | Beihang University |
Wang, Shen | Beihang University |
Bi, Jing | Beijing University of Technology |
Zhang, Jia | Southern Methodist University |
Keywords: Intelligent Internet Systems, Cloud, IoT, and Robotics Integration, Neural Networks and their Applications
Abstract: Current interactions of network traffic through cloud data centers have become an important process of network services. Precise and real-time detection and prediction of network traffic can assist system operators in effectively allocating resources, and assessing network performance based on actual service requirements, and analyzing network health. However, sources and distribution of network traffic are different, which makes accurate warnings of network attack traffic become a difficult problem. In recent years, neural networks have been proven to be effective in predicting time series data, particularly long short-term memory networks for capturing temporal features and convolutional methods for capturing spatial features. This work proposes a Deep Hybrid Spatio-Temporal (DHST) network method for abnormal traffic detection in cloud data centers, which combines a cooperative temporal convolutional network, an attention mechanism and a random inactivation method to capture the network traffic data’s spatio-temporal features. It improves accuracy of abnormal traffic detection, and realizes classification of normal traffic and abnormal one. It achieves higher accuracy than typical detection methods when applied to a real-life dataset collected from Yahoo Webscope S5.
|
|
08:00-09:00, Paper TU-PS00T1.5 | Add to My Program |
Frame-Level Smooth Motion Learning for Human Mesh Recovery |
|
Zhang, Zhaobing | Xian Jiaotong University |
Liu, Yuehu | Xi'an Jiaotong University |
Li, Shasha | Xian Jiaotong University |
Keywords: Virtual and Augmented Reality Systems, Virtual/Augmented/Mixed Reality, Information Visualization
Abstract: Reconstructing accurate and smooth 3D human mesh from a monocular video is still a challenge, due to the temporal consistency requirement of body movements. A frame-level smooth motion learning approach called SLMR model was proposed in this work. Specifically, we first design the temporal encoding with a multi-headed attention mechanism, which captures the global and local temporal context relations of motion to retain more dynamic motion features for improving mesh recovery accuracy. We also propose a probabilistic generative model consisting of a conditional variational autoencoder, which learns the distribution of pose changes in each frame, and solves temporal inconsistency of the body movement. Compared with existing approaches, SLMR can take full advantage of inter-frame motion contexts. Experiments validated the effectiveness and smoothness of the proposed approach for human mesh recovery in the wild.
|
|
08:00-09:00, Paper TU-PS00T1.6 | Add to My Program |
Optimal PD Control for Robots Using GAN and LSTM |
|
Hernandez, Ivan | CINVESTAV-IPN |
Yu, Wen | CINVESTAV-IPN |
Keywords: Deep Learning
Abstract: PD (proportional-derivative) control is a widely used model-free method for controlling robots. However, it does not guarantee optimal performance. Model-based optimal control methods, such as the linear quadratic regulator (LQR), can achieve the desired control performance, but they are only suitable for linear systems that are well understood. In this paper, we propose a novel approach to design an optimal PD control for unknown robot systems using Conditional Adversarial Generative Networks (C-GAN) and long-short term memory (LSTM) to approximate LQR PD control. This new control mechanism ensures both stability and optimal performance. We apply this method to control lower limb prostheses and our results demonstrate that the optimal PD control using GAN and LSTM outperforms classical controllers.
|
|
08:00-09:00, Paper TU-PS00T1.7 | Add to My Program |
Towards Energy-Efficient Scheduling of UAV-Enabled Mobile Edge Computing Systems (I) |
|
Yuan, Haitao | Beihang University |
Wang, Meijia | Beihang University |
Bi, Jing | Beijing University of Technology |
Zhang, Jia | Southern Methodist University |
Keywords: Evolutionary Computation, Swarm Intelligence, Intelligent Internet Systems
Abstract: Current mobile edge computing (MEC) owns cloud resources at the network edge, which enables low-latency mobile services. In addition to fixed MEC servers, MEC proxy servers with certain mobility and limited computing, e.g., flying unmanned aerial vehicles (UAVs), and vehicles, have emerged as competitors in providing services. In this work, aiming at a task offloading problem of a UAV-assisted MEC system, a hybrid network environment with multiple mobile devices (MDs) and multiple UAVs is established. A constrained mixed integer nonlinear program of the UAV-assisted hybrid cloud-edge system is formulated. A novel hybrid metaheuristic algorithm called Genetic Simulated annealing-based Particle Swarm Optimization (GSPSO) is presented to solve the program. Then, a task offloading and resource scheduling method is designed to intelligently minimize the total energy consumption of the hybrid system. Simulation results verify superiority of GSPSO over its three benchmark algorithms, thus demonstrating the proposed method significantly improves the energy efficiency of the UAV-enabled hybrid system.
|
|
08:00-09:00, Paper TU-PS00T1.8 | Add to My Program |
Joint Optimization of Cache-Assisted Offloading and Resource Allocation in Mobile Edge Computing (I) |
|
Bi, Jing | Beijing University of Technology |
Zhe, Sun | Beijing University of Technology |
Yuan, Haitao | Beihang University |
Zhang, Jia | Southern Methodist University |
Keywords: Cloud, IoT, and Robotics Integration, Swarm Intelligence, Evolutionary Computation
Abstract: Edge computing is a new architectural model that aims to offer computing, storage, and networking resources to support Internet of Things. Its primary strategy involves transferring computational tasks to the edge of network, which is closer to end-users. This paradigm facilitates offloading of computation, resulting in reduced latency and improved system performance. However, nodes located at the network edge have restricted energy and resources. As a result, running tasks entirely at the edge leads to higher energy consumption. This work proposes a novel three-tier offloading framework comprising of multiple mobile vehicles (MVs), a base station (BS), and a cloud data center (CDC). It jointly optimizes offloading rates of tasks, CPU computation rates of MVs, BS, and CDC, and the allocation of wireless bandwidth resources at MVs during partial computation offloading of tasks. It also considers limits of maximum computational resources and maximum delay of task execution. To further reduce the total system energy consumption, this work actively caches execution codes of tasks in MEC servers to reduce data transmission energy of MVs, which minimizes the total system energy consumption. This work develops a mixed integer nonlinear program and designs a mixed metaheuristic algorithm with a multi-strategy adaptive particle swarm optimizer. Simulation results demonstrate that it outperforms various state-of-the-art algorithms by achieving lower energy consumption in fewer iterations.
|
|
TU-PS00T2 Room T2 |
Add to My Program |
Supplementary Session 12 |
|
|
|
08:00-09:00, Paper TU-PS00T2.1 | Add to My Program |
Latency-Minimized Computation Offloading in Vehicle Fog Computing with Improved Whale Optimization Algorithm (I) |
|
Bi, Jing | Beijing University of Technology |
Xue, Xiangdong | Beijing University of Technology |
Yuan, Haitao | Beihang University |
Zhang, Jia | Southern Methodist University |
Keywords: Intelligent Internet Systems, Metaheuristic Algorithms, Evolutionary Computation
Abstract: Fog computing provides lower latency and higher bandwidth compared to cloud computing and is widely used in Internet of Vehicles (IoV). Vehicles cannot compute all tasks locally due to their limited computing power and battery capacity. Thus, it is a useful way to offload some tasks of vehicles to other resource-rich servers. However, due to the high mobility of vehicles, there may be a failure of returning computing results. Thus, it is a challenge to minimize the latency of tasks while meeting the constraint of energy consumption. Thus, this work proposes a vehicle-fog offloading system that offloads tasks to fog servers or idle vehicles. This work proposes an improved optimization algorithm called an adaptive L ́evy flight-based Whale optimization algorithm with Hierarchical learning (LWH) to solve this problem. Simulation experiments show that LWH has a strong global search capability and outperforms its five typical and widely used algorithms.
|
|
08:00-09:00, Paper TU-PS00T2.2 | Add to My Program |
Web Traffic Anomaly Detection Using a Hybrid Spatio-Temporal Neural Network (I) |
|
Bi, Jing | Beijing University of Technology |
Xu, Lifeng | Beijing University of Technology |
Yuan, Haitao | Beihang University |
Zhang, Jia | Southern Methodist University |
Keywords: Intelligent Internet Systems, Neural Networks and their Applications, Deep Learning
Abstract: Nowadays, rapid development of Internet has brought a sharp increase in traffic data. Abnormal traffic haS serious impact on network security. Traffic anomaly detection can be achieved by extracting characteristics of network traffic to detect anomalous intrusions, and therefore, anomaly detection algorithms are of great significance to maintenance of network security. This work proposes a hybrid spatio-temporal neural network with attention named CTGA to effectively identify anomalous traffic. CTGA combines a Convolutional neural network (CNN), a Temporal convolutional network (TCN), a bidirectional Gated recurrent unit network (BiGRU), and a self- Attention mechanism. It automatically extracts temporal and spatial features of sequences from raw data by sliding window preprocessing followed by CNN, TCN, BiGRU, and the selfattention mechanism to detect anomalous data. CNN is used to extract spatial features of time sequences and reduce the loss of spatial information. In the sequence, TCN obtains shortterm features. Long-term dependencies in the data are captured by BiGRU, and the self-attention mechanism obtains important information in the sequence. Finally, experiments with the reallife Yahoo S5 dataset prove that CTGA outperforms other approaches substantially.
|
|
08:00-09:00, Paper TU-PS00T2.3 | Add to My Program |
HR-Chain a Blockchain-Based Solution for Managing and Securing Heterogeneous Robots |
|
Tang, Kailei | Fudan University |
Dong, Zhiyan | Fudan University |
Shi, Wenxiang | Fudan University |
Gan, Zhongxue | Fudan University |
Keywords: Robotic Systems, Trust in Autonomous Systems, System Architecture
Abstract: In modern factories and daily life, the application of heterogeneous robot groups is becoming more and more widespread. However, there are still many areas for improvement in the management and communication of heterogeneous robot swarms, including the need for different connection interfaces for heterogeneous robots and the use of heterogeneous robot action logs to identify possible bottlenecks in the production line or record unplanned behaviors, whether malicious or not. In order to better manage heterogeneous robot swarms, this paper introduces the Heterogeneous Robots Chain (HR-Chain), a blockchain-based solution that can manage different types of robots and prevent unnecessary changes in robot operation logs to help improve production efficiency or other management requirements. HR-Chain is a Tezos-based blockchain project that securely stores robot logs in the blockchain using smart contracts and designs a new consensus algorithm, Delegated Proof of Stake with node's Resource and Behavior (DPoRB), which is tailored to heterogeneous robot swarms to improve the efficiency and fairness of the consensus process. Finally, this paper conducts experimental research on HR-Chain, and the simulation results show that the new consensus algorithm has better performance in terms of throughput and so on. The real experimental results show that the robot system based on HR-Chain has better response speed and execution force, showing its potential application prospects in the industrial and consumer fields.
|
|
08:00-09:00, Paper TU-PS00T2.4 | Add to My Program |
FlyTransformer: A Cross-Modal Fusion Policy for UAV End-To-End Trajectory Planning |
|
Shi, Wenxiang | Fudan University |
Zhao, Chen | Fudan University |
Tang, Kailei | Fudan University |
Sheng, Junru | Fudan University |
Dong, Zhiyan | Fudan University |
Zhang, Lihua | Fudan University |
Kang, Xiaoyang | FUDAN University |
Cao, Kai | Fudan University |
Keywords: Autonomous Vehicle, Robotic Systems, System Architecture
Abstract: The ability to perform efficient trajectory planning is crucial for UAV to carry out tasks autonomously. However, existing research on UAV trajectory planning often employs the cascade process method that involves high-precision maps, real-time positioning and path planning. These methods have limitations such as high computational complexity and time delay, which hinder the efficiency of trajectory planning. End-to-end trajectory planning methods offer a promising solution to this problem. As the core of these end-to-end methods, perception-end plays a decisive role in trajectory planning. But current multimodal fusion of perception is only post-fusion, lacks intermediate feature-level fusion and lacks attention to global visuospatial information. To solve these problems, we propose a new network architecture called FlyTransformer, which fuses the proprioceptive state and visual perception in feature-level for end-to-end trajectory planning. And the key visuospatial information can be attentioned in this architecture. We evaluate our method in forest and cuboid scenarios and their corresponding outdoor scenarios. The results show that FlyTransformer outperforms other baseline algorithms in terms of efficiency and performance.
|
|
08:00-09:00, Paper TU-PS00T2.5 | Add to My Program |
Question-Guided Graph Convolutional Network for Visual Question Answering Based on Object-Difference |
|
Minchang, Huangfu | Qilu University of Technology |
Geng, Yushui | Qilu University of Technology (Shandong Academy of Sciences) |
|
08:00-09:00, Paper TU-PS00T2.6 | Add to My Program |
Network Anomaly Detection with Stacked Sparse Shrink Autoencoders and Improved XGBoost |
|
Bi, Jing | Beijing University of Technology |
Guan, Ziyue | Beijing University of Technology |
Yuan, Haitao | Beihang University |
Zhang, Jia | Southern Methodist University |
Keywords: Intelligent Internet Systems, Neural Networks and their Applications, Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing
Abstract: Efficient and accurate identification of network anomalies is of great significance to the construction of network security systems in the information age. It is highly challenging to accurately detect abnormal behaviors in the increasing network data. Currently, classification methods based on feature extraction of autoencoders have been proven to be suitable for network anomaly detection. However, traditional detection models with autoencoders have poor detection accuracy in the face of massive network features. In addition, the hyperparameter optimization of their models cannot be effectively solved. For network anomaly detection, a new network anomaly detection method named SAXP is proposed in this work. SAXP integrates Stacked sparse shrink Autoencoders and an unbalanced XGBoost model based on genetic simulated annealing Particle swarm optimization (GSPSO). Specifically, features extracted by stacked sparse shrink autoencoders are introduced into the XGBoost model based on improved unbalance parameters for classification, and GSPSO is used to optimize the hyperparameters of XGBoost. Experimental results based on two real-life data sets demonstrate that the proposed SAXP achieves higher recognition accuracy than several state-of-the-art algorithms.
|
|
08:00-09:00, Paper TU-PS00T2.7 | Add to My Program |
Group Lasso with Checkpoints Selection for Biological Data Regression |
|
Zhan, Huixin | Texas Tech University |
Yifan, Wang | Texas Tech University |
Keywords: Biometric Systems and Bioinformatics, Machine Learning, AI and Applications
Abstract: Some unique characteristics of biological data are (1) that they are always High-Dimension and Low-Sample-Size (HDLSS) and (2) there are changes in the data distribution, such as an imbalance in classes, distribution and covariate shifts, etc. In this paper, we propose a Group Lasso with Checkpoints SElection (GL CSE) algorithm to tackle both issues. To address the first issue, we utilize a group Lasso regression model tailored for HDLSS data to perform feature selection on predefined groups of features, alleviating overfitting and being invariant under group-wise orthogonal reparameterizations. To address the second issue, we propose the checkpoint selection method to extract important model checkpoints while training on group Lasso via two proposed metrics, i.e., the average KL-divergence between training and validation features and the Frobenius error of the covariance matrices between training and validation features. Both metrics aim to select model checkpoints with minimal drifts between the training and validation features. The results of our experiments indicate that our proposed GL CSE algorithm achieves better performance compared to other baseline methods in terms of the MSE and R2 measurements. Specifically, on the biological age dataset, our GL CSE method achieves 0.8799 and 0.9883 for the MSE and R2 measurements, respectively. Additionally, we also show that our proposed checkpoint selection method performs better than regular K-fold cross-validation. Specifically, on the biological age dataset, GL CSE (Q2) achieves 0.9045 MSE and 0.9880 R2, respectively, which outperforms the regular K-fold cross-validation results, i.e., 1.0612 MSE and 0.9871 R2, respectively.
|
|
08:00-09:00, Paper TU-PS00T2.8 | Add to My Program |
Contour Detection from Ultrasound Kidney Images with a Coarse-To-Fine Approach |
|
Tao, Peng | Soochow University |
Gu, Yidong | Suzhou Municipal Hospital |
Xu, Yanqing | UT Southwestern Medical Center |
Wang, Caishan | The Second Affiliated Hospital of Soochow University |
Zhang, Lei | Duke Kunshan University |
Cai, Jing | Hong Kong Polytechnic University |
Keywords: AI and Applications, Image Processing and Pattern Recognition
Abstract: Ultrasound kidney image segmentation presents significant challenges due to missing or ambiguous boundaries. In this study, we introduce a coarse-to-refinement approach incorporating four novel aspects. Firstly, we leverage the properties of a principal curve (PC) to automatically fine-tune the curve shape and employ a neural network's learning ability to reduce model error. Secondly, a deep fusion learning network is utilized for the coarse segmentation step, incorporating a parallel architecture to enhance deep-learning performance. Thirdly, addressing the limitation of standard PC-based methods in determining the number of vertices automatically, we propose an automatic searching polygon tracking method using a mean shift clustering-based approach to replace the projection and vertex extension step in standard PC-based methods. Lastly, we develop an explainable mathematical map function for the kidney contour, as denoted by the neural network output (i.e., optimized vertices), which aligns well with the ground truth contour. We conducted various experiments to evaluate our method's performance, demonstrating its effectiveness in ultrasound kidney image segmentation.
|
|
TU-PS00T3 Room T3 |
Add to My Program |
Supplementary Session 13 |
|
|
|
08:00-09:00, Paper TU-PS00T3.1 | Add to My Program |
Cost-Effective and Dynamic Migration for Microservices in Hybrid Mobile Cloud-Edge System |
|
Zhai, Jiahui | Beijing University of Technology |
Bi, Jing | Beijing University of Technology |
Yuan, Haitao | Beihang University |
Zhang, Jia | Southern Methodist University |
Keywords: Intelligent Internet Systems, Cloud, IoT, and Robotics Integration, Evolutionary Computation
Abstract: Mobile edge computing (MEC), as a promising paradigm, delivers computation and storage capacities at the edge of the network. It supports delay-sensitive services for mobile users (MUs). However, dynamic and stochastic characteristics of MEC networks necessitate constant migration of installed services across edge servers to keep up with the mobility of MUs. As a result, the cost of maintaining the network increases significantly. Existing studies of MEC rarely consider the cost of service migration due to MU mobility. To minimize the long-term cost for microservices in a hybrid cloud-edge system comprising of MUs, small base stations (SBSs), and a cloud data center (CDC), the total cost minimization is formulated as a constrained mixed-integer nonlinear program. To solve it, this work designs a novel meta-heuristic optimization algorithm called Multi-swarm Gray-wolf-optimizer based on Genetic-learning (MGG), which effectively combines strong local search capabilities of gray wolf optimizer with superior global search capabilities of genetic algorithm. MGG simultaneously optimizes service request routing among MUs, SBSs, and CDC, CPU speeds of SBSs, service deployment of SBSs, service migration cost of SBSs, as well as MUs’ transmission power and channel bandwidth allocation. Simulation results with Google cluster trace demonstrate that MGG outperforms several state-of-the-art peers with respect to the overall cost of the hybrid system.
|
|
08:00-09:00, Paper TU-PS00T3.2 | Add to My Program |
Thyroid Nodule Classification in Ultrasound Videos by Combining 3D CNN and Video Transformer |
|
Huang, Jing | Wuhan University of Technology |
Chen, Tianyu | Wuhan University of Technology |
Jiang, Wen | SonoScape |
Zhang, Hewei | Wuhan University of Technology |
Wang, Ruoqi | Wuhan University of Technology |
Keywords: Medical Informatics
Abstract: Diagnosing thyroid nodules with computer-aided techniques remains a challenging task. Using ultrasound videos for the classification of benign and malignant nodules can provide valuable timing and change information that is consistent with clinical diagnosis. In this paper, we propose a novel thyroid nodule classification model based on ultrasound video. To capture different semantic information, we sample video frames at various time intervals and extract local and global features using two different feature extraction branches. Our experimental results show that our method outperforms existing state-of-the-art methods, demonstrating its effectiveness in accurately diagnosing thyroid nodules with ultrasound videos.
|
|
08:00-09:00, Paper TU-PS00T3.3 | Add to My Program |
AFPN: Asymptotic Feature Pyramid Network for Object Detection |
|
Yang, Guoyu | Zhejiang University of Technology |
Lei, Jie | Zhejiang University of Technology |
Zhu, Zhikuan | Zhejiang University of Technology |
Cheng, Siyu | Zhejiang University of Technology |
Zunlei, Feng | Zhejiang University |
Liang, Ronghua | Zhejiang University of Technology |
Keywords: Image Processing and Pattern Recognition, Machine Vision, Neural Networks and their Applications
Abstract: Multi-scale features are of great importance in encoding objects with scale variance in object detection tasks. A common strategy for multi-scale feature extraction is adopting the classic top-down and bottom-up feature pyramid networks. However, these approaches suffer from the loss or degradation of feature information, impairing the fusion effect of non-adjacent levels. This paper proposes an asymptotic feature pyramid network (AFPN) to support direct interaction at non-adjacent levels. AFPN is initiated by fusing two adjacent low-level features and asymptotically incorporates higher-level features into the fusion process. In this way, the larger semantic gap between non-adjacent levels can be avoided. Given the potential for multi-object information conflicts to arise during feature fusion at each spatial location, adaptive spatial fusion operation is further utilized to mitigate these inconsistencies. We incorporate the proposed AFPN into both two-stage and one-stage object detection frameworks and evaluate with the MS-COCO 2017 validation and test datasets. Experimental evaluation shows that our method achieves more competitive results than other state-of-the-art feature pyramid networks. The code is available at https://github.com/gyyang23/AFPN.
|
|
08:00-09:00, Paper TU-PS00T3.4 | Add to My Program |
Comparative Study on Different Types of Surrogate-Assisted Evolutionary Algorithms for High-Dimensional Expensive Problems |
|
Qiao, Zhuo-Yin | Nanjing University of Information Science & Technology |
Yang, Qiang | Nanjing University of Information Science and Technology |
Gao, Xu-Dong | Nanjing University of Information Science and Technology |
Xu, Peilan | Nanjing University of Information Science and Technology |
Lu, Zhen-Yu | Nanjing University of Information Science and Technology |
Keywords: Evolutionary Computation, Computational Intelligence, Swarm Intelligence
Abstract: Expensive optimization problems (EOPs) are becoming more and more ubiquitous nowadays. To effectively solve such problems, surrogate-assisted evolutionary algorithms (SAEAs) have been developed. Specifically, a SAEA usually maintains a surrogate model to simulate the real objective function of an EOP. Such a surrogate model is trained based on real-evaluated solutions. Then, it is utilized to evaluate the fitness of individuals in the EA instead of the real expensive fitness evaluation. Though many SAEAs have been designed, they mainly concentrate on dealing with low-dimensional EOPs with fewer than 300 dimensions. Their performance on large-scale EOPs with more than 300 dimensions is unknown. To fill this gap, this paper conducts a comparative study on two types of state-of-the-art SAEAs with a total of four algorithms on four classical EOPs. To make comprehensive comparisons, we range the dimension size from 50 to 1000. As far as we know, this is the first time to assess SAEAs on EOPs with such a wide range of dimension sizes and such high dimensionality. The comparison results show that the optimization performance of the compared four SAEAs on high-dimensional EOPs with more than 500 dimensions is not as satisfactory as their performance on low-dimensional EOPs because of their slow convergence. Therefore, research on large-scale SAEAs for high-dimensional EOPs still deserves intensive attention.
|
|
08:00-09:00, Paper TU-PS00T3.5 | Add to My Program |
Comparative Study on Different Encoding Strategies for Multiple Traveling Salesmen Problem |
|
Dou, Xin-Ai | School of Artificial Intelligence, Nanjing University of Informa |
Yang, Qiang | Nanjing University of Information Science and Technology |
Xu, Peilan | Nanjing University of Information Science and Technology |
Gao, Xu-Dong | Nanjing University of Information Science and Technology |
Lu, Zhen-Yu | Nanjing University of Information Science and Technology |
Keywords: Evolutionary Computation, Computational Intelligence, AI and Applications
Abstract: Multiple traveling salesmen problem (MTSP) is an extension of traditional traveling salesman problem (TSP). It involves both the city assignment optimization and the route optimization of each salesman. Genetic algorithms (GA) have been widely used to solve MTSP thanks to its easiness in implementation and good global search ability. To help GA effectively solve MTSP, researchers have developed various encoding schemes. However, there is no systematic and comparative study on the effectiveness of these encoding strategies. To fill this gap, this paper conducts investigations to compare four popular encoding strategies for MTSP, namely the one-chromosome encoding, the two-chromosome encoding, the two-part-chromosome encoding and the multi-chromosome encoding. Experimental results on different MTSP instances with different numbers of cities and salesmen show that the multi-chromosome encoding is far better than the other encoding strategies.
|
|
08:00-09:00, Paper TU-PS00T3.6 | Add to My Program |
Unmasking Deception: A Comparative Study of Tree-Based and Transformer-Based Models for Fake Review Detection on Yelp |
|
Wang, Pengqi | The Hong Kong University of Science and Technology (Guangzhou) |
Lin, Yue | The University of Hong Kong (HKU) |
Chai, Junyi | Beijing Normal University - Hong Kong Baptist University United |
Keywords: Application of Artificial Intelligence, Media Computing, Artificial Social Intelligence
Abstract: The increasing prevalence of fake online reviews jeopardizes firms' profits, consumers' well-being, and the trustworthiness of e-commerce ecosystems. We face the significant challenge of accurately detecting fake reviews. In this paper, we undertake a comprehensive investigation of traditional and state-of-the-art machine learning models in classification, based on textual features, to detect fake online reviews. We attempt to examine existing and noteworthy models for fake online review detection, in terms of the effectiveness of textual features, the efficiency of sampling methods, and their performance of detection. Adopting a quantitative and data-driven approach, we scrutinize both tree-based and transformer-based detection models. Our comparative studies evidence that transformer-based models (specifically BERT and GPT-3) outperform tree-based models (i.e., Random Forest and XGBoost), in terms of accuracy, precision, and recall metrics. We use real data from online reviews on Yelp.com for implementation. The results demonstrate that our proposed approach can identify fraudulent reviews effectively and efficiently. Synthesizing ChatGPT-3, tree-based, and transformer-based models for fake online review detection is rather new but promising, this paper highlights their potential for better detection of fake online reviews.
|
|
08:00-09:00, Paper TU-PS00T3.7 | Add to My Program |
Delay-Aware and Energy-Efficient Task Offloading Based on Adaptive Large Neighborhood Search (I) |
|
Jiang, MingZhong | Guangdong University of Technology |
Lu, AnBang | Guangdong University of Technology |
Zhu, QingHua | Guangdong University of Technology |
Fei, Lunke | Guangdong University of Technology |
Keywords: Cloud, IoT, and Robotics Integration
Abstract: Mobile edge computing boosts the application performance on mobile devices by collaborating with cloud platforms. This paper studies the task offloading and computing resource allocation problem in a multibase, multiserver, and multiuser scenario subject to resource constraints. The goal is to maximize the users’ task offloading utility, including improvements in task completion time, energy consumption, and communication cost. The addressed problem is formulated as a mixed integer nonlinear programming (MINLP) model. In this paper, we decompose the MINLP and the optimal computing resource allocation policy under a deterministic offloading strategy obtained by the Karush-Kuhn-Tucker conditions. Then, a hybrid adaptive large neighborhood search (HALNS) algorithm is proposed to conduct task offloading. The adaptive large neighborhood search and the variable neighborhood de-scent stages are jointly employed in HALNS. The proposed algorithm, an improved simulated annealing algorithm, and a modified variable neighborhood search algorithm are executed to evaluate their performances. Digital experimental results show that our proposed algorithm achieves higher system utility, lower delays, and less energy consumption.
|
|
08:00-09:00, Paper TU-PS00T3.8 | Add to My Program |
Surpass Teacher: Enlightenment Structured Knowledge Distillation of Transformer |
|
Yang, Ran | Wuhan University of Science and Technology |
Deng, Chunhua | Wuhan University of Science and Technology |
Keywords: Deep Learning, Image Processing and Pattern Recognition, Machine Vision
Abstract: It is difficult to train a trustworthy transformer model on a small image classification dataset. This research proposes a sophisticated structured knowledge distillation algorithm that uses CNNs as Transformer's sophisticated teachers, significantly lowering the number of training datasets needed. To better to develop the potential for CNN tutors, this research configures a public data set for CNN teaching as an enlightenment textbook to guide Transformer’s training and avoid falling into local optimization prematurely. The distillation process then employs a ``learn-digest-self-distillation'' learning strategy to enable the Transformer to assimilate CNN knowledge in a structured manner. Sufficient experiments show that the proposed method is significantly better than the direct training Transformer under the condition of limited data sets. Moreover, in order to show the practical application value, this research contributed a practical data set for the classification of smoking and calling. The corresponding code and dataset will be released at https://gitee.com/wustdch/surpass-teacher if this paper is accepted.
|
|
TU-PS10T1 Room T1 |
Add to My Program |
Supplementary Session 21 |
|
|
|
-, Paper TU-PS10T1.1 | Add to My Program |
Hybrid Intelligent-Annotation Organ Segmentation on Medical Datasets |
|
Tao, Peng | Soochow University |
Zhao, Jing | Beijing Tsinghua Changgung Hospital |
Gu, Yidong | Suzhou Municipal Hospital |
Di, Gongye | The Affiliated Taizhou People’s Hospital of Nanjing Medical Univ |
Zhang, Lei | Duke Kunshan University |
Cai, Jing | Hong Kong Polytechnic University |
Keywords: Intelligence Interaction, Multimedia Systems, Medical Informatics
Abstract: Ultrasound image segmentation is crucial for early disease detection and treatment planning but remains a challenging task due to the low contrast of organ boundaries and varying image quality. Current methods often require manual intervention or have limited accuracy. In this paper, we propose a novel hybrid framework that combines an automatic option polygon segment (AOPS) algorithm and a distributed- and memory-based evolution (DME) algorithm for precise ultrasound organ segmentation. Our pipeline consists of two cascaded stages: (1) a coarse segmentation step using the AOPS algorithm, which determines the number of vertices/clusters without human intervention, and (2) a refinement step using the DME algorithm for hunting for the optimal neural network, which is then used to represent a smooth, explainable mathematical expression of the organ boundary. We employ the fractional backpropagation learning network with L2 regularization (FBLN) for training and use the scaled exponential linear unit (SELU) activation function to address the vanishing gradient problem. This is a new attempt such a hybrid framework is applied to ultrasound organ segmentation tasks, and it demonstrates significant contributions in terms of accuracy, smoothness, and computational efficiency.
|
|
09:00-10:00, Paper TU-PS10T1.2 | Add to My Program |
Cybernetic Telepresence Humanoid Surgeon Avatar Robotic Astronaut (I) |
|
Jewell, Susan | Avatarmedic Inc |
Jewell, Emmy | MMAARS |
Keywords: AI and Applications, Cloud, IoT, and Robotics Integration, Expert and Knowledge-Based Systems
Abstract: This paper will discuss the current research and projects focusing on the potential for creating humanoid Cybernetic Surgeon Avatar Robotic Astronaut for future space exploration. To-date the current and exciting technological innovations in telepresence avatars and telerobotic research has enabled the vision to create the future of remote, real-time CYBERNETIC TELEPRESENCE HUMANOID AVATARS (CTHA) technology. A reality that can potentially become the future “Space Robotic Astronauts” for planetary missions and missions that are dangerous to send human astronauts. A CTHA is a physical robot that can be substituted for the physical presence of a person. These “humanoid” avatars are integrated with frontier technologies, such as, Augmented Reality (AR) and Extended Reality (XR), for example, spatial computing headset devices, such as, HoloLens 2, and equipped with sensors, haptics, and cameras that allow them to perceive their surroundings where they can move and interact with the environment that is similar to a human being. The key difference is that the person controlling the CTHA is typically located in a remote location, such as, a remote office or a different geographical site. The technology behind CTHA is potentially possible by the convergence of several frontier technologies and advancement in robotic development. The avatar is typically controlled by a human operator who can see what the CTHA avatar sees through cameras and other sensors and can control the movements and interactions via remote controls or joysticks. The advancement of facial recognition and deep Artificial Intelligence (AI) can allow the operator to control the facial expressions of the avatar allowing it to emote humanoid-like expressions and convey a sense of an authentic interaction with the person. This review will explore the concept of CTHA and provide examples of how this technology could be used in different contexts, such as, space exploration, space medicine, space psychiatry and applications for social impact and sustainability.
|
|
09:00-10:00, Paper TU-PS10T1.3 | Add to My Program |
Towards Intelligent Training Systems for Customer Service |
|
Song, Shuangyong | China Telecom Corporation Ltd |
Liu, Shixuan | Australian National University |
Keywords: Human-Computer Interaction, Human-centered Learning
Abstract: Customer service is very important in many industrial fields, and the service quality is most essential. However, customer service practitioners are with a high turnover rate, and it usually takes months for a new customer service employee to be an experienced one. If the training of new employees is conducted by other experienced employees, there will be a high resource consumption. Therefore, intelligent training systems for customer service can be designed to replace the manual training. In this paper, we define the task of intelligent training for customer service and propose an architecture of intelligent training systems. Dialogue scripts are prepared offline, and a dialogue simulation module and a service evaluation module are separately intended for the online service training and the service quality evaluation. We evaluate state-of-the-art models with respect to the ability to provide service training, and the experimental results show that our proposed system is effective on this task.
|
|
09:00-10:00, Paper TU-PS10T1.4 | Add to My Program |
Binomial Distribution Assisted Individual Selection for Differential Evolution |
|
Ji, Jiawei | Nanjing University of Information Science and Technology |
Yang, Qiang | Nanjing University of Information Science and Technology |
Gao, Xu-Dong | Nanjing University of Information Science and Technology |
Xu, Peilan | Nanjing University of Information Science and Technology |
Lu, Zhen-Yu | Nanjing University of Information Science and Technology |
Keywords: Evolutionary Computation, Metaheuristic Algorithms, Computational Intelligence
Abstract: Mutation plays a crucial role in assisting differential evolution (DE) to effectively solve optimization problems. The key to mutation lies in the selection of parent individuals participating in the mutation. Along this road, this paper devises a binomial distribution-assisted individual selection strategy for DE. Spe-cifically, this paper takes advantage of the probability distribu-tion function of the binomial distribution to assign weights to individuals based on their fitness rankings. In this way, the selection of individuals focuses more on medium better individ-uals instead of the top best ones. Therefore, high mutation di-versity can be preserved and thus it is likely that falling into local regions can be effectively avoided. Embedding this selec-tion strategy into DE, a novel DE variant called binomial dis-tribution assisted DE (BDDE) is developed. Experiments con-ducted on the CEC2017 benchmark suite have verified the ef-fectiveness of BDDE in solving optimization problems. Particu-larly, BDDE gains much better performance against the well-known and representative mutation strategies.
|
|
09:00-10:00, Paper TU-PS10T1.5 | Add to My Program |
Random Pairwise Competition Based Ant Selection for Pheromone Up-Dating in Ant Colony Optimization |
|
Cao, Hao | Nanjing University of Information Science and Technology |
Yang, Qiang | Nanjing University of Information Science and Technology |
Gao, Xu-Dong | Nanjing University of Information Science and Technology |
Xu, Peilan | Nanjing University of Information Science and Technology |
Lu, Zhen-Yu | Nanjing University of Information Science and Technology |
Zhang, Jun | Hanyang University |
Keywords: Evolutionary Computation, Metaheuristic Algorithms, Computational Intelligence
Abstract: Ant Colony Optimization (ACO) has shown very promising performance in solving Traveling Salesman Problem (TSP). However, most existing ACO algorithms utilize either the abso-lutely best ants or all ants to update the pheromone matrix. This leads to either serious diversity loss or slow convergence. To alleviate these predicaments, this paper designs a random pairwise competition based ant selection for pheromone updat-ing. Specifically, a number of ants are randomly selected from the ant colony and then are randomly paired together. Subse-quently the better one in each pair is selected to update the pheromone matrix. In this way, a good balance between search diversity and search convergence is potentially maintained. Integrating this selection strategy along with a local search scheme into the ACO framework, a new ACO algorithm called random pairwise competition based ACO (RPCACO) is devel-oped. Experiments conducted on 8 TSP instances from the TSPLIB benchmark set demonstrate that RPCACO is more effective and efficient than the five classical ACO algorithms in solving TSP.
|
|
09:00-10:00, Paper TU-PS10T1.6 | Add to My Program |
Investigation of Using Large-Scale Swarm Optimizers to Optimize Sub-Problems in Cooperative Coevolution |
|
Lu, Ming-Yuan | Henan Normal University |
Yang, Qiang | Nanjing University of Information Science and Technology |
Liu, Dong | Henan Normal University |
Ma, Yuan-Yuan | Henan Normal University |
Li, Tao | Henan Normal University |
Zhang, Jun | Hanyang University |
Keywords: Swarm Intelligence, Evolutionary Computation, Metaheuristic Algorithms
Abstract: Cooperative co-evolutionary algorithms (CCEAs) have witnessed giant success in solving large-scale optimization problems (LSOPs). However, most existing CCEAs use low-dimensional EAs to optimize the decomposed sub-problems. Such utilization of low-dimensional EAs may limit the effectiveness of CCEAs because some of the decomposed sub-problems may still be high-dimensional. Since there exist many non-decomposition based large-scale EAs, it is interesting to investigate the optimization effectiveness of CCEAs by using these non-decomposition based large-scale EAs to solve the decomposed sub-problems. To this end, this paper incorporates two state-of-the-art large-scale swarm optimizers into CCEAs with five state-of-the-art decomposition strategies to solve LSOPs. Experiments conducted on the CEC'2010 and CEC'2013 LSOP benchmark sets have shown that the two large-scale swarm optimizers help CCEAs with the five decomposition strategies achieve much better performance than the most widely used low-dimensional EA.
|
|
09:00-10:00, Paper TU-PS10T1.7 | Add to My Program |
Stochastic Dominant Cognitive Experience Guided Particle Swarm Optimization |
|
Pan, Hanyang | Henan Normal University |
Yang, Qiang | Nanjing University of Information Science and Technology |
Li, Ming | Henan Normal University |
Zhang, En | College of Computer and Information Engineering, Henan Normal Un |
Ma, Yuan-Yuan | Henan Normal University |
Li, Tao | Henan Normal University |
Liu, Dong | Henan Normal University |
Zhang, Jun | Hanyang University |
Keywords: Swarm Intelligence, Evolutionary Computation, Computational Intelligence
Abstract: This paper proposes a stochastic dominant cognitive experience-guided learning framework for particle swarm optimization (SDCEGPSO) to enhance its search ability in complex environment. Specifically, different from classical PSOs, SDCEGPSO randomly selects dominant cognitive experiences to guide the learning of particles. To this end, the cognitive experiences of all particles, namely their personal best positions, are sorted from the best to the worst. Then, each particle randomly chooses a personal best position better than its own to learn. For the cognitive experience selection, this paper designs three selection methods, namely the random selection, the roulette wheel selection, and the tournament selection. With this learning framework, particles have diverse guiding exemplars to learn from and thus high search diversity is expectedly maintained. Experiments conducted on the 50-D and 100-D CEC2014 problem suite have verified the effectiveness of SDCEGPSO. Compared with the classical global PSO (GPSO) and local PSO (LPSO), SDCEGPSO with the three selection schemes achieve significantly better performance. Besides, among the three selection schemes, the binary tournament selection is the most effective one to help SDCEGPSO solve optimization problems.
|
|
09:00-10:00, Paper TU-PS10T1.8 | Add to My Program |
Temporal Aggregation with Context Focusing for Few-Shot Video Object Detection |
|
Han, Wentao | Zhejiang University of Technology |
Lei, Jie | Zhejiang University of Technology |
Wang, Fahong | Zhejiang University of Technology |
Zunlei, Feng | Zhejiang University |
Liang, Ronghua | Zhejiang University of Technology |
Keywords: Image Processing and Pattern Recognition, Representation Learning, Neural Networks and their Applications
Abstract: Few-shot video object detection focuses on finding all the objects in a given query video that belong to the same class, given only a few support images of the target object in an unseen class. Unfortunately, due to the object blur or occlusion in video frames, using single-frame object detection directly will greatly limit the accuracy. The issue is significantly worse in few-shot settings due to insufficient support and time-domain information. In this paper, we propose a temporal aggregation with context focusing framework (TACF) for few-shot video object detection, which aims to fully use the information between support images and adjacent video frames. The context focusing module effectively encodes the target object in adjacent frames according to the support images. Afterward, the temporal aggregation module implicitly extracts the most similar ROI features from these adjacent frames to obtain the target proposals. In the end, the matching network determines the category and bounding box by calculating the distance with the support images. Extensive experimental evaluations on FSVOD and FSYTV databases show that our method achieves more competitive results than image-based methods, naive video-based extensions, and the state-of-the-art few-shot video object detection method.
|
|
TU-PS10T2 Room T2 |
Add to My Program |
Supplementary Session 22 |
|
|
|
-, Paper TU-PS10T2.1 | Add to My Program |
DS-Point: A Dual-Scale 3D Framework for Point Cloud Understanding |
|
Zhang, Renrui | The Chinese University of Hong Kong |
Zeng, Ziyao | ShanghaiTech University |
Guo, Ziyu | Peking University |
Chen, Borui | University of Electronic Science and Technology of China |
Zhang, Guangnan | Baoji University of Arts and Science |
Liu, Xilan | Baoji University of Arts and Science |
Keywords: Deep Learning, Machine Vision, Representation Learning
Abstract: Compared with grid-based 2D images, processing 3D point clouds is more challenging due to their irregular distribution and intricate spatial information. Most prior works introduce delicate designs on either local feature aggregators or global geometric architecture, but few combine two scales effectively. Therefore, to better incorporate the advantages of both local and global processing, we propose DS-Point, a dual-scale 3D framework for point cloud understanding. DS-Point firstly disentangles 3D features from channel dimension for concurrent dual-scale modeling, i.e., point-wise convolution for local fine-grained geometry parsing, and voxel-wise attention for global long-range spatial exploration. Upon that, an HF-fusion module is proposed to enhance the cross-modal interaction and thoroughly blend the dual-scale features. Then, with task-specific heads for different downstream tasks, DS-Point serves as an effective 3D framework for feature extraction. By the dual-scale paradigm, DS-Point achieves superior performance on multiple downstream tasks, e.g., 93.8% for shape classification on ModelNet40, 84.9% on ScanObjectNN, and 84.3% on ShapeNetPart.
|
|
09:00-10:00, Paper TU-PS10T2.2 | Add to My Program |
Blockchain-Based Multi-Cloud Data Storage System Disaster Recovery |
|
Wang, Feiyu | Inner Mongolia University |
Zhou, Jiantao | Inner Mongolia University |
Keywords: System Architecture, Distributed Intelligent Systems, Large-Scale System of Systems
Abstract: Cloud storage services have been used by most businesses and individual users. However, data loss, service interruptions and cyber attacks often lead to cloud storage services not being provided properly, and these incidents have caused financial losses to users. Second, traditional and single-cloud model disaster recovery services are no longer suitable for the current complex cloud storage systems. Therefore, a scheme to provide disaster recovery for cloud storage services in a multi-cloud storage environment is needed in real production. In this paper, we propose a disaster recovery scheme based on blockchain technology. The proposed scheme outlined in this study aims to address the issue of data availability within the cloud storage landscape. The proposed scheme achieves this goal by dividing data into hot and cold categories, verifying the integrity of copy data via blockchain technology, and utilizing blockchain networks to manage multi-cloud storage systems. Experimental findings demonstrate that the proposed scheme yields superior results in terms of computation and time overheads.
|
|
09:00-10:00, Paper TU-PS10T2.3 | Add to My Program |
Gender-Sensitive EEG Channel Selection for Emotion Recognition Using Enhanced Genetic Algorithm |
|
Duan, Danting | Key Laboratory of Media Audio & Video, Communication University |
Sun, Bing | College of Computer and Information, Henan Normal University |
Yang, Qiang | Nanjing University of Information Science and Technology |
Zhong, Wei | State Key Laboratory of Media, Convergence and Communication, Com |
Ye, Long | State Key Laboratory of Media, Convergence and Communication, Com |
Zhang, Qin | State Key Laboratory of Media, Convergence and Communication, Com |
Zhang, Jun | Hanyang University |
Keywords: Brain-Computer Interfaces, Affective Computing
Abstract: EEG channel selection can reduce data redundancy, thereby beneficial for improving the utility and efficiency of emotion recognition. Previous studies on EEG channel selection have not considered the influence of genders despite long-standing belief in gender differences with respect to emotion analysis. In this paper, we collected EEG signals from 20 subjects containing 10 males and 10 females by letting them watch short emotional videos. Then, to reduce data redundancy, we propose an enhanced genetic algorithm to select the optimal channel subsets separately for male and female subjects by incorporating a novel evolution operation. Experimental results show that the proposed algorithm achieves higher accuracy in terms of emotion recognition than several compared methods with a smaller channel subset. Besides, experimental results also indicate that the gender differences in neural patterns indeed exist. Through this study, the gender-sensitive channel selection offers a new avenue for further development of EEG based emotion recognition.
|
|
09:00-10:00, Paper TU-PS10T2.4 | Add to My Program |
Design and Visualization of a Knowledge Graph Based on Hematology Data: Management of Anemia in Adults (I) |
|
Despres, Sylvie | Sorbonne Paris Nord University |
Hodroj, Soulaymane | Université Sorbonne Paris Nord, LIMICS, INSERM UMRS 1142 |
Hamadi Piriou, Chiraz | Université Sorbonne Paris Nord, LIMICS, INSERM UMRS 1142 |
Keywords: Design Methods, Interactive Design Science and Engineering, Medical Informatics
Abstract: Several clinical decision support systems (CDSSs) have been developed to help practitioners in their diagnostic procedures in order to achieve the best care. This article presents the construction work of a knowledge graph within the framework of a CDSS dedicated to non-hematologist physicians whose main objective is to determine the urgent situations in hematology. First, we recall the basic notions concerning the different approaches of CDSSs. After having identified the skills issues and specified the need, we describe our first work on the construction of this graph based on the latest scientific data concerning the management of anemia and the proposed topology. We finally pass to the validation of the graph as well as its visualization.
|
|
09:00-10:00, Paper TU-PS10T2.5 | Add to My Program |
Knowledge Graph and Ontology for Representing CLL Data (I) |
|
Despres, Sylvie | Sorbonne Paris Nord University |
Hodroj, Soulaymane | Université Sorbonne Paris Nord, LIMICS, INSERM UMRS 1142 |
Hamadi Piriou, Chiraz | Université Sorbonne Paris Nord, LIMICS, INSERM UMRS 1142 |
Keywords: Design Methods, Medical Informatics, Interactive Design Science and Engineering
Abstract: This work is part of a project aiming to identify a profile of patients with an indolent form of chronic lymphocytic leukemia (CLL) using the MTS assay as a predictive marker. We describe the first results related to the construction of a knowledge graph representing data from heterogeneous data sources. After identifying the competency questions defining the prediction of the evolution of CLL, we propose a model to represent patient data in a knowledge graph, and we write the first expert rules to predict the disease progression.
|
|
09:00-10:00, Paper TU-PS10T2.6 | Add to My Program |
Secure and Efficient Group Decision-Making with Blockchain-Based Consensus and Trust Management |
|
Hassani, Hossein | University of Windsor |
Razavi-Far, Roozbeh | University of New Brunswick |
Saif, Mehrdad | University of Windsor |
Herrera Viedma, Enrique | University of Granada (Spain) |
Keywords: Computational Intelligence, Expert and Knowledge-Based Systems, Application of Artificial Intelligence
Abstract: Trust-building is of paramount importance for managing and improving consensus in group decision-making (GDM). This mechanism usually involves a trust propagation process for estimating the level of trust among decision-makers (DMs). However, this process is computationally expensive and hinders the speed of consensus reaching. To address this issue, this work proposes a novel trust-building mechanism that does not rely on the trust propagation process to quantify DMs' level of trust. Instead, it makes use of Blockchain technology to facilitate communication between the moderator and the group of DMs. This novel trust-building mechanism does not rely on trust propagation, which makes it computationally efficient for building trust among DMs while also providing a secure and efficient communication protocol to accelerate the consensus-reaching process. The proposed GDM model is illustrated through an example, and the sensitivity of the model to various assumptions is analyzed, demonstrating the practical applicability of this approach.
|
|
09:00-10:00, Paper TU-PS10T2.7 | Add to My Program |
Autonomous Decision Making with Reinforcement Learning in Multi-UAV Air Combat (I) |
|
Feng, Xutao | Beihang University |
Ma, Yaofei | Beihang University |
Zhao, Liping | Beihang University |
Yang, Hanbo | National Key Laboratory of Modeling and Simulation for Complex S |
Keywords: AI and Applications, Agent-Based Modeling, Deep Learning
Abstract: A multi-agent decision network based on QMIX is proposed in this paper to cope with the coordination decision problem of multiple UAV air combat missions. To speed up the training process, three improvements are introduced: 1) An improved epsilon-decaying method that enable some tutor to help in action selection at the early stage of the training. This measure greatly improves the exploring efficiency when the network are far from being fully trained; 2) State pruning and action mask measures are applied during the training. The former improves the effectiveness of the input state information, and the latter reduces unnecessary action exploring. 3) A gradually training configuration is used to make the training process more robust, where the combat adversaries are configured as the static targets, the randomly maneuver vehicles, and the Min-Max strategy vehicles respectively. The multi-UAV air combat scenarios are built up and the experiments are conducted. The results shows that these improvements have significantly improved training efficiency.
|
|
09:00-10:00, Paper TU-PS10T2.8 | Add to My Program |
Photovoltaic Power Forecast Based on Gated Recurrent Unit and Wavelet Transform (I) |
|
Chang, Yu Ming | Industrial Technology Research Institute (ITRI) |
Chen, Chao-Rong | National Taipei University of Technology |
Brice, Ouedraogo | National Taipei University of Technology |
Chou, Chih-Ju | National Taipei University of Technology |
Lee, Ching-Yin | Tungnan University |
Keywords: Deep Learning, Machine Learning, Neural Networks and their Applications
Abstract: Solar power generation varies according to time and several meteorological parameters making its prediction difficult. This paper proposes a hybrid model combining gated recurrent units (GRU) with wavelet transform called Wavelet-GRU for short-term solar power generation prediction. The proposed method is optimized with the hyperband algorithm and implemented with TensorRT for effective inference on GPU. The forecasting result forecasted from ten minutes to two hours ahead using actual data collected by a power plant in central Taiwan demonstrated that the proposed hybrid model can effectively predict the upcoming solar power generation in short terms horizon. The proposed approach can achieve 5.28% error rate of 60 min ahead forecasting and provides valuable contribution for accurate solar power forecasting while dispatching an important reference.
|
|
TU-PS10T3 Room T3 |
Add to My Program |
Supplementary Session 23 |
|
|
|
-, Paper TU-PS10T3.1 | Add to My Program |
CMM: Code-Switching with Manifold Mixup for Cross-Lingual Spoken Language Understanding |
|
Mao, Tianjun | Fudan University |
Zhang, Chenghong | Fudan University |
Keywords: Human-Machine Interaction, Human Perception in Multimedia, Human-Computer Interaction
Abstract: Spoken language understanding (SLU) is a task that typically involves intent detection and slot filling. Although it has achieved great success in high-resource languages, it remains challenging in low-resource languages due to the lack of labeled training data. Consequently, there is growing interest in code-switching method for zero-shot cross-lingual SLU to tackle with the challenge in low-resource languages. However, despite the success of existing models with code-switching method, most of them do not address the problem of difficulties in learning from code-switched utterances. To tackle this issue, we propose a framework called Code-Switching with Manifold Mixup for zero-shot cross-lingual spoken language understanding (CMM) that simplifies learning task for model. Specifically, we apply both mixup and curriculum learning to dynamically combine information from pure utterances and code-switched utterances. Our experimental results show that the proposed framework significantly improves performance compared to strong baselines and achieves state-of-the-art performance on the MultiATIS++ dataset, with a relative improvement of 3.0% in overall accuracy over the previous best model.
|
|
09:00-10:00, Paper TU-PS10T3.2 | Add to My Program |
A Static Multi-Class Malicious Office Document Detection Method Via Multi-Feature Fusion |
|
Chen, Jia | Beihang University |
Hu, Yang | Beihang University |
Luo, Xin | Chinese Academy of Sciences |
Keywords: Systems Safety and Security
Abstract: Microsoft Office documents have become hackers’ preferred tool to construct malicious documents. However, current research on detecting malicious Office documents has not covered all document formats and various types of malicious attacks. To address this issue, this paper proposes a Static Multi-class Malicious Office document Detection Method (SM2ODM) for multiple versions of Office documents. The focus of this research is to design a unified static feature representation method for multiple versions of Office documents via multi-feature fusion, including VBA (Visual Basic for Applications) code keywords, DDE (Dynamic Data Exchange) instructions, embedded files, OLE (Object Linking and Embedding) objects, external links, and other relevant features. In addition, this research identifies eight new types of malicious features and embedding locations. Then, this paper proposes a multi-class detection method for malicious Office documents that can detect five common types of malicious documents. Through analyzing 20,000 samples provided by Topsec Technologies Group, the proposed SM2ODM achieves high accuracy in multi-classification detection and identifies 185 malicious Office samples that common antivirus software failed to detect.
|
|
09:00-10:00, Paper TU-PS10T3.3 | Add to My Program |
Temporal Feature Mining in Dynamic Graph of Brain Connectivity Data |
|
Liu, Tao | Qilu University of Technology |
Zhang, Guangwei | Qilu University of Technology |
Jing, Ming | Big Data Institute |
Zhang, Li | Qilu University of Technology |
Yu, Jiguo | Qilu University of Technology |
Keywords: Knowledge Acquisition, Machine Learning
Abstract: In recent years, the graph feature mining method of brain connection data based on graph theory has been regarded as a popular and universal technology in the field of neuroscience. How to mine valuable information from brain connection data has become a research hotspot. Current research shows that the pathogenic factors of attention deficit and hyperactivity disorder (ADHD) may be caused by the abnormal connection between brain network structures. In order to find out the pathogenic factors of ADHD patients, we also carried out frequent subgraph mining on the connectivity graph data of brain functional network. By constantly adjusting the support threshold, all the subgraphs of ADHD patients and healthy control group were mined, and the differences in brain region connectivity were successfully found out. By combining the recently introduced neural document embedding model with traditional pattern mining techniques, we regard the brain network connection structure graph as the document and frequent subgraph as the atomic unit of the embedding process. By learning the mapping, each graph can be mapped to a D-dimensional continuous vector. The mapping needs to capture the similarity between the graphs. Feature vectors can be used as the direct input of graph classification in many traditional machine learning methods. Finally, support vector machine in machine learning is used to verify the accuracy of classification, and the results show that the accuracy is high.
|
|
09:00-10:00, Paper TU-PS10T3.4 | Add to My Program |
Exploring Global and Local Information for Anomaly Detection with Normal Samples |
|
Xu, Fan | University of Science and Technology of China |
Wang, Nan | Beijing Jiaotong University |
Zhao, Xibin | Tsinghua University |
Keywords: Fault Monitoring and Diagnosis, System Modeling and Control
Abstract: Anomaly detection aims to detect data that do not conform to regular patterns, and such data is also called outliers. The anomalies to be detected are often tiny in proportion, containing crucial information, and are suitable for application scenes like intrusion detection, fraud detection, fault diagnosis, e-commerce platforms, et al. However, in many realistic scenarios, only the samples following normal behavior are observed, while we can hardly obtain any anomaly information. To address such problem, we propose an anomaly detection method GALDetector which is combined of global and local information based on observed normal samples. The proposed method can be divided into a three-stage method. Firstly, the global similar normal scores and the local sparsity scores of unlabeled samples are computed separately. Secondly, potential anomaly samples are separated from the unlabeled samples corresponding to these two scores and corresponding weights are assigned to the selected samples. Finally, a weighted anomaly detector is trained by loads of samples, then the detector is utilized to identify else anomalies. To evaluate the effectiveness of the proposed method, we conducted experiments on three categories of real-world datasets from diverse domains, and experimental results show that our method achieves better performance when compared with other state-of-the-art methods.
|
|
09:00-10:00, Paper TU-PS10T3.5 | Add to My Program |
Agent Based Fetal Face Segmentation for Standard Plane Localization in 3D Ultrasound |
|
Huang, Jing | Wuhan University of Technology |
Wang, Ruoqi | Wuhan University of Technology |
Jiang, Wen | SonoScape |
Shao, Sen | Wuhan University of Technology |
Chen, Tianyu | Wuhan University of Technology |
Keywords: Medical Informatics, Information Visualization, Visual Analytics/Communication
Abstract: In practice, fetal 3D ultrasound can have difficulty in accurately detecting labels for auxiliary standard cut plane localization because of mass loss. Therefore, in this paper, we propose a new segmentation-based reinforcement learning framework for automatically localizing the standard plane of the face: in 3D fetal ultrasound, the initial plane is localized based on anatomical landmarks of mass and geometric relationships, agents navigate through visual segmentation to automatically localize the standard plane, and bound ultrasound views are presented to show the resultant plane. This study was extensively validated on an in-house large dataset. The accuracy of this automatic localization of 3D ultrasound standard planes with sonographer-calibrated median sagittal views of the face, horizontal transverse views of both eyeballs, and coronal views of the nasolabial was 6.64 °/5.65mm, 7.04°/3.58mm, and 5.14 °/4.26mm, respectively, with success rates of 66.67 %, 78.38 %, and 80.41 %, respectively. The experimental results verify that this system can effectively improve navigation performance.
|
|
09:00-10:00, Paper TU-PS10T3.6 | Add to My Program |
EEG-Based Emotion Analysis Using Person-Event Network |
|
Tang, Liwei | Tongji University |
He, Lianghua | Tongji University |
Keywords: Brain-Computer Interfaces
Abstract: Brain-computer interface (BCI) technology has attracted a lot of attention in recent years. Emotion recognition which based on electroencephalography is a typical application of BCI. Traditional methods on emotion recognition are mainly focusing on time domain feature and frequency domain feature while spatial information is often been ignored. In this paper, to make use of spatial feature, we propose a new convolutional neural network using not only temporal feature but also person related feature and event related feature. Depthwise convolution and separable convolution are also used for feature extraction. To verify the effectiveness of our method, we conduct extensive experiments on the public dataset DEAP and DREAMER. Compared with other methods, our method has achieved the state-of-the-art effect.
|
|
09:00-10:00, Paper TU-PS10T3.7 | Add to My Program |
Where Characteristics Effect Night-To-Day Translation Performance? (I) |
|
Yan, Lan | Hunan University |
Zheng, Wenbo | Wuhan University of Technology |
Li, Kenli | Hunan University |
Keywords: Deep Learning, Machine Vision, Representation Learning
Abstract: Inspired by the huge success of generative adversarial networks (GANs), GAN-based night-to-day translation methods have achieved excellent results. However, these methods have not been well visualized and understood, and thus cannot find the characteristics about their night-to-day translation performance. To this end, we present a simple clustering-based parsing approach to effectively understand the internal representations of the GAN-based night-to-day translator. In particular, we first cluster the internal representations of specific layers of the translator into a number of classes. Then, according to the proposed selection strategy, the class that has the most significant impact on the translation performance can be identified. Through experiments on three publicly available datasets, we find the answer of the question (title) is that the characteristics at the junction of bright and dark regions affect the performance of night-to-day translation.
|
|
09:00-10:00, Paper TU-PS10T3.8 | Add to My Program |
Balanced Supervised Contrastive Learning for Skin Lesion Classification (I) |
|
Yan, Lan | Hunan University |
Li, Kenli | Hunan University |
Keywords: Deep Learning, Machine Vision
Abstract: Deep neural networks have emerged as an important tool for computer-aided diagnosis. However, deep models for skin lesion classification still face the challenges of intra-class variation and inter-class similarity, as well as data imbalance. To address these challenges, in this paper, we propose a balanced supervised contrastive learning (BSCL) approach for the skin lesion classification task. Our model consists of two branches for supervised contrastive learning and classification, respectively. The introduced supervised contrastive learning branch helps the network to learn more discriminative representations. Moreover, we design both a category-averaging strategy which averages the instances of every class in a mini-batch, and a category-complement strategy which makes all categories to appear in each mini-batch, to balance the influence from different skin lesion categories. Besides, we introduce a multi-weighted classification loss to learn a balanced classifier. Extensive experiments on two benchmarks demonstrate that our approach is able to learn strong feature representations and achieve state-of-the-art skin lesion classification performance.
|
|
TU-PS20T1 Room T1 |
Add to My Program |
Supplementary Session 31 |
|
|
|
-, Paper TU-PS20T1.1 | Add to My Program |
Learning Quantum Distributions Based on Normalizing Flow (I) |
|
Li, Li | Tongji University |
Wang, Yong | Tongji University |
Cheng, Shuming | Tongji Univeristy |
Liu, Lijun | Shanxi Normal University |
Keywords: Quantum Cybernetics, Quantum Machine Learning
Abstract: Abstract—Learning many-body quantum systems is of fundamental importance in quantum information processing, however, it is a challenging task which typically requires estimating quantum distributions of dimensionality exponentially scaling to the system size. As generative models have shown a great scalability to learn high-dimensional distributions and found wide applications in the domain of image and text, they can be a powerful tool to facilitate us to accomplish the challenging quantum tasks. In this work, we propose using normalizing flow (NF) models with fast sampling to learn discrete quantum distributions for quantum state tomography. Particularly, three NF models, including denoising flow, argmax flow, and tree flow, are first adapted to the task of explicit quantum probability density estimation. We then perform extensive experiments on a large scale of quantum systems, and our numerical results demonstrate that these discrete NFs admit an excellent sampling efficiency in the sense that they are insensitive to the system size to learn the high-dimensional quantum distributions, without compromising the learning performance. Finally, in comparison to the other generative models, such as autoregressive, the NFs avoid the problem of slow sequential sampling.
|
|
-, Paper TU-PS20T1.2 | Add to My Program |
Multioutput Surrogate Assisted Evolutionary Algorithm for Expensive Multi-Modal Optimization Problems |
|
Chen, Renzhi | Defense Innovation Institute |
Li, Ke | University of Exeter |
Keywords: Optimization and Self-Organization Approaches, Evolutionary Computation
Abstract: Real-world optimization problems are often computationally expensive and feature multi-modal objective functions. Surrogate-assisted evolutionary optimization has proven to be an effective approach for addressing expensive black-box optimization challenges, but the technique has not been adequately studied in multi-modal situations. In this paper, we propose a simple but effective multi-output surrogate-based approach for empowering surrogate-assisted evolutionary optimization to address expensive multi-modal optimization problems. Specifically, our proposed approach employs a multi-output Gaussian process to capture correlations between data collected from different local areas. Experiments on synthetic benchmark test problems demonstrate the effectiveness of our proposed algorithm against five state-of-the-art peer algorithms.
|
|
10:00-11:00, Paper TU-PS20T1.3 | Add to My Program |
Classification of Haptic Handshake Data for the Control of Human-Telerobot Social Contact Interactions (I) |
|
Brunken, Tomma | Southern Illinois University Edwardsville |
Gorlewicz, Jenna L. | Saint Louis University |
Butts-Wilmsmeyer, Carolyn | Southern Illinois University Edwardsville |
Weinberg, Jerry B. | Southern Illinois University Edwardsville |
Keywords: Telepresence, Shared Control, Human-Computer Interaction
Abstract: Mobile Remote Presence (MRP) robots have emerged out of the need for telepresence in various settings such as the workplace and hospitals. As with face-to-face experiences, these robot mediated encounters have social aspects that current commercially available MRP robots lack the capabilities to incorporate. In previous work, we integrated a manipulator onto a commercial telerobotic platform to enable expressive gestures and demonstrated that the gesturing capabilities enhanced the social connection between remote and local users. However, we also found that controlling the robot for complex interactions, such as a handshake, diminishes the remote user’s social experience. This paper presents the discovery of models for handshakes in different social contexts, which can be used in a shared-control architecture to reduce the effort on the remote user. Using a haptic measurement glove, force and inertia data was collected for human-human handshakes in various social contexts. By applying a k-nearest neighbor algorithm in combination with dynamic time warping and a support vector machine algorithm, two classification models are derived that predict the social context and can be used in an intelligent shared-control robot architecture.
|
|
10:00-11:00, Paper TU-PS20T1.4 | Add to My Program |
Autoencoder and Teaching-Learning-Based Optimizer for Mobile Edge Computing System Optimization Problems (I) |
|
Xu, Dian | Macau University of Science and Technology |
Zhou, Mengchu | New Jersey Institute of Technology |
Yuan, Haitao | Beihang University |
Keywords: Heuristic Algorithms, Hybrid Models of Computational Intelligence, Optimization and Self-Organization Approaches
Abstract: By using an autoencoder as a dimension reduction tool, an Autoencoder-embedded Teaching-Learning Based Optimization (ATLBO) has been proved to be effective in solving high-dimensional computationally expensive problems through several widely used function problems. However, the following two crucial issues have not been resolved, 1) ATLBO should be verified by solving real-life optimization problems; and 2) how autoencoder parameters and structures impact AEO’s performance. In this work, ATLBO is verified by an energy consumption minimization problem (ECM) in mobile edge computing systems. To design an effective autoencoder for ATLBO, this work proposes a parameter tuning optimization strategy for autoencoders. By using the proposed Autoencoder Parameter Tuning (APT) strategy, ATLBO can enjoy higher robustness than those without it. The experimental results show that it is three to six times better than state-of-the-art methods in solving ECM. We consider the strategy-induced overhead and take the execution time as the primary criterion to evaluate them. In addition, the experimental results show that, against the conventional wisdom that higher-accuracy autoencoders bring higher system performance, lower-accuracy ones can actually assist ATLBO in locating the best solutions. This work promotes a novel application of autoencoders in optimization theory and practice.
|
|
10:00-11:00, Paper TU-PS20T1.5 | Add to My Program |
Novel 3D-Aware Composition Images Synthesis for Object Display with Diffusion Model |
|
Chen, Tianrun | Zhejiang University |
Tao, Xu | Huzhou University School of Information Engineering |
Ye, Yiyu | KOKONI, Moxin (Huzhou) Technology Co., LTD |
Mao, Papa | KOKONI, Moxin (Huzhou) Technology Co., LTD |
Zang, Ying | Huzhou University |
Sun, Lingyun | Zhejiang University |
Keywords: Multimedia Systems, Human-Computer Interaction, Design Methods
Abstract: Designing attractive images for object display can be a time-consuming and skill-intensive process. The emergence of advanced algorithms, particularly the Diffusion Model, has made it possible to synthesize attractive images using AI. However, the existing diffusion models are mostly used to generate entire images and lack control over specific objects for object display. Here, to the best of our knowledge, we pioneers to extend the application of the diffusion model to synthesize novel images for specific objects. By encoding the input images of objects into NeRF representation and synthesizing the desired backgrounds using diffusion models with the input of rendered object images and text prompts, our method can generate 3D aware object display images at arbitrary angles and arbitrary backgrounds. We have conducted extensive experiments to demonstrate that our method is capable of generating high-quality and photo-realistic images, which are >6 times faster than the conventional photomontage approach. Moreover, our generated images have higher compositional scores, image quality scores, and aesthetics scores in our user experiments. By significantly reducing the need for human effort and producing higher quality generated images, our approach opens up exciting possibilities for creating versatile novel images of specific objects.
|
|
10:00-11:00, Paper TU-PS20T1.6 | Add to My Program |
Hand Gesture Classification Model for Intelligent Wheelchair with Improved Gesture Variance Compensation |
|
Bandara, H.M. Ravindu T. | University of Moratuwa |
Priyanayana, Kodikarage Sahan | University of Moratuwa |
Rajendran, Hoshalarajh | University of Moratuwa |
Pathirana, Chandima | University of Moratuwa |
Jayasekara, Buddhika | University of Moratuwa |
Keywords: Intelligence Interaction, Assistive Technology
Abstract: The rapid increase in the elderly and disabled population has been identified as a growing socioeconomic problem. Due to reasons such as a lack of reliable caretakers and the need to empower the elderly and disabled population, it is important to have assistive devices. The interactive capabilities of these devices should match the nature of the interaction that prospective users would have with their companions. Humans communicate with each other in many modalities, such as speech, hand gestures, head gestures, gaze, etc. Hand gestures have been a popular modality that has been used in these interactive devices for speech and mobility-impaired wheelchair users. There have been many gesture models that have been developed recently for hand gesture-controlled wheelchair navigation. Natural hand gestures that are used in human-human interactions include both static and dynamic gestures. Therefore, it was logical to include both of these gestures in a navigational gesture model or hand gesture-controlled navigational system. However, hand tremors that are prevalent among the elderly and disabled community could affect the nature of the hand gesture. These tremors can vary from person to person, and hence the fixed ranges cannot be used for hand features. Hand features such as palm velocity, fingertip velocity, and others will have different ranges from person to person. Due to these reasons, a static hand gesture intended by the human user could be identified as a dynamic gesture. Further, this could lead to the misrecognition of gestures defined in gesture models. Therefore, a system is proposed in this paper to validate the gestures by considering the activity of hand features in 3D regions defined for the gesture. The accuracies of the improved system for static and dynamic gestures were 0.9849 and 0.9840, which were improvements from the accuracies of 0.8994 and 0.8479.
|
|
TU-PS20T2 Room T2 |
Add to My Program |
Supplementary Session 32 |
|
|
|
-, Paper TU-PS20T2.1 | Add to My Program |
TDID: Transparent and Efficient Decentralized Identity Management with Blockchain |
|
Hao, Jiakun | Peking University |
Gao, Jianbo | Peking University |
Xiang, Peng | Peking University |
Zhang, Jiashuo | Peking University |
Chen, Ziming | Peking University |
Hu, Hao | Nanjing University |
Chen, Zhong | Peking University |
Keywords: Cybernetics for Informatics, Big Data Computing,, Cloud, IoT, and Robotics Integration
Abstract: Decentralized identity (DID) is an identity management framework aiming to return the ownership of an identity to its corresponding user. Recent studies propose to store the identifiers of DID issuers and implement identity management systems based on blockchain. However, existing systems cannot avoid identity tampering and verifiable credential abuse of decentralized identities, which makes the identity management opaque. In this paper, we propose TDID, a Transparent and efficient Decentralized IDentity management system with blockchain. The key insight behind TDID is to manage the registration and authentication of DIDs via smart contracts, and design Structured Merkle Patricia Tree (SMPT) as an underlying data structure to store identity data on blockchain. The smart contract based processes can improve transparency of decentralized identity management, while the SMPT data structure can realize efficient storage of DID data. We implement and evaluate TDID on different identity management operations, and the experimental results show that TDID can achieve about 3.1 times for write operation and 6.3 times for read operation while improving the transparency of DID management.
|
|
-, Paper TU-PS20T2.2 | Add to My Program |
Deep3DSketch+/+: High-Fidelity 3D Modeling from Single Free-Hand Sketches |
|
Zang, Ying | Huzhou University |
Ding, Chaotao | Huzhou University |
Chen, Tianrun | Zhejiang University |
Mao, Papa | KOKONI, Moxin (Huzhou) Technology Co., LTD |
Wenjun, Hu | Huzhou University School of Information Engineering |
Keywords: Multimedia Systems, Design Methods, Virtual/Augmented/Mixed Reality
Abstract: The rise of AR/VR has led to an increased demand for 3D content. However, the traditional method of creating 3D content using Computer-Aided Design (CAD) is a labor-intensive and skill-demanding process, making it difficult to use for novice users. Sketch-based 3D modeling provides a promising solution by leveraging the intuitive nature of human-computer interaction. However, generating high-quality content that accurately reflects the creator's ideas can be challenging due to the sparsity and ambiguity of sketches. Furthermore, novice users often find it challenging to create accurate drawings from multiple perspectives or follow step-by-step instructions in existing methods. To address this, we introduce a groundbreaking end-to-end approach in our work, enabling 3D modeling from a single free-hand sketch, Deep3DSketch+backslash+. The issue of sparsity and ambiguity using single sketch is resolved in our approach by leveraging the symmetry prior and structural-aware shape discriminator. We conducted comprehensive experiments on diverse datasets, including both synthetic and real data, to validate the efficacy of our approach and demonstrate its state-of-the-art (SOTA) performance. Users are also more satisfied with results generated by our approach according to our user study. We believe our approach has the potential to revolutionize the process of 3D modeling by offering an intuitive and easy-to-use solution for novice users.
|
|
10:00-11:00, Paper TU-PS20T2.3 | Add to My Program |
SSC3OD: Sparsely Supervised Collaborative 3D Object Detection from LiDAR Point Clouds (I) |
|
Han, Yushan | Beijing Jiaotong University |
Zhang, Hui | Beijing Jiaotong University |
Zhang, Honglei | Beijing Jiaotong University |
Li, Yidong | Beijing Jiaotong University |
Keywords: Visual Analytics/Communication
Abstract: Collaborative 3D object detection, with its improved interaction advantage among multiple agents, has been widely explored in autonomous driving. However, existing collaborative 3D object detectors in a fully supervised paradigm heavily rely on large-scale annotated 3D bounding boxes, which is laborintensive and time-consuming. To tackle this issue, we propose a sparsely supervised collaborative 3D object detection framework SSC3OD, which only requires each agent to randomly label one object in the scene. Specifically, this model consists of two novel components, i.e., the pillar-based masked autoencoder (PillarMAE) and the instance mining module. The Pillar-MAE module aims to reason over high-level semantics in a self-supervised manner, and the instance mining module generates high-quality pseudo labels for collaborative detectors online. By introducing these simple yet effective mechanisms, the proposed SSC3OD can alleviate the adverse impacts of incomplete annotations. We generate sparse labels based on collaborative perception datasets to evaluate our method. Extensive experiments on three largescale datasets reveal that our proposed SSC3OD can effectively improve the performance of sparsely supervised collaborative 3D object detectors.
|
|
10:00-11:00, Paper TU-PS20T2.4 | Add to My Program |
BlindSpotEliminator: Collaborative Point Cloud Perception in Cellular-V2X Networks (I) |
|
Chen, Ziyue | Beijing University of Posts and Telecommunications |
Luo, Guiyang | Beijing University of Posts and Telecommunications |
Shao, Congzhang | Beijing University of Posts and Telecommunications |
Yuan, Quan | Beijing University of Posts and Telecommunications |
Li, Jinglin | Beijing University of Posts and Telecommunications |
Keywords: Visual Analytics/Communication
Abstract: Multi-agent collaborative perception depends on sharing sensory information to improve perception accuracy and robustness, as well as to extend coverage. However, most collaborative perception methods ignore the limitations of communication networks, such as limited bandwidth and the possibility of wireless conflicts. To fill this gap, this paper proposes BlindSpotEliminator, a conflict-free scheduler over the cellularV2X networks for supporting practical collaborative point cloud perception to eliminate blind spots. BlindSpotEliminator first identifies the blind spots for each vehicle, then lists the corresponding conflict relationships based on the distribution of the blind spots and communication conflicts, and finally designs an optimized point cloud data transmission strategy to eliminate the blind spots of each vehicle. Extensive experiments show that compared with greedy algorithm and random methods, BlindSpotEliminator achieves better efficiency, i.e., transmitting 20% more point cloud data.
|
|
10:00-11:00, Paper TU-PS20T2.5 | Add to My Program |
MS-Transformer: Masked and Sparse Transformer for Point Cloud Registration (I) |
|
Jia, Qingyuan | Beijing University of Posts and Telecommunications |
Luo, Guiyang | Beijing University of Posts and Telecommunications |
Yuan, Quan | Beijing University of Posts and Telecommunications |
Li, Jinglin | Beijing University of Posts and Telecommunications |
Shao, Congzhang | Beijing University of Posts and Telecommunications |
Chen, Ziyue | Beijing University of Posts and Telecommunications |
Keywords: Visual Analytics/Communication
Abstract: In this paper, we propose a masked and sparse transformer to address the problem of point cloud registration with low overlap. The mask mechanism reduces the overall data, increasing the corresponding point ratio in the overlap region, while also reducing the computational cost to accelerate the algorithm’s execution speed. Moreover, we combine spatial position encoding and sparse self-attention to establish relationships within the source point cloud, as well as the relationships and attention scores between the source and target point clouds. This approach is specifically designed for the task of point cloud registration. Finally, we search for the maximum overlap area by matching the spatial consistency between points and calculate the 3D transformation matrix to complete the registration process. Our method achieves an improvement in the inlier ratio and performs well on the 3DMatch and 3DLoMatch datasets, demonstrating high registration efficiency.
|
|
10:00-11:00, Paper TU-PS20T2.6 | Add to My Program |
Rule Renew Based on Learning Classifier System and Its Application to UAVs Swarm Adversarial Strategy Design |
|
Li, Xuanlu | Southeast University |
Zhang, Ya | Southeast University |
Keywords: Team Performance and Training Systems, Systems Safety and Security, Shared Control
Abstract: This paper studies how to renew and improve the expert rule and apply the rule update mechanism to optimize UAVs swarm adversarial strategy. A rule update approach is proposed, which uses the classifier subsystem to construct a training model based on expert experience, further trains the model through rule evaluation mechanisms and rule discovery subsystems to improve and enhance the rule base. A UAV swarm confrontation strategy model is further proposed based on the learning classifier system(LCS). Under the simulated aerial engagement environment of island capture between the red and blue sides, simulation experiments show that the model has robust combat effectiveness and offers significant practical utility for agent decision.
|
|
TU-PS20T3 Room T3 |
Add to My Program |
Supplementary Session 33 |
|
|
|
-, Paper TU-PS20T3.1 | Add to My Program |
ISAR: In-Sample Advantage-Regulated Offline Reinforcement Learning |
|
Yang, Deyu | Xi'an Jiaotong University |
Ma, Chengzhong | Xi'an Jiaotong University |
Liu, Zeyang | Xi'an Jiaotong University |
Lan, Xuguang | Xi'an Jiaotong University |
Keywords: Cognitive Computing, Networking and Decision-Making
Abstract: Offline reinforcement learning (RL) enables learning policies from fixed datasets, avoiding the potential safety risks and cost issues of online interaction with the environment. By collecting data from the real environment, offline reinforcement learning can also alleviate the cost of policy transfer in online learning, thus achieving higher learning efficiency and practicality. However, current offline reinforcement learning algorithms that use value functions to improve policies suffer from the problem of distributional shift, which makes it difficult to accurately evaluate state-action pairs within the dataset, and they pay little attention to the balance between reinforcement learning and imitation learning when improving policies. In this paper, we propose a novel offline learning algorithm ISAR that makes use of in-sample value function learning and advantage-regulated policy improvement. By learning the in-sample state value function, we avoid out-of-distribution action evaluation. And we introduce advantage-weighted behavioral cloning term during policy improvement to balance the relationship between reinforcement learning and behavior cloning. The experimental results show that the ISAR algorithm achieves results comparable to current state-of-the-art algorithms in various robot tasks without complex parameter tuning.
|
|
-, Paper TU-PS20T3.2 | Add to My Program |
Dynamic Equilibrium-Based Continual Learning Model with Disentangled Meta-Features |
|
Zhang, Mingyi | Institute of Automation, Chinese Academy of Sciences |
Zhang, Junge | Institute of Automation, Chinese Academy of Sciences |
Keywords: Cognitive Computing, Human-centered Learning
Abstract: The field of artificial intelligence research has witnessed remarkable advancements in recent decades. However, conventional approaches in AI research primarily depend on fixed datasets and stationary settings, which have limited applicability to real-world scenarios. In order to address this limitation, there is an increasing need to develop and study algorithms and methods for continual learning, which enables artificial systems to learn from a continuous stream of data. One of the key challenges in continual learning is to strike a balance between transfer and interference, and to identify an equilibrium solution that can effectively learn from nonstationary data. This paper presents a canonical model that is specifically designed for continual learning, utilizing the derivative of the loss function to evaluate parameter changes between tasks and achieve dynamic equilibrium. Additionally, to improve the efficiency of limited training samples in continual tasks, a feature learning method based on meta-feature disentangling is proposed. By leveraging the second derivative term of the canonical model, the parameter vector can be decoupled and meta-features can be discovered. Experimental results demonstrate the superiority of the proposed method over state-of-the-art methods in continual lifelong supervised learning benchmarks. The validity of the proposed canonical model is further supported by these experimental results. As the demands of available settings become increasingly stringent, the advantages of disentangling meta-features become more prominent, resulting in a significant performance gap with other continual learning methods.
|
|
10:00-11:00, Paper TU-PS20T3.3 | Add to My Program |
Wind Power Scenario Generation Based on Denoising Diffusion Probabilistic Model |
|
Xu, Chenglong | Wuhan University |
Dai, Yuxin | Wuhan University |
Xu, Peidong | Wuhan University |
Gao, Tianlu | Wuhan University |
Zhang, Jun | Wuhan University |
Keywords: Cognitive Computing, Human Factors, Intelligence Interaction
Abstract: The intermittency and randomness of wind power output have a negative impact on the stable operation of the power grid. Accurately modeling the uncertainty of wind power output is essential, and the primary method to achieve this is through scenario generation. Traditional scenario generation methods suffer from limitations such as low accuracy and high computational complexity. In this paper, a novel generation framework based on the denoising diffusion probabilistic model is presented and proposed for scenario generation of wind power. This method can overcome the limitations of traditional methods and learn the distribution of real data to generate reliable wind power scenarios. Compared to a homogeneous generative model, the proposed method shows improved performance in precisely capturing features of wind power scenarios.
|
|
10:00-11:00, Paper TU-PS20T3.4 | Add to My Program |
Application Analysis and Exploration of Hybrid-Augmented Intelligence in Power System |
|
Fan, Shixiong | China Electric Power Research Institute |
Zhao, Zening | China Electric Power Research Institute |
Ma, Shicong | China Electric Power Research Institute |
Guo, Jianbo | China Electric Power Research Institute |
Wang, Guozheng | China Electric Power Research Institute |
Xu, Haotian | China Electric Power Research Institute |
Keywords: Cognitive Computing, Systems Safety and Security,, Supervisory Control
Abstract: The new generation of artificial intelligence (AI) technology will play an important role in promoting the digitalization, informatization and intelligence of the future power grid due to its high-dimensional state intelligent perception and rapid decision-making capabilities. However, its inherent shortcomings such as poor interpretability and fragility also limit the further application of AI technology in power systems. This paper first introduces hybrid-augmented intelligence (HAI) technology and its application development in the fields of autonomous driving and industrial robots. Combining the characteristics of the power system and AI technology, the requirements of the power system for HAI are analyzed and summarized. Secondly, the key technologies involved in human-machine collaborative HAI are analyzed in terms of data processing, model training and model application. On this basis, the application of HAI technology in typical scenarios such as power flow section regulation is designed and analyzed, which provides reference for subsequent engineering applications. Finally, the challenges faced by the application of HAI in power systems are analyzed and prospected, aiming to promote and enrich the development of basic theories and key technologies of hybrid intelligence in power systems.
|
|
10:00-11:00, Paper TU-PS20T3.5 | Add to My Program |
A Transferable Multi-Agent Reinforcement Learning Method for Distribution Service Restoration |
|
Si, Ruiqi | Wuhan University |
Qiao, Ji | China Electric Power Research Institute |
Wang, Xiaohui | China Electric Power Research Institute |
Ji, Kaixuan | China Electric Power Research Institute |
Wang, Zibo | China Electric Power Research Institute |
Zhang, Jun | Wuhan University |
Pan, Xuanying | Wuhan University |
Zhang, Zhengyan | Wuhan University |
Keywords: Cognitive Computing, Multi-User Interaction
Abstract: The occurrence of extreme events, which has increased the risk of major outages in the grid, makes the quick and efficient recovery of load in the distribution network become a key issue. The data-driven deep reinforcement learning method has great potential in providing fast decision-making. However, a large number of agents lead to the curse of dimensionality, making it inefficient to obtain effective control strategies. When tackling similar tasks in different power gird, the retraining of multiple agents will bring us great cost. To solve this problem, we propose a transferable multi-agent reinforcement learning framework that employs model reload and buffer reuse methods to transfer control strategies from small-scale simple scenes to large-scale complex scenes. It also utilizes attention mechanisms to aggregate observation features and handles the problem of variable observation dimensions. Finally, the distribution service restoration problem is modeled as a Markov decision process and solved using the QMIX algorithm. The performance of the proposed method has been verified in IEEE 34-node and IEEE 123-node distribution systems.
|
| |