| |
Last updated on October 16, 2022. This conference program is tentative and subject to change
Technical Program for Wednesday October 12, 2022
To show or hide the keywords and abstract (text summary) of a paper (if available), click on the paper title
Open all abstracts
Close all abstracts
Presentation  In person  On-line  No presentation  No information
|
We-PS1-T1 Regular Session, MERIDIAN |
Add to My Program |
Deep Learning and Its Applications |
|
|
Chair: Kadera, Petr | Czech Technical University in Prague |
|
08:00-08:20, Paper We-PS1-T1.1 | Add to My Program |
COVID-19 Self-Test Guidance System for Swab Collection Using Deep Learning |
|
Abdelkareem, Youssef Abdelrahman Fathi Ahmed | University of Waterloo |
Nasr, Islam Mohamed Mahmoud | University of Waterloo |
Nassar, Lobna | University of Waterloo |
Karray, Fakhreddine | University of Waterloo |
Keywords: Neural Networks and their Applications, Image Processing and Pattern Recognition, Deep Learning
Abstract: The COVID-19 rapid antigen self-test kits are widely administered in several countries to increase the testing frequency and reduce the load on clinics for in-person tests. Yet, the telehealth worker supervision is mandatory to ensure proper sampling procedure is followed and high-quality swab samples are taken. To reduce the load on the health workers in telehealth, we propose a system that eliminates the need for any human supervision by guiding the testers throughout the self-test to ensure the collection of high-quality swab samples. The proposed system takes a live video stream of the frontal face of a user as input and provides real-time instructions to do the self-test correctly with corrective actions when detecting wrong steps. This is mainly done using a collection of deep learning (DL) models. The system uses a novel swab position classification model, Small-MobileNetV2 with Depth-Wise Attention (S-MBNV2-DWAtt), to detect whether a swab is in one of the nostrils or not, which is an optimized version of MobileNetV2 in terms of parameter count and inference speed. The depth-wise attention block allows it to focus on specific parts of the images where the swabs would possibly lie. Lastly, a large-scale synthetic dataset is created to increase the generalization to a variety of swabs and users and a small real dataset is collected to finetune the model on scenes that are similar to the deployment scenarios. The proposed swab position classification model is found to have outstanding performance in terms of both accuracy and speed; it outperforms the ResNet and VGG architectures by 22.83% and 35.11% respectively on a real-world test set while operating at 25 FPS on CPU.
|
|
08:20-08:40, Paper We-PS1-T1.2 | Add to My Program |
A System for Hybrid 4DVar-EnKF Data Assimilation Based on Deep Learning |
|
Dong, Renze | National University of Defense Technology |
Leng, Hongze | National University of Defense Technology |
Keywords: Deep Learning, Neural Networks and their Applications, Application of Artificial Intelligence
Abstract: The accuracy of the initial field is crucial to the forecast results of numerical weather prediction (NWP). Data assimilation (DA) is a method to provide the initial field to the NWP. Currently, the hybrid 4DVar-EnKF DA method is the primary DA method used by operational NWP centres. The technique requires the derivation of the tangent linear and adjoint models for the nonlinear model, but it’s challenging to get the tangent linear and the adjoint models. Furthermore, this method usually adopts empirical coefficients to combine the four-dimensional variational assimilation (4DVar) and the ensemble Kalman filter (EnKF), which reduces the accuracy of assimilation results. This paper builds a hybrid DA system based on a deep learning model (DL-HDA) in response to the above problems. First, we establish a forecast model based on the bilinear neural network (BNN) and use the tangent linear and adjoint models of the BNN for the 4DVar. Then, we utilize the ResNet model to combine the analysis of the 4DVar and the EnKF. The experiments are carried out on the Lorenz-96 model, and then the DL-HDA is compared with the traditional method. The experimental results show that the DL-HDA can improve the precision of assimilation results and decrease the system’s running time.
|
|
08:40-09:00, Paper We-PS1-T1.3 | Add to My Program |
Deep Learning Based Audio and Video Cross-Modal Recommendation |
|
Tie, Yun | ZhengZhou University |
Li, Xiaobing | Central Conservatory of Music |
Zhang, Tian | Zhengzhou University |
Jin, Cong | Communication University of China |
Zhao, Xin | Zhengzhou University |
Tie, Jiessie | University of Toronto |
Keywords: Deep Learning, Multimedia Computation
Abstract: With the rapid development of the Internet multimedia industry, audio and video cross-modal retrieval and recommendation have become a compelling topic in deep learning. However, most methods still have certain problems. For example, it takes a lot of effort to find suitable background music for the video, and the recommendations mode of the audio-visual database platform are too singular. This article will take the matching problem of video and music as the starting point in optimizing the music retrieval mode. To improve the multimedia website recommendation mechanism, a video and audio cross-domain recommendation model KATLN is proposed. We have incorporated the ideas of domain adaptation and attention mechanism, obtained rich item feature representations, captured users’ finer-grained interest preferences,and improved the accuracy of modeling. Through experiments,it can be seen that, compared with the existing mainstream algorithms, the KATLN proposed in this paper has outstanding performance and is more prominent in accurately grasping user preferences.
|
|
09:00-09:20, Paper We-PS1-T1.4 | Add to My Program |
2D Skeleton-Based Action Recognition Using Action-Snippets and Sequential Deep Learning |
|
Aizada, Askar | Nazarbayev University |
Lee, Min-Ho | Nazarbayev University |
Huynh-The, Thien | Kumoh National Institute of Technology |
Anh Tu, Nguyen | Nazarbayev University |
Keywords: Image Processing and Pattern Recognition, Deep Learning, Machine Learning
Abstract: Human action recognition (HAR) is an active and crucial field of computer vision due to its various applications, such as smart surveillance and human-computer interaction. Recently, the human skeleton, which is compact and intuitive for representing actions and body movements, has been widely used in numerous HAR frameworks. Despite the great success of the skeleton-based HAR, several challenges remain, such as intra-class variability and inter-class similarity. In this paper, we address this task by first proposing a discriminative representation of the action-snippet (i.e., the very short sequence) that captures meaningful characteristics of human pose and body transition. We then employ adequate deep sequential neural networks (DSNNs) to thoroughly learn the temporal relation of action-snippets in a whole sequence. In experiments, the results show that the proposed approach achieves high recognition rates on benchmark datasets while maintaining good computational efficiency (i.e., lightweight networks and high recognition speed).
|
|
09:20-09:40, Paper We-PS1-T1.5 | Add to My Program |
Clothing Retrieval from Vision Transformer and Class Aware Attention to Deep Embedding Distance Learning |
|
Liu, Kuan-Hsien | National Taichung University of Science and Technology |
Wu, Yu-Hsiang | National Taichung University of Science and Technology |
Liu, Tsung-Jung | National Chung Hsing University |
Keywords: Deep Learning, Machine Vision, Neural Networks and their Applications
Abstract: Clothing is an indispensable item in human life. Since the types and names of clothing are various, sometimes texts are not enough for searching the desired clothing item. The method of searching pictures through pictures has been widely used in many online shopping platforms. In order to help customers find similar clothes accurately, we crawl images from some online shopping websites to build a new dataset to conduct clothing retrieval task. We also propose a new clothing retrieval neural network model, which integrates the advantages of Triplet Network, Perceptual Loss and Capsule Network, and extends the triplet sample learning mode to add multiple negative samples for training. In our proposed model, Vision Transformer and two newly proposed attention mechanisms are applied to the Capsule Network to strengthen the features in each capsule. The Perceptual concept helps the model to extract color and texture features, and connect them. Our newly modified loss function can help the model retrieve better results. The overall training does not need landmark annotation which makes model increase complexity. In the experiments, our model is compared with other state-of-the-art models and shows better performance.
|
|
We-PS1-T2 Regular Session, ZENIT |
Add to My Program |
Medical Informatics I |
|
|
Chair: Drexler, Dániel András | Óbuda University |
Co-Chair: Eigner, György | Obuda University |
|
08:00-08:20, Paper We-PS1-T2.1 | Add to My Program |
HM-MDS: A Human-Machine Collaboration Based Online Medical Diagnosis System |
|
Chen, Yixuan | Northwestern Polytechnical University |
Liu, Jiaqi | Northwestern Polytechnical University |
Yu, Zhiwen | Northwestern Polytechnical University |
Wang, Hui | Northwestern Polytechnical University |
Wang, Liang | Northwestern Polytechnical University |
Guo, Bin | Northwestern Polytechnical University |
Keywords: Human-Machine Cooperation and Systems, Medical Informatics, Design Methods
Abstract: Online medical diagnosis refers to diagnosing diseases and providing treatment suggestions on the websites. It develops rapidly and has become a new choice for patients to seek medical treatment. Although manual online medical diagnosis is reliable, it has problems such as low efficiency, heavy burden on doctors, and long waiting time for patients. Relying on machines for automatic disease diagnosis is highly efficient, which, however, has low accuracy and reliability. In general, online medical diagnosis usually has two stages: inquiry and diagnosis. Inquiry stage refers to asking about the patient's physiological, where the questions are usually streamlined, and thus can be handled by the machine. Diagnosis stage is to diagnose the disease and provide medical recommendations, which has strict requirements for accuracy and safety, and thus should be handled by the human. Inspired by this, in the paper we propose a human-machine collaboration based online medical diagnosis system, i.e., HM-MDS. In inquiry stage, the system employs the machine. It uses the BERT+CRF to identify symptoms in the patient's dialogue and uses a DQN-based method to ask about symptoms. In diagnosis stage, the system employs both the machine and the human. The machine generates a pre-diagnosis result by calculating disease probability. Then the human doctor gives the final diagnosis result by checking the pre-diagnosis result and revising it if necessary. Obviously, HM-MDS can effectively save human doctor's time as well as patient's time, while ensure the accuracy of the diagnosis result. We conduct experiments on a real-world dataset. The results show our approach improves the online medical diagnosis's reliability as well as patient satisfaction, and ensures diagnosis accuracy. The time cost for a reliable medical diagnosis is reduced to 36% compared with pure manual work.
|
|
08:20-08:40, Paper We-PS1-T2.2 | Add to My Program |
Classifying Situations for Collaboration among Medical Staff from Operating Room Videos |
|
Hayashi, Takahiro | Kwansei Gakuin University |
Kakusho, Koh | Kwansei Gakuin University |
Kitahashi, Tadahiro | Osaka University |
Matsumoto, Takashi | Medi Plus, Inc |
Sugano, Naoya | Medi Plus, Inc |
Keywords: Medical Informatics, Human-Computer Interaction, Human Factors
Abstract: This article discusses the feasibility of classifying various situations for collaboration among medical staff engaged in surgery from videos taken in operating rooms. In video processing for medical use, especially for supporting surgery, many previous studies have attempted to recognize, from intraoperative videos, surgical phases constituting the workflows of specific surgical operations. However, it should also be significant to analyze, from videos obtained from observation cameras in operating rooms, how medical staff engaged in surgeries collaborate, with the aim of improving the efficiency and safety of collaboration among medical staff. The type of collaborative situations that occur are not pre-determined but rather depend on actual behavior of medical staff during surgeries. Therefore, this study attempts to segment videos of surgical operating rooms into different collaborative situations by clustering the video frames based on similarities in the arrangements of medical staff. The arrangements are represented by the distributions of medical staff in the operating rooms as well as their observed postures observed in the video frames. In the results of our experiment with the video of a real surgery, some meaningful clusters of collaborative situations for medical staff could be obtained.
|
|
08:40-09:00, Paper We-PS1-T2.3 | Add to My Program |
A Benchmark and Transformer-Based Approach for Automated Hyperparathyroidism Detection |
|
Guo, Qing | Beijing University of Chemical Technology |
Xiao, Yufeng | Beijing University of Chemical Technology |
Wang, Huaqing | Beijing University of Chemical Technology |
Yu, Mingan | China-Japan Friendship Hospital |
Wei, Ying | China-Japan Friendship Hospital |
Zhao, Zhenlong | China-Japan Friendship Hospital |
Keywords: Medical Informatics, Human-centered Learning
Abstract: Hyperparathyroidism (HPT) is a symptom of hypertrophy and enlargement of the parathyroids due to the overproduction of parathyroid hormones. Since there are multiple ectopic parathyroid glands, namely the position of the glands is not fixed. The clinical diagnosis and localization of HPT are difficult, which can easily results in underdiagnoses and misdiagnoses. This paper collected and produced two versions of datasets for the computer-aided diagnosis (CAD) of HPT in PASCAL VOC and COCO formats, respectively. Statistical analysis shows that 84.5% of the HPT lesions in the ultrasonic images are medium-scale and large-scale targets. We proposed a hybrid detection network that employs CSPDarknet, spatial pyramid pooling net (SPPnet) and deformable transformer for automatically detecting HPT. We further introduced several attention networks combined with the CNN backbone to improve the network structure. Extensive experiments have been carried out on the datasets by combining various attention networks and the hybrid detection network. The experimental results illustrate that the inclusion of the channel attention mechanisms can improve the performance of HPT lesion detection more effectively than other hybrid networks, especially for large lesion objects. It implies that the channel-wise features are more critical for HPT detection than the global features and the spatial location features.
|
|
09:00-09:20, Paper We-PS1-T2.4 | Add to My Program |
Enhancing Chinese Medical Named Entity Recognition with Auto-Mined Lexicon (I) |
|
Xiao, Yinlong | Beijing University of Technology |
Li, Jianqiang | Beijing University of Technology |
Zhao, Qing | Beijing University of Technology |
Zhu, Qing | Beijing University of Technology |
Wei, Yu-Chih | National Taipei University of Technology |
Keywords: Medical Informatics
Abstract: Recently, lexicon-based Chinese Named Entity Recognition (NER) models have achieved state-of-the-art performance by benefiting from the rich boundary and semantic information contained in the lexicon. However, in the Chinese medical domain, it's difficult to obtain the medical lexicon related to the target medical corpus. In this paper, we propose a new paradigm, enhancing Chinese medical NER with Auto-mined Lexicon (ALNER), which alleviates the difficulty of obtaining the medical lexicon by designing a data-driven automatic lexicon construction method. We define medical lexicon construction as a high-quality phrase mining task. We perform secondary annotation on the NER annotated data and use the secondary annotated data to train a deep learning-based phrase tagger. Experimental results show that our method can be combined with different lexicon-based Chinese NER models to improve performance and that the method does not require an external medical lexicon.
|
|
09:20-09:40, Paper We-PS1-T2.5 | Add to My Program |
Pharmacodynamics Modeling Based on in Vitro 2D Cell Culture Experiments (I) |
|
Gergics, Borbála | Obuda University |
Gombos, Balázs | Research Center for Natural Sciences |
Vajda, Flóra | Research Center for Natural Sciences |
Füredi, András | Research Center for Natural Sciences |
Szakács, Gergely | Research Center for Natural Sciences |
Drexler, Dániel András | Óbuda University |
Keywords: Medical Informatics
Abstract: With the advancement of technology and medicine, personalized therapies arise which may provide a promising solution for the problems of chemotherapy, such as many side effects. Personalized therapies can be applied that meet the individual needs of patients, this can make the healthcare of cancer patients safer and more cost-effective. However, personalized therapy requires a reliable model of tumor dynamics and the effect of the drug applied during the therapy. Examining the effect of certain drugs on in vitro cell cultures helps us to gain understanding of the drug mechanism and create pharmacodynamics models. In vitro experiments provide a more cost-effective, ethical and more controllable way to examine the effect of chemotherapeutic agents. In this work, we carried out parameter identification and validation of several versions of a tumor growth model based on in vitro cytotoxicity measurements in two-dimensional tumor cell cultures. In vitro experiments were carried out on Brca1-/-; p53 mice breast cancer cell line. The chemotherapeutic agents used were Doxil and Doxorubicin at different concentrations. Parameter identification was performed by fitting to data measured at 12 time points in 5 days. We examined how the presence or absence of necrotic rate n and an additional Hill coefficient affect the value of sum square error (SSE). We showed that the models that contain tumor cell necrosis and a general Hill function have the best performance for Doxil.
|
|
We-PS1-T3 Regular Session, NADIR |
Add to My Program |
3D Object Detection and Processing |
|
|
Chair: Kreinovich, Vladik | University of Texas at El Paso |
|
08:00-08:20, Paper We-PS1-T3.1 | Add to My Program |
3D Object Detection Based on Multi-Scale Feature Fusion and Contrastive Learning |
|
Hu, Xing | HoHai University |
Wang, Min | Hohai University |
Khuyag, Tulga | Hohai University |
Keywords: Machine Vision, Deep Learning, Application of Artificial Intelligence
Abstract: 3D object detection plays an increasingly important role in the understanding of real natural scenes. In recent years, the method based on Hough voting has attracted more and more attention because of its compact model and high efficiency. However, the current voting strategy only uses poor local proposal information, which is not conducive to model optimization and performance improvement. A 3D object detection model based on multi-scale feature fusion and contrastive learning is proposed in this paper. The proposal stage focuses on voting to generate proposals, including the coordinates of proposals and the corresponding feature vectors. In order to obtain the local structure information, then calculate the multi-scale attention, and trace the multi-scale features of the proposal, an additional proposed coding contrast branch is introduced, which uses the proposed feature coding contrast loss to jointly optimize the feature representation and multiscale attention modules. We have obtained competitive results on two large datasets SUN RGBD and ScanNet, which shows the effectiveness of our method.
|
|
08:20-08:40, Paper We-PS1-T3.2 | Add to My Program |
Multi-Stream Feature Aggregation Network for 3D Object Detection in Point Cloud |
|
Hou, YingJie | QingDao University |
Zhang, Xiaowei | Qingdao University |
Keywords: Image Processing and Pattern Recognition, Machine Vision, Deep Learning
Abstract: In the recent 3D object detection methods for point clouds, the combination of point-based methods and voxel-based methods is gradually becoming a trend. Point-based methods retain the accurate position and pose information in the raw points and voxel-based methods get multi-scale structure information through the 3D backbone. However, because of the sparsity and irregularity of point clouds, both representations ignore the context information, which is important for the detection of sparse and small objects. To solve this problem, we propose a multi-stream feature aggregation network to extract features from three representations of the point cloud for object detection. Specifically, we exploit multi-stream features extracted from point, voxel, and perspective view (PV) respectively on a parallel way, where the complementary information between different perspectives can be used to enrich the feature representations, especially for the perspective view containing rich semantic context information. Secondly, to eliminate redundant information and better exploit the correlation between different feature representations, we design an attention-based multi-stream feature fusion module (MSFF) to combine features from three information streams. Besides, we introduce a new voxel RoI pooling with the self-attention in the second refinement stage, which can further strengthen the connection between local features in the proposal to obtain accurate classification and localization predictions. Our method achieves progressive results on the KITTI dataset, especially in the cyclist category, which improves the baseline H23DR-CNN significantly by 4.38%, 2.97%, 2.88% AP in the test set for easy, moderate, and hard levels respectively. Code will be available at https://github.com/june2678/MRF.
|
|
08:40-09:00, Paper We-PS1-T3.3 | Add to My Program |
Dual-Branch Point Cloud Feature Learning for 3D Object Detection |
|
Lun, Hengsheng | University of Chinese Academy of Sciences |
Xue, Jian | University of Chinese Academy of Sciences |
Lu, Ke | University of Chinese Academy of Sciences |
Keywords: Deep Learning, Application of Artificial Intelligence, Neural Networks and their Applications
Abstract: Incomplete feature information is a key problem that limits 3D point cloud object detection and its applications. Many state-of-the-art detectors address this problem from different perspectives, but a comprehensive solution has not yet been obtained. In this paper, a solution is proposed that consists of two branches, one for channel-wise local feature learning and one for spatial-wise global feature learning. The combination of the local features, global contextual features, channel-wise attention features, and spatial attention features of the 3D point cloud is obtained through the two branches. Specifically, a generic spatial self-attention model is proposed that uses skeleton convolution to enhance the extraction of spatial features and combines it with a self-attention mechanism to improve the learning of global features. Further, the proposed skeleton attention mechanism focuses on object contour and rotation invariance of the point cloud. Through sufficient experimental validation, all module proposed in this paper are shown to have a substantial effect on performance and the visualization results demonstrate the importance of our approach.
|
|
09:00-09:20, Paper We-PS1-T3.4 | Add to My Program |
Loc-VAE: Learning Structurally Localized Representation from 3D Brain MR Images for Content-Based Image Retrieval |
|
Nishimaki, Kei | HOSEI University |
Ikuta, Kumpei | Hosei University |
Onga, Yuto | Hosei University |
Iyatomi, Hitoshi | Hosei University |
Oishi, Kenichi | Johns Hopkins University School of Medicine |
Keywords: Representation Learning, Neural Networks and their Applications, Deep Learning
Abstract: Content-based image retrieval (CBIR) systems are an emerging technology that supports reading and interpreting medical images. Since 3D brain MR images are high dimensional, dimensionality reduction is necessary for CBIR using machine learning techniques. In addition, for a reliable CBIR system, each dimension in the resulting low-dimensional representation must be associated with a neurologically interpretable region. We propose a localized variational autoencoder (Loc-VAE) that provides neuroanatomically interpretable low-dimensional representation from 3D brain MR images for clinical CBIR. Loc-VAE is based on beta-VAE with the additional constraint that each dimension of the low-dimensional representation corresponds to a local region of the brain. The proposed Loc-VAE is capable of acquiring representation that preserves disease features and is highly localized, even under high-dimensional compression ratios (4096:1). The low-dimensional representation obtained by Loc-VAE improved the locality measure of each dimension by 4.61 points compared to naive beta-VAE, while maintaining comparable brain reconstruction capability and information about the diagnosis of Alzheimer's disease.
|
|
09:20-09:40, Paper We-PS1-T3.5 | Add to My Program |
Fast 3D Point Cloud Target Tracking Based on Polar-Voxel Encoding |
|
Ouyang, Zhenchao | Beihang University |
Dong, Xiaoyun | Beihang University |
Zhang, Changjie | Beihang University |
Cui, Jiahe | Beihang University |
Hu, Qinglei | Beihang University |
Niu, Jianwei | Beihang University |
Keywords: Deep Learning, Application of Artificial Intelligence
Abstract: The century-old development of the automotive industry has spawned one of the greatest Cyber-Physical Systems (CPSs) in the future--unmanned vehicles. The vehicle can obtain environmental information through different sensors, map it to the virtual coordinate system of the vehicle body to make decisions, and finally generate control instructions. However, a series of factors, such as complex road scenes, defective and irregular target sparse sampling, and large coding space, pose challenges to accurate, efficient, and stable perception results. To overcome the most challenging problem of dynamic target tracking, this paper designs a two-stage detection model based on non-uniform polar voxelization sampling of irregular 3D point cloud, which is used with local registration-based search to achieve efficient multi-target tracking. Non-uniform voxelization not only balances the spatial sampling and encoding efficiency of the point cloud for the backbone, but also adapts to the feature aggregation of the detection head, thereby achieving double acceleration. Finally, we tested our model on KITTI Tracking data. The comparison results show that the calculation speed of the final model is greatly improved and the tracking accuracy is competitive in all categories.
|
|
We-PS1-T4 Regular Session, AQUARIUS |
Add to My Program |
Feature Engineering |
|
|
Chair: Jiang, Lei | Loughborough University |
Co-Chair: Fernandes, Ricardo | Federal University of Săo Carlos |
|
08:00-08:20, Paper We-PS1-T4.1 | Add to My Program |
Combining Muti-Layer Features for Plant Species Classification in a Siamese Network |
|
Moresco, Matheus | State University of Ponta Grossa (UEPG) |
Britto Jr, Alceu de Souza | State University of Ponta Grossa (UEPG) |
Yandre, Costa | State University of Maringá |
Senger, Luciano J | State University of Ponta Grossa (UEPG) |
Hochuli, Andre Gustavo | Pontifícia Universidade Católica Do Paraná (PPGIA/PUCPR) |
Keywords: Image Processing and Pattern Recognition, Machine Vision, Machine Learning
Abstract: The plant species classification using leaf images is a challenge due to the lack of annotation, imbalanced classes and similarities in the data representation. For such problems, Siamese Neural Networks (SNN's) have been used to overcome these bottlenecks in several contexts. In light of this, this work evaluates different architectures trained in Siamese manner for classifying plant species from the leaf image. Besides, we combined features from the intermediate convolutional layers to improve representations. Experiments on the well-known Flavia and MalayaKew databases have shown that the fusion of intermediate features results in a relevant gain in performance.
|
|
08:20-08:40, Paper We-PS1-T4.2 | Add to My Program |
Multi-View Feature Fusion for Few-Shot Remote Sensing Image Scene Classification |
|
Han, Anxun | China University of Petroleum (East China) |
Xing, Lei | China University of Petroleum (East China) |
Liu, Weifeng | China University of Petroleum (East China) |
Liu, Baodi | College of Information and Control Engineering, China University |
Keywords: Machine Vision, Deep Learning, Machine Learning
Abstract: 与深度学习方法相比,很少拍摄的学习方法不需要大量的标记图像。因此,对少拍摄的遥感影像场景分类进行了广泛的研究。然而,如何从有限数量的标记样品中提取效果信息是一个巨大的挑战。大多数方法仅从遥感影像的单一视角提取要素。此类信息很少,甚至具有误导性。为了解决这个问题,我们提出了一种多视图特征融合方法。具体而言,首先,我们分别在原始图像数据集和遥感图像数据集上训练两个特征提取器网络。每个模型提取平均池化前后的特征,得到遥感影像的4种特征。其次,我们使用多头特征协作(MHFC)方法和四个分类器从支持集特征中获取融合权重。第三,我们利用权重来融合预测概率矩阵,并通过新矩阵获得ĉ
|
|
08:40-09:00, Paper We-PS1-T4.3 | Add to My Program |
An Improved Novel View Synthesis Approach Based on Feature Fusion and Channel Attention |
|
Jiang, Lei | Loughborough University |
Schaefer, Gerald | Loughborough University |
Meng, Qinggang | Loughborough University |
Keywords: Image Processing and Pattern Recognition, Deep Learning, Neural Networks and their Applications
Abstract: Single image novel view synthesis allows the generation of target images with different views from a single input image. Pixel generation methods are one of the main approaches for novel view synthesis, with previous methods typically using the input images to infer the target image in the new view. However, only features from input images in the source view might not be sufficient to generate a good target image, especially when only a single input image is available. In this paper, we present a deep learning-based novel view synthesis approach that fuses features from an input and a warped image to collaboratively generate pixels in the new view. The warped image here is an intermediate output generated by projecting pixels of the input image onto the target view via an estimated depth. Since the estimated depth and the generated warped image are not perfect, errors will be introduced when generating target pixels. To alleviate these and to ensure better channel information between the features from input and warped image, channel attention blocks are employed. Experimental results on standard benchmark datasets show that our method produces excellent view synthesis results and outperforms other state-of-the-art methods.
|
|
09:00-09:20, Paper We-PS1-T4.4 | Add to My Program |
An Approach Based on Feature Engineering and Machine Learning for the Prediction of Child Violence |
|
Anibal, Osses | Federal University of Săo Carlos |
Fernandes, Ricardo | Federal University of Săo Carlos |
Keywords: Application of Artificial Intelligence, Expert and Knowledge-Based Systems, Computational Intelligence
Abstract: According to technical reports prepared by United Nations International Children's Emergency Fund and World Health Organization, there is a high percentage of child violence occurring around the world. This fact is even more evident in developing countries. However, it is difficult to determine the type of violence the child is suffering, since the problem is dependent on reports of the victims and these ones could be imprecise. In this sense, decision support tools are of utmost importance for agents of social assistance programs. Thus, this paper proposes an approach based on feature engineering and machine learning to classify the violence suffered by a child. For this purpose, it was used a dataset obtained from a Chilean organization of social assistance for children. After the training and validation processes, the best machine learning models reached areas under the curve between 0.84 and 0.97, surpassing the state-of-the-art results.
|
|
09:20-09:40, Paper We-PS1-T4.5 | Add to My Program |
Evolutionary Feature Selection Method Via a Chaotic Binary Dragonfly Algorithm |
|
Liu, Zhao | Jilin University |
Wang, Aimin | Jilin University |
Sun, Geng | Jilin University |
Li, Jiahui | Jilin University |
Bao, Haiming | Changguang Satellite Technolegy |
Li, Hongjuan | Jilin University |
Keywords: Machine Learning, Swarm Intelligence, Evolutionary Computation
Abstract: Feature selection aims at reducing the number of attributes while achieving a high classification accuracy in machine learning. In this paper, we design a fitness function to jointly reduce the number of the selected features and enhance the accuracy. Then, we propose a chaotic binary dragonfly algorithm (CBDA) with several improved factors on the conventional dragonfly algorithm (DA) for developing a wrapper-based feature selection method to solve the fitness function. Specifically, the CBDA introduces three improved factors that are the chaotic map, evolutionary population dynamics mechanism and binarization strategy to make the algorithm more suitable for the problem. Experiments are conducted to evaluate the performance of the proposed CBDA on 24 well-known data sets from the UCI repository, and the results demonstrate that the proposed CBDA outperforms other comparative algorithms on the majority of the tested data sets.
|
|
We-PS1-T5 Regular Session, TAURUS |
Add to My Program |
Intelligent Operation, Maintenance, and Scheduling of Industrial Processes |
|
|
Chair: Strasser, Thomas | AIT Austrian Institute of Technology |
Co-Chair: Zhu, Haibin | Nipissing University |
|
08:00-08:20, Paper We-PS1-T5.1 | Add to My Program |
Distributed Operating Performance Assessment for Hot Strip Mill Process Based on Probabilistic Support Tensor Data Description with Feature Tensor (I) |
|
Zhang, Chuanfang | University of Science and Technology Beijing |
Peng, Kaixiang | University of Science and Technology Beijing |
Dong, Jie | University of Science and Technology Beijing |
Zhang, Xueyi | University of Science and Technology Beijing |
Keywords: Machine Learning, Application of Artificial Intelligence
Abstract: In industrial processes, operating performance assessment is of great practical significance for guiding the production adjustment for operators. From the perspective of classification, operating performance assessment is considered as a multi-class classification problem. As a well-known one-class classifier, support vector data description (SVDD) are oriented to vector data and cannot deal with tensor data directly. Moreover, SVDD gives the target data set a spherically shaped description, which is a binary output. However, practical industrial data of different operating performance grade may have overlapping region, which is a knotty problem for classification. To handle above issues, a distributed operating performance assessment method based on probabilistic support tensor data description (PSTDD) is proposed in this work. First, the plant-wide process variables are selected and divided into several blocks. Then, a PSTDD model is developed in each block. Based on the assessment results of different blocks, a global assessment index is designed. If the process is running at non-optimal condition, the root cause are traced by variable contributions. Experimental results on a real hot strip mill process (HSMP) illustrate the effectiveness of the proposed method comparing to the traditional distributed SVDD.
|
|
08:20-08:40, Paper We-PS1-T5.2 | Add to My Program |
A Two-Stage Pricing Study of Product Line Considering Value-Added Services (I) |
|
Qi, Wei | Henan University |
Li, Ziwei | Henan University |
Liu, Xuwang | Henan University |
Guo, Xiwang | Liaoning Petrochemical University |
Wang, Jiacun | Monmouth University |
Tang, Ying | Rowan University |
Keywords: Optimization and Self-Organization Approaches, Computational Intelligence
Abstract: With the advancement of society and technology, consumers are becoming more personalized and more willing to buy new products. To meet the diverse needs of consumers, the design and development of product lines have become an important strategic issue of enterprises. Based on the consumer choice model, this paper aims at the design and development of product lines and the purchase behavior of consumers. A two-stage pricing model is constructed under the condition of bundled sales of products and services. This paper analyzes the impact that enterprises should consider products with value-added services and consumers' purchasing behavior on product line two-stage pricing. Research shows that the level of product value-added services and the degree of enterprise strategy will have an impact on the price of the product line and the enterprise's profit. When the service level is higher, the enterprise's product line price and profit will increase, and when the enterprise discount higher, the enterprise's total profit and product line price will decrease.
|
|
08:40-09:00, Paper We-PS1-T5.3 | Add to My Program |
Discrete Shuffled Frog Leading Algorithm for Multiple-Product Human-Robot Collaborative Disassembly Line Balancing Problem |
|
Fan, Chenyang | Liaoning Petrochemical University |
Guo, Xiwang | Liaoning Petrochemical University |
Zhou, Meng-Chu | New Jersey Institute of Technology |
Wang, Jiacun | Monmouth University |
Qin, Shujin | Shangqiu Normal University |
Qi, Liang | Shandong University of Science and Technology |
Keywords: Computational Intelligence, Heuristic Algorithms, Swarm Intelligence
Abstract: Abstract—With the rapid development of recycling and remanufacturing technologies, the disassembly line balancing problem (DLBP) has drawn great attention. Considering the limitation of disassembly by humans or robots alone, this paper focuses on human-robot collaborative disassembly lines. Specifically, this work proposes a multi-product human-robot collaborative disassembly line balancing model to tackle the inflexibility of single product disassembly and inconsistency in recovery values of different product components. The objective is to maximize disassembly profit. IBM’s CPLEX is used to obtain the exact solution of DLBP and verify the correctness of the proposed mathematical model. A discrete shuffled frog leading algorithm is then designed to solve the sizable problems. Experimental results show that the proposed algorithm has a fast convergence rate and can find solutions consistent with those with CPLEX but requires much less time than the latter.
|
|
09:00-09:20, Paper We-PS1-T5.4 | Add to My Program |
Improved LSSVM to Predict the Elongation of Strip Steel in Annealing Furnace |
|
Lv, Chun-peng | Liaoning Petrochemical University |
Shi, Yuan-bo | Liaoning Petrochemical University |
Huang, Yue-yang | Liaoning Petrochemical University |
Wang, Jian-hui | Northeasten University |
Keywords: Cybernetics for Informatics, Heuristic Algorithms, Swarm Intelligence
Abstract: It is difficult to predict the elongation of strip steel in annealing furnace due to the influence of temperature, tension, roll speed and data noise. Thus, a strip elongation prediction method based on least squares support vector machine (LSSVM) optimized by artificial bee colony (ABC) algorithm is proposed. In order to improve the convergence speed and accuracy of the algorithm, the new adaptive step update formula, adaptive probability selection formula and global search factor were introduced to improve the standard artificial bee colony algorithm. The parameter of the LSSVM were optimized by improved artificial bee colony(IABC) algorithm which overcame the subjectivity of human selection and made the LSSVM get better generalization and prediction accuracy. Numerical simulation results of MATLAB show that the relative error and root mean square error predicted by IABC-LSSVM are better than those predicted by ABC-LSSVM and LSSVM which effectively improves the convergence speed and prediction accuracy of the algorithm. Under the real working conditions, IABC-LSSVM provides theoretical support for the prediction of strip extension in annealing furnace and has a certain engineering application value.
|
|
09:20-09:40, Paper We-PS1-T5.5 | Add to My Program |
A Method of Remaining Useful Life Prediction of Multi-Source Signals Aero-Engine Based on RF-Transformer-LSTM* |
|
Mu, Hanshuo | Tongji University |
Zhai, Xiaodong | Tongji University |
Yin, Debin | Shanghai Institute of Process Automation & Instrumentation |
Qiao, Fei | Tongji University |
Keywords: Deep Learning, Application of Artificial Intelligence, Neural Networks and their Applications
Abstract: The aeroengine remaining useful life (RUL) prediction problem of prognostics and health management (PHM) is becoming more complicated and challenging due to the large-scale, high dimension and difficult feature extraction of sensor data. This paper proposes a RUL prediction method with multi-source signals based on Random Forest (RF)-Transformer-Long Short-Term Memory (LSTM) to enhance the prediction accuracy. The method proposed in this paper mainly includes the following structures: Firstly, RF is used to select the feature of multi-sensor data. Secondly, Transformer is used to extract the feature after feature selection. Thirdly, the data after feature extraction is put into the LSTM model to obtain the aeroengine RUL. Finally, a case study is conducted to validate the superiority of the proposed RF-Transformer-LSTM method based on C-MAPSS aeroengine dataset. The results show that under the four data sets, the root mean square error(RMSE) of the method used in this paper can reach a minimum of 10.23, which is far lower than other common methods. So, the proposed RF-Transformer-LSTM method has higher accuracy than other common methods.
|
|
We-PS1-T6 Regular Session, LEO |
Add to My Program |
Learning Approaches in Intelligent Systems |
|
|
Chair: Etemadi Idgahi, Reza | University of Texas at Arlington |
|
08:00-08:20, Paper We-PS1-T6.1 | Add to My Program |
Solving Quadratic Traveling Salesman Problem with Deep Reinforcement Learning (I) |
|
Zhang, Hang | Sun Yat-Sen University |
Zhang, Zizhen | Sun Yat-Sen University |
Chen, Jinbiao | Sun Yat-Sen University |
Wang, Jiahai | Sun Yat-Sen University |
Keywords: Neural Networks and their Applications, Deep Learning, Application of Artificial Intelligence
Abstract: There are many combinatorial optimization problems derived from the classic traveling salesman problem (TSP). The quadratic traveling salesman problem (QTSP) is one of them. It needs to consider the relationship between three successive nodes rather than two successive nodes. In literature, there are exact methods based on integer programming and approximate methods based on heuristics for solving QTSP. In this paper, we try to adopt deep reinforcement learning to tackle QTSP. We consider two classic QTSPs studied in the previous literature, namely the angular-metric TSP and the angular-distance-metric TSP. Both of them consider the turning angle for each node, and the angular-distance-metric TSP further considers the total traveling distance in the original TSP. The experimental results show that our method is superior to some typical heuristic methods in terms of solution quality, and better than the exact methods in terms of time.
|
|
08:20-08:40, Paper We-PS1-T6.2 | Add to My Program |
A Deep Averaged Reinforcement Learning Approach for the Traveling Salesman Problem |
|
Parasteh, Sirvan | University |
Khorram, Amin | Department of Industrial Systems Engineering, University of Regi |
Mouhoub, Malek | University of Regina |
Sadaoui, Samira | University of Regina |
Keywords: Deep Learning, Heuristic Algorithms, Transfer Learning
Abstract: This work presents a deep averaged reinforcement-learning approach to learn improvement heuristics for route planning. The proposed method is tested on the Traveling Salesman Problem (TSP). While learning improvement heuristics using machine learning models are prosperous, these methods suffer from the generalization and forgetfulness of the agents during the training process. We have applied the stochastic weight averaging method during the training phase to solve these issues, which smothers the training convergence and prevents the forgetting of optimized learned policies, and consequently provides better results. The agent can learn the optimized policy while holding a moving average of the previously learned policies during the training epochs. In order to assess the performance of our proposed approach, we conducted comparative experiments considering other known methods from the literature. The results demonstrate our proposed method's superiority in training trends and optimization.
|
|
08:40-09:00, Paper We-PS1-T6.3 | Add to My Program |
Learning Hierarchical Traversability Representations for Efficient Multi-Resolution Path Planning |
|
Etemadi Idgahi, Reza | University of Texas at Arlington |
Huber, Manfred | The University of Texas at Arlington |
Keywords: Heuristic Algorithms, Application of Artificial Intelligence, Representation Learning
Abstract: Path planning on grid-based obstacle maps is an essential and much-studied problem with applications in robotics and autonomy. Traditionally, in the AI community, heuristic search methods (e.g., based on Dijkstra, A*, or random trees) are used to solve this problem. This search, however, incurs a high computational cost that grows with the size and resolution of the obstacle grid and has to be mitigated with effective heuristics to allow path planning in real-time. This work introduces a learning framework using a deep neural network with a stackable convolution kernel to establish a hierarchy of directional traversability representations with decreasing resolution that can serve as an efficient heuristic to guide a multi-resolution path planner. This path planner finds paths efficiently, starting on the lowest resolution traversability representation and then refining the path incrementally through the hierarchy until it addresses the original obstacle constraints. We demonstrate the benefits and applicability of this approach on datasets of maps created to represent both indoor and outdoor environments to represent different real-world applications. The conducted experiments show that our method can accelerate path planning by 40% in indoor environments and 65% in outdoor environments compared to the same heuristic search method applied to the original obstacle map, which demonstrates the effectiveness of this method.
|
|
09:00-09:20, Paper We-PS1-T6.4 | Add to My Program |
Advances in Preference-Based Reinforcement Learning: A Review |
|
Abdelkareem, Youssef Abdelrahman Fathi Ahmed | University of Waterloo |
Shehata, Shady | Mohamed Bin Zayed University of Artificial Intelligence (MBZUAI) |
Karray, Fakhreddine | University of Waterloo |
Keywords: Machine Learning, Computational Intelligence, Deep Learning
Abstract: Reinforcement Learning (RL) algorithms suffer from the dependency on accurately engineered reward functions to properly guide the learning agents to do the required tasks. Preference-based reinforcement learning (PbRL) addresses that by utilizing human preferences as feedback from the experts instead of numeric rewards. Due to its promising advantage over traditional RL, PbRL has gained more focus in recent years with many significant advances. In this survey, we present a unified PbRL framework to include the newly emerging approaches that improve the scalability and efficiency of PbRL. In addition, we give a detailed overview of the theoretical guarantees and benchmarking work done in the field, while presenting its recent applications in complex real-world tasks. Lastly, we go over the limitations of the current approaches and the proposed future research directions.
|
|
09:20-09:40, Paper We-PS1-T6.5 | Add to My Program |
Attentive Reinforcement Learning for Scheduling Problem with Node Auto-Scaling |
|
Guan, Yanxia | National University of Defense Technology |
Xie, Min | National University of Defense Technology |
Yuan, Li | Academy of Military Sciences |
Xu, Xinhai | Academy of Military Sciences |
Keywords: Application of Artificial Intelligence, Deep Learning
Abstract: The distributed system Ray has attracted much attention in decision-making applications, which could greatly accelerate the training efficiency for intelligent algorithms. Task scheduling is one of the critical technologies in Ray, in which the number of resource nodes could be auto-scaling, i.e., automatically increasing or decreasing according to the workload. The adopted scheduling strategy is simple, which leaves much space to be optimized. In this paper, we consider designing a reinforcement learning method to optimize the scheduling problem in Ray. We propose an attentive reinforcement learning method, designing an attention-based state encoder that could efficiently extract the system state in the situation of the varying number of resource nodes. At the same time, an action mask mechanism filters invalid actions. Further, to improve the learning efficiency in the environment with the varied number of nodes, we design a curriculum learning method, which trains the method by gradually increasing the number of nodes in the scheduling process. Finally, we use the real data generated by the Alibaba Cluster Trace Program to test in the simulation platform CloudSim. The experimental results show that the proposed method effectively scales down the completion time of tasks compared to the original algorithm in Ray.
|
|
We-PS1-T7 Regular Session, VIRGO |
Add to My Program |
Novel Network-Based Architectures and Applications |
|
|
Chair: Saga, Ryosuke | Osaka Metropolitan University |
Co-Chair: Dong, Anming | Qilu University of Technology |
|
08:00-08:20, Paper We-PS1-T7.1 | Add to My Program |
A Multi-Scale Disperse Dynamic Routing Capsule Network Knowledge Graph Embedding Model Based on Relational Memory |
|
Ma, Haoxiang | Qilu University of Technology |
Jiang, Xuesong | Qilu University of Technology(shandong Academy of Sciences) |
Wei, Xiumei | Qilu University of Technology |
Chai, Huihui | Qilu University of Technology (Shandong Academy of Sciences) |
Keywords: Knowledge Acquisition, Neural Networks and their Applications, Deep Learning
Abstract: Knowledge graphs use triples containing head entity h, tail entity t, and relation r to represent real-world entities and their intrinsic relationships. In order to predict the actual missing triples in the knowledge graph, combine the strong triple representation ability of relational memory network with the powerful feature processing ability of capsule network, and use disperse dynamic routing that can improve the performance of capsule network, we propose a knowledge graph embedding model named RDMCapsE. First, the embedding vector is formed by encoding potential dependencies between entities and relations; then, different feature maps are generated using convolutional kernels with different window sizes and then reorganized into corresponding capsules; finally, the connection from the parent capsule to the child capsule is specified by the squash function and the disperse dynamic routing, and the credibility of the current triple is judged based on the score of the inner product of the child capsule and the weights. The experimental results show that compared with other models, the model in this paper can effectively improve the effect of knowledge graph completion and the classification accuracy of triples on WN18RR, FB15K-237, WN11, and FB13 datasets.
|
|
08:20-08:40, Paper We-PS1-T7.2 | Add to My Program |
Pooling Method Based on Edge Contraction for Graph Convolution Networks |
|
Saga, Ryosuke | Osaka Metropolitan University |
Keywords: Deep Learning, Machine Vision, Neural Networks and their Applications
Abstract: In recent years, various graph pooling methods have been proposed, and the existing edge pooling methods have some problems. Edge pooling aggregates nodes by removing edges while considering some node characteristics. However, edge pooling ignores the surrounding node features and graph topology. We propose a novel graph pooling method to address this problem. To address the problem, we build a reasonable pooling graph topology, consider the structure and feature information of the graph, improve the objectivity of node selection, and use the edge pooling method to select edges after considering the structure and feature information of the graph. Experimental results on the dataset show that our method is effective in graph classification and outperforms state-of-the-art graph pooling methods.
|
|
08:40-09:00, Paper We-PS1-T7.3 | Add to My Program |
Integrated Spatio-Temporal Prediction for Water Quality with Graph Attention Network and WaveNet |
|
Bi, Jing | Beijing University of Technology |
Zhang, Jun | Beijing University of Technology |
Yuan, Haitao | Beihang University |
Qiao, Junfei | Beijing University of Technology |
Keywords: Neural Networks and their Applications, Application of Artificial Intelligence, Deep Learning
Abstract: Water quality prediction is of great significance for water environmental protection and management. Traditional water quality prediction methods are mainly based on linear models, and they fail to extract nonlinear relationships. Recurrent neural network-based ones have shortcomings such as being difficult for long-term prediction and unable to capture spatial dependencies. To address these issues, this work proposes an improved spatio-temporal prediction model called GATWNet, which combines a Graph ATtention network with a WaveNet model based on dilated causal convolution to predict water quality at multiple sites over a future time period. GATWNet jointly captures the spatial information of river networks and the temporal information of each water quality monitoring sensor. Two real-world datasets-based experimental results demonstrate the proposed GATWNet achieves higher prediction accuracy than several baseline models.
|
|
09:00-09:20, Paper We-PS1-T7.4 | Add to My Program |
MGC-GAN: Multi-Graph Convolutional Generative Adversarial Networks for Accurate Citywide Traffic Flow Prediction |
|
Li, Lincan | Zhejiang University |
Bi, Jichao | Zhejiang University |
Yang, Kaixiang | Zhejiang University |
Luo, Fengji | The University of Sydney |
Yang, Lu-Xing | Deakin University |
Keywords: Application of Artificial Intelligence, Complex Network, Neural Networks and their Applications
Abstract: Accurate citywide traffic flow prediction is of great importance to intelligent transportation system. Existing methods typically assume the complete citywide traffic data can be obtained in real-time, which is impossible in applications. Furthermore, many recent works only consider one single kind of spatial correlation in traffic network when building graph representations. This work proposes an adversarial learning framework named Multi-Graph Convolutional Generative Adversarial Networks (MGC-GAN) to address the aforementioned challenges. To generate citywide traffic flow predictions using limited traffic data, we construct three kinds of graphs using easily accessed geographical and semantic information to model the complex spatial correlations in citywide transportation networks. Following that, a parallel GCN layer is designed to separately process multiple graphs. In addition, we design the Parallel Graph Convolution and Temporal Convolution Module (PGTCM) to effectively capture the heterogeneous spatial-temporal dependencies. Extensive experiments are carried out on two citywide traffic datasets, demonstrating that MGC-GAN outperforms several state-of-the-art baseline methods.
|
|
09:20-09:40, Paper We-PS1-T7.5 | Add to My Program |
Speech Enhancement Generative Adversarial Network Architecture with Gated Linear Units and Dual-Path Transformers |
|
Zhang, Dehui | Qilu University of Technology |
Dong, Anming | Qilu University of Technology |
Yu, Jiguo | Qilu University of Technology |
Cao, Yi | University of Jinan |
Zhang, Chuanting | University of Bristol |
Zhou, You | Shandong HiCon New Media Institute Co., Ltd |
Keywords: Deep Learning, Machine Learning, Neural Networks and their Applications
Abstract: Generative Adversarial Networks (GANs) have been used in the field of speech enhancement due to their huge potentials in reducing the noise mixed in the signals. Most of existing GAN-based speech enhancement approaches either operate on time domain or exploit the magnitude spectra in time-frequency domain, but lack consideration of direct optimization of the phase. In this paper, we propose a GAN architecture for speech enhancement based on gated linear units (GLUs) and Dual-Path Transformers (DPTs), which simultaneously deals with the amplitude and phase information on the time-frequency domain. The generator of the proposed GAN architecture is designed following an autoencoder structure fed by the real and imaginary parts of the time-frequency frames. The encoder of the generator is constructed by multiple cascaded convolutional GLUs (ConvGLUs), while the decoder consists of two groups of cascaded deconvolutional GLUs (DeconvGLUs), one for the real part of the spectrogram and the other for the imaginary part. The GLUs are adopted since they are potential in avoiding the gradient vanishing issue dwelling in deep architectures by providing a linear path for the gradients while retaining non-linear capabilities. Aiming at capturing the long-range dependent features in speech, we place DPTs between the encoder and the decoder of the generator, which contains multi-head attention modules and Bi-directional Gated Recurrent Units (BiGRUs). Moreover, the DPT structure is also merged with multiple one-dimensional convolutional layers in the discriminator of the GAN. Such a design not only improves the speech enhancement performance of GAN by focusing on multiple features of speech, but also reducing the volume of model parameters of GAN. Experimental results suggest that the proposed GAN architecture outperforms the existing benchmark GANs in terms of both objective speech intelligibility and quality with less computational complexity.
|
|
We-PS1-T8 Regular Session, QUADRANT |
Add to My Program |
Multi-Agent Safety Systems |
|
|
Chair: Mohajer, Navid | Deakin University |
|
08:00-08:20, Paper We-PS1-T8.1 | Add to My Program |
A Home for Principal Component Analysis (PCA) As Part of a Multi-Agent Safety System (MASS) for Human Robot Collaboration (HRC) within the Industry 5.0 Enterprise Architecture (EA) |
|
Rajendran, Anushri | Deakin University |
Kebria, Parham | Deakin University |
Mohajer, Navid | Deakin University |
Khosravi, Abbas | Deakin University |
Nahavandi, Saeid | Deakin University |
Keywords: Machine Learning, Application of Artificial Intelligence, Computational Intelligence
Abstract: Industry 5.0 is here, and human interaction experts claim that in the process of augmenting a high production/manufacturing workplace, a safety critical situation is created with the introduction of "Cobots". A Multi-Agent Safety System (MASS) is presented as a solution in this paper which uses commercial, wearable technologies with high data sharing acceptance rates such as the Apple watch to collect and share real time ECG signals with the Cobot. Principal Component Analysis (PCA) is selected as a dimension reduction tool because it is well established and meets the requirements for reliability in the development of a human-centric, safety system. Five Machine Learning (ML) classifiers (KNN, NB, RF, DT and GBM) are used with binary classification to predict whether the human is Distracted (Event 1) or Not Distracted (Event 0) to determine if this will pose a safety risk to the Human Robot Collaboration (HRC) System. Decision Tree (DT) classifier with 4 Principal Components (PCs) is evaluated at 98% Accuracy and 99%AUC and is the recommended model for future development of the MASS. A road map is also presented to ensure the longevity of MASS while signifying the inclusion of real time data which can close the demographic data gap and help to improve the privacy, efficiency and contextual reliability of the MASS model in the Industry 5.0 workplace.
|
|
08:20-08:40, Paper We-PS1-T8.2 | Add to My Program |
Interpretable Navigation Agents Using Attention-Augmented Memory |
|
Qu, Jia | Mitsubishi Electric Corporation |
Miwa, Shotaro | Mitsubishi Electric Corp |
Domae, Yukiyasu | AIST |
Keywords: Application of Artificial Intelligence, Deep Learning, Machine Learning
Abstract: Deep reinforcement learning (DRL) has achieved remarkable success in various domains, from games to complex tasks. In several DRL applications, the agents must perform long-horizon tasks in partially observable environments. However, owing to the high performance of DRL, the decision-making process for the long-horizon task is unclear and difficult to interpret. Although memory-based models and the Transformer model have been proposed to overcome the interpretability issue, challenges of limited flexibility, lack of stability, and high computational cost remain. To address these concerns, we propose a low-computational-complexity and scalable DRL model that uses attention-augmented memory (AAM) to interpret the long-horizon decision-making process. AAM adds long short-term memory states of past observations to the memory and uses a soft attention bottleneck to combine them into a single contextual vector. The agent is then trained to make decisions based on this AAM together with the current observation. The AAM model is applied to the navigation problem of the Labyrinth [1], and attention and saliency maps are generated to show the areas highlighted by the agent in the current observation and the areas it attends to from memory. The resiliency of the model was evaluated using saliency and it was discovered that the proposed method is more resilient to the noise of visual observations compared with the baseline model. In summary, the proposed method demonstrates an interpretable and noise-robust DRL approach for long-horizon tasks.
|
|
08:40-09:00, Paper We-PS1-T8.3 | Add to My Program |
Covert Attack and Detection through Deep Neural Network on Vision-Based Navigation Systems of Multi-Agent Autonomous Vehicles |
|
Moradi Sizkouhi, Amir Mohammad | Concordia University |
Rahimifard, Mahshid | Concordia University |
Selmic, Rastko | Concordia University |
Keywords: Artificial Immune Systems, Application of Artificial Intelligence, Deep Learning
Abstract: Autonomous vehicles are prone to worms of intelligent cyber-attacks that can use novel deep neural networks to adapt themselves to hosts and remain stealthy for a long period of time. This paper introduces a new, vision-based, covert attack and detection method, called VCAD-GAN, which can be applied to the navigation system of various autonomous vehicles. VCAD-GAN injects a faulty signal into the actuator channel to drive the vehicle out of the lane. To conceal the deviation effects, VCAD-GAN manipulates the camera output using a generative adversarial network such that the vehicle is shown on the road's center-line and aligned with it. The generator of VCAD-GAN reconstructs a synthesized top view of the road, while the discriminator classifies it as authentic or fake and sends feedback to the generator. A hybrid adversary detection system is also developed for VCAD-GAN using a customized deep neural network, global positioning system data, and an offline map. To evaluate the performance of VCAD-GAN, various 3D simulations were conducted. The simulation results show the validity and effectiveness of the proposed methods.
|
|
09:00-09:20, Paper We-PS1-T8.4 | Add to My Program |
Efficient Crowd Evacuation Guidance with Multiple Visual Signage Using a Middle-Range Agent Model and Black-Box Optimization |
|
Tsurushima, Akira | SECOM Co., Ltd. Intelligent Systems Laboratory |
Keywords: Agent-Based Modeling, Application of Artificial Intelligence, Optimization and Self-Organization Approaches
Abstract: To facilitate efficient crowd evacuations, we propose evacuation guidance systems with multiple visual signs to effectively convey helpful evacuation information to the evacuees. However, conventional signage systems use signs with fixed evacuation information that cannot respond to dynamically changing evacuation environments. Therefore, to address this issue, adaptive evacuation signage systems that alter evacuation information depending on situational changes have been proposed. This study addresses this problem by employing the black-box optimization technique using multi-agent simulation as the objective function. We assume a realistic evacuation environment consisting of central cores with complex aisles and broad surrounding space. Then, we simulate guiding a crowd of evacuees. The two problem settings for the visual evacuation signage assignment are guiding a crowd to a specific target and minimizing the total evacuation time. Our approach generated near-optimal solutions for the first set and a reasonably good solution for the second set.
|
|
09:20-09:40, Paper We-PS1-T8.5 | Add to My Program |
An Agent-Based Metaheuristic with Cooperation Approach Applied for Patients’scheduling in Hospital Emergency Department |
|
Faiza, Ajmi | Ecole Centrale De Lille |
Ajmi, Faten | University of Lille |
Ben Othman, Sarah | Ecole Centrale De Lille |
Zgaya-Biau, Hayfa | Lille University |
Renard, Jean-Marie | University of Lille |
Gregoire, Smith | CHU of Lille |
Hammadi, Slim | Ecole Centrale De Lille |
Keywords: Agent-Based Modeling, Optimization and Self-Organization Approaches, Evolutionary Computation
Abstract: In this paper, we propose an innovative metaheuristic characterized by a multi-dimensional chromosome where each dimension is driven by a rational agent. These agents have to communicate in order to implement evolving and adaptive genetic operators to accelerate the convergence towards the optimal solution. This cooperative approach is applied to solve the patient scheduling problem in emergency department (ED). This problem is NP-difficult due to the permanent interference between three types of arrival: already programmed patients, non-programmed patients and urgent non-programmed patients. Our scheduling problem has to integrate several dimensions such as medical dimensional, patient dimensional, temporal dimensional. The multi-dimensional aspect of the chromosome is crucial to model the different dimensions of the ED. The main goal of the simulation results is to assess the performance of the proposed agent driven multi-dimensional chromosome. The simulation results confirm that the intra and inter chromosomal interactions allow to avoid the blind aspect of the genetic operators and impacts the quality of solutions. The agents' cooperation and its ability to improve efficiently the quality of the solutions by exploring intelligently the research space are confirmed by the drop in average total patient waiting time by 15.09%
|
|
We-PS1-T9 Regular Session, KEPLER |
Add to My Program |
System Modeling and Control IV |
|
|
Chair: Christen, Patrik | FHNW |
|
08:00-08:20, Paper We-PS1-T9.1 | Add to My Program |
Curb Your Self-Modifying Code |
|
Christen, Patrik | FHNW |
Keywords: System Modeling and Control, Discrete Event Systems
Abstract: Self-modifying code has many intriguing applications in a broad range of fields including software security, artificial general intelligence, and open-ended evolution. Having control over self-modifying code, however, is still an open challenge since it is a balancing act between providing as much freedom as possible so as not to limit possible solutions, while at the same time imposing restriction to avoid security issues and invalid code or solutions. In the present study, I provide a prototype implementation of how one might curb self-modifying code by introducing control mechanisms for code modifications within specific regions and for specific transitions between code and data. I show that this is possible to achieve with the so-called allagmatic method - a framework to formalise, model, implement, and interpret complex systems inspired by Gilbert Simondon's philosophy of individuation and Alfred North Whitehead's philosophy of organism. Thereby, the allagmatic method serves as guidance for self-modification based on concepts defined in a metaphysical framework. I conclude that the allagmatic method seems to be a suitable framework for control mechanisms in self-modifying code and that there are intriguing analogies between the presented control mechanisms and gene regulation.
|
|
08:20-08:40, Paper We-PS1-T9.2 | Add to My Program |
Research on Low Energy Task Allocation and Scheduling Algorithm Based on Imprecise Heterogeneous Multi-Core Technology |
|
Gu, Haonan | College of Computer Science and Technology, Wuhan University Of |
Wu, Jing | Wuhan University of Science and Technology |
Hu, Wei | Wuhan University of Science and Technology |
Ma, Tianao | Wuhan University of Science and Technology |
Keywords: System Modeling and Control, Conflict Resolution, System Architecture
Abstract: In heterogeneous embedded systems, we try to find a new non real-time scheduling algorithm, which can not only meet the reliability requirements, but also achieve low energy consumption. "Resource reservation time deterministic cyclic scheduling(RREDS)" algorithm is a kind of scheduling for offline cache pre allocation. This scheduling algorithm combines time reservation and priority strategy. The algorithm can dynamically adjust according to different task sets. It can also adjust its own scheduling process to adapt to different scheduling environments. The periodic cyclic strategy enables low power consumption in task allocation. In the experimental part, we compare the algorithm proposed in this paper with the current mainstream real-time scheduling scheme. Experimental results show that rredcs algorithm has better performance.
|
|
08:40-09:00, Paper We-PS1-T9.3 | Add to My Program |
Biogeography-Based Optimisation for Weight Tuning of a Linear Time-Varying Model Predictive Control Approach for Autonomous Vehicles |
|
Chalak Qazani, Mohamad Reza | Institute for Intelligent Systems Research and Innovation (IISRI |
Asadi, Houshyar | Deakin University |
Karkoub, Mansour | Texas A&M University at Qatar |
Lim, Chee Peng | Deakin University |
Liew, Alan Wee-Chung | Griffith University |
Nahavandi, Saeid | Deakin University |
Keywords: System Modeling and Control, Modeling of Autonomous Systems, Intelligent Transportation Systems
Abstract: Self-driving vehicles, also known as Autonomous Vehicles (AVs), are steadily becoming very popular due to their huge benefits. They can improve safety, convenience and transport interconnectivity as well as reduce congestion, pollution and emissions. The generation of the comfort motion signal for AVs passenger via the calculation of accurate motion cues with lower motion discomforts is important to promote the adoption of Avs in society. Model predictive control (MPC) is currently used in AVs for tracking the motion signal with good accuracy. However, the higher efficiency of MPC is directly related to the right setting of the weights. In addition, the tracking of time-varying longitudinal velocity is not possible without using linear time-varying (LTV) MPC. In this study, an LTV MPC system is designed and developed as a highly efficient motion tracking mechanism for AVs to reduce the motion tracking error and motion discomfort. In addition, biogeography-based optimisation (BBO) is employed to determine the optimal weights of the LTV MPC controller, which further reduces the motion tracking error and increases the motion comfort for users. The empirical study demonstrates that a BBO-tuned LTV MPC controller decreases the mean square error of motion tracking by 4.79% as compared with that of a manually-tuned version. Moreover, the mean square errors of the lateral deviation and relative yaw decrease by 91.22% and 19.14% as compared with those from a manually-tuned LTV MPC counterpart, respectively.
|
|
09:00-09:20, Paper We-PS1-T9.4 | Add to My Program |
Adaptive Vibration Control of Vehicle Semi-Active Suspension System Based on Ensemble Fuzzy Logic and Reinforcement Learning |
|
Liang, Tong | University of Jinan |
Han, Shiyuan | University of Jinan |
Zhou, Jin | University of Jinan |
Chen, Yuehui | University of Jinan |
Yang, Jun | Shandong Jiaotong University |
Zhao, Jia | Zhongtong Bus Holding Co |
Keywords: System Modeling and Control, Mechatronics
Abstract: The integration of reinforcement learning with fuzzy logic can be effective in compensating the external disturbance and complex dynamic while designing the control strategy for vehicle suspension. The main contribution of this paper is that a learning-based adaptive vibration control strategy is proposed for semi-active suspension system, which combines the fuzzy logic with the reward function of reinforcement learning to improve the robustness and feasibility of the vibration control strategy. What's more, an improved proximal policy optimization algorithm combined with fuzzy logic is proposed for realizing the trial-and-error reinforcement learning. Specially, the reward function with fuzzy logic is formulated to meet the requirements of suspension performance under different road conditions, in which the fuzzy logic is designed to fuzzily the process the collected road information, real-time update the weight matrix coefficients, and adjust the optimization objectives adaptively. Finally, numerical simulation results are given to prove the effectiveness of the proposed vibration control strategy.
|
|
09:20-09:40, Paper We-PS1-T9.5 | Add to My Program |
Optimal Robust Control for Tremor Suppression in Parkinson’s Disease |
|
Saeedi, Mobin | Shiraz University of Technology |
Zarei, Jafar | Shiraz University of Technology |
Balouchi, Hoda | Jundishapur University of Medical Sciences |
Razavi-Far, Roozbeh | University of Windsor |
Saif, Mehrdad | University of Windsor |
Keywords: System Modeling and Control, Service Systems and Organizations
Abstract: Deep brain stimulation (DBS) is an effective and promising therapy to control Parkinson's tremor movement in patients with advanced Parkinson's disease (PD). This paper proposes a new alternative medication that has several advantages, including compatibility with individual needs and low side effects. There has been a rapid improvement in the literature on the development of the dynamic computational model of neuroscience, alongside the development of DBS. A combination of DBS and model-based control strategies opens up a new vision for Parkinson's disease treatment. Despite the numerous studies on basal ganglia (BG) modeling, researchers are required to employ adaptive and robust strategies to eliminate Parkinson's patients' tremors. This paper proposes a new adaptive optimal fast terminal sliding mode control (AOFTSMC) method to mitigate tremors by tuning GABA thorough DBS. This approach represents finite-time convergence law, a new method to stimulate the inner nuclei of BG in a robust and optimum manner that leads to removing tremors of PD fluctuation signal in the presence of uncertainties. Finally, simulation results of the basal ganglia model under the addressed approach are adopted to demonstrate the effectiveness of the proposed method.
|
|
We-PS1-T10 Regular Session, TYCHO |
Add to My Program |
Practical Applications of BMIs |
|
|
Chair: Volosyak, Ivan | Rhine-Waal University of Applied Sciences |
Co-Chair: Putze, Felix | University of Bremen |
|
08:00-08:20, Paper We-PS1-T10.1 | Add to My Program |
Is That Real? a Multifaceted Evaluation of the Quality of Simulated EEG Signals for Passive BCI |
|
Eilts, Hendrik | University of Bremen |
Putze, Felix | University of Bremen |
Keywords: Passive BMIs
Abstract: In this work, the quality of simulated EEG signals is investigated and subsequently compared with real EEG measurements. For this purpose, the EEG signals are considered both in the time and frequency domain and analyzed as to how realistic they appear. In addition, the quality of the simulated signals will be assessed via a survey. And finally, it will be tested whether augmentation of EEG data with generated data improves the classification. For the generation of the EEG signals, progressive Wasserstein GAN with gradient penalty were trained on the attention recognition data. Interviewing experts in the field of EEG signal analysis and/or processing will be used to evaluate the similarity of simulated and real EEG signals. To investigate the effect of data augmentation, the Shallow FBCSPNet model was trained on different ratios of the data. The results show that the simulated EEG signals appear realistic in both time and frequency domains. The survey also showed a positive result: no significant differences were detected by the participants between the simulated and the real data. One exception, however, was the perceived noise level, which was rated higher in the simulated data. The augmentation of the data with the simulated data showed a moderate improvement in accuracy and F-score, which is more pronounced when the original data set is reduced in size. Overall, the evaluation of the results showed that the quality of the simulated EEG signals is comparable to the quality of the real signals. This is promising for a variety of applications in brain-computer interfaces or in the analysis of cognitive processes that could benefit from the simulated signals.
|
|
08:20-08:40, Paper We-PS1-T10.2 | Add to My Program |
Evaluation of a Proposal for Sustained Attention Training through BCI with an Estimate of Effective Connectivity |
|
Dias Casagrande, Wagner | Federal University of Espirito Santo - UFES |
Delisle-Rodriguez, Denis | Santos Dumont Institute |
Nakamura-Palacios, Ester Miyuki | Federal University of Espirito Santo - UFES |
Frizera-Neto, Anselmo | Federal University of Espirito Santo UFES |
Keywords: Active BMIs, Other Neurotechnology and Brain-Related Topics
Abstract: Many assistive and rehabilitation Brain Computer Interface (BCI) systems have been developed by using brain imaging techniques through electroencephalogram (EEG). This study aims to estimate the effective connectivity of subjects that received attention training through a BCI-based game system. For this purpose, mathematical methods and computational tools were used here to evaluate the effect of using the BCI for sustained attention training, programmed with different levels of difficulty for two groups of subjects. As a result, it was possible to identify differences in connectivity patterns between both groups, mainly the flow outflow characteristic in the left temporal region of subjects who performed the training using the game programmed with greater difficulty level. With this preliminary result, we concluded that the proposed strategy for sustained attention training by applying different speed control in a BCI-based game resulted in the generation of different patterns, capable of distinguishing the users in the two groups. Our approach and findings can contribute to improving the development of EEG-based BCI technologies in identifying patterns related to the state of attention.
|
|
08:40-09:00, Paper We-PS1-T10.3 | Add to My Program |
Improving Silent Speech BCI Training Procedures through Transfer from Overt to Silent Speech |
|
Rekrut, Maurice | German Research Center for Artificial Intelligence (DFKI) |
Selim, Abdulrahman Mohamed | Saarland Informatics Campus |
Krüger, Antonio | German Research Center for Artificial Intelligence (DFKI) |
Keywords: Active BMIs, BMI Emerging Applications
Abstract: Silent speech Brain-Computer Interfaces (BCIs) try to decode imagined speech from brain activity. One of the most common and comfortable ways to establish BCIs is by measuring the electrical activity of the brain at the scalp surface via Electroencephalography (EEG). EEG-based silent speech BCIs require a tremendous amount of training data usually collected in tedious training sessions in which participants silently repeat words presented on a screen. Such sessions usually last several hours and require participants to remain focused and reduce their movement to a minimum, which makes those procedures mentally and physically exhausting. Within this work we present an approach to overcome those exhausting sessions by training a silent speech classifier on data recorded while speaking certain words and transferring this classifier to EEG data recorded during silent repetition of the same words. This approach does not only allow for a less mentally and physically exhausting training procedure but also for a more productive one as the overt speech output can be used for interaction while the classifier for silent speech is trained simultaneously. We evaluated our approach in a study in which 15 participants navigated a virtual robot on a screen in a game like scenario through a maze once with 5 overtly spoken and once with the same 5 silently spoken command words. In an offline analysis we trained a classifier on overt speech data and let it predict silent speech data. Our classification results do not only show successful results for the transfer (61.78%) significantly above chance level but also comparable results to a standard silents speech classifier (71.48%) trained and tested on the same data. These results illustrate the potential of the method to replace the currently tedious training procedures for silent speech BCIs with a more comfortable, engaging and productive approach by a transfer from overt to silent speech.
|
|
09:00-09:20, Paper We-PS1-T10.4 | Add to My Program |
Passive BCI Oddball Paradigm for Dementia Digital Neuro-Biomarker Elucidation from Attended and Inhibited ERPs Utilizing Information Geometry Classification Approaches |
|
Rutkowski, Tomasz M. | RIKEN |
Abe, Masato S. | Doshisha University |
Tokunaga, Seiki | RIKEN AIP |
Sugimoto, Hikaru | RIKEN AIP |
Komendzinski, Tomasz | Nicolaus Copernicus University |
Otake-Matsuura, Mihoko | RIKEN AIP |
Keywords: Passive BMIs, BMI Emerging Applications, Active BMIs
Abstract: Brain-computer interface (BCI) and efficient machine learning (ML) algorithms belonging to the so-called ‘AI for social good’ domain contribute to the well-being improvement of patients with limited mobility or communication skills. We report preliminary results from a project focusing on developing a dementia digital neuro–biomarker for early-onset prognosis of a possible cognitive decline utilizing a passive BCI approach. We also report findings from two elderly volunteer pilot study groups in oddball paradigm EEG responses to attended (target) and inhibited (ignored) images in a classical short-term-memory evaluating oddball paradigm. We propose applying an information geometry approach employing Riemannian geometry tools for EEG covariance matrix-derived features used in subsequent shallow machine learning classification. The reported pilot study showcases the vital application of artificial intelligence (AI) for an early-onset mild cognitive impairment (MCI) prediction in the elderly.
|
|
09:20-09:40, Paper We-PS1-T10.5 | Add to My Program |
Human Intracortical Responses to Varying Electrical Stimulation Conditions Are Separable in Low-Dimensional Subspaces |
|
Sun, Samantha | University of Washington |
Levinson, Lila | University of Washington |
Paschall, Courtnie | University of Washington |
Herron, Jeffrey | University of Washington |
Weaver, Kurt | University of Washington |
Hauptman, Jason | Seattle Children's Hospital |
Ko, Andrew | University of Washington |
Ojemann, Jeffrey | University of Washington |
Rao, Rajesh | University of Washington |
Keywords: Human-Machine Interface
Abstract: Electrical stimulation is a powerful tool for targeted neurorehabilitation, and recent work in adaptive stimulation where stimulation can be adjusted in real-time has shown promise in improving stimulation outcomes and reducing stimulation-induced side effects. Mapping the relationship between electrical stimulation input and neural activity response can help reveal interactions between stimulation and underlying neural activity and can give us tools to iterate and improve on our stimulation protocols. Here, we introduce methods for identifying low-dimensional subspaces of human intracortical responses to electrical stimulation in invasive electroencephalography. In epilepsy patients (n=4) undergoing clinical monitoring, we applied a stimulation protocol of varying stimulation amplitude and frequency in 5-second intervals to capture a range of responses to different stimulation conditions. We characterized these responses using time-frequency spectral power, applied baseline subtraction and outlier removal procedures, and performed principal component analysis across frequencies. We identified that intracortical responses to different stimulation conditions can be represented in a 3-dimensional subspace, accounting for more than 95% of the variance. Using pairwise support vector machine classification, we demonstrated separability of intracortical responses to different stimulation conditions across subjects, where this separability was contingent on performing baseline subtraction and outlier removal. Our results represent a first step towards building a mapping or predictive model from stimulation input to neural response, an important prerequisite for adaptive closed-loop stimulation for targeted neurorehabilitation.
|
|
We-PS1-T11 Regular Session, STELLA |
Add to My Program |
New Frontiers of Intelligent Pervasive Healthcare Systems |
|
|
Chair: Casalino, Gabriella | Universitŕ Degli Studi Di Bari, "A. Moro" |
Co-Chair: Rijcken, Emil | Eindhoven University of Technology |
|
08:00-08:20, Paper We-PS1-T11.1 | Add to My Program |
Exploring Embedding Spaces for More Coherent Topic Modeling in Electronic Health Records (I) |
|
Rijcken, Emil | Eindhoven University of Technology |
Zervanou, Kalliopi | LUMC |
Spruit, Marco | LUMC |
Mosteiro, Pablo | Utrecht University |
Scheepers, Floortje | UMCU |
Kaymak, Uzay | Eindhoven University of Technology |
Keywords: Application of Artificial Intelligence, Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing, Machine Learning
Abstract: The written notes in the Electronic Health Records contain a vast amount of information about patients. Implementing automated approaches for text classification tasks requires the automated methods to be well-interpretable, and topic models can be used for this goal as they can indicate what topics in a text are relevant to making a decision. We propose a new topic modeling algorithm, FLSA-E, and compare it with another state-of-the-art algorithm FLSA-W. In FLSA-E, topics are found by fuzzy clustering in a word embedding space. Since we use word embeddings as the basis for our clustering, we extend our evaluation with word-embeddings-based evaluation metrics. We find that different evaluation metrics favour different algorithms. Based on the results, there is evidence that FLSA-E has fewer outliers in its topics, a desirable property, given that within-topic words need to be semantically related.
|
|
08:20-08:40, Paper We-PS1-T11.2 | Add to My Program |
A Mobile App for Contactless Measurement of Vital Signs through Remote Photoplethysmography (I) |
|
Casalino, Gabriella | Universitŕ Degli Studi Di Bari, "A. Moro" |
Castellano, Giovanna | University of Bari |
Nisio, Andrea | University of Roma Sapienza |
Pasquadibisceglie, Vincenzo | Universitŕ Degli Studi Di Bari |
Zaza, Gianluca | Univesitŕ Degli Studi Di Bari "Aldo Moro" |
Keywords: Application of Artificial Intelligence, Computational Intelligence, Machine Vision
Abstract: The healthcare domain has undergone a huge transformation thanks to the availability of new technologies. In particular, health monitoring systems have entered everyday life without interfering with the daily routine. Mobile phones are increasingly used as health monitoring systems by means of ad-hoc applications. In this work, we propose a mobile app for contactless monitoring of vital signs, such as heart rate and blood oxygen saturation. Differently from the other devices in the literature, it is able to measure vital signs from the analysis of short videos through the use of remote photoplethysmography technology. A client-server architecture has been developed to run the signal and video processing on the server while implementing video acquisition and communication with the user on the smartphone. Experiments have shown the effectiveness of the developed app in accurately measuring vital parameters.
|
|
08:40-09:00, Paper We-PS1-T11.3 | Add to My Program |
Evolving Fuzzy Neural Network Based on Null-Unineurons for the Identification of Coronary Artery Disease (I) |
|
Guimarăes, Augusto Júnio | MRV Engenharia |
de Campos Souza, Paulo Vitor de | Johannes Kepler University Linz |
Rodrigues Batista, Huoston | University of Applied Sciences Upper Austria |
Lughofer, Edwin | Johannes Kepler University of Linz |
Keywords: Hybrid Models of Neural Networks, Fuzzy Systems, and Evolutionary Computing, Fuzzy Systems and their applications, Expert and Knowledge-Based Systems
Abstract: Coronary diseases affect a large part of the world population and have become the target of significant research in the academic field. The creation and use of intelligent models to facilitate the diagnosis of these diseases can allow treatments to be performed promptly to avoid further problems for patients. This paper applies an innovative evolving fuzzy neural network model to solve the problem of coronary heart disease diagnosis and extract valuable insights from the evaluated dataset. The null-unineurons that compose the model's architecture can extract fuzzy rules, representing linguistic knowledge about the target problem. A dataset that condenses the most famous data sources on this problem to classify coronary heart disease was applied to state-of-the-art models of evolving fuzzy systems. The results obtained by the model applied in this study are similar to the state-of-the-art results. Furthermore, the model provides relevant interpretations about the evolution of the problem evaluation.
|
|
09:00-09:20, Paper We-PS1-T11.4 | Add to My Program |
Brain Computer Interface: Deep Learning Approach to Predict Human Emotion Recognition (I) |
|
Ardito, Carmelo | Politecnico Di Bari |
Bortone, Ilaria | N.I. of Gastroenterology ”S. De Bellis” |
Colafiglio, Tommaso | Dept. of Electrical and Information Engineering (DEI), Politecni |
Di Noia, Tommaso | Dept. of Electrical and Information Engineering (DEI), Politecni |
Di Sciascio, Eugenio | Politecnico Di Bari |
Lofů, Domenico | Dept. of Electrical and Information Engineering (DEI), Politecni |
Narducci, Fedelucio | Politecnico Di Bari |
Sardone, Rodolfo | Data Sciences and Innovation, N.I. of Gastroenterology ”S. De Be |
Sorino, Paolo | Politecnico Di Bari |
Keywords: Application of Artificial Intelligence, Deep Learning
Abstract: Brain Computer Interfaces allow controlling machines through signals coming from Electroencephalography (EEG) analysis. Nowadays, there are several cheap electroencephalographs available on the market that guarantee good quality EEG signals. A very interesting approach in this area is related to detecting the emotional states of a user through the analysis of her EEG signal. In our study, we tried to detect the emotional polarity (Valence), the state of emotional excitement (Arousal), and the level of emotion control (Dominance). Through metric interpolation and Russell’s circumplex model, it is possible to characterize and define the current emotional state of the user who wears the device. Our study presents a prototype of an EEG-based emotion recognizer that provides the user’s emotional state exploitable as bio-feedback.
|
|
09:20-09:40, Paper We-PS1-T11.5 | Add to My Program |
Deep Knowledge Reasoning Guided Disease Prediction (I) |
|
Liang, Yuanzhi | Fudan University |
Wang, Haofen | Tongji University |
Zhang, Wenqiang | Fudan University |
Keywords: Application of Artificial Intelligence, Knowledge Acquisition
Abstract: Disease prediction, which aims to predict possible future diseases for patients, is a fundamental research problem in medical informatics. Many studies have proposed the introduction of external knowledge to enhance deep learning models which have achieved good results, but as most of these studies only consider using single-hop relationship information from knowledge graphs or simply introduce partial knowledge graphs, fail to effectively mine knowledge paths to understand the relationships between diseases. To this end, we propose a new approach, which uses existing medical knowledge graphs for multi-hop reasoning to guide the self-attention based transformer model for disease prediction. Specifically, we design a reinforcement learning algorithm to perform path reasoning in the knowledge graph to obtain explicit disease progression paths and fusion them with the original Electronic Health Records (EHR) data. After embedding we capture the implicit relationships between the deep knowledge and the original EHR information through several transformer encoders based on the self-attention mechanism to better extract features. At the same time, multi-hop knowledge is more interpretable than single-hop knowledge in terms of disease prediction. Experimental results on the real-world medical dataset MIMIC-III show the superiority of the proposed approach compared to a series of state-of-the-art baselines.
|
|
We-PS1-T12 Regular Session, ZODIAC |
Add to My Program |
Reinforcement Learning and Its Applications |
|
|
Co-Chair: Fellner, David | AIT Austrian Institute of Technology |
|
08:00-08:20, Paper We-PS1-T12.1 | Add to My Program |
Independent Multi-Agent Reinforcement Learning Using Common Knowledge |
|
Hu, Haomeng | National University of Defense Technology |
Shi, Dianxi | National Innovation Institute of Defense Technology |
Yang, Huanhuan | National University of Defense Technology |
Peng, Yingxuan | National University of Defense Technology |
Zhou, Yating | National University of Defense Technology |
Yang, Shaowu | National University of Defense Technology |
Keywords: Agent-Based Modeling, Deep Learning, Neural Networks and their Applications
Abstract: Many recent multi-agent reinforcement learning algorithms used centralized training with decentralized execution (CTDE), which results in a training process that relies on global information and suffers from the dimensional explosion. The independent learning (IL) approaches are simple in structure and can be more easily deployed to a wider range of multi-agent scenarios, but they can only solve relatively simple problems due to environment non-stationarity and partially observable. With this motivation, we let IL agents compute common knowledge information and fuse it with observation to explicitly exploit common knowledge. In addition, we chose a suitable network structure according to the characteristics of IL, using convolutional layers and GRU layers. Based on the above two improvements, we implement two IL algorithms. In our experiments, the algorithms we implemented show significant performance improvements compared to original IL algorithms and further approach CTDE while outperforming multi-agent common knowledge reinforcement learning.
|
|
08:20-08:40, Paper We-PS1-T12.2 | Add to My Program |
Sub-Optimal Policy Aided Multi-Agent Reinforcement Learning for Flocking Control |
|
Qiu, Yunbo | Tsinghua University |
Jin, Yue | Tsinghua University |
Wang, Jian | Tsinghua University |
Zhang, Xudong | Tsinghua University |
Keywords: Agent-Based Modeling, Application of Artificial Intelligence, Machine Learning
Abstract: Flocking control is a challenging problem, where multiple agents, such as drones or vehicles, need to reach a target position while maintaining the flock and avoiding collisions with obstacles and collisions among agents in the environment. Multi-agent reinforcement learning has achieved promising performance in flocking control. However, methods based on traditional reinforcement learning require a considerable number of interactions between agents and the environment. This paper proposes a sub-optimal policy aided multi-agent reinforcement learning algorithm (SPA-MARL) to boost sample efficiency. SPA-MARL directly leverages a prior policy that can be manually designed or solved with a non-learning method to aid agents in learning, where the performance of the policy can be sub-optimal. SPA-MARL recognizes the difference in performance between the sub-optimal policy and itself, and then imitates the sub-optimal policy if the sub-optimal policy is better. We leverage SPA-MARL to solve the flocking control problem. A traditional control method based on artificial potential fields is used to generate a sub-optimal policy. Experiments demonstrate that SPA-MARL can speed up the training process and outperform both the MARL baseline and the used sub-optimal policy.
|
|
08:40-09:00, Paper We-PS1-T12.3 | Add to My Program |
Generating Manipulation Sequences Using Reinforcement Learning and Behavior Trees for Peg-In-Hole Task |
|
Xu, Jiahua | Wuhan University of Science and Technology |
Lin, Yunhan | Wuhan University of Science and Technology |
Zhou, Haotian | Wuhan University of Science and Technology |
Min, Huasong | Wuhan University of Science and Technology |
Keywords: Machine Learning, Neural Networks and their Applications, Application of Artificial Intelligence
Abstract: Reinforcement Learning (RL), a method of learning skills through trial-and-error, has been successfully used in many robotics applications in recent years. We combine manipulation primitives (MPs), behavior trees (BTs), and reinforcement learning to propose an algorithm for peg-in-hole tasks, which speeds up the convergence of the RL model and enhance the adaptability of the dynamic environment. Manipulation primitives are used as actions for RL, which can reduce the gap between control instruction and robotic actions and speed up the convergence of the RL model. Behavior trees are used as robot behavior control, which makes robots can actively adapt to the changes in the environment. In experiments, RL-BT, from the combination of RL and BT, is designed for the peg-in-hole task in the Gazebo simulation environment by using UR5 as the actuator. The experiments are conducted on a simple peg and a complex multi-hole peg by three aspects, which include convergence speed verification experiment, adaptability of dynamic environment experiment, and algorithm robustness experiment. The experiment result proves that our RL-BT can speed up the convergence and adapt to the changes in the environment.
|
|
09:00-09:20, Paper We-PS1-T12.4 | Add to My Program |
Deep Reinforcement Learning for Object Detection with the Updatable Target Network |
|
Yu, Wenwu | University of Chinese Academy of Sciences |
Wang, Rui | Institute of Softfware Chinese Academy of Sciences |
Hu, Xiaohui | Institute of Softfware Chinese Academy of Sciences |
Keywords: Agent-Based Modeling, Application of Artificial Intelligence, Deep Learning
Abstract: We propose a reinforcement learning method for setting up the game of detecting objects within an image. Unlike some traditional image detection methods, which produce a large number of candidate boxes to detect objects, our approach allows an agent to use a small set of image locations to detect a visual object effectively. We train the agent to identify the useless information in the four edges of the image and discard them by representing predefined candidate areas in a tree-like hierarchy. Such a series of procedures that build the game environment is designed to detect locations of target objects. We also show that the updatable target network can make the agent reach stability faster and improve training results prominently, which utilizes good samples. Extensive comparison experiments on the benchmark dataset of Pascal VOC verify the outperformance of the proposed method.
|
|
09:20-09:40, Paper We-PS1-T12.5 | Add to My Program |
Deep Reinforcement Learning-Based UAV-Assisted Mobile Edge Computing Offloading and Resource Allocation Decision |
|
Du, Shougang | Beijing Information Science and Technology University |
Chen, Xin | Beijing Information Science and Technology University |
Jiao, Libo | Beijing Information Science and Technology University |
马, 卓 | Beijing Information Science and Technology University |
Wang, Yijie | Beijing Information Science and Technology University |
Keywords: Cloud, IoT, and Robotics Integration, Deep Learning, Computational Intelligence
Abstract: Nowadays, the surge of user data traffic has brought great challenges to the computing and energy capacity of mobile terminals (MTs). Mobile edge computing (MEC) technology is reckoned to be an efficient method to alleviate this problem. It can transfer tasks to MEC server and improve quality of service (QoS). In case of network failure, unmanned aerial vehicle (UAV) is deployed as a data transmission hub connecting MEC server to restore the network. In this article, we consider a UAV transmission hub (UTH) to communicate with the macro base station (MBS). MTs can offload tasks to MBS for processing through UTH, and the MEC server in MBS allocates computing resources to MTs. We raise a computing offloading and resource allocation decision scheme based on deep deterministic policy gradient (DDPG). The scheme considers the continuous generation of dynamic tasks, and the optimization objectives is to minimize the long-term average system cost. The simulation experiment datas verify the performance of DDPG-Based offloading and resource allocation decision scheme. It can validly optimize the average system cost in a random dynamic environment.
|
|
We-PS2-T1 Regular Session, MERIDIAN |
Add to My Program |
Image Classification and Processing |
|
|
Chair: Lee, Min-Ho | Nazarbayev University |
Co-Chair: Sekanina, Lukas | Brno University of Technology |
|
10:00-10:20, Paper We-PS2-T1.1 | Add to My Program |
Meta Pseudo Labels for Chest X-Ray Image Classification |
|
Abu, Assanali | Nazarbayev University |
Abdukarimov, Yerkin | Nazarbayev University |
Anh Tu, Nguyen | Nazarbayev University |
Lee, Min-Ho | Nazarbayev University |
Keywords: Deep Learning, Machine Learning, Transfer Learning
Abstract: Deep Learning methods are getting more and more extensively applied to medical imaging tasks. Nevertheless, very frequently medical images appear unlabelled making it difficult for AI algorithms to utilize the features of the images for classification purposes. Thus, such limitations make it almost impossible to develop robust and accurate algorithm for medical image classification. In this study, we have used a semi-supervised learning method Meta Pseudo Labels which allowed us to train models with a limited amount of labelled data extracted from chest X-ray images. The approach has demonstrated promising results achieving 92.5% of accuracy on the data labelled only for 16%. Additionally, we have also implemented the Transfer Learning approach to obtain higher accuracy on data labelled for only 0.5%. The approach involved initializing the model with the weights obtained from training it on a dataset with higher portion of labelled data. The approach has been proven to be successful averagely increasing the model accuracy on 0.5% of labeled data by 26 percent.
|
|
10:20-10:40, Paper We-PS2-T1.2 | Add to My Program |
Adaptive Weights and Sample's Distribution for Few Shot Classification |
|
Yang, Tengyu | Soochow University |
Li, Fanzhang | Soochow University |
Keywords: Machine Learning, Deep Learning, Image Processing and Pattern Recognition
Abstract: In recent years, few-shot classification algorithms have been developing. But many few-shot classification algorithms are facing their bottlenecks. After our study and research of the prototype network, we found that the prototype calculation method and loss function are relatively simple, which can be improved better. We present our few-shot classification algorithm in this paper, which is inspired by prototype network and center loss function. Based on the prototype network, we propose our Adaptive Weights Model(AWM) to give each sample a better weight parameter, so that we can get a more reasonable prototype. Based on the center loss function, we propose our Adaptive Sample’s Distribution Model(ASDM), which enables us to optimize the distribution of samples. Then, we did a lot of experiments based on our model. The results show that the proposed model is effective. Few-shot learning is becoming more and more important in machine learning. However, most few-shot learning algorithms only rely on deep neural networks to process samples. Here, we offer a different idea to give more reasonable weights to the samples, so that the prototype will be more reasonable. We believe that this is also a good idea and a new direction in the future.
|
|
10:40-11:00, Paper We-PS2-T1.3 | Add to My Program |
Adaptive Metric-Weight and Bias Network for Few-Shot Learning |
|
Wang, Jiaming | SuChow University |
Li, Fanzhang | Soochow University |
Keywords: Machine Learning, Deep Learning, Image Processing and Pattern Recognition
Abstract: Few-shot learning is dedicated to dealing with the dependence of deep learning on a large amount of data. It learns some new concepts through a large number of training tasks instead of a large amount of data, so it can quickly adapt to new tasks given a small amount of training data. Among different few-shot learning algorithms, metric-based methods achieve classification tasks by finding a robust data embedding space and using distance metric methods, which is an efficient few-shot learning method. Despite the success of metric-based methods in few-shot learning, the adaptability of metric methods to data distribution is still insufficient, which leads to classification results more dependent on the accuracy of encoder networks and are susceptible to noisy features. In this paper, we realize the filtering of noisy features and the correction of data distribution by adaptive learning of metric weights and data distribution biases, and propose corresponding loss functions to evaluate and update our adaptive module and encoding network. Experimental results show that our proposed method achieves significant improvement on standard few-shot image classification datasets.
|
|
11:00-11:20, Paper We-PS2-T1.4 | Add to My Program |
Multi-Relational Semantic Distillation for Few-Shot Object Detection |
|
Xie, Qingtao | China University of Petroleum |
Pan, Xiangshuai | China University of Petroleum (East China) |
Liu, Weifeng | China University of Petroleum (East China) |
Liu, Baodi | College of Information and Control Engineering, China University |
Keywords: Deep Learning, Image Processing and Pattern Recognition, Neural Networks and their Applications
Abstract: While few-shot object detection(FSOD) has been developed to a certain extent, it is still a large margin from practical applications. Most existing methods use traditional object detection methods as the basic framework is improved to a limited extent. Previous methods often ignore the special characterization relationship between support and query images. This paper fully investigates the effect of support images on detection performance and proposes a new FSOD method called Multi-relational Semantic Distillation (MSD). Our approach aims to improve FSOD performance by building a multi-relational semantic representation model with support and query features. In addition, we propose a support enhancement (SE) module based on the self-attention mechanism to enhance the useful information in the support features to mitigate the negative impact of low-quality support images. To verify the effectiveness of MSD, we conduct sufficient experiments on Pascal VOC and MS-COCO datasets. Experiments show that MSD achieves competitive results at low shots compared to other state-of-the-art few-shot detectors.
|
|
11:20-11:40, Paper We-PS2-T1.5 | Add to My Program |
Evolutionary Approximation in Non-Local Means Image Filters |
|
Valek, Matej | Brno University of Technology |
Sekanina, Lukas | Brno University of Technology |
Keywords: Computational Intelligence, Evolutionary Computation, Image Processing and Pattern Recognition
Abstract: The non-local means image filter is a non-trivial denoising algorithm for color images utilizing floating-point arithmetic operations in its reference software implementation. In order to simplify this algorithm for an on-chip implementation, we investigate the impact of various number representations and approximate arithmetic operators on the quality of image filtering. We employ Cartesian Genetic Programming (CGP) to evolve approximate implementations of a 20-bit signed multiplier which is then applied in the image filter instead of the conventional 32-bit floating-point multiplier. In addition to using several techniques that reduce the huge design cost, we propose a new mutation operator for CGP to improve the search quality and obtain better approximate multipliers than with CGP utilizing the standard mutation operator. Image filters utilizing evolved approximate multipliers can save 35% in power consumption of multiplication operations for a negligible drop in the image filtering quality.
|
|
We-PS2-T2 Regular Session, ZENIT |
Add to My Program |
Medical Informatics II |
|
|
Chair: Strasser, Thomas | AIT Austrian Institute of Technology |
|
10:00-10:20, Paper We-PS2-T2.1 | Add to My Program |
Automatic Audio-Based Screening System for Alzheimer's Disease Detection |
|
Lin, Sheng-Ya | National Taiwan University |
Chang, Ho-Ling | National Taiwan University |
Hwang, Jwu-Jia | National Taiwan University |
Wai, Thiri | National Taiwan University |
Chang, Yu-Ling | National Taiwan University |
Fu, Li-Chen | National Taiwan University |
Keywords: Medical Informatics
Abstract: Alzheimer's disease (AD) and other types of dementia have become a public health priority worldwide. To lessen the burden of AD diagnosis, an automatic screening system that can be deployed in large-scale and cost-efficient screening methods will be needed. This paper presents a speech assessment system for cognitive impairment detection, detecting whether elders have AD or suffer from mild cognitive impairment (MCI) based on their audio recordings taken from neuropsychological tests. The audio waveform first is transformed to Mel-spectrogram and done the downsampling. With the combination of Transformer and convolutional neural network (CNN) architecture, we can do the feature extraction and get a better representation for the classifier. We conducted experiments on 120 subjects with a balanced distribution of ordinary aging, MCI, and AD patients to validate our study. Our experiments achieve an accuracy of 91% and 79% for classifying groups of AD and MCI from ordinary aging people, respectively.
|
|
10:20-10:40, Paper We-PS2-T2.2 | Add to My Program |
Learning Label Independence and Relevance for Multi-Label Biomedical Text Classification |
|
Chen, Zihao | Wuhan University of Technology |
Peng, Jing | Wuhan University of Technology |
Keywords: Medical Informatics
Abstract: The rapidly growing biomedical literature requires accurate and robust automatic computational methods to quickly select the most relevant labels in biomedical topic candidate label sets to specific biomedical documents. Such methods can facilitate hypothesis generation and knowledge discovery. Many multi-label biomedical text classification methods have been proposed in the last few years, such as DeepMeSH, MeSHProbeNet, and BERTMeSH. However, these methods encode the labels as one-hot vectors, which ignore the semantic relevance between the labels. Moreover, the loss function they employ does not consider the unbalanced label distribution, resulting in overfitting labels with high frequencies. To alleviate the above problems and improve the performance of multi-label biomedical document classification tasks, we propose a model to learn Label Independence And Relevance (LIAR) and integrate them efficiently. LIAR uses BioBERT to fully extract information from biomedical literature to generate textual representations and uses it to construct a one-hot vector and learn label embeddings, respectively. Meanwhile, we construct a new loss function that can adaptively weight and integrates the one-hot vector distribution and label semantic similarity to compute the loss value and Assign Weights (AWLoss) to labels of different frequencies to alleviate the shortcomings of the loss function in the above model. LIAR outperformed the state-of-the-art method by more than 1% on all three benchmark datasets.
|
|
10:40-11:00, Paper We-PS2-T2.3 | Add to My Program |
Combining Deep Graph Convolutional Networks and PRSA to Enhance Protein–protein Interaction Site Prediction |
|
Li, Zhouhan | Wuhan University of Technology |
Peng, Jing | Wuhan University of Technology |
Keywords: Medical Informatics
Abstract: Protein-protein interaction(PPI) site prediction is a deep-level exploration of the mechanism of life activity, but relying solely on experimental methods to identify PPI sites is hugely costly. This method is advantageous among the developed computational methods using structural information. For the relative solvent accessibility (RSA) of protein structural information, the absolute values of solvent accessibility derived from the program named DSSP (Kabsch and Sander, 1983) were primarily used and then normalized using the highest exposure area of the amino acid type determined in the past. It is difficult to obtain suitable RSA when protein structure information cannot be obtained by homologous transfer, and thus the use of RSA is limited. We used the latest deep learning prediction tools to mine potentially valuable information from long-range interactions inside protein sequences and used it for protein RSA prediction. In a deep graph convolutional neural network, we incorporate the predicted relative solvent accessibility (PRSA) into the original structural information and then combine the sequence information and evolutionary information to form graph node features. We showed that our proposed method significantly improves the performance of AUPRC and MCC by over 9.5% and 21% compared to other sequence-based and structure-based methods. Furthermore, it was demonstrated by analyzing the method that the PRSA plays a crucial role in PPI site prediction.
|
|
11:00-11:20, Paper We-PS2-T2.4 | Add to My Program |
Predicting Hospitalization from Health Insurance Data |
|
Baro, Everton | Instituto Federal Do Paraná |
Oliveira, Luiz | UFPR |
Souza Britto Jr., Alceu | Pontifícia Universidade Católica Do Paraná |
Keywords: Medical Informatics
Abstract: Hospitalizations represent an expressive part of total health costs and, therefore, reducing the number of hospitalizations, when possible, can generate both economic gains and enhanced quality of life of patients. Several works have been striving to use machine learning to create models for hospitalization predictions. Most of them require specialized knowledge in the health area, mainly in the stages of data preparation and selection of features. This feature engineering is not always perfect and may fail to select relevant features for the model training process. In this paper, to fill this gap, we explore three sources of information to extract features, i.e., medical specialty, event description, and the International Classification of Diseases. In addition, we introduce a dataset composed of 38,524 records of medical events from 34,930 patients. To assess and set a baseline for this new dataset, we have used two well-known ensemble methods (Random Forest and Gradient Boosting). The best results, AUC = 0.82, were achieved by combining the models generated from the three feature set tested and gradient boosting. We believe that researchers will find this dataset a valuable tool in their work on hospitalization prediction. It will also make future benchmarking and evaluation possible.
|
|
11:20-11:40, Paper We-PS2-T2.5 | Add to My Program |
Contrast-Enhanced Automatic Cognitive Impairment Detection System with Pause-Encoder |
|
Lin, Sheng-Ya | National Taiwan University |
Chang, Ho-Ling | National Taiwan University |
Wai, Thiri | National Taiwan University |
Fu, Li-Chen | National Taiwan University |
Chang, Yu-Ling | National Taiwan University |
Keywords: Medical Informatics
Abstract: As the elderly population grows globally, healthcare systems face a burden from the rise in Alzheimer's patients due to an increase in demand for early diagnosis. Therefore, more people have started focusing on developing systems helping doctors diagnose Alzheimer's, such as cognitive impairment detection systems. This paper presents a contrast-enhanced automatic cognitive impairment screening system combining paused-encoder based on the automatic transcription. We use the pre-trained automatic speech recognition model and adapt it to generate transcripts of the elderly's speech. The pattern of pauses in speech is a commonly-studied acoustic feature since it can provide additional information besides the semantic information for the model prediction. The back-translation with contrastive learning is used to improve the encoded model further. The model also fine-tunes with the pause-encoded transcriptions to detect the cognitive impairment. Our result shows excellent performance with an accuracy of 81% in detecting Alzheimer's disease. Also, the accuracy is acceptable on a more challenging task of detecting mild cognitive impairment, the middle stage between healthy and Alzheimer's. In addition to the outperforming performance, our system is fully automatic and can be used easily.
|
|
We-PS2-T3 Regular Session, NADIR |
Add to My Program |
Context Awareness in Connected Society & Systems, Human Machine Interfaces
and Haptics |
|
|
Chair: Abel, Marie-Hélčne | Sorbonne Universités, Université De Technologie De Compičgne, CNRS UMR 7253 Heudiasyc |
|
10:00-10:20, Paper We-PS2-T3.1 | Add to My Program |
Cooperative Behaviors of Connected Autonomous Vehicles and Pedestrians to Provide Safe and Efficient Traffic in Industrial Sites (I) |
|
Zhang, Meng | Université De Technologie De Belfort Montbéliard (UTBM) |
Keywords: Cooperative Systems and Control, Intelligent Transportation Systems, Distributed Intelligent Systems
Abstract: The technology of Connected and Autonomous Vehicles (CAV) is a hot topic of transportation systems, especially regarding platooning and the interaction with other road users. Considering traffic safety, many studies have been devoted to the exchange of information among various road users, such as CAVs and pedestrians. In a platooning scenario, when a pedestrian is detected by a CAV, the leader CAV shares the information with its followers to provide a safe and courteous environment thanks to its connectivity. However, the possibility to improve traffic efficiency while meeting the safety requirements has rarely been addressed in current researches. Yet, in industrial areas, where automated vehicles and pedestrians frequently interact, combining safety and efficiency is crucial. The present paper addresses this challenge by first analyzing the intersection of CAVs and pedestrians in no-traffic-signal scenarios. The optimal state is proposed to reduce the time loss. Then, the paper uses a reinforcement learning based method to make CAVs arrive at the optimal state, to improve traffic efficiency. The experimental results based on virtual reality show that the proposed method increases the traffic efficiency while ensuring traffic safety.
|
|
10:20-10:40, Paper We-PS2-T3.2 | Add to My Program |
Dynamic Context Awareness in Autonomous Navigation (I) |
|
Chefchaouni Moussaoui, Sélim | Université De Technologie De Compičgne (UTC) |
Pousseur, Hugo | Université De Technologie De Compičgne, CNRS, Heudiasyc (Heurist |
Victorino, Alessandro Correa | Sorbonne Universités - Université De Technologie De Compičgne - |
Abel, Marie-Hélčne | Sorbonne Universités, Université De Technologie De Compičgne, CN |
Keywords: Intelligent Transportation Systems, System Modeling and Control
Abstract: Many studies faced the problem of vehicle autonomous navigation in different fields, but nowadays just a few of them uses all the implicit information coming from the context in which such navigation is occurring. This results in a huge potential information loss that prevents us from adapting the vehicle's behavior to each different situation it may be in. In a previous work, we defined a method to model the static context of navigation using ontologies and take it into account in the command law when performing a local navigation task. In this paper, we extend our model of the context of navigation, and define a software architecture able to update the context dynamically, by using sensor information. The method is tested with real-time experiments on driving simulator. They show that the Context of Navigation can be effectively updated during the navigation and leads to a smarter vehicle's behavior on the road.
|
|
10:40-11:00, Paper We-PS2-T3.3 | Add to My Program |
Weighting Sliding Tiles for Writer Identification in Handwritten Musical Scores (I) |
|
Beltran Beltran, Lady Viviana | La Rochelle UniversitÉ |
Coustaty, Mickael | University of La Rochelle |
Agresta, Rosalba | The Bibliothčque Nationale De France |
Doucet, Antoine | La Rochelle UniversitÉ |
Keywords: System Architecture, Technology Assessment, Enterprise Information Systems
Abstract: In this paper, we propose an approach along with an extended ablation study to address the writer identification task in handwritten scores. To benefit from machine learning methods and train them when working with these types of images, traditional approaches tend to apply some standard transformations, such as reshaping the image or randomly choosing a small image patch. This process can seriously degrade the information received by the model and therefore, it risks learning non-discriminatory or useless information. In a more realistic scenario, the databases found in organizations that preserve these types of documents contain images with non-standard formats, large dimensionality, and a high and diverse level of noise such as full white pages or non-music related data. This is due to the digitization process followed by most organizations that preserve this data. To address these problems, we propose a sliding tile-based approach for the task of writer identification in two stages. The first stage benefits from symbol detection for the sole purpose of identifying optimal regions containing musical information. And a second stage uses this music content information to compute the final classification at the full-page. We present an ablation study together with a new database containing musical scores extracted from the music department of the National Library of France - BnF named REMDM Autographs. This database contains rich images of original musical compositions from different writers with the main problem of containing a high level of noise. We present the results of our explorations for two databases, the new corpus, and the well-known public CVC-MUSCIMA database. When comparing the performance of the tile approach versus the full-page approach, we see an undeniable performance improvement of more than 45% in both databases.
|
|
11:00-11:20, Paper We-PS2-T3.4 | Add to My Program |
Multi-Objective NSGA-II for Weight Tuning of a Nonlinear Model Predictive Controller in Autonomous Vehicles (I) |
|
Chalak Qazani, Mohamad Reza | Institute for Intelligent Systems Research and Innovation (IISRI |
Karkoub, Mansour | Texas A&M University at Qatar |
Asadi, Houshyar | Deakin University |
Lim, Chee Peng | Deakin University |
Liew, Alan Wee-Chung | Griffith University |
Nahavandi, Saeid | Deakin University |
Keywords: Modeling of Autonomous Systems, System Modeling and Control, Intelligent Transportation Systems
Abstract: Motion signal should be generated via the AV control system targeting the maximum motion comfort for the users. Nonlinear model predictive control (MPC) is recently used in AVs to achieve this critical task. However, nonlinear MPC has lots of hyperparameters, including weights and MPC horizons, that should be tuned systematically to reach the system's high efficiency. The energy usage and motion comfort have a direct relationship. The generation of high-fidelity motion cues for AV users leads to higher energy usage. Hence, there is a need for the use of a multi-objective optimisation technique to tune the weights wisely to satisfy the appropriate energy usage and motion comfort for the AV users. In this study, multi-objective NSGA-II is employed, for the first time, to tune the weights of a nonlinear MPC-based controller in AVs. The proposed method is designed and developed using MATLAB/SIMULINK software. The simulation results show minimum energy usage by generation of smooth motion signals, delivering maximum comfort to AV users.
|
|
11:20-11:40, Paper We-PS2-T3.5 | Add to My Program |
Experimental Validation of a High-G Centrifuge System Using an Advanced Wireless Human Dummy (I) |
|
Mohajer, Navid | Deakin University |
Winter, Asher | Deakin University |
Gregory, Timothy Mark | Institute for Intelligent Systems Research and Innovation, Deaki |
Nahavandi, Darius | Deakin Universirty |
Watson, Matthew | IISRI, Deakin University |
Nahavandi, Saeid | Deakin University |
Keywords: System Modeling and Control, Robotic Systems, Mechatronics
Abstract: High-G Centrifuge Systems (HCSs) are valuable tools for training aircrews and research on aviation medicine. Providing a safe and controlled environment, they are an enabler for protecting aircrews and air assets. Despite their vast applications, development of a human-rated HCS is a costly and challenging engineering project. One of the most critical steps in the development of HCSs is the experimental validation. This step has not received enough attention within the published research. This study reports evaluation and validation of an operational HCS located in the Institute for Intelligent Systems Research and Innovation (IISRI) at Deakin University, Australia. The system, owning a low-cost structure with an effective arm length of over 5m, is capable of generating a maximum sustained acceleration of 9G with an onset rate of 5G/sec. The experimental validation of system is implemented using an Advanced Wireless Human Dummy (AWHD) which is fully instrumented. The results of experimental validation show that the system can reliably generate the reference centripetal acceleration with lowest error at the human spine location.
|
|
We-PS2-T4 Regular Session, AQUARIUS |
Add to My Program |
Learning to Optimize in Intelligent Systems I |
|
|
Chair: Mandischer, Nils | RWTH Aachen University |
|
10:00-10:20, Paper We-PS2-T4.1 | Add to My Program |
Two-Stage Online Product Pricing Optimization Based on Consumer Decision Factors (I) |
|
Liu, Xuwang | Henan University |
Wang, Junjia | Henan University |
Qi, Wei | Henan University |
Guo, Xiwang | Liaoning Petrochemical University |
Wang, Jiacun | Monmouth University |
Tang, Ying | Rowan University |
Keywords: Optimization and Self-Organization Approaches, Computational Intelligence
Abstract: 在平台经济下,价格、评论、销售是消费者最关心的三个购买决策因素。但是,不同的客户对相同的决策因素具有不同的敏感性。因此,研究消费者对评论,价格和销售的敏感性非常重要。本文基于多元Logit模型(MNL模型),构建平台企业新产品两阶段定价模型,研究价格、审核和销售对企业利润的影响。然后分析了消费者对价格变化、产品成本和消费者产品质量评价的敏感度对产品定价和企业利润的影响机制。之后,进一步制定了产品销售的两阶段最优定价策略。研究表明,企业在制定定价策略时,不仅要考虑消费者对评论、价格和销售量的敏感度,还要借鉴以往的销售经验。研究结果可为平台企业的产品定价和运营管理提供理论依
|
|
10:20-10:40, Paper We-PS2-T4.2 | Add to My Program |
Weight-Specific-Decoder Attention Model to Solve Multiobjective Combinatorial Optimization Problems (I) |
|
Ye, Te | Sun Yat-Sen University |
Zhang, Zizhen | Sun Yat-Sen University |
Chen, Jinbiao | Sun Yat-Sen University |
Wang, Jiahai | Sun Yat-Sen University |
Keywords: Neural Networks and their Applications, Deep Learning, Application of Artificial Intelligence
Abstract: The multiobjective combinatorial optimization problems (MOCOPs) have a wide range of real-world applications. Designing an effective algorithm has an important and practical significance. Due to the huge search space and limited time, it is generally difficult to obtain the optimal solution of this kind of problem by traditional exact and heuristic algorithms. Recently, learning-based algorithms have achieved good results in solving MOCOPs, but the quality and diversity of found solutions can be further improved. In this paper, we propose a Weight-Specific-Decoder Attention Model (WSDAM) to better approximate the whole Pareto set. It embeds a weight-adaptive layer into the decoder to concentrate on the information of different weight vectors. During the model training, the weight vector is sampled from the Dirichlet distribution, which can further strengthen the learning of boundary solutions. We evaluate our method on two classic MOCOPs, i.e., the multiobjective traveling salesman problem (MOTSP) and multiobjective capacitated vehicle routing problem (MOCVRP). The experimental results show that our proposed method outperforms current state-of-the-art learning-based methods in both solution quality and generalization ability.
|
|
10:40-11:00, Paper We-PS2-T4.3 | Add to My Program |
Data-Driven Suboptimal Control for Nonlinear Systems Using State-Dependent Riccati Equation (I) |
|
Zhu, Liao | Beijing Normal University at Zhuhai |
Xu, Jingsheng | 河南城建学院 |
Guo, Ping | Beijing Normal University |
Keywords: Computational Intelligence, Heuristic Algorithms, Evolutionary Computation
Abstract: In this paper, the approximate optimal control design for continuous-time nonlinear systems with partially unknown dynamics is studied. Based on the state-dependent coefficient parameterization, the dynamics of nonlinear systems are represented in a resemble linear manner. In this case, the Hamilton-Jacobi-Bellman equation can be recast in the form of the state-dependent Riccati equation (SDRE). Based on the integral reinforcement learning, an online policy iteration algorithm is developed to iteratively solve the SDRE using the information of the system state and the control input. In addition, it is proved that the proposed algorithm is equivalent to the traditional iterative solution of SDRE. A suboptimal control policy can be attained under proper conditions. The stability of the closed-loop system can be guaranteed by the iterative feedback control policy. The effectiveness of the presented algorithm is validated by simulation results.
|
|
11:00-11:20, Paper We-PS2-T4.4 | Add to My Program |
Cost-Minimized and Multi-Plant Scheduling in Distributed Industrial Systems (I) |
|
Yuan, Haitao | Beihang University |
Hu, Qinglong | Beihang University |
Bi, Jing | Beijing University of Technology |
Keywords: Cybernetics for Informatics, Intelligent Internet Systems, Evolutionary Computation
Abstract: As a new paradigm, the industrial Internet provides information sharing of various elements and resources in a whole industrial production process. It makes industrial production processes intelligent and provides low-cost and efficient scheduling. Manufacturing planning for multi-plant enterprises in the industrial Internet brings many big challenges due to numerous optimization variables and limits of manufacturing capacities of plants, production resources, etc. Current studies fail to jointly consider the cost of different products in multiple heterogeneous plants, and ignore machine-level scheduling of manufacturing tasks. This work designs an improved framework for multi-plant enterprises, based on which a constrained non-linear integer program for reducing the total cost including production cost and transportation one is formulated. It jointly considers many complex nonlinear constraints, e.g., limits of replacement times, storage space, substitution, and pairing production. It investigates machine-level task scheduling where different machines have heterogeneous manufacturing capacities. To solve it, this work proposes an algorithm named Genetic Simulated annealing-based Particle Swarm Optimization (GSPSO). Realistic data-based experiments demonstrate GSPSO reduces the cost of a multi-plant system by at least 23% than its typical peers.
|
|
11:20-11:40, Paper We-PS2-T4.5 | Add to My Program |
Moth-Flame Optimizer for Multi-Product Human-Robot Collaborative Parallel Disassembly Line Balancing Problem (I) |
|
Lu, Fayang | Liaoning Petrochemical University |
Liu, Peisheng | Liaoning Petrochemical University |
Guo, Xiwang | Liaoning Petrochemical University |
Wang, Jiacun | Monmouth University |
Qin, Shujin | Shangqiu Normal University |
Qi, Liang | Shandong University of Science and Technology |
Zhao, Jian | University of Science and Technogly Liaoning |
Keywords: Computational Intelligence, Evolutionary Computation, Heuristic Algorithms
Abstract: Abstract—With the rapid development and upgrade of electronics and related technologies, more and more discarded and end-of-life products are generated and must be properly handled and recycled. Disassembly lines are a key to their efficient recycling process. A parallel disassembly line offers high profit, low energy consumption, and high efficiency. In this paper, a linear programming model for optimal human- robot collaborative disassembly is established. The goal is to maximize disassembly profit. An improved Moth-Flame Optimizer (MFO) is proposed and the crossover part of the algorithm is improved based on this problem's characteristics. Experiments with practical cases involving multiple products of disassembly are used to test the model and algorithm. The result shows that MFO has obvious advantages over a commonly-used algorithm in solving parallel disassembly line balancing problems.
|
|
We-PS2-T5 Regular Session, TAURUS |
Add to My Program |
Novel Image Processing Methods and Applications |
|
|
Co-Chair: Dahmane, Mohamed | Computer Research Institute of Montreal |
|
10:00-10:20, Paper We-PS2-T5.1 | Add to My Program |
Enhancing Fresh Produce Yield Forecasting Using Vegetation Indices from Satellite Images |
|
Nasr, Islam Mohamed Mahmoud | University of Waterloo |
Nassar, Lobna | University of Waterloo |
Karray, Fakhreddine | University of Waterloo |
Keywords: Neural Networks and their Applications, Deep Learning, Application of Artificial Intelligence
Abstract: Developing fresh produce yield forecasting service is essential for estimating fair prices to protect against overpriced agricultural commodities and minimize the bid ask spread which not only benefits the retailers and customers but also protects farmers. Forecasting the fresh produce yield is achieved using state of the art deep learning (DL) models. Those models are trained and built using data retrieved from Santa Barbara region in California using an ensemble of Attention Deep Feedforward Neural Network with Gated Recurrent Units (GRU) and Deep Feedforward Neural Network with embedded GRU units. The ensemble takes as input the soil moisture and temperature parameters as well as vegetation indices (VIs) calculated from images retrieved from multiple satellites. The effect of adding the VIs as input parameters on the forecasting performance of the deep learning model is assessed and the most effective VIs are selected. In addition, interpolation techniques are used to estimate the missing VIs due to the low frequency of capturing the images by the satellites. A comparative analysis is conducted to choose the most effective technique, which is found to be Cubic Spline interpolation. One VI, which is the Normalized Difference Vegetation Index (NDVI), proves to be the most effective index in forecasting the yield. Based on the aggregated error measure (AGM) score, the yield forecasting performance of the DL ensemble is enhanced by 12.51% after adding the complete interpolated NDVI to the input parameters used in training the model.
|
|
10:20-10:40, Paper We-PS2-T5.2 | Add to My Program |
Pattern Spotting and Image Retrieval in Historical Documents Using Deep Hashing |
|
da Silva Dias, Caio | Pontifícia Universidade Católica Do Paraná |
Souza Britto Jr., Alceu | Pontifícia Universidade Católica Do Paraná |
Barddal, Jean Paul | Pontificia Universidade Catolica Do Parana |
Heutte, Laurent | University of Rouen Normandy |
Lameiras Koerich, Alessandro | Ecole De Technologie Superieure (ETS) |
Keywords: Deep Learning, Image Processing and Pattern Recognition, Application of Artificial Intelligence
Abstract: This paper presents a deep learning approach for image retrieval and pattern spotting in digital collections of historical documents. First, a region proposal algorithm detects object candidates in the document page images. Next, deep learning models are used for feature extraction, considering two distinct variants, which provide either real-valued or binary code representations. Finally, candidate images are ranked by computing the feature similarity with a given input query. A robust experimental protocol evaluates the proposed approach considering each representation scheme (real-valued and binary code) on the DocExplore image database. The experimental results show that the proposed deep models compare favorably to the state-of-the-art image retrieval approaches for images of historical documents, outperforming other deep models by 2.56 percentage points using the same techniques for pattern spotting. Besides, the proposed approach also reduces the search time up to 200x, and the storage cost up to 6,000x when compared to related works based on real-valued representations.
|
|
10:40-11:00, Paper We-PS2-T5.3 | Add to My Program |
On Attacking Deep Image Quality Evaluator Via Spatial Transform |
|
Ning, Lu | Ningbo University |
Li, Dong | Ningbo University |
Diqun, Yan | Ningbo University |
Xianliang, Jiang | Ningbo University |
Keywords: Multimedia Computation
Abstract: Adversarial examples fool the neural networks by adding slightly-perturbed noise to the original image, which barriers the usability of deep models. Most of the works focused on the adversarial attack on the classification task. We, in this work, attempt to develop an adversarial example generation method for attacking neural-based image quality assessment (IQA). Specifically, instead of employing conventional additive adversarial noise generation methods, we propose an image content deformation approach, avoiding the loss of adversarial noise after compression. The deformation component is designed as neural layers. The given image is firstly deformed and then undergoes compression; an existing IQA evaluates the compressed image. The deformation layers are trained by back-propagating the differences between the targeted IQA score and the originally-evaluated one. Experimental results demonstrate that the proposed method can produce compression-resistant adversarial images for image quality evaluators. The generated adversarial examples could effectively attack neural-based image quality evaluators with less distortion. The data and code of this work are available at https://github.com/luning409/Attack IQA.
|
|
11:00-11:20, Paper We-PS2-T5.4 | Add to My Program |
Joint Water-Filling Algorithm with Adaptive Chroma Adjustment for Shadow Removal from Text Document Images |
|
Wang, Ze | Northwestern Polytechnical University |
Wang, Bingshu | Northwestern Polytechnical University, Taicang Campus |
Zheng, Jiangbin | Northwestern Polytechnical University |
Chen, C. L. Philip | University of Macau |
Keywords: Image Processing and Pattern Recognition, Machine Vision, Application of Artificial Intelligence
Abstract: With smart portable devices such as smartphones and tablets in usage and popularity, people are more willing to use these devices to scan and save digitized documents. However, when capturing document images, shadows are inevitable and influence clarity and readability. How to remove the shadows of document images is an important and meaningful task. In this paper, we propose a water-filling method using chroma adjustment for shadow removal. Firstly, a global and local jointly water-filling approach is designed to estimate the shading map. Then, we design an adaptive global brightness adjustment strategy to optimize the global luminance of the output image. Since only adjusting brightness can cause color distortion of output images, we propose an adaptive chroma adjustment strategy to ensure color consistency across all areas of output images. A series of experiments show that our method can remove shadows of digitized documents, outperforming some state-of-the-art methods. Moreover, the proposed method can keep the brightness and color as consistent as possible with the non-shadow area.
|
|
11:20-11:40, Paper We-PS2-T5.5 | Add to My Program |
Infinitesimal Confidence Residuals-Based Image Authentication |
|
Dahmane, Mohamed | Computer Research Institute of Montreal |
Keywords: Image Processing and Pattern Recognition, Machine Vision, Media Computing
Abstract: In this work, we propose an image authentication method based on image matting. We demonstrate that an infinitesimal confidence interval computed from an α-matte of a reference image is able to expose manipulated images at the pixel level. The critical bounds are infinitesimally conditional on the spatial and intensity image affinity that is determined by a constrained soft-matting. A unique reconstruction parameter is embedded in the image rather than embedding a watermark which is known to cause image distortions. In the experiments show a higher sensitivity of the reconstructed image residuals regarding the bounds of the infinitesimal confidence interval. The test images have shown a very high visual quality achieving a distinctive peak signal-to-noise ratio. Also, the approach implements a visual assessment for image authentication.
|
|
We-PS2-T6 Regular Session, LEO |
Add to My Program |
Soft Computing Methods in Real-World Application |
|
|
Chair: Leon-Garza, Hugo | University of Essex |
Co-Chair: Sovatzidi, Georgia | University of Thessaly |
|
10:00-10:20, Paper We-PS2-T6.1 | Add to My Program |
A Hand-Gesture Recognition Based Interpretable Type-2 Fuzzy Rule-Based System for Extended Reality (I) |
|
Leon-Garza, Hugo | University of Essex |
Hagras, Hani | University of Essex |
Pena-Rios, Anasol | BT Labs |
Bahceci, Ozkan | BT Labs |
Conway, Anthony | BT Labs |
Keywords: Fuzzy Systems and their applications
Abstract: In recent years, technologies such as Augmented Reality (AR) and Virtual Reality (VR) have become more popular and available to broader audiences, which has led to the research and development of a myriad of extended applications. A modality for end-users to interact with these applications is through hand gestures, hence the importance of detecting the different gestures in real-time. This paper presents an interval type-2 Fuzzy Rule-based System (FRBS) optimised by the Big Bang-Big Crunch (BB-BC) algorithm that uses the fingers’ position from the hand-tracking technology in extended reality (XR) headsets (namely HoloLens 2 and Oculus Quest 2) to classify the user’s hand gestures. This approach achieved an accuracy of 96.4%, and it is an interpretable model that can be understood and adjusted by end-users. The interval type-2 FRBS was tested against a type-1 FRBS and a k-nearest neighbours (KNN) model. It outperformed the type-1 FRBS and was close to the 98.9% accuracy performance of the KNN model, making our suggested approach a competitive alternative to opaque models.
|
|
10:20-10:40, Paper We-PS2-T6.2 | Add to My Program |
A Machine Learning Based Approach to Detect Fault Injection Attacks in IoT Software Systems |
|
Gangolli, Aakash Anil | Ontario Tech University |
Mahmoud, Qusay | Ontario Tech University |
Azim, Akramul | University of Ontario Institute of Technology (UOIT) |
Keywords: Machine Learning, Application of Artificial Intelligence
Abstract: With the rapid growth of Internet of Things (IoT) applications, the security of these systems has become critical. Fault injection attacks are a type of physical attack on the hardware components of an IoT system. These attacks cause the IoT system software to behave abnormally, which the adversaries exploit. Typically, these attacks have been detected through the use of a separate hardware detection mechanism, which is expensive and itself vulnerable to attack. The purpose of this paper is to propose a machine learning based approach to detect the attacks by monitoring specific run-time software parameters in the live environment of an IoT system. The proposed approach generates a labelled dataset by injecting instruction-level faults into the software executable, which is then used to train a machine learning model that can predict whether the IoT software system is currently being affected by a fault injection attack. Using a software fault injection tool to create a labelled dataset enables the use of supervised machine learning techniques, which produce more accurate prediction results than unsupervised techniques. The machine learning model can be used in the live environment of an IoT software system to monitor specific run-time software properties in order to detect the effects of a fault injection attack on the software. Additionally, the model classifies the type of fault introduced into the software as a result of the attack, which can be used to determine the necessary corrective action.
|
|
10:40-11:00, Paper We-PS2-T6.3 | Add to My Program |
Solving Vehicle Routing Problem with Drones Based on a Bi-Level Heuristic Approach (I) |
|
Yang, Jian | Southern University of Science and Technology |
Yang, Haobin | Southern University of Science and Technology |
He, Zh | Southern University of Science and Technology |
Zhao, Qi | Southern University of Science and Technology |
Shi, Yuhui | Southern University of Science and Technology |
Keywords: Swarm Intelligence, Heuristic Algorithms, Computational Intelligence
Abstract: Unmanned Aerial Vehicles (UAVs), or drones, have the potential to be applied to delivery services, which are expected to bring economic benefits. One of the key issues is planning routes for vehicles and drones with specific constraints and objectives, known as Vehicle Routing Problem with Drones (VRPD). This paper considers a scenario involving multiple trucks, multiple UAV stations, and UAVs within each station to serve the customers. A bi-level approach that combines the Brain Storm Optimization algorithm and Adaptive Large Neighborhood Search is proposed by designing the solution representation, new solution generation mechanism, and other operations. The experimental results show that the proposed method has the ability to solve the problem and deserves further development.
|
|
11:00-11:20, Paper We-PS2-T6.4 | Add to My Program |
Towards the Interpretation of Convolutional Neural Networks for Image Classification Using Fuzzy Sets |
|
Vasilakakis, Michael | University of Thessaly |
Sovatzidi, Georgia | University of Thessaly |
Dimas, George | University of Thessaly |
Iakovidis, Dimitris | University of Thessaly |
Keywords: Fuzzy Systems and their applications, Machine Learning, Application of Artificial Intelligence
Abstract: Convolutional Neural Networks (CNNs) have demonstrated an outstanding performance on a range of image classification problems in various domains. However, their major drawback is that they are “black box” and opaque classifiers. Taking into consideration the increasing demand for interpretable classification models, this paper introduces a novel meta-feature extraction scheme. This scheme is based on fuzzy sets, and it can be applied on the feature maps of a CNN. Initially, representative image prototypes are selected based on their deep feature map representation. Then, it constructs information granules from the feature maps, describing the content of each image class. It uses fuzzy sets to linguistically characterize the similarity between the deep feature maps of an image and the deep feature maps of the image prototypes. Thus, a classification outcome can be interpreted based on the features characterizing the different image classes involved in a classification problem. The experimental evaluation of the proposed scheme is performed on five publicly available image datasets. The results indicate that the proposed scheme outperforms other state-of-the-art classifiers, while providing an understandable interpretation of the classification result.
|
|
11:20-11:40, Paper We-PS2-T6.5 | Add to My Program |
CNAS: Constrained Neural Architecture Search |
|
Gambella, Matteo | Politecnico Di Milano |
Falcetta, Alessandro | Politecnico Di Milano |
Roveri, Manuel | Politecnico Di Milano |
Keywords: Deep Learning, Neural Networks and their Applications, Cloud, IoT, and Robotics Integration
Abstract: Neural Architecture Search (NAS) paves the way for the automatic definition of neural networks architectures. The research interest in this field is steadily growing with several solutions available in the literature. This study introduces, for the first time in the literature, a NAS solution, called Constrained NAS (CNAS), able to take into account constraints on the search of the designed neural architecture. Specifically, CNAS is able to consider both functional constraints (i.e., the type of operations that can be carried out in the neural network) and technological constraints (i.e., constraints on the computational and memory demand of the designed neural network). CNAS has been successfully applied to Tiny Machine Learning and Privacy-Preserving Deep Learning with Homomorphic Encryption being two relevant and challenging application scenarios where functional and technological constraints are relevant in the neural network search.
|
|
We-PS2-T7 Regular Session, VIRGO |
Add to My Program |
Novel Detection Methods and Applications |
|
|
Chair: Muhuri, Pranab K. | South Asian University |
Co-Chair: Wu, Yujin | University of Lille |
|
10:00-10:20, Paper We-PS2-T7.1 | Add to My Program |
BNeSiFC: The Boosted NeSiFC Algorithm for Fast Fuzzy Community Detection Based on Neighbors' Similarity |
|
Roy, Uttam K. | South Asian University |
Muhuri, Pranab K. | South Asian University |
Biswas, Sajib K. | South Asian University |
Keywords: Complex Network, Fuzzy Systems and their applications, Cybernetics for Informatics
Abstract: This paper reports a novel fuzzy community detection (FCD) algorithm, which we term as ‘Boosted NeSiFC (bNeSiFC)’, based on an improvement of the recently proposed NeSiFC approach. Similar to the basic NeSiFC approach, the proposed bNeSiFC also computes the similarity between two neighbors using the modified local random walk (mLRW). In the proposed bNeSiFC, a new similarity metric termed EDS is introduced to compute the pair-similarity for constructing the transition probability matrix of mLRW. The boosted NeSiFC outperforms over the basic NeSiFC in terms of a faster computation in finding the most similar neighbors through the incorporation of the newly proposed metric EDS. Also, we introduce a novel fuzzy membership degree computation method for the proposed bNeSiFC, which is much clearer and easy to interpret than the one used for the basic NeSiFC. Comparative analysis of the experimental results with eight different real-life datasets establishes the superiority of the bNeSiFC over the NeSiFC and other existing approaches.
|
|
10:20-10:40, Paper We-PS2-T7.2 | Add to My Program |
RDDP: Reliable Detection and Description of Interest Points |
|
Gao, Yuning | Zhengzhou University |
Qi, Lin | ZhengZhou University |
Tie, Yun | ZhengZhou University |
Keywords: Image Processing and Pattern Recognition, Deep Learning, Machine Vision
Abstract: Recent works have paid much attention on learning repeatable maps for detection and description of interest points. However, there are always some components which is repeatable but not discriminative in reality. Training on the whole images with these components will lead to poor matching performance. Thus, we propose a self-supervised network with a new branch which can eliminate the negative influence of unreliable image components. We also design a loss function including a part calculated with Average Precision (AP) to make full use of the reliability branch. Evaluations on HPatches dataset show that the proposed method achieves competitive results compared with state-of-the-art methods.
|
|
10:40-11:00, Paper We-PS2-T7.3 | Add to My Program |
Using ALBERT and Multi-Modal Circulant Fusion for Fake News Detection |
|
Wang, Xingang | Qilu University of Technology(Shandong Academy of Sciences) |
Li, Xiaomin | Qilu University of Technology (Shandong Academy of Sciences) |
Liu, Xiaoyu | Qilu University of Technology(Shandong Academy of Sciences) |
Cheng, Honglu | Qilu University of Technology |
Keywords: Application of Artificial Intelligence, Deep Learning, Machine Learning
Abstract: Fake news that combines text and images has a better story-telling ability than text-only fake news, making it more deceptive and easier to spread maliciously. Therefore, multi-modal fake news detection has become a new hot topic. There are two main challenges in this task. First, the traditional pre-training model BERT has more parameters and a relatively slow training speed, which limits the level of extracted text features. Second, in the multi-modal form, the fusion process is only a simple splicing of visual and textual features of news, and the obtained multi-modal features are insufficient to express the complementarity between multi-modal data and may have redundant information, potentially leading to biased detection results. In order to solve the above issues, we propose ALB-MCF, using ALBERT and Multi-modal Circulant Fusion(MCF) for fake news detection. ALB-MCF consists of four main modules: a multi-modal feature extractor, a multi-modal feature fusion, a fake news detection and a domain classifier. Specifically, the multi-modal feature extractor innovatively uses a pre-trained ALBERT model to extract text features and a pre-trained VGG-19 model to extract visual features. Then, the text features and the visual features are fused into a multi-modal feature representation by MCF, which improves the fusion capability while avoiding an increase in parameters and computational cost. Finally, the multi-modal features are fed into the detector to detect fake news. The role of the domain classifier is mainly to map multi-modal features of different events to the same feature space. We have conducted extensive experiments on two real-world datasets. The results demonstrated that our model can handle multi-modal data more effectively, thus improving the accuracy of fake news detection.
|
|
11:00-11:20, Paper We-PS2-T7.4 | Add to My Program |
Research on Multi-Level Image-Text Fusion Method for Rumor Detection |
|
Su, Mengli | Qilu University of Technology |
Sun, Tao | Qilu University of Technology (Shandong Academy of Scienc |
Quan, Zhibang | Qilu University of Technology (Shandong Academy of Sciences) |
Wei, Jishu | Qilu University of Technology |
Yin, Xy | 齐鲁工业大学(山东 |
Zhongshenjie, Zhongshenjie | Qilu University of Technology |
Keywords: Deep Learning, Multimedia Computation, Image Processing and Pattern Recognition
Abstract: In recent years, public opinion events such as "fake news" and "news reversal" have occurred frequently, and spreading rumors through images has become a new form of rumor circulating in the digital age. Most of the existing methods only consider the text content, ignoring the role of the information in the additional images; for the fusion between multiple modalities, their adequate information cannot be fully utilized, and the graphic and text information have not fully interacted. Therefore, we propose a multi-level image-text fusion method (MLFRD), which can effectively obtain local and global information about events, improve the connection between text and images, and improve the performance of rumor detection. MLFRD consists of three parts, a multimodal feature extractor to extract textual and visual features from posts, the extracted features are sent to a multilevel feature fusion network for efficient fusion, and finally to a rumor detector for rumor discrimination. We conduct extensive experiments on two real datasets, and MLFRD can better fuse features between multiple modalities for rumor detection and outperform state-of-the-art methods.
|
|
11:20-11:40, Paper We-PS2-T7.5 | Add to My Program |
Fusion of Physiological and Behavioural Signals on SPD Manifolds with Application to Stress and Pain Detection |
|
Wu, Yujin | University of Lille |
Daoudi, Mohamed | IMT Nord Europe |
Amad, Ali | Lille University |
Sparrow, Laurent | University De Lille |
D'Hondt, Fabien | Université De Lille |
Keywords: Application of Artificial Intelligence, Deep Learning, Multimedia Computation
Abstract: Existing multimodal stress/pain recognition approaches generally extract features from different modalities independently and thus ignore cross-modality correlations. This paper proposes a novel geometric framework for multimodal stress/pain detection utilizing Symmetric Positive Definite (SPD) matrices as a representation that incorporates the correlation relationship of physiological and behavioural signals from covariance and cross-covariance. Considering the non-linearity of the Riemannian manifold of SPD matrices, well-known machine learning techniques are not suited to classify these matrices. Therefore, a tangent space mapping method is adopted to map the derived SPD matrix sequences to the vector sequences in the tangent space where the LSTM-based network can be applied for classification. The proposed framework has been evaluated on two public multimodal datasets, achieving both the state-of-the-art results for stress and pain detection tasks.
|
|
We-PS2-T8 Regular Session, QUADRANT |
Add to My Program |
System Modeling and Analysis Methods |
|
|
Co-Chair: Kuroe, Yasuaki | Doshisha University |
|
10:00-10:20, Paper We-PS2-T8.1 | Add to My Program |
TiBERT: Tibetan Pre-Trained Language Model |
|
Liu, Sisi | Minzu University of China |
Deng, Junjie | Minzu University of China |
Sun, Yuan | Minzu University of China; Minority Languages Branch, National La |
Zhao, Xiaobing | Minzu University of China; Minority Languages Branch, National La |
Keywords: Deep Learning, Machine Learning
Abstract: The pre-trained language model is trained on large-scale unlabeled text and can achieve state-of-the-art results in many different downstream tasks. However, the current pre-trained language model is mainly concentrated in the Chinese and English fields. For low resource language such as Tibetan, there is lack of a monolingual pre-trained model. To promote the development of Tibetan natural language processing tasks, this paper collects the large-scale training data from Tibetan websites and constructs a vocabulary that can cover 99.95% of the words in the corpus by using Sentencepiece. Then, we train the Tibetan monolingual pre-trained language model named TiBERT on the data and vocabulary. Finally, we apply TiBERT to the downstream tasks of text classification and question generation, and compare it with classic models and multilingual pre-trained models, the experimental results show that TiBERT can achieve the best performance. Our model is published in http://tibert.cmli-nlp.com/.
|
|
10:20-10:40, Paper We-PS2-T8.2 | Add to My Program |
Portfolio Selection for SAT Instances |
|
Sadreddin, Armin | University of Regina |
Mouhoub, Malek | University of Regina |
Sadaoui, Samira | University of Regina |
Keywords: Deep Learning, Neural Networks and their Applications, Heuristic Algorithms
Abstract: SAT problems are fundamental in representing and solving combinatorial applications. Over the past years, many sophisticated SAT solvers have been proposed. Due to the topic's relevance, a SAT competition is scheduled yearly to promote solving hard SAT instances. There is no unique solver to tackle all SAT problems efficiently. Indeed, some solvers work best for some SAT instances but perform poorly for others. This limitation has been addressed, in the literature, by identifying a pool of solvers that complement each other for efficiently tackling a given set of SAT instances. This pool of solvers is called a portfolio. Several studies have been conducted to find the optimal portfolio maximizing the number of solved SAT instances, minimizing the overall running time, or a trade-off between both. In this context, we present a new approach that first finds the suitable portfolio meeting each of these objectives. Then, the approach predicts the best solver for any new SAT instance. Our approach is based on Greedy search techniques, clustering, and deep learning. More precisely, we investigate two different scenarios. In the first one, our goal is to find the best portfolio capable of solving the largest number of instances within a given time limit. Both the Greedy-based method and clustering are used in this case. The second scenario aims to find the optimal portfolio to minimize the penalized average running time. The latter objective captures a good trade-off between the objective in the first scenario and the overall average running time. In addition to Greedy search and clustering, we consider a variant of the Beam-search technique to address this scenario. To assess the performance of our approach regarding the two scenarios, we conduct multiple experiments on the SAT2021 competition datasets that include SAT instances together with participants' solvers' results for each instance. The outcomes from the conducted experiments are encouraging and promising.
|
|
10:40-11:00, Paper We-PS2-T8.3 | Add to My Program |
CBPGM: A Cache Based Piecewise Geometric Model Index |
|
Xu, Xiaopei | East China Normal University |
Cao, Guitao | East China Normal University |
Li, Yan | East China Normal University |
Keywords: Application of Artificial Intelligence, Cloud, IoT, and Robotics Integration, Machine Learning
Abstract: Recent works on learned indexes have changed the way we look at the decades-old field of Data Base Management System indexing. However, they are limited to too many hyperparameters, long model construction time, and not taking full advantage of CPU cache and hardware acceleration. In this paper, we propose a Cache Based Piecewise Geometric Model (CBPGM) Index to address these issues with only one hyperparameter and effectively combines a sampling approach to reduce training dataset size that accelerates the construction procedure and aligns models and data to the CPU cache line to improve search performance. Experimental results show that the CBPGM index can improve the construction speed up to 8X and the query speed by 30% compared with the PGM index.
|
|
11:00-11:20, Paper We-PS2-T8.4 | Add to My Program |
Throughput-Efficient Communication Device Driver for IoT Gateways |
|
Niu, Yannian | East China Normal University |
Zhu, Minghua | East China Normal University |
Keywords: Cloud, IoT, and Robotics Integration
Abstract: With the rise of IoT and edge computing, terminal data generated by end devices is also growing explosively. The limited data transmission links between edge data centers and the end devices is becoming a performance bottleneck of IoT edge-based architectures. This paper proposes Assembler, a module integrated into network drivers of the gateway to realize high-throughput forwarding in small-packet-intensive scenarios. Motivated by observations that transmitting large packets obtains far more data throughput than transmitting small ones, Assembler assembles small packets received by the driver into large ones before the driver transmits them. Meanwhile, an adaptive assembling size algorithm is introduced to balance data throughput and packet throughput in scenarios with abrupt changes in data traffic. Our evaluation shows that the network driver incorporated with Assembler can achieve 1.75x the data throughput in small-packet-intensive scenarios compared to the state-of-the-art work. Furthermore, the adaptive assembling size algorithm can help the driver adapt to scenarios with changing data traffic and keep steady throughput.
|
|
11:20-11:40, Paper We-PS2-T8.5 | Add to My Program |
Analysis Method of Period Sensitivities for Rhythm Phenomena |
|
Kuroe, Yasuaki | Doshisha University |
Mori, Yoshihiro | Kyoto Institute of Technology |
Keywords: Biometric Systems and Bioinformatics, Computational Intelligence
Abstract: Sensitivity analysis is fundamental and essential in analysis and design in any system. This paper discusses a method of sensitivity analysis of rhythm phenomena which are found in various systems such as physical systems, biological systems and human societies and so on. Sensitivity analysis of rhythm phenomena is very difficult because rhythms appear autonomously as periodic phenomena in nonlinear systems and only few studies have been done. We deal with the periods and propose an analysis method of sensitivities of periods for periodic phenomena. We first derive a strict expression of period sensitivities by introducing Poincar´e map. Based on the expression we derive an efficient computer algorithm to calculate period sensitivities. It is shown that the proposed analysis method makes it possible to obtain period sensitivities of not only stable periodic orbits but also unstable periodic orbits embedded in chaos attracters.
|
|
We-PS2-T9 Regular Session, KEPLER |
Add to My Program |
System Modeling and Control V |
|
|
Chair: Neuvonen, Markus | University of Oulu |
Co-Chair: Christen, Patrik | FHNW |
|
10:00-10:20, Paper We-PS2-T9.1 | Add to My Program |
Heat Exchanger Fouling Estimation for Combustion–Thermal Power Plants Including Load Level Dynamics |
|
Neuvonen, Markus | University of Oulu |
Selek, Istvan | University of Oulu |
Ikonen, Enso | University of Oulu |
Aho, Lauri | University of Oulu |
Keywords: System Modeling and Control, Fault Monitoring and Diagnosis, Decision Support Systems
Abstract: This paper presents a robust soft sensor for estimating heat exchanger fouling in combustion–thermal power plant context. The approach is data–driven and focuses on identifying the effect of plant load changes to fouling estimation. Proposed method is applied to real process measurements and results are presented. The method consists of two blocks; a static energy balance calculation block for “traditional” fouling indicator calculation and a dynamic subspace identification block for finding sootblowing– and load level dynamics components of the static fouling indicator signal. Results from applying the proposed method to real plant data show that load level dynamics can be decoupled from fouling estimate.
|
|
10:20-10:40, Paper We-PS2-T9.2 | Add to My Program |
Exploring Decision Patterns for Supporting DoDAF Based Architecture Design |
|
Fang, Zhemei | Huazhong University of Science and Technology |
Jin, Wenjing | CyberInsight Technology Co. Ltd |
Keywords: Large-Scale System of Systems, System Architecture
Abstract: System-of-systems architecture is an important factor leading to successful system integration and capability delivery. However, current architecture model development based on architecture frameworks places heavy reliance on expert experiences. The lack of quantitative decision support not only increases human work burden, but also misses the chance of modeling better. Thus this paper proposes to identify and use decision patterns during the process of developing architecture description models. Five decision patterns built upon meta-model and associated decision points are identified, including the downselecting, partitioning, connecting, permuting, and assigning patterns. Each provides a mathematical framework to shape the important decision-making elements. A mixed use of decision patterns for an air and missile defense SoS architecture design is simply illustrated in the end. As a preliminary study, this paper demonstrates the potential of developing decision patterns to support and ease the architecture model development.
|
|
10:40-11:00, Paper We-PS2-T9.3 | Add to My Program |
Programming Data Structures for Large-Scale Desktop Simulations of Complex Systems |
|
Christen, Patrik | FHNW |
Keywords: Large-Scale System of Systems, System Modeling and Control, Discrete Event Systems
Abstract: The investigation of complex systems requires running large-scale simulations over many temporal iterations. It is therefore important to provide efficient implementations. The present study borrows philosophical concepts from Gilbert Simondon to identify data structures and algorithms that have the biggest impact on running time and memory usage. These are the entity e-tuple E and the intertwined update function phi. Focusing on implementing data structures in C#, E is implemented as a list of objects according to current software engineering practice and as an array of pointers according to theoretical considerations. Cellular automaton simulations with 10^9 entities over one iteration reveal that the object-list with dynamic typing and multi-state readiness has a drastic effect on running time and memory usage, especially dynamic typing as it has a big impact on the evolution time. Pointer-arrays are possible to implement in C# and are more running time and memory efficient as compared to the object-list implementation, however, they are cumbersome to implement. In conclusion, avoiding dynamic typing in object-list based implementations or using pointer-arrays gives evolution times that are acceptable in practice, even on desktop computers.
|
|
11:00-11:20, Paper We-PS2-T9.4 | Add to My Program |
Identification and Control of Linear Systems with Piece-Wise Constant Parameters |
|
Esfandiari, Kasra | Yale University |
Narendra, Kumpati | Yale Univ |
Keywords: Control of Uncertain Systems, System Modeling and Control, Modeling of Autonomous Systems
Abstract: The paper deals with the adaptive control of linear systems whose parameters can vary in a piece-wise constant fashion. The principal aim of the paper is to discuss the questions that arise when dealing with such systems and describe the methods used to identify and control them. These include the use of many models, the choice of their location, and how they are to be activated. Second level adaptation, which incorporates many of these features, is found to be the best method for tracking piece-wise constant systems from the point of view of speed,accuracy, and stability. Simulation results are included to indicate the improvement in performance at every stage.
|
|
11:20-11:40, Paper We-PS2-T9.5 | Add to My Program |
On Transformations among Opacity Notions |
|
Balun, Jiri | Palacky University |
Masopust, Tomas | Faculty of Science, Palacky University in Olomouc |
Keywords: Discrete Event Systems
Abstract: Opacity is a property asking whether a system may reveal its secret to a passive observer who knows the structure of the system but has only limited observations of its behavior. Several notions of opacity have been studied. Similarities among the opacity notions have been investigated via transformations, which have many potential applications. We investigate K-step opacity (K-SO), a notion that generalizes both current-state opacity and infinite-step opacity, and asks whether the intruder cannot decide, at any instant, whether or when the system was in a secret state during the last K observable steps. We provide new polynomial-time transformations among K-SO and other opacity notions. Our results lead, among others, to the general solution of an open problem concerning the computational complexity of the verification of K-SO.
|
|
We-PS2-T10 Regular Session, TYCHO |
Add to My Program |
Novel Optimization Methods and Applications |
|
|
Co-Chair: Novak, Petr | Czech Technical University in Prague - CIIRC |
|
10:00-10:20, Paper We-PS2-T10.1 | Add to My Program |
A Distributed Cooperative Co-Evolutionary Algorithm Based on Ring Network for Distributed Large-Scale Optimization |
|
Ou, Wen-Jie | South China University of Technology |
Shi, Xuan-Li | South China University of Technology |
Chen, Wei-Neng | South China University of Technology |
Keywords: Evolutionary Computation, Swarm Intelligence, Computational Intelligence
Abstract: With the rapid development of distributed computing paradigms like edge computing and Internet of Things (IoT), many distributed edge nodes involve in data collection and decision making, resulting in many distributed optimization problems (DOPs). In this paper, we consider the DOPs with the following features. First, decision variables of a problem are naturally distributed in several spatial-distributed edges. Each computing node is responsible for one subproblem, and it can only access its corresponding local data and perform local objective evaluation. Second, some decision variables appear in different groups simultaneously, which are called overlapping variables. Third, the computing nodes can only communicate following a certain network topology. They need to work together to solve the overall problem. Because of its divide-and-conquer nature, cooperative coevolution (CC) has good potential for handling such distributed problems. Therefore, we develop a new distributed CC framework to solve them. First, a new CC architecture based on the ring network without any central node is designed. Second, an asynchronously communication strategy with low communication frequency and volume is proposed. Third, a competitive selection strategy is adopted to achieve consistency in asynchronous evolution. We define a set of distributed benchmark problems and the experimental results validate the effectiveness of the proposed approach.
|
|
10:20-10:40, Paper We-PS2-T10.2 | Add to My Program |
ASAN: An Extendable Approach for Automatic Step-Sizes Adjustment for Newton-Raphson Consensus Optimization Algorithms |
|
Aminian, Behdad | Norwegian University of Science and Technology (NTNU) |
Davari benam, Karim | Norwegian University of Science and Technology (NTNU) |
Varagnolo, Damiano | University of Padova |
Keywords: Optimization and Self-Organization Approaches, Heuristic Algorithms
Abstract: We propose a novel approach for solving the Automatic Step-size Adjustment problem in Newtown-Raphson Consensus based distributed optimization algorithms (abbreviated as ASAN). The approach leads the agents in the network to autonomously, continuously and automatically choose their local stepsizes as the distributed optimization process unfolds. In practice, it is based on first evaluating the reliability of the next predicted local optimum by using the information collected at each node as the optimization process is being executed, and then using this reliability assessment to locally adjust the step-size accordingly. By letting each node adjust its own step-size separately by means of local inspection of local variables, the approach does not add communication overheads (a feature that is beneficial especially for systems with limited communication possibilities). Moreover, the strategy does not require information about the current topology of the communication network, nor information about the local cost functions, serving thus situations for which both the network structure and the local cost functions are time-varying. Besides the overall concept, the paper introduces different heuristic reliability evaluation algorithms to analyse the temporal dy- namics of the local data, and compares the approach against an oracle-based implementation that selects the best constant step-size for each specific network and set of local costs by means of Monte Carlo analyses. The paper statistically shows that the proposed heuristic leads to convergence rates that are most often better (and in general not worse) than the ones of the oracle-based optimization scheme, without though the need for knowing information about the communication topology or local costs.
|
|
10:40-11:00, Paper We-PS2-T10.3 | Add to My Program |
Genetic Algorithm with Adapted Crossover Operators for Multiple Traveling Salesmen Problem with Visiting Constraints |
|
Bao, Cong | Nanjing University of Information Science and Technology |
Yang, Qiang | Nanjing University of Information Science and Technology |
Gao, Xu-Dong | Nanjing University of Information Science and Technology |
Lu, Zhen-Yu | Nanjing University of Information Science and Technology |
Keywords: Swarm Intelligence, Evolutionary Computation, Computational Intelligence
Abstract: Multiple traveling salesmen problem with visiting constraints (VCMTSP) is a general version of the classical multiple traveling salesmen problem (MTSP), where each city can be only accessed by a number of salesmen. To cope with this new problem, we adapt the genetic algorithm (GA) for MTSP by using a dual-chromosome representation scheme with one chromosome denoting the visiting sequence of cities and the other representing the assignment of cities to salesmen. To further promote the effectiveness of GA in solving VCMTSP, we modify three popular crossover operators, namely the cycle crossover (CX), the order crossover (OX), and the partially mapped crossover (PMX). Similar to the execution for traditional TSP, the three crossover operators are all executed on the city sequence chromosome, while the adaption of them lies in the modification of the salesman assignment in the second chromosome. To this end, a correction mechanism according to the accessibility matrix is conducted to make the generated solutions after crossover feasible. Extensive experiments conducted on totally 16 VCMTSP instances generated from the benchmark TSPLIB set demonstrate that the adapted GA could effectively cope with VCMTSP, and the GA with the modified PMX achieves the best overall performance
|
|
11:00-11:20, Paper We-PS2-T10.4 | Add to My Program |
PL-TD3: A Dynamic Path Planning Algorithm of Mobile Robot |
|
Tan, Yijian | Wuhan University of Science and Technology |
Lin, Yunhan | Wuhan University of Science and Technology |
Liu, Tong | Wuhan University of Science and Technology |
Min, Huasong | Wuhan University of Science and Technology |
Keywords: Machine Learning, Neural Networks and their Applications, Application of Artificial Intelligence
Abstract: In this paper, Prioritized Experience Replay (PER) strategy and Long Short Term Memory (LSTM) neural network are introduced to the path planning process of mobile robots, which solves the problems of slow convergence and inaccurate perception of dynamic obstacles with the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm. We dubbed this new method as PL-TD3. Firstly, we improve the convergence speed of the algorithm by introducing PER strategy. Secondly, we use LSTM neural network to achieve the improvement of the algorithm for dynamic obstacle perception. In order to verify the method of this paper, we design static environment, dynamic environment and adaptability to dynamic experiments to compare and analyze the methods before and after improvement. The experimental results show that PL-TD3 outperforms TD3 in terms of execution time and execution path length in all environments.
|
|
11:20-11:40, Paper We-PS2-T10.5 | Add to My Program |
A Routing Algorithm Based on Optimistic Path Analysis |
|
Liu, Jing | Wuhan University of Science and Technology |
Yu, Haijian | Wuhan University of Science and Technology |
Zhang, Yifu | Wuhan University of Science and Technology |
Hu, Wei | Wuhan University of Science and Technology |
Keywords: Heuristic Algorithms, Complex Network
Abstract: To solve the problem of low delivery rate, high network load and high average forwarding delay caused by blind forwarding of messages in mobile social networks considering the selfish attribute of nodes, the optimistic path analysis (OPA) algorithm is proposed. OPA puts forward the concept of encounter intensity based on the historical encounter records and gives its calculation formula. Encounter intensity uses time as an important basis, which can more accurately reflect the possibility of two nodes meeting next time. At the same time, based on the intensity of encounter, the “small world principle” is used to limit the number of hops of routing paths, and finally a set of preferred paths for the message from the source node to the destination node is constructed, which is used as a basis to determine whether the message is forwarded or not. Simulation results show that compared with classic routing algorithms, OPA can effectively increase message delivery rate, reduce network load and forwarding delay.
|
|
We-PS2-T11 Regular Session, STELLA |
Add to My Program |
System Modeling and Evaluation Methods |
|
|
Chair: Preucil, Libor | Czech Technical University in Prague |
|
10:00-10:20, Paper We-PS2-T11.1 | Add to My Program |
Construction Methods for Error Correcting Output Codes Using Constructive Coding and Their System Evaluations |
|
Hirasawa, Shigeichi | Waseda University |
Kumoi, Gendo | Waseda University |
Yagi, Hideki | University of Electro-Communications |
Kobayashi, Manabu | Waseda University |
Goto, Masayuki | Waseda University |
Inazumi, Hiroshige | Aoyama Gakuin University |
Keywords: Machine Learning
Abstract: Consider M-valued (M ≥ 3) classification systems realized by combination of N (N ≥ ⌈log2 M⌉) binary classifiers. such a construction method is called an Error Correcting Output Code (ECOC). First, focusing on a Reed-Muller (RM) code, we derive a modified RM (mRM) code to make it suitable for the ECOC. Using the mRM code and the Hadamard matrix, we introduce a simplex code which is one of the powerful equidistant codes. Next, from the viewpoint of system evaluation model, we evaluate the ECOC by using constructive coding described above. We show that they have desirable properties such as Flexible, Elastic, and Effective Elastic as M becomes large, by employing analytical formulas and experiments.
|
|
10:20-10:40, Paper We-PS2-T11.2 | Add to My Program |
Performance Analysis for Biometric Identification Systems with Nonlegitimate Users |
|
Yagi, Hideki | University of Electro-Communications |
Hirasawa, Shigeichi | Waseda University |
Keywords: Information Assurance and Intelligence, Biometric Systems and Bioinformatics
Abstract: The biometric identification system, introduced by Willems et al., is a mathematical model to identify users based on their physical features. Although the maximum rate of the number of users which are reliably dealt with in the system (identification capacity) and the exponential behavior of the average error probability (error exponents) of the legitimate users have been revealed via information theoretic approaches, optimum error exponents has not been shown when there exists a nonlegitimate user in the system. In this paper, we formally define the reliability function as the optimum error exponent for legitimate users for a given rate of the number of legitimate users and a given error exponent for the nonlegitimate users. It is shown that the reliability function can be completely characterized by the well-known random coding exponent and the hypothesis testing error exponent.
|
|
10:40-11:00, Paper We-PS2-T11.3 | Add to My Program |
An Opinion Dynamics Model with Cross-Link Interaction |
|
Zhang, Haoming | National University of Defense and Technology |
Wenjie, Tang | National University of Defense and Technology |
Yao, Yiping | National University of Defense and Technology |
Jiefan, Zhu | National University of Defense and Technology |
Keywords: Agent-Based Modeling, Complex Network
Abstract: At present, online social networks have become the main platform for people to express their opinions and interact with them, which has a great impact on the evolution of opinions. Therefore, the research on the evolution law of opinions in online social networks has become a current hotspot. Opinion dynamics is an important tool to study the evolution law of opinions in the network. In the opinion dynamics model for online social networks, agents often can only interact with others who have links on the network. However, in reality, the interaction of agents' opinions is not limited to agents with links on the network. Therefore, due to the lack of sufficient interaction, the traditional model cannot reflect the true final state of public opinion evolution. In view of the above situation, we propose a novel cross-link interaction mechanism which enables agents interact opinion with others without the limit of network inks and use machine learning methodology to get the cross-link interaction distance. After that, we introduce the mechanism to the bounded confidence opinion dynamics model. With this mechanism, the interaction of the agent will be more reasonable and more like social behavior. The simulation results show that the proposed model fits the real data better than traditional models and even under a very small bounded confidence value, agents opinions will still be around high-influence agents' opinions.
|
|
11:00-11:20, Paper We-PS2-T11.4 | Add to My Program |
A Tool to Certify Dynamic Benchmarks |
|
Carrero, Jonathan | Universidad Complutense |
Rodríguez, Ismael | Universidad Complutense De Madrid |
Rubio, Fernando | Universidad Complutense |
Keywords: Computational Intelligence, Evolutionary Computation, Swarm Intelligence
Abstract: Benchmarks are useful to allow evaluating the usefulness of new algorithms. However, care has to be taken to avoid cheating in case the users know the benchmarks in advance. In this paper, we present a blockchain-based tool that allows the generation of dynamic benchmarks. Moreover, it provides a verifiable certification about the moment when the benchmark was created, the researcher who asked for it, how much time was spent before solving it, etc. By using it, we can safely deal with the use of dynamic benchmarks without requiring a trusted third party.
|
|
11:20-11:40, Paper We-PS2-T11.5 | Add to My Program |
A Denoisable Super Resolution Method: A Way to Improve Structure from Motion's Performance against CMOS's Noise |
|
Zhang, Kaihang | University of Tsukuba |
Hajime, Nobuhara | University of Tsukuba |
Keywords: Deep Learning, Image Processing and Pattern Recognition, Neural Networks and their Applications
Abstract: The quality of three-dimensional (3D) reconstruction algorithm Structure from Motion (SfM) is affected by the input image’s resolution and noise level. We propose a denoisable Super Resolution (SR) method to improve resolutions while reducing noise for preprocessing SfM’s input images taken by a CMOS device, improving its performance on noisy images. The conventional deep learning-based SR algorithm does not consider denoising during the learning process. This results in the disability of simultaneously reducing noise and improving resolution. In our methods (Add Noise before Downsampling (an-ds) and Downsampling before Adding Noise (ds-an)), instead of expanding the training data, we extract the noise from a real-world noise dataset and selectively add it to low resolution (LR) images of the SR training set. The SR algorithm can simultaneously improve resolution and reduce noise by learning the relationship between LR images with noise and high resolution (HR) images without noise. Moreover, selectively adding noise also can remain SR algorithm’s performance on clean images. We trained two representative SR algorithms (SRCNN and EDSR) using traditional and our designed methods to process both clean and noisy images; thereafter we calculated the peak signal-to-noise ratio (PSNR). Without changing the SR network’s structure, improvements of 0.17 dB by Ds-An and 0.14 dB by An-Ds (approximately 20% of improvement in three years by reforming the network’s structure) were observed in noisy images' experiments by EDSR. However, there is only a little loss (less than 0.01 dB) in the clean images’ experiments by EDSR. Subsequently, we input noisy images processed by conventional and our proposed methods and noisy images without preprocessing to SfM. Using our methods, a better 3D model is reconstructed. Compared to non-preprocessing and conventional preprocessing, key metrics such as Mean Reprojection Error (MRE) reduced 51.9% and 12.6%, and 2D key-points matching rate improved 41.7% and 217%, respectively. These results prove the superiority of our proposed methods.
|
|
We-PS2-T12 Regular Session, ZODIAC |
Add to My Program |
Recent Progress in Representation Learning |
|
|
Chair: Putze, Felix | University of Bremen |
|
10:00-10:20, Paper We-PS2-T12.1 | Add to My Program |
Multimodal Sentiment Analysis Based on Nonverbal Representation Optimization Network and Contrastive Interaction Learning |
|
Quan, Zhibang | Qilu University of Technology (Shandong Academy of Sciences) |
Sun, Tao | Qilu University of Technology (Shandong Academy of Scienc |
Su, Mengli | Qilu University of Technology |
Wei, Jishu | Qilu University of Technology |
ZhangXiang, ZhangXiang | Qilu University of Technology |
Zhongshenjie, Zhongshenjie | Qilu University of Technology |
Keywords: Deep Learning, Multimedia Computation, Representation Learning
Abstract: Multimodal sentiment analysis is an active subfield of natural language processing. It aims to extract and integrate semantic information gathered from multiple modalities to identify the sentiments expressed by users. Indeed, the complementary and heterogeneous information between modalities influences the prediction results. Recent research proposals employ a single neural network to obtain mutually independent representations of all modalities. However, a problem that may limit previous work to reach a higher level is that this does not take into account the modal heterogeneity problem that exists between different modalities. This in turn may lead to the presence of additional noise in the representations before modal fusion. For this reason, we propose a new framework, MICS, which adopts a suitable strategy for each modality and provides a better representation for fusion. Also, we design a multimodal comparative learning interaction module for the fusion phase, which plays a crucial role in the information interaction between modalities. Our extensive experiments on two publicly available and popular benchmarks, MOSI and MOSEI, show that MICS can better focus on the characteristics of different modal data and has significant advantages over previous complex baselines.
|
|
10:20-10:40, Paper We-PS2-T12.2 | Add to My Program |
Semantic Knowledge Representation for Long-Range Action Anticipation |
|
Koch, Jan-Hendrik | University of Bremen |
Putze, Felix | University of Bremen |
Keywords: Application of Artificial Intelligence, Representation Learning, Deep Learning
Abstract: Action anticipation is an important capability for systems interacting with humans in their everyday environments, such as robots or digital assistants. Action anticipation already works remarkably well for short periods of time; however, it is still an unsolved challenge for larger time gaps. In this paper, we propose a semantic representation of previous actions for the prediction of a distribution across possible future actions. We show that this approach is able to beat a baseline prediction for as much as 5 minutes into the future.
|
|
10:40-11:00, Paper We-PS2-T12.3 | Add to My Program |
UniMoCo: Unsupervised, Semi-Supervised and Fully-Supervised Visual Representation Learning |
|
Dai, Zhigang | South China University of Technology |
Cai, Bolun | Tencent |
Chen, Junying | South China University of Technology |
Keywords: Representation Learning, Transfer Learning, Deep Learning
Abstract: Momentum Contrast (MoCo) achieves great success for unsupervised visual representation learning. However, there are a lot of supervised and semi-supervised datasets, which are already labeled. To fully utilize the label annotations, we propose Unified Momentum Contrast (UniMoCo), which extends MoCo to support arbitrary ratios of labeled data and unlabeled data training. Compared with MoCo, UniMoCo has two modifications as follows: (1) Different from a single positive pair in MoCo, we maintain multiple positive pairs on-the-fly by comparing the query label to a label queue. (2) We propose a Unified Contrastive (UniCon) loss to support an arbitrary number of positives and negatives in a unified pair-wise optimization perspective. Our UniCon is more reasonable and powerful than the supervised contrastive loss in theory and practice. In our experiments, we pre-train multiple UniMoCo models with different ratios of ImageNet labels and evaluate the performance on various downstream tasks. Experiment results show that UniMoCo generalizes well for unsupervised, semi-supervised and fully-supervised visual representation learning. Besides, we surprisingly find that UniMoCo performs best with 60% ImageNet labels for COCO and VOC transfer learning. The code is available: https://github.com/dddzg/unimoco.
|
|
11:00-11:20, Paper We-PS2-T12.4 | Add to My Program |
A Representation Learning Method of Knowledge Graph Integrating Ordered Relation Path and Entity Description Information |
|
Ma, Haoxiang | Qilu University of Technology |
Jiang, Xuesong | Qilu University of Technology(shandong Academy of Sciences) |
Chai, Huihui | Qilu University of Technology (Shandong Academy of Sciences) |
Wei, Xiumei | Qilu University of Technology |
Keywords: Representation Learning, Knowledge Acquisition, Deep Learning
Abstract: Knowledge graph representation learning aims to obtain its vector representation by mapping entities and relations in knowledge graphs to a continuous low-dimensional vector space by learning methods. Most of the existing knowledge graph representation learning methods only consider the single-step relation between entities from the perspective of triples and fail to effectively utilize important information such as ordered multi-step relation paths and entity descriptions, thus affecting the ability of knowledge representation learning. We propose a knowledge graph representation learning model that integrates ordered relation paths and entity descriptions in response to the above problems. The model can integrate the triple representation in the knowledge graph, the semantic representation of entity description, and the representation of ordered relation paths for training. On the FB15K, WN18, FB15K-237, and WN18RR datasets, the proposed model and baselines are run on the link prediction task. Experimental results show that the model has higher accuracy than existing baselines, demonstrating the effectiveness and superiority of the method.
|
|
11:20-11:40, Paper We-PS2-T12.5 | Add to My Program |
Learning and Estimation of Latent Structural Models Based on Between-Data Metrics |
|
Mikawa, Kenta | Tokyo City University |
Kobayashi, Manabu | Waseda University |
Goto, Masayuki | Waseda University |
Hirasawa, Shigeichi | Waseda University |
Keywords: Machine Learning
Abstract: With the development of information technology, a wide variety of data have been accumulated, and there are many methods for analyzing such data. In this study, we model the input data and the metrics between the data based on the assumption that each metric is generated from a continuous latent variable. Specifically, we assume that the input data are generated using low-dimensional latent variables and their projection matrices. We describe a method for estimating the latent variables. Because the generative model defined in this study cannot obtain the Q function analytically, we use the Monte Carlo EM algorithm to approximate the Q function and investigate an efficient parameter estimation method. Experiments using artificial data and the 20 newsgroups dataset demonstrate the effectiveness of the proposed method.
|
|
We-PS3-T1 Regular Session, MERIDIAN |
Add to My Program |
Machine Learning for Intelligent Imaging Systems |
|
|
Co-Chair: Barhoum, Alaa | University of New South Wales |
|
13:00-13:20, Paper We-PS3-T1.1 | Add to My Program |
MBNet: Detecting Salient Object in Low-Light Scenes (I) |
|
Zhang, Yu | Sichuan Normal University |
Hu, Yiyue | Sichuan Normal University |
Mu, Nan | Sichuan Normal University |
Keywords: Image Processing and Pattern Recognition, Deep Learning, Machine Vision
Abstract: Benefiting from the powerful features created by deep learning techniques, salient object detection has recently made significant progress. Compared with the task of detecting salient object in well-light images, the detection of salient object in low-light scenes requires not only acquiring the spatial visual saliency of images under low light conditions, but also accurately identifying the multi-scale objects of interest. Mountain Basin Network (MBNet) is proposed for salient object detection to discriminate the pixel-level saliency of low-light images. To further refine the object localization and pixel classification performance, the proposed model integrates a high-low feature aggregation module (HLFA) to synergize the information from a high level branch (named Bal-Net) and a low level branch (named Mol-Net) to fuse the global and local context, and the hierarchical supervision modules (HSM) is embedded to assist in obtaining accurate salient objects, especially the small ones. Furthermore, multi-supervised integration strategy is leveraged to optimize the structure and boundaries of salient objects. Meanwhile, to facilitate further research and evaluation of the visual saliency models, we construct a new low-light dataset, which includes 13 categories with a total of 1000 low-light images. The experimental results show that the proposed model has state-of-the-art low-light saliency detection performance compared with seven existing methods.
|
|
13:20-13:40, Paper We-PS3-T1.2 | Add to My Program |
Dual Generative Adversarial Network for Ultrasound Localization Microscopy (I) |
|
赵, 亚川 | 西南石油大学 |
Liu, Shuyan | Southwest Petroleum University |
Luo, Anguo | Sichuan Provincial Key Laboratory of Ultrasound Cardiac Electrop |
Peng, Bo | Southwest Petroleum University |
Keywords: Image Processing and Pattern Recognition, Application of Artificial Intelligence
Abstract: Ultrasound localization microscopy (ULM) is a new imaging technique that uses microbubbles (MBs) to improve the spatial resolution of ultrasound (US) imaging. For ULM, it is critical to accurately localize MB position. Recently, deep learning-based methods are adopted to acquire MB localization, which shows promising performance and efficient computation. However, detection of high-concentration MBs is still a challenging task. To further improve the localization accuracy, a dual generative adversarial network (DualGAN)-based ULM imaging method (DualGAN-ULM) is proposed in this paper to overcome the problems of long data processing time and low parameter robustness in current ULM imaging methods. This method is trained using simulated data generated by point spread function (PSF) convolution and uses dual generation adversarial strategy to enable the generator to perform accurate localization under high-concentration MB conditions. Meanwhile, the localization and reconstruction capabilities of five ULM methods, namely Centroid, CS-ULM, mUNET-ULM, mSPCN-ULM and DualGAN-ULM, are evaluated in this paper. The experimental results reveal that DL-based ULM methods (DualGAN-ULM, mSPCN-ULM, and mUNET-ULM) outperform compressed sensing-based localization methods (CS-ULM) and Centroid in terms of localization accuracy and localization dependability. DualGAN-ULM performs better than mSPCN-ULM and mUNET-ULM, making it a more realistic ULM method.
|
|
13:40-14:00, Paper We-PS3-T1.3 | Add to My Program |
Raindrop Removal for In-Vehicle Camera Images with Generative Adversarial Network (I) |
|
Zhao, Zihao | Wuhan University of Science and Technology |
Jiang, Min | Wuhan University of Science and Technology |
Guo, Jia | Wuhan University of Science and Technology |
Yang, Xiaoyu | Wuhan University of Science and Technology |
Hu, Yudie | Wuhan University of Science and Technology |
周, 贤龙 | Wuhan University of Science and Technology |
Keywords: Image Processing and Pattern Recognition
Abstract: Raindrop removal for in-Vehicle camera images is useful for surveillance and analysis system such as intelligent driving or case diagnosis. As in rainy days, images taken by in-vehicle cameras such as monitor or drive recorder often suffer from noticeable image degradation, and makes it difficult to identify the objects on the road. With the uncertainty of raindrop distribution and the complexity of raindrop status, the attached raindrops of in-vehicle camera images can have different effects. And the large-diameter raindrop will greatly reduce the image quality and make it harder to remove the raindrops. However, most of the popular deraining algorithms have been successful in recovering images with small raindrops noise and tiny image distortion, but fails to restore those with raindrops covering large areas. We propose a single image raindrop removal network based on generative adversarial network for in-vehicle camera images . We incorporate two overlapping attention layers into the derain network, which adopt task-driven visual attention and content perception mechanism. The formal obtains features by recursive neural network to guide network more interested in the raindrop regions and the surrounding structures, while the content perception schema extracts the features far from raindrops. Moreover, we collect a new benchmark for in-vehicle cameras Image Deraining (IVCID), it choose in-vehicle cameras images from various Deraining dataset and also collect a few in-vehicle data by ourselves. The experimental results show that the proposed deraining network outperforms the other state-of-the-art raindrop removal methods in image recovering test on the IVCID dataset and also have the best performance on the experiments of traffic objects detection.
|
|
14:00-14:20, Paper We-PS3-T1.4 | Add to My Program |
Age Estimation of Caenorhabditis Elegans with Label Distribution Learning (I) |
|
Zhao, Zi-Kang | Wuhan University of Science and Technology |
Liu, Jun | Wuhan University of Science and Technology |
Wang, Jun-Ji | Wuhan University of Science and Technology |
Chen, Gong | Wuhan University of Science and Technology |
Li, Chen-Qian | Wuhan University of Science and Technology |
Zhao, Zihao | Wuhan University of Science and Technology |
Jiang, Min | Wuhan University of Science and Technology |
Keywords: Application of Artificial Intelligence, Deep Learning, Image Processing and Pattern Recognition
Abstract: Caenorhabditis elegans (C. elegans) is widely used in life research as a model organism. Accurate measurement of their age has practical implications for follow- up research. However, due to the similar characteristics of nematodes in different age groups, the general method is difficult to estimate accurately. Aiming at this problem, this paper proposes a method of label distribution learning, and uses fluorescence microscopy images to estimate the age of C. elegans. First, the logical age label are mapped to the label distribution. Secondly, the distribution learning loss function is used to make the neural network learn the relationship between the label distribution and corresponding image. Finally, the distribution learning loss function and the classification loss function are combined to strengthen the learning of the neural network. Experiments show that the accuracy and mean absolute error (MAE) of the method in this paper are improved compared with the baseline. In the experiment of estimating the age of different parts, it was also concluded that the fluorescent image of the torso of C. elegans can better reflect the age characteristics than the head and tail.
|
|
14:20-14:40, Paper We-PS3-T1.5 | Add to My Program |
Feasibility Study of a Lens-Based SPECT with a Tiled Lens and Detector Geometry for Animal Research: Simulation Results (I) |
|
Barhoum, Alaa | University of New South Wales |
Tahtali, Murat | UNSW Canberra |
Riccardo, Camattari | University of Ferrara |
Keywords: Computational Intelligence, Image Processing and Pattern Recognition, Machine Learning
Abstract: The considerable potential of gamma cameras in Nuclear Medicine Imaging has sparked attention and facilitated early cancer screening and several investigations. By leveraging recent advances in high-energy astrophysics, focusing X-ray and gamma-ray optics enable us to break the paradigm of low resolution and demonstrate a methodology for imaging small organisms with high resolution. This work aims to develop a method for SPECT imaging that does not require parallel collimators nor pinholes and conducts a first of its kind Monte Carlo analysis. We examine the performance parameters of a multi lens-based SPECT system equipped with modular partially curved detectors and Laue lens arrays. The Laue lenses and their detectors are placed in tilted planes in a modular fashion such that their fields of view converge. The SPECT gamma camera environment was simulated using a tracking Monte Carlo simulation to determine the distribution of diffracted high energy photons. The absorption and attenuation events were studied by incorporating possible physical interactions. We explored the resolutions and possibilities that the proposed device can offer to obtain images of tumours in the sub-millimetre range. For this, we simulated a phantom sphere of 0.03 mm radius with a total activity 0.1 mCi. The simulation results demonstrated the modular SPECT’s capacity to discriminate between two adjacent volumes as small as 0.00013 cc placed 0.1 mm apart centre to centre, which is significantly better than any existing SPECT or PET system. The modular lens-based SPECT exhibits superior resolution and comparable sensitivity to existing LEHR. Moreover, the sensitivity was boosted with the modified geometry compared to the single lens-based SPECT and to LEHR, with three hits detected per 42 source photons, corresponding to a sensitivity of 81 cps/MBq. The system's novelty lies on its ability to view the object from every lens separately to capture a specific view of it, and can be integrated to generate a three-dimensional image from the three modules, including accurate information about the object's size and location.
|
|
We-PS3-T2 Regular Session, ZENIT |
Add to My Program |
Medical Informatics III |
|
|
Chair: Hallgarten, Philipp | Dr. Ing. H.c. F. Porsche AG / University of Tübingen |
Co-Chair: Fasciglione, Andrea | University of Genova |
|
13:00-13:20, Paper We-PS3-T2.1 | Add to My Program |
EEG2Vec: Learning Affective EEG Representations Via Variational Autoencoders |
|
Bethge, David | Dr. Ing. H. C. F. Porsche AG, LMU Munich |
Hallgarten, Philipp | Dr. Ing. H.c. F. Porsche AG / University of Tübingen |
Grosse-Puppendahl, Tobias | Dr. Ing. H.c. F. Porsche AG |
Kari, Mohamed | Dr. Ing. H.c. F. Porsche AG |
Chuang, Lewis | LMU Munich |
Ozdenizci, Ozan | Graz University of Technology |
Schmidt, Albrecht | LMU Munich |
Keywords: Human-Machine Interface, Medical Informatics
Abstract: There is a growing need for sparse representational formats of human affective states that can be utilized in scenarios with limited computational memory resources. We explore whether representing neural data, in response to emotional stimuli, in a latent vector space can serve to both predict emotional states as well as generate synthetic EEG data that are participant- and/or emotion-specific. We propose a conditional variational autoencoder based framework, EEG2Vec, to learn generative-discriminative representations from EEG data. Experimental results on affective EEG recording datasets demonstrate that our model is suitable for unsupervised EEG modeling, classification of three distinct emotion categories (positive, neutral, negative) based on the latent representation achieves a robust performance of 68.49%, and generated synthetic EEG sequences resemble real EEG data inputs to particularly reconstruct low-frequency signal components. Our work advances areas where affective EEG representations can be useful in e.g., generating artificial (labeled) training data or alleviating manual feature extraction, and provide efficiency for memory constrained edge computing applications.
|
|
13:20-13:40, Paper We-PS3-T2.2 | Add to My Program |
Emotion-Related Awareness Detection for Patients with Disorders of Consciousness Via Graph Isomorphic Network |
|
He, Zhipeng | South China Normal University |
Zhong, Yongshi | South China Normal University |
Pan, Jiahui | South China Normal University |
Keywords: Assistive Technology, Medical Informatics, Brain-based Information Communications
Abstract: The clinical diagnosis of patients with disorders of consciousness (DOC) mainly relies on behavioral scales. However, patients with DOC often have severe dyskinesia, which may lead to misdiagnosis of patients in the minimally conscious state (MCS) as patients in vegetative state (VS). In this paper, we propose an emotion-induced paradigm based on audio-visual stimulation, which can collect electroencephalogram (EEG) signals for consciousness detection without performing behavioral expression tasks. This paradigm exposes patients to emotional videos and stimulates them through video clips, which is more comfortable than event-related potential (ERP), steady-state visual evoked potential (SSVEP) and other paradigms. It effectively reduces the mental burden required by the patients. In terms of algorithms, a graph isomorphic network (GIN) is adopted to automatically classify VS and MCS, using emotional EEG signals from patients with DOC. The accuracy, precision, specificity, and sensitivity of our method are 93.70%, 93.77%, 89.44%, and 95.21%, respectively. Compared with the existing consciousness detection methods, our method has superior performance in consciousness detection for patients with DOC. The experimental results show that our method is feasible in distinguishing MCS and VS, and it is an effective extension of consciousness level detection in patients with DOC.
|
|
13:40-14:00, Paper We-PS3-T2.3 | Add to My Program |
Mental Disorders Prediction with Heterogeneous Graph Convolutional Network |
|
Lin, Haocai | Ningbo University |
Pan, Jiacheng | Ningbo University |
Dong, Yihong | Ningbo University |
Keywords: Medical Informatics, Assistive Technology, Design Methods
Abstract: In the medical imaging field, Computer-Aided Detection (CADe) has greatly benefited from the recent development of Graph Convolutional Networks (GCNs). GCN-based predictive models require building a population graph to detect the disease states of each subject, based on imaging and non-imaging data. Until now, all existing population-level methods are homogeneous, failing to consider sex differences. To address this issue, we present a heterogeneous population graph convolutional network with hierarchical attention mechanisms, including intra-level and inter-level attention. Specifically, the intra-level attention layer is aimed at learning differences and similarities between the sexes, while the inter-level attention layer is responsible for information integration by assigning weights to different features. The objective is to obtain node embeddings describing individual characteristics completely and provide discriminative inputs to classifiers. Compared to benchmark models, our proposal achieves satisfying prediction results on three datasets, illustrating the framework's ability to extract predictive attributes from medical multimodal data.
|
|
14:00-14:20, Paper We-PS3-T2.4 | Add to My Program |
Prediction of Lymph Node Metastasis in CT Based on Multi-Scale Attention Fully Convolutional Network |
|
He, Weiping | East China Normal University |
Hu, Wenxin | East China Normal University |
Yu, Kun | East China Normal University |
Lu, Changquan | East China Normal University |
Keywords: Medical Informatics, Assistive Technology
Abstract: With the application of artificial intelligence technology in the medical field, the use of deep learning methods to predict lymph node (LN) metastasis from Computed Tomography (CT) images has become one of the important studies in adjuvant cancer treatment. Although many studies have made some progress, the large difference in LNs size has always limited the performance of traditional convolutional neural network methods. In this paper, we propose a multi-scale attention fully convolutional network for LN metastasis prediction in CT images. We aim to make the network compatible with LNs of different sizes to improve the overall performance of the network. First, the network extracts image features from multi-scale input data. Then, we use an attention-based fusion module to study the relationship between features at different scales and adaptively fuse image features. Furthermore, a feature consistency loss is proposed by us to enhance the similarity between image features at different scales. The experimental results show that our proposed network achieves the best performance with the accuracy 92.37%, the sensitivity 90.21% and the specificity 95.71%, which outperforms several state-of-the-art methods.
|
|
14:20-14:40, Paper We-PS3-T2.5 | Add to My Program |
Reproducibility in Activity Recognition Based on Wearable Devices: A Focus on Used Datasets |
|
Fasciglione, Andrea | University of Genova |
Leotta, Maurizio | University of Genova |
Verri, Alessandro | University of Genova |
Keywords: Medical Informatics, Wearable Computing
Abstract: Reproducibility of proposed approaches is a crucial element in scientific fields, in order to let other researchers trust published works. Moreover, in order to let authors compare the effectiveness of a novel method to the state of the art, benchmark datasets should be commonly used. Concentrating on the task of activity recognition using data coming from wearable devices with inertial sensors, we have analyzed the reproducibility of proposed approaches with a focus on used datasets. In this work, with a literature review, we have measured what percentage of works in the literature verified their approach using public datasets or sharing the ones created on purpose. At the same time, we have also examined the characteristics of considered datasets, with attention to the amount of data recorded, involved population, and studied activities. Starting from 1289 works retrieved on Scopus, we analyzed in detail 146 of them and found out that approximately one out of three (33%) used public datasets and that less than one out of three (28%) of the specially made datasets were shared with the public. Moreover, considering all the considered datasets, 13% of them had restricted access (e.g. requiring requests to authors or subscriptions to websites for a fee) or were offline.
|
|
We-PS3-T3 Regular Session, NADIR |
Add to My Program |
Conflict Resolution |
|
|
Chair: Zhu, Manli | Northumbria Universtiy |
Co-Chair: Kato, Yukiko | Tokyo Institute of Technology |
|
13:00-13:20, Paper We-PS3-T3.1 | Add to My Program |
State Definition for Conflict Analysis with Four-Valued Logic |
|
Kato, Yukiko | Tokyo Institute of Technology |
Keywords: Conflict Resolution, Decision Support Systems, Homeland Security
Abstract: We examined a four-valued logic method for state settings in conflict resolution models. Decision-making models of conflict resolution, such as game theory and graph model for conflict resolution (GMCR), assume the description of a state to be the outcome of a combination of strategies or the consequence of option selection by the decision-makers. However, for a framework to function as a decision-making system, unless a clear definition of the task of placing information out of an infinite world exists, logical consistency cannot be ensured, and thus, the function may be incomputable. The introduction of paraconsistent four-valued logic can prevent incorrect state setting and analysis with insufficient information and provide logical validity to analytical methods that vary the analysis resolution depending on the degree of coarseness of the available information. This study proposes a GMCR stability analysis with state configuration based on Belnap's four-valued logic.
|
|
13:20-13:40, Paper We-PS3-T3.2 | Add to My Program |
A Fault-Tolerant Scheduling Algorithm Based on Local Maximum Reliability Replication Strategy in Real-Time Heterogeneous Systems |
|
Mao, Dengfeng | Wuhan University of Science and Technology |
Hu, Wei | Wuhan University of Science and Technology |
Gan, Yu | Wuhan University of Science and Technology |
Liu, Jing | Wuhan University of Science and Technology |
Gu, Haonan | College of Computer Science and Technology, Wuhan University Of |
Keywords: System Architecture, Conflict Resolution, System Modeling and Control
Abstract: High reliability and low latency are conflicting when tasks are scheduled. Scheduling of parallel applications with data dependencies in heterogeneous systems is an NP-complete problem. Using replication to improve system reliability can lead to increased application execution time. From the perspective of increasing the reliability of real-time heterogeneous system considering communication overhead and the timing requirements, this paper proposed a fault-tolerant scheduling algorithm based on local maximum reliability replication strategy (FTSA-BLMR). Our algorithm first sets the maximum number of replications for each task. Then it continuously replicates the task with the highest system reliability for the current task set to obtain a new task set. The tasks in the new task set will be scheduled and the scheduling results will be recorded. Finally, the scheduling result will be selected as the final scheduling sequence, which has maximum system reliability and meets the deadline. The experimental results indicate that our algorithm can improve the system reliability by 40% compared with DB-FTSA when the deadline constraint is strict.
|
|
13:40-14:00, Paper We-PS3-T3.3 | Add to My Program |
A Novel Conflict Measurement Method Based on Cosine Similarity and Deng Entropy in Dempster-Shafer Evidence Theory |
|
Zhang, Xu | Chongqing University |
Tang, Yongchuan | Northwestern Polytechnical University |
Zhou, Deyun | Northwestern Polytechnical University |
Keywords: Conflict Resolution, Decision Support Systems
Abstract: As a generalization of probability theory, Dempster-Shafer evidence theory is superior in dealing with uncertain information. However, a counter-intuitive result is often obtained when combining highly conflicting evidence. In this paper, a new method based on similarity and Deng entropy of the evidence is proposed to measure the conflict and a new framework of fusing conflicting evidence is built based the proposed method. When most evidence has the same view, this evidence is given the higher weight. Moreover, the lower the entropy of the evidence, the stronger its ability to provide accurate information, and should be paid more attention. Experiments on real data show that this method can effectively solve the combination problem of conflicting evidence and it has a higher accuracy rate in the classification problem compared with other methods.
|
|
14:00-14:20, Paper We-PS3-T3.4 | Add to My Program |
A New Correlation Belief Transfer Method in the Evidence Theory |
|
Zhang, Xu | Chongqing University |
Tang, Yongchuan | Northwestern Polytechnical University |
Zhou, Deyun | Northwestern Polytechnical University |
Keywords: Conflict Resolution, Decision Support Systems, Control of Uncertain Systems
Abstract: Dempster-Shafer evidence theory (D-S evidence theory) is an effective method in dealing with uncertain information. However, it may get counterintuitive results when using traditional Dempster's combination rule directly to fuse highly conflicting data. How to manage conflict in data fusion is still an open issue in D-S evidence theory. In this paper, a new correlation belief function is proposed to modify the basic belief assignment before combination in closed-world. The method transfers the belief from a certain proposition to another related proposition to avoid the loss of information when data are fused, which effectively solves the problem of conflict management in D-S evidence theory. The advantage of the proposed method is that it does not lose belief value in main propositions related to decision-making and also expresses the conflict information effectively. Several numerical examples and experiments with real data sets from the University of California Irvine Machine Learning Repository are adopted to verify the rationality and validity of the proposed method.
|
|
14:20-14:40, Paper We-PS3-T3.5 | Add to My Program |
An Evolutionary Learning Approach for Anti-Jamming Game in Cognitive Radio Confrontation |
|
Zou, Mingwo | National University of Defense Technology |
Chen, Shaofei | National University of Defense Technology |
Luo, Junren | National University of Defense Technology |
Hu, Zhenzhen | National University of Defense Technology |
Chen, Jing | National University of Defense Technology |
Keywords: Decision Support Systems, Communications
Abstract: In cognitive radio systems, there are cases where malicious users transmit interference power to prevent secondary users from transmitting information. Due to the destructive behavior of malicious users, spectrum utilization efficiency will be reduced. The applications of game theory to study this confrontation relationship are reasonable. Based on the Continuous Blotto Game (CBG) model under the condition of one-shot perfect information confrontation, this paper constructs an anti-jamming game model for power allocation confrontation between secondary users and malicious users, and simulates the competition between two players under fixed resource constraints. An evolutionary learning approach is proposed, which improves the performance of the power allocation strategy through the repeated game of two players, and can obtain the same effect as the Nash equilibrium strategy under certain conditions. Our approach realizes the optimal power allocation on the information transmission channel in interference countermeasures without knowing the power allocation strategy of malicious users, thereby realizing the optimal utilization of spectrum resources. The simulation results show that, compared with the greedy algorithm and random algorithm, our algorithm has a more obvious effect on improving the spectrum utilization of secondary users.
|
|
We-PS3-T4 Regular Session, AQUARIUS |
Add to My Program |
Learning to Optimize in Intelligent Systems II |
|
|
Chair: Sorino, Paolo | Politecnico Di Bari |
|
13:00-13:20, Paper We-PS3-T4.1 | Add to My Program |
A Q-Learning-Based Selective Disassembly Sequence Planning Method (I) |
|
Bi, Zhiliang | Liaoning Petrochemical University |
Guo, Xiwang | Liaoning Petrochemical University |
Wang, Jiacun | Monmouth University |
Qin, Shujin | Shangqiu Normal University |
Qi, Liang | Shandong University of Science and Technology |
Zhao, Jian | University of Science and Technogly Liaoning |
Keywords: Computational Intelligence, Evolutionary Computation, Heuristic Algorithms
Abstract: Abstract—Disassembly planning and sequencing play an important role in recycling a fast-growing number of end-oflife products. Optimal sequences can effectively reduce carbon emissions and save natural resources in the remanufacturing industry. Considering the development of intelligent manufacturing technology, this work deals with the optimization problem of selective disassembly sequences with an objective of maximizing disassembly profit. Disassembly sequences are generated based on AND/OR graphs. After setting up an environment matrix based on such graphs, this proposes a Q-learning technique to find an selective optimal disassembly sequence. The algorithm is applied to real-life disassembly cases. Experimental results show that the algorithm is superior a popularly-used genetic algorithm (GA) in both computing speed and solution quality through their various comparisons.
|
|
13:20-13:40, Paper We-PS3-T4.2 | Add to My Program |
Learning to Schedule Job-Shop Problems Via Hierarchical Reinforcement Learning (I) |
|
Liao, Zijun | Jinan University |
Li, Qiwen | Jinan University |
Dai, Yuanzhi | Jinan University |
Zhang, Zizhen | Sun Yat-Sen University |
Keywords: Machine Learning
Abstract: The job-shop scheduling problem (JSSP) is a classic combinatorial optimization problem in the areas of computer science and operations research. It is closely associated with many industrial scenarios. In today’s society, the demand for efficient and stable scheduling algorithms has significantly increased. More and more researchers have recently tried new methods to solve JSSP. In this paper, we effectively formulate the scheduling process of JSSP as a Semi-Markov Decision Process. We then propose a method of using hierarchical reinforcement learning with graph neural networks to solve JSSP. We also demonstrate that larger-sized instances require the support of a bigger number of sub-policies and different scheduling phases require using different sub-policies.
|
|
13:40-14:00, Paper We-PS3-T4.3 | Add to My Program |
An Artificial Neural Network Model to Assess Nutritional Factors Associated with Frailty in the Aging Population from Southern Italy (I) |
|
Castellana, Fabio | Data Sciences and Innovation, N.I. of Gastroenterology ”S. De Be |
Aresta, Simona | Data Sciences and Innovation |
Sorino, Paolo | Politecnico Di Bari |
Bortone, Ilaria | N.I. of Gastroenterology ”S. De Bellis” |
Lofů, Domenico | Dept. of Electrical and Information Engineering (DEI), Politecni |
Narducci, Fedelucio | Politecnico Di Bari |
Di Noia, Tommaso | Dept. of Electrical and Information Engineering (DEI), Politecni |
Di Sciascio, Eugenio | Politecnico Di Bari |
Sardone, Rodolfo | Data Sciences and Innovation, N.I. of Gastroenterology ”S. De Be |
Keywords: Application of Artificial Intelligence, Computational Life Science
Abstract: Machine Learning could help the healthcare industry manage huge amounts of data and discover hidden trends and patterns that could help us better understand disease development and treatment. The goal is to define a Neural Network model (NN) to classify physical frailty in aging cohort to identify the frail food and clinical profile. In a 1,929 older cohort from Southern Italy, the Food Frequency Questionnaire (FFQ) and clinical data were collected with blood tests. A NN was built with a hyperparameter tuning technique using accuracy as a performance parameter to select the best model. Confusion matrices, Garson and Olden’s variable importance were evaluated. Older age, female gender, high BMI, and high blood pressure were associated with physical frailty. In frail subjects, the lipid profile and RBC levels were significantly lower than their counterpart. On the contrary, serum levels of interleukin-6 and CRP were higher in the frail group. Frail subjects show higher consumption of spaghetti soup, pecorino cheese, fennel and chocolate, while a lower consumption of ham. The NN model has a respective training and testing accuracy of 86.49% and 85.77%. NN performs well on the train. The test dataset makes few mistakes and can predict healthy subjects with high specificity. According to Garson's method, age, gender, foods rich in fats, and smoking habits are essential in predicting the frailty condition. In contrast, Olden’s method underlined the higher consumption of legumes and unrefined cereals.
|
|
14:00-14:20, Paper We-PS3-T4.4 | Add to My Program |
Task Offloading for Multi-Gateway-Assisted Mobile Edge Computing Based on Deep Reinforcement Learning |
|
Chu, Xianyang | 华东师范大学 |
Zhu, Minghua | East China Normal University |
Qiu, Yunzhou | Shanghai Institute of Microsystem and Information Technology , C |
Mao, Hongyan | East China Normal University |
Keywords: Cloud, IoT, and Robotics Integration, Neural Networks and their Applications, Application of Artificial Intelligence
Abstract: An effective task offloading strategy in the mobile edge computing enables terminals to migrate their tasks to the edge server, accelerating the execution of terminal tasks. However, most researches on task offloading are limited to the single edge server, while in practice, it is difficult for a single edge server to support joint offloading requests from multiple terminals. Edge gateways can be flexibly deployed around terminal devices to further reduce the computing load of edge servers. Therefore, we jointly study the task offloading problem in the multi-gateway-assisted mobile edge computing scenario. Constrained by discrete environmental variables, the offloading process jointly optimizes user scheduling, task offloading rate, and gateway resource allocation, with evaluation indexes defined by the average task delay and energy consumption. Aiming at minimizing the long-term cost of the whole system, we design a deep reinforcement learning algorithm with dynamically adjusted offloading strategies and allocated resources with only the partial state information. The simulation results demonstrate that the algorithm can obtain the optimal computation offloading policy in an uncontrollable dynamic environment. Compared with the other four benchmark algorithms, it has better system cost performance, and can quickly converge to the optimum. Meanwhile, In order to ensure the relative load balance on multiple gateways, we design a low-complexity balanced offloading strategy among multiple gateways and verify its performance.
|
|
14:20-14:40, Paper We-PS3-T4.5 | Add to My Program |
Fusional Modality and Distribution Alignment Learning for Visible-Infrared Person Re-Identification |
|
Sun, Yuxiang | Guangzhou University |
Qi, Ke | Guangzhou University |
Chen, Wenbin | Guangzhou University |
Xiong, Wei | Guangzhou University |
Peiyue, Li | Guangzhou University |
Zhuxian, Liu | Guangzhou University |
Keywords: Machine Vision, Deep Learning, Application of Artificial Intelligence
Abstract: The Visible-Infrared Person Re-Identification (VI-ReID) task aims to retrieval pedestrian images with the same labels across different modalities. VI-ReID is a very challenging task due to the huge intra-modality variation and cross-modality gap. Existing methods are mainly based on the feature alignment to mitigate the modality's gap, however, using only feature-level constraints does not mitigate cross-modality gap well. We propose a fusional modality and distribution alignment learning network (FMADALNet) to mitigate modality's gap and align modality's distribution to learn modality-shared feature representations. FMADALNet contains a lightweight fusional modality generation module (FMGM). FMGM constructs a fusional modality that incorporates heterogeneous image features and contains only modality-shared information to mitigate modality gap at the pixel-level. In addition, to mitigate the differences in the distribution of the different modalities, we design a Hetero-center Maximum Mean Discrepancy loss (Hc-MMD), which reduces the differences in the distribution of the different modalities in a displaying manner. Extensive experimental results on two public datasets show that our proposed method achieves impressive performance compared to state-of-the-art methods.
|
|
We-PS3-T5 Regular Session, TAURUS |
Add to My Program |
Video Processing and Applications |
|
|
Co-Chair: Kremen, Vaclav | Mayo Clinic |
|
13:00-13:20, Paper We-PS3-T5.1 | Add to My Program |
SCSE-E2VID: Improved Event-Based Video Reconstruction with an Event Camera |
|
Lu, Yue | National University of Defense Technology |
Shi, Dianxi | National Innovation Institute of Defense Technology |
Li, Ruihao | Defense Innovation Institute |
Zhang, Yi | National Innovation Institute of Defense Technology |
Jing, Luoxi | National University of Defense Technology |
Yang, Shaowu | National University of Defense Technology |
Keywords: Application of Artificial Intelligence, Machine Vision, Deep Learning
Abstract: The recently emerging event camera has grown into a new type of sensor in the realm of vision, with benefits such as low power consumption, high dynamic range (HDR), microsecond time resolution, and no motion blur. While event cameras offer numerous advantages over conventional cameras, they only capture changes in intensity and give up lots of environmental details. This paper proposes an end-to-end UNet network called SCSE-E2VID to synthesize gray images from asynchronous events. We design an event fusion block to feed more related events to the encoder, allowing the network to extract more valuable features. The famous attention module called Spatial and Channel 'Squeeze & Excitation' Block (SCSE) is utilized to remove artifacts and better extract spatiotemporal features for the decoder. Besides, we add parallel convolutions in the upsampling block and refine the output features, which supplement content in reduced channels. In order to evaluate the performance of our proposed SCSE-E2VID, we implement quantitative and qualitative comparisons based on the public IJRR and HQF datasets. The results show that our method achieves better performance in terms of perceptual similarity and structural similarity when compared with state-of-art methods and demonstrates comparable performance in terms of squared error.
|
|
13:20-13:40, Paper We-PS3-T5.2 | Add to My Program |
OTPose: Occlusion-Aware Transformer for Pose Estimation in Sparsely-Labeled Videos |
|
Jin, Kyung Min | Korea University |
Lee, Gun-Hee | Korea University |
Lee, Seong-Whan | Korea University |
Keywords: Image Processing and Pattern Recognition, Deep Learning, Neural Networks and their Applications
Abstract: Although many approaches for multi-human pose estimation in videos have shown profound results, they require densely annotated data which entails excessive man labor. Furthermore, there exists occlusion and motion blur that inevitably lead to poor estimation performance. To address these problems, we propose a method that leverages an attention mask for occluded joints and encodes temporal dependency between frames using transformers. First, our framework composes different combinations of sparsely annotated frames that denote the track of the overall joint movement. We propose an occlusion attention mask from these combinations that enable encoding occlusion-aware heatmaps as a semi-supervised task. Second, the proposed temporal encoder employs transformer architecture to effectively aggregate the temporal relationship and keypoint-wise attention from each time step and accurately refines the target frame's final pose estimation. We achieve state-of-the-art pose estimation results for PoseTrack2017 and PoseTrack2018 datasets and demonstrate the robustness of our approach to occlusion and motion blur in sparsely annotated video data.
|
|
13:40-14:00, Paper We-PS3-T5.3 | Add to My Program |
Learning Temporal Context of Normality for Unsupervised Anomaly Detection in Videos |
|
Hyun, Wooyeol | Korea University |
Nam, Woo Jeoung | Korea University |
Lee, Jooyeon | Korea University |
Lee, Seong-Whan | Korea University |
Keywords: Deep Learning, Machine Vision, Neural Networks and their Applications
Abstract: Incomplete reconstruction of abnormal samples using convolutional autoencoders trained only on normal samples has been the key principle of anomaly detection. Such detection mechanisms utilize reconstruction error differences between normal and abnormal frames. This is not consistent, however, causing the normal and abnormal samples undistinguishable. To handle this problem, we propose a shuffle-and-sort strategy for learning the temporal context of normality. The purpose of the strategy is to reconstruct shuffled input frames into an output with the correct order using a self-attention mechanism. Consequently, the proposed method can model the temporal context of normal events, which prevents the successful completion of reconstructing anomalies by the convolutional layers. We demonstrated the detection efficiency of the proposed method using public benchmark datasets: UCSD Pedestrian 2, CUHK Avenue, and ShanghaiTech Campus Datasets.
|
|
14:00-14:20, Paper We-PS3-T5.4 | Add to My Program |
Character Animation and Retargeting from Video Streams |
|
Chen, Gong-Bin | Shenzhen Institutes of Advanced Technology, Chinese Academy of Sc |
Wang, Lei | Shenzhen Institute of Advanced Technology, Chinese Academy of Sc |
Liu, Xun-Yu | Shenzhen University |
Hu, Long-Hua | ShenZhen University |
Cheng, Jun | Shenzhen Institute of Advanced Technology, Chinese Academy of Sc |
Keywords: Neural Networks and their Applications, Machine Vision, Image Processing and Pattern Recognition
Abstract: Virtual character animation is widely used in 3D games and virtual reality. Traditional character animation can be achieved through key-frame animation or motion capture technology. These methods have limited applications due to expensive equipments or sophisticated operations. Aiming at a lower-cost solution for this issue, in this paper we propose a method of virtual character animations and retargeting from RGB video streams based on human pose reconstruction. We conduct extensive experiments with different videos and virtual characters, and the resulting character animation is well represented in the virtual scene. The proposed method greatly has reduced the production cost of character animation, which has potential applications in virtual reality.
|
|
14:20-14:40, Paper We-PS3-T5.5 | Add to My Program |
Temporal-Aware Mechanism with Bidirectional Complementarity for Video Q&A |
|
Luo, Yuanmao | Sun Yat-Sen University |
Wang, Ruomei | Sun Yat-Sen University |
Zhang, Fuwei | Sun Yat-Sen University |
Zhou, Fan | Sun Yat-Sen University |
Lin, Shujin | Sun Yat-Sen University |
Keywords: Application of Artificial Intelligence, Multimedia Computation, Deep Learning
Abstract: Video question answering (Video Q&A) is a challenging task as it requires a sufficient understanding of the video and question information. Video is composed of frame sequence, which contains multi-scale temporal relationships and corresponding contextual information. A model competently tackle Video Q&A task that needs to be able to: 1) construct long-term and neighborhood dependencies in frame sequences to extract global and local contextual features that can reflect multi-scale temporal dependencies, and deduce the temporal aware refined features, and 2) identify static and dynamic features from pertinent moments of a video, while filtering away question-irrelated dependencies of feature sequences, to yield the most precise and reasonable temporal-aware overall contextual features. In response to the above requirements, we propose a novel Video Q&A mechanism which consists of Bidirectional Complementary Attention(BCA) module and Adaptive Temporal-aware(ATA) module. Bidirectional complementary attention module stacks multi-head self-attention layer and convolutional layer in different orders to designed two kinds of attention units, which is able to make bidirectional multi-step reasoning based on complete global information and accurate local information to obtain temporal-aware refined features. Adaptive temporal-aware module is used to filter away question-irrelated dependencies in the feature sequence to yield the most precise and reasonable temporal-aware overall contextual features. Comprehensive comparative experiments are conducted on publicly available benchmark datasets. An extended ablation study is further conducted to show the usefulness of each module of the solution in acquiring its computational Q&A capabilities.
|
|
We-PS3-T6 Regular Session, LEO |
Add to My Program |
Learning from Streaming Data - Advances and Challenges |
|
|
Chair: Basterrech, Sebastian | VSB-Technical University of Ostrava, Ostrava, Czech Republic |
Co-Chair: Barddal, Jean Paul | Pontificia Universidade Catolica Do Parana |
|
13:00-13:20, Paper We-PS3-T6.1 | Add to My Program |
Tracking Changes Using Kullback-Leibler Divergence for the Continual Learning (I) |
|
Basterrech, Sebastian | VSB-Technical University of Ostrava, Ostrava, Czech Republic |
Wozniak, Michal | Wroclaw University of Science and Technology |
Keywords: Expert and Knowledge-Based Systems, Knowledge Acquisition, Representation Learning
Abstract: Recently, continual learning has received a lot of attention. One of the significant problems is the occurrence of concept drift, which consists of changing probabilistic characteristics of the incoming data. In the case of the classification task, this phenomenon destabilizes the model's performance and negatively affects the achieved prediction quality. Most current methods apply statistical learning and similarity analysis over the raw data. However, similarity analysis in streaming data remains a complex problem due to time limitation, non-precise values, fast decision speed, scalability, etc. This article introduces a novel method for monitoring changes in the probabilistic distribution of multi-dimensional data streams. As a measure of the rapidity of changes, we analyze the popular Kullback-Leibler divergence. During the experimental study, we show how to use this metric to predict the concept drift occurrence and understand its nature. The obtained results encourage further work on the proposed methods and its application in the real tasks where the prediction of the future appearance of concept drift plays a crucial role, such as predictive maintenance.
|
|
13:20-13:40, Paper We-PS3-T6.2 | Add to My Program |
Improving Data Stream Classification Using Incremental Yeo-Johnson Power Transformation |
|
Tieppo, Eduardo | Pontifícia Universidade Católica Do Paraná (PUCPR) |
Barddal, Jean Paul | Pontificia Universidade Catolica Do Parana |
Nievola, Julio | Pontifical Catholic University of Parana (PUCPR) |
Keywords: Machine Learning
Abstract: Data transformation plays an essential role as a preprocessing step in learning models. Several classification techniques have premises about the underlying data distribution, such as normal distribution assumed in Bayesians classifiers. However, applying data transformation in a streaming setting requires processing an infinite and continuous flow of data. In this paper, we propose the Incremental Yeo-Johnson Power Transformation, a variant of the well-known batch Yeo-Johnson transformation that is tailored for streaming settings, i.e., it supports streaming data via statistical sampling and hypothesis testing. Experimental results show that our proposal achieves the same data normality as its batch counterpart. In addition, it improves the prediction performance of a data stream classifier based on Bayesian statistical models. Overall, learning models obtained 3 percentage points improvement.
|
|
13:40-14:00, Paper We-PS3-T6.3 | Add to My Program |
Feature Importance Identification for Time Series Classifiers |
|
Meng, Han | University of Nottingham |
Wagner, Christian | University of Nottingham |
Triguero, Isaac | University of Nottingham |
Keywords: Machine Learning, Deep Learning, Application of Artificial Intelligence
Abstract: Time series classification is a challenging research area where machine learning techniques such as deep learning perform well, yet lack interpretability. Identifying the most important features for such classifiers provides a pathway to improving their interpretability. Several Feature Importance (FI) identification methods remove the contributions of features, i.e. observations at certain time steps of, from the input and evaluate the change in the classification result to measure the importance of features. As time series features cannot simply be deleted, current techniques generally rely on replacing features with constant or random values. While effective, this approach risks unexpected results in the classification and thus feature importance estimation--as the replacements used may be different to what the classifier encountered in the training phase. This is referred to as the Out-Of-Distribution problem. The OOD problem has been recognised in image and language models but have not received much attention in the context of time series classification. This work addresses the OOD problem in FI identification for time series classifiers. Specifically, we propose a method based on Conditional Variational Autoencoder to generate possible sets of within-distribution inputs, which are used to evaluate feature importance through marginalisation. Experiments on publicly accessible datasets are carried out showing that the method identifies the most important features with higher accuracy than existing methods, providing the basis for improved explainability of time series classifiers.
|
|
14:00-14:20, Paper We-PS3-T6.4 | Add to My Program |
Time-Series Forecasting with Shape Attention |
|
Huang, Feihu | Sichuan University |
Yi, Peiyu | Sichuan University |
Wang, Jince | Sichuan University |
Li, Mengshi | Sichuan University |
Peng, Jian | Sichuan University |
Keywords: Neural Networks and their Applications, Deep Learning, Machine Learning
Abstract: The study of time series forecasting is significant and useful in a variety of scenarios. However, due to the high degree of randomness and the complex contextual factors, it remains a difficult challenge. While several works based on machine learning and deep neural network have been proposed in recent years to address these challenges, most of them mine sequence features based on discrete points and overlook the fact that shape similarity plays an important role in inferring the future values. In this paper, we propose a seq2seq model with Shape Attention and Dilated Convolution (SADC) to tackle this problem. SADC contains two important phases: (1) Embedding with multi-scale dilated convolution. We first define the shape as a set of discrete points in a fixed-length window. The features hidden in the shape are then learned using multiple dilated convolutions with different kernels. (2) Inferring with shape attention. During this phase, we first present the shape attention, which aims to provide support information for inferring future values by generating the embedding vector of each prediction window based on shape similarity. The PreNet network is then built to predict the values using the embedding vector for each prediction window. The experimental results conducted on two datasets show that the performance of SADC model outperforms the state-of-the-art models on time series forecasting.
|
|
14:20-14:40, Paper We-PS3-T6.5 | Add to My Program |
Improving Time Series Generation of GANs through Soft Dynamic Time Warping Loss |
|
Yu, Xiaozhuo | University of Waterloo |
Karray, Fakhreddine | University of Waterloo |
Keywords: Machine Learning, Deep Learning, Application of Artificial Intelligence
Abstract: With the rising popularity of Generative Adversarial Networks (GANs) in generating synthetic data, time series are no exception to this trend. In this work, we propose two novel loss functions sDTW-p and sDTW-m based on Soft-Dynamic Time Warping that can be used to improve the generated time series without modifications to the existing architecture. We also present the first evaluation of the generated samples across different sequence length. Lastly, we show empirically that the result of leveraging our loss function can lead to a 9% improvement according to our metric.
|
|
We-PS3-T7 Regular Session, VIRGO |
Add to My Program |
Perception, Control and Optimization for Complex Systems |
|
|
Chair: Miletić, Mladen | Faculty of Transport and Traffic Sciences, University of Zagreb |
|
13:00-13:20, Paper We-PS3-T7.1 | Add to My Program |
Impact of Connected Vehicles on Learning Based Adaptive Traffic Control Systems (I) |
|
Miletić, Mladen | Faculty of Transport and Traffic Sciences, University of Zagreb |
Čakija, Dino | Faculty of Transport and Traffic Sciences |
Vrbanić, Filip | Faculty of Transport and Traffic Sciences, University of Zagreb |
Ivanjko, Edouard | Faculty of Transport and Traffic Sciences, University of Zagreb |
Keywords: Agent-Based Modeling, Application of Artificial Intelligence, Machine Learning
Abstract: Adaptive Traffic Signal Control (ATSC) systems can be implemented to reduce travel times at urban intersections by changing the signal program according to real-time traffic situations. Modern approaches to ATSC are based on Reinforcement Learning (RL) which can allow the controller to learn the control policy independently. By including the concept of Connected Vehicles (CVs), the RL-based ATSC system can use data gathered from CVs instead of traditional traffic sensors. In this paper, the impact of varying CV penetration rate on RL-based ATSC is implemented and evaluated in a simulated environment. Obtained results show that with a sufficient CVs penetration rate the RL-based ATSC systems can significantly reduce the delay of all vehicles in the traffic network.
|
|
13:20-13:40, Paper We-PS3-T7.2 | Add to My Program |
Implementation of the Grasshopper Optimisation Algorithm to Optimize Prediction and Control Horizons in Model Predictive Control-Based Motion Cueing Algorithm |
|
Al-serri, Sari | Deakin University |
Asadi, Houshyar | Deakin University |
Qazani, Mohammad Reza Chalak | Deakin University |
Al-ashmori, Mohammed | Deakin University |
Mohamed, Shady | Senior Research Fellow, Deakin University |
Arogbonlo, Adetokunbo | Deakin University |
Alsanwy, Shehab | Dealin University |
Abu Alqumsan, Ahmad | Deakin University |
Lim, Chee Peng | Deakin University |
Nahavandi, Saeid | Deakin University |
Keywords: Heuristic Algorithms, Swarm Intelligence
Abstract: Advances in utilisng motion simulators for skill training and related applications have yielded numerous benefits, such as safety, availability, and serviceability, environmentally friendly, and economically beneficial. To give simulator users a sense of realistic feeling of driving, an accurate motion cueing algorithm (MCA) is essential, in order to respect the simulator platform limitation and avoid motion sickness. The use of Model Predictive Control (MPC) in MCA designs leads to respecting the constraints and considering the future dynamic behaviors of the simulator. However, the tuning process of MPC prediction horizon and control horizon still need to be improved. These horizons are normally selected manually by the designer. Previous studies on meta-heuristic algorithms produce a large prediction horizon with a heavy computational load, or a small prediction horizon that sacrifices stability and accuracy of the simulator system. In this study, the Grasshopper Optimization Algorithm (GOA) is adopted to yield optimal prediction and control horizons in MPC-based MCA models. The results are compared with those from the Butterfly Optimization Algorithm (BOA) and Genetic Algorithm (GA) in terms of sensation error and computation time. The GOA technique depicts the fastest process time to promptly detect proper MPC horizons. It does not affect the simulator's efficiency in utilising the workspace, as evidenced by the correlation coefficient and root mean square error between sensation from a real-world vehicle and from the simulator.
|
|
13:40-14:00, Paper We-PS3-T7.3 | Add to My Program |
METRIC: Toward a Drone-Based Cyber-Physical Traffic Management System |
|
Ma, Xiaoliang | KTH Royal Inst of Tech |
Liang, Xinyu | KTH Royal Institute of Technology |
Ning, Mang | KTH Royal Institute of Technology |
Radu, Andrei | KTH Royal Institute of Technology |
Keywords: Image Processing and Pattern Recognition, Application of Artificial Intelligence, Neural Networks and their Applications
Abstract: Drone-based system has a big potential to be applied for traffic monitoring and other advanced applications in Intelligent Transport Systems (ITS). This paper introduces our latest efforts of digitalising road traffic by various types of sensing systems, among which visual detection by drones provides a promising technical solution. A platform, called METRIC, is under recent development to carry out real-time traffic measurement and prediction using drone-based data collection. The current system is designed as a cyber-physical system (CPS) with essential functions aiming for visual traffic detection and analysis, real-time traffic estimation and prediction as well as decision supports based on simulation. In addition to the computer vision functions developed in the earlier stage, this paper also presents the CPS system architecture and the current implementation of the drone front-end system and a simulation-based system being used for further drone operations.
|
|
14:00-14:20, Paper We-PS3-T7.4 | Add to My Program |
Traffic Flow Prediction Based on Federated Learning with Joint PCA Compression and Bayesian Optimization |
|
Zang, Lu | Harbin Institute of Technology |
Qin, Yang | Harbin Institute of Technology (Shenzhen) |
Li, Ruonan | Harbin Institute of Technology, Shenzhen |
Keywords: Deep Learning, Neural Networks and their Applications, Machine Learning
Abstract: Traffic flow prediction (TFP) is of great significance in the field of traffic congestion mitigation on the Internet of Vehicle(Iov). To be capable of a trade-off between data privacy protection and accurate prediction, we introduce a training paradigm based on Federated Learning (FL). However, the implementation of federal learning in practice is confronted with high communication and data heterogeneity. In this paper, Principal component analysis (PCA) is introduced to minimize the scale of data transmission on both the client and server. Due to the errors arising from the compression and reversion of the transmission model, we add an additional error term in the local objective function. To address the imbalanced data distribution and to accelerate the federal learning convergence, we then propose a mechanism that incorporates Bayesian optimization to dynamically determine the weights of clients during aggregation. With extensive experiments on real data, it can be demonstrated that communication costs can be minimized by 60-70% while ensuring fewer errors.
|
|
14:20-14:40, Paper We-PS3-T7.5 | Add to My Program |
An Improved Advantage Actor-Critic Algorithm for Disassembly Line Balancing Problems Considering Tools Deterioration Rate (I) |
|
Cai, Weibiao | Liaoning Petrochemical University |
Guo, Xiwang | Liaoning Petrochemical University |
Wang, Jiacun | Monmouth University |
Qin, Shujin | Shangqiu Normal University |
Zhao, Jian | University of Science and Technogly Liaoning |
Tan, Yuanyuan | Shenyang University of Technology |
Keywords: Computational Intelligence, Evolutionary Computation, Heuristic Algorithms
Abstract: Abstract—With more and more waste products are discarded, how to recycle them has become an urgent issue. Disassembling these discarded products is a critical step to take. With disassembly, we can maximize resource utilization and greatly save manufacturing costs. There are many influencing factors in a disassembly process. In this paper we consider the impact of disassembly tools deterioration rate on disassembly time and establish a mathematical model to minimize the disassembly time. This paper uses the advantage actor-critic algorithm in reinforcement learning to solve this model. The correctness and superiority of the algorithm are verified by comparing the actor-critic algorithm.
|
|
We-PS3-T8 Regular Session, QUADRANT |
Add to My Program |
AI for Human Performance Monitoring, Collaborative Technologies, and
Applications |
|
|
Chair: Kiam, Jane Jean | Universität Der Bundeswehr München |
Co-Chair: Algumaei, Mohammed | Deakin University |
|
13:00-13:20, Paper We-PS3-T8.1 | Add to My Program |
Learning Decision-Making Patterns in the Context of Manned-Unmanned Teaming |
|
Kiam, Jane Jean | Universität Der Bundeswehr München |
Fröhlich, Lukas | Universität Der Bundeswehr München |
Schulte, Axel | Bundeswehr University Munich |
Keywords: Human-Machine Cooperation and Systems, Team Performance and Training Systems
Abstract: Manned-Unmanned Teaming (MUM-T) is an ensemble of manned and unmanned vehicles operating as a team to achieve the same set of goals. Such teaming is highly beneficial, notably in overcoming the limits of direct communication link to the unmanned vehicles, as well as in enhancing capabilities of a manned vehicle by leveraging multiple accompanying unmanned vehicles. However, this can in times overstrain the command and control capacity of the human operator(s) on board of the manned vehicle(s), unless if the unmanned vehicles possess some ``understanding" of the human operators' decision-making behaviors, in which case, they can act like real ``team players" to proactively support the manned vehicle, instead of waiting passively for successive commands. In this study, we investigate the possibility of learning decision-making behaviors of a human operator on board of a manned vehicle in charge of commanding multiple accompanying unmanned vehicles. We base our investigation on a rescue mission involving a manned helicopter and several unmanned aerial vehicles to collect training and validation data using software-in-the-loop simulations. By extracting meaningful features and by performing a clustering on the features on the training dataset, we validate and analyze the learned pattern of commanding the unmanned vehicles.
|
|
13:20-13:40, Paper We-PS3-T8.2 | Add to My Program |
Physiological Compliance During a Three Member Collaborative Computer Task (I) |
|
Algumaei, Mohammed | Deakin University |
Hettiarachchi, Imali Thanuja | Deakin University |
Veerabhadrappa, Rakesh | Deakin University |
Bhatti, Asim | Deakin University |
Keywords: Human Factors, Human Performance Modeling
Abstract: Measuring team performance and physiological compliance (PC) within a team have gained interest in the last few decades. The team’s performance or functioning of a team is overseen by attributes such as collaboration, coordination, attitudes and motivation. Team emergent states such as situational awareness, trust, emotions and mutual understanding influence the attributes of teams. This study examines the relationship between PC and collaboration. It also investigates how the cognitive state influences this relationship. Seventeen teams, each with three members, participated in a collaborative simulated task, while their electrocardiogram (ECG) activity was recorded via a chest strap device. Short-term time-domain measures of heart rate variability (HRV) were derived for each participant. PC was established using the mean of the cross-correlation (CC) between dyads within a team. A linear regression model was employed to examine the relationship between PC and self-reported measures of team collaboration. This in turn will be applied to investigate how cognitive state between team members influences that relationship. The results have shown a statistically significant (p<0.05) positive relationship between PC and collaboration. In conclusion, PC has the potential to be an objective method to quantify teamwork effectiveness where it can be assessed via self-reported collaboration.
|
|
13:40-14:00, Paper We-PS3-T8.3 | Add to My Program |
The Role of Wannabes in the Digital Nomad Ecosystem in Times of Pandemic (I) |
|
de Almeida, Marcos Antonio | Ufrj |
Correia, António | UTAD / INESC TEC / University of Kent |
Souza, Jano | Federal University of Rio De Janeiro |
Schneider, Daniel | UFRJ |
Keywords: Human Factors, Human-Computer Interaction, Intelligence Interaction
Abstract: In this paper, we report on new findings about the results of an empirical study which aims to investigate how the so-called “wannabe” digital nomads activities contribute to the sustainability of the digital nomad ecosystem. To do this, we collected textual data from posts in a Reddit community. We argue that, in order to understand how to design technical solutions for the digital nomad ecosystem, one way to approach this is to understand how they are being impacted in their Personal Knowledge Management Ecology practices and routines, and also how they are seeing the future of their technology-mediated work-life space. Finally, we show how evidence collected from digital nomads about the wannabe/how-to-be-digital-nomad symbiotic ecosystem can inform researchers worldwide about future design-oriented strands.
|
|
14:00-14:20, Paper We-PS3-T8.4 | Add to My Program |
A Study on the Driver-Vehicle Interaction System in Autonomous Vehicle Considering Driver’s Attention Status (I) |
|
Song, Yein | Seoul National University |
Kim, Soo Yeon | Seoul National University |
Lee, Joong Hee | Seoul National University |
Yun, Myung Hwan | Seoul National University |
Nam, Chang | North Carolina State University |
Keywords: Human-Computer Interaction, Human-Machine Interface, Assistive Technology
Abstract: Before fully autonomous driving technology is developed, drivers are not free from the responsibility of Take-Over Request (TOR). Even with level 5 automation, a driver still can play a role as a final decision maker for various driving and non-driving functions, even though most of the tasks will be conducted by the vehicle. In this regard, the performance of the driver’s reaction to the request from the car is crucial in autonomous driving throughout and it is highly dependent on the driver’s attention. It is important to understand the state of the driver during autonomous driving and utilize this information in the driver-vehicle interaction system to enhance safety. Accordingly, it is important to extract features from information about the driver and the surrounding environment in order to increase the accuracy of the model that predicts the driver's state. This paper aims to examine a method of predicting an attentional status based on the driver's emotional state and explore the possibility of applying brain-computer interaction (BCI) with consideration of the driver’s type to increase accuracy. As a result, the prediction model for attentional state with emotional information was developed, and the driver’s characteristics and types were investigated. Based on the results, a driver-vehicle interaction system was proposed for the context of the autonomous vehicle in the future. Although the prediction model for attentional status was not powerful, it can be developed by considering more critical features such as driver’s characteristics and types. In addition, it can be used as a constraint for effective interaction when a driver uses the BCI system to deliver his/her decision to a vehicle.
|
|
14:20-14:40, Paper We-PS3-T8.5 | Add to My Program |
Image Saliency Prediction in Novel Production Scenarios (I) |
|
Zhou, Hailing | Accenture |
Doggett, Erika | Walt Disney Pictures |
Qi, Keyu | Accenture |
Tang, Binghao | Accenture |
Wolak, Anna | Walt Disney Studio Technology |
Nahavandi, Saeid | Deakin University |
Nguyen, David | Accenture |
Keywords: Human-centered Learning, Multimedia Systems, Entertainment Engineering
Abstract: Predicting image saliency has many potential useful applications across several industries including film production, creative marketing content, product design, and manufacturing quality control. Although image saliency prediction in general has made substantial progress with the advent of deep learning on benchmark datasets, it still has space to improve when applied to novel scenarios. This paper presents our experimental findings in dataset building and model development where challenges in predicting saliency on images with large variations are addressed. To evaluate the proposed model properly, we conduct experiments on both benchmark datasets (i.e. SALICON, MIT1003, MIT300, CAT2000) and our own private dataset. The results demonstrate the superior performance of the proposed method. We highlight a film production use-case, but the model and methods explored here may also generalize to other areas relevant to image saliency.
|
|
We-PS3-T9 Regular Session, KEPLER |
Add to My Program |
Virtual and Augmented Reality Systems |
|
|
Co-Chair: Takazawa, Saki | Waseda University |
|
13:00-13:20, Paper We-PS3-T9.1 | Add to My Program |
Virtual Reality Revolution: Strategies for Treating Mental and Emotional Disorders |
|
Nava, Eleonora | Norwegian University of Science and Technology, NTNU |
Jalote-Parmar, Ashis | Norwegian University of Science and Technology, NTNU |
Keywords: Virtual and Augmented Reality Systems, Interactive and Digital Media, Assistive Technology
Abstract: Virtual Reality (VR) is increasingly gaining recognition in healthcare, especially as a treatment tool for psychological interventions. This paper reviews current advances in immersive VR-based therapies to explore different strategies designed to treat mental and emotional disorders with Virtual Reality Therapy (VRT). The study contributes to the VR community by exemplifying the application of various psychological treatment strategies in designing VR therapies, such as Cognitive-Behavioral, Distraction, Perspective-Taking and Exposure. For higher adoption of VR by clinicians, greater quality control of these strategies and well-defined user experiences are required, followed by clinical validation.
|
|
13:20-13:40, Paper We-PS3-T9.2 | Add to My Program |
Development of a Mixed Reality Rehabilitation System for Real-Life Environment in Stroke Patients with Unilateral Spatial Neglect |
|
Takazawa, Saki | Waseda University |
Yasuda, Kazuhiro | Waseda University |
Sabu, Rikushi | Waseda University |
Kawaguchi, Shuntaro | Sonodakai Rehabilitation Hospital |
Iwata, Hiroyasu | Waseda University |
Keywords: Virtual and Augmented Reality Systems, Wearable Computing
Abstract: Unilateral spatial neglect (USN) is a higher cognitive dysfunction that can occur after a stroke. It is defined as an impairment in finding, reporting, reacting to, and directing to stimuli presented opposite to the damaged side of the brain. So far, we have developed an immersive virtual reality system that releases the attention biased to the non-neglected side and guides it to the neglected side to improve the symptoms of neglect. However, it improved neglect symptoms evaluated using a paper-and-pencil test on a desk (i.e., Behavioral Inattention Test), and it did not improve neglect in daily life evaluated using a standardized evaluation tool [i.e., Catherine Bergego Scale (CBS)], which is a rating index of neglect in actual daily life. Therefore, establishing a method for improving neglect in daily life is necessary. This study was designed to develop a platform for improving neglect in daily life as an initial investigation of rehabilitation methods for patients with USN. To improve neglect, the following three elements were derived from previous studies that should be included in rehabilitation: object search by moving around, reproduction of the complexity of daily life, and own body movement in the neglected side with visual confirmation. The patient moves around the room and visually explores and touches the objects presented in the mixed reality (MR) space. Using this system, we conducted a validity test on a patient with USN. The results showed that the patient improved from moderate to mild neglect in the CBS. Thus, the applicability of proposed MR system was suggested in terms of improving neglect in patients with USN in daily life environment.
|
|
13:40-14:00, Paper We-PS3-T9.3 | Add to My Program |
Motion Arc Analysis in Virtual Reality Environment |
|
Qu, Chenxin | Beijing Jiaotong University |
Chen, Ruiling | Beihang University |
Che, Xiaoping | Beijing Jiaotong University |
Keywords: Virtual and Augmented Reality Systems, Human-Computer Interaction, Interactive Design Science and Engineering
Abstract: With the popularity of VR, the body interaction in VR has not received enough corresponding attention, and the interaction design that violates the law of the human body happens a lot. Therefore, we study the action interaction in virtual reality through motion arc, an important parameter of human action, to get the factors that affect the user's body interaction in virtual reality. A within-subject experiment (n=18) was conducted, in which all participants played three VR games and responded to the post-game questionnaire. Through video recordings of the front and right side of them, 3D skeleton modeling was reconstructed by using OpenPose and the conversion relationship between the four coordinate systems under computer vision. After the results of operations such as skeleton standardization and motion segmentation, clustering algorithms are used to cluster similar users, and Spearman’s Rank Correlation Coefficient is used to study the influence of user characteristics on motion arc. Our results indicated that in an unconstrained game, the tutorial makes the motion arc increase, while in a constrained game, the change of motion arc is more complex. It is also found that the participants’ instructions learning has the greatest impact on the average motion arc, followed by individual factors (including age gender, etc.) and sports experience.
|
|
14:00-14:20, Paper We-PS3-T9.4 | Add to My Program |
Spatial Augmented Respiratory Cardiofeedback Design for Prosthetic Embodiment Training: A Pilot Study |
|
Salatino, Laura | Istituto Italiano Di Tecnologia |
Demarzi, Giorgio | Universitŕ Degli Studi Di Genova |
de Zambotti, Massimiliano | SRI International |
Deshpande, Nikhil | Istituto Italiano Di Tecnologia |
Berta, Riccardo | Universitŕ Degli Studi Di Genova |
Boccardo, Nicolň | Istituto Italiano Di Tecnologia |
Freddolini, Marco | Istituto Italiano Di Tecnologia |
Laffranchi, Matteo | Istituto Italiano Di Tecnologia |
De Michieli, Lorenzo | Istituto Italiano Di Tecnologia |
Barresi, Giacinto | Istituto Italiano Di Tecnologia |
Keywords: Virtual and Augmented Reality Systems, Human-Computer Interaction, Human-Machine Interface
Abstract: Recent literature suggests that self-regulation techniques like biofeedback can be used to enhance the embodiment of artificial limbs. In this study, we developed and preliminarily tested an embodiment training protocol based on a Spatial Augmented Respiratory Cardiofeedback (SARC) implemented through a computer screen - visualizing a 3D model of a prosthetic hand (Hannes) - and a thoracic band for monitoring the Heart Rate Variability (HRV) of the users. The feedback was based on the respiratory-driven modulation of a composite index of the individuals’ cardiac autonomic state after an initial calibration based on slow breathing (at a rate perceived as "comfortable"). Alongside the assessment of the SARC use feasibility, this pilot study evaluates the virtual hand embodiment obtained in two task conditions. In both conditions, the virtual limb gradually appears when the cardiofeedback exercise is performed correctly. Otherwise, the virtual limb parts gradually disappear (“unstable” condition) or they remain visible (“cumulative” condition). In the latter case, the virtual hand maintains its “reality-based” stability, supporting the subject’s motivation. Ten volunteers without disabilities were presented both conditions on 10 trials each (2 min per trial). Their experience and their proprioceptive drift (estimating their real hand position as close to the artificial one) were assessed as measures of virtual prosthesis embodiment. The questionnaire results preliminarily highlight the feasibility of the SARC. Furthermore, a significantly stronger drift for the virtual prosthesis occurred in the cumulative condition, orienting further investigations.
|
|
14:20-14:40, Paper We-PS3-T9.5 | Add to My Program |
UHRCS: A High-Throughput Platform for Real-Time Cameras-Sampling Based on UE4 |
|
Shi, Zhijiang | National University of Defense Technology |
Shuai, Ye | Academy of Military Science |
Zhu, Chengzhang | Academy of Military Science |
Zhu, Yuqi | Academy of Military Science |
Li, Hao | Academy of Military Science |
Xu, Xinhai | Academy of Military Science |
Keywords: Virtual and Augmented Reality Systems, Information Visualization, Multimedia Systems
Abstract: Nowadays, simulation environments are widely used in different research fields. As the complexity of the problem increases, Cameras-Sampling in a simulation environment is important since it can provide highly realistic training data for online machine learning and reinforcement learning, etc. This paper attempts to improve the throughput and efficiency when multiple camera sampling are performed simultaneously. We implement a high-throughput platform for Real-time Cameras-Sampling based on Unreal Engine 4. The platform uses multi-graphics queues to support multi-cameras sampling in parallel. We construct a virtual desert scene to verify the correctness and effectiveness of the proposed platform. The experiments show that the proposed platform can generate eight 960x640 pixels of pictures at a frequency of 30Hz. The throughput and GPU utilization have been increased by 2.57x and 1.43x, respectively, compared with Unreal Engine4.
|
|
We-PS3-T10 Regular Session, TYCHO |
Add to My Program |
Recommendation Systems |
|
|
Chair: Buarque de Lima Neto, Fernando | University of Pernambuco |
Co-Chair: Saptono, Ristu | Kyushu University |
|
13:00-13:20, Paper We-PS3-T10.1 | Add to My Program |
Recommending View Bundles in Data Marketplaces |
|
Buenos Aires de Carvalho, Thiego | University of Pernambuco |
Lima Martins, Denis Mayr | University of Muenster |
Buarque de Lima Neto, Fernando | University of Pernambuco |
Vossen, Gottfried | University of Muenster |
Keywords: Computational Intelligence, Evolutionary Computation, Intelligent Internet Systems
Abstract: For companies to have a competitive advantage, they need to extract relevant information from data and for that, they need to complement their own data with other data sources. Data marketplaces are platforms on which data providers and data consumers do business. However, every data interaction incurs a monetary cost. Therefore, data consumers are interested in buying a set of interesting views that fit their (goal and) budget. Views allow data to be represented visually, enabling users to make sense of patterns and insights. Besides allowing for easier cost control, buying a set of views bundled together increases the chance of finding what consumers want over buying them view-by-view. Selecting the suitable views to compose an interesting bundle is non-trivial, due to the vast number of view combinations that potentially meet the data consumer's needs. In this paper, we address the problem of view bundle recommendation in data marketplaces, in which the utility of a bundle depends on the interplay among candidate views. We propose the use of Self-Organizing Maps as a means to compute this interplay and use a Genetic Algorithm to design near-optimal bundles. Our empirical results demonstrate that our approach can effectively aid data consumers to find relevant view bundles under budget constraints.
|
|
13:20-13:40, Paper We-PS3-T10.2 | Add to My Program |
Interpretable Educational Recommendation: An Open Freamework Based on Bayesian Principal Component Analysis |
|
Yun, Yue | Northwestern Polytechnical University |
Dai, Huan | Northwestern Polytechnical University |
Zhang, Yupei | Northwestern Polytechnical University |
Wei, Shuangshuang | Northwestern Polytechnical University |
Shang, Xuequn | Northwestern Polytechnical University |
Keywords: Expert and Knowledge-Based Systems, Machine Learning, Application of Artificial Intelligence
Abstract: Recommendations in the educational environment aims to help learner access their personalized demands efficiently. Unlike commodity recommendation, limited to the ethics of pedagogy and the high cost of bad recommendations, the credibility and interpretability of the education recommendation system are more worthy of attention to achieve recommendation accuracy. However, few studies focused on the interpretability of recommendations. Thus, this study proposes an Open Recommendation framework for Interpretability on the basis of the Bayesian principal component analysis (PPCA), ORec4Int. ORec4Int helps learners understand the recommendation by building a mapping between educational resources and the latent factors/features of learners. The interpretability will enhance his/her trust in the education recommendation system. Finally, We not only evaluate the recommendation performance of ORec4Int on the basis of one real-world dataset but also compared its performance in interpretability and the education expert solution. Results show that ORec4Int can approach the performance of education expert solutions. Ultimately, ORec4Int is faster, more efficient, and less costly.
|
|
13:40-14:00, Paper We-PS3-T10.3 | Add to My Program |
Research on Social Recommendation Model Based on Enhanced Neighbor Perception |
|
Jia, Zihe | Qilu University of technology(Shandong Academy of Sciences) |
Gao, Qian | Qilu University of technology(Shandong Academy of Sciences) |
Fan, Jun | Business-Intelligence of Oriental Nations Corporation Ltd |
Keywords: Neural Networks and their Applications, Complex Network, Deep Learning
Abstract: Recent years, people have been gradually influenced by online socialization. The emergence and development of Graph Neural Networks (GNN) has shown great advantages in mining implicit data and expressing node relationships, etc. However, due to the arbitrary nature of establishing neighborly relationships, the judging reliability of trusting relationships between neighbors is a difficult issue. Therefore, this paper proposes an innovative approach to obtain the valid neighbor relationships based on real user-item interactions and friend trust relationships in the dataset. The proposed method can address the impact of invalid neighbors on the social model and achieve the effect of enhancing neighbor perception. For neighbor interaction in the graph neural network model, this paper establishes the direct connection between users and items through mapping of multilayer perceptron firstly. Then it integrates neighbor similarity and neighbor sampling to mitigate the interference information of invalid neighbors and achieves the effect of enhancing the perceived information of neighbor interaction. Finally, this paper establishes the item social space and user social space with enhanced neighbor perception according to the dyadic nature and organically integrated them to enrich the social data. Comparison experiments are carried out based on two publicly available datasets Epinions and Ciao, the recommended method performs better than other social recommendation models with the MAE and RMSE values being improved by 0.81%-1.09% and 1.15%-1.41% respectively.
|
|
14:00-14:20, Paper We-PS3-T10.4 | Add to My Program |
Knowledge-Enhanced Graph Transformer Network for Multi-Behavior and Item-Knowledge Session-Based Recommendation |
|
Chai, Huihui | Qilu University of Technology (Shandong Academy of Sciences) |
Wei, Xiumei | Qilu University of Technology |
Ma, Haoxiang | Qilu University of Technology |
Jiang, Xuesong | Qilu University of Technology(shandong Academy of Sciences) |
Keywords: Knowledge Acquisition, Deep Learning, Machine Learning
Abstract: Session-based recommendations already play an important role in platforms such as e-commerce and streaming media, which are designed to predict the next interaction item based on a given session.Most of the current recommendation models only use the interaction sequence of the session to capture the potential conversion patterns between items,often ignoring the user’s multi-type interaction behavior that reflects the user’s fine-grained preferences.At present,most models of multi-type interaction behaviors only learn user-item multi-type interaction behaviors and item-item dependencies relatively independently, and ignore the problems of item cold start and data sparsity.These issues motivate us to propose a new model MKGTN in this paper, we apply multi-type user-item interaction behaviors and item-item dependencies to session recommendation via MLP.Simultaneously, using a multi-task learning MLT paradigm involving learning knowledge embeddings as an auxiliary task to facilitate the main task of SR.Evaluations on three datasets show that MKGTN outperforms state-of-the-art multi-action interaction models, demonstrating the superiority of our model’s performance.
|
|
14:20-14:40, Paper We-PS3-T10.5 | Add to My Program |
Best Approximate Distribution-Based Model for Helpful Vote of Customer Review Prediction |
|
Saptono, Ristu | Kyushu University |
Mine, Tsunenori | Kyushu University |
Keywords: Machine Learning, Application of Artificial Intelligence
Abstract: Product reviews are more and more important for potential customers to decide on their purchases in electronic commerce nowadays. The helpful vote is a critical indicator of how much impact the review has on other customers. Therefore, the prediction of helpful votes is an essential task. Linear and Tobit Regression are general methods of the prediction. Those methods share the same objective function and come from the initial assumption that the helpful votes on any dataset follow a normal distribution. However, the assumption is not usually confirmed, and the distribution of the helpful votes often follows other distributions. Consequently, the prediction results might not be fully appropriate. This paper proposes a model that follows the best approximate distribution of helpful votes to predict the number of helpful votes. On top of that, considering the elapsed time since reviews were written, we propose an adaptive window size sampling method to evaluate the model on review datasets sorted chronologically. To validate the proposed model, we conducted extensive experiments on real-world datasets. Experimental results illustrate the validity of the proposed model.
|
|
We-KN4K Keynote Session, MERIDIAN |
Add to My Program |
Ilya Kolmanovsky: Perspectives, Challenges and Opportunities in Control of
Safety Critical Systems for Increased Autonomy |
|
|
Chair: Strasser, Thomas | AIT Austrian Institute of Technology |
|
15:00-16:00, Paper We-KN4K.1 | Add to My Program |
Perspectives, Challenges and Opportunities in Control of Safety Critical Systems for Increased Autonomy |
|
Kolmanovsky, Ilya V. | University of Michigan |
Keywords: System Modeling and Control, Cooperative Systems and Control, Modeling of Autonomous Systems
Abstract: Autonomous systems need to satisfy increasingly stringent requirements informed by regulations, expanding mission objectives, human-machine interactions, and safety considerations. In particular, constraints on state, output and control variables need to be enforced during system operation. The speaker will present perspectives on control of systems with constraints drawn from his experience with the relevant theory and applications, and he will highlight underlying challenges and opportunities. In particular, the development of add-on governor schemes to protect systems from constraint violation, the interplay between closed-loop stability, performance and computations in model predictive control, integration of constrained control and learning, and approaches to maximizing system operating life when safety critical constraint violation cannot be avoided will be discussed. The potential for applications to control of autonomous spacecraft, very flexible aircraft and automotive vehicles will be highlighted.
|
|
We-KN5K Keynote Session, MERIDIAN |
Add to My Program |
Tomas Mikolov: Complex Systems for AI |
|
|
Chair: Marik, Vladimir | Czech Tech |
|
16:00-17:00, Paper We-KN5K.1 | Add to My Program |
Complex Systems for AI |
|
Mikolov, Tomas | CVUT |
Keywords: Application of Artificial Intelligence, Machine Learning, Computational Intelligence
Abstract: In this talk, I will describe some of our recent efforts to develop mathematical models which can spontaneously evolve and increase in complexity. We hope such models can be a basis for stronger AI models, which could possibly learn, adapt and develop in time without the need for supervision or even rewards. This would allow us to solve tasks which are currently too challenging for the mainstream machine learning algorithms, such as smart chatbots or other applications where learning on the fly without supervision is necessary.
|
| |