Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Knowledge graphs can provide a rich resource for constructing question answering systems and recommendation systems. However, most knowledge graphs still encounter knowledge incompleteness. The path-based approach predicts the unknown relation between pairwise entities based on existing path facts. This approach is one of the most promising approaches for knowledge graph completion. A critical challenge of such approaches is integrating path sequence information to achieve the goal of better reasoning. Existing researches focus more on the features between neighboring entities and relations in a path, ignoring the semantic relations of the whole triple. A single path consists of entities and relations, but triples contain valuable semantic information. Moreover, the importance of different triples on each path is disparate. To address these problems, we propose a method convolutional network with hierarchical attention to complete the knowledge graph. Firstly, we use a convolutional network and bidirectional long short-term memory to extract the features of each triple in the path. Then, we employ a novel hierarchical attention network, including triple-level attention and path-level attention, picking up path features at multiple granularities. In addition, we elaborate a multistep reasoning component that repeats multiple interactions with the hierarchical attention module to obtain more plausible inference evidence. Finally, we predict the relation between query entities and provide the most dominant path to explain our answer. The experimental results show that our method outperforms existing approaches by 1–3 \(\%\) on four datasets. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Achieving interpretable embedding of real network has a significant impact on network analysis tasks. However, majority of node embedding-based methods seldom consider the rationality and interpretability of node embedding. Although graph attention networks-based approaches have been employed to improve the interpretability of node embedding, they are implicitly specifying different weights to different nodes in a neighborhood. In this study, we present node embedding with capsule generation-embedding network(CapsGE), which is a novel capsule network-based network architecture, and uses node density based on the definition of uncertainty of node community belongings to explicitly assign different weights to different nodes in a neighborhood. In addition, this model uses the proposed cognitive reasoning mechanism for the weighted features to achieve rational and interpretable embedding of nodes. The performance of the method is assessed on node classification task. The experimental results demonstrate its advantages over other methods. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Thermal images could help visual images to improve object detection performance under low illumination. On the other hand, the complementary fusion of visual and thermal features can be challenging. In RGB-T object detection, the two-stream network structure has been widely used, in which addition operation and concatenation operation are utilized to merge feature maps. However, the addition compacts two-stream feature with inevitable distortion, while direct concatenation may bring redundancy to features. In this paper, we show that the addition operation is more suitable for common features from RGB and thermal, while the concatenation operation is more suitable for specific features unique to RGB or thermal. Then we take the divide-and-conquer strategy to propose an RGB-T detector named Divide-and-Conquer Fusion Network (DaCFN), which divides RGB and thermal features into common and specific ones and applies category-customized operations to them. Specifically, we design the Partial Coupling Net Block (PCNB), in which common features are extracted by coupled parameters and specific features by independent ones. Then the Selective Common Addition (SCA) and the Independent Specific Concatenation (ISC) are designed to fuse common and specific features, respectively. Experiments on FLIR and KAIST datasets demonstrate that our approach achieves high accuracy with high speed against other state-of-the-art RGB-T detectors. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: A rule is an effective representation of knowledge in formal concept analysis (FCA), which can express the relations between concepts. One of the main research directions of FCA is to develop rule-based classification algorithms. Rule-based algorithms in FCA lack effective methods for analyze their generalization capability, which can provide an effective learning guarantee for the algorithm. To solve this problem and effectively improve the classification performance of rule-based algorithms in terms of speed and accuracy, this paper combines formal concept analysis with online learning theory to design an online rule fusion model based on FCA, named ORFM. First, the weak granular decision rule is proposed based on rule confidence. Second, the purpose of each iteration is to reduce the difference between the prediction rules extracted from the ORFM and the weak granular decision rules as much as possible so that the classifier model can be adjusted to the direction of the minimum regret growth rate, and the regret growth rate is 0 under the ideal state at the end of iteration. Third, it is proven that the regret of ORFM has an upper bound; that is, in an ideal state, the regret growth rate decreases rapidly with the increase in the number of iterations, eventually making the regret of the model no longer grow. This provides an effective learning guarantee for ORFM. Finally, experimental results on 16 datasets show that ORFM has better classification performance than other classifier models. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: The deep learning model has demonstrated excellent performance in the fitting of data and knowledge. For hyperspectral images, accurate classification is still difficult in the case of limited samples and high-dimensional relevance. In this paper, we propose a collaborative optimization parallel convolution network consisting of 3D-2D CNN for hyperspectral image classification. One branch of the parallel network is a 3D-CNN consisting of three blocks for extracting spectrum features and spectrum correlation. The three blocks include a 3D bottleneck block (convolution), SEblock (attention), and a spatial-spectrum convolution module. Secondly, the diverse Region feature extraction network is employed as a spatial-spectrum feature computing module. Finally, the classification predictions from the two branches are fused to obtain the classification results. By comparing the experimental results conducted on three datasets, the proposed method performs significantly better than the SOTA methods in comparison and has better generalization capability. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Haze severely degrades the definition of images captured in outdoor scenes. The goal of image dehazing is to restore clear images from hazy ones. This problem has been significantly advanced by using deep neural networks. The performance gains mainly depend on large capacity models, which inevitably increases memory consumption and is not benefit to deployment on mobile devices. In contrast, we propose an effective image dehazing method based on a multi-scale recursive network which does not simply stack deep neural networks to improve dehazing performance. The proposed network consists of both internal and external recursions and some residual blocks. In addition, an auxiliary network is developed to collaboratively train with the primary network and guide the training process of the primary network, which is termed as the auxiliary loss. To better train the proposed network, we develop the smooth \(L_{1}\) -norm-based content loss, perceptual loss, and auxiliary loss to regularize the proposed network. Extensive experiments demonstrate that the multi-scale recursive network achieves favorable performances against state-of-the-art image dehazing methods. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Feature selection, which is a commonly used data prepossessing technique, focuses on improving model performance and efficiency by removing redundant or irrelevant features. However, an implicit assumption made by traditional feature selection approaches is that data are independent and identically distributed (IID). To further obtain more complex and significant information, an effective feature selection construction should consider the couplings (non-IIDness) contained within feature values and relevance between features. Hence, referring to rough set theory, this paper first introduces a new coupled similarity measure to discover the value-to-feature-to-class coupling information, which can be used to calculate object neighbor and update feature weights. Second, using mutual information, a new coupled relevance measure is defined to capture the feature-to-feature coupling relationships. On this basis, an effective feature-selection algorithm based on coupling learning is developed for categorical data. To demonstrate the proposed algorithm, four common classifiers and 12 UCI data sets are employed in the experiments. The experimental results confirm the feasibility of the new algorithm and its effectiveness. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: The fast expansion of the Internet of Things (IoT) networks raises the possibility of further network threats. In today’s world, network traffic analysis has become an increasingly critical and useful tool for monitoring network traffic in general and analyzing attack patterns in particular. A few years ago, distributed denial-of-service attacks on IoT networks were considered the most pressing problem that needed to be addressed. The absence of high-quality datasets is one of the main obstacles to applying DDOS detection systems based on machine learning. Researchers have developed numerous methods to extract and analyze information from recorded files. From a literature review, it is clear that most of these tools share similar drawbacks. In this study, we proposed an intelligent raw network data extractor and labeler tool by incorporating the limitations of the tools that are available to transform PCAP to CSV. To generate and process a high-quality DDOS attack dataset suitable for machine learning models, we employed several data preprocessing operations on the selected network intrusion dataset. To confirm the validity and acceptability of the dataset, we tested different models. Among the models tested, the random forest was the most accurate in detecting the DDOS attack. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: With the development of convolutional neural network (CNN) technology, No-reference Image Quality Assessment (NR-IQA) based on CNN has attracted the attention of many scholars. However, most of the previous methods improved the evaluation performance by increasing the network depth and various feature extraction mechanisms. This maybe causes some problems such as insufficient feature extraction, detail loss and gradient disappearance due to the limited samples with labels in the existing database. To learn feature representation more effectively, this paper proposes a Two-channel Deep Recursive Multi-Scale Network Based on Multi-Attention (ATDRMN), which can accurately evaluate image quality without relying on reference images. The network is a two-channel convolution networks with original image and gradient image as inputs. In two sub-branch networks, the Multi-scale Feature Extraction Block based on Attention (AMFEB) and the Improved Atrous Space Pyramid Pooling Network (IASPP-Net) are proposed to extend the attention-required feature information and obtain different levels of hierarchical feature information. Specifically, each AMFEB makes full use of image features in convolution kernels of different sizes to expand feature information, and further inputs these features into the attention mechanism to learn their corresponding weights. The output of each AMFEB is extended by the cavity convolution algorithm in IASPP-Net to obtain more context information and learn its hierarchical features. Finally, multiple AMFEBs and IASPP-Nets are respectively deeply and recursively fused to further obtain the most effective feature information, and then the output features are inputted to the regression network for final quality evaluation. The experimental results on seven databases showed that the proposed method has good robustness and is superior to the most advanced NR-IQA methods. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Existing adversarial attack methods usually add perturbations directly to the pixel space of an image, resulting in significant local noise in the image. Besides, the performance of existing attack methods is affected by various pixel-space based defense strategies. In this paper, we propose a novel method to generate adversarial examples by adding perturbations to the feature space. Specifically, the perturbation of the feature space is induced by a style-shifting-based network architecture called AdvAdaIN. Furthermore, we expose the feature space to the attacker via an encoder, and then the perturbation is injected into the feature space by AdvAdaIN. Simultaneously, due to the specificity of feature space perturbations, we trained a decoder to reflect the changes in feature space to pixel space and ensure that the perturbations are not easily detected. Meanwhile, we align the original image with another image in the feature space, adding additional adversarial information to the model. In addition, we can generate diverse adversarial samples by varying the perturbation parameters, which mainly change the overall color and brightness of the image. Experiments demonstrate that the proposed method outperforms existing methods and produces more natural adversarial samples when facing defensive strategies. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Although transfer learning has been employed successfully with pre-trained models based on large convolutional neural networks, the demand for huge storage space makes it unattractive to deploy these solutions on edge devices having limited storage and computational power. A number of researchers have proposed Convolution Neural Network Compression models to take care of such issues. In this paper, a genetic algorithm-based approach has been employed to reduce the size of the Convolution Neural Network model, by selecting a subset of convolutional filters and nodes in the dense layers, while maintaining accuracy levels of original models. Specifically, AlexNet, VGG16, ResNet50 architectures have been taken up for model reduction and it has been shown that without compromising on the accuracy, huge gains can be made in terms of reduced storage space. The paper also shows that using this approach additional reduction in storage space of around 38% could be achieved even for SqueezeNet, which is an already compressed model. The paper also reports a substantial reduction in inference time for standard datasets such as MNIST, CIFAR-10 and CIFAR-100 applied on all the compressed models mentioned above. For CIFAR-100, the reduction in time is almost double that of other results reported in the literature. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Modeling Hawkes process using deep learning is superior to traditional statistical methods in the goodness of fit. However, methods based on RNN or self-attention are deficient in long-time dependence and recursive induction, respectively. Universal Transformer (UT) is an advanced framework to integrate these two requirements simultaneously due to its continuous transformation of self-attention in the depth of the position. In addition, migration of the UT framework involves the problem of effectively matching Hawkes process modeling. Thus, in this paper, an iterative convolutional enhancing self-attention Hawkes process with time relative position encoding (ICAHP-TR) is proposed, which is based on improved UT. First, the embedding maps from dense layers are carried out on sequences of arrival time points and markers to enrich event representation. Second, the deep network composed of UT extracts hidden historical information from event expression with the characteristics of recursion and the global receptive field. Third, two designed mechanics, including the relative positional encoding on the time step and the convolution enhancing perceptual attention are adopted to avoid losing dependencies between relative and adjacent positions in the Hawkes process. Finally, the hidden historical information is mapped by Dense layers as parameters in Hawkes process intensity function, thereby obtaining the likelihood function as the network loss. The experimental results show that the proposed methods demonstrate the effectiveness of synthetic datasets and real-world datasets from the perspective of both the goodness of fit and predictive ability compared with other baseline methods. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Labels play a central role in the text classification tasks. However, most studies has a lossy label encoding problem, in which the label will be represented by a meaningless and independent one-hot vector. This paper proposes a novel strategy to dynamically generate a soft pseudo label based on the prediction for each training. This history-based soft pseudo label will be taken as the target to optimize parameters by minimizing the distance between the target and the prediction. In addition, we augment the training data with Mix-up, a widely used method, to prevent overfitting on the small dataset. Extensive experimental results demonstrate that the proposed dynamical soft label strategy significantly improves the performance of several widely used deep learning classification models on binary and multi-class text classification tasks. Not only is our simple and efficient strategy much easier to implement and train, it is also exhibits substantial improvements (up to 2.54% relative improvement on FDCNews datasets with an LSTM encoder) over Label Confusion Learning (LCM)—a state-of-the-art label smoothing model—under the same experimental setting. The experimental result also demonstrate that Mix-up improves our method's performance on smaller datasets, but introduce excess noise in larger datasets, which diminishes the model’s performance. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Recently, the application of convolution neural network (CNN) in single image super-resolution (SISR) is gradually developing. Although many CNN-based methods have acquired splendid performance, oversized model complexity hinders their application in real life. In response to this problem, lightweight and efficient are becoming development tendency of SR models. The residual feature distillation network (RFDN) is one of the state-of-the-art lightweight SR networks. However, the shallow residual block (SRB) in RFDN still uses ordinary convolution to extract feature, where still has great improvement room for the reduction of network parameters. In this paper, we propose the Group-convolutional Feature Enhanced Distillation Network (GFEDNet), which is constructed by the stacking of feature distillation and aggregation block (FDAB). Benefitting from residual learning of residual feature aggregation (RFA) framework and feature distillation strategy of RFDN, the FDAB can obtain more diverse and detailed feature representations, thereby improves the SR capability. Furthermore, we propose the multi-scale group convolution block (MGCB) to replace the SRB. Thanks to group convolution and multi-branch parallel structure, the MGCB reduces the parameters substantially while maintaining SR performance. Extensive experiments show the powerful function of our proposed GFEDNet against other state-of-the-art methods. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: In classification problems, the occurrence of abnormal observations is often encountered. How to obtain a stable model to deal with outliers has always been a subject of widespread concern. In this article, we draw on the ideas of the AdaBoosting algorithm and propose a asymptotically linear loss function, which makes the output function more stable for contaminated samples, and two boosting algorithms were designed based on two different way of updating, to handle outliers. In addition, a skill for overcoming the instability of Newton’s method when dealing with weak convexity is introduced. Several samples, where outliers were artificially added, show that the Discrete L-AdaBoost and Real L-AdaBoost Algorithms find the boundary of each category consistently under the condition where data is contaminated. Extensive real-world dataset experiments are used to test the robustness of the proposed algorithm to noise. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Class hierarchical structures play a significant role in large and complex tasks of machine learning. Existing studies on the construction of such structures follow a two-stage strategy. The category similarities are first computed with a certain assumption, and the group partition algorithm is then performed with some hyper-parameters to control the shape of class hierarchy. Despite their effectiveness in many cases, these methods suffer from two problems: (1) optimizing the two-stage objective to obtain the structure is sub-optimal; (2) hyper-parameters make the search space too large to find the optimal structure efficiently. In this paper, we propose a unified and dynamic framework to address these problems, which can: (1) jointly optimize the category similarity and group partition; (2) obtain the class hierarchical structure dynamically without any hyper-parameters. The framework replaces the traditional category similarity with the sample similarity, and constrains samples from the same atomic category partitioned to the same super-category. We theoretically prove that, within our framework, the sample similarity is equivalent to the category similarity and can balance the partitions in terms of the number of samples. Further, we design a modularity-based partition optimization algorithm that can automatically determine the number of partitions on each level. Extensive experimental results on multiple image classification datasets show that the hierarchical structure constructed by the proposed method achieves better accuracy and efficiency compared to existing methods. Additionally, the hierarchy obtained by the proposed method can benefit long-tail learning scenarios due to the balanced partition on samples. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Despite the great progress in action recognition made by deep neural networks, visual tempo may be overlooked in the feature learning process of existing methods. The visual tempo is the dynamic and temporal scale variation of actions. Existing models usually understand spatiotemporal scenes using temporal and spatial convolutions, which are limited in both temporal and spatial dimensions, and they cannot cope with differences in visual tempo changes. To address these issues, we propose a multi-receptive field spatiotemporal (MRF-ST) network to effectively model the spatial and temporal information of different receptive fields. In the proposed network, dilated convolution is utilized to obtain different receptive fields. Meanwhile, dynamic weighting for different dilation rates is designed based on the attention mechanism. Thus, the proposed MRF-ST network can directly caption various tempos in the same network layer without any additional cost. Moreover, the network can improve the accuracy of action recognition by learning more visual tempos of different actions. Extensive evaluations show that MRF-ST reaches the state-of-the-art on three popular benchmarks for action recognition: UCF-101, HMDB-51, and Diving-48. Further analysis also indicates that MRF-ST can significantly improve the performance at the scenes with large variances in visual tempo. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Social emotion classification is important for better capturing the preferences and perspectives of individual users to monitor public opinion and edit news. However, news reports have a strong domain dependence. Moreover, training data in the target domain are usually insufficient and only a small amount of training data may be labeled. To address these problems, we develop a cluster-level method for social emotion classification across domains. By discovering both source and target clusters and weighting the cluster in the source domain according to the similarity between its distribution and that of the target cluster, we can discover common patterns between the source and target domains, thus using both source and target data more effectively. Extensive experiments involving 12 cross-domain tasks conducted by using the ChinaNews dataset show that our model outperforms existing methods. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: With the progressive development of ubiquitous computing, wearable human activity recognition is playing an increasingly important role in many fields, such as health monitoring, disease-assisted diagnostic rehabilitation, and exercise assessment. Internal measurement unit in wearable devices provides a rich representation of motion. Human activity recognition based on sensor sequence has proven to be crucial in machine learning research. The key challenge is to extract powerful representational features from multi-sensor data to capture subtle differences in human activities. Beyond this challenge, due to the lack of attention to the temporal and spatial dependence of the data, critical information is often lost in the feature extraction process. Few previous papers can jointly address these two challenges. In this paper, we propose an efficient Bilinear Spatial-Temporal Attention Network (Bi-STAN). Firstly, a multi-scale ResNet backbone network is used to extract multimodal signal features and jointly optimize the feature extraction process. Then, to adaptively focus on what and where is important in the original data and to mine the discriminative part of the features, we design a spatial-temporal attention network. Finally, a bilinear pooling with low redundancy is introduced to efficiently obtain second-order information. Experiments on three public datasets and our real-world dataset demonstrate that the proposed Bi-STAN is superior to existing methods in terms of both accuracy and efficiency. The code and models are publicly available at https://github.com/ilovesea/Bi-STAN. PubDate: 2023-07-01
Please help us test our new pre-print finding feature by giving the pre-print link a rating. A 5 star rating indicates the linked pre-print has the exact same content as the published article.
Abstract: Multimodality sentiment classification of social media attracts increasing attention, whose main purpose is to predict the sentiment of the target mentioned in the posts. Current research mainly focuses on integrating the multimodal data, but fails to consider the impacts on the target. In this work, we tend to propose a target-oriented multimodal sentiment classification model. Specifically, our model starts with exploiting the target-oriented topic within the text. Then, a multi-head attention network is established to learn the multimodal interaction among textual, visual and topic information, based on which the target-oriented representations of the topic, the text and the image are obtained. Moreover, a gating unit to fuse the multimodal information is also built up. On the task of target-oriented multimodal sentiment classification, experiments on multimodal samples are carried out on manually annotated the dataset. Experimental results reveal that our method significantly reduces the gap over each given target, which sets a foundation to achieve the state-of-arts sentiment classification results. PubDate: 2023-07-01