Subjects -> ENGINEERING (Total: 2688 journals)
    - CHEMICAL ENGINEERING (229 journals)
    - CIVIL ENGINEERING (237 journals)
    - ELECTRICAL ENGINEERING (176 journals)
    - ENGINEERING (1325 journals)
    - ENGINEERING MECHANICS AND MATERIALS (452 journals)
    - HYDRAULIC ENGINEERING (56 journals)
    - INDUSTRIAL ENGINEERING (98 journals)
    - MECHANICAL ENGINEERING (115 journals)

ENGINEERING (1325 journals)                  1 2 3 4 5 6 7 | Last

Showing 1 - 200 of 1205 Journals sorted by number of followers
Composite Structures     Hybrid Journal   (Followers: 247)
Composites Part B : Engineering     Hybrid Journal   (Followers: 221)
IEEE Spectrum     Full-text available via subscription   (Followers: 219)
ACS Nano     Hybrid Journal   (Followers: 183)
Composites Part A : Applied Science and Manufacturing     Hybrid Journal   (Followers: 175)
IEEE Geoscience and Remote Sensing Letters     Hybrid Journal   (Followers: 151)
Composites Science and Technology     Hybrid Journal   (Followers: 150)
IEEE Instrumentation & Measurement Magazine     Hybrid Journal   (Followers: 148)
IEEE Communications Magazine     Full-text available via subscription   (Followers: 140)
IEEE Engineering Management Review     Full-text available via subscription   (Followers: 117)
IEEE Antennas and Propagation Magazine     Hybrid Journal   (Followers: 112)
IEEE Transactions on Control Systems Technology     Hybrid Journal   (Followers: 111)
IEEE Transactions on Instrumentation and Measurement     Hybrid Journal   (Followers: 106)
IEEE Transactions on Signal Processing     Hybrid Journal   (Followers: 92)
IEEE Antennas and Wireless Propagation Letters     Hybrid Journal   (Followers: 88)
IEEE Industry Applications Magazine     Full-text available via subscription   (Followers: 82)
IEEE Transactions on Antennas and Propagation     Full-text available via subscription   (Followers: 79)
IEEE Transactions on Engineering Management     Hybrid Journal   (Followers: 74)
Engineering Failure Analysis     Hybrid Journal   (Followers: 68)
IEEE Microwave Magazine     Full-text available via subscription   (Followers: 63)
IEEE Signal Processing Letters     Hybrid Journal   (Followers: 60)
IEEE Transactions on Reliability     Hybrid Journal   (Followers: 53)
Experimental Techniques     Hybrid Journal   (Followers: 51)
IET Radar, Sonar & Navigation     Open Access   (Followers: 50)
IEEE Transactions on Microwave Theory and Techniques     Hybrid Journal   (Followers: 49)
Control Engineering Practice     Hybrid Journal   (Followers: 46)
IEEE Journal of Selected Topics in Signal Processing     Hybrid Journal   (Followers: 43)
Biotechnology Progress     Hybrid Journal   (Followers: 42)
IEEE Potentials     Full-text available via subscription   (Followers: 42)
IEEE Journal on Selected Areas in Communications     Hybrid Journal   (Followers: 39)
Heat Transfer Engineering     Hybrid Journal   (Followers: 36)
IET Microwaves, Antennas & Propagation     Open Access   (Followers: 35)
International Journal for Numerical Methods in Engineering     Hybrid Journal   (Followers: 35)
IEEE Microwave and Wireless Components Letters     Hybrid Journal   (Followers: 35)
Digital Signal Processing     Hybrid Journal   (Followers: 34)
IEEE Transactions on Knowledge and Data Engineering     Hybrid Journal   (Followers: 32)
AIChE Journal     Hybrid Journal   (Followers: 31)
Computing in Science & Engineering     Full-text available via subscription   (Followers: 31)
Computers & Geosciences     Hybrid Journal   (Followers: 30)
Flow, Turbulence and Combustion     Hybrid Journal   (Followers: 30)
Coastal Management     Hybrid Journal   (Followers: 29)
Canadian Geotechnical Journal     Hybrid Journal   (Followers: 28)
GPS Solutions     Hybrid Journal   (Followers: 28)
Fluid Dynamics     Hybrid Journal   (Followers: 27)
Bell Labs Technical Journal     Hybrid Journal   (Followers: 27)
Géotechnique     Hybrid Journal   (Followers: 27)
IEEE Transactions on Information Theory     Hybrid Journal   (Followers: 27)
IEEE Transactions on Power Delivery     Hybrid Journal   (Followers: 26)
Applied Energy     Partially Free   (Followers: 26)
Advances in Engineering Software     Hybrid Journal   (Followers: 26)
IEEE Journal of Solid-State Circuits     Full-text available via subscription   (Followers: 24)
Corrosion Science     Hybrid Journal   (Followers: 23)
Engineering & Technology     Hybrid Journal   (Followers: 22)
IET Image Processing     Open Access   (Followers: 22)
Intermetallics     Hybrid Journal   (Followers: 21)
Combustion, Explosion, and Shock Waves     Hybrid Journal   (Followers: 21)
IEEE Transactions on Electronics Packaging Manufacturing     Hybrid Journal   (Followers: 21)
IET Signal Processing     Open Access   (Followers: 21)
IEEE Transactions on Circuits and Systems II: Express Briefs     Hybrid Journal   (Followers: 20)
Advanced Synthesis & Catalysis     Hybrid Journal   (Followers: 20)
Implementation Science     Open Access   (Followers: 20)
International Journal for Numerical Methods in Fluids     Hybrid Journal   (Followers: 19)
Engineering Optimization     Hybrid Journal   (Followers: 19)
International Communications in Heat and Mass Transfer     Hybrid Journal   (Followers: 19)
Electrophoresis     Hybrid Journal   (Followers: 18)
IET Circuits, Devices & Systems     Open Access   (Followers: 18)
IEEE/ACM Transactions on Computational Biology and Bioinformatics     Hybrid Journal   (Followers: 18)
International Journal of Adhesion and Adhesives     Hybrid Journal   (Followers: 18)
IEEE Transactions on Intelligent Transportation Systems     Hybrid Journal   (Followers: 17)
Experiments in Fluids     Hybrid Journal   (Followers: 17)
Computational Geosciences     Hybrid Journal   (Followers: 17)
Integration     Hybrid Journal   (Followers: 16)
IEEE Transactions on Energy Conversion     Hybrid Journal   (Followers: 16)
Engineering Geology     Hybrid Journal   (Followers: 16)
European Journal of Mass Spectrometry     Hybrid Journal   (Followers: 16)
Energy Conversion and Management     Hybrid Journal   (Followers: 15)
Bulletin of Engineering Geology and the Environment     Hybrid Journal   (Followers: 15)
Coastal Engineering     Hybrid Journal   (Followers: 15)
IEEE Transactions on Magnetics     Hybrid Journal   (Followers: 14)
IEEE Journal of Biomedical and Health Informatics     Hybrid Journal   (Followers: 14)
IEEE Transactions on Automation Science and Engineering     Full-text available via subscription   (Followers: 13)
IEEE Transactions on Evolutionary Computation     Hybrid Journal   (Followers: 13)
Electromagnetics     Hybrid Journal   (Followers: 13)
Computers and Geotechnics     Hybrid Journal   (Followers: 12)
IEEE Transactions on Semiconductor Manufacturing     Hybrid Journal   (Followers: 12)
IET Renewable Power Generation     Open Access   (Followers: 12)
Human Factors in Ergonomics & Manufacturing     Hybrid Journal   (Followers: 12)
IEEE Transactions on Professional Communication     Hybrid Journal   (Followers: 11)
Biomedical Engineering     Hybrid Journal   (Followers: 11)
IEEE Transactions on Education     Hybrid Journal   (Followers: 11)
CIRP Annals - Manufacturing Technology     Hybrid Journal   (Followers: 11)
Heat Transfer - Asian Research     Hybrid Journal   (Followers: 11)
IEEE Journal of Oceanic Engineering     Hybrid Journal   (Followers: 11)
International Journal of Antennas and Propagation     Open Access   (Followers: 10)
Proceedings of the Institution of Civil Engineers - Geotechnical Engineering     Hybrid Journal   (Followers: 10)
IEEE Transactions on Nuclear Science     Hybrid Journal   (Followers: 10)
IEEE Transactions on Plasma Science     Hybrid Journal   (Followers: 10)
Computers & Mathematics with Applications     Full-text available via subscription   (Followers: 9)
Fuel Cells Bulletin     Full-text available via subscription   (Followers: 9)
Computational Optimization and Applications     Hybrid Journal   (Followers: 9)
Annals of Science     Hybrid Journal   (Followers: 9)
European Journal of Engineering Education     Hybrid Journal   (Followers: 9)
Applied Catalysis B: Environmental     Hybrid Journal   (Followers: 9)
Biomedical Microdevices     Hybrid Journal   (Followers: 8)
IEEE Technology and Society Magazine     Full-text available via subscription   (Followers: 8)
Fuel Cells     Hybrid Journal   (Followers: 8)
Adaptive Behavior     Hybrid Journal   (Followers: 8)
Proceedings of the Institution of Civil Engineers - Bridge Engineering     Hybrid Journal   (Followers: 8)
Energy Engineering     Full-text available via subscription   (Followers: 8)
IEEE Transactions on Advanced Packaging     Full-text available via subscription   (Followers: 8)
Clay Minerals     Hybrid Journal   (Followers: 8)
Continuum Mechanics and Thermodynamics     Hybrid Journal   (Followers: 8)
Applied Catalysis A: General     Hybrid Journal   (Followers: 7)
International Journal of Applied Ceramic Technology     Hybrid Journal   (Followers: 7)
Basin Research     Hybrid Journal   (Followers: 7)
Discrete Optimization     Full-text available via subscription   (Followers: 7)
Designs, Codes and Cryptography     Hybrid Journal   (Followers: 7)
IEEE Journal of Selected Topics in Quantum Electronics     Hybrid Journal   (Followers: 7)
Environmental and Ecological Statistics     Hybrid Journal   (Followers: 7)
Biomicrofluidics     Open Access   (Followers: 7)
Geothermics     Hybrid Journal   (Followers: 7)
Fuel and Energy Abstracts     Full-text available via subscription   (Followers: 7)
IEEE Vehicular Technology Magazine     Full-text available via subscription   (Followers: 7)
Catalysis Communications     Hybrid Journal   (Followers: 7)
Computers and Electronics in Agriculture     Hybrid Journal   (Followers: 7)
Computer Applications in Engineering Education     Hybrid Journal   (Followers: 6)
Computing and Visualization in Science     Hybrid Journal   (Followers: 6)
Fusion Engineering and Design     Hybrid Journal   (Followers: 6)
Applied Clay Science     Hybrid Journal   (Followers: 6)
Composite Interfaces     Hybrid Journal   (Followers: 6)
Formal Methods in System Design     Hybrid Journal   (Followers: 6)
Acta Geotechnica     Hybrid Journal   (Followers: 6)
Advances in OptoElectronics     Open Access   (Followers: 6)
International Journal of Adaptive Control and Signal Processing     Hybrid Journal   (Followers: 5)
IEEE Transactions on Vehicular Technology     Hybrid Journal   (Followers: 5)
IET Science, Measurement & Technology     Open Access   (Followers: 5)
IEEE Transactions on Applied Superconductivity     Hybrid Journal   (Followers: 5)
International Journal of Architectural Computing     Full-text available via subscription   (Followers: 5)
Finite Fields and Their Applications     Full-text available via subscription   (Followers: 5)
Focus on Powder Coatings     Full-text available via subscription   (Followers: 5)
Engineering With Computers     Hybrid Journal   (Followers: 5)
Proceedings of the Institution of Civil Engineers - Engineering Sustainability     Hybrid Journal   (Followers: 5)
Archives of Computational Methods in Engineering     Hybrid Journal   (Followers: 5)
Active and Passive Electronic Components     Open Access   (Followers: 5)
Proceedings of the Institution of Civil Engineers - Ground Improvement     Hybrid Journal   (Followers: 4)
Frontiers in Energy     Hybrid Journal   (Followers: 4)
Adsorption     Hybrid Journal   (Followers: 4)
Catalysis Today     Hybrid Journal   (Followers: 4)
Applied Numerical Mathematics     Hybrid Journal   (Followers: 4)
Current Applied Physics     Full-text available via subscription   (Followers: 4)
Fluid Phase Equilibria     Hybrid Journal   (Followers: 4)
Graphs and Combinatorics     Hybrid Journal   (Followers: 4)
Filtration & Separation     Full-text available via subscription   (Followers: 4)
Annals of Pure and Applied Logic     Open Access   (Followers: 4)
Grass and Forage Science     Hybrid Journal   (Followers: 4)
Catalysis Surveys from Asia     Hybrid Journal   (Followers: 4)
Informatik-Spektrum     Hybrid Journal   (Followers: 3)
Engineering Computations     Hybrid Journal   (Followers: 3)
European Journal of Combinatorics     Full-text available via subscription   (Followers: 3)
Applicable Algebra in Engineering, Communication and Computing     Hybrid Journal   (Followers: 3)
Chaos : An Interdisciplinary Journal of Nonlinear Science     Hybrid Journal   (Followers: 3)
Concurrent Engineering     Hybrid Journal   (Followers: 3)
Focus on Pigments     Full-text available via subscription   (Followers: 3)
Annals of Combinatorics     Hybrid Journal   (Followers: 3)
Frontiers of Environmental Science & Engineering     Hybrid Journal   (Followers: 3)
Fuzzy Sets and Systems     Hybrid Journal   (Followers: 3)
Catalysis Letters     Hybrid Journal   (Followers: 3)
IET Generation, Transmission & Distribution     Open Access   (Followers: 2)
Historical Records of Australian Science     Hybrid Journal   (Followers: 2)
IET Optoelectronics     Open Access   (Followers: 2)
Assembly Automation     Hybrid Journal   (Followers: 2)
International Journal of Abrasive Technology     Hybrid Journal   (Followers: 2)
Aerobiologia     Hybrid Journal   (Followers: 2)
Cellular and Molecular Neurobiology     Hybrid Journal   (Followers: 2)
Comptes Rendus : Mécanique     Open Access   (Followers: 2)
Chinese Journal of Catalysis     Full-text available via subscription   (Followers: 2)
IEEE Latin America Transactions     Full-text available via subscription   (Followers: 2)
Communications in Numerical Methods in Engineering     Hybrid Journal   (Followers: 2)
ESAIM: Control Optimisation and Calculus of Variations     Open Access   (Followers: 2)
Focus on Surfactants     Full-text available via subscription   (Followers: 2)
Engineering Analysis with Boundary Elements     Hybrid Journal   (Followers: 2)
Chaos, Solitons & Fractals     Hybrid Journal   (Followers: 1)
Foundations of Science     Hybrid Journal   (Followers: 1)
Forschung     Hybrid Journal   (Followers: 1)
European Journal of Lipid Science and Technology     Hybrid Journal   (Followers: 1)
Antarctic Science     Hybrid Journal   (Followers: 1)
Épités - Épitészettudomány     Full-text available via subscription   (Followers: 1)
Dyes and Pigments     Hybrid Journal   (Followers: 1)
Bautechnik     Hybrid Journal   (Followers: 1)
Biointerphases     Open Access   (Followers: 1)
Designed Monomers and Polymers     Open Access   (Followers: 1)
Color Research & Application     Hybrid Journal   (Followers: 1)
Abstract and Applied Analysis     Open Access   (Followers: 1)
Focus on Catalysts     Full-text available via subscription  
ESAIM: Proceedings     Open Access  
Environmetrics     Hybrid Journal  
COMBINATORICA     Hybrid Journal  
Chinese Science Bulletin     Open Access  
Calphad     Hybrid Journal  
Boundary Value Problems     Open Access  

        1 2 3 4 5 6 7 | Last

Similar Journals
Journal Cover
IEEE Transactions on Knowledge and Data Engineering
Journal Prestige (SJR): 1.133
Citation Impact (citeScore): 5
Number of Followers: 32  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1041-4347
Published by IEEE Homepage  [228 journals]
  • A Data-Characteristic-Aware Latent Factor Model for Web Services QoS
           Prediction

    • Free pre-print version: Loading...

      Authors: Di Wu;Xin Luo;Mingsheng Shang;Yi He;Guoyin Wang;Xindong Wu;
      Pages: 2525 - 2538
      Abstract: How to accurately predict unknown quality-of-service (QoS) data based on observed ones is a hot yet thorny issue in Web service-related applications. Recently, a latent factor (LF) model has shown its efficiency in addressing this issue owing to its high accuracy and scalability. An LF model can be improved by identifying user and service neighborhoods based on user and service geographical information. However, such information can be difficult to acquire in most applications with the considerations of information security, identity privacy, and commercial interests in a real system. Besides, the existing LF model-based QoS predictors mostly ignore the reliability of given QoS data where noises commonly exist to cause accuracy loss. To address the above issues, this paper proposes a data-characteristic-aware latent factor (DCALF) model to implement highly accurate QoS predictions, where ‘data-characteristic-aware’ indicates that it can appropriately implement QoS prediction according to the characteristics of given QoS data. Its main idea is two-fold: a) it detects the neighborhoods and noises of users and services based on the dense LFs extracted from the original sparse QoS data, b) it incorporates a density peaks-based clustering method into its modeling process for achieving the simultaneous detections of both neighborhoods and noises of QoS data. With such designs, it precisely represents the given QoS data in spite of their sparsity, thereby achieving highly accurate predictions for unknown ones. Experimental results on two QoS datasets generated by real-world Web services demonstrate that the proposed DCALF model outperforms state-of-the-art QoS predictors, making it highly competitive in addressing the issue of Web service selection and recommendation.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • A Deep Multi-View Framework for Anomaly Detection on Attributed Networks

    • Free pre-print version: Loading...

      Authors: Zhen Peng;Minnan Luo;Jundong Li;Luguo Xue;Qinghua Zheng;
      Pages: 2539 - 2552
      Abstract: The explosion of modeling complex systems using attributed networks boosts the research on anomaly detection in such networks, which can be applied in various high-impact domains. Many existing attempts, however, do not seriously tackle the inherent multi-view property in attribute space but concatenate multiple views into a single feature vector, which inevitably ignores the incompatibility between heterogeneous views caused by their own statistical properties. Actually, the distinct but complementary information brought by multi-view data promises the potential for more effective anomaly detection than the efforts only based on single-view data. Furthermore, the abnormal patterns naturally behave diversely in different views, which coincides with people’s desire to discover specific abnormality according to their preferences for views (attributes). Most existing methods cannot adapt to people’s requirements as they fail to consider the idiosyncrasy of user preferences. Therefore, we propose a multi-view framework Alarm to incorporate user preferences into anomaly detection and simultaneously tackle heterogeneous attribute characteristics through multiple graph encoders and a well-designed aggregator that supports self-learning and user-guided learning. Experiments on synthetic and real-world datasets, e.g., Disney, Books, and Enron, corroborate the improvement of Alarm in detection accuracy evaluated by the AUC metric and its effectiveness in supporting user-oriented anomaly detection.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • A Review for Weighted MinHash Algorithms

    • Free pre-print version: Loading...

      Authors: Wei Wu;Bin Li;Ling Chen;Junbin Gao;Chengqi Zhang;
      Pages: 2553 - 2573
      Abstract: Data similarity (or distance) computation is a fundamental research topic which underpins many high-level applications based on similarity measures in machine learning and data mining. However, in large-scale real-world scenarios, the exact similarity computation has become daunting due to “3V” nature (volume, velocity and variety) of big data. In this case, the hashing techniques have been verified to efficiently conduct similarity estimation in terms of both theory and practice. Currently, MinHash is a popular technique for efficiently estimating the Jaccard similarity of binary sets and furthermore, weighted MinHash is generalized to estimate the generalized Jaccard similarity of weighted sets. This review focuses on categorizing and discussing the existing works of weighted MinHash algorithms. In this review, we mainly categorize the weighted MinHash algorithms into quantization-based approaches, “active index”-based ones and others, and show the evolution and inherent connection of the weighted MinHash algorithms, from the integer weighted MinHash ones to the real-valued weighted MinHash ones. Also, we have developed a Python toolbox for the algorithms, and released it in our github. We experimentally conduct a comprehensive study of the standard MinHash algorithm and the weighted MinHash ones in the similarity estimation error and the information retrieval task.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • A Survey on Large-Scale Machine Learning

    • Free pre-print version: Loading...

      Authors: Meng Wang;Weijie Fu;Xiangnan He;Shijie Hao;Xindong Wu;
      Pages: 2574 - 2594
      Abstract: Machine learning can provide deep insights into data, allowing machines to make high-quality predictions and having been widely used in real-world applications, such as text mining, visual classification, and recommender systems. However, most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data. This issue calls for the need of Large-scale Machine Learning (LML), which aims to learn patterns from big data with comparable performance efficiently. In this paper, we offer a systematic survey on existing LML methods to provide a blueprint for the future developments of this area. We first divide these LML methods according to the ways of improving the scalability: 1) model simplification on computational complexities, 2) optimization approximation on computational efficiency, and 3) computation parallelism on computational capabilities. Then we categorize the methods in each perspective according to their targeted scenarios and introduce representative methods in line with intrinsic strategies. Lastly, we analyze their limitations and discuss potential directions as well as open issues that are promising to address in the future.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Adaptive Lower-Level Driven Compaction to Optimize LSM-Tree Key-Value
           Stores

    • Free pre-print version: Loading...

      Authors: Yunpeng Chai;Yanfeng Chai;Xin Wang;Haocheng Wei;Yangyang Wang;
      Pages: 2595 - 2609
      Abstract: Log-structured merge (LSM) tree key-value (KV) stores have been widely deployed in many NoSQL and SQL systems, serving online big data applications such as social networking, graph processing, machine learning, etc. The batch processing of sorted data merging (i.e., compaction) in LSM-tree key-value stores improves the write efficiency, and some lazy compaction methods have been proposed to accumulate more data within a batch. However, these batched writing methods lead to significant tail latency, which is unacceptable for online processing. Aiming to optimize both latency and throughput, we propose a novel Lower-level Driven Compaction (LDC) method which breaks the limitations of the traditional upper-level driven compaction manner and triggers practical compaction actions bottom-up, with the benefits of both decreasing the compaction granularity for smaller latency and reducing write amplification for higher throughput. Furthermore, we extend LDC to Adaptive LDC (ALDC) by adding an adaptive policy to adjust the key compaction threshold to fit the changes of workloads’ features. The experimental results indicate that ALDC reduces the tail latency significantly and meanwhile achieves a much higher and stable throughput compared with existing approaches.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • An Experimental Study of State-of-the-Art Entity Alignment Approaches

    • Free pre-print version: Loading...

      Authors: Xiang Zhao;Weixin Zeng;Jiuyang Tang;Wei Wang;Fabian M. Suchanek;
      Pages: 2610 - 2625
      Abstract: Entity alignment (EA) finds equivalent entities that are located in different knowledge graphs (KGs), which is an essential step to enhance the quality of KGs, and hence of significance to downstream applications (e.g., question answering and recommendation). Recent years have witnessed a rapid increase of EA approaches, yet the relative performance of them remains unclear, partly due to the incomplete empirical evaluations, as well as the fact that comparisons were carried out under different settings (i.e., datasets, information used as input, etc.). In this paper, we fill in the gap by conducting a comprehensive evaluation and detailed analysis of state-of-the-art EA approaches. We first propose a general EA framework that encompasses all the current methods, and then group existing methods into three major categories. Next, we judiciously evaluate these solutions on a wide range of use cases, based on their effectiveness, efficiency and robustness. Finally, we construct a new EA dataset to mirror the real-life challenges of alignment, which were largely overlooked by existing literature. This study strives to provide a clear picture of the strengths and weaknesses of current EA approaches, so as to inspire quality follow-up research.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Anomaly Detection in Quasi-Periodic Time Series Based on Automatic Data
           Segmentation and Attentional LSTM-CNN

    • Free pre-print version: Loading...

      Authors: Fan Liu;Xingshe Zhou;Jinli Cao;Zhu Wang;Tianben Wang;Hua Wang;Yanchun Zhang;
      Pages: 2626 - 2640
      Abstract: Quasi-periodic time series (QTS) exists widely in the real world, and it is important to detect the anomalies of QTS. In this paper, we propose an automatic QTS anomaly detection framework (AQADF) consisting of a two-level clustering-based QTS segmentation algorithm (TCQSA) and a hybrid attentional LSTM-CNN model (HALCM). TCQSA first automatically splits the QTS into quasi-periods which are then classified by HALCM into normal periods or anomalies. Notably, TCQSA integrates a hierarchical clustering and the k-means technique, making itself highly universal and noise-resistant. HALCM hybridizes LSTM and CNN to simultaneously extract the overall variation trends and local features of QTS for modeling its fluctuation pattern. Furthermore, we embed a trend attention gate (TAG) into the LSTM, a feature attention mechanism (FAM) and a location attention mechanism (LAM) into the CNN to finely tune the extracted variation trends and local features according to their true importance to achieve a better representation of the fluctuation pattern of the QTS. On four public datasets, HALCM exceeds four state-of-the-art baselines and obtains at least 97.3 percent accuracy, TCQSA outperforms two cutting-edge QTS segmentation algorithms and can be applied to different types of QTSs. Additionally, the effectiveness of the attention mechanisms is quantitatively and qualitatively demonstrated.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Approximate Algorithms for Data-Driven Influence Limitation

    • Free pre-print version: Loading...

      Authors: Sourav Medya;Arlei Silva;Ambuj Singh;
      Pages: 2641 - 2652
      Abstract: Online social networks have become major battlegrounds for political campaigns, viral marketing, and the dissemination of news. As a consequence, “bad actors” are increasingly exploiting these platforms, which is a key challenge for their administrators, businesses and society in general. The spread of fake news is a classical example of the abuse of social networks by these bad actors. While some have advocated for stricter policies to control the spread of misinformation in social networks, this often happens in detriment of their democratic and organic structure. In this paper, we aim to limit the influence of a target group in a social network via the removal of a few users/links. We formulate the influence limitation problem in a data-driven fashion, by taking into account past propagation traces. More specifically, our algorithms find critical edges to be removed in order to decrease the influence of a target group based on past data. The idea is to control the diffusion processes while minimizing the amount of disturbance in the network structure. Moreover, we consider two types of constraints over edge removals, a budget constraint and also a, more general, set of matroid constraints. These problems lead to interesting challenges in terms of algorithm design. For instance, we are able to show that influence limitation is APX-hard and propose deterministic and probabilistic approximation algorithms for the budgeted and the matroid version of the problem, respectively. Experiments show that the proposed approaches outperform several baselines.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • AutoHash: Learning Higher-Order Feature Interactions for Deep CTR
           Prediction

    • Free pre-print version: Loading...

      Authors: Niannan Xue;Bin Liu;Huifeng Guo;Ruiming Tang;Fengwei Zhou;Stefanos Zafeiriou;Yuzhou Zhang;Jun Wang;Zhenguo Li;
      Pages: 2653 - 2666
      Abstract: Feature combinations are essential for the success of many web applications, such as personalised recommendation and online advertising. State-of-the-art methods usually model explicit feature interactions to help neural networks reduce the number of parameters and achieve better performance. However, their explicit feature interactions are often restricted to the second-order due to computational complexity. In this work, we propose efficient ways to represent explicit high-order feature combinations as well as prune redundant features in the mean time. To begin with, we make novel use of the Count Sketch algorithm within a DNN classifier such that high-order feature combinations can be compactly represented. After that, to combat the problem of redundant features which degrade the prediction performance, we introduce an adaptive hashing algorithm, AutoHash, which can automatically select meaningful features to interact at high orders according to the specific dataset in question. This is an AutoML approach. Experiments on three well-known public datasets demonstrate that AutoHash is significantly superior to state-of-the-art methods. Meanwhile, due to its efficient scheme of automatically selecting useful high-order feature interactions, AutoHash has less model complexity and can be trained in an end-to-end manner with less training time than state-of-the-art methods.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Collaborative List-and-Pairwise Filtering From Implicit Feedback

    • Free pre-print version: Loading...

      Authors: Runlong Yu;Qi Liu;Yuyang Ye;Mingyue Cheng;Enhong Chen;Jianhui Ma;
      Pages: 2667 - 2680
      Abstract: The implicit feedback based collaborative filtering (CF) has attracted much attention in recent years, mainly because users implicitly express their preferences in many real-world scenarios. The current mainstream pairwise methods optimize the Area Under the Curve (AUC) and are empirically proved to be helpful to exploit binary relevance data, but lead to either not address the ranking problem, or not specifically focus on top-$k$k recommendation. Although there exists the listwise method maximizes the Mean Reciprocal Rank (MRR), it has low efficiency and is not particularly adequate for general implicit feedback situations. To that end, in this paper, we propose a new framework, namely Collaborative List-and-Pairwise Filtering (CLAPF), which aims to introduce pairwise thinking into listwise methods. Specifically, we smooth another well-known rank-biased measure called Mean Average Precision (MAP), and respectively combine two rank-biased metrics (MAP, MRR) with the pairwise objective function to capture the performance of top-$k$k recommendation. Furthermore, the sampling scheme for CLAPF is discussed to accelerate the convergence speed. Our CLAPF framework is a new hybrid model that provides an idea of utilizing rank-biased measures in a pairwise way on implicit feedback. Empirical studies demonstrated CLAPF outperforms state-of-the-art approaches on real-world datasets.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Deep Learning for Adverse Event Detection From Web Search

    • Free pre-print version: Loading...

      Authors: Faizan Ahmad;Ahmed Abbasi;Brent Kitchens;Donald Adjeroh;Daniel Zeng;
      Pages: 2681 - 2695
      Abstract: Adverse event detection is critical for many real-world applications including timely identification of product defects, disasters, and major socio-political incidents. In the health context, adverse drug events account for countless hospitalizations and deaths annually. Since users often begin their information seeking and reporting with online searches, examination of search query logs has emerged as an important detection channel. However, search context - including query intent and heterogeneity in user behaviors – is extremely important for extracting information from search queries, and yet the challenge of measuring and analyzing these aspects has precluded their use in prior studies. We propose DeepSAVE, a novel deep learning framework for detecting adverse events based on user search query logs. DeepSAVE uses an enriched variational autoencoder encompassing a novel query embedding and user modeling module that work in concert to address the context challenge associated with search-based detection of adverse events. Evaluation results on three large real-world event datasets show that DeepSAVE outperforms existing detection methods as well as comparison deep learning auto encoders. Ablation analysis reveals that each component of DeepSAVE significantly contributes to its overall performance. Collectively, the results demonstrate the viability of the proposed architecture for detecting adverse events from search query logs.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Detecting Hierarchical and Overlapping Network Communities Based on
           Opinion Dynamics

    • Free pre-print version: Loading...

      Authors: Ren Ren;Jinliang Shao;Yuhua Cheng;Xiaofan Wang;
      Pages: 2696 - 2710
      Abstract: It is common for communities in real-world networks to possess hierarchical and overlapping structures, which make community detection even more challenging. In this paper, by investigating consensus process of the classical DeGroot model in opinion dynamics, we propose a novel method based on the cumulative opinion distance (COD) to discover hierarchical and overlapping communities. It is shown that this method is different from those classical algorithms relying on static fitness metrics that depict the inhomogeneous connectivity across the network. The proposed method is validated from two aspects. First, by estimating the eigenvectors of adjacency matrices, we investigate the detectability limit of our algorithms on random networks, which together with the results concerning the convergence speed of consensus guarantees the performance of our method theoretically. Second, experiments on both large scale real-world networks and artificial benchmarks show that our method is very effective and competitive on hierarchical modular graphs. In particular, it outperforms the state-of-the-art algorithms on overlapping community detection.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Detecting Statistically Significant Communities

    • Free pre-print version: Loading...

      Authors: Zengyou He;Hao Liang;Zheng Chen;Can Zhao;Yan Liu;
      Pages: 2711 - 2725
      Abstract: Community detection is a key data analysis problem across different fields. During the past decades, numerous algorithms have been proposed to address this issue. However, most work on community detection does not address the issue of statistical significance. Although some research efforts have been made towards mining statistically significant communities, deriving an analytical solution of $p$p-value for one community under the configuration model is still a challenging mission that remains unsolved. The configuration model is a widely used random graph model in community detection, in which the degree of each node is preserved in the generated random networks. To partially fulfill this void, we present a tight upper bound on the $p$p-value of a single community under the configuration model, which can be used for quantifying the statistical significance of each community analytically. Meanwhile, we present a local search method to detect statistically significant communities in an iterative manner. Experimental results demonstrate that our method is comparable with the competing methods on detecting statistically significant communities.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Efficient Algorithms for Kernel Aggregation Queries

    • Free pre-print version: Loading...

      Authors: Tsz Nam Chan;Leong Hou U;Reynold Cheng;Man Lung Yiu;Shivansh Mittal;
      Pages: 2726 - 2739
      Abstract: Kernel functions support a broad range of applications that require tasks like density estimation, classification, regression or outlier detection. For these tasks, a common online operation is to compute the weighted aggregation of kernel function values with respect to a set of points. However, scalable aggregation methods are still unknown for typical kernel functions (e.g., Gaussian kernel, polynomial kernel, sigmoid kernel and additive kernels) and weighting schemes. In this paper, we propose a novel and effective bounding technique, by leveraging index structures, to speed up the computation of kernel aggregation. In addition, we extend our technique to additive kernel functions, including $chi ^2$χ2, intersection, JS and Hellinger kernels, which are widely used in different communities, e.g., computer vision, medical science, Geoscience etc. To handle the additive kernel functions, we further develop the novel and effective bound functions to efficiently evaluate the kernel aggregation. Experimental studies on many real datasets reveal that our proposed solution KARL achieves at least one order of magnitude speedup over the state-of-the-art for different types of kernel functions.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • $k$ k -Means+Clustering+With+Coreset+Caching&rft.title=IEEE+Transactions+on+Knowledge+and+Data+Engineering&rft.issn=1041-4347&rft.date=2022&rft.volume=34&rft.spage=2740&rft.epage=2754&rft.aulast=Tirthapura;&rft.aufirst=Yu&rft.au=Yu+Zhang;Kanat+Tangwongsan;Srikanta+Tirthapura;">Fast Streaming $k$ k -Means Clustering With Coreset Caching

    • Free pre-print version: Loading...

      Authors: Yu Zhang;Kanat Tangwongsan;Srikanta Tirthapura;
      Pages: 2740 - 2754
      Abstract: We present new algorithms for $k$k-means clustering on a data stream with a focus on providing fast responses to clustering queries. Compared to the state-of-the-art, our algorithms provide substantial improvements in the query time for cluster-center queries while retaining the desirable properties of provably small approximation error and low space usage. Our proposed clustering algorithms systematically reuse the “coresets” (summaries of data) computed for recent queries in answering the current clustering query, a novel technique which we refer to as coreset caching. We also present an algorithm called OnlineCC that integrates the coreset caching idea with a simple sequential streaming $k$k-means algorithm. In practice, OnlineCC algorithm can provide constant query time. We present both theoretical analysis and detailed experiments demonstrating the correctness, accuracy, and efficiency of all our proposed clustering algorithms.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Fine-Grained Urban Flow Inference

    • Free pre-print version: Loading...

      Authors: Kun Ouyang;Yuxuan Liang;Ye Liu;Zekun Tong;Sijie Ruan;Yu Zheng;David S. Rosenblum;
      Pages: 2755 - 2770
      Abstract: Spatially fine-grained urban flow data is critical for smart city efforts. Though fine-grained information is desirable for applications, it demands much more resources for the underlying storage system compared to coarse-grained data. To bridge the gap between storage efficiency and data utility, in this paper, we aim to infer fine-grained flows throughout a city from their coarse-grained counterparts. This task exhibits two challenges: the spatial correlations between coarse- and fine-grained urban flows, and the complexities of external impacts. To tackle these issues, we develop a model entitled UrbanFM which consists of two major parts: 1) an inference network to generate fine-grained flow distributions from coarse-grained inputs that uses a feature extraction module and a novel distributional upsampling module; 2) a general fusion subnet to further boost the performance by considering the influence of different external factors. This structure provides outstanding effectiveness and efficiency for small scale upsampling. However, the single-pass upsampling used by UrbanFM is insufficient at higher upscaling rates. Therefore, we further present UrbanPy, a cascading model for progressive inference of fine-grained urban flows by decomposing the original tasks into multiple subtasks. Compared to UrbanFM, such an enhanced structure demonstrates favorable performance for larger-scale inference tasks.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Hypergraph Partitioning With Embeddings

    • Free pre-print version: Loading...

      Authors: Justin Sybrandt;Ruslan Shaydulin;Ilya Safro;
      Pages: 2771 - 2782
      Abstract: Problems in scientific computing, such as distributing large sparse matrix operations, have analogous formulations as hypergraph partitioning problems. A hypergraph is a generalization of a traditional graph wherein “hyperedges” may connect any number of nodes. As a result, hypergraph partitioning is an NP-Hard problem to both solve or approximate. State-of-the-art algorithms that solve this problem follow the multilevel paradigm, which begins by iteratively “coarsening” the input hypergraph to smaller problem instances that share key structural features. Once identifying an approximate problem that is small enough to be solved directly, that solution can be interpolated and refined to the original problem. While this strategy represents an excellent trade off between quality and running time, it is sensitive to coarsening strategy. In this work we propose using graph embeddings of the initial hypergraph in order to ensure that coarsened problem instances retrain key structural features. Our approach prioritizes coarsening within self-similar regions within the input graph, and leads to significantly improved solution quality across a range of considered hypergraphs. Reproducibility: All source code, plots and experimental data are available at https://sybrandt.com/2019/partition.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • IncGraph: An Improved Distributed Incremental Graph Computing Model and
           Framework Based on Spark GraphX

    • Free pre-print version: Loading...

      Authors: Zhuo Tang;Mengsi He;Zhongming Fu;Li Yang;
      Pages: 2783 - 2797
      Abstract: The excavated information will become obsolete when the data changes in dynamic graphs. To compute the up-to-date results, the graph algorithm has to re-compute the entire data from scratch, which will consume huge computation time and resources. To reduce the cost of such calculations, this paper proposes a model called IncGraph to support incremental iterative computation over dynamic graphs. Different from the way of traditional iteration, IncGraph executes the graph algorithm through reusing the results of the previous graph and performs computation on the part of the graph that has changed. IncGraph has two critical components: (1) an incremental iterative computation model that consists of two steps: an incremental step to calculate the results on the changed vertices of the graph, and a merge step to calculate the results on the entire graph by using the results of the previous graph and the incremental step; and (2) an incremental update method to accelerate the iterative process within the iterative graph algorithm. We implement IncGraph model on GraphX and evaluate its performance by using several representative iterative graph algorithms: PageRank, Connected components, and Single Source Shortest Path. The results show that compared with the traditional iteration, when adding the 100k of vertices in different size data sets, the performance optimization ratio of IncGraph is 31.79 percent averagely, and 50.2 percent maximum; and when the percentage of added vertices varied from 0.01 to 10 percent in different data sets, the performance optimization ratio of IncGraph varied from 19.9 to 66.1 percent. Moreover, the result errors of IncGraph is small and can be neglected.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Item Recommendation for Word-of-Mouth Scenario in Social E-Commerce

    • Free pre-print version: Loading...

      Authors: Chen Gao;Chao Huang;Donghan Yu;Haohao Fu;Tzh-Heng Lin;Depeng Jin;Yong Li;
      Pages: 2798 - 2809
      Abstract: Social commerce, which is different from traditional e-commerce where people purchase products via initiative searching or recommendations from the platform, transforms a social community into an inclusive place to do business by enabling people to share products with their friends. A user (sharer), can share a link of a product to their social-connected friends (receiver). Once a receiver purchases the product, the sharer can earn commission provided by the platform. To promote sales, the platform can also assist sharers by providing product candidates which are more likely to be purchased during the social sharing. We define this task of generating sharing suggestions as item recommendation for word-of-mouth scenario, and to the best of our knowledge, this is a new task that has never been explored. In this article, we propose a TriM (short for Triad based word-of-Mouth recommendation) model that can capture both the sharer’s influence and the receiver’s interest at the same time, which are two significant factors that determine whether the receiver will buy the product or not. Furthermore, with joint learning on two parts of interaction data to address data sparsity issue, our proposed TriM-Joint further improves the recommendation performance. By conducting experiments, we show that our proposed models achieve the best results compared to state-of-the-art models with significant improvements by at least $7.4% sim 14.4%$7.4%∼14.4% respectively.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Iterative Refinement for Multi-Source Visual Domain Adaptation

    • Free pre-print version: Loading...

      Authors: Hanrui Wu;Yuguang Yan;Guosheng Lin;Min Yang;Michael K. Ng;Qingyao Wu;
      Pages: 2810 - 2823
      Abstract: One of the main challenges in multi-source domain adaptation is how to reduce the domain discrepancy between each source domain and a target domain, and then evaluate the domain relevance to determine how much knowledge should be transferred from different source domains to the target domain. However, most prior approaches barely consider both discrepancies and relevance among domains. In this paper, we propose an algorithm, called Iterative Refinement based on Feature Selection and the Wasserstein distance (IRFSW), to solve semi-supervised domain adaptation with multiple sources. Specifically, IRFSW aims to explore both the discrepancies and relevance among domains in an iterative learning procedure, which gradually refines the learning performance until the algorithm stops. In each iteration, for each source domain and the target domain, we develop a sparse model to select features in which the domain discrepancy and training loss are reduced simultaneously. Then a classifier is constructed with the selected features of the source and labeled target data. After that, we exploit optimal transport over the selected features to calculate the transferred weights. The weight values are taken as the ensemble weights to combine the learned classifiers to control the amount of knowledge transferred from source domains to the target domain. Experimental results validate the effectiveness of the proposed method.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • More Than Privacy: Applying Differential Privacy in Key Areas of
           Artificial Intelligence

    • Free pre-print version: Loading...

      Authors: Tianqing Zhu;Dayong Ye;Wei Wang;Wanlei Zhou;Philip S. Yu;
      Pages: 2824 - 2843
      Abstract: Artificial Intelligence (AI) has attracted a great deal of attention in recent years. However, alongside all its advancements, problems have also emerged, such as privacy violations, security issues and model fairness. Differential privacy, as a promising mathematical model, has several attractive properties that can help solve these problems, making it quite a valuable tool. For this reason, differential privacy has been broadly applied in AI but to date, no study has documented which differential privacy mechanisms can or have been leveraged to overcome its issues or the properties that make this possible. In this paper, we show that differential privacy can do more than just privacy preservation. It can also be used to improve security, stabilize learning, build fair models, and impose composition in selected areas of AI. With a focus on regular machine learning, distributed machine learning, deep learning, and multi-agent systems, the purpose of this article is to deliver a new view on many possibilities for improving AI performance with differential privacy techniques.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Neighborhood Matters: Influence Maximization in Social Networks With
           Limited Access

    • Free pre-print version: Loading...

      Authors: Chen Feng;Luoyi Fu;Bo Jiang;Haisong Zhang;Xinbing Wang;Feilong Tang;Guihai Chen;
      Pages: 2844 - 2859
      Abstract: Influence maximization (IM) aims at maximizing the spread of influence by offering discounts to influential users (called seeding). In many applications, due to user’s privacy concern, overwhelming network scale etc., it is hard to target any user in the network as one wishes. Instead, only a small subset of users is initially accessible. Such access limitation would significantly impair the influence spread, since IM often relies on seeding high degree users, which are particularly rare in such a small subset due to the power-law structure of social networks. In this paper, we attempt to solve the limited IM in real-world scenarios by the adaptive approach with seeding and diffusion uncertainty considered. Specifically, we consider fine-grained discounts and assume users accept the discount probabilistically. The diffusion process is depicted by the independent cascade model. To overcome the access limitation, we prove the set-wise friendship paradox (FP) phenomenon that neighbors have higher degree in expectation, and propose a two-stage seeding model with the FP embedded, where neighbors are seeded. On this basis, for comparison we formulate the non-adaptive case and adaptive case, both proven to be NP-hard. In the non-adaptive case, discounts are allocated to users all at once. We show the monotonicity of influence spread w.r.t. discount allocation and design a two-stage coordinate descent framework to decide the discount allocation. In the adaptive case, users are sequentially seeded based on observations of existing seeding and diffusion results. We prove the adaptive submodularity and submodularity of the influence spread function in two stages. Then, a series of adaptive greedy algorithms are proposed with constant approximation ratio. Extensive experiments on real-world datasets show that our adaptive algorithms achieve larger influence spread than non-adaptive and other adaptive algorithms (up to a maximum of 116 percent).
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Optimal Estimation of Low-Rank Factors via Feature Level Data Fusion of
           Multiplex Signal Systems

    • Free pre-print version: Loading...

      Authors: Hui-Jia Li;Zhen Wang;Jie Cao;Jian Pei;Yong Shi;
      Pages: 2860 - 2871
      Abstract: The design of fusion engines is a subject of great importance in a variety of fields. In this paper, we focus on the problem of linear fusion at the feature level for multiple signal matrices with noises, with the features being extremal eigenvectors. When given multiple similarity matrices, the objective is to find an estimate of the latent signal eigenspace. The concentration result for the inner product of features from different matrix samples is developed, utilizing the random matrix theory. Based on of the theoretical results, we proposed an efficient algorithm, EigFuse, to solve the constrained data-driven optimization problem with different level of noises. Our method is of high efficiency by comparing it with state-of-the-art baseline approaches with multiple noise levels. Comprehensive experiments on several synthetic as well as real-life networks demonstrate our method’s superior performance.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Optimal Neighborhood Multiple Kernel Clustering With Adaptive Local
           Kernels

    • Free pre-print version: Loading...

      Authors: Jiyuan Liu;Xinwang Liu;Jian Xiong;Qing Liao;Sihang Zhou;Siwei Wang;Yuexiang Yang;
      Pages: 2872 - 2885
      Abstract: Multiple kernel clustering (MKC) algorithm aims to group data into different categories by optimally integrating information from a group of pre-specified kernels. Though demonstrating superiorities in various applications, we observe that existing MKC algorithms usually do not sufficiently consider the local density around individual data samples and excessively limit the representation capacity of the learned optimal kernel, leading to unsatisfying performance. In this paper, we propose an algorithm, called optimal neighborhood MKC with adaptive local kernels (ON-ALK), to address the two issues. In specific, we construct adaptive local kernels to sufficiently consider the local density around individual data samples, where different numbers of neighbors are discriminatingly selected on each sample. Further, the proposed ON-ALK algorithm boosts the representation of the learned optimal kernel via relaxing it into the neighborhood area of weighted combination of the pre-specified kernels. To solve the resultant optimization problem, a three-step iterative algorithm is designed and theoretically proven to be convergent. After that, we also study the generalization bound of the proposed algorithm. Extensive experiments have been conducted to evaluate the clustering performance. As indicated, the algorithm significantly outperforms state-of-the-art methods in recent literatures on six challenging benchmark datasets, verifying its advantages and effectiveness.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Representation Learning From Limited Educational Data With Crowdsourced
           Labels

    • Free pre-print version: Loading...

      Authors: Wentao Wang;Guowei Xu;Wenbiao Ding;Gale Yan Huang;Guoliang Li;Jiliang Tang;Zitao Liu;
      Pages: 2886 - 2898
      Abstract: Representation learning has been proven to play an important role in the unprecedented success of machine learning models in numerous tasks, such as machine translation, face recognition and recommendation. The majority of existing representation learning approaches often require a large number of consistent and noise-free labels. However, due to various reasons such as budget constraints and privacy concerns, labels are very limited in many real-world scenarios. Directly applying standard representation learning approaches on small labeled data sets will easily run into over-fitting problems and lead to sub-optimal solutions. Even worse, in some domains such as education, the limited labels are usually annotated by multiple workers with diverse expertise, which yields noises and inconsistency in such crowdsourcing settings. In this paper, we propose a novel framework which aims to learn effective representations from limited data with crowdsourced labels. Specifically, we design a grouping based deep neural network to learn embeddings from a limited number of training samples and present a Bayesian confidence estimator to capture the inconsistency among crowdsourced labels. Furthermore, to expedite the training process, we develop a hard example selection procedure to adaptively pick up training examples that are misclassified by the model. Extensive experiments conducted on three real-world data sets demonstrate the superiority of our framework on learning representations from limited data with crowdsourced labels, comparing with various state-of-the-art baselines. In addition, we provide a comprehensive analysis on each of the main components of our proposed framework and also introduce the promising results it achieved in our real production to fully understand the proposed framework. To encourage reproducible results, we make our code available online at https://github.com/tal-ai/RECLE.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • sCOs: Semi-Supervised Co-Selection by a Similarity Preserving Approach

    • Free pre-print version: Loading...

      Authors: Khalid Benabdeslem;Dou El Kefel Mansouri;Raywat Makkhongkaew;
      Pages: 2899 - 2911
      Abstract: In this paper, we focus on co-selection of instances and features in the semi-supervised learning scenario. In this context, co-selection becomes a more challenging problem as data contain labeled and unlabeled examples sampled from the same population. To carry out such semi-supervised co-selection, we propose a unified framework, called sCOs, which efficiently integrates labeled and unlabeled parts into the co-selection process. The framework is based on introducing both a sparse regularization term and a similarity preserving approach. It evaluates the usefulness of features and instances in order to select the most relevant ones, simultaneously. We propose two efficient algorithms that work for both convex and nonconvex functions. To the best of our knowledge, this paper offers, for the first time ever, a study utilizing nonconvex penalties for the co-selection of semi-supervised learning tasks. Experimental results on some known benchmark datasets are provided for validating sCOs and comparing it with some representative methods in the state-of-the art.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Semi-Supervised Learning With the EM Algorithm: A Comparative Study
           Between Unstructured and Structured Prediction

    • Free pre-print version: Loading...

      Authors: Wenchong He;Zhe Jiang;
      Pages: 2912 - 2920
      Abstract: Semi-supervised learning aims to learn prediction models from both labeled and unlabeled samples. There has been extensive research in this area. Among existing work, generative mixture models with Expectation-Maximization (EM) is a popular method due to clear statistical properties. However, existing literature on EM-based semi-supervised learning largely focuses on unstructured prediction, assuming that samples are independent and identically distributed. Studies on EM-based semi-supervised approach in structured prediction is limited. This article aims to fill the gap through a comparative study between unstructured and structured methods in EM-based semi-supervised learning. Specifically, we compare their theoretical properties and find that both methods can be considered as a generalization of self-training with soft class assignment of unlabeled samples, but the structured method additionally considers structural constraint in soft class assignment. We conducted a case study on real-world flood mapping datasets to compare the two methods. Results show that structured EM is more robust to class confusion caused by noise and obstacles in features in the context of the flood mapping application.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Social Recommendation With Characterized Regularization

    • Free pre-print version: Loading...

      Authors: Chen Gao;Nian Li;Tzu-Heng Lin;Dongsheng Lin;Jun Zhang;Yong Li;Depeng Jin;
      Pages: 2921 - 2933
      Abstract: Social recommendation, which utilizes social relations to enhance recommender systems, has been gaining increasing attention recently with the rapid development of online social networks. Existing social recommendation methods are based on the assumption, so-called social-trust, that users’ preference or decision is influenced by their social-connected friends’ purchase behaviors. However, they assume that the influences of social relationships are always the same, which violates the fact that users are likely to share preference on different products with different friends. More precisely, friends’ behaviors do not necessarily affect a user’s preferences, and the influence is diverse among different items. In this paper, we contribute a new solution, CSR (short for Characterized Social Regularization) model by designing a universal regularization term for modeling variable social influence. This regularization term captures the finely grained similarity of social-connected friends. We further introduce two variants of our model with different optimization manners. Our proposed model can be applied to both explicit and implicit interaction due to its high generality. Extensive experiments on three real-world datasets demonstrate that our CSR can outperform state-of-the-art social recommendation methods. Further experiments show that CSR can improve recommendation performance for those users with sparse social relations or behavioral interactions.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Spiral of Silence and Its Application in Recommender Systems

    • Free pre-print version: Loading...

      Authors: Chen Lin;Dugang Liu;Hanghang Tong;Yanghua Xiao;
      Pages: 2934 - 2947
      Abstract: It is crucial to model missing ratings in recommender systems since user preferences learnt from only observed ratings are biased. One possible explanation for missing ratings is motivated by the spiral of silence theory. When the majority opinion is formed, a spiral process is triggered where users are more and more likely to show their ratings if they perceive that they are supported by the opinion climate. In this paper we first verify the existence of the spiral process in recommender systems by using a variety of different real-life datasets. We then study the characteristics of two key factors in the spiral process: opinion climate and the hardcore users who will give ratings even when they are minority opinion holders. Based on our empirical findings, we develop four variants to model missing ratings. They mimic different components of the spiral of silence based on the spiral process with global opinion climate, local opinion climate, hardcore users, relationships between hardcore users and items, respectively. We experimentally show that, the presented variants all outperform state-of-the-art recommendation models with missing rating components.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Tensor Canonical Correlation Analysis Networks for Multi-View Remote
           Sensing Scene Recognition

    • Free pre-print version: Loading...

      Authors: Xinghao Yang;Weifeng Liu;Wei Liu;
      Pages: 2948 - 2961
      Abstract: Convolutional neural network (CNN) has been proven an effective way to extract high-level features from remote sensing (RS) images automatically. Many variants of the CNN model have been proposed, including principal component analysis network (PCANet), canonical correlation analysis network (CCANet), multiple scale CCANet (MS-CCANet) and multiview CCANet (MCCANet). The PCANet is specialized for single view feature abstraction, while in many real-world practices, the RS data are frequently observed from many more views. Although CCANet, MS-CCANet and MCCANet can be applied to two or more view data, they consider only the pair-wise correlation by calculating a series of two-order covariance matrices. However, the high-order consistence, which can only be explored by collectively and simultaneously examining all views, remains undiscovered. In this paper, we propose the tensor canonical correlation analysis network (TCCANet) to tackle this problem. Particularly, TCCANet learns filter banks by simultaneously maximizing arbitrary number of views with high-order-correlation and solves the optimization problem by decomposing a covariance tensor. After the convolutional stage, we utilize binarization and block-wise histogram strategies to generate the final feature. Furthermore, we also develop a Multiple Scale version of TCCANet, i.e., MS-TCCANet, to extract enriched representation of the RS data by incorporating all previous convolutional layers. Numerical experiment results on RSSCN7 and SAT-6 datasets demonstrate the advantages of TCCANet and MS-TCCANet for RS scene recognition.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • The Dynamic Privacy-Preserving Mechanisms for Online Dynamic Social
           Networks

    • Free pre-print version: Loading...

      Authors: Tianqing Zhu;Jin Li;Xiangyu Hu;Ping Xiong;Wanlei Zhou;
      Pages: 2962 - 2974
      Abstract: Networks that constantly transmit information and change structure are becoming increasingly prevalent. However, traditional privacy models are designed to protect static information, such as records in a database or a person’s profile information, which seldom changes. This conflict between static models and dynamic environments is dramatically hindering the effectiveness and efficiency of privacy preservation in today’s dynamic world. Hence, in this paper, we formally define the concept of dynamic privacy, present two novel perspectives, privacy propagation and accumulation, on the way private information can spread through dynamic cyberspace, and develop associated theories and mechanisms for preserving privacy in advanced complex networks, such as social networking sites where data are constantly being released, shared, and exchanged.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Toward Predicting Active Participants in Tweet Streams: A Case Study on
           Two Civil Rights Events

    • Free pre-print version: Loading...

      Authors: Xiao-Kun Wu;Tian-Fang Zhao;Wei-Neng Chen;Jun Zhang;
      Pages: 2975 - 2987
      Abstract: Online social media have aroused much research interest in recent years. In contrast to previous work that focused on the detection of emerging topics, this article undertakes the prediction of active users in online social events, which is so far rarely explored. This prediction task is formulated as a binary classification problem that built on real-world tweet streams, taking Ferguson event and New York Chockhold event as examples. Then, a comprehensive user feature system is designed to characterize the events’ online participants, which includes not only basic statistical characteristics and image-pixel-level features, but also some emotional features and personality features. Next, the Weighted Random Forest (Weighted-RF) classifier is adopted to solve the classification problem. Based on the user feature system and the classifier, the experience of a previous event can be archived and applied to the prediction of later similar events. Experimental results show that the Weighted-RF trained by samples of Ferguson event can effectively predict active users in NYC event, with an AUC value around 0.8392. Besides, the image-content based personality model provides a new tool for depicting user portraits, which further contributes to the quantitative analysis of online social events.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • T-PAIR: Temporal Node-Pair Embedding for Automatic Biomedical Hypothesis
           Generation

    • Free pre-print version: Loading...

      Authors: Uchenna Akujuobi;Michael Spranger;Sucheendra K. Palaniappan;Xiangliang Zhang;
      Pages: 2988 - 3001
      Abstract: In this paper, we study an automatic hypothesis generation (HG) problem, which refers to the discovery of meaningful implicit connections between scientific terms, including but not limited to diseases, chemicals, drugs, and genes extracted from databases of biomedical publications. Most prior studies of this problem focused on the use of static information of terms and largely ignored the temporal dynamics of scientific term relations. Even when the dynamics were considered in a few recent studies, they learned the representations for the scientific terms, rather than focusing on the term-pair relations. Since the HG problem is to predict term-pair connections, it is not enough to know with whom the terms are connected, it is more important to know how the connections have been formed (in a dynamic process). We formulate this HG problem as a future connectivity prediction in a dynamic attributed graph. The key is to capture the temporal evolution of node-pair (term-pair) relations. We propose an inductive edge (node-pair) embedding method named T-PAIR, utilizing both the graphical structure and node attribute to encode the temporal node-pair relationship. We demonstrate the efficiency of the proposed model on three real-world datasets, which are three graphs constructed from Pubmed papers published until 2019 in Neurology, Immunotherapy, and Virology, respectively. Evaluations were conducted on predicting future term-pair relations between millions of seen terms (in the transductive setting), as well as on the relations involving unseen terms (in the inductive setting). Experiment results and case study analyses show the effectiveness of the proposed model.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Unsupervised Feature Learning Architecture With Multi-Clustering
           Integration RBM

    • Free pre-print version: Loading...

      Authors: Jielei Chu;Hongjun Wang;Jing Liu;Zhiguo Gong;Tianrui Li;
      Pages: 3002 - 3015
      Abstract: In this paper, we present a novel unsupervised feature learning architecture, which consists of a multi-clustering integration module and a variant of RBM termed multi-clustering integration RBM (MIRBM). In the multi-clustering integration module, we apply three clusterers (K-means, affinity propagation and spectral clustering algorithms) to obtain three different clustering partitions (CPs) without any background knowledge or label. Then, an unanimous voting strategy is used to generate a local clustering partition (LCP). The novel MIRBM model is a core feature encoding part of the proposed unsupervised feature learning architecture. The novelty of it is that the LCP as an unsupervised guidance is integrated into one step contrastive divergence (${mathtt{{CD}}}_{1}$CD1) learning to guide the distribution of the hidden layer features. For the instance in the same LCP cluster, the hidden and reconstructed hidden layer features of the MIRBM model in the proposed architecture tend to constrict together in the training process. Meanwhile, each LCP center tends to disperse from each other as much as possible in the hidden and reconstructed hidden layer during training. The experiments demonstrate that the proposed unsupervised feature learning architecture has more powerful feature representation and generalization capability than the state-of-the-art models for clustering tasks in the Microsoft Research Asia Multimedia (MSRA-MM)2.0 dataset.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
  • Unsupervised Spectral Feature Selection With Dynamic Hyper-Graph Learning

    • Free pre-print version: Loading...

      Authors: Xiaofeng Zhu;Shichao Zhang;Yonghua Zhu;Pengfei Zhu;Yue Gao;
      Pages: 3016 - 3028
      Abstract: Unsupervised spectral feature selection (USFS) methods could output interpretable and discriminative results by embedding a Laplacian regularizer in the framework of sparse feature selection to keep the local similarity of the training samples. To do this, USFS methods usually construct the Laplacian matrix using either a general-graph or a hyper-graph on the original data. Usually, a general-graph could measure the relationship between two samples while a hyper-graph could measure the relationship among no less than two samples. Obviously, the general-graph is a special case of the hyper-graph and the hyper-graph may capture more complex structure of samples than the general graph. However, in previous USFS methods, the construction of the Laplacian matrix is separated from the process of feature selection. Moreover, the original data usually contain noise. Each of them makes difficult to output reliable feature selection models. In this paper, we propose a novel feature selection method by dynamically constructing a hyper-graph based Laplacian matrix in the framework of sparse feature selection. Experimental results on real datasets showed that our proposed method outperformed the state-of-the-art methods in terms of both clustering and segmentation tasks.
      PubDate: June 1 2022
      Issue No: Vol. 34, No. 6 (2022)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 34.231.247.88
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-