A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

        1 2 | Last   [Sort alphabetically]   [Restore default list]

  Subjects -> ELECTRONICS (Total: 207 journals)
Showing 1 - 200 of 277 Journals sorted by number of followers
IEEE Transactions on Aerospace and Electronic Systems     Hybrid Journal   (Followers: 313)
Control Systems     Hybrid Journal   (Followers: 253)
IEEE Transactions on Geoscience and Remote Sensing     Hybrid Journal   (Followers: 201)
Journal of Guidance, Control, and Dynamics     Hybrid Journal   (Followers: 197)
Electronics     Open Access   (Followers: 138)
Advances in Electronics     Open Access   (Followers: 133)
Electronic Design     Partially Free   (Followers: 129)
Electronics For You     Partially Free   (Followers: 128)
IEEE Antennas and Propagation Magazine     Hybrid Journal   (Followers: 120)
IEEE Power Electronics Magazine     Full-text available via subscription   (Followers: 91)
IEEE Transactions on Power Electronics     Hybrid Journal   (Followers: 89)
IEEE Antennas and Wireless Propagation Letters     Hybrid Journal   (Followers: 88)
IEEE Transactions on Software Engineering     Hybrid Journal   (Followers: 84)
IEEE Transactions on Industrial Electronics     Hybrid Journal   (Followers: 84)
IEEE Transactions on Antennas and Propagation     Full-text available via subscription   (Followers: 81)
IET Power Electronics     Open Access   (Followers: 70)
IEEE Transactions on Automatic Control     Hybrid Journal   (Followers: 67)
Selected Topics in Applied Earth Observations and Remote Sensing, IEEE Journal of     Hybrid Journal   (Followers: 63)
IEEE Embedded Systems Letters     Hybrid Journal   (Followers: 62)
IEEE Transactions on Industry Applications     Hybrid Journal   (Followers: 58)
IEEE Journal of Emerging and Selected Topics in Power Electronics     Hybrid Journal   (Followers: 53)
Canadian Journal of Remote Sensing     Full-text available via subscription   (Followers: 53)
Advances in Power Electronics     Open Access   (Followers: 49)
IEEE Nanotechnology Magazine     Hybrid Journal   (Followers: 45)
IEEE Transactions on Consumer Electronics     Hybrid Journal   (Followers: 45)
Journal of Electrical and Electronics Engineering Research     Open Access   (Followers: 41)
IEEE Transactions on Biomedical Engineering     Hybrid Journal   (Followers: 35)
IEEE Transactions on Circuits and Systems for Video Technology     Hybrid Journal   (Followers: 34)
IET Microwaves, Antennas & Propagation     Open Access   (Followers: 34)
Journal of Physics B: Atomic, Molecular and Optical Physics     Hybrid Journal   (Followers: 32)
American Journal of Electrical and Electronic Engineering     Open Access   (Followers: 30)
IEEE Transactions on Information Theory     Hybrid Journal   (Followers: 28)
Electronics Letters     Open Access   (Followers: 28)
Bell Labs Technical Journal     Hybrid Journal   (Followers: 27)
Microelectronics and Solid State Electronics     Open Access   (Followers: 27)
International Journal of Power Electronics     Hybrid Journal   (Followers: 24)
International Journal of Aerospace Innovations     Full-text available via subscription   (Followers: 24)
Journal of Sensors     Open Access   (Followers: 23)
International Journal of Image, Graphics and Signal Processing     Open Access   (Followers: 22)
IEEE Reviews in Biomedical Engineering     Hybrid Journal   (Followers: 20)
IEEE/OSA Journal of Optical Communications and Networking     Hybrid Journal   (Followers: 19)
IEEE Transactions on Electron Devices     Hybrid Journal   (Followers: 18)
Journal of Artificial Intelligence     Open Access   (Followers: 18)
Journal of Power Electronics & Power Systems     Full-text available via subscription   (Followers: 17)
IET Wireless Sensor Systems     Open Access   (Followers: 17)
Circuits and Systems     Open Access   (Followers: 16)
Archives of Electrical Engineering     Open Access   (Followers: 15)
International Journal of Control     Hybrid Journal   (Followers: 14)
IEEE Transactions on Signal and Information Processing over Networks     Hybrid Journal   (Followers: 14)
International Journal of Advanced Research in Computer Science and Electronics Engineering     Open Access   (Followers: 14)
IEEE Women in Engineering Magazine     Hybrid Journal   (Followers: 13)
Advances in Microelectronic Engineering     Open Access   (Followers: 13)
IEEE Solid-State Circuits Magazine     Hybrid Journal   (Followers: 13)
Machine Learning with Applications     Full-text available via subscription   (Followers: 12)
Intelligent Transportation Systems Magazine, IEEE     Full-text available via subscription   (Followers: 12)
IEEE Transactions on Broadcasting     Hybrid Journal   (Followers: 12)
IEEE Transactions on Learning Technologies     Full-text available via subscription   (Followers: 12)
IEICE - Transactions on Electronics     Full-text available via subscription   (Followers: 11)
International Journal of Sensors, Wireless Communications and Control     Hybrid Journal   (Followers: 11)
International Journal of Microwave and Wireless Technologies     Hybrid Journal   (Followers: 11)
International Journal of Advanced Electronics and Communication Systems     Open Access   (Followers: 11)
Journal of Low Power Electronics     Full-text available via subscription   (Followers: 11)
Open Journal of Antennas and Propagation     Open Access   (Followers: 10)
Solid-State Electronics     Hybrid Journal   (Followers: 10)
International Journal of Advances in Telecommunications, Electrotechnics, Signals and Systems     Open Access   (Followers: 10)
IETE Journal of Research     Open Access   (Followers: 10)
Batteries     Open Access   (Followers: 9)
Electronics and Communications in Japan     Hybrid Journal   (Followers: 9)
International Journal of Wireless and Microwave Technologies     Open Access   (Followers: 9)
IETE Technical Review     Open Access   (Followers: 9)
Nature Electronics     Hybrid Journal   (Followers: 9)
Journal of Signal and Information Processing     Open Access   (Followers: 9)
APSIPA Transactions on Signal and Information Processing     Open Access   (Followers: 8)
IEEE Journal of the Electron Devices Society     Open Access   (Followers: 8)
International Journal of Electronics and Telecommunications     Open Access   (Followers: 8)
Journal of Electromagnetic Waves and Applications     Hybrid Journal   (Followers: 8)
China Communications     Full-text available via subscription   (Followers: 8)
Superconductivity     Full-text available via subscription   (Followers: 8)
IEEE Transactions on Autonomous Mental Development     Hybrid Journal   (Followers: 8)
Journal of Low Power Electronics and Applications     Open Access   (Followers: 8)
International Journal of Antennas and Propagation     Open Access   (Followers: 8)
Journal of Electronic Design Technology     Full-text available via subscription   (Followers: 8)
Advances in Electrical and Electronic Engineering     Open Access   (Followers: 8)
Universal Journal of Electrical and Electronic Engineering     Open Access   (Followers: 7)
Power Electronic Devices and Components     Open Access   (Followers: 7)
Foundations and Trends® in Signal Processing     Full-text available via subscription   (Followers: 7)
Nanotechnology, Science and Applications     Open Access   (Followers: 7)
IEEE Magnetics Letters     Hybrid Journal   (Followers: 7)
Progress in Quantum Electronics     Full-text available via subscription   (Followers: 7)
Foundations and Trends® in Communications and Information Theory     Full-text available via subscription   (Followers: 6)
Metrology and Measurement Systems     Open Access   (Followers: 6)
Advances in Biosensors and Bioelectronics     Open Access   (Followers: 6)
International Journal of Systems, Control and Communications     Hybrid Journal   (Followers: 6)
Kinetik : Game Technology, Information System, Computer Network, Computing, Electronics, and Control     Open Access   (Followers: 6)
International Journal of Electronics     Hybrid Journal   (Followers: 6)
IEICE - Transactions on Information and Systems     Full-text available via subscription   (Followers: 6)
Research & Reviews : Journal of Embedded System & Applications     Full-text available via subscription   (Followers: 6)
Journal of Power Electronics     Hybrid Journal   (Followers: 6)
Annals of Telecommunications     Hybrid Journal   (Followers: 6)
Electronic Markets     Hybrid Journal   (Followers: 6)
Energy Storage Materials     Full-text available via subscription   (Followers: 6)
IEEE Transactions on Services Computing     Hybrid Journal   (Followers: 5)
International Journal of Computational Vision and Robotics     Hybrid Journal   (Followers: 5)
Journal of Optoelectronics Engineering     Open Access   (Followers: 5)
Journal of Electromagnetic Analysis and Applications     Open Access   (Followers: 5)
Journal of Field Robotics     Hybrid Journal   (Followers: 5)
Journal of Electronics (China)     Hybrid Journal   (Followers: 5)
Batteries & Supercaps     Hybrid Journal   (Followers: 5)
IEEE Pulse     Hybrid Journal   (Followers: 5)
Journal of Microelectronics and Electronic Packaging     Hybrid Journal   (Followers: 4)
Networks: an International Journal     Hybrid Journal   (Followers: 4)
EPE Journal : European Power Electronics and Drives     Hybrid Journal   (Followers: 4)
Advanced Materials Technologies     Hybrid Journal   (Followers: 4)
Frontiers in Electronics     Open Access   (Followers: 4)
Wireless and Mobile Technologies     Open Access   (Followers: 4)
Synthesis Lectures on Power Electronics     Full-text available via subscription   (Followers: 4)
Journal of Energy Storage     Full-text available via subscription   (Followers: 4)
IEEE Transactions on Haptics     Hybrid Journal   (Followers: 4)
Journal of Electrical Engineering & Electronic Technology     Hybrid Journal   (Followers: 4)
Journal of Circuits, Systems, and Computers     Hybrid Journal   (Followers: 4)
International Journal of Review in Electronics & Communication Engineering     Open Access   (Followers: 4)
Electronic Materials Letters     Hybrid Journal   (Followers: 4)
Journal of Biosensors & Bioelectronics     Open Access   (Followers: 4)
Biomedical Instrumentation & Technology     Hybrid Journal   (Followers: 4)
IJEIS (Indonesian Journal of Electronics and Instrumentation Systems)     Open Access   (Followers: 3)
Informatik-Spektrum     Hybrid Journal   (Followers: 3)
IEEE Journal on Exploratory Solid-State Computational Devices and Circuits     Hybrid Journal   (Followers: 3)
International Journal of Numerical Modelling: Electronic Networks, Devices and Fields     Hybrid Journal   (Followers: 3)
Advancing Microelectronics     Hybrid Journal   (Followers: 3)
International Journal of Applied Electronics in Physics & Robotics     Open Access   (Followers: 3)
IETE Journal of Education     Open Access   (Followers: 3)
Superconductor Science and Technology     Hybrid Journal   (Followers: 3)
Sensors International     Open Access   (Followers: 3)
e-Prime : Advances in Electrical Engineering, Electronics and Energy     Open Access   (Followers: 3)
EPJ Quantum Technology     Open Access   (Followers: 3)
Frontiers of Optoelectronics     Hybrid Journal   (Followers: 3)
Transactions on Electrical and Electronic Materials     Hybrid Journal   (Followers: 2)
ACS Applied Electronic Materials     Open Access   (Followers: 2)
IET Smart Grid     Open Access   (Followers: 2)
Energy Storage     Hybrid Journal   (Followers: 2)
Journal of Microwave Power and Electromagnetic Energy     Hybrid Journal   (Followers: 2)
Australian Journal of Electrical and Electronics Engineering     Hybrid Journal   (Followers: 2)
Journal of Information and Telecommunication     Open Access   (Followers: 2)
TELKOMNIKA (Telecommunication, Computing, Electronics and Control)     Open Access   (Followers: 2)
Journal of Semiconductors     Full-text available via subscription   (Followers: 2)
Radiophysics and Quantum Electronics     Hybrid Journal   (Followers: 2)
International Transaction of Electrical and Computer Engineers System     Open Access   (Followers: 2)
Journal of Intelligent Procedures in Electrical Technology     Open Access   (Followers: 2)
Sensing and Imaging : An International Journal     Hybrid Journal   (Followers: 2)
Security and Communication Networks     Hybrid Journal   (Followers: 2)
Journal of Nuclear Cardiology     Hybrid Journal   (Followers: 2)
ECTI Transactions on Electrical Engineering, Electronics, and Communications     Open Access   (Followers: 1)
IET Energy Systems Integration     Open Access   (Followers: 1)
Majalah Ilmiah Teknologi Elektro : Journal of Electrical Technology     Open Access   (Followers: 1)
International Journal of Granular Computing, Rough Sets and Intelligent Systems     Hybrid Journal   (Followers: 1)
IEEE Letters on Electromagnetic Compatibility Practice and Applications     Hybrid Journal   (Followers: 1)
Journal of Computational Intelligence and Electronic Systems     Full-text available via subscription   (Followers: 1)
Електротехніка і Електромеханіка     Open Access   (Followers: 1)
Open Electrical & Electronic Engineering Journal     Open Access   (Followers: 1)
IEEE Journal of Electromagnetics, RF and Microwaves in Medicine and Biology     Hybrid Journal   (Followers: 1)
Journal of Advanced Dielectrics     Open Access   (Followers: 1)
Transactions on Cryptographic Hardware and Embedded Systems     Open Access   (Followers: 1)
International Journal of Hybrid Intelligence     Hybrid Journal   (Followers: 1)
Ural Radio Engineering Journal     Open Access   (Followers: 1)
IET Cyber-Physical Systems : Theory & Applications     Open Access   (Followers: 1)
Edu Elektrika Journal     Open Access   (Followers: 1)
Power Electronics and Drives     Open Access   (Followers: 1)
Automatika : Journal for Control, Measurement, Electronics, Computing and Communications     Open Access  
npj Flexible Electronics     Open Access  
Elektronika ir Elektortechnika     Open Access  
Emitor : Jurnal Teknik Elektro     Open Access  
IEEE Solid-State Circuits Letters     Hybrid Journal  
IEEE Open Journal of Industry Applications     Open Access  
IEEE Open Journal of the Industrial Electronics Society     Open Access  
IEEE Open Journal of Circuits and Systems     Open Access  
Journal of Electronic Science and Technology     Open Access  
Solid State Electronics Letters     Open Access  
Industrial Technology Research Journal Phranakhon Rajabhat University     Open Access  
Journal of Engineered Fibers and Fabrics     Open Access  
Jurnal Teknologi Elektro     Open Access  
IET Nanodielectrics     Open Access  
Elkha : Jurnal Teknik Elektro     Open Access  
JAREE (Journal on Advanced Research in Electrical Engineering)     Open Access  
Jurnal Teknik Elektro     Open Access  
IACR Transactions on Symmetric Cryptology     Open Access  
Acta Electronica Malaysia     Open Access  
Bioelectronics in Medicine     Hybrid Journal  
Chinese Journal of Electronics     Open Access  
Problemy Peredachi Informatsii     Full-text available via subscription  
Technical Report Electronics and Computer Engineering     Open Access  
Jurnal Rekayasa Elektrika     Open Access  
Facta Universitatis, Series : Electronics and Energetics     Open Access  
Visión Electrónica : algo más que un estado sólido     Open Access  
Telematique     Open Access  
International Journal of Nanoscience     Hybrid Journal  
International Journal of High Speed Electronics and Systems     Hybrid Journal  
Semiconductors and Semimetals     Full-text available via subscription  

        1 2 | Last   [Sort alphabetically]   [Restore default list]

Similar Journals
Journal Cover
IEEE Transactions on Circuits and Systems for Video Technology
Journal Prestige (SJR): 0.977
Citation Impact (citeScore): 5
Number of Followers: 34  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1051-8215
Published by IEEE Homepage  [228 journals]
  • IEEE Transactions on Circuits and Systems for Video Technology publication
           information

    • Free pre-print version: Loading...

      Abstract: Presents a listing of the editorial board, board of governors, current staff, committee members, and/or society editors for this issue of the publication.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • IEEE Circuits and Systems Society information for authors

    • Free pre-print version: Loading...

      Abstract: These instructions give guidelines for preparing papers for this publication. Presents information for authors publishing in this journal.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Effective Tensor Completion via Element-Wise Weighted Low-Rank Tensor
           Train With Overlapping Ket Augmentation

    • Free pre-print version: Loading...

      Authors: Yang Zhang;Yao Wang;Zhi Han;Xi’ai Chen;Yandong Tang;
      Pages: 7286 - 7300
      Abstract: Tensor completion methods based on the tensor train (TT) have the issues of inaccurate weight assignment and ineffective tensor augmentation pre-processing. In this work, we propose a novel tensor completion approach via the element-wise weighted technique. Accordingly, a novel formulation for tensor completion and an effective optimization algorithm, called tensor completion by parallel weighted matrix factorization via tensor train (TWMac-TT), is proposed. In addition, we specifically consider the recovery quality of edge elements from adjacent blocks. Different from traditional reshaping and ket augmentation, we utilize a new tensor augmentation technique called overlapping ket augmentation, which can further avoid blocking artifacts. We then conduct extensive performance evaluations on synthetic data and several real image data sets. Our experimental results demonstrate that the proposed algorithm TWMac-TT outperforms several other competing tensor completion methods. The code is available at https://github.com/yzcv/ TWMac-TT-OKA
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Convolutional Neural Networks for Omnidirectional Image Quality
           Assessment: A Benchmark

    • Free pre-print version: Loading...

      Authors: Abderrezzaq Sendjasni;Mohamed-Chaker Larabi;Faouzi Alaya Cheikh;
      Pages: 7301 - 7316
      Abstract: In this paper, we conduct an extensive study on the use of pre-trained convolutional neural networks (CNNs) for omnidirectional image quality assessment (IQA). To cope with the lack of available IQA databases, transfer learning from seven pre-trained CNN models is investigated over retraining on standard 2D databases. In addition, we explore the influence of various image representations and training strategies on the model’s performance. A comparison of the use of projected versus radial content, and multichannel CNN versus patch-wise training is also covered. The experimental results on two publicly available databases are used to draw conclusions about which strategy best fits the visual quality prediction and at which computational cost. The analysis shows that retraining CNN models on 2D IQA databases improves the prediction accuracy. The latter and the required computational time are found to be significantly affected by the training strategy. Cross-database evaluations demonstrate that the nature and variety of the content impact the generalization ability of the models. Finally, we show that conclusions coming from other image processing communities may not hold for IQA. The provided discussion shall provide insights and recommendations when using pre-trained CNNs for omnidirectional IQA.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Propagating Facial Prior Knowledge for Multitask Learning in Face
           Super-Resolution

    • Free pre-print version: Loading...

      Authors: Chenyang Wang;Junjun Jiang;Zhiwei Zhong;Xianming Liu;
      Pages: 7317 - 7331
      Abstract: Existing face hallucination methods always achieve improved performance through regularizing the model with facial prior. Most of them always estimate facial prior information first and then leverage it to help the prediction of the target high-resolution face image. However, the accuracy of prior estimation is difficult to guarantee, especially for the low-resolution face image. Once the estimated prior is inaccurate or wrong, the following face super-resolution performance is unavoidably influenced. A natural question that arises: how to incorporate facial prior effectively and efficiently without prior estimation' To achieve this goal, we propose to learn facial prior knowledge at training stage, but test only with low-resolution face image, which can overcome the difficulty of estimating accurate prior. In addition, instead of estimating facial prior, we directly explore the potential of high-quality facial prior in the training phase and progressively propagate the facial prior knowledge from the teacher network (trained with the low-resolution face/high-quality facial prior and high-resolution face image pairs) to the student network (trained with the low-resolution face and high-resolution face image pairs). Quantitative and qualitative comparisons on benchmark face datasets demonstrate that our method outperforms the state-of-the-art face super-resolution methods. The source codes of the proposed method will be available at https://github.com/wcy-cs/KDFSRNet.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Self-Supervised Low-Light Image Enhancement Using Discrepant Untrained
           Network Priors

    • Free pre-print version: Loading...

      Authors: Jinxiu Liang;Yong Xu;Yuhui Quan;Boxin Shi;Hui Ji;
      Pages: 7332 - 7345
      Abstract: This paper proposes a deep learning method for low-light image enhancement, which exploits the generation capability of Neural Networks (NNs) while requiring no training samples except the input image itself. Based on the Retinex decomposition model, the reflectance and illumination of a low-light image are parameterized by two untrained NNs. The ambiguity between the two layers is resolved by the discrepancy between the two NNs in terms of architecture and capacity, while the complex noise with spatially-varying characteristics is handled by an illumination-adaptive self-supervised denoising module. The enhancement is done by jointly optimizing the Retinex decomposition and the illumination adjustment. Extensive experiments show that the proposed method not only outperforms existing non-learning-based and unsupervised-learning-based methods, but also competes favorably with some supervised-learning-based methods in extreme low-light conditions.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Memory-Efficient Deformable Convolution Based Joint Denoising and
           Demosaicing for UHD Images

    • Free pre-print version: Loading...

      Authors: Juntao Guan;Rui Lai;Yang Lu;Yangang Li;Huanan Li;Lichen Feng;Yintang Yang;Lin Gu;
      Pages: 7346 - 7358
      Abstract: This paper introduces deformable convolution in deep learning based joint denoising and demosaicing (JDD), which yields more adaptable representation and larger receptive fields in features extraction for a superior restoration performance. However, the deformable convolution generally leads to considerable computational load and irregular memory access bottleneck, limiting its extensive deployment on edge devices. To address this issue, we develop grouping strategy and assign independent offsets to each kernel group to reduce the computation latency while keeping the accuracy. Motivated by the exploration for aggregate distribution characteristics of deformable offsets, we present the offset sharing methodology to simplify the memory access complexity of deformable convolution. As for hardware acceleration, we specially design a novel deformable matrix multiplication workflow incorporated with a deformable memory mapping unit to boost the computational throughput. The verification experiments on FPGA demonstrate that the proposed deformable convolution based JDD can restore 4K Ultra High Definition (UHD) images at 70FPS and yields significant promotion in visual effect and objective quality assessment.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Vector-Based Efficient Data Hiding in Encrypted Images via Multi-MSB
           Replacement

    • Free pre-print version: Loading...

      Authors: Yike Zhang;Wenbin Luo;
      Pages: 7359 - 7372
      Abstract: As an essential technique for data privacy protection, reversible data hiding in encrypted images (RDHEI) methods have drawn intensive research interest in recent years. In response to the increasing demand for protecting data privacy, novel methods that perform RDHEI are continually being developed. We propose two effective multi-MSB (most significant bit) replacement-based approaches that yield comparably high data embedding capacity, improve overall processing speed, and enhance reconstructed images’ quality. Our first method, Efficient Multi-MSB Replacement-RDHEI (EMR-RDHEI), obtains higher data embedding rates (DERs, also known as payloads) and better visual quality in reconstructed images when compared with many other state-of-the-art methods. Our second method, Lossless Multi-MSB Replacement-RDHEI (LMR-RDHEI), can losslessly recover original images after an information embedding process is performed. To verify the accuracy of our methods, we compared them with other recent RDHEI techniques and performed extensive experiments using the widely accepted BOWS-2 dataset. Our experimental results showed that the DER of our EMR-RDHEI method ranged from 1.2087 bit per pixel (bpp) to 6.2682 bpp with an average of 3.2457 bpp. For the LMR-RDHEI method, the average DER was 2.5325 bpp, with a range between 0.2129 bpp and 6.0168 bpp. Our results demonstrate that these methods outperform many other state-of-the-art RDHEI algorithms. Additionally, the multi-MSB replacement-based approach provides a clean design and efficient vectorized implementation.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Invertible Color-to-Grayscale Conversion Using Lossy Compression and
           High-Capacity Data Hiding

    • Free pre-print version: Loading...

      Authors: Qiaoyi Liang;Shijun Xiang;
      Pages: 7373 - 7385
      Abstract: Invertible color-to-grayscale conversion is a research issue in grayscale image colorization, which is a complex problem resulting from information loss. This paper presents an innovative invertible color-to-grayscale conversion idea by independently compressing the chromaticity plane and hiding it into the corresponding luminance plane as a watermark. Since the chromaticity and luminance planes are orthogonal in the proposed method, they can be processed efficiently without mutual influence. By using an efficient lossy compression operation, we can save more chromatic information. By using two high-capacity data hiding techniques (reversible watermarking (RW) and least significant bit (LSB) substitution), we can embed the compressed chromatic information into the luminance plane perfectly. For the purpose of integrality authentication, the luminance plane is hashed as part of the embedded information before the embedding. Experimental results have shown that higher quality of reconstructed color images can be achieved by using RW, but the quality of the synthesized grayscale drops sharply. By using LSB substitution, we can obtain high-quality synthesized grayscale and reconstructed color images simultaneously. Furthermore, we have compared the proposed LSB-based scheme with several recently reported state-of-the-art methods to validate the superiority of the proposed approach.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Distilling Knowledge From Object Classification to Aesthetics Assessment

    • Free pre-print version: Loading...

      Authors: Jingwen Hou;Henghui Ding;Weisi Lin;Weide Liu;Yuming Fang;
      Pages: 7386 - 7402
      Abstract: In this work, we point out that the major dilemma of image aesthetics assessment (IAA) comes from the abstract nature of aesthetic labels. That is, a vast variety of distinct contents can correspond to the same aesthetic label. On the one hand, during inference, the IAA model is required to relate various distinct contents to the same aesthetic label. On the other hand, when training, it would be hard for the IAA model to learn to distinguish different contents merely with the supervision from aesthetic labels, since aesthetic labels are not directly related to any specific content. To deal with this dilemma, we propose to distill knowledge on semantic patterns for a vast variety of image contents from multiple pre-trained object classification (POC) models to an IAA model. Expecting the combination of multiple POC models can provide sufficient knowledge on various image contents, the IAA model can easier learn to relate various distinct contents to a limited number of aesthetic labels. By supervising an end-to-end single-backbone IAA model with the distilled knowledge, the performance of the IAA model is significantly improved by 4.8% in SRCC compared to the version trained only with ground-truth aesthetic labels. On specific categories of images, the SRCC improvement brought by the proposed method can achieve up to 7.2%. Peer comparison also shows that our method outperforms 10 previous IAA methods.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Multiscale Low-Light Image Enhancement Network With Illumination
           Constraint

    • Free pre-print version: Loading...

      Authors: Guo-Dong Fan;Bi Fan;Min Gan;Guang-Yong Chen;C. L. Philip Chen;
      Pages: 7403 - 7417
      Abstract: Images captured under low-light environments typically have poor visibility, affecting many advanced computer vision tasks. In recent years, there have been some low-light image enhancement models based on deep learning, but they have not been able to effectively mine the deep multiscale features in the image, resulting in poor generalization performance and instability of the model. The disadvantages are mainly reflected in the color distortion, color unsaturation and artifacts. Current methods unable to adjust the exposure effectively, resulting in uneven exposure or partial overexposure. To address these issues, we propose an end-to-end low-light image enhancement model, which is called multiscale low-light image enhancement network with illumination constraint (MLLEN-IC), to achieve preferable generalization ability and stable performance. On the one hand, we use the squeeze-and-excitation-Res2Net block (SE-Res2block) as a base unit to enhance the model’s ability by extracting deep multiscale features. On the other hand, to make the model more adaptable in low-light image enhancement tasks, we calculate the illumination constraint by the low-light itself to prevent overexposure, uneven exposure, and unsaturated colors. Extensive experiments are conducted to demonstrate MLLEN-IC not only adjusts light levels, but also has a more natural visual effect, and avoids problems such as color distortion, artifacts, and uneven exposure. In particular, MLLEN-IC has pretty generalization and stability performance. The source code and supplementary are available at https://github.com/CCECfgd/MLLEN-IC.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • DSRGAN: Detail Prior-Assisted Perceptual Single Image Super-Resolution via
           Generative Adversarial Networks

    • Free pre-print version: Loading...

      Authors: Ziyang Liu;Zhengguo Li;Xingming Wu;Zhong Liu;Weihai Chen;
      Pages: 7418 - 7431
      Abstract: The generative adversarial network (GAN) is successfully applied to study the perceptual single image super-resolution (SISR). However, since the GAN is data-driven, it has a fundamental limitation on restoring real high frequency information for an unknown instance (or image) during test. On the other hand, the conventional model-based methods have a superiority to achieve instance adaptation as they operate by considering the statistics of each instance (or image) only. Motivated by this, we propose a novel model-based algorithm, which can extract the detail layer of an image efficiently. The detail layer represents the high frequency information of image and it is constituted of image edges and fine textures. It is seamlessly incorporated into the GAN and serves as a prior knowledge to assist the GAN in generating more realistic details. The proposed method, named DSRGAN, takes advantages from both the model-based conventional algorithm and the data-driven deep learning network. Experimental results demonstrate that the DSRGAN outperforms the state-of-the-art SISR methods on perceptual metrics, meanwhile achieving comparable results in terms of fidelity metrics. Following the DSRGAN, it is feasible to incorporate other conventional image processing algorithms into a deep learning network to form a model-based deep SISR.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Generative Memory-Guided Semantic Reasoning Model for Image Inpainting

    • Free pre-print version: Loading...

      Authors: Xin Feng;Wenjie Pei;Fengjun Li;Fanglin Chen;David Zhang;Guangming Lu;
      Pages: 7432 - 7447
      Abstract: The critical challenge of single image inpainting stems from accurate semantic inference via limited information while maintaining image quality. Typical methods for semantic image inpainting train an encoder-decoder network by learning a one-to-one mapping from the corrupted image to the inpainted version. While such methods perform well on images with small corrupted regions, it is challenging for these methods to deal with images with large corrupted area due to two potential limitations. 1) Such one-to-one mapping paradigm tends to overfit each single training pair of images; 2) The inter-image prior knowledge about the general distribution patterns of visual semantics, which can be transferred across images sharing similar semantics, is not explicitly exploited. In this paper, we propose the Generative Memory-guided Semantic Reasoning Model (GM-SRM), which infers the content of corrupted regions based on not only the known regions of the corrupted image, but also the learned inter-image reasoning priors characterizing the generalizable semantic distribution patterns between similar images. In particular, the proposed GM-SRM first pre-learns a generative memory from the whole training data to explicitly learn the distribution of different semantic patterns. Then the learned memory are leveraged to retrieve the matching semantics for the current corrupted image to perform semantic reasoning during image inpainting. While the encoder-decoder network is used for guaranteeing the pixel-level content consistency, our generative priors are favorable for performing high-level semantic reasoning, which is particularly effective for inferring semantic content for large corrupted area. Extensive experiments on Paris Street View, CelebA-HQ, and Places2 benchmarks demonstrate that our GM-SRM outperforms the state-of-the-art methods for image inpainting in terms of both visual quality and quantitative metrics.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Variational Hyperparameter Inference for Few-Shot Learning Across Domains

    • Free pre-print version: Loading...

      Authors: Lei Zhang;Liyun Zuo;Baoyan Wang;Xin Li;Xiantong Zhen;
      Pages: 7448 - 7459
      Abstract: The focus of few shot learning research has been on the development of meta-learning recently, where a meta-learner is trained on a variety of tasks in hopes of being generalizable to new tasks. Tasks in meta training and meta test are usually assumed to be from the same domain, which would not necessarily hold in real world scenarios. In this paper, we propose variational hyperparameter inference for few-shot learning across domains. Based on an especially successful algorithm named model agnostic meta learning, the proposed variational hyperparameter inference integrates meta learning and variational inference into the optimization of hyperparameters, which enables the meta-learner with adaptivity for generalization across domains. In particular, we choose to learn adaptive hyperparameters including the learning rate and weight decay to avoid the failure in the face of few labeled examples across domain. Moreover, we model hyperparameters as distributions instead of fixed values, which will further enhance the generalization ability by capturing the uncertainty. Extensive experiments are conducted on two benchmark datasets including few shot learning dataset within-domain and across-domain. The results demonstrate that our methods outperforms previous approaches consistently, and comprehensive ablation studies further validate its effectiveness on few shot learning both within domains and across domains.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • RD-IWAN: Residual Dense Based Imperceptible Watermark Attack Network

    • Free pre-print version: Loading...

      Authors: Chunpeng Wang;Qixian Hao;Shujiang Xu;Bin Ma;Zhiqiu Xia;Qi Li;Jian Li;Yun-Qing Shi;
      Pages: 7460 - 7472
      Abstract: Digital watermarking technology and watermark attack methods are mutually reinforcing and complementary. Currently, traditional watermark attack methods are relatively mature, but these traditional attack methods will inevitably damage the visual quality of original images (OIs). Therefore, this paper proposes a covert attack method called residual dense based imperceptible watermark attack network (RD-IWAN). First, this paper designs a watermark attack residual dense network (WARDN) based on the residual dense network (RDN), which can effectively remove the watermark information in the middle and low frequency features of the watermarked image (WMI). Second, to improve the attack ability of the network, this paper innovatively proposes a progressive preprocessing method based on the information enhancement preprocessing method. Concurrently, to ensure the imperceptibility of this watermark attack method, a comprehensive loss function that combines the perceptual loss and mean square error loss (MSE) of OI and attacked watermarked image (AWMI) is designed in this study. Finally, attack experiments are designed and performed on watermarks with different embedding strengths and sizes. Experimental results show that, compared to traditional attack methods, the watermark attack method proposed in this paper exhibits stronger attack ability and higher imperceptibility.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Autonomous Generation of Service Strategy for Household Tasks: A
           Progressive Learning Method With A Priori Knowledge and Reinforcement
           Learning

    • Free pre-print version: Loading...

      Authors: Mengyang Zhang;Guohui Tian;Huanbing Gao;Ying Zhang;
      Pages: 7473 - 7488
      Abstract: Human beings tend to learn unknown knowledge in a gradual process, from the basic to the complex. Based on this point, we propose a progressive learning method for producing service strategies according to requests, with a hierarchical priori knowledge and reinforcement learning. Service strategy aims to guide how to perform home services and takes into consideration the relationship between actions and objects in home environment. In this paper, strategy generation is regarded as a text generation problem in question answering (QA). Firstly, a hierarchical priori knowledge with service-object correlation at the bottom and action-object correlation at the top is constructed to assist the understanding on the relationship of objects and actions in service strategies. Service-object correlation guides how to select proper objects with the correct order, while action-object correlation associates actions in strategies according to selected objects. Based on the hierarchical priori knowledge, a progressive learning method is proposed to make the model produce effective strategies with a sequential cognition, from service-object correlation (objects) to action-object correlation (actions). After that, reinforcement learning is employed to enhance the progressive guidance, by designing rewards in terms of the hierarchical priori knowledge. Finally, the proposed method is tested with both comparative experiments and ablation studies, and the experimental results demonstrate the superiority in producing comprehensive and logical strategies, indicating that the progressive learning method in our paper can further improve the QA performance.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Unsupervised Deep Event Stereo for Depth Estimation

    • Free pre-print version: Loading...

      Authors: S. M. Nadim Uddin;Soikat Hasan Ahmed;Yong Ju Jung;
      Pages: 7489 - 7504
      Abstract: Bio-inspired event cameras have been considered effective alternatives to traditional frame-based cameras for stereo depth estimation, especially in challenging conditions such as low-light or high-speed environments. Recently, deep learning-based supervised event stereo matching methods have achieved significant performance improvements over the traditional event stereo methods. However, the supervised methods depend on ground-truth disparity maps for training, and it is difficult to secure a large amount of ground-truth disparity maps. A feasible alternative is to devise an unsupervised event stereo method that can be trained without ground-truth disparity maps. To this end, we propose the first unsupervised event stereo matching method that can predict dense disparity maps, and is trained by transforming the depth estimation problem into a warping-based reconstruction problem. We propose a novel unsupervised loss function that enforces the network to minimize the feature-level epipolar correlation difference between the ground-truth intensity images and warped images. Moreover, we propose a novel event embedding mechanism that utilizes both temporal and spatial neighboring events to capture spatio-temporal relationships among the events for stereo matching. Experimental results reveal that the proposed method outperforms the baseline unsupervised methods by significant margins (e.g., up to 16.88% improvement) and achieves comparable results with the existing supervised methods. Extensive ablation studies validate the efficacy of the proposed modules and architectural choices.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • PSCC-Net: Progressive Spatio-Channel Correlation Network for Image
           Manipulation Detection and Localization

    • Free pre-print version: Loading...

      Authors: Xiaohong Liu;Yaojie Liu;Jun Chen;Xiaoming Liu;
      Pages: 7505 - 7517
      Abstract: To defend against manipulation of image content, such as splicing, copy-move, and removal, we develop a Progressive Spatio-Channel Correlation Network (PSCC-Net) to detect and localize image manipulations. PSCC-Net processes the image in a two-path procedure: a top-down path that extracts local and global features and a bottom-up path that detects whether the input image is manipulated, and estimates its manipulation masks at multiple scales, where each mask is conditioned on the previous one. Different from the conventional encoder-decoder and no-pooling structures, PSCC-Net leverages features at different scales with dense cross-connections to produce manipulation masks in a coarse-to-fine fashion. Moreover, a Spatio-Channel Correlation Module (SCCM) captures both spatial and channel-wise correlations in the bottom-up path, which endows features with holistic cues, enabling the network to cope with a wide range of manipulation attacks. Thanks to the light-weight backbone and progressive mechanism, PSCC-Net can process $1,080text{P}$ images at 50+FPS. Extensive experiments demonstrate the superiority of PSCC-Net over the state-of-the-art methods on both detection and localization. Codes and models are available at https://github.com/proteus1991/PSCC-Net.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • DACNN: Blind Image Quality Assessment via a Distortion-Aware Convolutional
           Neural Network

    • Free pre-print version: Loading...

      Authors: Zhaoqing Pan;Hao Zhang;Jianjun Lei;Yuming Fang;Xiao Shao;Nam Ling;Sam Kwong;
      Pages: 7518 - 7531
      Abstract: Deep neural networks have achieved great performance on blind Image Quality Assessment (IQA), but it is still challenging for using one network to accurately predict the quality of images with different distortions. In this paper, a Distortion-Aware Convolutional Neural Network (DACNN) is proposed for blind IQA, which works effectively for not only synthetically distorted images but also authentically distorted images. The proposed DACNN consists of a distortion aware module, a distortion fusion module, and a quality prediction module. In the distortion aware module, a Siamese network-based pretraining strategy is proposed to design a synthetic distortion-aware network for full learning the synthetic distortions, and an authentic distortion-aware network is used for extracting the authentic distortions. To efficiently fuse the learned distortion features, and make the network pay more attention to the essential features, a weight-adaptive fusion network is proposed to adaptively adjust the weight of each distortion. Finally, the quality prediction module is adopted to map the fused features to a quality score. Extensive experiments on four authentic IQA databases and four synthetic IQA databases have proved the effectiveness of the proposed DACNN.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • UPHDR-GAN: Generative Adversarial Network for High Dynamic Range Imaging
           With Unpaired Data

    • Free pre-print version: Loading...

      Authors: Ru Li;Chuan Wang;Jue Wang;Guanghui Liu;Heng-Yu Zhang;Bing Zeng;Shuaicheng Liu;
      Pages: 7532 - 7546
      Abstract: The paper proposes a method to effectively fuse multi-exposure inputs and generate high-quality high dynamic range (HDR) images with unpaired datasets. Deep learning-based HDR image generation methods rely heavily on paired datasets. The ground truth images play a leading role in generating reasonable HDR images. Datasets without ground truth are hard to be applied to train deep neural networks. Recently, Generative Adversarial Networks (GAN) have demonstrated their potentials of translating images from source domain $X$ to target domain $Y$ in the absence of paired examples. In this paper, we propose a GAN-based network for solving such problems while generating enjoyable HDR results, named UPHDR-GAN. The proposed method relaxes the constraint of the paired dataset and learns the mapping from the LDR domain to the HDR domain. Although the pair data are missing, UPHDR-GAN can properly handle the ghosting artifacts caused by moving objects or misalignments with the help of the modified GAN loss, the improved discriminator network and the useful initialization phase. The proposed method preserves the details of important regions and improves the total image perceptual quality. Qualitative and quantitative comparisons against the representative methods demonstrate the superiority of the proposed UPHDR-GAN.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Knowledge-Based Visual Question Generation

    • Free pre-print version: Loading...

      Authors: Jiayuan Xie;Wenhao Fang;Yi Cai;Qingbao Huang;Qing Li;
      Pages: 7547 - 7558
      Abstract: Visual question generation task aims to generate meaningful questions about an image targeting an answer. Existing methods focus on the visual concepts in the image for question generation. However, humans inevitably use their knowledge related to visual objects in images to construct questions. In this paper, we propose a knowledge-based visual question generation model that can integrate visual concepts and non-visual knowledge to generate questions. To obtain visual concepts, we utilize a pre-trained object detection model to obtain object-level features of each object in the image. To obtain useful non-visual knowledge, we first retrieve the knowledge from the knowledge-base related to the visual objects in the image. Considering that not all retrieved knowledge is helpful for this task, we introduce an answer-aware module to capture the candidate knowledge related to the answer from the retrieved knowledge, which ensures that the generated content can be targeted at the answer. Finally, object-level representations containing visual concepts and non-visual knowledge are sent to a decoder module to generate questions. Extensive experiments on the FVQA and KBVQA datasets show that the proposed model outperforms the state-of-the-art models.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Perceptual Hashing With Complementary Color Wavelet Transform and
           Compressed Sensing for Reduced-Reference Image Quality Assessment

    • Free pre-print version: Loading...

      Authors: Mengzhu Yu;Zhenjun Tang;Xianquan Zhang;Bineng Zhong;Xinpeng Zhang;
      Pages: 7559 - 7574
      Abstract: Image quality assessment (IQA) is an important task of image processing and has diverse applications, such as image super-resolution reconstruction, image transmission and monitoring systems. This paper proposes a perceptual hashing algorithm with complementary color wavelet transform (CCWT) and compressed sensing (CS) for reduced-reference (RR) IQA. The CCWT is exploited to decompose input color image into different sub-bands. Since the calculation of CCWT uses all color channels without discarding any information, the distortions introduced by digital operations on color channels are preserved in the CCWT sub-bands. The block-based CS is used to extract features from the CCWT sub-bands. As the Euclidean distance between the block-based CS features is slightly influenced by content-preserving operations, perceptual features constructed by Euclidean distances are robust, discriminative and compact. Hash sequence is finally determined by quantifying the perceptual features. Effectiveness of the proposed hashing is verified by various experiments on four open image databases. Experimental results demonstrate that the proposed hashing is superior to some state-of-the-art algorithms in terms of classification and RR IQA application.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • IV-PSNR—The Objective Quality Metric for Immersive Video
           Applications

    • Free pre-print version: Loading...

      Authors: Adrian Dziembowski;Dawid Mieloch;Jakub Stankowski;Adam Grzelka;
      Pages: 7575 - 7591
      Abstract: This paper presents a new objective quality metric that was adapted to the complex characteristics of immersive video (IV) which is prone to errors caused by processing and compression of multiple input views and virtual view synthesis. The proposed metric, IV-PSNR, contains two techniques that allow for the evaluation of quality loss for typical immersive video distortions: corresponding pixel shift and global component difference. The performed experiments compared the proposal with 31 state-of-the-art quality metrics, showing their performance in the assessment of quality in immersive video coding and processing, and in other applications, using commonly used image quality assessment databases– TID2013 and CVIQ. As presented, IV-PSNR outperforms other metrics in immersive video applications and still can be efficiently used in the evaluation of different images and videos. Moreover, basing the metric on the calculation of PSNR allowed the computational complexity to remain low. Publicly available, efficient implementation of IV-PSNR software was provided by the authors of this paper and is used by ISO/IEC MPEG for evaluation and research on the upcoming MPEG Immersive video (MIV) coding standard.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Blind Image Quality Assessment for Authentic Distortions by Intermediary
           Enhancement and Iterative Training

    • Free pre-print version: Loading...

      Authors: Tianshu Song;Leida Li;Pengfei Chen;Hantao Liu;Jiansheng Qian;
      Pages: 7592 - 7604
      Abstract: With the boom of deep neural networks, blind image quality assessment (BIQA) has achieved great processes. However, the current BIQA metrics are limited when evaluating low-quality images as compared to medium-quality and high-quality images, which restricts their applications in real world problems. In this paper, we first identify that two challenges caused by distribution shift and long-tailed distribution lead to the compromised performance on low-quality images. Then, we propose an intermediary enhancement-based bilateral network with iterative training strategy for solving these two challenges. Drawing on the experience of transitive transfer learning, the proposed metric adaptively introduces enhanced intermediary images to transfer more information to low-quality images for mitigating the distribution shift. Our metric also adopts an iterative training strategy to deal with the long-tailed distribution. This strategy decouples feature extraction and score regression for better representation learning and regressor training. It not only transfers the knowledge learned from the earlier stage to the latter stage, but also makes the model pay more attention to long-tailed low-quality images. We conduct extensive experiments on five authentically distorted image quality datasets. The results show that our metric significantly improves the evaluating performance on low-quality images and delivers state-of-the-art intra-dataset results. During generalization tests, our metric also achieves the best cross-dataset performance.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Reversible Data Hiding With Brightness Preserving Contrast Enhancement by
           Two-Dimensional Histogram Modification

    • Free pre-print version: Loading...

      Authors: Hao-Tian Wu;Xin Cao;Ruoyan Jia;Yiu-Ming Cheung;
      Pages: 7605 - 7617
      Abstract: Recently, contrast enhancement with reversible data hiding (CE-RDH) has been proposed for digital images to hide useful data into contrast-enhanced images. In existing schemes, one-dimensional (1D) or two-dimensional (2D) histogram is equalized during the process of CE-RDH so that an original image can be exactly recovered from its contrast-enhanced version. However, noticeable brightness change and color distortion may be introduced by applying these schemes, especially in the case of over enhancement. To preserve image quality, this paper presents a new 2D histogram based CE-RDH scheme by taking brightness preservation (BP) into account. In particular, the row or column of histogram bins with the maximum total height are chosen to be expanded at each time of histogram modification, while the row or column of bins to be expanded next are adaptively chosen according to the change of image brightness. Experimental results on three color image sets demonstrate efficacy and reversibility of the proposed scheme. Compared with the schemes using 1D histogram, image brightness can be preserved more finely by modifying the generated 2D histogram. Moreover, our proposed scheme preserves image color and brightness while achieving better image quality than the existing schemes.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • No-Reference Quality Assessment for 3D Colored Point Cloud and Mesh Models

    • Free pre-print version: Loading...

      Authors: Zicheng Zhang;Wei Sun;Xiongkuo Min;Tao Wang;Wei Lu;Guangtao Zhai;
      Pages: 7618 - 7631
      Abstract: To improve the viewer’s Quality of Experience (QoE) and optimize computer graphics applications, 3D model quality assessment (3D-QA) has become an important task in the multimedia area. Point cloud and mesh are the two most widely used digital representation formats of 3D models, the visual quality of which is quite sensitive to lossy operations like simplification and compression. Therefore, many related studies such as point cloud quality assessment (PCQA) and mesh quality assessment (MQA) have been carried out to measure the visual quality of distorted 3D models. However, most previous studies utilize full-reference (FR) metrics, which indicates they can not predict the quality level in the absence of the reference 3D model. Furthermore, few 3D-QA metrics consider color information, which significantly restricts their effectiveness and scope of application. In this paper, we propose a no-reference (NR) quality assessment metric for colored 3D models represented by both point cloud and mesh. First, we project the 3D models from 3D space into quality-related geometry and color feature domains. Then, the 3D natural scene statistics (3D-NSS) and entropy are utilized to extract quality-aware features. Finally, a support vector regression (SVR) model is employed to regress the quality-aware features into visual quality scores. Our method is validated on the colored point cloud quality assessment database (SJTU-PCQA), the Waterloo point cloud assessment database (WPC), and the colored mesh quality assessment database (CMDM). The experimental results show that the proposed method outperforms most compared NR 3D-QA metrics with competitive computational resources and greatly reduces the performance gap with the state-of-the-art FR 3D-QA metrics. The code of the proposed model is publicly available now at https://github.com/zzc-1998/NR-3DQA.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • MoADNet: Mobile Asymmetric Dual-Stream Networks for Real-Time and
           Lightweight RGB-D Salient Object Detection

    • Free pre-print version: Loading...

      Authors: Xiao Jin;Kang Yi;Jing Xu;
      Pages: 7632 - 7645
      Abstract: RGB-D Salient Object Detection (RGB-D SOD) aims at detecting remarkable objects by complementary information from RGB images and depth cues. Although many outstanding prior arts have been proposed for RGB-D SOD, most of them focus on performance enhancement, while lacking concern about practical deployment on mobile devices. In this paper, we propose mobile asymmetric dual-stream networks (MoADNet) for real-time and lightweight RGB-D SOD. First, inspired by the intrinsic discrepancy between RGB and depth modalities, we observe that depth maps can be represented by fewer channels than RGB images. Thus, we design asymmetric dual-stream encoders based on MobileNetV3. Second, we develop an inverted bottleneck cross-modality fusion (IBCMF) module to fuse multimodality features, which adopts an inverted bottleneck structure to compensate for the information loss in the lightweight backbones. Third, we present an adaptive atrous spatial pyramid (A2SP) module to speed up the inference, while maintaining the performance by appropriately selecting multiscale features in the decoder. Extensive experiments are conducted to compare our method with 15 state-of-the-art approaches. Our MoADNet obtains competitive results on five benchmark datasets under four evaluation metrics. For efficiency analysis, the proposed method significantly outperforms other baselines by a large margin. The MoADNet only contains 5.03 M parameters and runs 80 FPS when testing a $256times 256$ image on a single NVIDIA 2080Ti GPU.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Cross-Collaborative Fusion-Encoder Network for Robust RGB-Thermal Salient
           Object Detection

    • Free pre-print version: Loading...

      Authors: Guibiao Liao;Wei Gao;Ge Li;Junle Wang;Sam Kwong;
      Pages: 7646 - 7661
      Abstract: With the prevalence of thermal cameras, RGB-T multi-modal data have become more available for salient object detection (SOD) in complex scenes. Most RGB-T SOD works first individually extract RGB and thermal features from two separate encoders and directly integrate them, which pay less attention to the issue of defective modalities. However, such an indiscriminate feature extraction strategy may produce contaminated features and thus lead to poor SOD performance. To address this issue, we propose a novel CCFENet for a perspective to perform robust and accurate multi-modal expression encoding. First, we propose an essential cross-collaboration enhancement strategy (CCE), which concentrates on facilitating the interactions across the encoders and encouraging different modalities to complement each other during encoding. Such a cross-collaborative-encoder paradigm induces our network to collaboratively suppress the negative feature responses of defective modality data and effectively exploit modality-informative features. Moreover, as the network goes deeper, we embed several CCEs into the encoder, further enabling more representative and robust feature generation. Second, benefiting from the proposed robust encoding paradigm, a simple yet effective cross-scale cross-modal decoder (CCD) is designed to aggregate multi-level complementary multi-modal features, and thus encourages efficient and accurate RGB-T SOD. Extensive experiments reveal that our CCFENet outperforms the state-of-the-art models on three RGB-T datasets with a fast inference speed of 62 FPS. In addition, the advantages of our approach in complex scenarios (e.g., bad weather, motion blur, etc.) and RGB-D SOD further verify its robustness and generality. The source code will be publicly available via our project page: https://git.openi.org.cn/OpenVision/CCFENet.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • A Novel Long-Term Iterative Mining Scheme for Video Salient Object
           Detection

    • Free pre-print version: Loading...

      Authors: Chenglizhao Chen;Hengsen Wang;Yuming Fang;Chong Peng;
      Pages: 7662 - 7676
      Abstract: The existing state-of-the-art (SOTA) video salient object detection (VSOD) models have widely followed short-term methodology, which dynamically determines the balance between spatial and temporal saliency fusion by solely considering the current consecutive limited frames. However, the short-term methodology has one critical limitation, which conflicts with the real mechanism of our visual system — a typical long-term methodology. As a result, failure cases keep showing up in the results of the current SOTA models, and the short-term methodology becomes the major technical bottleneck. To solve this problem, this paper proposes a novel VSOD approach, which performs VSOD in a complete long-term way. Our approach converts the sequential VSOD, a sequential task, to a data mining problem, i.e., decomposing the input video sequence to object proposals in advance and then mining salient object proposals as much as possible in an easy-to-hard way. Since all object proposals are simultaneously available, the proposed approach is a complete long-term approach, which can alleviate some difficulties rooted in conventional short-term approaches. In addition, we devised an online updating scheme that can grasp the most representative and trustworthy pattern profile of the salient objects, outputting framewise saliency maps with rich details and smoothing both spatially and temporally. The proposed approach outperforms almost all SOTA models on five widely used benchmark datasets.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Anomaly Detection of Metro Station Tracks Based on Sequential Updatable
           Anomaly Detection Framework

    • Free pre-print version: Loading...

      Authors: Zhongxing Zheng;Weiming Liu;Ruikang Liu;Liang Wang;Liang Mao;Qisheng Qiu;Guangzheng Ling;
      Pages: 7677 - 7691
      Abstract: The intrusion of foreign objects in tracks, one of the sources of injuries and fatalities in metro stations, can be solved as an anomaly detection task. However, the existing anomaly detection methods rarely consider the importance of updating the new knowledge from false alarm data, resulting in repeated mistakes. These methods are also impractical for edge devices that cannot afford the high calculation cost. A sequential updatable anomaly detection (SUAD) framework is proposed to tackle these problems. This framework is based on the Robbin-Monro algorithm and a fast version of Mahalanobis distance. A well-trained model of SUAD can continue to learn new knowledge through the sequential knowledge update module based on the Robbin-Monro algorithm without reviewing the old data. SUAD utilizes a new Mahalanobis distance calculation method based on principal component analysis. This new method exhibits a fast inference speed with a lighter model size than before. SUAD is evaluated on a self-built Metro Anomaly Detection (MAD) dataset and three public datasets. SUAD achieves an average area under the receiver operating characteristic curve score of 99.4% at image level and 99.6% at pixel level on MAD. SUAD also reduces at least 78% model size and 60% memory usage. Competitive results are also achieved in public datasets, including MVTec AD, beanTech Anomaly Detection, and CIFAR-10.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Monocular Robust 3D Human Localization by Global and Body-Parts Depth
           Awareness

    • Free pre-print version: Loading...

      Authors: Haolun Li;Chi-Man Pun;
      Pages: 7692 - 7705
      Abstract: Learning the human depth localization in camera coordinate space plays a crucial role in understanding the behavior and activities of multi-person in 3D scenes. However, existing monocular-based methods rarely combine the global image features and the human body-parts features effectively, resulting in a large gap from the actual location in some cases, e.g., the special body-sized persons and mutual occlusion between humans in the image. This paper presents a novel Robust 3D Human Localization (R3HL) network consisting of two stages: global depth awareness and body-parts depth awareness, to significantly improve the robustness and accuracy of the 3D location. In the first stage, the front-back and far-near relationship estimation module based on multi-person are proposed to make the network extract depth features from the global perspective. In the second stage, the network focuses on the target human. We propose a Pose-guided Multi-person Repulsion (PMR) module to enhance the target human’s features and reduce the interference features produced by the background and other people. In addition, an Adaptive Body-parts Attention (ABA) module is designed to assign different feature weights to each joint. Finally, the human’s absolute depth is obtained through global pooling and fully connected layers. The experimental results show that the attention from the whole image to a single person helps find the absolute location of different body-sized and poses people from diverse scenes. Our method can achieve better performance than other state-of-the-art methods on both indoor and outdoor 3D multi-person datasets.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Double-Stream Position Learning Transformer Network for Image Captioning

    • Free pre-print version: Loading...

      Authors: Weitao Jiang;Wei Zhou;Haifeng Hu;
      Pages: 7706 - 7718
      Abstract: Image captioning has made significant achievement through developing feature extractor and model architecture. Recently, the image region features extracted by object detector prevail in most existing models. However, region features are criticized for the lacking of background and full contextual information. This problem can be remedied by providing some complementary visual information from patch features. In this paper, we propose a Double-Stream Position Learning Transformer Network (DSPLTN) which exploits the advantages of region features and patch features. Specifically, the region-stream encoder utilizes a Transformer encoder with Relative Position Learning (RPL) module to enhance the representations of region features through modeling the relationships between regions and positions respectively. As for the patch-stream encoder, we introduce convolutional neural network into the vanilla Transformer encoder and propose a novel Convolutional Position Learning (CPL) module to encode the position relationships between patches. CPL improves the ability of relationship modeling by combining the position and visual content of patches. Incorporating CPL into the Transformer encoder can synthesize the benefits of convolution in local relation modeling and self-attention in global feature fusion, thereby compensating for the information loss caused by the flattening operation of 2D feature maps to 1D patches. Furthermore, an Adaptive Fusion Attention (AFA) mechanism is proposed to balance the contribution of enhanced region and patch features. Extensive experiments on MSCOCO demonstrate the effectiveness of the double-stream encoder and CPL, and show the superior performance of DSPLTN.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • HFF6D: Hierarchical Feature Fusion Network for Robust 6D Object Pose
           Tracking

    • Free pre-print version: Loading...

      Authors: Jian Liu;Wei Sun;Chongpei Liu;Xing Zhang;Shimeng Fan;Wei Wu;
      Pages: 7719 - 7731
      Abstract: Tracking the 6-degree-of-freedom (6D) object pose in video sequences is gaining attention because it has a wide application in multimedia and robotic manipulation. However, current methods often perform poorly in challenging scenes, such as incorrect initial pose, sudden re-orientation, and severe occlusion. In contrast, we present a robust 6D object pose tracking method with a novel hierarchical feature fusion network, refer it as HFF6D, which aims to predict the object’s relative pose between adjacent frames. Instead of extracting features from adjacent frames separately, HFF6D establishes sufficient spatial-temporal information interaction between adjacent frames. In addition, we propose a novel subtraction feature fusion (SFF) module with attention mechanism to leverage feature subtraction during feature fusion. It explicitly highlights the feature differences between adjacent frames, thus improving the robustness of relative pose estimation in challenging scenes. Besides, we leverage data augmentation technology to make HFF6D be used more effectively in the real world by training only with synthetic data, thereby reducing manual effort in data annotation. We evaluate HFF6D on the well-known YCB-Video and YCBInEOAT datasets. Quantitative and qualitative results demonstrate that HFF6D outperforms state-of-the-art (SOTA) methods in both accuracy and efficiency. Moreover, it is also proved to achieve high-robustness tracking under the above-mentioned challenging scenes.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • KTN: Knowledge Transfer Network for Learning Multiperson 2D-3D
           Correspondences

    • Free pre-print version: Loading...

      Authors: Xuanhan Wang;Lianli Gao;Yixuan Zhou;Jingkuan Song;Meng Wang;
      Pages: 7732 - 7745
      Abstract: Human densepose estimation, aiming at establishing dense correspondences between 2D pixels of human body and 3D human body template, is a key technique in enabling machines to have an understanding of people in images. It still poses several challenges due to practical scenarios where real-world scenes are complex and only partial annotations are available, leading to incompelete or false estimations. In this work, we present a novel framework to detect the densepose of multiple people in an image. The proposed method, which we refer to Knowledge Transfer Network (KTN), tackles two main problems: 1) how to refine image representation for alleviating incomplete estimations, and 2) how to reduce false estimation caused by the low-quality training labels (i.e., limited annotations and class-imbalance labels). Unlike existing works directly propagating the pyramidal features of regions for densepose estimation, the KTN uses a refinement of pyramidal representation, where it simultaneously maintains feature resolution and suppresses background pixels, and this strategy results in a substantial increase in accuracy. Moreover, the KTN enhances the ability of 3D based body parsing with external knowledges, where it casts 2D based body parsers trained from sufficient annotations as a 3D based body parser through a structural body knowledge graph. In this way, it significantly reduces the adverse effects caused by the low-quality annotations. The effectiveness of KTN is demonstrated by its superior performance to the state-of-the-art methods on DensePose-COCO dataset. Extensive ablation studies and experimental results on representative tasks (e.g., human body segmentation, human part segmentation and keypoints detection) and two popular densepose estimation pipelines (i.e., RCNN and fully-convolutional frameworks), further indicate the generalizability of the proposed method.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • MOTFR: Multiple Object Tracking Based on Feature Recoding

    • Free pre-print version: Loading...

      Authors: Jun Kong;Ensen Mo;Min Jiang;Tianshan Liu;
      Pages: 7746 - 7757
      Abstract: The stable continuation of trajectories among different targets has always been the key to the tracking performance of multi-object tracking (MOT) tasks. If features of the target are aggregated and classified simply, the discriminant features of the target will be ignored. This will affect the robustness of the trajectory generated by the model. Meanwhile, many popular models are keen to execute detection and feature extraction tasks in parallel. But these two tasks will conflict with each other when optimized respectively. Therefore, we propose our tracker MOTFR to solve the above problems. In this paper, we propose a Locally Shared Information Decoupling Module (LSIDM) to reduce task optimization conflicts while ensuring the necessary information sharing. Meanwhile, a feature recoding module for deep extraction of identity discriminative features is proposed, which is called the Feature Purification Module (FPM). By combining LSIDM and FPM modules, the model utilizes the discriminative appearance features to guide the optimization of detection and further improves the performance of our model. To solve the problem of targets disappearing due to various abnormal occlusion, a Short-term Trajectory Online Complement Strategy (STOCS) is proposed to realize the trajectories continuation of these targets in the tracking stage. Through sufficient experiments, we demonstrate the superior performance of our MOTFR, which guarantees high-quality detection while achieving the stability of the target trajectory.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Multitask Multigranularity Aggregation With Global-Guided Attention for
           Video Person Re-Identification

    • Free pre-print version: Loading...

      Authors: Dengdi Sun;Jiale Huang;Lei Hu;Jin Tang;Zhuanlian Ding;
      Pages: 7758 - 7771
      Abstract: The goal of video-based person re-identification (Re-ID) is to identify the same person across multiple non-overlapping cameras. The key to accomplishing this challenging task is to sufficiently exploit both spatial and temporal cues in video sequences. However, most current methods are incapable of accurately locating semantic regions or efficiently filtering discriminative spatio-temporal features; so it is difficult to handle issues such as spatial misalignment and occlusion. Thus, we propose a novel feature aggregation framework, multi-task and multi-granularity aggregation with global-guided attention (MMA-GGA), which aims to adaptively generate more representative spatio-temporal aggregation features. Specifically, we develop a multi-task multi-granularity aggregation (MMA) module to extract features at different locations and scales to identify key semantic-aware regions that are robust to spatial misalignment. Then, to determine the importance of the multi-granular semantic information, we propose a global-guided attention (GGA) mechanism to learn weights based on the global features of the video sequence, allowing our framework to identify stable local features while ignoring occlusions. Therefore, the MMA-GGA framework can efficiently and effectively capture more robust and representative features. Extensive experiments on four benchmark datasets demonstrate that our MMA-GGA framework outperforms current state-of-the-art methods. In particular, our method achieves a rank-1 accuracy of 91.0% on the MARS dataset, the most widely used database, significantly outperforming existing methods.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • DHNet: Salient Object Detection With Dynamic Scale-Aware Learning and
           Hard-Sample Refinement

    • Free pre-print version: Loading...

      Authors: Chenhao Zhang;Shanshan Gao;Deqian Mao;Yuanfeng Zhou;
      Pages: 7772 - 7782
      Abstract: During the annotation procedure of salient object detection, researchers usually locate the approximate location of the salient objects first and then process the pixels that need to be finely annotated. Following this idea, we find that the existing methods have limited exploration for solving the problem of positioning salient objects. Furthermore, no effective solution has been proposed for the hard-sample problem related to this task. Therefore, we propose dynamic scale-aware learning to learn dynamic scale weights that vary with different images to solve the first problem. Second, we design a dense sampling strategy for hard samples to construct a graph representation with samples from different classes and different confidence levels. Then, we achieve targeted feature aggregation based on the constructed graph with the help of the graph attention mechanism. We conduct extensive experiments on five benchmark datasets using comprehensive evaluation metrics. The results show that our method outperforms the current state-of-the-art approaches.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Incremental Translation Averaging

    • Free pre-print version: Loading...

      Authors: Xiang Gao;Lingjie Zhu;Bin Fan;Hongmin Liu;Shuhan Shen;
      Pages: 7783 - 7795
      Abstract: Translation averaging is known to be more difficult than rotation averaging due to scale ambiguity, estimation sensitivity, and solution uncertainty. Existing approaches have exposed their limitations in terms of accuracy, robustness, simplicity, or efficiency. To tackle this tough problem, a simple yet effective translation averaging pipeline, termed as Incremental Translation Averaging (ITA), is proposed in this paper. It combines the advantages of high accuracy and robustness in incremental parameter estimation pipeline and the advantages of high simplicity and efficiency in global motion averaging approach. Unlike the traditional translation averaging methods which estimate all the absolute camera locations simultaneously and suffer from inaccuracy in parameter estimation and incompleteness in scene reconstruction, our ITA computes them novelly in an incremental way with higher accuracy and robustness. Thanks to the introduction of incremental parameter estimation thought into the translation averaging pipeline, 1) our ITA is robust to measurement outliers and accurate in parameter estimation; and 2) our ITA is simple and efficient because of its less dependency on complicated optimization, carefully-designed preprocessing, or additional information. Comprehensive evaluations on the 1DSfM dataset demonstrate the effectiveness of our ITA and its advantages over several state-of-the-art translation averaging approaches.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Uncertainty Guided Multi-View Stereo Network for Depth Estimation

    • Free pre-print version: Loading...

      Authors: Wanjuan Su;Qingshan Xu;Wenbing Tao;
      Pages: 7796 - 7808
      Abstract: Deep learning has greatly promoted the development of multi-view stereo in recent years. However, how to measure the reliability of the estimated depth map for practical applications and make reasonable depth hypothesis sampling for the cost volume building in the coarse-to-fine architecture are still unresolved crucial problems. To this end, an Uncertainty Guided multi-view Network (UGNet) is proposed in this paper. In order to enable the network to perceive the uncertainty, an uncertainty-aware loss function is introduced, which not only can infer uncertainty implicitly in an unsupervised manner but also can reduce the bad impact of high uncertainty regions and the erroneous labels in the training set during training. Moreover, an uncertainty-based depth hypothesis sampling strategy is further proposed to adaptively determine the depth search range of each pixel for finer stages, which helps to generate more rational depth intervals compared with other methods and build more compact cost volumes without redundancy. Experimental results on DTU dataset, BlendedMVS dataset, Tanks and Temples dataset and ETH3D high-res benchmark show that our method achieves promising reconstruction results compared with other state-of-the-art methods.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Multilevel Spatial-Temporal Feature Aggregation for Video Object Detection

    • Free pre-print version: Loading...

      Authors: Chao Xu;Jiangning Zhang;Mengmeng Wang;Guanzhong Tian;Yong Liu;
      Pages: 7809 - 7820
      Abstract: Video object detection (VOD) focuses on detecting objects for each frame in a video, which is a challenging task due to appearance deterioration in certain video frames. Recent works usually distill crucial information from multiple support frames to improve the reference features, but they only perform at frame level or proposal level that cannot integrate spatial-temporal features sufficiently. To deal with this challenge, we treat VOD as a spatial-temporal hierarchical features interacting process and introduce a Multi-level Spatial-Temporal (MST) feature aggregation framework to fully exploit frame-level, proposal-level, and instance-level information in a unified framework. Specifically, MST first measures context similarity in pixel space to enhance all frame-level features rather than only update reference features. The proposal-level feature aggregation then models object relation to augment reference object proposals. Furthermore, to filter out irrelevant information from other classes and backgrounds, we introduce an instance ID constraint to boost instance-level features by leveraging support object proposal features that belong to the same object. Besides, we propose a Deformable Feature Alignment (DAlign) module before MST to achieve a more accurate pixel-level spatial alignment for better feature aggregation. Extensive experiments are conducted on ImageNet VID and UAVDT datasets that demonstrate the superiority of our method over state-of-the-art (SOTA) methods. Our method achieves 83.3% and 62.1% with ResNet-101 on two datasets, outperforming SOTA MEGA by 0.4% and 2.7%.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • IHEM Loss: Intra-Class Hard Example Mining Loss for Robust Face
           Recognition

    • Free pre-print version: Loading...

      Authors: Degui Xiao;Jiazhi Li;Jianfang Li;Shiping Dong;Tao Lu;
      Pages: 7821 - 7831
      Abstract: Recently, angular margin-based methods have become the mainstream approach for unconstrained face recognition with remarkable success. However, robust face recognition still remains a challenge, as the face is subject to variations in pose, age, expression, occlusion, and illumination, especially in unconstrained scenarios. Since the training dataset are always collected in unconstrained scenarios, it is inevitable that there’re significant number of hard examples in the training process. In this paper, we design a hard example selection function to effectively identify hard examples in the training procedure with the supervision of angular margin-based losses. Furthermore, a novel Intra-class Hard Example Mining (IHEM) loss function is proposed, which penalizes the cosine distance between the hard examples and their class centers to enhance the discriminative power of face representations. To ensure high performance for face recognition, we combine the supervision of angular margin-based loss and IHEM loss for model training. Specifically, during the training procedure, the angular margin-based loss guarantees the power of feature discrimination for face recognition, while the IHEM loss further encourages the intra-class compactness of hard example. Extensive results demonstrate the superiority of our approach.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Adaptive Weighted Losses With Distribution Approximation for Efficient
           Consistency-Based Semi-Supervised Learning

    • Free pre-print version: Loading...

      Authors: Di Li;Yang Liu;Liang Song;
      Pages: 7832 - 7842
      Abstract: Recent semi-supervised learning (SSL) algorithms such as FixMatch achieve state-of-the-art performance by exploiting consistency regularization and entropy minimization techniques. However, many consistency-based SSL algorithms extract pseudo-labels from unlabeled data through a fixed threshold and ignore the different learning progress of each category, which makes the easy-to-learn categories have more examples contributing to the loss, resulting in a class-imbalance problem and affecting training efficiency. In order to improve the training reliability, we propose adaptive weighted losses (AWL). Through the evaluation of the class-wise learning progress, the loss contribution of the pseudo-labeled data of each category is continuously and dynamically adjusted during the learning process, and the pseudo-label discrimination ability of the model can be steadily improved. Moreover, to improve the training efficiency, we propose a bidirectional distribution approximation (DA) method, which introduces the consistency information of the predictions under the threshold into the loss calculation, and significantly improves the model convergence speed. Through the combination of AWL and DA, our method surpasses the performance of other algorithms on multiple benchmarks with a faster convergence efficiency, especially in the case of labeled data extremely limited. For example, AWL&DA achieves 95.29% test accuracy on the CIFAR-10-40-labels experiment and 92.56% accuracy on a faster experiment setting with only $2^{18}$ iterations.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Learning Channel-Aware Correlation Filters for Robust Object Tracking

    • Free pre-print version: Loading...

      Authors: Ke Nai;Zhiyong Li;Haidong Wang;
      Pages: 7843 - 7857
      Abstract: Correlation filters with Convolutional Neural Networks (CNNs) features have obtained tremendous attention and success in visual tracking. However, redundant and noisy feature channels existed in CNN features may cause severe over-fitting and greatly limit the discriminative power of the tracking model. To tackle the issue, in this paper, we develop a new and effective channel-aware correlation filters (CACF) method for boosting the tracking performance. Our CACF method aims to dynamically select representative and discriminative feature channels from high-dimensional CNN features to reduce the model complexity and better distinguish the target object from the background. Moreover, the CACF model is solved by the alternating direction method of multipliers (ADMM) to learn correlation filters. By retaining reliable feature channels, our CACF tracking method can reach better generalization ability and discriminative ability to accurately localize the target object. Comprehensive experiments are conducted on challenging tracking datasets, and the experiment results prove that our CACF method obtains favorable tracking accuracy compared to several popular tracking methods.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • SSAT: Self-Supervised Associating Network for Multiobject Tracking

    • Free pre-print version: Loading...

      Authors: Tae-Young Chung;MyeongAh Cho;Heansung Lee;Sangyoun Lee;
      Pages: 7858 - 7868
      Abstract: Multi-object tracking (MOT), which is crucial for computer vision and video processing, has immense potential for improvement. Traditional tracking-by-detection approaches include feature-based object re-identification methods that use trained features, but these methods suffer from a lack of suitable training data. In training datasets used for MOT, every object in a video sequence must have its own location and ID. However, assigning IDs to each object in every sequence is considerably labor-intensive, and hence current MOT datasets are unsuitable for training re-identification networks. To resolve this issue, this paper proposes a novel self-supervised learning method using several short videos that contain no human-added labels, based on the idea that each video is a set of temporally corresponding image frames. We then describe how to improve tracking performance using a re-identification network trained in a self-supervised manner. In addition, ablation studies were conducted in order to define the optimal parameters, such as number of clips, data augmentation, and appropriate matching algorithms. The proposed approach achieved competitive performance compared with current best-practice methods including supervised methods, achieving MOT accuracy = 62.0% and ID F1-score = 62.7% on the MOT17 benchmark.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • RSDet++: Point-Based Modulated Loss for More Accurate
           Rotated Object Detection

    • Free pre-print version: Loading...

      Authors: Wen Qian;Xue Yang;Silong Peng;Xiujuan Zhang;Junchi Yan;
      Pages: 7869 - 7879
      Abstract: We classify the discontinuity of loss in both five-param and eight-param rotated object detection methods as rotation sensitivity error (RSE) which will result in performance degeneration. We introduce a novel modulated rotation loss to alleviate the problem and a rotation sensitivity detection network (RSDet) which consists of an eight-param single-stage rotated object detector and the modulated rotation loss. Our proposed RSDet has several advantages: 1) it reformulates the rotated object detection problem as predicting the corners of objects while most previous methods employ a five-param-based regression method with different measurement units. 2) modulated rotation loss achieves consistent improvement on both five-param and eight-param rotated object detection methods by solving the discontinuity of loss. To further improve the accuracy of our method on objects smaller than 10 pixels, we introduce a novel RSDet++ which consists of a point-based anchor-free rotated object detector and a modulated rotation loss. Extensive experiments demonstrate the effectiveness of both RSDet and RSDet++, which achieve competitive results on rotated object detection in the challenging benchmarks DOTA-v1.0, DOTA-v1.5, and DOTA-v2.0. We hope the proposed method can provide a new perspective for designing algorithms to solve rotated object detection and pay more attention to tiny objects. The codes and models are available at: https://github.com/yangxue0827/RotationDetection.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • UrbanLF: A Comprehensive Light Field Dataset for Semantic Segmentation of
           Urban Scenes

    • Free pre-print version: Loading...

      Authors: Hao Sheng;Ruixuan Cong;Da Yang;Rongshan Chen;Sizhe Wang;Zhenglong Cui;
      Pages: 7880 - 7893
      Abstract: As one of the fundamental technologies for scene understanding, semantic segmentation has been widely explored in the last few years. Light field cameras encode the geometric information by simultaneously recording the spatial information and angular information of light rays, which provides us with a new way to solve this issue. In this paper, we propose a high-quality and challenging urban scene dataset, containing 1074 samples composed of real-world and synthetic light field images as well as pixel-wise annotations for 14 semantic classes. To the best of our knowledge, it is the largest and the most diverse light field dataset for semantic segmentation. We further design two new semantic segmentation baselines tailored for light field and compare them with state-of-the-art RGB, video and RGB-D-based methods using the proposed dataset. The outperforming results of our baselines demonstrate the advantages of the geometric information in light field for this task. We also provide evaluations of super-resolution and depth estimation methods, showing that the proposed dataset presents new challenges and supports detailed comparisons among different methods. We expect this work inspires new research direction and stimulates scientific progress in related fields. The complete dataset is available at https://github.com/HAWKEYE-Group/UrbanLF.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • JEDE: Universal Jersey Number Detector for Sports

    • Free pre-print version: Loading...

      Authors: Hengyue Liu;Bir Bhanu;
      Pages: 7894 - 7909
      Abstract: The rapid progress in deep learning-based computer vision has opened unprecedented possibilities in computing various high-level analytics for sports. Artificial intelligence techniques such as predictive analysis, automatic highlight generation, and assistant coaching have been applied to improve performance and decision-making for teams and players. To perform any high-level analysis from a game match, collecting the locations (where) and identities (who) of players is crucial and challenging. In this paper, a universal JErsey number DEtector (JEDE) for player identification is presented that predicts players’ bounding boxes and keypoints, along with bounding boxes and classes of jersey digits and numbers in an end-to-end manner. Instead of generating digit proposals from pre-defined anchors, JEDE predicts more robust proposals guided by players’ features and pose estimation. Moreover, a dataset is collected from soccer and basketball matches with annotations on players’ bounding boxes and body keypoints, and jersey digits’ bounding boxes and labels. Extensive experimental results and ablation studies on the collected dataset show that the proposed method outperforms the state-of-the-art methods by a large margin. Both quantitative and qualitative results also demonstrate JEDE’s superior practicality and generalizability over different sports.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Selective Intra-Image Similarity for Personalized Fixation-Based Object
           Segmentation

    • Free pre-print version: Loading...

      Authors: Huajun Zhou;Lingxiao Yang;Xiaohua Xie;Jianhuang Lai;
      Pages: 7910 - 7923
      Abstract: Personalized Fixation-based Object Segmentation (PFOS) aims at segmenting the gazed objects in images conditioned on personalized fixations. However, the performances of existing PFOS methods are degraded when facing anomalous fixation maps (some fixations fall in the background) or enormous objects because of their poor localization ability. In this paper, we propose a novel Selective Intra-image Similarity Network (SISNet) that achieves significant performance by precisely localizing the gazed objects. First, we propose a Response Purifying Module (RPM) to eliminate the false response regions caused by anomalous fixations in the background. By suppressing these false responses, we can significantly reduce the negative impacts caused by anomalous fixations. Second, we propose an intra-image similarity module (ISM) to better localize large objects by integrating more long-range information. In addition, we propose a new Discriminative Intersection-over-Union metric that evaluates whether PFOS methods can produce distinctive predictions for varying fixations. Experiments on the PFOS and our proposed OSIE-CFPS-UN datasets prove that our network achieves remarkable improvements and outperforms existing state-of-the-art methods. Code has been published at https://www.github.com/moothes/SISNet.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Joint Sample Enhancement and Instance-Sensitive Feature Learning for
           Efficient Person Search

    • Free pre-print version: Loading...

      Authors: Xiao Ke;Hao Liu;Wenzhong Guo;Baitao Chen;Yuhang Cai;Weibin Chen;
      Pages: 7924 - 7937
      Abstract: Person search, consisting of jointly or separately trained person detection stage and person Re-ID stage, suffers from significant challenges such as inefficiency and difficulty in acquiring discriminative features. However, certain work has either turned to the end-to-end framework whose performance is limited by task conflicts or has consistently attempted to obtain more accurate bounding boxes (Bboxes). Few studies have focused on the impact of sample-specificity in person search datasets for training a fine-grain Re-ID model, and few have considered obtaining discriminative Re-ID features from Bboxes in a more efficient way. In this paper, a novel sample-enhanced and instance-sensitive (SEIE) framework is designed to boost performance. By analyzing the structure of person search framework, our method refines the two stages separately. For the detection stage, we re-design the usage of Bbox and a sample enhancement combination is proposed to further enhance the quality and quantity of Bboxes. SEC can suppress false positive detection results and randomly generate high-quality positive samples. For the Re-ID stage, we contribute an instance similarity loss to exploit the similarity between classless instances, and an Omni-scale Re-ID backbone is employed to learn more discriminative features. We obtain a more efficient and discriminative person search framework by concatenating the two stages. Extensive experiments demonstrate that our method achieves state-of-the-art performance with a high speed, and significantly outperforms other existing methods.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Exploiting Multiperspective Driven Hierarchical Content-Aware Network for
           Finger Vein Verification

    • Free pre-print version: Loading...

      Authors: Pengyang Zhao;Shuping Zhao;Luyang Chen;Wenming Yang;Qingmin Liao;
      Pages: 7938 - 7950
      Abstract: The finger vein trait has attracted widespread attention for personal authentication in recent years. However, most finger vein verification methods are performed on the single perspective, captured by a monocular near-infrared camera fixed at one side of the finger. Consequently, the contents of a single perspective have few details of the spatial network structure of the finger vein and show noticeable differences even if the posture of the same finger is slightly different. Both of them impact the verification performance. Hence, finger vein images captured from different viewpoints are considered in this work. We first design a low-cost multi-perspective based dorsal finger vein imaging device for data collection. A deep neural network named Hierarchical Content-Aware Network (HCAN) is then proposed to extract the discriminative hierarchical features of the finger vein. Specifically, HCAN is compound of a Global Stem Network (GSN) and a Local Perception Module (LPM). GSN aims to extract the latent global 3D feature from all perspectives through a recurrent neural network. It enables the model to retain the details in previous hidden states by incorporating a memory weighting strategy. LPM is designed to perceive each perspective from the aspect of image entropy. Guided by the entropy loss, LPM captures the prominent local feature and improves the discriminability and robustness of the hierarchical feature. The experimental results on the newly collected THU-MFV database demonstrate the superiority of the proposed method in comparison with other multi-perspective and single-perspective based methods.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Video Person Re-Identification Using Attribute-Enhanced Features

    • Free pre-print version: Loading...

      Authors: Tianrui Chai;Zhiyuan Chen;Annan Li;Jiaxin Chen;Xinyu Mei;Yunhong Wang;
      Pages: 7951 - 7966
      Abstract: In this work we propose to boost video-based person re-identification (Re-ID) by using attribute-enhanced feature presentation. To this end, we not only try to use the ID-relevant attributes more effectively, but also for the first time in literature harness the ID-irrelevant attributes to help model training. The former mainly include gender, age, clothing characteristics, etc., which contain rich and supplementary information about the pedestrian; the latter include viewpoint, action, etc., which are seldom used for identification previously. In particular, we use the attributes to enhance the significant areas of the image with a novel Attribute Salient Region Enhance (ASRE) module that can attend more accurately to the body of the pedestrian, so as to better separate the target from the background. Furthermore, we find that many ID-irrelevant but subject-relevant factors, like the view angle and movement of the target pedestrian, have great impact on the two-dimensional appearance of a pedestrian. We then propose to exploit both the ID-relevant and the ID-irrelevant attributes via a novel triplet loss called the Viewpoint and Action-Invariant (VAI) triplet loss. Based on the above, we design an Attribute Salience Assisted Network (ASA-Net) to perform attribute recognition along with identity recognition, and use the attributes for feature enhancement and hard sample mining. Extensive experiments on MARS and DukeMTMC-VideoReID datasets show that our method outperforms the state-of-the-arts. Also, the visualizations of learning results further prove the effectiveness of the proposed method.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Topology-Aware Flow-Based Point Cloud Generation

    • Free pre-print version: Loading...

      Authors: Takumi Kimura;Takashi Matsubara;Kuniaki Uehara;
      Pages: 7967 - 7982
      Abstract: Point clouds have attracted attention as a representation of an object’s surface. Deep generative models have typically used a continuous map from a dense set in a latent space to express their variations. However, a continuous map cannot adequately express the varying numbers of holes. That is, previous approaches disregarded the topological structure of point clouds. Furthermore, a point cloud comprises several subparts, making it difficult to express it using a continuous map. This paper proposes ChartPointFlow, a flow-based deep generative model that forms a map conditioned on a label. Similar to a manifold chart, a map conditioned on a label is assigned to a continuous subset of a point cloud. Thus, ChartPointFlow is able to maintain the topological structure with clear boundaries and holes, whereas previous approaches generated blurry point clouds with fuzzy holes. The experimental results show that ChartPointFlow achieves state-of-the-art performance in various tasks, including generation, reconstruction, upsampling, and segmentation.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Overview of the Low Complexity Enhancement Video Coding (LCEVC) Standard

    • Free pre-print version: Loading...

      Authors: Stefano Battista;Guido Meardi;Simone Ferrara;Lorenzo Ciccarelli;Florian Maurer;Massimo Conti;Simone Orcioni;
      Pages: 7983 - 7995
      Abstract: The Low Complexity Enhancement Video Coding (LCEVC) specification is a recent standard approved by the ISO/IEC JTC 1/SC 29/WG04 (MPEG) Video Coding. The main goal of LCEVC is to provide a standalone toolset for the enhancement of any other existing codec. It works on top of other coding schemes, resulting in a multi-layer video coding technology, but unlike existing scalable video codecs, adds enhancement layers completely independent from the base video. The LCEVC technology takes as input the decoded video at lower resolution and adds up to two enhancement sub-layers of residuals encoded with specialized low-complexity coding tools, such as simple temporal prediction, frequency transform, quantization, and entropy encoding. This paper provides an overview of the main features of the LCEVC standard: high compression efficiency, low complexity, minimized requirements of memory and processing power.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Rate-Distortion Optimal Transform Coefficient Selection for Unoccupied
           Regions in Video-Based Point Cloud Compression

    • Free pre-print version: Loading...

      Authors: Christian Herglotz;Nils Genser;André Kaup;
      Pages: 7996 - 8009
      Abstract: This paper presents a novel method to determine rate-distortion optimized transform coefficients for efficient compression of videos generated from point clouds. The method exploits a generalized frequency selective extrapolation approach that iteratively determines rate-distortion-optimized coefficients for all basis functions of two-dimensional discrete cosine and sine transforms. The method is applied to blocks containing both occupied and unoccupied pixels in video based point cloud compression for HEVC encoding. In the proposed algorithm, only the values of the transform coefficients are changed such that resulting bit streams are compliant to the V-PCC standard. For all-intra coded point clouds, bitrate savings of more than 4% for geometry and more than 6% for texture error metrics with respect to standard encoding can be observed. These savings are more than twice as high as savings obtained using competing methods from literature. In the randomaccess case, our proposed method outperforms competing V-PCC methods by more than 0.5%.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • A Feature Transformation Framework With Selective Pseudo-Labeling for 2D
           Image-Based 3D Shape Retrieval

    • Free pre-print version: Loading...

      Authors: Nian Hu;Heyu Zhou;Xiangdong Huang;Xuanya Li;An-An Liu;
      Pages: 8010 - 8021
      Abstract: 2D image-based 3D shape retrieval (2D-to-3D) aims at searching the corresponding 3D shapes (unlabeled) when given a 2D image (labeled), which is a fundamental task in computer vision and has gained a surge of attention in recent years. However, extensive prior works are limited by two settings, 1) reducing domain discrepancy while ignoring the 3D shape style, 2) 3D shapes are simply and brutally pseudo-annotated by the 2D image-supervised classifier, neglecting the structure information underlying the 3D shape domain. To remedy these issues, we propose a feature transformation framework with selective pseudo-labeling (FTSPL) for 2D-to-3D task. Specifically, we first employ CNNs to produce both 2D image and 3D shape (described as multiple views) features, then we force the inter-domain centroid alignment class-wisely to reduce the overall domain discrepancy. In addition to this, we further exploit the intra-category attribute variation (covariance) of 3D shape features to transform the 2D image features. By doing so, we can equip 2D features with 3D shape style. Since the centroid and covariance estimation of 3D shape features require accurate label predictions, we put forward a selective pseudo-labeling module, which can assign reliable pseudo-labels for 3D shapes via nearest category centroid and cluster analysis, respectively, while preserving the structure information of 3D shapes. Comprehensive experiments validate that our model surpasses the state-of-the-arts on standard 2D-to-3D benchmarks (MI3DOR and MI3DOR-2).
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Discrete Joint Semantic Alignment Hashing for Cross-Modal Image-Text
           Search

    • Free pre-print version: Loading...

      Authors: Song Wang;Huan Zhao;Keqin Li;
      Pages: 8022 - 8036
      Abstract: Supervised cross-modal image-text hashing has aroused extensive concentrations in comprehending the correspondence between vision and language for data search tasks. Existing methods learn the compact hash codes by leveraging a given image-text data pairs or supervised information to explore such correspondence. However, they still confront obvious drawbacks. First, there is no engagement between multiple semantic information that yields the suboptimal search performance. Second, most of them adopt continuous relaxation strategy by discarding the discrete constraints, which results in large binary quantization errors. To deal with these problems, we propose a novel supervised hashing method, termed Discrete Joint Semantic Alignment Hashing (DJSAH). Specifically, it builds a connection between semantics (a.k.a. class labels and pairwise similarities) by the joint semantic alignment learning. And thus the high-level discriminative semantics can be preserved into the hash codes. Besides, a well-designed discrete optimization algorithm with linear computation and memory cost is developed to reduce the information loss of the hash codes with no need for relaxation. Extensive experiments and analyses on three benchmark datasets validate the superiority of the proposed DJSAH against several state-of-the-art hashing methods.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Dual-Level Representation Enhancement on Characteristic and Context for
           Image-Text Retrieval

    • Free pre-print version: Loading...

      Authors: Song Yang;Qiang Li;Wenhui Li;Xuanya Li;An-An Liu;
      Pages: 8037 - 8050
      Abstract: Image-text retrieval is a fundamental and vital task in multi-media retrieval and has received growing attention since it connects heterogeneous data. Previous methods that perform well on image-text retrieval mainly focus on the interaction between image regions and text words. But these approaches lack joint exploration of characteristics and contexts of regions and words, which will cause semantic confusion of similar objects and loss of contextual understanding. To address these issues, a dual-level representation enhancement network (DREN) is proposed to strength the characteristic and contextual representations by innovative block-level and instance-level representation enhancement modules, respectively. The block-level module focuses on mining the potential relations between multiple blocks within each instance representation, while the instance-level module concentrates on learning the contextual relations between different instances. To facilitate the accurate matching of image-text pairs, we propose the graph correlation inference and weighted adaptive filtering to conduct the local and global matching between image-text pairs. Extensive experiments on two challenging datasets (i.e., Flickr30K and MSCOCO) verify the superiority of our method for image-text retrieval.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • SentiStory: A Multi-Layered Sentiment-Aware Generative Model for Visual
           Storytelling

    • Free pre-print version: Loading...

      Authors: Wei Chen;Xuefeng Liu;Jianwei Niu;
      Pages: 8051 - 8064
      Abstract: The visual storytelling (VIST) task aims at generating reasonable, human-like and coherent stories with the image streams as input. Although many deep learning models have achieved promising results, most of them do not directly leverage the sentiment information of stories. In this paper, we propose a sentiment-aware generative model for VIST called SentiStory. The key of SentiStory is a multi-layered sentiment extraction module (MLSEM). For a given image stream, the higher layer gives coarse-grained but accurate sentiments, while the lower layer of the MLSEM extracts fine-grained but usually unreliable ones. The two layers are combined strategically to generate coherent and rich visual sentiment concepts for the VIST task. Results from both automatic and human evaluations demonstrate that with the help of the MLSEM, SentiStory achieves improvement in generating more coherent and human-like stories.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Selective Element and Two Orders Vectorization Networks for Automatic
           Depression Severity Diagnosis via Facial Changes

    • Free pre-print version: Loading...

      Authors: Mingyue Niu;Ziping Zhao;Jianhua Tao;Ya Li;Björn W. Schuller;
      Pages: 8065 - 8077
      Abstract: Physiological studies have shown that healthy and depressed individuals present different facial changes. Thus, many researchers have attempted to use Convolutional Neural Networks (CNNs) to extract high-level facial dynamic representations for predicting depression severity. However, the max-pooling (or average-pooling) layers in the CNN lead to the loss of subtle depression cues. Without pooling layers, the CNN cannot extract multi-scale information and has difficulties for tensor vectorization. To this end, we propose a Selective Element and Two Orders Vectorization (SE-TOV) network. For the SE-TOV network, an SE block is constructed to adaptively select the effective elements from the tensors obtained by receptive fields of different sizes. Moreover, we propose a TOV block for vectorizing a high-dimensional tensor. On the one hand, TOV block inputs a tensor into the Global Average Pooling layer to obtain the first-order vectorization result. On the other hand, it takes principal components of the correlation matrix of channels in a tensor as the second-order vectorization result. Experimental results on AVEC 2013 (RMSE $=7.42$ , MAE $=6.09$ ) and AVEC 2014 (RMSE $=7.39$ , MAE $=5.87$ ) depression databases illustrate the superiority of our approach over previous works.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Ownership Verification of DNN Architectures via Hardware Cache Side
           Channels

    • Free pre-print version: Loading...

      Authors: Xiaoxuan Lou;Shangwei Guo;Jiwei Li;Tianwei Zhang;
      Pages: 8078 - 8093
      Abstract: Deep Neural Networks (DNN) are gaining higher commercial values in computer vision applications, e.g., image classification, video analytics, etc. This calls for urgent demands of the intellectual property (IP) protection of DNN models. In this paper, we present a novel watermarking scheme to achieve the ownership verification of DNN architectures. Existing works all embedded watermarks into the model parameters while treating the architecture as public property. These solutions were proven to be vulnerable by an adversary to detect or remove the watermarks. In contrast, we claim the model architectures as an important IP for model owners, and propose to implant watermarks into the architectures. We design new algorithms based on Neural Architecture Search (NAS) to generate watermarked architectures, which are unique enough to represent the ownership, while maintaining high model usability. Such watermarks can be extracted via side-channel-based model extraction techniques with high fidelity. We conduct comprehensive experiments on watermarked CNN models for image classification tasks and the experimental results show our scheme has negligible impact on the model performance, and exhibits strong robustness against various model transformations and adaptive attacks.
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
  • Call for IEEE T-CSVT Associate Editors Nomination

    • Free pre-print version: Loading...

      Pages: 8094 - 8094
      Abstract: Reports on news of interest to CAS members
      PubDate: Nov. 2022
      Issue No: Vol. 32, No. 11 (2022)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 3.225.221.130
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-