Subjects -> LIBRARY AND INFORMATION SCIENCES (Total: 392 journals)
    - DIGITAL CURATION AND PRESERVATION (13 journals)
    - LIBRARY ADMINISTRATION (1 journals)
    - LIBRARY AND INFORMATION SCIENCES (378 journals)

LIBRARY AND INFORMATION SCIENCES (378 journals)                  1 2 | Last

Showing 1 - 200 of 379 Journals sorted by number of followers
Library & Information Science Research     Hybrid Journal   (Followers: 1821)
Journal of Librarianship and Information Science     Hybrid Journal   (Followers: 1337)
Library Hi Tech     Hybrid Journal   (Followers: 1140)
Journal of Information Science     Hybrid Journal   (Followers: 1112)
Journal of Academic Librarianship     Hybrid Journal   (Followers: 1100)
Library Management     Hybrid Journal   (Followers: 977)
The Electronic Library     Hybrid Journal   (Followers: 976)
Library Quarterly     Full-text available via subscription   (Followers: 941)
Global Knowledge, Memory and Communication     Hybrid Journal   (Followers: 882)
Journal of Information Literacy     Open Access   (Followers: 858)
Library Hi Tech News     Hybrid Journal   (Followers: 788)
Information Technology and Libraries     Open Access   (Followers: 736)
New Library World     Hybrid Journal   (Followers: 684)
Journal of Library & Information Services in Distance Learning     Hybrid Journal   (Followers: 635)
Information Retrieval     Hybrid Journal   (Followers: 616)
Information Sciences     Hybrid Journal   (Followers: 602)
International Journal on Digital Libraries     Hybrid Journal   (Followers: 580)
Information Processing & Management     Hybrid Journal   (Followers: 567)
Information Systems Research     Full-text available via subscription   (Followers: 557)
College & Research Libraries     Open Access   (Followers: 528)
Evidence Based Library and Information Practice     Open Access   (Followers: 461)
Journal of Library and Information Science     Open Access   (Followers: 444)
International Information & Library Review     Hybrid Journal   (Followers: 437)
The Information Society: An International Journal     Hybrid Journal   (Followers: 406)
Library Trends     Full-text available via subscription   (Followers: 390)
Library and Information Research     Open Access   (Followers: 363)
Forensic Science International: Digital Investigation     Full-text available via subscription   (Followers: 344)
Annals of Library and Information Studies (ALIS)     Open Access   (Followers: 337)
International Journal of Library Science     Open Access   (Followers: 303)
Canadian Journal of Information and Library Science     Full-text available via subscription   (Followers: 289)
College & Research Libraries News     Partially Free   (Followers: 286)
Bioinformatics     Hybrid Journal   (Followers: 283)
The Reference Librarian     Hybrid Journal   (Followers: 267)
College & Undergraduate Libraries     Hybrid Journal   (Followers: 261)
IFLA Journal     Hybrid Journal   (Followers: 261)
Library Leadership & Management     Open Access   (Followers: 261)
Journal of Electronic Resources Librarianship     Hybrid Journal   (Followers: 259)
Journal of Library Administration     Hybrid Journal   (Followers: 254)
Library Collections, Acquisitions, and Technical Services     Hybrid Journal   (Followers: 253)
Communications in Information Literacy     Open Access   (Followers: 244)
Data Technologies and Applications     Hybrid Journal   (Followers: 236)
American Libraries     Partially Free   (Followers: 223)
Journal of the Medical Library Association     Open Access   (Followers: 222)
Code4Lib Journal     Open Access   (Followers: 218)
Journal of Information & Knowledge Management     Hybrid Journal   (Followers: 214)
International Journal of Information Management     Hybrid Journal   (Followers: 212)
Cataloging & Classification Quarterly     Hybrid Journal   (Followers: 207)
Journal of Library Metadata     Hybrid Journal   (Followers: 206)
Australian Library Journal     Full-text available via subscription   (Followers: 198)
Journal of Documentation     Hybrid Journal   (Followers: 195)
portal: Libraries and the Academy     Full-text available via subscription   (Followers: 189)
Ariadne Magazine     Open Access   (Followers: 185)
Journal of Hospital Librarianship     Hybrid Journal   (Followers: 184)
Behavioral & Social Sciences Librarian     Hybrid Journal   (Followers: 179)
Aslib Proceedings     Hybrid Journal   (Followers: 172)
Library & Information History     Hybrid Journal   (Followers: 165)
American Archivist     Hybrid Journal   (Followers: 161)
EDUCAUSE Review     Full-text available via subscription   (Followers: 161)
Research Library Issues     Free   (Followers: 159)
The Serials Librarian     Hybrid Journal   (Followers: 156)
The Library : The Transactions of the Bibliographical Society     Hybrid Journal   (Followers: 154)
New Review of Academic Librarianship     Hybrid Journal   (Followers: 151)
Book History     Full-text available via subscription   (Followers: 149)
Against the Grain     Partially Free   (Followers: 143)
Library Technology Reports     Full-text available via subscription   (Followers: 141)
Journal of eScience Librarianship     Open Access   (Followers: 134)
DESIDOC Journal of Library & Information Technology     Open Access   (Followers: 105)
Archives and Museum Informatics     Hybrid Journal   (Followers: 99)
Australian Academic & Research Libraries     Full-text available via subscription   (Followers: 99)
European Journal of Information Systems     Hybrid Journal   (Followers: 95)
Online Information Review     Hybrid Journal   (Followers: 91)
Journal of Librarianship and Scholarly Communication     Open Access   (Followers: 88)
International Journal of Digital Curation     Open Access   (Followers: 85)
Information Technologies & International Development     Open Access   (Followers: 84)
Journal of Electronic Publishing     Open Access   (Followers: 77)
Serials Review     Hybrid Journal   (Followers: 75)
Journal of Education in Library and Information Science - JELIS     Full-text available via subscription   (Followers: 74)
International Journal of Digital Library Systems     Full-text available via subscription   (Followers: 74)
Journal of Interlibrary Loan Document Delivery & Electronic Reserve     Hybrid Journal   (Followers: 69)
LIBER Quarterly : The Journal of the Association of European Research Libraries     Open Access   (Followers: 68)
Archival Science     Hybrid Journal   (Followers: 66)
Ethics and Information Technology     Hybrid Journal   (Followers: 66)
Journal of the Canadian Health Libraries Association / Journal de l'Association des bibliothèques de la santé du Canada     Open Access   (Followers: 66)
Library Philosophy and Practice     Open Access   (Followers: 66)
Insights : the UKSG journal     Open Access   (Followers: 65)
Practical Academic Librarianship : The International Journal of the SLA Academic Division     Open Access   (Followers: 65)
MIS Quarterly : Management Information Systems Quarterly     Hybrid Journal   (Followers: 63)
Journal of Management Information Systems     Full-text available via subscription   (Followers: 60)
Science & Technology Libraries     Hybrid Journal   (Followers: 59)
Journal of Information Technology     Hybrid Journal   (Followers: 56)
The Bottom Line: Managing Library Finances     Hybrid Journal   (Followers: 56)
Alexandria : The Journal of National and International Library and Information Issues     Full-text available via subscription   (Followers: 56)
Journal of Health & Medical Informatics     Open Access   (Followers: 54)
Partnership : the Canadian Journal of Library and Information Practice and Research     Open Access   (Followers: 54)
Archives and Manuscripts     Hybrid Journal   (Followers: 52)
International Journal of Legal Information     Full-text available via subscription   (Followers: 51)
Library & Archival Security     Hybrid Journal   (Followers: 49)
Bangladesh Journal of Library and Information Science     Open Access   (Followers: 47)
OCLC Systems & Services     Hybrid Journal   (Followers: 46)
Community & Junior College Libraries     Hybrid Journal   (Followers: 45)
Information Discovery and Delivery     Hybrid Journal   (Followers: 44)
Journal of Access Services     Hybrid Journal   (Followers: 40)
Medical Reference Services Quarterly     Hybrid Journal   (Followers: 40)
VINE Journal of Information and Knowledge Management Systems     Hybrid Journal   (Followers: 40)
Journal of the Society of Archivists     Hybrid Journal   (Followers: 36)
Scholarly and Research Communication     Open Access   (Followers: 36)
Public Library Quarterly     Hybrid Journal   (Followers: 32)
Journal of Archival Organization     Hybrid Journal   (Followers: 31)
Information & Culture : A Journal of History     Full-text available via subscription   (Followers: 31)
Australasian Public Libraries and Information Services     Full-text available via subscription   (Followers: 31)
Journal of the Association for Information Systems     Open Access   (Followers: 31)
Research Evaluation     Hybrid Journal   (Followers: 30)
Foundations and Trends® in Information Retrieval     Full-text available via subscription   (Followers: 30)
Information     Open Access   (Followers: 29)
International Journal of Information Retrieval Research     Full-text available via subscription   (Followers: 29)
Information Systems Frontiers     Hybrid Journal   (Followers: 27)
International Journal of Intellectual Property Management     Hybrid Journal   (Followers: 26)
International Journal of Information Privacy, Security and Integrity     Hybrid Journal   (Followers: 26)
Proceedings of the American Society for Information Science and Technology     Hybrid Journal   (Followers: 26)
Health Information Management Journal     Hybrid Journal   (Followers: 26)
Journal of the Institute of Conservation     Hybrid Journal   (Followers: 25)
Access     Full-text available via subscription   (Followers: 24)
Nordic Journal of Information Literacy in Higher Education     Open Access   (Followers: 24)
South African Journal of Libraries and Information Science     Open Access   (Followers: 23)
Sci-Tech News     Open Access   (Followers: 23)
LASIE : Library Automated Systems Information Exchange     Free   (Followers: 22)
Journal of Information, Communication and Ethics in Society     Hybrid Journal   (Followers: 22)
NASIG Newsletter     Open Access   (Followers: 21)
InCite     Full-text available via subscription   (Followers: 20)
Georgia Library Quarterly     Open Access   (Followers: 20)
LOEX Quarterly     Full-text available via subscription   (Followers: 20)
RBM : A Journal of Rare Books, Manuscripts, and Cultural Heritage     Open Access   (Followers: 20)
Urban Library Journal     Open Access   (Followers: 19)
El Profesional de la Informacion     Full-text available via subscription   (Followers: 18)
Journal of Research on Libraries and Young Adults     Open Access   (Followers: 18)
International Journal of Web Portals     Full-text available via subscription   (Followers: 17)
Communication Booknotes Quarterly     Hybrid Journal   (Followers: 16)
Theological Librarianship : An Online Journal of the American Theological Library Association     Open Access   (Followers: 16)
Perspectives in International Librarianship     Open Access   (Followers: 16)
Biblioteca Universitaria     Open Access   (Followers: 16)
Collection and Curation     Hybrid Journal   (Followers: 15)
Manuscripta     Full-text available via subscription   (Followers: 15)
Bibliotheca Orientalis     Full-text available via subscription   (Followers: 14)
International Journal of Business Information Systems     Hybrid Journal   (Followers: 14)
International Journal of Information Technology, Communications and Convergence     Hybrid Journal   (Followers: 14)
Notes     Full-text available via subscription   (Followers: 14)
Online Journal of Public Health Informatics     Open Access   (Followers: 14)
Alexandría : Revista de Ciencias de la Información     Open Access   (Followers: 14)
Anales de Documentacion     Open Access   (Followers: 14)
Journal of Educational Media, Memory, and Society     Full-text available via subscription   (Followers: 13)
Biblios     Open Access   (Followers: 13)
International Journal of Intercultural Information Management     Hybrid Journal   (Followers: 12)
Alsic : Apprentissage des Langues et Systèmes d'Information et de Communication     Open Access   (Followers: 12)
Journal of Information Technology Teaching Cases     Hybrid Journal   (Followers: 12)
Journal of Religious & Theological Information     Hybrid Journal   (Followers: 11)
Universal Access in the Information Society     Hybrid Journal   (Followers: 11)
InterActions: UCLA Journal of Education and Information     Open Access   (Followers: 11)
International Journal of Information and Decision Sciences     Hybrid Journal   (Followers: 11)
Journal of Information Systems     Full-text available via subscription   (Followers: 11)
Kansas Library Association College & University Libraries Section Proceedings     Open Access   (Followers: 11)
Journal of Information Engineering and Applications     Open Access   (Followers: 10)
Journal of Global Information Management     Full-text available via subscription   (Followers: 9)
Southeastern Librarian     Open Access   (Followers: 9)
e & i Elektrotechnik und Informationstechnik     Hybrid Journal   (Followers: 8)
JLIS.it     Open Access   (Followers: 8)
International Journal of Multicriteria Decision Making     Hybrid Journal   (Followers: 8)
JISTEM : Journal of Information Systems and Technology Management     Open Access   (Followers: 8)
International Journal of Multimedia Information Retrieval     Partially Free   (Followers: 8)
BIBLOS - Revista do Departamento de Biblioteconomia e História     Open Access   (Followers: 7)
New Review of Information Networking     Hybrid Journal   (Followers: 7)
Idaho Librarian     Free   (Followers: 7)
Slavic & East European Information Resources     Hybrid Journal   (Followers: 6)
Egyptian Informatics Journal     Open Access   (Followers: 6)
Informaatiotutkimus     Open Access   (Followers: 5)
Revista Interamericana de Bibliotecología     Open Access   (Followers: 5)
CIC. Cuadernos de Informacion y Comunicacion     Open Access   (Followers: 5)
Bridgewater Review     Open Access   (Followers: 5)
Bilgi Dünyası     Open Access   (Followers: 5)
Open Systems & Information Dynamics     Hybrid Journal   (Followers: 4)
ProInflow : Journal for Information Sciences     Open Access   (Followers: 4)
Nordic Journal of Library and Information Studies     Open Access   (Followers: 4)
International Journal of Cooperative Information Systems     Hybrid Journal   (Followers: 4)
OJS på dansk     Open Access   (Followers: 4)
Investigación Bibliotecológica     Open Access   (Followers: 4)
Revista Española de Documentación Científica     Open Access   (Followers: 4)
International Journal of Organisational Design and Engineering     Hybrid Journal   (Followers: 3)
Journal of Information Systems Teaching Notes     Hybrid Journal   (Followers: 3)
HLA News     Full-text available via subscription   (Followers: 3)
Encontros Bibli : revista eletrônica de biblioteconomia e ciência da informação     Open Access   (Followers: 3)
SLIS Student Research Journal     Open Access   (Followers: 3)
VRA Bulletin     Open Access   (Followers: 3)
Türk Kütüphaneciliği : Turkish Librarianship     Open Access   (Followers: 2)
Información, Cultura y Sociedad     Open Access   (Followers: 2)
Revista General de Información y Documentación     Open Access   (Followers: 2)
Informação & Informação     Open Access   (Followers: 2)
In Monte Artium     Full-text available via subscription   (Followers: 1)
Knjižnica : Revija za Področje Bibliotekarstva in Informacijske Znanosti     Open Access   (Followers: 1)
Documentación de las Ciencias de la Información     Open Access   (Followers: 1)
Palabra Clave (La Plata)     Open Access  
Liinc em Revista     Open Access  

        1 2 | Last

Similar Journals
Journal Cover
International Journal of Multimedia Information Retrieval
Journal Prestige (SJR): 0.268
Citation Impact (citeScore): 1
Number of Followers: 8  
 
  Partially Free Journal Partially Free Journal
ISSN (Print) 2192-6611 - ISSN (Online) 2192-662X
Published by Springer-Verlag Homepage  [2468 journals]
  • MemeTector: enforcing deep focus for meme detection

    • Free pre-print version: Loading...

      Abstract: Image memes and specifically their widely known variation image macros are a special new media type that combines text with images and are used in social media to playfully or subtly express humor, irony, sarcasm and even hate. It is important to accurately retrieve image memes from social media to better capture the cultural and social aspects of online phenomena and detect potential issues (hate-speech, disinformation). Essentially, the background image of an image macro is a regular image easily recognized as such by humans but cumbersome for the machine to do so due to feature map similarity with the complete image macro. Hence, accumulating suitable feature maps in such cases can lead to deep understanding of the notion of image memes. To this end, we propose a methodology, called visual part utilization, that utilizes the visual part of image memes as instances of the regular image class and the initial image memes as instances of the image meme class to force the model to concentrate on the critical parts that characterize an image meme. Additionally, we employ a trainable attention mechanism on top of a standard ViT architecture to enhance the model’s ability to focus on these critical parts and make the predictions interpretable. Several training and test scenarios involving web-scraped regular images of controlled text presence are considered for evaluating the model in terms of robustness and accuracy. The findings indicate that light visual part utilization combined with sufficient text presence during training provides the best and most robust model, surpassing state of the art. Source code and dataset are available at https://github.com/mever-team/memetector.
      PubDate: 2023-05-13
       
  • Maximizing mutual information inside intra- and inter-modality for
           audio-visual event retrieval

    • Free pre-print version: Loading...

      Abstract: The human brain can process sound and visual information in overlapping areas of the cerebral cortex, which means that audio and visual information are deeply correlated with each other when we explore the world. To simulate this function of the human brain, audio-visual event retrieval (AVER) has been proposed. AVER is about using data from one modality (e.g., audio data) to query data from another. In this work, we aim to improve the performance of audio-visual event retrieval. To achieve this goal, first, we propose a novel network, InfoIIM, which enhance the accuracy of intra-model feature representation and inter-model feature alignment. The backbone of this network is a parallel connection of two VAE models with two different encoders and a shared decoder. Secondly, to enable the VAE to learn better feature representations and to improve intra-modal retrieval performance, we have used InfoMax-VAE instead of the vanilla VAE model. Additionally, we study the influence of modality-shared features on the effectiveness of audio-visual event retrieval. To verify the effectiveness of our proposed method, we validate our model on the AVE dataset, and the results show that our model outperforms several existing algorithms in most of the metrics. Finally, we present our future research directions, hoping to inspire relevant researchers.
      PubDate: 2023-05-04
       
  • Multiple feedback based adversarial collaborative filtering with
           aesthetics

    • Free pre-print version: Loading...

      Abstract: Visual-aware personalized recommendation systems can estimate the potential demand by evaluating consumer personalized preferences. In general, consumer feedback data is deduced from either explicit feedback or implicit feedback. However, explicit and implicit feedback raises the chance of malicious operation or misoperation, which can lead to deviations in recommended outcomes. Adversarial learning, a regularization approach that can resist disturbances, could be a promising choice for enhancing model resilience. We propose a novel adversarial collaborative filtering with aesthetics (ACFA) for the visual recommendation that utilizes adversarial learning to improve resilience and performance in the case of perturbation. The ACFA algorithm applies three types of input to the visual Bayesian personalized ranking: negative, unobserved, and positive feedback. Through feedbacks at various levels, it uses a probabilistic approach to obtain consumer personalized preferences. Since in visual recommendation, the aesthetic data in determining consumer preferences on product is critical, we construct the consumer personalized preferences model with aesthetic elements, and then use them to enhance the sampling quality when training the algorithm. To mitigate the negative effects of feedback noise, We use minimax adversarial learning to learn the ACFA objective function. Experiments using two datasets demonstrate that the ACFA model outperforms state-of-the-art algorithms on two metrics.
      PubDate: 2023-04-29
       
  • A deep image retrieval network using Max-m-Min pooling and morphological
           feature generating residual blocks

    • Free pre-print version: Loading...

      Abstract: The textural and structural information contained in the images is very important for generating highly discriminative features for the task of image retrieval. Morphological operations are nonlinear mathematical operations that can provide such textural and structural information. In this work, a new residual block based on a module using morphological operations coupled with an edge extraction module is proposed. A novel pooling operation focusing on the edges of the images is also proposed. A deep convolutional network is then designed using the proposed residual block and the new pooling operation that significantly improves its representational capacity. Extensive experiments are carried out to show the effectiveness of the ideas used in the design of the proposed deep image retrieval network. The proposed network is shown to significantly outperform existing state-of-the-art image retrieval networks on various benchmark datasets.
      PubDate: 2023-04-26
       
  • Study of Alzheimer’s disease brain impairment and methods for its early
           diagnosis: a comprehensive survey

    • Free pre-print version: Loading...

      Abstract: Alzheimer’s disease (AD) is one of the most severe kinds of dementia that affects the elderly population. Since this disease is incurable and the changes in brain sub-regions start decades before the symptoms are observed, early detection becomes more challenging. Discriminating similar brain patterns for AD classification is difficult as minute changes in biomarkers are detected in different neuroimaging modality, also in different image projections. Deep learning models have provided excellent performance in analyzing various neuroimaging and clinical data. In this survey, we performed a comparative analysis of 134 papers published between 2017 and 2022 to get 360° knowledge of the AD kind of problem and everything done to examine and deeply analyze factors causing this. Different pre-processing tools and techniques, various datasets, and brain sub-regions affected mainly by AD have been reviewed. Further deep analysis of various biomarkers, feature extraction techniques, Deep learning and Machine learning architectures has been done for the survey. Summarization of the latest research articles with valuable findings has been represented in multiple tables. A novel approach has been used representing classification of biomarkers, pre-processing techniques and AD detection methods in form of figures and classification of AD on the basis of stages showing difference in accuracies between binary and multi-class in form of table. We finally concluded our paper by addressing some challenges faced during classification and provided recommendations that can be considered for future research in diagnosing various stages of AD.
      PubDate: 2023-03-17
       
  • Video anomaly detection with memory-guided multilevel embedding

    • Free pre-print version: Loading...

      Abstract: Playing a vitally important role in the operation of intelligent video surveillance system and smart city, video anomaly detection (VAD) has been widely practiced and studied in both industrial circles and academia. In the present study, a new anomaly detection method is proposed for multi-level memory embedding. According to the novel method, the feature prototype of the sample is stored in the memory pool, which enhances the diversity of the sample feature prototype paradigm. Besides, the memory is embedded in the decoder in a hierarchical integrating manner, which makes the feature information of the object more complete and improves the quality of features. At the end of the model, modeling is performed for the channel relationship between the features of the object in the channel dimension, thus making the model capable of more efficient anomaly detection. This method is verified by conducting evaluation on three publicly available datasets: UCSD Ped2, CUHK Avenue, ShanghaiTech.
      PubDate: 2023-03-15
      DOI: 10.1007/s13735-023-00272-x
       
  • Nested-Net: a deep nested network for background subtraction

    • Free pre-print version: Loading...

      Abstract: Background subtraction is one of the most highly regarded steps in computer vision, especially in video surveillance applications. Although various approaches have been proposed to cope with the different difficulties of this field, many of these methods have not been able to fully tackle complicated situations in realistic scenes due to their sensitivity to many challenges. This paper presents a deep nested background subtraction algorithm based on residual micro-autoencoder blocks. Hence, our method is implemented as a U-net like architecture with more skip connections. The nested network uses residual connections between these micro-autoencoders that can extract significant multi-scale features of a complex scene. We also test and prove that the proposed method can work in various challenging situations. A small set of training samples is included to train this end-to-end network. The experimental results demonstrate that our model outperforms other state-of-the-art methods on two well-known benchmark datasets: CDNet 2014 and SBI 2015.
      PubDate: 2023-03-07
      DOI: 10.1007/s13735-023-00270-z
       
  • LG-MLFormer: local and global MLP for image captioning

    • Free pre-print version: Loading...

      Abstract: Self-attention-based image captioning model exists visual features’ spatial information loss problem, introducing relative position encoding can solve the problem to some extent. However, it will bring additional parameters and greater computational complexity. To solve the above problem, we propose a novel local–global MLFormer (LG-MLFormer) with specifically designed encoder module Local–global multi-layer perceptron (LG-MLP). The LG-MLP can capture the latent correlations between different images and its linear stacking calculation mode can reduce computational complexity. It consists of two independent local MLP (LM) modules and a cross-domain global MLP (CDGM) module. The LM specially designs the mapping dimension between linear layers to realize the self-compensation of visual features’ spatial information without introducing relative position encoding. The CDGM module aggregates cross-domain potential correlations between grid-based features and region-based features to realize the complementary advantages of these global and local semantic associations. Experiments on the Karpathy test split and the online test server reveal that our approach provides superior or comparable performance to the state-of-the-art (SOTA). Trained models and code for reproducing the experiments are publicly available at: https://github.com/wxx1921/LGMLFormer-local-and-global-mlp-for-image-captioning.
      PubDate: 2023-02-23
      DOI: 10.1007/s13735-023-00266-9
       
  • Deep learning for video-text retrieval: a review

    • Free pre-print version: Loading...

      Abstract: Video-Text Retrieval (VTR) aims to search for the most relevant video related to the semantics in a given sentence, and vice versa. In general, this retrieval task is composed of four successive steps: video and textual feature representation extraction, feature embedding and matching, and objective functions. In the last, a list of samples retrieved from the dataset is ranked based on their matching similarities to the query. In recent years, significant and flourishing progress has been achieved by deep learning techniques, however, VTR is still a challenging task due to the problems like how to learn an efficient spatial-temporal video feature and how to narrow the cross-modal gap. In this survey, we review and summarize over 100 research papers related to VTR, demonstrate state-of-the-art performance on several commonly benchmarked datasets, and discuss potential challenges and directions, with the expectation to provide some insights for researchers in the field of video-text retrieval.
      PubDate: 2023-02-23
      DOI: 10.1007/s13735-023-00267-8
       
  • CLIP-based fusion-modal reconstructing hashing for large-scale
           unsupervised cross-modal retrieval

    • Free pre-print version: Loading...

      Abstract: As multi-modal data proliferates, people are no longer content with a single mode of data retrieval for access to information. Deep hashing retrieval algorithms have attracted much attention for their advantages of efficient storage and fast query speed. Currently, the existing unsupervised hashing methods generally have two limitations: (1) Existing methods fail to adequately capture the latent semantic relevance and coexistent information from the different modality data, resulting in the lack of effective feature and hash encoding representation to bridge the heterogeneous and semantic gaps in multi-modal data. (2) Existing unsupervised methods typically construct a similarity matrix to guide the hash code learning, which suffers from inaccurate similarity problems, resulting in sub-optimal retrieval performance. To address these issues, we propose a novel CLIP-based fusion-modal reconstructing hashing for Large-scale Unsupervised Cross-modal Retrieval. First, we use CLIP to encode cross-modal features of visual modalities, and learn the common representation space of the hash code using modality-specific autoencoders. Second, we propose an efficient fusion approach to construct a semantically complementary affinity matrix that can maximize the potential semantic relevance of different modal instances. Furthermore, to retain the intrinsic semantic similarity of all similar pairs in the learned hash codes, an objective function for similarity reconstruction based on semantic complementation is designed to learn high-quality hash code representations. Sufficient experiments were carried out on four multi-modal benchmark datasets (WIKI, MIRFLICKR, NUS-WIDE, and MS COCO), and the proposed method achieves state-of-the-art image-text retrieval performance compared to several representative unsupervised cross-modal hashing methods.
      PubDate: 2023-02-22
      DOI: 10.1007/s13735-023-00268-7
       
  • End-to-end residual learning-based deep neural network model deployment
           for human activity recognition

    • Free pre-print version: Loading...

      Abstract: Human activity recognition is a theme commonly explored in computer vision. Its applications in various domains include monitoring systems, video processing, robotics, and healthcare sector, etc. Activity recognition is a difficult task since there are structural changes among subjects, as well as inter-class and intra-class correlation between activities. As a result, a continuous intelligent control system for detecting human behavior with grouping of maximum information is necessary. Therefore, in this paper, a novel automatic system to identify human activity on the UTKinect dataset is implemented by using Residual learning-based Network “ResNet-50” and transfer learning to represent more complicated features and improved model robustness. The experimental results have shown an excellent generalization capability when tested on the validation set and obtained high accuracy of 98.60 per cent with a 0.02 loss score. The designed residual learning-based system indicates the efficiency of comparing with the other state-of-the-art models.
      PubDate: 2023-02-19
      DOI: 10.1007/s13735-023-00269-6
       
  • Special issue on cross-modal retrieval and analysis

    • Free pre-print version: Loading...

      PubDate: 2022-12-03
      DOI: 10.1007/s13735-022-00265-2
       
  • Semantic-aware visual scene representation

    • Free pre-print version: Loading...

      Abstract: Scene classification is a mature and active computer vision task, due to the inherent ambiguity. The scene classification task aims to classify the visual scene images in predefined categories based on the ambient content, objects and the layout of the images. Inspired by human visual scene understanding, the visual scenes can be divided into two categories: (1) Object-based scenes that consist of the scene-specific objects and can be understood with those objects. (2) Layout-based scenes that are understandable based on the layout and the ambient content of the scene images. Scene-specific objects semantically help to understand object-based scenes, whereas the layout and the ambient content are effective in understanding layout-based scenes by representing the visual appearance of the scene images. Hence, one of the main challenges in scene classification is to create a discriminative representation that can provide a high-level perception of visual scenes. Accordingly, we have presented a discriminative hybrid representation of visual scenes, in which semantic features extracted from scene-specific objects are fused with visual features extracted from a deep CNN. The proposed scene representation method is used for the scene classification task and is applied to three benchmark scene datasets including: MIT67, SUN397, and UIUC Sports. Moreover, a new scene dataset, called "Scene40," has been introduced, and also, the results of our proposed method have been presented on it. Experimental results show that our proposed method has achieved remarkable performance in the scene classification task and has outperformed the state-of-the-art methods.
      PubDate: 2022-12-01
      DOI: 10.1007/s13735-022-00246-5
       
  • TCKGE: Transformers with contrastive learning for knowledge graph
           embedding

    • Free pre-print version: Loading...

      Abstract: Representation learning of knowledge graphs has emerged as a powerful technique for various downstream tasks. In recent years, numerous research efforts have been made for knowledge graphs embedding. However, previous approaches usually have difficulty dealing with complex multi-relational knowledge graphs due to their shallow network architecture. In this paper, we propose a novel framework named Transformers with Contrastive learning for Knowledge Graph Embedding (TCKGE), which aims to learn complex semantics in multi-relational knowledge graphs with deep architectures. To effectively capture the rich semantics of knowledge graphs, our framework leverages the powerful Transformers to build a deep hierarchical architecture to dynamically learn the embeddings of entities and relations. To obtain more robust knowledge embeddings with our deep architecture, we design a contrastive learning scheme to facilitate optimization by exploring the effectiveness of several different data augmentation strategies. The experimental results on two benchmark datasets show the superior of TCKGE over state-of-the-art models.
      PubDate: 2022-11-27
      DOI: 10.1007/s13735-022-00256-3
       
  • Your heart rate betrays you: multimodal learning with spatio-temporal
           fusion networks for micro-expression recognition

    • Free pre-print version: Loading...

      Abstract: Micro-expressions can convey feelings that people are trying to hide. At present, some studies on micro-expression, most of which only use the temporal or spatial information in the image to recognize micro-expressions, neglect the intrinsic features of the image. To solve this problem, we detect the subject’s heart rate in the long micro-expression videos; we extract the image’s spatio-temporal feature through a spatio-temporal network and then extract the heart rate feature using a heart rate network. A multimodal learning method that combines the heart rate and spatio-temporal features is used to recognize micro-expressions. The experimental results on CASMEII, SAMM, and SMIC show that the proposed methods’ unweighted F1-score and unweighted average recall are 0.8867 and 0.8962, respectively. The spatio-temporal fusion network combined with heart rate information provides an essential reference for multimodal approaches to micro-expression recognition.
      PubDate: 2022-10-09
      DOI: 10.1007/s13735-022-00250-9
       
  • Multimodal Quasi-AutoRegression: forecasting the visual popularity of new
           fashion products

    • Free pre-print version: Loading...

      Abstract: Estimating the preferences of consumers is of utmost importance for the fashion industry as appropriately leveraging this information can be beneficial in terms of profit. Trend detection in fashion is a challenging task due to the fast pace of change in the fashion industry. Moreover, forecasting the visual popularity of new garment designs is even more demanding due to lack of historical data. To this end, we propose MuQAR, a Multimodal Quasi-AutoRegressive deep learning architecture that combines two modules: (1) a multimodal multilayer perceptron processing categorical, visual and textual features of the product and (2) a Quasi-AutoRegressive neural network modelling the “target” time series of the product’s attributes along with the “exogenous” time series of all other attributes. We utilize computer vision, image classification and image captioning, for automatically extracting visual features and textual descriptions from the images of new products. Product design in fashion is initially expressed visually and these features represent the products’ unique characteristics without interfering with the creative process of its designers by requiring additional inputs (e.g. manually written texts). We employ the product’s target attributes time series as a proxy of temporal popularity patterns, mitigating the lack of historical data, while exogenous time series help capture trends among interrelated attributes. We perform an extensive ablation analysis on two large-scale image fashion datasets, Mallzee-P and SHIFT15m to assess the adequacy of MuQAR and also use the Amazon Reviews: Home and Kitchen dataset to assess generalization to other domains. A comparative study on the VISUELLE dataset shows that MuQAR is capable of competing and surpassing the domain’s current state of the art by 4.65% and 4.8% in terms of WAPE and MAE, respectively.
      PubDate: 2022-10-08
      DOI: 10.1007/s13735-022-00262-5
       
  • Prototype local–global alignment network for image–text
           retrieval

    • Free pre-print version: Loading...

      Abstract: Image–text retrieval is a challenging task due to the requirement of thorough multimodal understanding and precise inter-modality relationship discovery. However, most previous approaches resort to doing global image–text alignment and neglect fine-grained correspondence. Although some works explore local region–word alignment, they usually suffer from a heavy computing burden. In this paper, we propose a prototype local–global alignment (PLGA) network for image–text retrieval by jointly performing the fine-grained local alignment and high-level global alignment. Specifically, our PLGA contains two key components: a prototype-based local alignment module and a multi-scale global alignment module. The former enables efficient fine-grained local matching by combining region–prototype alignment and word–prototype alignment, and the latter helps perceive hierarchical global semantics by exploring multi-scale global correlations between the image and text. Overall, the local and global alignment modules can boost their performances for each other via the unified model. Quantitative and qualitative experimental results on Flickr30K and MS-COCO benchmarks demonstrate that our proposed approach performs favorably against state-of-the-art methods.
      PubDate: 2022-10-06
      DOI: 10.1007/s13735-022-00258-1
       
  • Visual and semantic ensemble for scene text recognition with gated dual
           mutual attention

    • Free pre-print version: Loading...

      Abstract: Scene text recognition is a challenging task in computer vision due to the significant differences in text appearance, such as image distortion and rotation. However, linguistic prior helps individuals reason text from images even if some characters are missing or blurry. This paper investigates the fusion of visual cues and linguistic dependencies to boost recognition performance. We introduce a relational attention module to leverage visual patterns and word representations. We embed linguistic dependencies from a language model into the optimization framework to ensure that the predicted sequence captures the contextual dependencies within a word. We propose a dual mutual attention transformer that promotes cross-modality interactions such that the inter- and intra-correlations between visual and linguistic can be fully explored. The introduced gate function enables the model to learn to determine the contribution of each modality and further boost the model performance. Extensive experiments demonstrate that our method enhances the recognition performance of low-quality images and achieves state-of-the-art performance on datasets of texts from regular and irregular scenes.
      PubDate: 2022-10-06
      DOI: 10.1007/s13735-022-00253-6
       
  • Tri-RAT: optimizing the attention scores for image captioning

    • Free pre-print version: Loading...

      Abstract: Attention mechanisms and grid features are widely used in current visual language tasks like image captioning. The attention scores are the key factor to the success of the attention mechanism. However, the connection between attention scores in different layers is not strong enough since Transformer is a hierarchical structure. Additionally, geometric information is inevitably lost when grid features are flattened to be fed into a transformer model. Therefore, bias scores about geometric position information should be added to the attention scores. Considering that there are three different kinds of attention modules in the transformer architecture, we build three independent paths (residual attention paths, RAPs) to propagate the attention scores from the previous layer as a prior for attention computation. This operation is like a residual connection between attention scores, which can enhance the connection and make each attention layer obtain a global comprehension. Then, we replace the traditional attention module with a novel residual attention with relative position module in the encoder to incorporate relative position scores with attention scores. Residual attention may increase the internal covariate shifts. To optimize the data distribution, we introduce residual attention with layer normalization on query vectors module in the decoder. Finally, we build our Residual Attention Transformer with three RAPs (Tri-RAT) for the image captioning task. The proposed model achieves competitive performance on the MSCOCO benchmark with all the state-of-the-art models. We gain 135.8 \(\%\) CIDEr on MS COCO “Karpathy” offline test split and 135.3 \(\%\) CIDEr on the online testing server.
      PubDate: 2022-10-06
      DOI: 10.1007/s13735-022-00260-7
       
  • Generative adversarial networks for 2D-based CNN pose-invariant face
           recognition

    • Free pre-print version: Loading...

      Abstract: The computer vision community considers the pose-invariant face recognition (PIFR) as one of the most challenging applications. Many works were devoted to enhancing face recognition performance when facing profile samples. They mainly focused on 2D- and 3D-based frontalization techniques trying to synthesize frontal views from profile ones. In the same context, we propose in this paper a new 2D PIFR technique based on Generative Adversarial Network image translation. The used GAN is Pix2Pix paired architecture covering many generator and discriminator models that will be comprehensively evaluated on a new benchmark proposed in this paper referred to as Combined-PIFR database, which is composed of four datasets that provide profiles images and their corresponding frontal ones. The paired architecture we are using is based on computing the L1 distance between the generated image and the ground truth one (pairs). Therefore, both generator and discriminator architectures are paired ones. The Combined-PIFR database is partitioned respecting person-independent constraints to evaluate our proposed framework’s frontalization and classification sub-systems fairly. Thanks to the GAN-based frontalization, the recorded results demonstrate an important improvement of 33.57% compared to the baseline.
      PubDate: 2022-09-15
      DOI: 10.1007/s13735-022-00249-2
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 44.200.112.172
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-