Subjects -> LIBRARY AND INFORMATION SCIENCES (Total: 392 journals)
    - DIGITAL CURATION AND PRESERVATION (13 journals)
    - LIBRARY ADMINISTRATION (1 journals)
    - LIBRARY AND INFORMATION SCIENCES (378 journals)

LIBRARY AND INFORMATION SCIENCES (378 journals)                  1 2 | Last

Showing 1 - 200 of 379 Journals sorted alphabetically
027.7 Zeitschrift für Bibliothekskultur / Journal for Library Culture     Open Access   (Followers: 61)
Access     Full-text available via subscription   (Followers: 23)
Acervo : Revista do Arquivo Nacional     Open Access   (Followers: 1)
African Journal of Library, Archives and Information Science     Full-text available via subscription   (Followers: 67)
Against the Grain     Partially Free   (Followers: 119)
AIB Studi     Full-text available via subscription   (Followers: 10)
Alexandría : Revista de Ciencias de la Información     Open Access   (Followers: 11)
Alexandria : The Journal of National and International Library and Information Issues     Full-text available via subscription   (Followers: 55)
Alsic : Apprentissage des Langues et Systèmes d'Information et de Communication     Open Access   (Followers: 12)
American Archivist     Hybrid Journal   (Followers: 128)
American Libraries     Partially Free   (Followers: 187)
Anales de Documentacion     Open Access   (Followers: 13)
Anuari de l'Observatori de Biblioteques, Llibres i Lectura     Open Access   (Followers: 2)
ANZTLA EJournal     Full-text available via subscription  
Archeion Online     Open Access   (Followers: 3)
Archimag     Full-text available via subscription   (Followers: 3)
Archival Science     Hybrid Journal   (Followers: 64)
Archivaria     Open Access   (Followers: 32)
Archives     Full-text available via subscription   (Followers: 6)
Archives and Manuscripts     Hybrid Journal   (Followers: 50)
Archives and Museum Informatics     Hybrid Journal   (Followers: 97)
Ariadne Magazine     Open Access   (Followers: 145)
Art Libraries Journal     Hybrid Journal   (Followers: 10)
Aslib Journal of Information Management     Hybrid Journal   (Followers: 32)
Aslib Proceedings     Hybrid Journal   (Followers: 147)
AtoZ : novas práticas em informação e conhecimento     Open Access  
Australasian Journal of Information Systems     Open Access   (Followers: 16)
Australasian Public Libraries and Information Services     Full-text available via subscription   (Followers: 31)
Australian Academic & Research Libraries     Full-text available via subscription   (Followers: 93)
Australian Library Journal     Full-text available via subscription   (Followers: 146)
Baca : Jurnal Dokumentasi dan Informasi     Open Access   (Followers: 1)
Bangladesh Journal of Library and Information Science     Open Access   (Followers: 44)
Behavioral & Social Sciences Librarian     Hybrid Journal   (Followers: 143)
Berkala Ilmu Perpustakaan dan Informasi     Open Access  
Biblios     Open Access   (Followers: 11)
Biblioteca Escolar em Revista     Open Access  
Biblioteca Universitaria     Open Access   (Followers: 14)
Bibliotecas : Revista de la Escuela de Bibliotecología, Documentación e Información     Open Access   (Followers: 3)
Bibliotecas Universitárias : pesquisas, experiências e perspectivas     Open Access   (Followers: 1)
Bibliotecas. Anales de Investigacion     Open Access  
Biblioteka     Open Access   (Followers: 2)
Biblioteka i Edukacja     Open Access   (Followers: 4)
Bibliotheca Orientalis     Full-text available via subscription   (Followers: 14)
BIBLIOTIKA : Jurnal Kajian Perpustakaan dan Informasi     Open Access  
BIBLOS - Revista do Departamento de Biblioteconomia e História     Open Access   (Followers: 6)
BiD : textos universitaris de biblioteconomia i documentació     Open Access   (Followers: 6)
Bilgi Dünyası     Open Access   (Followers: 5)
Biodiversity Information Science and Standards     Open Access   (Followers: 1)
Bioinformatics     Hybrid Journal   (Followers: 216)
Biuletyn EBIB     Open Access  
Boletín Cultural y Bibliográfico     Open Access   (Followers: 2)
Book History     Full-text available via subscription   (Followers: 112)
Bridgewater Review     Open Access   (Followers: 4)
Bulletin des bibliotheques de France     Full-text available via subscription   (Followers: 7)
Bulletin of the Association for Information Science and Technology     Open Access   (Followers: 22)
Bulletin of the John Rylands Library     Hybrid Journal   (Followers: 21)
Canadian Journal of Academic Librarianship     Open Access   (Followers: 20)
Canadian Journal of Information and Library Science     Full-text available via subscription   (Followers: 245)
Cataloging & Classification Quarterly     Hybrid Journal   (Followers: 168)
CERN IdeaSquare Journal of Experimental Innovation     Open Access  
Children and Libraries : The Journal of the Association for Library Service to Children     Full-text available via subscription   (Followers: 16)
CIC. Cuadernos de Informacion y Comunicacion     Open Access   (Followers: 4)
Ciência da Informação em Revista     Open Access   (Followers: 1)
Code4Lib Journal     Open Access   (Followers: 171)
Collaborative Librarianship     Open Access   (Followers: 53)
Collection and Curation     Hybrid Journal   (Followers: 11)
College & Research Libraries     Open Access   (Followers: 452)
College & Research Libraries News     Partially Free   (Followers: 243)
College & Undergraduate Libraries     Hybrid Journal   (Followers: 220)
Communicate : Journal of Library and Information Science     Full-text available via subscription   (Followers: 63)
Communication Booknotes Quarterly     Hybrid Journal   (Followers: 15)
Communications in Information Literacy     Open Access   (Followers: 193)
Community & Junior College Libraries     Hybrid Journal   (Followers: 42)
Cuadernos de Gestión de Información     Open Access  
Data Curation Profiles Directory     Open Access   (Followers: 5)
Data Technologies and Applications     Hybrid Journal   (Followers: 207)
DESIDOC Journal of Library & Information Technology     Open Access   (Followers: 96)
Digital Library Perspectives     Hybrid Journal   (Followers: 39)
Digital Platform: Information Technologies in Sociocultural Sphere     Open Access   (Followers: 1)
Documentación de las Ciencias de la Información     Open Access  
Documentation et bibliothèques     Full-text available via subscription   (Followers: 9)
e & i Elektrotechnik und Informationstechnik     Hybrid Journal   (Followers: 8)
e-Ciencias de la Información     Open Access   (Followers: 1)
Eastern Librarian     Open Access   (Followers: 11)
Edulib : Journal of Library and Information Science     Open Access   (Followers: 26)
Egyptian Informatics Journal     Open Access   (Followers: 5)
El Profesional de la Informacion     Full-text available via subscription   (Followers: 17)
eLucidate     Open Access   (Followers: 7)
Emerging Library & Information Perspectives     Open Access   (Followers: 29)
Encontros Bibli : revista eletrônica de biblioteconomia e ciência da informação     Open Access   (Followers: 3)
Ethics and Information Technology     Hybrid Journal   (Followers: 64)
European Journal of Information Systems     Hybrid Journal   (Followers: 85)
European Science Editing     Open Access  
Evidence Based Library and Information Practice     Open Access   (Followers: 386)
Florida Libraries     Open Access   (Followers: 1)
Folia Bibliologica     Open Access  
Forensic Science International: Digital Investigation     Full-text available via subscription   (Followers: 317)
Foundations and Trends® in Information Retrieval     Full-text available via subscription   (Followers: 30)
Georgia Library Quarterly     Open Access   (Followers: 21)
Ghana Library Journal     Full-text available via subscription   (Followers: 16)
Global Knowledge, Memory and Communication     Hybrid Journal   (Followers: 806)
GSI Journals Serie C : Advancements in Information Sciences and Technologies     Open Access   (Followers: 1)
Health Information Management Journal     Hybrid Journal   (Followers: 23)
Hipertext.net : Anuario Académico sobre Documentación Digital y Comunicación Interactiva     Open Access  
HLA News     Full-text available via subscription   (Followers: 2)
IASSIST Quarterly     Open Access  
Idaho Librarian     Free   (Followers: 8)
IFLA Journal     Hybrid Journal   (Followers: 217)
In Monte Artium     Full-text available via subscription   (Followers: 1)
In the Library with the Lead Pipe     Open Access   (Followers: 122)
InCID : Revista de Ciência da Informação e Documentação     Open Access  
InCite     Full-text available via subscription   (Followers: 19)
Informaatiotutkimus     Open Access   (Followers: 3)
Informação & Informação     Open Access   (Followers: 2)
Informação em Pauta     Open Access  
Informacijos mokslai     Open Access  
Información, Cultura y Sociedad     Open Access   (Followers: 2)
Informatio. Revista del Instituto de Información de la Facultad de Información y Comunicación     Open Access  
Information     Open Access   (Followers: 30)
Information & Culture : A Journal of History     Full-text available via subscription   (Followers: 31)
Information Discovery and Delivery     Hybrid Journal   (Followers: 43)
Information Manager (The)     Open Access   (Followers: 29)
Information Processing & Management     Hybrid Journal   (Followers: 123)
Information Retrieval     Hybrid Journal   (Followers: 186)
Information Sciences     Hybrid Journal   (Followers: 168)
Information Systems Frontiers     Hybrid Journal   (Followers: 27)
Information Systems Research     Full-text available via subscription   (Followers: 127)
Information Technologies & International Development     Open Access   (Followers: 81)
Information Technologist (The)     Full-text available via subscription   (Followers: 17)
Information Technology and Libraries     Open Access   (Followers: 292)
Information Today     Full-text available via subscription   (Followers: 34)
Informationspraxis     Open Access   (Followers: 12)
Informationswissenschaft : Theorie, Methode und Praxis     Open Access   (Followers: 4)
iNFOTEZY     Open Access  
Insaniyat : Journal of Islam and Humanities     Open Access   (Followers: 1)
Insights : the UKSG journal     Open Access   (Followers: 62)
InterActions: UCLA Journal of Education and Information     Open Access   (Followers: 11)
Interdisciplinary Journal of e-Skills and Lifelong Learning     Open Access   (Followers: 3)
Interdisciplinary Journal of Information, Knowledge, and Management     Open Access   (Followers: 12)
International Association of School Librarianship Conference Proceedings     Open Access  
International Information & Library Review     Hybrid Journal   (Followers: 396)
International Journal of Bibliometrics in Business and Management     Hybrid Journal   (Followers: 2)
International Journal of Business Information Systems     Hybrid Journal   (Followers: 14)
International Journal of Cooperative Information Systems     Hybrid Journal   (Followers: 4)
International Journal of Digital Curation     Open Access   (Followers: 82)
International Journal of Digital Library Systems     Full-text available via subscription   (Followers: 73)
International Journal of Doctoral Studies     Open Access   (Followers: 6)
International Journal of Information and Decision Sciences     Hybrid Journal   (Followers: 10)
International Journal of Information Management     Hybrid Journal   (Followers: 153)
International Journal of Information Privacy, Security and Integrity     Hybrid Journal   (Followers: 25)
International Journal of Information Retrieval Research     Full-text available via subscription   (Followers: 28)
International Journal of Information Science and Management     Open Access   (Followers: 5)
International Journal of Information Technology, Communications and Convergence     Hybrid Journal   (Followers: 14)
International Journal of Information, Diversity, & Inclusion     Open Access   (Followers: 3)
International Journal of Intellectual Property Management     Hybrid Journal   (Followers: 26)
International Journal of Intercultural Information Management     Hybrid Journal   (Followers: 12)
International Journal of Legal Information     Full-text available via subscription   (Followers: 48)
International Journal of Librarianship     Open Access   (Followers: 25)
International Journal of Library and Information Science     Open Access   (Followers: 229)
International Journal of Library Science     Open Access   (Followers: 263)
International Journal of Library Science     Full-text available via subscription   (Followers: 55)
International Journal of Multicriteria Decision Making     Hybrid Journal   (Followers: 8)
International Journal of Multimedia Information Retrieval     Partially Free   (Followers: 8)
International Journal of Organisational Design and Engineering     Hybrid Journal   (Followers: 3)
International Journal of Web Portals     Full-text available via subscription   (Followers: 16)
International Journal on Digital Libraries     Hybrid Journal   (Followers: 544)
InULA Notes : Indiana University Librarians Association     Open Access  
Investigación Bibliotecológica     Open Access   (Followers: 4)
IRIS - Revista de Informação, Memória e Tecnologia     Open Access  
Issues in Informing Science and Information Technology     Open Access   (Followers: 2)
Issues in Science and Technology Librarianship     Open Access   (Followers: 2)
JISTEM : Journal of Information Systems and Technology Management     Open Access   (Followers: 6)
JLIS.it     Open Access   (Followers: 7)
JMIR Medical Informatics     Open Access   (Followers: 9)
Journal of Academic Librarianship     Hybrid Journal   (Followers: 1012)
Journal of Access Services     Hybrid Journal   (Followers: 39)
Journal of Advancements in Library Sciences     Open Access   (Followers: 47)
Journal of Adventist Libraries and Archives     Open Access  
Journal of Altmetrics     Open Access   (Followers: 7)
Journal of Archival Organization     Hybrid Journal   (Followers: 28)
Journal of Copyright in Education & Librarianship     Open Access   (Followers: 29)
Journal of Creative Library Practice     Open Access   (Followers: 98)
Journal of Data Mining and Digital Humanities     Open Access   (Followers: 39)
Journal of Documentation     Hybrid Journal   (Followers: 161)
Journal of East Asian Libraries     Open Access   (Followers: 7)
Journal of Education in Library and Information Science - JELIS     Full-text available via subscription   (Followers: 71)
Journal of Educational Media & Library Sciences     Open Access   (Followers: 9)
Journal of Educational Media, Memory, and Society     Full-text available via subscription   (Followers: 12)
Journal of Electronic Publishing     Open Access   (Followers: 76)
Journal of Electronic Resources Librarianship     Hybrid Journal   (Followers: 224)
Journal of eScience Librarianship     Open Access   (Followers: 112)
Journal of Global Information Management     Full-text available via subscription   (Followers: 9)
Journal of Health & Medical Informatics     Open Access   (Followers: 49)
Journal of Hospital Librarianship     Hybrid Journal   (Followers: 152)
Journal of Information & Knowledge Management     Hybrid Journal   (Followers: 139)
Journal of Information and Data Management     Open Access   (Followers: 14)
Journal of Information Engineering and Applications     Open Access   (Followers: 10)
Journal of Information Literacy     Open Access   (Followers: 773)
Journal of Information Science     Hybrid Journal   (Followers: 1013)
Journal of Information Studies & Technology     Open Access   (Followers: 1)

        1 2 | Last

Similar Journals
Journal Cover
International Journal on Digital Libraries
Journal Prestige (SJR): 0.441
Citation Impact (citeScore): 2
Number of Followers: 544  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1432-1300 - ISSN (Online) 1432-5012
Published by Springer-Verlag Homepage  [2469 journals]
  • CNN-based framework for classifying temporal relations with question
           encoder

    • Free pre-print version: Loading...

      Abstract: Abstract Temporal-relation classification plays an important role in the field of natural language processing. Various deep learning-based classifiers, which can generate better models using sentence embedding, have been proposed to address this challenging task. These approaches, however, do not work well due to the lack of task-related information. To overcome this problem, we propose a novel framework that incorporates prior information by employing awareness of events and time expressions (time–event entities) with various window sizes to focus on context words around the entities as a filter. We refer to this module as “question encoder.” In our approach, this kind of prior information can extract task-related information from simple sentence embedding. Our experimental results on a publicly available Timebank-Dense corpus demonstrate that our approach outperforms some state-of-the-art techniques, including CNN-, LSTM-, and BERT-based temporal relation classifiers.
      PubDate: 2022-06-01
       
  • Assessing the impact of OCR noise on multilingual event detection over
           digitised documents

    • Free pre-print version: Loading...

      Abstract: Abstract Event detection is a crucial task for natural language processing and it involves the identification of instances of specified types of events in text and their classification into event types. The detection of events from digitised documents could enable historians to gather and combine a large amount of information into an integrated whole, a panoramic interpretation of the past. However, the level of degradation of digitised documents and the quality of the optical character recognition (OCR) tools might hinder the performance of an event detection system. While several studies have been performed in detecting events from historical documents, the transcribed documents needed to be hand-validated which implied a great effort of human expertise and manual labour-intensive work. Thus, in this study, we explore the robustness of two different event detection language-independent models to OCR noise, over two datasets that cover different event types and multiple languages. We aim at analysing their ability to mitigate problems caused by the low quality of the digitised documents and we simulate the existence of transcribed data, synthesised from clean annotated text, by injecting synthetic noise. For creating the noisy synthetic data, we chose to utilise four main types of noise that commonly occur after the digitisation process: Character Degradation, Bleed Through, Blur, and Phantom Character. Finally, we conclude that the imbalance of the datasets, the richness of the different annotation styles, and the language characteristics are the most important factors that can influence event detection in digitised documents.
      PubDate: 2022-04-04
       
  • An exploratory approach to archaeological knowledge production

    • Free pre-print version: Loading...

      Abstract: Abstract The current scientific context is characterized by intensive digitization of the research outcomes and by the creation of data infrastructures for the systematic publication of datasets and data services. Several relationships can exist among these outcomes. Some of them are explicit, e.g. the relationships of spatial or temporal similarity, whereas others are hidden, e.g. the relationship of causality. By materializing these hidden relationships through a linking mechanism, several patterns can be established. These knowledge patterns may lead to the discovery of information previously unknown. A new approach to knowledge production can emerge by following these patterns. This new approach is exploratory because by following these patterns, a researcher can get new insights into a research problem. In the paper, we report our effort to depict this new exploratory approach using Linked Data and Semantic Web technologies (RDF, OWL). As a use case, we apply our approach to the archaeological domain.
      PubDate: 2022-03-19
       
  • Special Issue on Selected Papers from ICADL 2020

    • Free pre-print version: Loading...

      PubDate: 2022-03-16
       
  • CSO Classifier 3.0: a scalable unsupervised method for classifying
           documents in terms of research topics

    • Free pre-print version: Loading...

      Abstract: Classifying scientific articles, patents, and other documents according to the relevant research topics is an important task, which enables a variety of functionalities, such as categorising documents in digital libraries, monitoring and predicting research trends, and recommending papers relevant to one or more topics. In this paper, we present the latest version of the CSO Classifier (v3.0), an unsupervised approach for automatically classifying research papers according to the Computer Science Ontology (CSO), a comprehensive taxonomy of research areas in the field of Computer Science. The CSO Classifier takes as input the textual components of a research paper (usually title, abstract, and keywords) and returns a set of research topics drawn from the ontology. This new version includes a new component for discarding outlier topics and offers improved scalability. We evaluated the CSO Classifier on a gold standard of manually annotated articles, demonstrating a significant improvement over alternative methods. We also present an overview of applications adopting the CSO Classifier and describe how it can be adapted to other fields.
      PubDate: 2022-03-01
       
  • Analysing the requirements for an Open Research Knowledge Graph: use
           cases, quality requirements, and construction strategies

    • Free pre-print version: Loading...

      Abstract: Abstract Current science communication has a number of drawbacks and bottlenecks which have been subject of discussion lately: Among others, the rising number of published articles makes it nearly impossible to get a full overview of the state of the art in a certain field, or reproducibility is hampered by fixed-length, document-based publications which normally cannot cover all details of a research work. Recently, several initiatives have proposed knowledge graphs (KG) for organising scientific information as a solution to many of the current issues. The focus of these proposals is, however, usually restricted to very specific use cases. In this paper, we aim to transcend this limited perspective and present a comprehensive analysis of requirements for an Open Research Knowledge Graph (ORKG) by (a) collecting and reviewing daily core tasks of a scientist, (b) establishing their consequential requirements for a KG-based system, (c) identifying overlaps and specificities, and their coverage in current solutions. As a result, we map necessary and desirable requirements for successful KG-based science communication, derive implications, and outline possible solutions.
      PubDate: 2022-03-01
       
  • Current Research on Theory and Practice of Digital Libraries: Best Papers
           from TPDL 2019 & 2020

    • Free pre-print version: Loading...

      Abstract: Abstract This volume presents a special issue on selected papers from the 2019 & 2020 editions of the International Conference on Theory and Practice of Digital Libraries (TPDL). They cover different research areas within Digital Libraries, from Ontology and Linked Data to quality in Web Archives and Topic Detection. We first provide a brief overview of both TPDL editions, and we introduce the selected papers.
      PubDate: 2022-02-28
      DOI: 10.1007/s00799-022-00322-5
       
  • Mapping audiovisual content providers and resources in Greece

    • Free pre-print version: Loading...

      Abstract: Abstract In Greece, there are many audiovisual resources available on the Internet that interest scientists and the general public. Although freely available, finding such resources often becomes a challenging task, because they are hosted on scattered websites and in different types/formats. These websites usually offer limited search options; at the same time, there is no aggregation service for audiovisual resources, nor a national registry for such content. To meet this need, the Open AudioVisual Archives project was launched and the first step in its development is to create a dataset with open access audiovisual material. The current research creates such a dataset by applying specific selection criteria in terms of copyright and content, form/use and process/technical characteristics. The results reported in this paper show that libraries, archives, museums, universities, mass media organizations, governmental and non-governmental organizations are the main types of providers, but the vast majority of resources are open courses offered by universities under the “Creative Commons” license. Providers have significant differences in terms of their collection management capabilities. Most of them do not own any kind of publishing infrastructure and use commercial streaming services, such as YouTube. In terms of metadata policy, most of the providers use application profiles instead of international metadata schemas.
      PubDate: 2022-02-03
      DOI: 10.1007/s00799-022-00321-6
       
  • Correction to: MELHISSA: a multilingual entity linking architecture for
           historical press articles

    • Free pre-print version: Loading...

      PubDate: 2022-01-18
      DOI: 10.1007/s00799-021-00320-z
       
  • Cross-lingual citations in English papers: a large-scale analysis of
           prevalence, usage, and impact

    • Free pre-print version: Loading...

      Abstract: Abstract Citation information in scholarly data is an important source of insight into the reception of publications and the scholarly discourse. Outcomes of citation analyses and the applicability of citation-based machine learning approaches heavily depend on the completeness of such data. One particular shortcoming of scholarly data nowadays is that non-English publications are often not included in data sets, or that language metadata is not available. Because of this, citations between publications of differing languages (cross-lingual citations) have only been studied to a very limited degree. In this paper, we present an analysis of cross-lingual citations based on over one million English papers, spanning three scientific disciplines and a time span of three decades. Our investigation covers differences between cited languages and disciplines, trends over time, and the usage characteristics as well as impact of cross-lingual citations. Among our findings are an increasing rate of citations to publications written in Chinese, citations being primarily to local non-English languages, and consistency in citation intent between cross- and monolingual citations. To facilitate further research, we make our collected data and source code publicly available.
      PubDate: 2021-12-23
      DOI: 10.1007/s00799-021-00312-z
       
  • Children’s query formulation and search result exploration

    • Free pre-print version: Loading...

      Abstract: Abstract Our research aims at understanding children’s information search and their use of information search tools during educational pursuits. We conducted an observation study with 50 New Zealand school children between the ages of 9 and 13 years old. In particular, we studied the way that children constructed search queries and interacted with the Google search engine when undertaking a range of educationally appropriate inquiry tasks. As a result of this in situ study, we identified typical query-creation and query-reformulation strategies that children use. The children worked through 250 tasks, and created a total of 550 search queries. 64.4% of the successful queries made were natural language queries compared to only 35.6% keyword queries. Only three children used the related searches feature of the search engine, while 46 children used query suggestions. We gained insights into the information search strategies children use during their educational pursuits. We observed a range of issues that children encountered when interacting with a search engine to create searches as well as to triage and explore information in the search engine results page lists. We found that search tasks posed as questions were more likely to result in query constructions based on natural language questions, while tasks posed as instructions were more likely to result in query constructions using natural language sentences or keywords. Our findings have implications for both educators and search engine designers.
      PubDate: 2021-12-01
      DOI: 10.1007/s00799-021-00316-9
       
  • Unified approach to retrospective event detection for event- based
           epidemic intelligence

    • Free pre-print version: Loading...

      Abstract: Abstract Inferring the magnitude and occurrence of real-world events from natural language text is a crucial task in various domains. Particularly in the domain of public health, the state-of-the-art document and token centric event detection approaches have not kept the pace with the growing need for more robust event detection in public health. In this paper, we propose UPHED, a unified approach, which combines both the document and token centric event detection techniques in an unsupervised manner such that events which are: rare (aperiodic); reoccurring (periodic) can be detected using a generative model for the domain of public health. We evaluate the efficiency of our approach as well as its effectiveness for two real-world case studies with respect to the quality of document clusters. Our results show that we are able to achieve a precision of 60% and a recall of 71% analyzed using manually annotated real-world data. Finally, we also make a comparative analysis of our work with the well-established rule-based system of MedISys and find that UPHED can be used in a cooperative way with MedISys to not only detect similar anomalies, but can also deliver more information about the specific outbreak of reported diseases.
      PubDate: 2021-12-01
      DOI: 10.1007/s00799-021-00308-9
       
  • Improving data quality in large-scale repositories through conflict
           resolution

    • Free pre-print version: Loading...

      Abstract: Abstract Digital repositories rely on technical metadata to manage their objects. The output of characterization tools is aggregated and analyzed through content profiling. The accuracy and correctness of characterization tools vary; they frequently produce contradicting outputs, resulting in metadata conflicts. The resulting metadata conflicts limit scalable preservation risk assessment and repository management. This article presents and evaluates a rule-based approach to improving data quality in this scenario through expert-conducted conflict resolution. We characterize the data quality challenges and present a method for developing conflict resolution rules to improve data quality. We evaluate the method and the resulting data quality improvements in an experiment on a publicly available document collection. The results demonstrate that our approach enables the effective resolution of conflicts by producing rules that reduce the number of conflicts in the data set from 17 to 3%. This replicable method for presents a significant improvement in content profiling technology for digital repositories, since the enhanced data quality can improve risk assessment and preservation management in digital repository systems.
      PubDate: 2021-12-01
      DOI: 10.1007/s00799-021-00311-0
       
  • MELHISSA: a multilingual entity linking architecture for historical press
           articles

    • Free pre-print version: Loading...

      Abstract: Abstract Digital libraries have a key role in cultural heritage as they provide access to our culture and history by indexing books and historical documents (newspapers and letters). Digital libraries use natural language processing (NLP) tools to process these documents and enrich them with meta-information, such as named entities. Despite recent advances in these NLP models, most of them are built for specific languages and contemporary documents that are not optimized for handling historical material that may for instance contain language variations and optical character recognition (OCR) errors. In this work, we focused on the entity linking (EL) task that is fundamental to the indexation of documents in digital libraries. We developed a Multilingual Entity Linking architecture for HIstorical preSS Articles that is composed of multilingual analysis, OCR correction, and filter analysis to alleviate the impact of historical documents in the EL task. The source code is publicly available. Experimentation has been done over two historical documents covering five European languages (English, Finnish, French, German, and Swedish). Results have shown that our system improved the global performance for all languages and datasets by achieving an F-score@1 of up to 0.681 and an F-score@5 of up to 0.787.
      PubDate: 2021-11-29
      DOI: 10.1007/s00799-021-00319-6
       
  • SchenQL: in-depth analysis of a query language for bibliographic metadata

    • Free pre-print version: Loading...

      Abstract: Abstract Information access to bibliographic metadata needs to be uncomplicated, as users may not benefit from complex and potentially richer data that may be difficult to obtain. Sophisticated research questions including complex aggregations could be answered with complex SQL queries. However, this comes with the cost of high complexity, which requires for a high level of expertise even for trained programmers. A domain-specific query language could provide a straightforward solution to this problem. Although less generic, it can support users not familiar with query construction in the formulation of complex information needs. In this paper, we present and evaluate SchenQL, a simple and applicable query language that is accompanied by a prototypical GUI. SchenQL focuses on querying bibliographic metadata using the vocabulary of domain experts. The easy-to-learn domain-specific query language is suitable for domain experts as well as casual users while still providing the possibility to answer complex information demands. Query construction and information exploration are supported by a prototypical GUI. We present an evaluation of the complete system: different variants for executing SchenQL queries are benchmarked; interviews with domain-experts and a bipartite quantitative user study demonstrate SchenQL’s suitability and high level of users’ acceptance.
      PubDate: 2021-11-23
      DOI: 10.1007/s00799-021-00317-8
       
  • VeTo+: improved expert set expansion in academia

    • Free pre-print version: Loading...

      Abstract: Abstract Expanding a set of known domain experts with new individuals, sharing similar expertise, is a problem that has various applications, such as adding new members to a conference program committee or finding new referees to review funding proposals. In this work, we focus on applications of the problem in the academic world and we introduce VeTo+, a novel approach to effectively deal with it by exploiting scholarly knowledge graphs. VeTo+ expands a given set of experts by identifying scholars having similar publishing habits with them. Our experiments show that VeTo+ outperforms, in terms of accuracy, previous approaches to recommend expansions to a set of given academic experts.
      PubDate: 2021-11-15
      DOI: 10.1007/s00799-021-00318-7
       
  • Correspondence as the primary measure of information quality for web
           archives: a human-centered grounded theory study

    • Free pre-print version: Loading...

      Abstract: Abstract Creating an archived website that is as close as possible to the original, live website remains one of the most difficult challenges in the field of web archiving. Failing to adequately capture a website might mean an incomplete historical record or, worse, no evidence that the site ever even existed. This paper presents a grounded theory of quality for web archives created using data from web archivists. In order to achieve this, I analyzed support tickets submitted by clients of the Internet Archive’s Archive-It (AIT), a subscription-based web archiving service that helps organizations build and manage their own web archives. Overall, 305 tickets were analyzed, comprising 2544 interactions. The resulting theory is comprised of three dimensions of quality in a web archive: correspondence, relevance, and archivability. The dimension of correspondence, defined as the degree of similarity or resemblance between the original website and the archived website, is the most important facet of quality in web archives, and it is the focus of this work. This paper presents the first theory created specifically for web archives and lays the groundwork for future theoretical developments in the field. Furthermore, the theory is human-centered and grounded in how users and creators of web archives perceive their quality. By clarifying the notion of quality in a web archive, this research will be of benefit to web archivists and cultural heritage institutions.
      PubDate: 2021-11-03
      DOI: 10.1007/s00799-021-00314-x
       
  • Evaluating BERT-based scientific relation classifiers for scholarly
           knowledge graph construction on digital library collections

    • Free pre-print version: Loading...

      Abstract: Abstract The rapid growth of research publications has placed great demands on digital libraries (DL) for advanced information management technologies. To cater to these demands, techniques relying on knowledge-graph structures are being advocated. In such graph-based pipelines, inferring semantic relations between related scientific concepts is a crucial step. Recently, BERT-based pre-trained models have been popularly explored for automatic relation classification. Despite significant progress, most of them were evaluated in different scenarios, which limits their comparability. Furthermore, existing methods are primarily evaluated on clean texts, which ignores the digitization context of early scholarly publications in terms of machine scanning and optical character recognition (OCR). In such cases, the texts may contain OCR noise, in turn creating uncertainty about existing classifiers’ performances. To address these limitations, we started by creating OCR-noisy texts based on three clean corpora. Given these parallel corpora, we conducted a thorough empirical evaluation of eight Bert-based classification models by focusing on three factors: (1) Bert variants; (2) classification strategies; and, (3) OCR noise impacts. Experiments on clean data show that the domain-specific pre-trained Bert is the best variant to identify scientific relations. The strategy of predicting a single relation each time outperforms the one simultaneously identifying multiple relations in general. The optimal classifier’s performance can decline by around 10% to 20% in F-score on the noisy corpora. Insights discussed in this study can help DL stakeholders select techniques for building optimal knowledge-graph-based systems.
      PubDate: 2021-11-02
      DOI: 10.1007/s00799-021-00313-y
       
  • Multi-label classification of legislative contents with hierarchical label
           attention networks

    • Free pre-print version: Loading...

      Abstract: Abstract EuroVoc is a thesaurus maintained by the European Union Publication Office, used to describe and index legislative documents. The EuroVoc concepts are organized following a hierarchical structure, with 21 domains, 127 micro-thesauri terms, and more than 6,700 detailed descriptors. The large number of concepts in the EuroVoc thesaurus makes the manual classification of legal documents highly costly. In order to facilitate this classification work, we present two main contributions. The first one is the development of a hierarchical deep learning model to address the classification of legal documents according to the EuroVoc thesaurus. Instead of training a classifier for each hierarchy level, our model allows the simultaneous prediction of the three levels of the EuroVoc thesaurus. Our second contribution concerns the proposal of a new legal corpus for evaluating the classification of documents written in Portuguese. This corpus, named EUR-Lex PT, contains more than 220k documents, labeled under the three EuroVoc hierarchical levels. Comparative experiments with other state-of-the-art models indicate that our approach has competitive results, at the same time offering the ability to interpret predictions through attention weights.
      PubDate: 2021-10-30
      DOI: 10.1007/s00799-021-00307-w
       
  • An extended analysis of the persistence of persistent identifiers of the
           scholarly web

    • Free pre-print version: Loading...

      Abstract: Scholarly resources, just like any other resources on the web, are subject to reference rot as they frequently disappear or significantly change over time. Digital Object Identifiers (
      DOI s) are commonplace to persistently identify scholarly resources and have become the de facto standard for citing them. This paper is an extended version of work previously published in the proceedings of the 2020 International Conference on Theory and Practice of Digital Libraries (TPDL). We investigate the notion of persistence of
      DOI s by conducting a series of experiments to analyze a
      DOI ’s resolution on the web, with this work presenting a set of novel investigations to expand on our previous work. We derive confidence in the persistence of these identifiers in part from the assumption that dereferencing a
      DOI will consistently return the same response, regardless of which HTTP request method we use or from which network environment we send the requests. Our experiments show, however, that persistence, according to our interpretation, is not warranted. We find that scholarly content providers respond differently to varying request methods and network environments, change their response to requests against the same
      DOI , and even return inconsistent results over a period of time. We present the results of our quantitative analysis that is aimed at informing the scholarly communication community about this disconcerting lack of consistency.
      PubDate: 2021-10-22
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 18.205.56.183
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-