for Journals by Title or ISSN
for Articles by Keywords
Followed Journals
Journal you Follow: 0
Sign Up to follow journals, search in your chosen journals and, optionally, receive Email Alerts when new issues of your Followed Journals are published.
Already have an account? Sign In to see the journals you follow.
Journal Cover International Journal on Digital Libraries
  [SJR: 0.375]   [H-I: 28]   [551 followers]  Follow
   Hybrid Journal Hybrid journal (It can contain Open Access articles)
   ISSN (Print) 1432-1300 - ISSN (Online) 1432-5012
   Published by Springer-Verlag Homepage  [2336 journals]
  • Results of a digital library curriculum field test
    • Authors: Sanghee Oh; Seungwon Yang; Jeffrey P. Pomerantz; Barbara M. Wildemuth; Edward A. Fox
      Pages: 273 - 286
      Abstract: The DL Curriculum Development project was launched in 2006, responding to an urgent need for consensus on DL curriculum across the fields of computer science and information and library science. Over the course of several years, 13 modules of a digital libraries (DL) curriculum were developed and were ready for field testing. The modules were evaluated in DL courses in real classroom environments in 37 classes by 15 instructors and their students. Interviews with instructors and questionnaires completed by their students were used to collect evaluative feedback. Findings indicate that the modules have been well designed to educate students on important topics and issues in DLs, in general. Suggestions to improve the modules based on the interviews and questionnaires were discussed as well. After the field test, module development has been continued, not only for the DL community but also others associated with DLs, such as information retrieval, big data, and multimedia. Currently, 56 modules are readily available for use through the project website or the Wikiversity site.
      PubDate: 2016-11-01
      DOI: 10.1007/s00799-015-0151-5
      Issue No: Vol. 17, No. 4 (2016)
  • Systems integration of heterogeneous cultural heritage information systems
           in museums: a case study of the National Palace Museum
    • Authors: Shao-Chun Wu
      Pages: 287 - 304
      Abstract: This study addresses the process of information systems integration in museums. Research emphasis has concentrated on systems integration in the business community after restructuring of commercial enterprises. Museums fundamentally differ from commercial enterprises and thus cannot wholly rely on the business model for systems integration. A case study of the National Palace Museum in Taiwan was conducted to investigate its systems integration of five legacy systems into one information system for museum and public use. Participatory observation methods were used to collect data for inductive analysis. The results suggested that museums are motivated to integrate their systems by internal cultural and administrative operations, external cultural and creative industries, public expectations, and information technology attributes. Four factors related to the success of the systems integration project: (1) the unique attributes of a museum’s artifacts, (2) the attributes and needs of a system’s users, (3) the unique demands of museum work, and (4) the attributes of existing information technology resources within a museum. The results provide useful reference data for other museums when they carry out systems integration.
      PubDate: 2016-11-01
      DOI: 10.1007/s00799-015-0154-2
      Issue No: Vol. 17, No. 4 (2016)
  • Research-paper recommender systems: a literature survey
    • Authors: Joeran Beel; Bela Gipp; Stefan Langer; Corinna Breitinger
      Pages: 305 - 338
      Abstract: In the last 16 years, more than 200 research articles were published about research-paper recommender systems. We reviewed these articles and present some descriptive statistics in this paper, as well as a discussion about the major advancements and shortcomings and an overview of the most common recommendation concepts and approaches. We found that more than half of the recommendation approaches applied content-based filtering (55 %). Collaborative filtering was applied by only 18 % of the reviewed approaches, and graph-based recommendations by 16 %. Other recommendation concepts included stereotyping, item-centric recommendations, and hybrid recommendations. The content-based filtering approaches mainly utilized papers that the users had authored, tagged, browsed, or downloaded. TF-IDF was the most frequently applied weighting scheme. In addition to simple terms, n-grams, topics, and citations were utilized to model users’ information needs. Our review revealed some shortcomings of the current research. First, it remains unclear which recommendation concepts and approaches are the most promising. For instance, researchers reported different results on the performance of content-based and collaborative filtering. Sometimes content-based filtering performed better than collaborative filtering and sometimes it performed worse. We identified three potential reasons for the ambiguity of the results. (A) Several evaluations had limitations. They were based on strongly pruned datasets, few participants in user studies, or did not use appropriate baselines. (B) Some authors provided little information about their algorithms, which makes it difficult to re-implement the approaches. Consequently, researchers use different implementations of the same recommendations approaches, which might lead to variations in the results. (C) We speculated that minor variations in datasets, algorithms, or user populations inevitably lead to strong variations in the performance of the approaches. Hence, finding the most promising approaches is a challenge. As a second limitation, we noted that many authors neglected to take into account factors other than accuracy, for example overall user satisfaction. In addition, most approaches (81 %) neglected the user-modeling process and did not infer information automatically but let users provide keywords, text snippets, or a single paper as input. Information on runtime was provided for 10 % of the approaches. Finally, few research papers had an impact on research-paper recommender systems in practice. We also identified a lack of authority and long-term research interest in the field: 73 % of the authors published no more than one paper on research-paper recommender systems, and there was little cooperation among different co-author groups. We concluded that several actions could improve the research landscape: developing a common evaluation framework, agreement on the information to include in research papers, a stronger focus on non-accuracy aspects and user modeling, a platform for researchers to exchange information, and an open-source framework that bundles the available recommendation approaches.
      PubDate: 2016-11-01
      DOI: 10.1007/s00799-015-0156-0
      Issue No: Vol. 17, No. 4 (2016)
  • Location-triggered mobile access to a digital library of audio books using
    • Authors: Annika Hinze; David Bainbridge
      Pages: 339 - 365
      Abstract: This paper explores the role of audio as a means to access books while being at locations referred to within the books, through a mobile app, called Tipple. The books are sourced from a digital library—either self-contained on the mobile phone, or else over the network—and can either be accompanied by pre-recorded audio or synthesized using text-to-speech. The paper details the functional requirements, design and implementation of Tipple. The developed concept was explored and evaluated through three field studies.
      PubDate: 2016-11-01
      DOI: 10.1007/s00799-015-0165-z
      Issue No: Vol. 17, No. 4 (2016)
  • Creating knowledge maps using Memory Island
    • Authors: Bin Yang; Jean-Gabriel Ganascia
      Abstract: Knowledge maps are useful tools, now beginning to be widely applied to the management and sharing of large-scale hierarchical knowledge. In this paper, we discuss how knowledge maps can be generated using Memory Island. Memory Island is our cartographic visualization technique, which was inspired by the ancient “Art of Memory”. It consists of automatically creating the spatial cartographic representation of a given hierarchical knowledge (e.g., ontology). With the help of its interactive functions, users can navigate through an artificial landscape, to learn and retrieve information from the knowledge. We also present some preliminary results of representing different hierarchical knowledge to show how the knowledge maps created by our technique work.
      PubDate: 2016-10-21
      DOI: 10.1007/s00799-016-0196-0
  • Expressing reliability with CIDOC CRM
    • Authors: Franco Niccolucci; Sorin Hermon
      Abstract: The paper addresses the issue of documenting and communicating the reliability of evidence interpretation in archaeology and, in general, in heritage science. It is proposed to express reliability with fuzzy logic, and model it using an extension of CIDOC CRM classes and properties. This proposed extension is compared with other CRM extensions.
      PubDate: 2016-10-07
      DOI: 10.1007/s00799-016-0195-1
  • Inheriting library cards to Babel and Alexandria: contemporary metaphors
           for the digital library
    • Authors: Paul Gooding; Melissa Terras
      Abstract: Librarians have been consciously adopting metaphors to describe library concepts since the nineteenth century, helping us to structure our understanding of new technologies. As a profession, we have drawn extensively on these figurative frameworks to explore issues surrounding the digital library, yet very little has been written to date which interrogates how these metaphors have developed over the years. Previous studies have explored library metaphors, using either textual analysis or ethnographic methods to investigate their usage. However, this is to our knowledge the first study to use bibliographic data, corpus analysis, qualitative sentiment weighting and close reading to study particular metaphors in detail. It draws on a corpus of over 450 articles to study the use of the metaphors of the Library of Alexandria and Babel, concluding that both have been extremely useful as framing metaphors for the digital library. However, their longstanding use has seen them become stretched as metaphors, meaning that the field’s figurative framework now fails to represent the changing technologies which underpin contemporary digital libraries.
      PubDate: 2016-09-22
      DOI: 10.1007/s00799-016-0194-2
  • Harmonizing the CRMba and CRMarchaeo models
    • Authors: Paola Ronzino
      Abstract: This work presents the initial thoughts towards the harmonization of the CRMba and CRMarchaeo models, two extensions of the CIDOC CRM, the former developed to model the complexity of a built structure from the perspective of buildings archaeology, while the latter was developed to model the processes involved in the investigation of subsurface archaeological deposits. The paper describes the modelling principles of CRMba and CRMarchaeo, and identifies common concepts that will allow to merge the two ontological models.
      PubDate: 2016-08-19
      DOI: 10.1007/s00799-016-0193-3
  • CRMgeo: A spatiotemporal extension of CIDOC-CRM
    • Authors: Gerald Hiebel; Martin Doerr; Øyvind Eide
      Abstract: CRMgeo is a formal ontology intended to be used as a global schema for integrating spatiotemporal properties of temporal entities and persistent items. Its primary purpose is to provide a schema consistent with the CIDOC CRM to integrate geoinformation using the conceptualizations, formal definitions, encoding standards and topological relations defined by the Open Geospatial Consortium in GeoSPARQL. To build the ontology, the same ontology engineering methodology was used as in the CIDOC CRM. CRMgeo first introduced the concept of Spacetime volume that was subsequently included in the CIDOC CRM and provides a differentiation between phenomenal and declarative Spacetime volume, Place and Time-Span. Phenomenal classes derive their identity from real world phenomena like events or things and declarative classes derive their identity from human declarations like dates or coordinates. This differentiation is an essential conceptual background to link CIDOC CRM to the classes, topological relations and encodings provided by Geo-SPARQL and thus allowing spatiotemporal analysis offered by geoinformation systems based on the semantic distinctions of the CIDOC CRM. CRMgeo introduces the classes and relations necessary to model the spatiotemporal properties of real world phenomena and their topological and semantic relations to spatiotemporal information about these phenomena that was derived from historic sources, maps, observations or measurements. It is able to model the full chain of approximating and finding again a phenomenal place, like the actual site of a ship wreck, by a declarative place, like a mark on a sea chart.
      PubDate: 2016-08-13
      DOI: 10.1007/s00799-016-0192-4
  • What’s news? Encounters with news in everyday life: a study of
           behaviours and attitudes
    • Authors: Sally Jo Cunningham; David M. Nichols; Annika Hinze; Judy Bowen
      Abstract: As the news landscape changes, for many users the nature of news itself is changing as well. Insights into the changing news behaviour of users can inform the design of access tools and news archives. We analysed a set of 35 autoethnographies of news encounters, created by students in New Zealand. These comprise rich descriptions of the news sources, modalities, topics of interest, and news ‘routines’ by which the students keep in touch with friends and maintain awareness of personal, local, national, and international events. We explore the implications of these insights into news behaviour for digital news systems.
      PubDate: 2016-08-10
      DOI: 10.1007/s00799-016-0187-1
  • Off-the-shelf CRM with Drupal: a case study of documenting decorated
    • Authors: Athanasios Velios; Aurelie Martin
      Abstract: We present a method of setting up a website using the Drupal CMS to publish CRM data. Our setup requires basic technical expertise by researchers who are then able to publish their records in both a human accessible way through HTML and a machine friendly format through RDFa. We begin by examining previous work on Drupal and the CRM and identifying useful patterns. We present the Drupal modules that are required by our setup and we explain why these are sustainable. We continue by giving guidelines for setting up Drupal to serve CRM data easily and we describe a specific installation for our case study which is related to decorated papers alongside our CRM mapping. We finish with highlighting the benefits of our method (i.e. speed and user-friendliness) and we refer to a number of issues which require further work (i.e. automatic validation, UI improvements and the provision for SPARQL endpoints).
      PubDate: 2016-08-08
      DOI: 10.1007/s00799-016-0191-5
  • Editorial for the TPDL 2015 special issue
    • Authors: Sarantos Kapidakis; Cezary Mazurek; Marcin Werla
      PubDate: 2016-07-26
      DOI: 10.1007/s00799-016-0190-6
  • WW1LOD: an application of CIDOC-CRM to World War 1 linked data
    • Authors: Eetu Mäkelä; Juha Törnroos; Thea Lindquist; Eero Hyvönen
      Abstract: The CIDOC-CRM standard indicates that common events, actors, places and timeframes are important in linking together cultural material, and provides a framework for describing them. However, merely describing entities in this way in two datasets does not yet interlink them. To do that, the identities of instances still need to be either reconciled, or be based on a shared vocabulary. The WW1LOD dataset presented in this paper was created to facilitate both of these approaches for collections dealing with the First World War. For this purpose, the dataset includes events, places, agents, times, keywords, and themes related to the war, based on over ten different authoritative data sources from providers such as the Imperial War Museum. The content is harmonized into RDF, and published as a Linked Open Data service. While generally based on CIDOC-CRM, some modeling choices used also deviate from it where our experience dictated such. In the article, these deviations are discussed in the hope that they may serve as examples where CIDOC-CRM itself may warrant further examination. As a demonstration of use, the dataset and online service have been used to create a contextual reader application that is able to link together and pull in information related to WW1 from, e.g., 1914–1918 Online, Wikipedia, WW1 Discovery, Europeana and the Digital Public Library of America.
      PubDate: 2016-07-26
      DOI: 10.1007/s00799-016-0186-2
  • Scripta manent: a CIDOC CRM semiotic reading of ancient texts
    • Authors: Achille Felicetti; Francesca Murano
      Abstract: This paper tries to identify the most important concepts involved in the study of ancient texts and proposes the use of CIDOC CRM to encode them and to model the scientific process of investigation related to the study of ancient texts to foster integration with other cultural heritage research fields. After identifying the key concepts, assessing the available technologies and analysing the entities provided by CIDOC CRM and by its extensions, we introduce more specific classes to be used as the basis for creating a new extension, CRMtex, which is more responsive to the specific needs of the various disciplines involved (including papyrology, palaeography, codicology and epigraphy).
      PubDate: 2016-07-22
      DOI: 10.1007/s00799-016-0189-z
  • Characteristics of social media stories
    • Authors: Yasmin AlNoamany; Michele C. Weigle; Michael L. Nelson
      Abstract: An emerging trend in social media is for users to create and publish “stories”, or curated lists of  Web resources, with the purpose of creating a particular narrative of interest to the user. While some stories on the Web are automatically generated, such as Facebook’s “Year in Review”, one of the most popular storytelling services is “Storify”, which provides users with curation tools to select, arrange, and annotate stories with content from social media and the Web at large. We would like to use tools, such as Storify, to present (semi-)automatically created summaries of archival collections. To support automatic story creation, we need to better understand as a baseline the structural characteristics of popular (i.e., receiving the most views) human-generated stories. We investigated 14,568 stories from Storify, comprising 1,251,160 individual resources, and found that popular stories (i.e., top 25 % of views normalized by time available on the Web) have the following characteristics: 2/28/1950 elements (min/median/max), a median of 12 multimedia resources (e.g., images, video), 38 % receive continuing edits, and 11 % of their elements are missing from the live Web. We also checked the population of Archive-It collections (3109 collections comprising 305,522 seed URIs) for better understanding the characteristics of the collections that we intend to summarize. We found that the resources in human-generated stories are different from the resources in Archive-It collections. In summarizing a collection, we can only choose from what is archived (e.g., is popular in Storify, but rare in Archive-It). However, some other characteristics of human-generated stories will be applicable, such as the number of resources.
      PubDate: 2016-07-21
      DOI: 10.1007/s00799-016-0185-3
  • Detecting off-topic pages within TimeMaps in Web archives
    • Authors: Yasmin AlNoamany; Michele C. Weigle; Michael L. Nelson
      Abstract: Web archives have become a significant repository of our recent history and cultural heritage. Archival integrity and accuracy is a precondition for future cultural research. Currently, there are no quantitative or content-based tools that allow archivists to judge the quality of the Web archive captures. In this paper, we address the problems of detecting when a particular page in a Web archive collection has gone off-topic relative to its first archived copy. We do not delete off-topic pages (they remain part of the collection), but they are flagged as off-topic so they can be excluded for consideration for downstream services, such as collection summarization and thumbnail generation. We propose different methods (cosine similarity, Jaccard similarity, intersection of the 20 most frequent terms, Web-based kernel function, and the change in size using the number of words and content length) to detect when a page has gone off-topic. Those predicted off-topic pages will be presented to the collection’s curator for possible elimination from the collection or cessation of crawling. We created a gold standard data set from three Archive-It collections to evaluate the proposed methods at different thresholds. We found that combining cosine similarity at threshold 0.10 and change in size using word count at threshold −0.85 performs the best with accuracy = 0.987, \(F_{1}\) score = 0.906, and AUC \(=\) 0.968. We evaluated the performance of the proposed method on several Archive-It collections. The average precision of detecting off-topic pages in the collections is 0.89.
      PubDate: 2016-07-18
      DOI: 10.1007/s00799-016-0183-5
  • Web archive profiling through CDX summarization
    • Authors: Sawood Alam; Michael L. Nelson; Herbert Van de Sompel; Lyudmila L. Balakireva; Harihar Shankar; David S. H. Rosenthal
      Abstract: With the proliferation of public web archives, it is becoming more important to better profile their contents, both to understand their immense holdings as well as to support routing of requests in the Memento aggregator. To save time, the Memento aggregator should only poll the archives that are likely to have a copy of the requested URI. Using the crawler index files produced after crawling, we can generate profiles of the archives that summarize their holdings and can be used to inform routing of the Memento aggregator’s URI requests. Previous work in profiling ranged from using full URIs (no false positives, but with large profiles) to using only top-level domains (TLDs) (smaller profiles, but with many false positives). This work explores strategies in between these two extremes. In our experiments, we correctly identified about 78 % of the URIs that were present or not present in the archive with less than 1 % relative cost as compared to the complete knowledge profile and 94 % URIs with less than 10 % relative cost without any false negatives. With respect to the TLD-only profile, the registered domain profile doubled the routing precision, while complete hostname and one path segment gave a tenfold increase in the routing precision.
      PubDate: 2016-07-16
      DOI: 10.1007/s00799-016-0184-4
  • Using a file history graph to keep track of personal resources across
           devices and services
    • Authors: Matthias Geel; Moira C. Norrie
      Abstract: Personal digital resources now tend to be stored, managed and shared using a variety of devices and online services. As a result, different versions of resources are often stored in different places, and it has become increasingly difficult for users to keep track of them. We introduce the concept of a file history graph that can be used to provide users with a global view of resource provenance and enable them to track specific versions across devices and services. We describe how this has been used to realise a version-aware environment, called Memsy, and report on a lab study used to evaluate the proposed workflow. We also describe how reconciliation services can be used to fill in missing links in the file history graph and present a detailed study for the case of images as a proof of concept.
      PubDate: 2016-07-07
      DOI: 10.1007/s00799-016-0181-7
  • A semantic architecture for preserving and interpreting the information
           contained in Irish historical vital records
    • Authors: Christophe Debruyne; Oya Deniz Beyan; Rebecca Grant; Sandra Collins; Stefan Decker; Natalie Harrower
      Abstract: Irish Record Linkage 1864–1913 is a multi-disciplinary project that started in 2014 aiming to create a platform for analyzing events captured in historical birth, marriage, and death records by applying semantic technologies for annotating, storing, and inferring information from the data contained in those records. This enables researchers to, among other things, investigate to what extent maternal and infant mortality rates were underreported. We report on the semantic architecture, provide motivation for the adoption of RDF and Linked Data principles, and elaborate on the ontology construction process that was influenced by both the requirements of the digital archivists and historians. Concerns of digital archivists include the preservation of the archival record and following best practices in preservation, cataloguing, and data protection. The historians in this project wish to discover certain patterns in those vital records. An important aspect of the semantic architecture is the clear separation of concerns that reflects those distinct requirements—the transcription and archival authenticity of the register pages and the interpretation of the transcribed data—that led to the creation of two distinct ontologies and knowledge bases. The advantage of this clear separation is the transcription of register pages resulted in a reusable data set fit for other research purposes. These transcriptions were enriched with metadata according to best practices in archiving for ingestion in suitable long-term digital preservation platforms.
      PubDate: 2016-07-01
      DOI: 10.1007/s00799-016-0180-8
  • Evaluating unsupervised thesaurus-based labeling of audiovisual content in
           an archive production environment
    • Authors: Victor de Boer; Roeland J. F. Ordelman; Josefien Schuurman
      Abstract: In this paper we report on a two-stage evaluation of unsupervised labeling of audiovisual content using collateral text data sources to investigate how such an approach can provide acceptable results for given requirements with respect to archival quality, authority and service levels to external users. We conclude that with parameter settings that are optimized using a rigorous evaluation of precision and accuracy, the quality of automatic term-suggestion is sufficiently high. We furthermore provide an analysis of the term extraction after being taken into production, where we focus on performance variation with respect to term types and television programs. Having implemented the procedure in our production work-flow allows us to gradually develop the system further and to also assess the effect of the transformation from manual to automatic annotation from an end-user perspective. Additional future work will be on deploying different information sources including annotations based on multimodal video analysis such as speaker recognition and computer vision.
      PubDate: 2016-06-23
      DOI: 10.1007/s00799-016-0182-6
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
Home (Search)
Subjects A-Z
Publishers A-Z
Your IP address:
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-2016