Followed Journals
Journal you Follow: 0
Sign Up to follow journals, search in your chosen journals and, optionally, receive Email Alerts when new issues of your Followed Journals are published.
Already have an account? Sign In to see the journals you follow.
Similar Journals
Journal Cover
Journal of Information Science
Journal Prestige (SJR): 0.674
Citation Impact (citeScore): 2
Number of Followers: 1216  
Hybrid Journal Hybrid journal   * Containing 2 Open Access Open Access article(s) in this issue *
ISSN (Print) 0165-5515 - ISSN (Online) 1741-6485
Published by Sage Publications Homepage  [1088 journals]
  • Intelligent detection of hate speech in Arabic social network: A machine
           learning approach
    • Authors: Ibrahim Aljarah, Maria Habib, Neveen Hijazi, Hossam Faris, Raneem Qaddoura, Bassam Hammo, Mohammad Abushariah, Mohammad Alfawareh
      Abstract: Journal of Information Science, Ahead of Print.
      Nowadays, cyber hate speech is increasingly growing, which forms a serious problem worldwide by threatening the cohesion of civil societies. Hate speech relates to using expressions or phrases that are violent, offensive or insulting for a person or a minority of people. In particular, in the Arab region, the number of Arab social media users is growing rapidly, which is accompanied with high increasing rate of cyber hate speech. This drew our attention to aspire healthy online environments that are free of hatred and discrimination. Therefore, this article aims to detect cyber hate speech based on Arabic context over Twitter platform, by applying Natural Language Processing (NLP) techniques, and machine learning methods. The article considers a set of tweets related to racism, journalism, sports orientation, terrorism and Islam. Several types of features and emotions are extracted and arranged in 15 different combinations of data. The processed dataset is experimented using Support Vector Machine (SVM), Naive Bayes (NB), Decision Tree (DT) and Random Forest (RF), in which RF with the feature set of Term Frequency-Inverse Document Frequency (TF-IDF) and profile-related features achieves the best results. Furthermore, a feature importance analysis is conducted based on RF classifier in order to quantify the predictive ability of features in regard to the hate class.
      Citation: Journal of Information Science
      PubDate: 2020-05-18T07:50:11Z
      DOI: 10.1177/0165551520917651
  • Museum libraries in Spain: A case study at state level
    • Authors: Silvia Cobo-Serrano, Rosario Arquero-Avilés, Gonzalo Marco-Cuenca
      Abstract: Journal of Information Science, Ahead of Print.
      Special libraries are essential information and documentation centres for university teachers and researchers due to the quality and richness of their collections. In Spain, it is estimated that there are 2456 special libraries, although many are unknown either generally or among information professionals. These include museum libraries, which are important centres with valuable collections of bibliographic heritage for the area of Humanities and Social Sciences. The aim of this research is to gain an understanding of the real state of these information units and promote the social value of museum libraries in Spain. To do this, a survey was sent to the libraries of state-owned and -managed museums under the General Directorate of Fine Arts and Cultural Property (Ministry of Culture and Sports) of the Government of Spain. This general objective will be accompanied by a review of the scientific literature on various aspects of museum libraries at national and international level. After addressing the research methodology, the results obtained will be discussed and will include the following topics: collection management, library services and staff, economic and technological resources and finally, library management. Conclusions include recommendations for museum librarians and reveal that institutional cooperation is a strategic issue to improve both museum libraries visibility and their social recognition as cultural and research centre.
      Citation: Journal of Information Science
      PubDate: 2020-05-15T07:08:07Z
      DOI: 10.1177/0165551520917652
  • Performance-based evaluation of academic libraries in the big data era
    • Authors: A Y M Atiquil Islam, Khurshid Ahmad, Muhammad Rafi, Zheng JianMing
      Abstract: Journal of Information Science, Ahead of Print.
      The concept of big data has been extensively considered as a technological modernisation in organisations and educational institutes. Thus, the purpose of this study is to determine whether the modified technology acceptance model (MTAM) is viable for evaluating the performance of librarians in the use of big data analytics in academic libraries. This study used an empirical research method for collecting data from 211 librarians working in Pakistan’s universities. On the basis of the findings of the MTAM analysis by structural equation modelling, the performances of the academic libraries were comprehended through the process of big data. The main influential components of the performance analysis in this study were the big data analytics capabilities, perceived ease of access and the usefulness of big data practices in academic libraries. Subsequently, the utilisation of big data was significantly affected by skills, perceived ease of access and the usefulness of academic libraries. The results also suggested that the various components of the academic libraries lead to effective organisational performance when linked to big data analytics.
      Citation: Journal of Information Science
      PubDate: 2020-05-13T04:34:42Z
      DOI: 10.1177/0165551520918516
  • Semantics-preserving optimisation of mapping multi-column key constraints
           for RDB to RDF transformation
    • Authors: Hee-Gook Jun, Dong-Hyuk Im, Hyoung-Joo Kim
      Abstract: Journal of Information Science, Ahead of Print.
      The relational database (RDB) to resource description framework (RDF) transformation is a major semantic information extraction method because most web data are managed by RDBs. Existing automatic RDB-to-RDF transformation methods generate RDF data without losing the semantics of original relational data. However, two major problems have been observed during the mapping of multi-column key constraints: repetitive data generation and semantic information loss. In this article, we propose an improved RDB-to-RDF transformation method that ensures mapping without the aforementioned problems. Optimised rules are defined to generate an accurate semantic data structure for a multi-column key constraint and to reduce repetitive constraint data. Experimental results show that the proposed method achieves better accuracy in transforming multi-column key constraints and generates compact semantic results without repetitive data.
      Citation: Journal of Information Science
      PubDate: 2020-05-12T05:14:54Z
      DOI: 10.1177/0165551520920804
  • Topic extraction to provide an overview of research activities: The case
           of the high-temperature superconductor and simulation and modelling
    • Authors: Ritsuko Nakajima, Nobuyuki Midorikawa
      Abstract: Journal of Information Science, Ahead of Print.
      For those who are not experts in a particular scientific field, it is difficult to understand scientific research trends. Although studies on the extraction of research trends have been conducted, most focus on extracting global trends from large-scale data, and the methods are often complicated. The purpose of this study is to develop a method of obtaining overviews of a scientific field for non-experts by capturing research trends simply and then to verify the method. To extract research topics which should express research trends, text analysis was performed using abstracts over 12 years of articles on high-temperature superconductors. We characterised three topics for the extracted word groups that frequently occurred. For these topics, we studied their appropriateness using a method that has been little used: examining research articles, review literature and co-citations among research articles used to extract the words, comparisons with controlled index terms assigned to the articles and confirming that there were no contradictions. Based on the established method, we have also applied this method to another research field: ‘simulation and modelling’. Although the method used in this article is simple, important topics were extracted, and the relations with the original articles are clear, which can lead to further investigation of the extracted topics.
      Citation: Journal of Information Science
      PubDate: 2020-05-06T05:11:55Z
      DOI: 10.1177/0165551520920794
  • Partitioning highly, medium and lowly cited publications
    • Authors: Yong Huang, Yi Bu, Ying Ding, Wei Lu
      Abstract: Journal of Information Science, Ahead of Print.
      Dividing papers based on their numbers of citations into several groups constitutes one of the most common research practices in bibliometrics and beyond. However, existing dividing methods are both arbitrary and subject to bias. This article proposes a novel approach to partition highly, medium and lowly cited publications based on their citation distribution. We utilise the whole Web of Science (WoS) dataset to demonstrate how to apply this approach to scholarly datasets and examine the robustness of our algorithm in each of the six disciplines under the WoS dataset. The codes that underlie the algorithm are available online.
      Citation: Journal of Information Science
      PubDate: 2020-04-27T04:30:29Z
      DOI: 10.1177/0165551520917655
  • Which are the influential publications in the Web of Science subject
           categories over a long period of time' CRExplorer software used for
           big-data analyses in bibliometrics
    • Authors: Andreas Thor, Lutz Bornmann, Robin Haunschild, Loet Leydesdorff
      Abstract: Journal of Information Science, Ahead of Print.
      What are the landmark papers in scientific disciplines' Which papers are indispensable for scientific progress' These are typical questions which are of interest not only for researchers (who frequently know the answers – or guess to know them) but also for the interested general public. Citation counts can be used to identify very useful papers since they reflect the wisdom of the crowd – in this case, the scientists using published results for their research. In this study, we identified with recently developed methods for the program CRExplorer landmark publications in nearly all Web of Science subject categories (WoS-SCs). These are publications which belong more frequently than other publications during the citing years to the top-1‰ in their subject area. As examples, we show the results of five subject categories: ‘Information Science & Library Science’, ‘Computer Science, Information Systems’, ‘Computer Science, Software Engineering’, ‘Psychology, Social’ and, ‘Chemistry, Physical’. The results of the other WoS-SCs can be found online at An analyst of the results should keep in mind that the identification of landmark papers depends on the used methods and data. Small differences in methods and/or data may lead to other results.
      Citation: Journal of Information Science
      PubDate: 2020-04-24T05:31:55Z
      DOI: 10.1177/0165551520913817
  • An ensemble clustering approach for topic discovery using implicit text
    • Authors: Muhammad Qasim Memon, Yu Lu, Penghe Chen, Aasma Memon, Muhammad Salman Pathan, Zulfiqar Ali Zardari
      Abstract: Journal of Information Science, Ahead of Print.
      Text segmentation (TS) is the process of dividing multi-topic text collections into cohesive segments using topic boundaries. Similarly, text clustering has been renowned as a major concern when it comes to multi-topic text collections, as they are distinguished by sub-topic structure and their contents are not associated with each other. Existing clustering approaches follow the TS method which relies on word frequencies and may not be suitable to cluster multi-topic text collections. In this work, we propose a new ensemble clustering approach (ECA) is a novel topic-modelling-based clustering approach, which induces the combination of TS and text clustering. We improvised a LDA-onto (LDA-ontology) is a TS-based model, which presents a deterioration of a document into segments (i.e. sub-documents), wherein each sub-document is associated with exactly one sub-topic. We deal with the problem of clustering when it comes to a document that is intrinsically related to various topics and its topical structure is missing. ECA is tested through well-known datasets in order to provide a comprehensive presentation and validation of clustering algorithms using LDA-onto. ECA exhibits the semantic relations of keywords in sub-documents and resultant clusters belong to original documents that they contain. Moreover, present research sheds the light on clustering performances and it indicates that there is no difference over performances (in terms of F-measure) when the number of topics changes. Our findings give above par results in order to analyse the problem of text clustering in a broader spectrum without applying dimension reduction techniques over high sparse data. Specifically, ECA provides an efficient and significant framework than the traditional and segment-based approach, such that achieved results are statistically significant with an average improvement of over 10.2%. For the most part, proposed framework can be evaluated in applications where meaningful data retrieval is useful, such as document summarization, text retrieval, novelty and topic detection.
      Citation: Journal of Information Science
      PubDate: 2020-04-14T08:30:03Z
      DOI: 10.1177/0165551520911590
  • Cross-lingual text similarity exploiting neural machine translation models
    • Authors: Kazuhiro Seki
      Abstract: Journal of Information Science, Ahead of Print.
      This article studies cross-lingual text similarity using neural machine translation models. A straightforward approach based on machine translation is to use translated text so as to make the problem monolingual. Another possible approach is to use intermediate states of machine translation models as recently proposed in the related work, which could avoid propagation of translation errors. We aim at improving both approaches independently and then combine the two types of information, that is, translations and intermediate states, in a learning-to-rank framework to compute cross-lingual text similarity. To evaluate the effectiveness and generalisability of our approach, we conduct empirical experiments on English–Japanese and English–Hindi translation corpora for a cross-lingual sentence retrieval task. It is demonstrated that our approach using translations and intermediate states outperforms other neural network–based approaches and is even comparable with a strong baseline based on a state-of-the-art machine translation system.
      Citation: Journal of Information Science
      PubDate: 2020-03-19T04:39:20Z
      DOI: 10.1177/0165551520912676
  • Semisupervised sentiment analysis method for online text reviews
    • Authors: Gyeong Taek Lee, Chang Ouk Kim, Min Song
      Abstract: Journal of Information Science, Ahead of Print.
      Sentiment analysis plays an important role in understanding individual opinions expressed in websites such as social media and product review sites. The common approaches to sentiment analysis use the sentiments carried by words that express opinions and are based on either supervised or unsupervised learning techniques. The unsupervised learning approach builds a word-sentiment dictionary, but it requires lengthy time periods and high costs to build a reliable dictionary. The supervised learning approach uses machine learning models to learn the sentiment scores of words; however, training a classifier model requires large amounts of labelled text data to achieve a good performance. In this article, we propose a semisupervised approach that performs well despite having only small amounts of labelled data available for training. The proposed method builds a base sentiment dictionary from a small training dataset using a lasso-based ensemble model with minimal human effort. The scores of words not in the training dataset are estimated using an adaptive instance-based learning model. In a pretrained word2vec model space, the sentiment values of the words in the dictionary are propagated to the words that did not exist in the training dataset. Through two experiments, we demonstrate that the performance of the proposed method is comparable to that of supervised learning models trained on large datasets.
      Citation: Journal of Information Science
      PubDate: 2020-03-02T05:46:53Z
      DOI: 10.1177/0165551520910032
  • A qualitative–quantitative study of science mapping by different
           algorithms: The Polish journals landscape
    • Authors: Veslava Osinska
      Abstract: Journal of Information Science, Ahead of Print.
      By applying different clustering algorithms, the author strived to construct the best visual representation of scientific domains and disciplines in Poland. Journals and their disciplinary categories constituted a data set. A comparative analysis of maps was based on both qualitative and quantitative approaches. Complex patterns of eight maps were evaluated taking into account both the local proximity of disciplines and the whole structure of presented domains. Final clustering quality value was introduced and calculated in reference to the knowledge domains. The authors underlined the role of quantitative and qualitative methods in combination in the mapping evaluation. The best results were obtained with the T-distributed stochastic neighbour embedding (t-SNE) algorithm. This youngest technique may have the biggest potential for semantic information studies and in the scope of broadly understood semantic solutions.
      Citation: Journal of Information Science
      PubDate: 2020-02-03T09:30:59Z
      DOI: 10.1177/0165551520902738
  • Online news media website ranking using user-generated content
    • Authors: Samaneh Karimi, Azadeh Shakery, Rakesh Verma
      Abstract: Journal of Information Science, Ahead of Print.
      News media websites are important online resources that have drawn great attention of text mining researchers. The main aim of this study is to propose a framework for ranking online news websites from different viewpoints. The ranking of news websites provides useful information, which can benefit many news-related tasks such as news retrieval and news recommendation. In the proposed framework, the ranking of news websites is obtained by calculating three measures introduced in the article and based on user-generated content (UGC). Each proposed measure is concerned with the performance of news websites from a particular viewpoint including the completeness of news reports, the diversity of events being covered by the website and its speed. The use of UGC in this framework, as a partly unbiased, real-time and low cost content on the web distinguishes the proposed news website ranking framework from the literature. The results obtained for three prominent news websites, British Broadcasting Corporation (BBC), Cable News Network (CNN) and New York Times (NYTimes), show that BBC has the best performance in terms of news completeness and speed, and NYTimes has the best diversity in comparison with the other two websites.
      Citation: Journal of Information Science
      PubDate: 2020-02-03T09:08:28Z
      DOI: 10.1177/0165551519894928
  • Group trip planning and information seeking behaviours by mobile social
           media users: A study of tourists in Australia, Bangladesh and China
    • Authors: Jannatul Fardous, Jia Tina Du, Preben Hansen, Kim-Kwang Raymond Choo, Songshan (Sam) Huang
      Abstract: Journal of Information Science, Ahead of Print.
      Social media plays an increasingly important role in travel information seeking and decision-making. However, there is limited understanding of how a group of tourists use social media to plan trips collaboratively and the different practices between countries. In this study, we investigated the collaborative information seeking (CIS) and sharing behaviours of mobile social media users from Australia, Bangladesh and China. Specifically, we surveyed a total of 219 participants to explore the differences in CIS behaviours when people were planning a group trip. The findings suggest significant differences among three countries in terms of the motivations of using social media, CIS activities and social interactions outside the group. Key findings include Bangladeshi and Chinese travellers preferred known contacts on social media, while Australian tourists intended to use both known contacts and user-generated contents for seeking information. The findings also show that social interactions employed by individuals are considered as an important complement of and are interwoven with in-group CIS; both contribute to tourism information seeking. Finally, we propose a framework for CIS research in the tourism domain.
      Citation: Journal of Information Science
      PubDate: 2019-12-10T06:39:23Z
      DOI: 10.1177/0165551519890515
  • The association between professional stratification and use of online
           sources: Evidence from the National Dental Practice-Based Research Network
    • Authors: Simone Rosenblum, Kimberley R Isett, Julia Melkers, Ellen Funkhouser, Diana Hicks, Gregg Gilbert, Michael J Melkers, Deborah McEdward, Meredith Buchberg-Trejo
      Abstract: Journal of Information Science, Ahead of Print.
      The use of online information sources in most professions is widespread, and well researched. Less understood is how the use of these sources vary across the strata within a single profession, and how question context affects search behaviour. Using the dental profession as a case of a highly stratified discipline, we examine search preferences for sources by professional strata among dentists in a practice-based network. Results show that variation exists in information search behaviour across professional strata of dental clinicians. This study highlights the importance of addressing information literacy across different levels of a profession. Findings also underscore that search behaviour and source preference vary with perceived question relevance.
      Citation: Journal of Information Science
      PubDate: 2019-12-09T12:43:23Z
      DOI: 10.1177/0165551519890519
  • Topic modelling and social network analysis of publications and patents in
           humanoid robot technology
    • Authors: Richa Kumari, Jae Yun Jeong, Byeong-Hee Lee, Kwang-Nam Choi, Kiseok Choi
      Abstract: Journal of Information Science, Ahead of Print.
      This article presents analysis of data from scientific articles and patents to identify the evolving trends and underlying topics in research on humanoid robots. We used topic modelling based on latent Dirichlet allocation analysis to identify underlying topics in sub-areas in the field. We also used social network analysis to measure the centrality indices of publication keywords to detect important and influential sub-areas and used co-occurrence analysis of keywords to visualise relationships among subfields. The research result is useful to identify evolving topics and areas of current focus in the field of humanoid technology. The results contribute to identify valuable research patterns from publications and to increase understanding of the hidden knowledge themes that are revealed by patents.
      Citation: Journal of Information Science
      PubDate: 2019-12-09T12:43:22Z
      DOI: 10.1177/0165551519887878
  • Indicator of quality for environmental articles on Wikipedia at the higher
           education level
    • Authors: Eduard Petiška, Bedřich Moldan
      Abstract: Journal of Information Science, Ahead of Print.
      Wikipedia is important in higher education because students and scholars often use it. Nevertheless, the issue of Wikipedia’s quality is an obstacle for its use at the higher education level. In order to contribute to this discussion, we have proposed ‘Verifiability by respected sources’ as an indicator for assessing the quality of Wikipedia articles at the higher education level and conducted an analysis of the most frequently visited articles in the category of Environment on Wikipedia. Results show that these articles contain many unreferenced statements, so their usage at the higher education level is problematic. Therefore, we also propose specific steps for relevant actors that could help to improve the quality of Wikipedia.
      Citation: Journal of Information Science
      PubDate: 2019-12-09T12:43:19Z
      DOI: 10.1177/0165551519888607
  • DeepLink: A novel link prediction framework based on deep learning
    • Authors: Mohammad Mehdi Keikha, Maseud Rahgozar, Masoud Asadpour
      Abstract: Journal of Information Science, Ahead of Print.
      Recently, link prediction has attracted more attention from various disciplines such as computer science, bioinformatics and economics. In link prediction, numerous information such as network topology, profile information and user-generated contents are considered to discover missing links between nodes. Whereas numerous previous researches had focused on the structural features of the networks for link prediction, recent studies have shown more interest in profile and content information, too. So, some of these researches combine structural and content information. However, some issues such as scalability and feature engineering need to be investigated to solve a few remaining problems. Moreover, most of the previous researches are presented only for undirected and unweighted networks. In this article, a novel link prediction framework named ‘DeepLink’ is presented, which is based on deep learning techniques. While deep learning has the advantage of extracting automatically the best features for link prediction, many other link prediction algorithms need manual feature engineering. Moreover, in the proposed framework, both structural and content information are employed. The framework is capable of using different structural feature vectors that are prepared by various link prediction methods. It learns all proximity orders that are presented on a network during the structural feature learning. We have evaluated the effectiveness of DeepLink on two real social network datasets, Telegram and irBlogs. On both datasets, the proposed framework outperforms several other structural and hybrid approaches for link prediction.
      Citation: Journal of Information Science
      PubDate: 2019-12-08T07:50:04Z
      DOI: 10.1177/0165551519891345
  • Karyon: A scalable and easy to integrate ontology summarisation framework
    • Authors: Tugba Ozacar, Ovunc Ozturk
      Abstract: Journal of Information Science, Ahead of Print.
      In the current Semantic Web Community, as the size and complexity of ontologies increase, ontology summarisation is becoming more important. There are many studies in the literature that use different approaches and metrics. However, many of these studies are not effective in terms of performance or have integration issues with current technologies. In this study, the popular ontology summarisation metrics are examined focusing on their performance in terms of time, and a number of metrics have been selected accordingly. To increase the accuracy of selections made with chosen metrics, we propose a novel metric: ‘name inclusion’. This metric promotes a concept if its name is subsumed by the name of another concept. As the existing summarisation applications have integration issues, we have implemented our summarisation framework to integrate easily with the latest web technologies. Therefore, the algorithm is implemented using Rust language, which performs well and easily integrates with other languages.
      Citation: Journal of Information Science
      PubDate: 2019-12-03T06:01:28Z
      DOI: 10.1177/0165551519887873
  • Using microdata for international e-Government data exchange: The case of
           social security domain
    • Authors: Francisco Delgado, José R Hilera, Raul Ruggia, Salvador Otón, Héctor R Amado-Salvatierra
      Abstract: Journal of Information Science, Ahead of Print.
      Semantic interoperability issues of international e-Government data exchanges have not been solved up until now. In the case of social security institutions, the data exchange operations have some particularities that make that the non-ambiguous definition of core concepts used in the institutions has a key impact on the success and quality of system interconnections. In this article, we present the result of a research to implement a new metadata specification based in Dublin Core elements for international social security exchanges, named Exchange Social Security Information Metadata (ESSIM). This proposal is based in a semantic approach using Linked Data for Interoperability, with technologies, such as RDF(S), SPARQL, Microdata and JSON-LD, in order to ensure interoperability between social security institutions from different countries. This will help to strengthen the protection of the social security rights of mobile workers by automating the application of international agreements on social security and to improve cross border communication between social security institutions of different countries. For the near future, the goal is to include this specification as part of information and communication technology Guidelines under development by International Social Security Association with the participation of authors of this article. This will facilitate a future adoption of the specification as an international standard.
      Citation: Journal of Information Science
      PubDate: 2019-12-03T06:01:27Z
      DOI: 10.1177/0165551519891361
  • A review of author name disambiguation techniques for the PubMed
           bibliographic database
    • Authors: Debarshi Kumar Sanyal, Plaban Kumar Bhowmick, Partha Pratim Das
      Abstract: Journal of Information Science, Ahead of Print.
      Author names in bibliographic databases often suffer from ambiguity owing to the same author appearing under different names and multiple authors possessing similar names. It creates difficulty in associating a scholarly work with the person who wrote it, thereby introducing inaccuracy in credit attribution, bibliometric analysis, search-by-author in a digital library and expert discovery. A plethora of techniques for disambiguation of author names has been proposed in the literature. In this article, we focus on the research efforts targeted to disambiguate author names specifically in the PubMed bibliographic database. We believe this concentrated review will be useful to the research community because it discusses techniques applied to a very large real database that is actively used worldwide. We make a comprehensive survey of the existing author name disambiguation (AND) approaches that have been applied to the PubMed database: we organise the approaches into a taxonomy; describe the major characteristics of each approach including its performance, strengths, and limitations; and perform a comparative analysis of them. We also identify the datasets from PubMed that are publicly available for researchers to evaluate AND algorithms. Finally, we outline a few directions for future work.
      Citation: Journal of Information Science
      PubDate: 2019-12-01T11:54:06Z
      DOI: 10.1177/0165551519888605
  • Integrated framework for criminal network extraction from Web
    • Authors: Salim Afra, Reda Alhajj
      Abstract: Journal of Information Science, Ahead of Print.
      Extracting criminals’ information and discovering their network are techniques that investigators often rely on to get extra information about criminal incidents and potential criminals. With the recent advances of the Web, a.k.a. Web 2.0, it has become a rich source of data which provides a variety of information sources. In this article, we propose an integrated framework that combines a variety of available components and makes use of different sources of information provided on the Web to get a better knowledge about criminals or terrorists (we will use criminals to cover all terrorists in the rest of this article). Our system extracts criminals’ information and their corresponding network using Web sources, such as online newspapers, official reports, and social media. It uses text analysis to identify key persons and topics from crawled Web documents. We build a criminal graph from the analysed text based on the co-occurrence of mentioning of criminals. Further analysis is applied on the constructed graph to get key people, hidden relationships and interactions between criminals, as well as hierarchical criminal groups within a network. For every process in the framework, we analysed various available works and implementations that could be used in the process. While analysing social media posts, we identified several challenges which show what solutions could be used for that purpose. Finally, we provide a Web application which implements the proposed framework. It also shows how helpful and efficient the system is in extracting and analysing criminal information.
      Citation: Journal of Information Science
      PubDate: 2019-11-28T08:57:59Z
      DOI: 10.1177/0165551519888606
  • What makes a tweet be retweeted' A Bayesian trigram analysis of tweet
           propagation during the 2015 Colombian political campaign
    • Authors: Roberto Casarin, Juan C Correa, Jorge E Camargo, Silvana Dakduk, Enrique ter Horst, German Molina
      Abstract: Journal of Information Science, Ahead of Print.
      This article proposes the use of computationally efficient inverse regression Bayesian method for analysis of tweet propagation of political messages. Our example focuses on the Colombian case, though our method can be used in any election where social media messaging has a direct impact on political outcomes. We find strong evidence that politicians were able to identify the combination of sensitive words to enhance the probability of retweet of the message, which, in turn, had an impact on political outcomes. The contributions of our work entail: (a) an examination of a neglected unit of analysis (trigram) in a language less studied (i.e. Spanish), (b) based on an innovative Bayesian efficient approach and (c) exploiting the predictive power that retweets have on electoral results as an informational diffusion tool in social media. A practical implication of this new methodology is the possibility to adjust political messages as a means to increase voters engagement in political campaigns.
      Citation: Journal of Information Science
      PubDate: 2019-11-20T08:05:20Z
      DOI: 10.1177/0165551519886056
  • Digital identity and the online self: Footprint strategies – An
           exploratory and comparative research study
    • Authors: Katalin Feher
      Abstract: Journal of Information Science, Ahead of Print.
      Reflecting on the thousands of diverse research studies of social media representation and digital privacy, this article presents a comprehensive summary of online personal strategies. First, the evolution of academic concepts about digital identity and the online self is summarised. Then, the article investigates the key dynamics of personal strategies and control issues in detail with ideas, experiences, stories and metaphors taken from 60 qualitative interviews from Central and Eastern Europe and Southeast Asia. According to the key findings of this article, the universal patterns of online personal strategies follow mostly conscious decisions, resulting in users maintaining 70% control of their digital footprints. However, the remaining 30% of online activities are unconscious floating with digital dynamics and resulting in a wide range of non-expected consequences from identity theft to kidnapping. In summary, an intercultural and intergenerational model highlights the complexity and diversity of the studied field, providing a reference framework for future studies. The closing section presents a discussion of those findings of this study that are inconsistent with commonplace assumptions and conclusions present in the academic literature, promoting for study those subjects that still need to be extended or explored.
      Citation: Journal of Information Science
      PubDate: 2019-10-17T11:07:04Z
      DOI: 10.1177/0165551519879702
  • A deep learning-based quality assessment model of collaboratively edited
           documents: A case study of Wikipedia
    • Authors: Ping Wang, Xiaodan Li, Renli Wu
      Abstract: Journal of Information Science, Ahead of Print.
      Wikipedia is becoming increasingly critical in helping people obtain information and knowledge. Its leading advantage is that users can not only access information but also modify it. However, this presents a challenging issue: how can we measure the quality of a Wikipedia article' The existing approaches assess Wikipedia quality by statistical models or traditional machine learning algorithms. However, their performance is not satisfactory. Moreover, most existing models fail to extract complete information from articles, which degrades the model’s performance. In this article, we first survey related works and summarise a comprehensive feature framework. Then, state-of-the-art deep learning models are introduced and applied to assess Wikipedia quality. Finally, a comparison among deep learning models and traditional machine learning models is conducted to validate the effectiveness of the proposed model. The models are compared extensively in terms of their training and classification performance. Moreover, the importance of each feature and the importance of different feature sets are analysed separately.
      Citation: Journal of Information Science
      PubDate: 2019-09-30T02:57:05Z
      DOI: 10.1177/0165551519877646
  • Development of a classification system for Mathematical Logic
    • Authors: Antonio Sarasa Cabezuelo
      Abstract: Journal of Information Science, Ahead of Print.
      The number of digital resources that exist in repositories and on the Internet in general is enormous. Recovering resources that fit with the user’s specific needs poses a problem. To solve this problem, metainformation is added to the resources. One type of metainformation is the classification of a resource using a classification system that is widely recognised and agreed upon by its users. In this way, each resource is assigned a precise place within the classification system, thus facilitating its location. This article proposes a taxonomy for the classification of the resources (notes, exercises, exams or programmes that could be stored within a digital repository) that are generated within the scope of a Mathematical Logic course for a computer science degree programme. It also describes how to represent the proposed taxonomy using the IMS-VDEX standard and how to integrate it in the LOM and Dublin Core metadata specifications and proposes a set of controlled vocabularies that make it possible to refine the taxonomic metainformation.
      Citation: Journal of Information Science
      PubDate: 2019-09-24T03:12:22Z
      DOI: 10.1177/0165551519877644
  • A bibliometric analysis of topic modelling studies (2000–2017)
    • Authors: Xin Li, Lei Lei
      Abstract: Journal of Information Science, Ahead of Print.
      Topic modelling is a powerful text mining tool that has been applied in many fields such as software engineering, political and linguistic sciences. To evaluate the development of topic modelling studies, the present study reports a bibliometric analysis of SCIE, SSCI and A&HCI listed articles published from 2000 and 2017. Bibliometric indices for productive authors, countries and institutions are analysed. In addition, thematic changes concerning topic modelling are also examined. Results show that China plays a leading role in this field. Topic modelling has established itself as an important technique in not only natural and formal sciences but also social sciences. LDA, social networks and text analysis are the topics with increasing popularity, while certain models (e.g. pLSA) and applications (e.g. topic detection) are declining in popularity. The findings could help researchers optimise research topic choices, seek collaboration with appropriate partners and stay up-to-date with the development of the field.
      Citation: Journal of Information Science
      PubDate: 2019-09-20T08:41:40Z
      DOI: 10.1177/0165551519877049
  • Deep learning in Arabic sentiment analysis: An overview
    • Authors: Amal Alharbi, Mounira Taileb, Manal Kalkatawi
      Abstract: Journal of Information Science, Ahead of Print.
      Sentiment analysis became a very motivating area in both academic and industrial fields due to the exponential increase of the online published reviews and recommendations. To solve the problem of analysing and classifying those reviews and recommendations, several techniques have been proposed. Lately, deep neural networks showed promising outcomes in sentiment analysis. The growing number of Arab users on the Internet along with the increasing amount of published Arabic reviews and comments encouraged researchers to apply deep learning to analyse them. This article is a comprehensive overview of research works that utilised the deep learning approach for Arabic sentiment analysis.
      Citation: Journal of Information Science
      PubDate: 2019-09-18T04:59:37Z
      DOI: 10.1177/0165551519865488
  • Clickbait detection using multiple categorisation techniques
    • Authors: Abinash Pujahari, Dilip Singh Sisodia
      Abstract: Journal of Information Science, Ahead of Print.
      Clickbaits are online articles with deliberately designed misleading titles for luring more and more readers to open the intended web page. Clickbaits are used to tempt visitors to click on a particular link either to monetise the landing page or to spread the false news for sensationalisation. The presence of clickbaits on any news aggregator portal may lead to unpleasant experience to readers. Automatic detection of clickbait headlines from news headlines has been a challenging issue for the machine learning community. A lot of methods have been proposed for preventing clickbait articles in recent past. However, the recent techniques available in detecting clickbaits are not much robust. This article proposes a hybrid categorisation technique for separating clickbait and non-clickbait articles by integrating different features, sentence structure and clustering. During preliminary categorisation, the headlines are separated using 11 features. After that, the headlines are recategorised using sentence formality and syntactic similarity measures. In the last phase, the headlines are again recategorised by applying clustering using word vector similarity based on t-stochastic neighbourhood embedding (t-SNE) approach. After categorisation of these headlines, machine learning models are applied to the dataset to evaluate machine learning algorithms. The obtained experimental results indicate that the proposed hybrid model is more robust, reliable and efficient than any individual categorisation techniques for the dataset we have used.
      Citation: Journal of Information Science
      PubDate: 2019-09-16T10:54:18Z
      DOI: 10.1177/0165551519871822
  • An efficient attribute reduction algorithm using MapReduce
    • Authors: Linzi Yin, Jing Li, Zhaohui Jiang, Jiafeng Ding, Xuemei Xu
      Abstract: Journal of Information Science, Ahead of Print.
      Classical attribute reduction algorithms based on attribute significance initiate too many jobs (O( C 2)) when they run in MapReduce. To improve the efficiencies of these algorithms, we proposed a novel reduction algorithm. Instead of focusing on attribute significance, the notion of a core attribute was applied to construct a new heuristic reduction algorithm, and only C jobs were considered to obtain a reduct. The algorithm only included two basic operations: compare and sort. The latter was optimised using the shuffle mechanism in MapReduce, which provided an efficient sorting ability for big data. In particular, we connected jobs in an iterative form to transfer the processing result of the former job to the latter job. Finally, experimental results demonstrated that the proposed attribute reduction algorithm was efficient and significantly improved upon the classical algorithms in runtime and number of jobs.
      Citation: Journal of Information Science
      PubDate: 2019-09-11T04:18:15Z
      DOI: 10.1177/0165551519874617
  • Loops in publication citation networks
    • Authors: Yi Bu, Yong Huang, Wei Lu
      Abstract: Journal of Information Science, Ahead of Print.
      Traditionally, publication citation networks are regarded as acyclic, that is, no loops in the network as an earlier published article cannot cite a later published article. However, due to the accessibility of pre-print versions of articles, there might be some loops in a publication citation network. This article presents a descriptive statistic on loops in publication citation networks of computer science and physics by employing a network-based indicator, namely, strongly connected component (SCC). By employing computer science and physics disciplines publications from the Web of Science database as examples, this article examines the count of loops, how the count changes over time and how the count relates to the published year difference between publications within the loop in the citation network. Some common structural patterns are also extracted and analysed; we observe that the two disciplines share the most frequent patterns though there exist some minor differences. Moreover, we find that self-citations in terms of authors, authors’ institutions and journals contribute to the formation of loops in publication citation networks.
      Citation: Journal of Information Science
      PubDate: 2019-09-06T07:56:40Z
      DOI: 10.1177/0165551519871826
  • Using dates as contextual information for personalised cultural heritage
    • Authors: Ahmed Dahroug, Andreas Vlachidis, Antonios Liapis, Antonis Bikakis, Martín López-Nores, Owen Sacco, José Juan Pazos-Arias
      Abstract: Journal of Information Science, Ahead of Print.
      We present semantics-based mechanisms that aim to promote reflection on cultural heritage by means of dates (historical events or annual commemorations), owing to their connections to a collection of items and to the visitors’ interests. We argue that links to specific dates can trigger curiosity, increase retention and guide visitors around the venue following new appealing narratives in subsequent visits. The proposal has been evaluated in a pilot study on the collection of the Archaeological Museum of Tripoli (Greece), for which a team of humanities experts wrote a set of diverse narratives about the exhibits. A year-round calendar was crafted so that certain narratives would be more or less relevant on any given day. Expanding on this calendar, personalised recommendations can be made by sorting out those relevant narratives according to personal events and interests recorded in the profiles of the target users. Evaluation of the associations by experts and potential museum visitors shows that the proposed approach can discover meaningful connections, while many others that are more incidental can still contribute to the intended cognitive phenomena.
      Citation: Journal of Information Science
      PubDate: 2019-09-06T07:56:20Z
      DOI: 10.1177/0165551519871823
  • DelibAnalysis: Understanding the quality of online political discourse
           with machine learning
    • Authors: Eleonore Fournier-Tombs, Giovanna Di Marzo Serugendo
      Abstract: Journal of Information Science, Ahead of Print.
      This article proposes an automated methodology for the analysis of online political discourse. Drawing from the discourse quality index (DQI) by Steenbergen et al., it applies a machine learning–based quantitative approach to measuring the discourse quality of political discussions online. The DelibAnalysis framework aims to provide an accessible, replicable methodology for the measurement of discourse quality that is both platform and language agnostic. The framework uses a simplified version of the DQI to train a classifier, which can then be used to predict the discourse quality of any non-coded comment in a given political discussion online. The objective of this research is to provide a systematic framework for the automated discourse quality analysis of large datasets and, in applying this framework, to yield insight into the structure and features of political discussions online.
      Citation: Journal of Information Science
      PubDate: 2019-09-04T07:17:11Z
      DOI: 10.1177/0165551519871828
  • Low-cost similarity calculation on ontology fusion in knowledge bases
    • Authors: Wen Lou, Ruofan Pi, Hui Wang, Yuan Ju
      Abstract: Journal of Information Science, Ahead of Print.
      Ontology fusion in knowledge bases has become less easy, due to the massive capacity involved in the process of semantic similarity calculation. Many similarity calculation methods have been developed, although they are hardly united. This article contributes a low-cost similarity calculation method for ontology fusion, based on the inspiration of binary metrics, with the aim of reducing the size of similarity calculations both spatially and logically. By introducing the definitions of a heterogeneous ontology, entities of ontologies and rules of ontology fusion on the basis of concept fusion and relationship fusion, we put forward the algorithm of main traverse procedure and calculated to be the least cost in time and space in comparison with traditional methods. We adopted three experiments to testify the usability of our approach from the perspective of actual library resources, small datasets and large datasets. In Experiment 1, the bibliographic data from East China Normal University Library were used to show the feasibility and capability of our proposal and present the process of the algorithm. In both Experiments 2 and 3, our approach had at least 88% confidence in detecting accurate merging mappings and also decreased time cost. The test demonstrated a good fusion result. The problem of lower recalls caused by error analysis results from the conflict between the complex structures in ontologies and the recursive functions, which will be improved in the future.
      Citation: Journal of Information Science
      PubDate: 2019-09-04T06:22:16Z
      DOI: 10.1177/0165551519870456
  • Change point detection in social networks using a multivariate
           exponentially weighted moving average chart
    • Authors: Ali Salmasnia, Mohammadreza Mohabbati, Mohammadreza Namdar
      Abstract: Journal of Information Science, Ahead of Print.
      Although the significant role of social networks in communications between individuals has attracted researchers’ attention to the social networks, only few authors investigated social network monitoring in their studies. Most of the existing studies in this context suffer from the following three main drawbacks: (1) using the case-based network attributes such as person experiences and departments instead of the main attributes such as network density and centrality attributes, (2) monitoring the social attributes separately with the assumption that they are independent of each other and (3) ignoring detection of real time of change in the network. To overcome the above-mentioned disadvantages, this research develops a statistical method for monitoring the connections among actors in the social networks with the four most important network attributes consisting of (1) network density, (2) degree centrality, (3) betweenness centrality and (4) closeness centrality. To this end, a multivariate exponentially weighted moving average (MEWMA) control chart is used for simultaneous monitoring of these four correlated attributes. Furthermore, since the control chart usually does not alert a signal in the exact time of change due to type II error, this study presents a change point detection method to reduce cost and time required for diagnosing the control chart signal. Eventually, the efficiency of the proposed approach in comparison with the existing methods is evaluated through a simulation procedure. The results indicate that the suggested method has better performance than the univariate approach in detecting change point.
      Citation: Journal of Information Science
      PubDate: 2019-08-30T07:28:47Z
      DOI: 10.1177/0165551519863351
  • Corrigendum to Spam Profiles Detection on Social Networks Using
           Computational Intelligence Methods: The Effect of The Lingual Context
    • Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-08-30T05:28:49Z
      DOI: 10.1177/0165551519876373
  • Mapping the social landscape through social media
    • Authors: Girish Chandra Joshi, Mayuri Paul, Bhrigu Kumar Kalita, Vikram Ranga, Jiwan Singh Rawat, Pinkesh Singh Rawat
      Abstract: Journal of Information Science, Ahead of Print.
      Being a habitat of the global village, every place has established connections through the strength and power of social media, piercing through the political boundaries. Social media is a digital platform, where people across the world can interact. This has a number of advantages of being universal, anonymous, easy accessibility, indirect interaction, gathering and sharing information when compared with direct interaction. The easy access to social networking sites (SNSs) such as Facebook, Twitter and blogs has brought about unprecedented opportunities for citizens to voice their opinions loaded with emotions/sentiments. Furthermore, social media can influence human thoughts. A recent incident of public importance had presented an opportunity to map the sentiments, involved around it. Sentiments were extracted from tweets for a week. These sentiments were classified as positive, negative and neutral and were mapped in geographic information system (GIS) environment. It was found that the number of tweets diminished by 91% over a week from 25 August 2017 to 31 August 2017. Maximum tweets emerged from places near the origin of the case (Haryana, Delhi and Punjab). The trend of sentiments was found to be – neutral (47.4%), negative (30%) and positive (22.6%). Interestingly, tweets were also coming from unexpected places such as United States, United Kingdom and West Asia. The result can also be used to assess the spatial distribution of digital penetration in India. The highest concentration was found to be around metropolitan cities, that is, Mumbai, Delhi and lowest in North East India and Jammu & Kashmir indicating the penetration of SNSs.
      Citation: Journal of Information Science
      PubDate: 2019-08-13T07:30:43Z
      DOI: 10.1177/0165551519865487
  • Evaluating the performance of government websites: An automatic assessment
           system based on the TFN-AHP methodology
    • Authors: Xudong Cai, Shengli Li, Gengzhong Feng
      Abstract: Journal of Information Science, Ahead of Print.
      Government websites are currently important for providing information and services to citizens. It is a crucial task to evaluate the performance of each government website. The traditional evaluation processes based on experts are criticised as being subjective and cannot work in real time. This article proposes a framework and automatic assessment system for evaluating the performance of government websites in real time. To test the proposed framework and the automatic assessment system, we evaluate and classify 70 websites from Shaanxi Province of China. This article provides guidance for government agencies, managers of government websites and researchers.
      Citation: Journal of Information Science
      PubDate: 2019-08-12T09:17:43Z
      DOI: 10.1177/0165551519866548
  • Exploring the dominant features of social media for depression detection
    • Authors: Jamil Hussain, Fahad Ahmed Satti, Muhammad Afzal, Wajahat Ali Khan, Hafiz Syed Muhammad Bilal, Muhammad Zaki Ansaar, Hafiz Farooq Ahmad, Taeho Hur, Jaehun Bang, Jee-In Kim, Gwang Hoon Park, Hyonwoo Seung, Sungyoung Lee
      Abstract: Journal of Information Science, Ahead of Print.
      Recently, social media have been used by researchers to detect depressive symptoms in individuals using linguistic data from users’ posts. In this study, we propose a framework to identify social information as a significant predictor of depression. Using the proposed framework, we develop an application called the Socially Mediated Patient Portal (SMPP), which detects depression-related markers in Facebook users by applying a data-driven approach with machine learning classification techniques. We examined a data set of 4350 users who were evaluated for depression using the Center for Epidemiological Studies Depression (CES-D) scale. From this analysis, we identified a set of features that can distinguish between individuals with and without depression. Finally, we identified the dominant features that adequately assess individuals with and without depression on social media. The model trained on these features will be helpful to physicians in diagnosing mental diseases and psychiatrists in analysing patient behaviour.
      Citation: Journal of Information Science
      PubDate: 2019-08-12T09:13:23Z
      DOI: 10.1177/0165551519860469
  • An overview of systematic literature reviews in social media marketing
    • Authors: Jennifer Rowley, Brendan Keegan
      Abstract: Journal of Information Science, Ahead of Print.
      Systematic literature reviews (SLRs) adopt a specified and transparent approach in order to scope the literature in a field or sub-field. However, there has been little critical comment on their purpose and processes in practice. By undertaking an overview of SLRs in the field of social media (SM) marketing, this article undertakes a critical evaluation of the SLR purposes and processes in a set of recent SLRs and presents a future research agenda for social media marketing. The overview shows that the purposes of SLRs include the following: making sense (of research in a field), developing a concept matrix/taxonomy and supporting research and practice. On SLR processes, while there is some consensus on the stages of the process, there is considerable variation in how these processes are executed. This article offers a resource to inform practice and acts as a platform for further critical debate regarding the nature and value of SLRs.
      Citation: Journal of Information Science
      PubDate: 2019-08-12T09:00:39Z
      DOI: 10.1177/0165551519866544
  • An intrinsic evaluation of the Waterloo spam rankings of the ClueWeb09 and
           ClueWeb12 datasets
    • Authors: İbrahim Barış Yılmazel, Ahmet Arslan
      Abstract: Journal of Information Science, Ahead of Print.
      The ClueWeb09 dataset and its successor, the ClueWeb12 dataset, are two of the largest collections of Web pages released by Text REtrieval Conference (TREC). The ClueWeb datasets were used in various tracks of TREC ran through 2009 to 2017. For every year, approximately 50 new queries are released and a pool of Web pages are judged against these queries by human assessors as relevant, non-relevant or spam. In this article, a ground truth for binary classification (spam vs non-spam) is constructed from Web pages that are judged as spam or relevant under the assumption that a Web page judged as relevant for any query cannot be spam. Based on this ground truth, we evaluate classification performances of the Waterloo spam rankings (Fusion, Britney, GroupX and UK2006), which have been traditionally used to identify and filter spam pages in retrieval systems. The experimental results in terms of the universal binary classification evaluation measures suggest that the Fusion (with threshold = 11%) is the best for the ClueWeb09 dataset. Analysis of the frequency distributions of relevant/spam documents over spam scores reveals that the GroupX is the most powerful at identifying relevant documents, whereas the Fusion is the most powerful at identifying spam documents. It is also confirmed that the effectiveness of the Fusion spam ranking of the ClueWeb12 dataset is not as good as that of the ClueWeb09.
      Citation: Journal of Information Science
      PubDate: 2019-08-08T12:13:36Z
      DOI: 10.1177/0165551519866551
  • Spam profiles detection on social networks using computational
           intelligence methods: The effect of the lingual context
    • Authors: Ala’ M Al-Zoubi, Hossam Faris, Ja’far Alqatawna, Mohammad A Hassonah
      Abstract: Journal of Information Science, Ahead of Print.
      In online social networks, spam profiles represent one of the most serious security threats over the Internet; if they do not stop producing bad advertisements, they can be exploited by criminals for various purposes. This article addresses the nature and the characteristics of spam profiles in a social network like Twitter to improve spam detection, based on a number of publicly available language-independent features. In order to investigate the effectiveness of these features in spam detection, four datasets are extracted for four different language contexts (i.e. Arabic, English, Korean and Spanish), and a fifth is formed by combining them all. We conduct our experiments using a set of five well-known classification algorithms in spam detection field, k-Nearest Neighbours (k-NN), Random Forest (RF), Naive Bayes (NB), Decision Tree (DT) (J48) and Multilayer Perceptron (MLP) classifiers, along with five filter-based feature selection methods, namely, Information Gain, Chi-square, ReliefF, Correlation and Significance. The results show oscillating performance of each classifier across all datasets, but improved classification results with feature selection. In addition, detailed analysis and comparisons are carried out on two different levels: in the first level, we compare the selected features’ importance among the feature selection methods, whereas in the second level, we observe the relations and the importance of the selected features across all datasets. The findings of this article lead to a better understanding of social spam and improving detection methods by considering the various important features resulting from the different lingual contexts.
      Citation: Journal of Information Science
      PubDate: 2019-08-07T12:05:26Z
      DOI: 10.1177/0165551519861599
  • An investigation of cultural objects in conflict zones through the lens of
           TripAdvisor reviews: A case of South Caucasus
    • Authors: Lala Hajibayova
      Abstract: Journal of Information Science, Ahead of Print.
      This study is an investigation of how cultural sites and objects in the former conflict zones of South Caucasus are constructed in user-generated narratives in TripAdvisor reviews and images. An analysis of these reviews and images was found to demonstrate the embodied orientation of reviewers’ narrations, wherein the disputed nature of the cultural sites is mainly voiced in the form of dissatisfaction with the socio-economical situation and services. This study suggests that the forgotten nature of frozen conflicts engendered an erosion of and disconnect from cultural heritage, ties and significance for those who fled the contested areas.
      Citation: Journal of Information Science
      PubDate: 2019-08-05T10:18:36Z
      DOI: 10.1177/0165551519867545
  • Chemistry research in Europe: A publication analysis (2006–2016)
    • Authors: Hakan Kaygusuz
      Abstract: Journal of Information Science, Ahead of Print.
      In this article, chemistry research in 51 different European countries between years 2006 and 2016 was studied using statistical methods. This study consists of two parts: In the first part, different economical, institutional and citation parameters were correlated with the number of publications, citations and chemical industry numbers using principal components analysis and hierarchical cluster analysis. The results of the first part indicated that economical and geographical parameters directly affect the chemistry research outcome. In the second part, research in branches of chemistry and related disciplines such as analytical chemistry, polymer science and physical chemistry were analysed using principal components analysis and hierarchical cluster analysis for each country. Publication data were collected as the number of chemistry publications (in Science Citation Index–Expanded (SCI-E)) between years 2006 and 2016 in different chemistry subdisciplines and related scientific areas. Results of the second part of the study produced geographical and economical clusters of countries, interestingly, without addition of any geographical data.
      Citation: Journal of Information Science
      PubDate: 2019-07-29T09:31:41Z
      DOI: 10.1177/0165551519865491
  • Contextual weighting approach to compute term weight in layered vector
           space model
    • Authors: Jayant Gadge, Sunil Bhirud
      Abstract: Journal of Information Science, Ahead of Print.
      The World Wide Web (WWW) is the largest available repository of information. This huge amount of information put forward the challenges of retrieval of trustworthy information from WWW. It defies researchers with new issues of diversity and complexity while retrieving the web information. Information retrieval from the web demands approaches that span beyond conventional information retrieval. Heterogeneity, complexity and the huge volume of web information requires a unique approach to retrieve information. Besides, end-users introduce some difficulties in the retrieval process. Sometimes queries submitted by the user are subtle and ambiguous. The primary concern in information retrieval is the issue of predicting the relevance of documents. In this article, a new approach is proposed that rationally separates web document into five layers, namely, title, header, hyperlink, meta tag and body layer. The proposed method effectively combines the textual information and structural evidence of web document for retrieving information from Web. In the proposed layered vector space model, each layer has an allocated priority which is used to compute weight factor for these layers. The proposed method deduces equation that effectively combines priority of the layer and length of the layer to calculate the weight of the layer.
      Citation: Journal of Information Science
      PubDate: 2019-07-29T09:31:10Z
      DOI: 10.1177/0165551519860043
  • Knowledge discovery using SPARQL property path: The case of disease data
    • Authors: Enayat Rajabi, Salvador Sanchez-Alonso
      Abstract: Journal of Information Science, Ahead of Print.
      The Semantic Web allows knowledge discovery on graph-based data sets and facilitates answering complex queries that are extremely difficult to achieve using traditional database approaches. Intuitively, the Semantic Web query language (SPARQL) has a ‘property path’ feature that enables knowledge discovery in a knowledgebase using its reasoning engine. In this article, we utilise the property path of SPARQL and the other Semantic Web technologies to answer sophisticated queries posed over a disease data set. To this aim, we transform data from a disease web portal to a graph-based data set by designing an ontology, present a template to define the queries and provide a set of conjunctive queries on the data set. We illustrate how the reasoning engine of ‘property path’ feature of SPARQL can retrieve the results from the designed knowledgebase. The results of this study were verified by two domain experts as well as authors’ manual exploration on the disease web portal.
      Citation: Journal of Information Science
      PubDate: 2019-07-22T10:44:14Z
      DOI: 10.1177/0165551519865495
  • Does open access citation advantage depend on paper topics'
    • Authors: Hajar Sotudeh
      Abstract: Journal of Information Science, Ahead of Print.
      Research topics vary in their citation potential. In a metric-wise scientific milieu, it would be probable that authors tend to select citation-attractive topics especially when choosing open access (OA) outlets that are more likely to attract citations. Applying a matched-pairs study design, this research aims to examine the role of research topics in the citation advantage of OA papers. Using a comparative citation analysis method, it investigates a sample of papers published in 47 Elsevier article processing charges (APC)-funded journals in different access models including non-open access (NOA), APC, Green and mixed Green-APC. The contents of the papers are analysed using natural language processing techniques at the title and abstract level and served as a basis to match the NOA papers to their peers in the OA models. The publication years and journals are controlled for in order to avoid their impacts on the citation numbers. According to the results, the OA citation advantage that is observed in the whole sample still holds even for the highly similar OA and NOA papers. This implies that the OA citation surplus is not an artefact of the OA and NOA papers’ differences in their topics and, therefore, in their citation potential. This leads to the conclusion that OA authors’ self-selectivity, if it exists at all, is not responsible for the OA citation advantage, at least as far as selection of topics with probably higher citation potentials is concerned.
      Citation: Journal of Information Science
      PubDate: 2019-07-22T10:17:05Z
      DOI: 10.1177/0165551519865489
  • A cohort study of how faculty in LIS schools perceive and engage with
           open-access publishing
    • Authors: Wilhelm Peekhaus
      Abstract: Journal of Information Science, Ahead of Print.
      This article presents results from a survey of faculty in North American Library and Information Studies (LIS) schools about their attitudes towards and experience with open-access publishing. As a follow-up to a similar survey conducted in 2013, the article also outlines the differences in beliefs about and engagement with open access that have occurred between 2013 and 2018. Although faculty in LIS schools are proponents of free access to research, journal publication choices remain informed by traditional considerations such as prestige and impact factor. Engagement with open access has increased significantly, while perceptions of open access have remained relatively stable between 2013 and 2018. Nonetheless, those faculty who have published in an open-access journal or are more knowledgeable about open access tend to be more convinced about the quality of open-access publications and less apprehensive about open-access publishing than those who have no publishing experience with open-access journals or who are less knowledgeable about various open-access modalities. Willingness to comply with gold open-access mandates has increased significantly since 2013.
      Citation: Journal of Information Science
      PubDate: 2019-07-22T07:36:55Z
      DOI: 10.1177/0165551519865481
  • Music-search behaviour on a social Q&A site: A cross-gender comparison
    • Authors: Shengli Deng, Anqi Zhao, Shaoxiong Fu, Yong Liu, Wenjie Fan, Yuting Jiang
      Abstract: Journal of Information Science, Ahead of Print.
      While there have been numerous studies of music-search behaviour, little is known about gendered aspects of how it is carried out on social question and answer sites. The article examines gender differences manifested on one such site with regard to (a) the motivations of the person posing the question, (b) intervening variables that influence music-search behaviour and (c) the formulation of the questions. Results from manual categorisation and other analysis of 17,380 music-relevant questions collected from the site show that males who asked questions did so more often, provided more answers and had more followers than female question-posters. Males tended to include music context information in questions asking for ready reference, whereas females often asked questions in a second-person pronoun aiming for promoting discussion. Such research results add to the current understanding of music-search behaviour and contribute new insights that can inform development of better music services/systems.
      Citation: Journal of Information Science
      PubDate: 2019-07-17T12:51:07Z
      DOI: 10.1177/0165551519861605
  • Real-time feedback query expansion technique for supporting scholarly
           search using citation network analysis
    • Authors: Shah Khalid, Shengli Wu, Aftab Alam, Irfan Ullah
      Abstract: Journal of Information Science, Ahead of Print.
      Scholars routinely search relevant papers to discover and put a new idea into proper context. Despite ongoing advances in scholarly retrieval technologies, locating relevant papers through keyword queries is still quite challenging due to the massive expansion in the size of the research paper repository. To tackle this problem, we propose a novel real-time feedback query expansion technique, which is a two-stage interactive scholarly search process. Upon receiving the initial search query, the retrieval system provides a ranked list of results. In the second stage, a user selects a few relevant papers, from which useful terms are extracted for query expansion. The newly expanded query is run against the index in real time to generate the final list of research papers. In both stages, citation analysis is involved in further improving the quality of the results. The novelty of the approach lies in the combined exploitation of query expansion and citation analysis that may bring the most relevant papers to the top of the search results list. The experimental results on the Association of Computational Linguistics (ACL) Anthology Network data set demonstrate that this technique is effective and robust for locating relevant papers regarding normalised discounted cumulative gain (nDCG), precision and recall rates than several state-of-the-art approaches.
      Citation: Journal of Information Science
      PubDate: 2019-07-17T01:01:10Z
      DOI: 10.1177/0165551519863346
  • Bucketed common vector scaling for authorship attribution in heterogeneous
           web collections: A scaling approach for authorship attribution
    • Authors: Hayri Volkan Agun, Ozgur Yilmazel
      Abstract: Journal of Information Science, Ahead of Print.
      Domain, genre and topic influences on author style adversely affect the performance of authorship attribution (AA) in multi-genre and multi-domain data sets. Although recent approaches to AA tasks focus on suggesting new feature sets and sampling techniques to improve the robustness of a classification system, they do not incorporate domain-specific properties to reduce the negative impact of irrelevant features on AA. This study presents a novel scaling approach, namely, bucketed common vector scaling, to efficiently reduce negative domain influence without reducing the dimensionality of existing features; therefore, this approach is easily transferable and applicable in a classification system. Classification performances on English-language competition data sets consisting of emails and articles and Turkish-language web documents consisting of blogs, articles and tweets indicate that our approach is very competitive to top-performing approaches in English competition data sets and is significantly improving the top classification performance in mixed-domain experiments on blogs, articles and tweets.
      Citation: Journal of Information Science
      PubDate: 2019-07-11T12:50:07Z
      DOI: 10.1177/0165551519863350
  • A hybrid recommender system for the mining of consumer preferences from
           their reviews
    • Authors: Li Chen Cheng, Ming-Chan Lin
      Abstract: Journal of Information Science, Ahead of Print.
      Product review sites are widespread on the Internet and are rapidly gaining in popularity among consumers. This already large volume of user-generated content is dramatically growing every day, making it hard for consumers to filter out the worthwhile information which appears on the various review sites. There commendation system plays a significant role in solving the problem of information overload. This study proposes a framework which integrates a collaborative filtering approach and an opinion mining technique for movie recommendation. Within the proposed framework, sentiment analysis is first applied to the users’ reviews to detect consumer opinions about the movie they have watched and to explore the individual’s preference profile. Traditional recommendation models are overly dependent on preference ratings and often suffer from the problem of ‘data sparsity’. Experimental results obtained from real online reviews show that our proposed method is effective in dealing with insufficient data and is more accurate and efficient than existing traditional methods.
      Citation: Journal of Information Science
      PubDate: 2019-07-10T01:55:17Z
      DOI: 10.1177/0165551519849510
  • On classification of abstracts obtained from medical journals
    • Authors: Bekir Parlak, Alper Kürşat Uysal
      Abstract: Journal of Information Science, Ahead of Print.
      Classification of medical documents was mostly carried out on English data sets and these studies were performed on hospital records rather than academic texts. The main reasons behind this situation are the lack of publicly available data sets and the tasks being costly and time-consuming. As the first contribution of this study, two data sets including Turkish and English counterparts of the same abstracts published in Turkish medical journals were constructed. Turkish is one of the widely used agglutinative languages worldwide and English is a good example of non-agglutinative languages. While English abstracts were obtained automatically from MEDLINE database with a computer program, Turkish counterparts of these documents were collected manually from the Internet. As the second contribution of this study, an extensive comparison on classification of abstracts obtained from Turkish medical journals was made by using these two equivalent data sets. Features were extracted from text documents with three different approaches: unigram, bigram and hybrid. Hybrid approach includes a combination of unigram and bigram features. In the experiments, three different feature selection methods and seven different classifiers were utilised. According to the results on both data sets, classification performance of the English abstracts outperformed the Turkish counterparts. Maximum accuracies were obtained from the combination of unigram features, distinguishing feature selector (DFS) and multinomial naïve Bayes (MNB) classifier for both data sets. Unigram features were generally more efficient than bigram and hybrid features. However, analysis of top-10 features indicated that nearly half of the features were translations of each other for Turkish and English data sets.
      Citation: Journal of Information Science
      PubDate: 2019-07-09T01:19:03Z
      DOI: 10.1177/0165551519860982
  • Knowledge-sharing and collaborative behaviour: An empirical study on a
           Portuguese higher education institution
    • Authors: Marcello Chedid, Ana Caldeira, Helena Alvelos, Leonor Teixeira
      Abstract: Journal of Information Science, Ahead of Print.
      Collaboration has been considered a way to address the challenges of the 21st century, fostering the necessary innovation, growth and productivity for all parties involved. Several studies reveal that collaboration can be strongly influenced by knowledge sharing. The literature suggests that this topic is quite relevant and that there is an evident lack of empirical studies that properly investigate the relationship between knowledge-sharing and collaborative behaviour in Higher Education Institutions (HEIs). In this context, the purpose of this work is to examine whether knowledge-sharing intention has a positive relationship with collaborative behaviour among professors and researchers in a public Portuguese HEI, taking into account other constructs that can have effect on the knowledge-sharing intention. In order to reach this objective, a conceptual research model was developed based on the theory of reasoned action. The empirical study was conducted based on a questionnaire, and the data analysis was performed using partial least squares. The results indicate that intrinsic motivation and networking are the factors that positively affect the attitude towards knowledge sharing. Nevertheless, it is concluded that trust is the variable that more strongly affects the knowledge-sharing intention. Finally, the study identified that knowledge-sharing intention has a positive influence in collaborative behaviour. It is considered that this study can contribute to support institutions’ management in defining strategies and developing actions in order to promote an organisational culture based on knowledge management that significantly leads to knowledge-sharing and collaboration relationships.
      Citation: Journal of Information Science
      PubDate: 2019-07-03T12:51:50Z
      DOI: 10.1177/0165551519860464
  • WeChat knowledge service system of university library based on SoLoMo: A
           holistic design framework
    • Authors: Mang Chen, Wei Zhang
      Abstract: Journal of Information Science, Ahead of Print.
      In this study, we develop a WeChat knowledge service system (WKSS) in university library based on SoLoMo. The aim is to build a comprehensive, open, mobile and smart knowledge service environment. It can realise the interaction between the three users, library and knowledge, and promote the dissemination and sharing of knowledge. By referencing the Internet frontier concept SoLoMo, this study designs a new mobile smart service system, including the system architecture design, the content design and the data association design. Then, this study develops the system, including the running environment configuration, the development of workflow, the core module and the system implementation. This system enables the provision of accurate, specific and more personalised service to each user. It also includes a portable mobile terminal to increase the accuracy of context awareness and enhance user convenience. This study makes up for the shortcomings of the library and increases the functions of personalisation, mobility and intelligence. It extends the way of mobile service in libraries and provides readers with better library mobile services, which was liked by readers.
      Citation: Journal of Information Science
      PubDate: 2019-07-03T12:35:41Z
      DOI: 10.1177/0165551519860045
  • Using Bayesian networks with hidden variables for identifying trustworthy
           users in social networks
    • Authors: Xu Chen, Yuyu Yuan, Mehmet Ali Orgun
      Abstract: Journal of Information Science, Ahead of Print.
      The popularity and broad accessibility of online social networks (OSNs) have facilitated effective communication among people, but such networks also pose potential risks that should not be ignored. Interaction through OSNs is complex and can be unsafe, as individuals can be contacted by strangers at any time. This makes the notion of trust a crucial issue in the use of OSNs. However, compared with decision-making processes associated with whether to trust a stranger encountered in everyday life, this task is more difficult to address with regard to OSNs due to the lack of face-to-face communication and prior knowledge between people. In this article, trust evaluation is formalised as a classification problem. We demonstrate how user profiles and historical records can be organised into a logical structure based on Bayesian networks to recognise the trustworthy people without the need to build trust relationships in OSNs. This is possible when a more detailed description of features denoted by hidden variables is considered. We compare the performance of our method with those of six other machine learning methods using Facebook and Twitter datasets, and our results show that our method achieves higher values in accuracy, recall and F1 score.
      Citation: Journal of Information Science
      PubDate: 2019-07-02T10:41:41Z
      DOI: 10.1177/0165551519857590
  • Spatial information extraction from travel narratives: Analysing the
           notion of co-occurrence indicating closeness of tourist places
    • Authors: Erum Haris, Keng Hoon Gan, Tien-Ping Tan
      Abstract: Journal of Information Science, Ahead of Print.
      Recent advancements in social media have generated a myriad of unstructured geospatial data. Travel narratives are among the richest sources of such spatial clues. They are also a reflection of writers’ interaction with places. One of the prevalent ways to model this interaction is a points of interest (POIs) graph depicting popular POIs and routes. A relevant notion is that frequent pairwise occurrences of POIs indicate their geographic proximity. This work presents an empirical interpretation of this theory and constructs spatially enriched POI graphs, a clear augmentation to popularity-based POI graphs. A triplet pattern, rule-based spatial relation extraction technique SpatRE is proposed and compared with standard relation extraction systems Ollie and Stanford OpenIE. A travel blogs data set is also contributed containing labelled spatial relations. The performance is further evaluated on SemEval 2013 benchmark data sets. Finally, spatially enriched POI graphs are qualitatively compared with TripAdvisor and Google Maps to visualise information accuracy.
      Citation: Journal of Information Science
      PubDate: 2019-06-10T07:57:02Z
      DOI: 10.1177/0165551519837188
  • ASA: A framework for Arabic sentiment analysis
    • Authors: Ahmed Oussous, Fatima-Zahra Benjelloun, Ayoub Ait Lahcen, Samir Belfkih
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-05-21T01:20:49Z
      DOI: 10.1177/0165551519849516
  • Capture and visualisation of text understanding through semantic
           annotations and semantic networks for teaching and learning
    • Authors: Roberto Willrich, Adiel Mittmann, Renato Fileto, Alckmar Luiz dos Santos
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-05-21T01:10:55Z
      DOI: 10.1177/0165551519849514
  • Finding top performers through email patterns analysis
    • Authors: Qi Wen, Peter A Gloor, Andrea Fronzetti Colladon, Praful Tickoo, Tushar Joshi
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-05-20T02:11:01Z
      DOI: 10.1177/0165551519849519
  • A study on first citations of patents through a combination of
           Bradford’s distribution, Cox regression and life tables method
    • Authors: Mohammad Tavakolizadeh-Ravari, Faramarz Soheili, Fatemeh Makkizadeh, Fatemeh Akrami
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-05-08T02:27:10Z
      DOI: 10.1177/0165551519845848
  • University students’ mobile news consumption activities and
           evaluative/affective reactions to political news during election
           campaigns: A diary study
    • Authors: Rong Tang, Kyong Eun Oh
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-04-29T08:14:17Z
      DOI: 10.1177/0165551519845855
  • Mapping the efficiency of international scientific collaboration between
           cities worldwide
    • Authors: György Csomós, Balázs Lengyel
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-04-10T12:43:13Z
      DOI: 10.1177/0165551519842128
  • Understanding data search as a socio-technical practice
         This is an Open Access Article Open Access Article

    • Authors: Kathleen M Gregory, Helena Cousijn, Paul Groth, Andrea Scharnhorst, Sally Wyatt
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-04-02T11:00:00Z
      DOI: 10.1177/0165551519837182
  • Multimodal ensemble approach to identify and rank top-k influential nodes
           of scholarly literature using Twitter network
    • Authors: Bharat Tidke, Rupa Mehta, Jenish Dhanani
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-18T08:51:21Z
      DOI: 10.1177/0165551519837190
  • Exploring the characteristics of crowdsourcing: An online observational
    • Authors: Harpreet Bassi, Christopher J Lee, Laura Misener, Andrew M Johnson
      First page: 291
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-02-21T10:55:34Z
      DOI: 10.1177/0165551519828626
  • Twitter speaks: A case of national disaster situational awareness
    • Authors: Amir Karami, Vishal Shah, Reza Vaezi, Amit Bansal
      First page: 313
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-04T02:06:25Z
      DOI: 10.1177/0165551519828620
  • Decision tree classification: Ranking journals using IGIDI
    • Authors: Muhammad Shaheen, Tanveer Zafar, Sajid Ali Khan
      First page: 325
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-18T08:45:01Z
      DOI: 10.1177/0165551519837176
  • An improved evidence-based aggregation method for sentiment analysis
    • Authors: Parisa Jamadi Khiabani, Mohammad Ehsan Basiri, Hamid Rastegari
      First page: 340
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-18T08:39:07Z
      DOI: 10.1177/0165551519837187
  • Visual analysis of information world maps: An exploration of four methods
    • Authors: Devon Greyson, Heather O’Brien, Saguna Shankar
      First page: 361
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-22T01:48:47Z
      DOI: 10.1177/0165551519837174
  • A semantic web methodological framework to evaluate the support of
           integrity in thesaurus tools
    • Authors: M Mercedes Martínez-González, María-Luisa Alvite-Díez
      First page: 378
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-27T09:33:42Z
      DOI: 10.1177/0165551519837195
  • How to identify the roots of broad research topics and fields' The
           introduction of RPYS sampling using the example of climate change research

         This is an Open Access Article Open Access Article

    • Authors: Robin Haunschild, Werner Marx, Andreas Thor, Lutz Bornmann
      First page: 392
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-04-02T10:52:00Z
      DOI: 10.1177/0165551519837175
  • Concept-LDA: Incorporating Babelfy into LDA for aspect extraction
    • Authors: Ekin Ekinci, Sevinç İlhan Omurca
      First page: 406
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-04-29T08:05:17Z
      DOI: 10.1177/0165551519845854
  • A case study for block-based linked data generation: Recipes as jigsaw
    • Authors: Övünç Öztürk, Tuğba Özacar
      First page: 419
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-06-05T02:08:48Z
      DOI: 10.1177/0165551519849518
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-