Journal Cover
Journal of Information Science
Journal Prestige (SJR): 0.674
Citation Impact (citeScore): 2
Number of Followers: 1187  
Hybrid Journal Hybrid journal   * Containing 3 Open Access Open Access article(s) in this issue *
ISSN (Print) 0165-5515 - ISSN (Online) 1741-6485
Published by Sage Publications Homepage  [1085 journals]
  • Analysis and shortcomings of e-recruitment systems: Towards a
           semantics-based approach addressing knowledge incompleteness and limited
           domain coverage
    • Authors: Mohammed Maree, Aseel B Kmail, Mohammed Belkhatir
      Pages: 713 - 735
      Abstract: Journal of Information Science, Volume 45, Issue 6, Page 713-735, December 2019.
      The rapid development of the Internet has led to introducing new methods for e-recruitment and human resources management. These methods aim to systematically address the limitations of conventional recruitment procedures through incorporating natural language processing tools and semantics-based methods. In this context, for a given job post, applicant resumes (usually uploaded as free-text unstructured documents in different formats such as .pdf, .doc or .rtf) are matched/screened out using the conventional keyword-based model enriched by additional resources such as occupational categories and semantics-based techniques. Employing these techniques has proved to be effective in reducing the cost, time, and efforts required in traditional recruitment and candidate selection methods. However, bridging the skill gap - that is, the propensity to precisely detect and extract relevant skills in applicant resumes and job posts - and highlighting the hidden semantic dimensions encoded in applicant resumes are still challenging issues in the process of devising effective e-recruitment systems. This is due to the fact that resources exploited by current e-recruitment systems are obtained from generic domain-independent sources, therefore resulting in knowledge incompleteness and the lack of domain coverage. In this article, we review state-of-the-art e-recruitment approaches and highlight recent advancements in this domain. An e-recruitment framework addressing current shortcomings through the use of multiple cooperative semantic resources, feature extraction techniques and skill relatedness measures is detailed. An instantiation of the proposed framework is proposed and an experimental validation using a real-world recruitment dataset from two employment portals demonstrates the effectiveness of the proposed approach.
      Citation: Journal of Information Science
      PubDate: 2019-12-01T08:00:00Z
      DOI: 10.1177/0165551518811449
  • Integrating word status for joint detection of sentiment and aspect in

         This is an Open Access Article Open Access Article

    • Authors: Ayoub Bagheri
      Pages: 736 - 755
      Abstract: Journal of Information Science, Volume 45, Issue 6, Page 736-755, December 2019.
      A crucial task in sentiment analysis is aspect detection: the step of selecting the aspects on which opinions are expressed. This step anticipates the step of determining whether the opinions on aspects are positive or negative. This article proposes a novel probabilistic generative topic model for aspect-based sentiment analysis which is able to discover the latent structure of a large collection of review documents. The proposed joint sentiment-aspect detection model (SAM) is a generative topic model that incorporates the structure of review sentences for detecting aspects and sentiments simultaneously. The intuitions behind the SAM are that from generating documents by latent single- and multi-word topics, modelling the word distribution for each topic and learning of the prior distribution over topics in sentences of documents. SAM introduces word status so that the model can decide when to sample from a bigram distribution or a unigram distribution and integrates all these components into one combined model for aspect-based sentiment analysis. We evaluate SAM both qualitatively and quantitatively to show that the model is indeed able to perform the task effectively and improves significantly over standard joint sentiment-aspect models. The proposed model can easily be transformed between domains or languages and can detect the polarity of text data at various levels. However, for the quantitative analysis, we mainly focus on presenting the results for the document-level sentiment classification.
      Citation: Journal of Information Science
      PubDate: 2019-12-01T08:00:00Z
      DOI: 10.1177/0165551518811458
  • A linked open data framework to enhance the discoverability and impact of
           culture heritage
    • Authors: Gustavo Candela, Pilar Escobar, Rafael C Carrasco, Manuel Marco-Such
      Pages: 756 - 766
      Abstract: Journal of Information Science, Volume 45, Issue 6, Page 756-766, December 2019.
      Cultural heritage institutions have recently begun to consider the benefits of sharing their collections using linked open data to disseminate and enrich their metadata. As datasets become very large, challenges appear, such as ingestion, management, querying and enrichment. Furthermore, each institution has particular features related to important aspects such as vocabularies and interoperability, which make it difficult to generalise this process and provide one-for-all solutions. In order to improve the user experience as regards information retrieval systems, researchers have identified that further refinements are required for the recognition and extraction of implicit relationships expressed in natural language. We introduce a framework for the enrichment and disambiguation of locations in text using open knowledge bases such as Wikidata and GeoNames. The framework has been successfully used to publish a dataset based on information from the Biblioteca Virtual Miguel de Cervantes, thus illustrating how semantic enrichment can help information retrieval. The methods applied in order to automate the enrichment process, which build upon open source software components, are described herein.
      Citation: Journal of Information Science
      PubDate: 2019-12-01T08:00:00Z
      DOI: 10.1177/0165551518812658
  • Data hiding technique in steganography for information security using
           number theory
    • Authors: Amjad Rehman, Tanzila Saba, Toqeer Mahmood, Zahid Mehmood, Mohsin Shah, Adeel Anjum
      Pages: 767 - 778
      Abstract: Journal of Information Science, Volume 45, Issue 6, Page 767-778, December 2019.
      In the current era, due to the widespread availability of the Internet, it is extremely easy for people to communicate and share multimedia contents with each other. However, at the same time, secure transfer of personal and copyrighted material has become a critical issue. Consequently, secure means of data transfer are the most urgent need of the time. Steganography is the science and art of protecting the secret data from an unauthorised access. The steganographic approaches conceal secret data into a cover file of type audio, video, text and/or image. The actual challenge in steganography is to achieve high robustness and capacity without bargaining on the imperceptibility of the cover file. In this article, an efficient steganography method is proposed for the transfer of secret data in digital images using number theory. For this purpose, the proposed method represents the cover image using the Fibonacci sequence. The representation of an image in the Fibonacci sequence allows increasing the bit planes from 8-bit to 12-bit planes. The experimental results of the proposed method in comparison with other existing steganographic methods exhibit that our method not only achieves high embedding of secret data but also gives high quality of stego images in terms of peak signal-to-noise ratio (PSNR). Furthermore, the robustness of the technique is also evaluated in the presence of salt and pepper noise attack on the cover images.
      Citation: Journal of Information Science
      PubDate: 2019-12-01T08:00:00Z
      DOI: 10.1177/0165551518816303
  • Mining layered technological information in scientific papers: A
           semi-supervised method
    • Authors: Xiaoyu Wang, Yujia Zhai, Yuanhai Lin, Fang Wang
      Pages: 779 - 793
      Abstract: Journal of Information Science, Volume 45, Issue 6, Page 779-793, December 2019.
      Tech mining is the application of text mining tools to science and technology information resources. The ever-increasing volume of scientific outputs is a boom to technological innovation, but it also complicates efforts to obtain useful and concise information for problem solving. This challenge extends to tech mining, where the development of techniques compatible with big data is an urgent issue. This article introduces a semi-supervised method for extracting layered technological information from scientific papers in order to extend the reach of tech mining. Our method starts with several pre-set seed patterns used to extract candidate phrases by matching the dependency tree of each sentence. Then, after a series of judgements, phrases are divided into two categories: ‘main technique’ and ‘tech-component’. (A technique, for the purposes of this study, is a method or tool used in the article being analysed.) In order to generate new patterns for subsequent iterations, a weighted pattern learning method is also adopted. Finally, multiple iterations of the method are applied to extract technological information from each paper. A dataset from the field of optical switcher is used to verify the method’s effectiveness. Our findings are that (1) by two loops of extraction process in each iteration, our method realises the layered technological information extraction, which contains the ‘part–whole’ relationships between main techniques and tech-components; (2) the recall rate for main techniques is superior to the baseline after iterating 23 rounds; (3) when layering is disregarded, in the aspect of the precision and the volume of techniques, the new method is higher than that for the baseline; and (4) adjusting another two parameters can optimise the efficiency – however, the effect is neither pronounced nor straightforward.
      Citation: Journal of Information Science
      PubDate: 2019-12-01T08:00:00Z
      DOI: 10.1177/0165551518816941
  • A meta-heuristic framework based on clustering and preprocessed datasets
           for solving the link prediction problem
    • Authors: Reham Shawqi Barham, Ahmad Sharieh, Azzam Sleit
      Pages: 794 - 817
      Abstract: Journal of Information Science, Volume 45, Issue 6, Page 794-817, December 2019.
      This study presents a solution to a problem commonly known as link prediction problem. Link prediction problem interests in predicting the possibility of appearing a connection between two nodes of a network, while there is no connection between these nodes in the present state of the network. Finding a solution to link prediction problem attracts variety of computer science fields such as data mining and machine learning. This attraction is due to its importance for many applications such as social networks, bioinformatics and co-authorship networks. Towards solving this problem, Evolutionary Link Prediction (EVO-LP) framework is proposed, presented, analysed and tested. EVO-LP is a framework that includes dataset preprocessing approach and a meta-heuristic algorithm based on clustering for prediction. EVO-LP is divided into preprocessing stage and link prediction stage. Feature extraction, data under-sampling and feature selection are utilised in the preprocessing stage, while in the prediction stage, a meta-heuristic algorithm based on clustering is used as an optimiser. Experimental results on a number of real networks show that EVO-LP improves the prediction quality with low time complexity.
      Citation: Journal of Information Science
      PubDate: 2019-12-01T08:00:00Z
      DOI: 10.1177/0165551518816296
  • DIC-DOC-K-means: Dissimilarity-based Initial Centroid selection for
           DOCument clustering using K-means for improving the effectiveness of text
           document clustering
    • Authors: R Lakshmi, S Baskar
      Pages: 818 - 832
      Abstract: Journal of Information Science, Volume 45, Issue 6, Page 818-832, December 2019.
      In this article, a new initial centroid selection for a K-means document clustering algorithm, namely, Dissimilarity-based Initial Centroid selection for DOCument clustering using K-means (DIC-DOC-K-means), to improve the performance of text document clustering is proposed. The first centroid is the document having the minimum standard deviation of its term frequency. Each of the other subsequent centroids is selected based on the dissimilarities of the previously selected centroids. For comparing the performance of the proposed DIC-DOC-K-means algorithm, the results of the K-means, K-means++ and weighted average of terms-based initial centroid selection + K-means (Weight_Avg_Initials + K-means) clustering algorithms are considered. The results show that the proposed DIC-DOC-K-means algorithm performs significantly better than the K-means, K-means++ and Weight_Avg_Initials+ K-means clustering algorithms for Reuters-21578 and WebKB with respect to purity, entropy and F-measure for most of the cluster sizes. The cluster sizes used for Reuters-8 are 8, 16, 24 and 32 and those for WebKB are 4, 8, 12 and 16. The results of the proposed DIC-DOC-K-means give a better performance for the number of clusters that are equal to the number of classes in the data set.
      Citation: Journal of Information Science
      PubDate: 2019-12-01T08:00:00Z
      DOI: 10.1177/0165551518816302
  • A semantic-based video scene segmentation using a deep neural network
    • Authors: Hyesung Ji, Danial Hooshyar, Kuekyeng Kim, Heuiseok Lim
      Pages: 833 - 844
      Abstract: Journal of Information Science, Volume 45, Issue 6, Page 833-844, December 2019.
      Video scene segmentation is very important research in the field of computer vision, because it helps in efficient storage, indexing and retrieval of videos. Achieving this kind of scene segmentation cannot be done by just calculating the similarity of low-level features presented in the video; high-level features should also be considered to achieve a better performance. Even though much research has been conducted on video scene segmentation, most of these studies failed to semantically segment a video into scenes. Thus, in this study, we propose a Deep-learning Semantic-based Scene-segmentation model (called DeepSSS) that considers image captioning to segment a video into scenes semantically. First, the DeepSSS performs shot boundary detection by comparing colour histograms and then employs maximum-entropy-applied keyframe extraction. Second, for semantic analysis, using image captioning that benefits from deep learning generates a semantic text description of the keyframes. Finally, by comparing and analysing the generated texts, it assembles the keyframes into a scene grouped under a semantic narrative. That said, DeepSSS considers both low- and high-level features of videos to achieve a more meaningful scene segmentation. By applying DeepSSS to data sets from MS COCO for caption generation and evaluating its semantic scene-segmentation task results with the data sets from TRECVid 2016, we demonstrate quantitatively that DeepSSS outperforms other existing scene-segmentation methods using shot boundary detection and keyframes. What’s more, the experiments were done by comparing scenes segmented by humans and scene segmented by the DeepSSS. The results verified that the DeepSSS’ segmentation resembled that of humans. This is a new kind of result that was enabled by semantic analysis, which was impossible by just using low-level features of videos.
      Citation: Journal of Information Science
      PubDate: 2019-12-01T08:00:00Z
      DOI: 10.1177/0165551518819964
  • Karyon: A scalable and easy to integrate ontology summarisation framework
    • Authors: Tugba Ozacar, Ovunc Ozturk
      Abstract: Journal of Information Science, Ahead of Print.
      In the current Semantic Web Community, as the size and complexity of ontologies increase, ontology summarisation is becoming more important. There are many studies in the literature that use different approaches and metrics. However, many of these studies are not effective in terms of performance or have integration issues with current technologies. In this study, the popular ontology summarisation metrics are examined focusing on their performance in terms of time, and a number of metrics have been selected accordingly. To increase the accuracy of selections made with chosen metrics, we propose a novel metric: ‘name inclusion’. This metric promotes a concept if its name is subsumed by the name of another concept. As the existing summarisation applications have integration issues, we have implemented our summarisation framework to integrate easily with the latest web technologies. Therefore, the algorithm is implemented using Rust language, which performs well and easily integrates with other languages.
      Citation: Journal of Information Science
      PubDate: 2019-12-03T06:01:28Z
      DOI: 10.1177/0165551519887873
  • Using microdata for international e-Government data exchange: The case of
           social security domain
    • Authors: Francisco Delgado, José R Hilera, Raul Ruggia, Salvador Otón, Héctor R Amado-Salvatierra
      Abstract: Journal of Information Science, Ahead of Print.
      Semantic interoperability issues of international e-Government data exchanges have not been solved up until now. In the case of social security institutions, the data exchange operations have some particularities that make that the non-ambiguous definition of core concepts used in the institutions has a key impact on the success and quality of system interconnections. In this article, we present the result of a research to implement a new metadata specification based in Dublin Core elements for international social security exchanges, named Exchange Social Security Information Metadata (ESSIM). This proposal is based in a semantic approach using Linked Data for Interoperability, with technologies, such as RDF(S), SPARQL, Microdata and JSON-LD, in order to ensure interoperability between social security institutions from different countries. This will help to strengthen the protection of the social security rights of mobile workers by automating the application of international agreements on social security and to improve cross border communication between social security institutions of different countries. For the near future, the goal is to include this specification as part of information and communication technology Guidelines under development by International Social Security Association with the participation of authors of this article. This will facilitate a future adoption of the specification as an international standard.
      Citation: Journal of Information Science
      PubDate: 2019-12-03T06:01:27Z
      DOI: 10.1177/0165551519891361
  • A review of author name disambiguation techniques for the PubMed
           bibliographic database
    • Authors: Debarshi Kumar Sanyal, Plaban Kumar Bhowmick, Partha Pratim Das
      Abstract: Journal of Information Science, Ahead of Print.
      Author names in bibliographic databases often suffer from ambiguity owing to the same author appearing under different names and multiple authors possessing similar names. It creates difficulty in associating a scholarly work with the person who wrote it, thereby introducing inaccuracy in credit attribution, bibliometric analysis, search-by-author in a digital library and expert discovery. A plethora of techniques for disambiguation of author names has been proposed in the literature. In this article, we focus on the research efforts targeted to disambiguate author names specifically in the PubMed bibliographic database. We believe this concentrated review will be useful to the research community because it discusses techniques applied to a very large real database that is actively used worldwide. We make a comprehensive survey of the existing author name disambiguation (AND) approaches that have been applied to the PubMed database: we organise the approaches into a taxonomy; describe the major characteristics of each approach including its performance, strengths, and limitations; and perform a comparative analysis of them. We also identify the datasets from PubMed that are publicly available for researchers to evaluate AND algorithms. Finally, we outline a few directions for future work.
      Citation: Journal of Information Science
      PubDate: 2019-12-01T11:54:06Z
      DOI: 10.1177/0165551519888605
  • Integrated framework for criminal network extraction from Web
    • Authors: Salim Afra, Reda Alhajj
      Abstract: Journal of Information Science, Ahead of Print.
      Extracting criminals’ information and discovering their network are techniques that investigators often rely on to get extra information about criminal incidents and potential criminals. With the recent advances of the Web, a.k.a. Web 2.0, it has become a rich source of data which provides a variety of information sources. In this article, we propose an integrated framework that combines a variety of available components and makes use of different sources of information provided on the Web to get a better knowledge about criminals or terrorists (we will use criminals to cover all terrorists in the rest of this article). Our system extracts criminals’ information and their corresponding network using Web sources, such as online newspapers, official reports, and social media. It uses text analysis to identify key persons and topics from crawled Web documents. We build a criminal graph from the analysed text based on the co-occurrence of mentioning of criminals. Further analysis is applied on the constructed graph to get key people, hidden relationships and interactions between criminals, as well as hierarchical criminal groups within a network. For every process in the framework, we analysed various available works and implementations that could be used in the process. While analysing social media posts, we identified several challenges which show what solutions could be used for that purpose. Finally, we provide a Web application which implements the proposed framework. It also shows how helpful and efficient the system is in extracting and analysing criminal information.
      Citation: Journal of Information Science
      PubDate: 2019-11-28T08:57:59Z
      DOI: 10.1177/0165551519888606
  • What makes a tweet be retweeted' A Bayesian trigram analysis of tweet
           propagation during the 2015 Colombian political campaign
    • Authors: Roberto Casarin, Juan C Correa, Jorge E Camargo, Silvana Dakduk, Enrique ter Horst, German Molina
      Abstract: Journal of Information Science, Ahead of Print.
      This article proposes the use of computationally efficient inverse regression Bayesian method for analysis of tweet propagation of political messages. Our example focuses on the Colombian case, though our method can be used in any election where social media messaging has a direct impact on political outcomes. We find strong evidence that politicians were able to identify the combination of sensitive words to enhance the probability of retweet of the message, which, in turn, had an impact on political outcomes. The contributions of our work entail: (a) an examination of a neglected unit of analysis (trigram) in a language less studied (i.e. Spanish), (b) based on an innovative Bayesian efficient approach and (c) exploiting the predictive power that retweets have on electoral results as an informational diffusion tool in social media. A practical implication of this new methodology is the possibility to adjust political messages as a means to increase voters engagement in political campaigns.
      Citation: Journal of Information Science
      PubDate: 2019-11-20T08:05:20Z
      DOI: 10.1177/0165551519886056
  • Digital identity and the online self: Footprint strategies – An
           exploratory and comparative research study
    • Authors: Katalin Feher
      Abstract: Journal of Information Science, Ahead of Print.
      Reflecting on the thousands of diverse research studies of social media representation and digital privacy, this article presents a comprehensive summary of online personal strategies. First, the evolution of academic concepts about digital identity and the online self is summarised. Then, the article investigates the key dynamics of personal strategies and control issues in detail with ideas, experiences, stories and metaphors taken from 60 qualitative interviews from Central and Eastern Europe and Southeast Asia. According to the key findings of this article, the universal patterns of online personal strategies follow mostly conscious decisions, resulting in users maintaining 70% control of their digital footprints. However, the remaining 30% of online activities are unconscious floating with digital dynamics and resulting in a wide range of non-expected consequences from identity theft to kidnapping. In summary, an intercultural and intergenerational model highlights the complexity and diversity of the studied field, providing a reference framework for future studies. The closing section presents a discussion of those findings of this study that are inconsistent with commonplace assumptions and conclusions present in the academic literature, promoting for study those subjects that still need to be extended or explored.
      Citation: Journal of Information Science
      PubDate: 2019-10-17T11:07:04Z
      DOI: 10.1177/0165551519879702
  • A deep learning-based quality assessment model of collaboratively edited
           documents: A case study of Wikipedia
    • Authors: Ping Wang, Xiaodan Li, Renli Wu
      Abstract: Journal of Information Science, Ahead of Print.
      Wikipedia is becoming increasingly critical in helping people obtain information and knowledge. Its leading advantage is that users can not only access information but also modify it. However, this presents a challenging issue: how can we measure the quality of a Wikipedia article' The existing approaches assess Wikipedia quality by statistical models or traditional machine learning algorithms. However, their performance is not satisfactory. Moreover, most existing models fail to extract complete information from articles, which degrades the model’s performance. In this article, we first survey related works and summarise a comprehensive feature framework. Then, state-of-the-art deep learning models are introduced and applied to assess Wikipedia quality. Finally, a comparison among deep learning models and traditional machine learning models is conducted to validate the effectiveness of the proposed model. The models are compared extensively in terms of their training and classification performance. Moreover, the importance of each feature and the importance of different feature sets are analysed separately.
      Citation: Journal of Information Science
      PubDate: 2019-09-30T02:57:05Z
      DOI: 10.1177/0165551519877646
  • Development of a classification system for Mathematical Logic
    • Authors: Antonio Sarasa Cabezuelo
      Abstract: Journal of Information Science, Ahead of Print.
      The number of digital resources that exist in repositories and on the Internet in general is enormous. Recovering resources that fit with the user’s specific needs poses a problem. To solve this problem, metainformation is added to the resources. One type of metainformation is the classification of a resource using a classification system that is widely recognised and agreed upon by its users. In this way, each resource is assigned a precise place within the classification system, thus facilitating its location. This article proposes a taxonomy for the classification of the resources (notes, exercises, exams or programmes that could be stored within a digital repository) that are generated within the scope of a Mathematical Logic course for a computer science degree programme. It also describes how to represent the proposed taxonomy using the IMS-VDEX standard and how to integrate it in the LOM and Dublin Core metadata specifications and proposes a set of controlled vocabularies that make it possible to refine the taxonomic metainformation.
      Citation: Journal of Information Science
      PubDate: 2019-09-24T03:12:22Z
      DOI: 10.1177/0165551519877644
  • A bibliometric analysis of topic modelling studies (2000–2017)
    • Authors: Xin Li, Lei Lei
      Abstract: Journal of Information Science, Ahead of Print.
      Topic modelling is a powerful text mining tool that has been applied in many fields such as software engineering, political and linguistic sciences. To evaluate the development of topic modelling studies, the present study reports a bibliometric analysis of SCIE, SSCI and A&HCI listed articles published from 2000 and 2017. Bibliometric indices for productive authors, countries and institutions are analysed. In addition, thematic changes concerning topic modelling are also examined. Results show that China plays a leading role in this field. Topic modelling has established itself as an important technique in not only natural and formal sciences but also social sciences. LDA, social networks and text analysis are the topics with increasing popularity, while certain models (e.g. pLSA) and applications (e.g. topic detection) are declining in popularity. The findings could help researchers optimise research topic choices, seek collaboration with appropriate partners and stay up-to-date with the development of the field.
      Citation: Journal of Information Science
      PubDate: 2019-09-20T08:41:40Z
      DOI: 10.1177/0165551519877049
  • Deep learning in Arabic sentiment analysis: An overview
    • Authors: Amal Alharbi, Mounira Taileb, Manal Kalkatawi
      Abstract: Journal of Information Science, Ahead of Print.
      Sentiment analysis became a very motivating area in both academic and industrial fields due to the exponential increase of the online published reviews and recommendations. To solve the problem of analysing and classifying those reviews and recommendations, several techniques have been proposed. Lately, deep neural networks showed promising outcomes in sentiment analysis. The growing number of Arab users on the Internet along with the increasing amount of published Arabic reviews and comments encouraged researchers to apply deep learning to analyse them. This article is a comprehensive overview of research works that utilised the deep learning approach for Arabic sentiment analysis.
      Citation: Journal of Information Science
      PubDate: 2019-09-18T04:59:37Z
      DOI: 10.1177/0165551519865488
  • Clickbait detection using multiple categorisation techniques
    • Authors: Abinash Pujahari, Dilip Singh Sisodia
      Abstract: Journal of Information Science, Ahead of Print.
      Clickbaits are online articles with deliberately designed misleading titles for luring more and more readers to open the intended web page. Clickbaits are used to tempt visitors to click on a particular link either to monetise the landing page or to spread the false news for sensationalisation. The presence of clickbaits on any news aggregator portal may lead to unpleasant experience to readers. Automatic detection of clickbait headlines from news headlines has been a challenging issue for the machine learning community. A lot of methods have been proposed for preventing clickbait articles in recent past. However, the recent techniques available in detecting clickbaits are not much robust. This article proposes a hybrid categorisation technique for separating clickbait and non-clickbait articles by integrating different features, sentence structure and clustering. During preliminary categorisation, the headlines are separated using 11 features. After that, the headlines are recategorised using sentence formality and syntactic similarity measures. In the last phase, the headlines are again recategorised by applying clustering using word vector similarity based on t-stochastic neighbourhood embedding (t-SNE) approach. After categorisation of these headlines, machine learning models are applied to the dataset to evaluate machine learning algorithms. The obtained experimental results indicate that the proposed hybrid model is more robust, reliable and efficient than any individual categorisation techniques for the dataset we have used.
      Citation: Journal of Information Science
      PubDate: 2019-09-16T10:54:18Z
      DOI: 10.1177/0165551519871822
  • An efficient attribute reduction algorithm using MapReduce
    • Authors: Linzi Yin, Jing Li, Zhaohui Jiang, Jiafeng Ding, Xuemei Xu
      Abstract: Journal of Information Science, Ahead of Print.
      Classical attribute reduction algorithms based on attribute significance initiate too many jobs (O( C 2)) when they run in MapReduce. To improve the efficiencies of these algorithms, we proposed a novel reduction algorithm. Instead of focusing on attribute significance, the notion of a core attribute was applied to construct a new heuristic reduction algorithm, and only C jobs were considered to obtain a reduct. The algorithm only included two basic operations: compare and sort. The latter was optimised using the shuffle mechanism in MapReduce, which provided an efficient sorting ability for big data. In particular, we connected jobs in an iterative form to transfer the processing result of the former job to the latter job. Finally, experimental results demonstrated that the proposed attribute reduction algorithm was efficient and significantly improved upon the classical algorithms in runtime and number of jobs.
      Citation: Journal of Information Science
      PubDate: 2019-09-11T04:18:15Z
      DOI: 10.1177/0165551519874617
  • Loops in publication citation networks
    • Authors: Yi Bu, Yong Huang, Wei Lu
      Abstract: Journal of Information Science, Ahead of Print.
      Traditionally, publication citation networks are regarded as acyclic, that is, no loops in the network as an earlier published article cannot cite a later published article. However, due to the accessibility of pre-print versions of articles, there might be some loops in a publication citation network. This article presents a descriptive statistic on loops in publication citation networks of computer science and physics by employing a network-based indicator, namely, strongly connected component (SCC). By employing computer science and physics disciplines publications from the Web of Science database as examples, this article examines the count of loops, how the count changes over time and how the count relates to the published year difference between publications within the loop in the citation network. Some common structural patterns are also extracted and analysed; we observe that the two disciplines share the most frequent patterns though there exist some minor differences. Moreover, we find that self-citations in terms of authors, authors’ institutions and journals contribute to the formation of loops in publication citation networks.
      Citation: Journal of Information Science
      PubDate: 2019-09-06T07:56:40Z
      DOI: 10.1177/0165551519871826
  • Using dates as contextual information for personalised cultural heritage
    • Authors: Ahmed Dahroug, Andreas Vlachidis, Antonios Liapis, Antonis Bikakis, Martín López-Nores, Owen Sacco, José Juan Pazos-Arias
      Abstract: Journal of Information Science, Ahead of Print.
      We present semantics-based mechanisms that aim to promote reflection on cultural heritage by means of dates (historical events or annual commemorations), owing to their connections to a collection of items and to the visitors’ interests. We argue that links to specific dates can trigger curiosity, increase retention and guide visitors around the venue following new appealing narratives in subsequent visits. The proposal has been evaluated in a pilot study on the collection of the Archaeological Museum of Tripoli (Greece), for which a team of humanities experts wrote a set of diverse narratives about the exhibits. A year-round calendar was crafted so that certain narratives would be more or less relevant on any given day. Expanding on this calendar, personalised recommendations can be made by sorting out those relevant narratives according to personal events and interests recorded in the profiles of the target users. Evaluation of the associations by experts and potential museum visitors shows that the proposed approach can discover meaningful connections, while many others that are more incidental can still contribute to the intended cognitive phenomena.
      Citation: Journal of Information Science
      PubDate: 2019-09-06T07:56:20Z
      DOI: 10.1177/0165551519871823
  • DelibAnalysis: Understanding the quality of online political discourse
           with machine learning
    • Authors: Eleonore Fournier-Tombs, Giovanna Di Marzo Serugendo
      Abstract: Journal of Information Science, Ahead of Print.
      This article proposes an automated methodology for the analysis of online political discourse. Drawing from the discourse quality index (DQI) by Steenbergen et al., it applies a machine learning–based quantitative approach to measuring the discourse quality of political discussions online. The DelibAnalysis framework aims to provide an accessible, replicable methodology for the measurement of discourse quality that is both platform and language agnostic. The framework uses a simplified version of the DQI to train a classifier, which can then be used to predict the discourse quality of any non-coded comment in a given political discussion online. The objective of this research is to provide a systematic framework for the automated discourse quality analysis of large datasets and, in applying this framework, to yield insight into the structure and features of political discussions online.
      Citation: Journal of Information Science
      PubDate: 2019-09-04T07:17:11Z
      DOI: 10.1177/0165551519871828
  • Low-cost similarity calculation on ontology fusion in knowledge bases
    • Authors: Wen Lou, Ruofan Pi, Hui Wang, Yuan Ju
      Abstract: Journal of Information Science, Ahead of Print.
      Ontology fusion in knowledge bases has become less easy, due to the massive capacity involved in the process of semantic similarity calculation. Many similarity calculation methods have been developed, although they are hardly united. This article contributes a low-cost similarity calculation method for ontology fusion, based on the inspiration of binary metrics, with the aim of reducing the size of similarity calculations both spatially and logically. By introducing the definitions of a heterogeneous ontology, entities of ontologies and rules of ontology fusion on the basis of concept fusion and relationship fusion, we put forward the algorithm of main traverse procedure and calculated to be the least cost in time and space in comparison with traditional methods. We adopted three experiments to testify the usability of our approach from the perspective of actual library resources, small datasets and large datasets. In Experiment 1, the bibliographic data from East China Normal University Library were used to show the feasibility and capability of our proposal and present the process of the algorithm. In both Experiments 2 and 3, our approach had at least 88% confidence in detecting accurate merging mappings and also decreased time cost. The test demonstrated a good fusion result. The problem of lower recalls caused by error analysis results from the conflict between the complex structures in ontologies and the recursive functions, which will be improved in the future.
      Citation: Journal of Information Science
      PubDate: 2019-09-04T06:22:16Z
      DOI: 10.1177/0165551519870456
  • Change point detection in social networks using a multivariate
           exponentially weighted moving average chart
    • Authors: Ali Salmasnia, Mohammadreza Mohabbati, Mohammadreza Namdar
      Abstract: Journal of Information Science, Ahead of Print.
      Although the significant role of social networks in communications between individuals has attracted researchers’ attention to the social networks, only few authors investigated social network monitoring in their studies. Most of the existing studies in this context suffer from the following three main drawbacks: (1) using the case-based network attributes such as person experiences and departments instead of the main attributes such as network density and centrality attributes, (2) monitoring the social attributes separately with the assumption that they are independent of each other and (3) ignoring detection of real time of change in the network. To overcome the above-mentioned disadvantages, this research develops a statistical method for monitoring the connections among actors in the social networks with the four most important network attributes consisting of (1) network density, (2) degree centrality, (3) betweenness centrality and (4) closeness centrality. To this end, a multivariate exponentially weighted moving average (MEWMA) control chart is used for simultaneous monitoring of these four correlated attributes. Furthermore, since the control chart usually does not alert a signal in the exact time of change due to type II error, this study presents a change point detection method to reduce cost and time required for diagnosing the control chart signal. Eventually, the efficiency of the proposed approach in comparison with the existing methods is evaluated through a simulation procedure. The results indicate that the suggested method has better performance than the univariate approach in detecting change point.
      Citation: Journal of Information Science
      PubDate: 2019-08-30T07:28:47Z
      DOI: 10.1177/0165551519863351
  • Corrigendum to Spam Profiles Detection on Social Networks Using
           Computational Intelligence Methods: The Effect of The Lingual Context
    • Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-08-30T05:28:49Z
      DOI: 10.1177/0165551519876373
  • Mapping the social landscape through social media
    • Authors: Girish Chandra Joshi, Mayuri Paul, Bhrigu Kumar Kalita, Vikram Ranga, Jiwan Singh Rawat, Pinkesh Singh Rawat
      Abstract: Journal of Information Science, Ahead of Print.
      Being a habitat of the global village, every place has established connections through the strength and power of social media, piercing through the political boundaries. Social media is a digital platform, where people across the world can interact. This has a number of advantages of being universal, anonymous, easy accessibility, indirect interaction, gathering and sharing information when compared with direct interaction. The easy access to social networking sites (SNSs) such as Facebook, Twitter and blogs has brought about unprecedented opportunities for citizens to voice their opinions loaded with emotions/sentiments. Furthermore, social media can influence human thoughts. A recent incident of public importance had presented an opportunity to map the sentiments, involved around it. Sentiments were extracted from tweets for a week. These sentiments were classified as positive, negative and neutral and were mapped in geographic information system (GIS) environment. It was found that the number of tweets diminished by 91% over a week from 25 August 2017 to 31 August 2017. Maximum tweets emerged from places near the origin of the case (Haryana, Delhi and Punjab). The trend of sentiments was found to be – neutral (47.4%), negative (30%) and positive (22.6%). Interestingly, tweets were also coming from unexpected places such as United States, United Kingdom and West Asia. The result can also be used to assess the spatial distribution of digital penetration in India. The highest concentration was found to be around metropolitan cities, that is, Mumbai, Delhi and lowest in North East India and Jammu & Kashmir indicating the penetration of SNSs.
      Citation: Journal of Information Science
      PubDate: 2019-08-13T07:30:43Z
      DOI: 10.1177/0165551519865487
  • Evaluating the performance of government websites: An automatic assessment
           system based on the TFN-AHP methodology
    • Authors: Xudong Cai, Shengli Li, Gengzhong Feng
      Abstract: Journal of Information Science, Ahead of Print.
      Government websites are currently important for providing information and services to citizens. It is a crucial task to evaluate the performance of each government website. The traditional evaluation processes based on experts are criticised as being subjective and cannot work in real time. This article proposes a framework and automatic assessment system for evaluating the performance of government websites in real time. To test the proposed framework and the automatic assessment system, we evaluate and classify 70 websites from Shaanxi Province of China. This article provides guidance for government agencies, managers of government websites and researchers.
      Citation: Journal of Information Science
      PubDate: 2019-08-12T09:17:43Z
      DOI: 10.1177/0165551519866548
  • Exploring the dominant features of social media for depression detection
    • Authors: Jamil Hussain, Fahad Ahmed Satti, Muhammad Afzal, Wajahat Ali Khan, Hafiz Syed Muhammad Bilal, Muhammad Zaki Ansaar, Hafiz Farooq Ahmad, Taeho Hur, Jaehun Bang, Jee-In Kim, Gwang Hoon Park, Hyonwoo Seung, Sungyoung Lee
      Abstract: Journal of Information Science, Ahead of Print.
      Recently, social media have been used by researchers to detect depressive symptoms in individuals using linguistic data from users’ posts. In this study, we propose a framework to identify social information as a significant predictor of depression. Using the proposed framework, we develop an application called the Socially Mediated Patient Portal (SMPP), which detects depression-related markers in Facebook users by applying a data-driven approach with machine learning classification techniques. We examined a data set of 4350 users who were evaluated for depression using the Center for Epidemiological Studies Depression (CES-D) scale. From this analysis, we identified a set of features that can distinguish between individuals with and without depression. Finally, we identified the dominant features that adequately assess individuals with and without depression on social media. The model trained on these features will be helpful to physicians in diagnosing mental diseases and psychiatrists in analysing patient behaviour.
      Citation: Journal of Information Science
      PubDate: 2019-08-12T09:13:23Z
      DOI: 10.1177/0165551519860469
  • An overview of systematic literature reviews in social media marketing
    • Authors: Jennifer Rowley, Brendan Keegan
      Abstract: Journal of Information Science, Ahead of Print.
      Systematic literature reviews (SLRs) adopt a specified and transparent approach in order to scope the literature in a field or sub-field. However, there has been little critical comment on their purpose and processes in practice. By undertaking an overview of SLRs in the field of social media (SM) marketing, this article undertakes a critical evaluation of the SLR purposes and processes in a set of recent SLRs and presents a future research agenda for social media marketing. The overview shows that the purposes of SLRs include the following: making sense (of research in a field), developing a concept matrix/taxonomy and supporting research and practice. On SLR processes, while there is some consensus on the stages of the process, there is considerable variation in how these processes are executed. This article offers a resource to inform practice and acts as a platform for further critical debate regarding the nature and value of SLRs.
      Citation: Journal of Information Science
      PubDate: 2019-08-12T09:00:39Z
      DOI: 10.1177/0165551519866544
  • An intrinsic evaluation of the Waterloo spam rankings of the ClueWeb09 and
           ClueWeb12 datasets
    • Authors: İbrahim Barış Yılmazel, Ahmet Arslan
      Abstract: Journal of Information Science, Ahead of Print.
      The ClueWeb09 dataset and its successor, the ClueWeb12 dataset, are two of the largest collections of Web pages released by Text REtrieval Conference (TREC). The ClueWeb datasets were used in various tracks of TREC ran through 2009 to 2017. For every year, approximately 50 new queries are released and a pool of Web pages are judged against these queries by human assessors as relevant, non-relevant or spam. In this article, a ground truth for binary classification (spam vs non-spam) is constructed from Web pages that are judged as spam or relevant under the assumption that a Web page judged as relevant for any query cannot be spam. Based on this ground truth, we evaluate classification performances of the Waterloo spam rankings (Fusion, Britney, GroupX and UK2006), which have been traditionally used to identify and filter spam pages in retrieval systems. The experimental results in terms of the universal binary classification evaluation measures suggest that the Fusion (with threshold = 11%) is the best for the ClueWeb09 dataset. Analysis of the frequency distributions of relevant/spam documents over spam scores reveals that the GroupX is the most powerful at identifying relevant documents, whereas the Fusion is the most powerful at identifying spam documents. It is also confirmed that the effectiveness of the Fusion spam ranking of the ClueWeb12 dataset is not as good as that of the ClueWeb09.
      Citation: Journal of Information Science
      PubDate: 2019-08-08T12:13:36Z
      DOI: 10.1177/0165551519866551
  • Spam profiles detection on social networks using computational
           intelligence methods: The effect of the lingual context
    • Authors: Ala’ M Al-Zoubi, Hossam Faris, Ja’far Alqatawna, Mohammad A Hassonah
      Abstract: Journal of Information Science, Ahead of Print.
      In online social networks, spam profiles represent one of the most serious security threats over the Internet; if they do not stop producing bad advertisements, they can be exploited by criminals for various purposes. This article addresses the nature and the characteristics of spam profiles in a social network like Twitter to improve spam detection, based on a number of publicly available language-independent features. In order to investigate the effectiveness of these features in spam detection, four datasets are extracted for four different language contexts (i.e. Arabic, English, Korean and Spanish), and a fifth is formed by combining them all. We conduct our experiments using a set of five well-known classification algorithms in spam detection field, k-Nearest Neighbours (k-NN), Random Forest (RF), Naive Bayes (NB), Decision Tree (DT) (J48) and Multilayer Perceptron (MLP) classifiers, along with five filter-based feature selection methods, namely, Information Gain, Chi-square, ReliefF, Correlation and Significance. The results show oscillating performance of each classifier across all datasets, but improved classification results with feature selection. In addition, detailed analysis and comparisons are carried out on two different levels: in the first level, we compare the selected features’ importance among the feature selection methods, whereas in the second level, we observe the relations and the importance of the selected features across all datasets. The findings of this article lead to a better understanding of social spam and improving detection methods by considering the various important features resulting from the different lingual contexts.
      Citation: Journal of Information Science
      PubDate: 2019-08-07T12:05:26Z
      DOI: 10.1177/0165551519861599
  • An investigation of cultural objects in conflict zones through the lens of
           TripAdvisor reviews: A case of South Caucasus
    • Authors: Lala Hajibayova
      Abstract: Journal of Information Science, Ahead of Print.
      This study is an investigation of how cultural sites and objects in the former conflict zones of South Caucasus are constructed in user-generated narratives in TripAdvisor reviews and images. An analysis of these reviews and images was found to demonstrate the embodied orientation of reviewers’ narrations, wherein the disputed nature of the cultural sites is mainly voiced in the form of dissatisfaction with the socio-economical situation and services. This study suggests that the forgotten nature of frozen conflicts engendered an erosion of and disconnect from cultural heritage, ties and significance for those who fled the contested areas.
      Citation: Journal of Information Science
      PubDate: 2019-08-05T10:18:36Z
      DOI: 10.1177/0165551519867545
  • Chemistry research in Europe: A publication analysis (2006–2016)
    • Authors: Hakan Kaygusuz
      Abstract: Journal of Information Science, Ahead of Print.
      In this article, chemistry research in 51 different European countries between years 2006 and 2016 was studied using statistical methods. This study consists of two parts: In the first part, different economical, institutional and citation parameters were correlated with the number of publications, citations and chemical industry numbers using principal components analysis and hierarchical cluster analysis. The results of the first part indicated that economical and geographical parameters directly affect the chemistry research outcome. In the second part, research in branches of chemistry and related disciplines such as analytical chemistry, polymer science and physical chemistry were analysed using principal components analysis and hierarchical cluster analysis for each country. Publication data were collected as the number of chemistry publications (in Science Citation Index–Expanded (SCI-E)) between years 2006 and 2016 in different chemistry subdisciplines and related scientific areas. Results of the second part of the study produced geographical and economical clusters of countries, interestingly, without addition of any geographical data.
      Citation: Journal of Information Science
      PubDate: 2019-07-29T09:31:41Z
      DOI: 10.1177/0165551519865491
  • Contextual weighting approach to compute term weight in layered vector
           space model
    • Authors: Jayant Gadge, Sunil Bhirud
      Abstract: Journal of Information Science, Ahead of Print.
      The World Wide Web (WWW) is the largest available repository of information. This huge amount of information put forward the challenges of retrieval of trustworthy information from WWW. It defies researchers with new issues of diversity and complexity while retrieving the web information. Information retrieval from the web demands approaches that span beyond conventional information retrieval. Heterogeneity, complexity and the huge volume of web information requires a unique approach to retrieve information. Besides, end-users introduce some difficulties in the retrieval process. Sometimes queries submitted by the user are subtle and ambiguous. The primary concern in information retrieval is the issue of predicting the relevance of documents. In this article, a new approach is proposed that rationally separates web document into five layers, namely, title, header, hyperlink, meta tag and body layer. The proposed method effectively combines the textual information and structural evidence of web document for retrieving information from Web. In the proposed layered vector space model, each layer has an allocated priority which is used to compute weight factor for these layers. The proposed method deduces equation that effectively combines priority of the layer and length of the layer to calculate the weight of the layer.
      Citation: Journal of Information Science
      PubDate: 2019-07-29T09:31:10Z
      DOI: 10.1177/0165551519860043
  • Knowledge discovery using SPARQL property path: The case of disease data
    • Authors: Enayat Rajabi, Salvador Sanchez-Alonso
      Abstract: Journal of Information Science, Ahead of Print.
      The Semantic Web allows knowledge discovery on graph-based data sets and facilitates answering complex queries that are extremely difficult to achieve using traditional database approaches. Intuitively, the Semantic Web query language (SPARQL) has a ‘property path’ feature that enables knowledge discovery in a knowledgebase using its reasoning engine. In this article, we utilise the property path of SPARQL and the other Semantic Web technologies to answer sophisticated queries posed over a disease data set. To this aim, we transform data from a disease web portal to a graph-based data set by designing an ontology, present a template to define the queries and provide a set of conjunctive queries on the data set. We illustrate how the reasoning engine of ‘property path’ feature of SPARQL can retrieve the results from the designed knowledgebase. The results of this study were verified by two domain experts as well as authors’ manual exploration on the disease web portal.
      Citation: Journal of Information Science
      PubDate: 2019-07-22T10:44:14Z
      DOI: 10.1177/0165551519865495
  • Does open access citation advantage depend on paper topics'
    • Authors: Hajar Sotudeh
      Abstract: Journal of Information Science, Ahead of Print.
      Research topics vary in their citation potential. In a metric-wise scientific milieu, it would be probable that authors tend to select citation-attractive topics especially when choosing open access (OA) outlets that are more likely to attract citations. Applying a matched-pairs study design, this research aims to examine the role of research topics in the citation advantage of OA papers. Using a comparative citation analysis method, it investigates a sample of papers published in 47 Elsevier article processing charges (APC)-funded journals in different access models including non-open access (NOA), APC, Green and mixed Green-APC. The contents of the papers are analysed using natural language processing techniques at the title and abstract level and served as a basis to match the NOA papers to their peers in the OA models. The publication years and journals are controlled for in order to avoid their impacts on the citation numbers. According to the results, the OA citation advantage that is observed in the whole sample still holds even for the highly similar OA and NOA papers. This implies that the OA citation surplus is not an artefact of the OA and NOA papers’ differences in their topics and, therefore, in their citation potential. This leads to the conclusion that OA authors’ self-selectivity, if it exists at all, is not responsible for the OA citation advantage, at least as far as selection of topics with probably higher citation potentials is concerned.
      Citation: Journal of Information Science
      PubDate: 2019-07-22T10:17:05Z
      DOI: 10.1177/0165551519865489
  • A cohort study of how faculty in LIS schools perceive and engage with
           open-access publishing
    • Authors: Wilhelm Peekhaus
      Abstract: Journal of Information Science, Ahead of Print.
      This article presents results from a survey of faculty in North American Library and Information Studies (LIS) schools about their attitudes towards and experience with open-access publishing. As a follow-up to a similar survey conducted in 2013, the article also outlines the differences in beliefs about and engagement with open access that have occurred between 2013 and 2018. Although faculty in LIS schools are proponents of free access to research, journal publication choices remain informed by traditional considerations such as prestige and impact factor. Engagement with open access has increased significantly, while perceptions of open access have remained relatively stable between 2013 and 2018. Nonetheless, those faculty who have published in an open-access journal or are more knowledgeable about open access tend to be more convinced about the quality of open-access publications and less apprehensive about open-access publishing than those who have no publishing experience with open-access journals or who are less knowledgeable about various open-access modalities. Willingness to comply with gold open-access mandates has increased significantly since 2013.
      Citation: Journal of Information Science
      PubDate: 2019-07-22T07:36:55Z
      DOI: 10.1177/0165551519865481
  • Music-search behaviour on a social Q&A site: A cross-gender comparison
    • Authors: Shengli Deng, Anqi Zhao, Shaoxiong Fu, Yong Liu, Wenjie Fan, Yuting Jiang
      Abstract: Journal of Information Science, Ahead of Print.
      While there have been numerous studies of music-search behaviour, little is known about gendered aspects of how it is carried out on social question and answer sites. The article examines gender differences manifested on one such site with regard to (a) the motivations of the person posing the question, (b) intervening variables that influence music-search behaviour and (c) the formulation of the questions. Results from manual categorisation and other analysis of 17,380 music-relevant questions collected from the site show that males who asked questions did so more often, provided more answers and had more followers than female question-posters. Males tended to include music context information in questions asking for ready reference, whereas females often asked questions in a second-person pronoun aiming for promoting discussion. Such research results add to the current understanding of music-search behaviour and contribute new insights that can inform development of better music services/systems.
      Citation: Journal of Information Science
      PubDate: 2019-07-17T12:51:07Z
      DOI: 10.1177/0165551519861605
  • Real-time feedback query expansion technique for supporting scholarly
           search using citation network analysis
    • Authors: Shah Khalid, Shengli Wu, Aftab Alam, Irfan Ullah
      Abstract: Journal of Information Science, Ahead of Print.
      Scholars routinely search relevant papers to discover and put a new idea into proper context. Despite ongoing advances in scholarly retrieval technologies, locating relevant papers through keyword queries is still quite challenging due to the massive expansion in the size of the research paper repository. To tackle this problem, we propose a novel real-time feedback query expansion technique, which is a two-stage interactive scholarly search process. Upon receiving the initial search query, the retrieval system provides a ranked list of results. In the second stage, a user selects a few relevant papers, from which useful terms are extracted for query expansion. The newly expanded query is run against the index in real time to generate the final list of research papers. In both stages, citation analysis is involved in further improving the quality of the results. The novelty of the approach lies in the combined exploitation of query expansion and citation analysis that may bring the most relevant papers to the top of the search results list. The experimental results on the Association of Computational Linguistics (ACL) Anthology Network data set demonstrate that this technique is effective and robust for locating relevant papers regarding normalised discounted cumulative gain (nDCG), precision and recall rates than several state-of-the-art approaches.
      Citation: Journal of Information Science
      PubDate: 2019-07-17T01:01:10Z
      DOI: 10.1177/0165551519863346
  • Bucketed common vector scaling for authorship attribution in heterogeneous
           web collections: A scaling approach for authorship attribution
    • Authors: Hayri Volkan Agun, Ozgur Yilmazel
      Abstract: Journal of Information Science, Ahead of Print.
      Domain, genre and topic influences on author style adversely affect the performance of authorship attribution (AA) in multi-genre and multi-domain data sets. Although recent approaches to AA tasks focus on suggesting new feature sets and sampling techniques to improve the robustness of a classification system, they do not incorporate domain-specific properties to reduce the negative impact of irrelevant features on AA. This study presents a novel scaling approach, namely, bucketed common vector scaling, to efficiently reduce negative domain influence without reducing the dimensionality of existing features; therefore, this approach is easily transferable and applicable in a classification system. Classification performances on English-language competition data sets consisting of emails and articles and Turkish-language web documents consisting of blogs, articles and tweets indicate that our approach is very competitive to top-performing approaches in English competition data sets and is significantly improving the top classification performance in mixed-domain experiments on blogs, articles and tweets.
      Citation: Journal of Information Science
      PubDate: 2019-07-11T12:50:07Z
      DOI: 10.1177/0165551519863350
  • A hybrid recommender system for the mining of consumer preferences from
           their reviews
    • Authors: Li Chen Cheng, Ming-Chan Lin
      Abstract: Journal of Information Science, Ahead of Print.
      Product review sites are widespread on the Internet and are rapidly gaining in popularity among consumers. This already large volume of user-generated content is dramatically growing every day, making it hard for consumers to filter out the worthwhile information which appears on the various review sites. There commendation system plays a significant role in solving the problem of information overload. This study proposes a framework which integrates a collaborative filtering approach and an opinion mining technique for movie recommendation. Within the proposed framework, sentiment analysis is first applied to the users’ reviews to detect consumer opinions about the movie they have watched and to explore the individual’s preference profile. Traditional recommendation models are overly dependent on preference ratings and often suffer from the problem of ‘data sparsity’. Experimental results obtained from real online reviews show that our proposed method is effective in dealing with insufficient data and is more accurate and efficient than existing traditional methods.
      Citation: Journal of Information Science
      PubDate: 2019-07-10T01:55:17Z
      DOI: 10.1177/0165551519849510
  • On classification of abstracts obtained from medical journals
    • Authors: Bekir Parlak, Alper Kürşat Uysal
      Abstract: Journal of Information Science, Ahead of Print.
      Classification of medical documents was mostly carried out on English data sets and these studies were performed on hospital records rather than academic texts. The main reasons behind this situation are the lack of publicly available data sets and the tasks being costly and time-consuming. As the first contribution of this study, two data sets including Turkish and English counterparts of the same abstracts published in Turkish medical journals were constructed. Turkish is one of the widely used agglutinative languages worldwide and English is a good example of non-agglutinative languages. While English abstracts were obtained automatically from MEDLINE database with a computer program, Turkish counterparts of these documents were collected manually from the Internet. As the second contribution of this study, an extensive comparison on classification of abstracts obtained from Turkish medical journals was made by using these two equivalent data sets. Features were extracted from text documents with three different approaches: unigram, bigram and hybrid. Hybrid approach includes a combination of unigram and bigram features. In the experiments, three different feature selection methods and seven different classifiers were utilised. According to the results on both data sets, classification performance of the English abstracts outperformed the Turkish counterparts. Maximum accuracies were obtained from the combination of unigram features, distinguishing feature selector (DFS) and multinomial naïve Bayes (MNB) classifier for both data sets. Unigram features were generally more efficient than bigram and hybrid features. However, analysis of top-10 features indicated that nearly half of the features were translations of each other for Turkish and English data sets.
      Citation: Journal of Information Science
      PubDate: 2019-07-09T01:19:03Z
      DOI: 10.1177/0165551519860982
  • Knowledge-sharing and collaborative behaviour: An empirical study on a
           Portuguese higher education institution
    • Authors: Marcello Chedid, Ana Caldeira, Helena Alvelos, Leonor Teixeira
      Abstract: Journal of Information Science, Ahead of Print.
      Collaboration has been considered a way to address the challenges of the 21st century, fostering the necessary innovation, growth and productivity for all parties involved. Several studies reveal that collaboration can be strongly influenced by knowledge sharing. The literature suggests that this topic is quite relevant and that there is an evident lack of empirical studies that properly investigate the relationship between knowledge-sharing and collaborative behaviour in Higher Education Institutions (HEIs). In this context, the purpose of this work is to examine whether knowledge-sharing intention has a positive relationship with collaborative behaviour among professors and researchers in a public Portuguese HEI, taking into account other constructs that can have effect on the knowledge-sharing intention. In order to reach this objective, a conceptual research model was developed based on the theory of reasoned action. The empirical study was conducted based on a questionnaire, and the data analysis was performed using partial least squares. The results indicate that intrinsic motivation and networking are the factors that positively affect the attitude towards knowledge sharing. Nevertheless, it is concluded that trust is the variable that more strongly affects the knowledge-sharing intention. Finally, the study identified that knowledge-sharing intention has a positive influence in collaborative behaviour. It is considered that this study can contribute to support institutions’ management in defining strategies and developing actions in order to promote an organisational culture based on knowledge management that significantly leads to knowledge-sharing and collaboration relationships.
      Citation: Journal of Information Science
      PubDate: 2019-07-03T12:51:50Z
      DOI: 10.1177/0165551519860464
  • WeChat knowledge service system of university library based on SoLoMo: A
           holistic design framework
    • Authors: Mang Chen, Wei Zhang
      Abstract: Journal of Information Science, Ahead of Print.
      In this study, we develop a WeChat knowledge service system (WKSS) in university library based on SoLoMo. The aim is to build a comprehensive, open, mobile and smart knowledge service environment. It can realise the interaction between the three users, library and knowledge, and promote the dissemination and sharing of knowledge. By referencing the Internet frontier concept SoLoMo, this study designs a new mobile smart service system, including the system architecture design, the content design and the data association design. Then, this study develops the system, including the running environment configuration, the development of workflow, the core module and the system implementation. This system enables the provision of accurate, specific and more personalised service to each user. It also includes a portable mobile terminal to increase the accuracy of context awareness and enhance user convenience. This study makes up for the shortcomings of the library and increases the functions of personalisation, mobility and intelligence. It extends the way of mobile service in libraries and provides readers with better library mobile services, which was liked by readers.
      Citation: Journal of Information Science
      PubDate: 2019-07-03T12:35:41Z
      DOI: 10.1177/0165551519860045
  • Using Bayesian networks with hidden variables for identifying trustworthy
           users in social networks
    • Authors: Xu Chen, Yuyu Yuan, Mehmet Ali Orgun
      Abstract: Journal of Information Science, Ahead of Print.
      The popularity and broad accessibility of online social networks (OSNs) have facilitated effective communication among people, but such networks also pose potential risks that should not be ignored. Interaction through OSNs is complex and can be unsafe, as individuals can be contacted by strangers at any time. This makes the notion of trust a crucial issue in the use of OSNs. However, compared with decision-making processes associated with whether to trust a stranger encountered in everyday life, this task is more difficult to address with regard to OSNs due to the lack of face-to-face communication and prior knowledge between people. In this article, trust evaluation is formalised as a classification problem. We demonstrate how user profiles and historical records can be organised into a logical structure based on Bayesian networks to recognise the trustworthy people without the need to build trust relationships in OSNs. This is possible when a more detailed description of features denoted by hidden variables is considered. We compare the performance of our method with those of six other machine learning methods using Facebook and Twitter datasets, and our results show that our method achieves higher values in accuracy, recall and F1 score.
      Citation: Journal of Information Science
      PubDate: 2019-07-02T10:41:41Z
      DOI: 10.1177/0165551519857590
  • Spatial information extraction from travel narratives: Analysing the
           notion of co-occurrence indicating closeness of tourist places
    • Authors: Erum Haris, Keng Hoon Gan, Tien-Ping Tan
      Abstract: Journal of Information Science, Ahead of Print.
      Recent advancements in social media have generated a myriad of unstructured geospatial data. Travel narratives are among the richest sources of such spatial clues. They are also a reflection of writers’ interaction with places. One of the prevalent ways to model this interaction is a points of interest (POIs) graph depicting popular POIs and routes. A relevant notion is that frequent pairwise occurrences of POIs indicate their geographic proximity. This work presents an empirical interpretation of this theory and constructs spatially enriched POI graphs, a clear augmentation to popularity-based POI graphs. A triplet pattern, rule-based spatial relation extraction technique SpatRE is proposed and compared with standard relation extraction systems Ollie and Stanford OpenIE. A travel blogs data set is also contributed containing labelled spatial relations. The performance is further evaluated on SemEval 2013 benchmark data sets. Finally, spatially enriched POI graphs are qualitatively compared with TripAdvisor and Google Maps to visualise information accuracy.
      Citation: Journal of Information Science
      PubDate: 2019-06-10T07:57:02Z
      DOI: 10.1177/0165551519837188
  • A case study for block-based linked data generation: Recipes as jigsaw
    • Authors: Övünç Öztürk, Tuğba Özacar
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-06-05T02:08:48Z
      DOI: 10.1177/0165551519849518
  • ASA: A framework for Arabic sentiment analysis
    • Authors: Ahmed Oussous, Fatima-Zahra Benjelloun, Ayoub Ait Lahcen, Samir Belfkih
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-05-21T01:20:49Z
      DOI: 10.1177/0165551519849516
  • Capture and visualisation of text understanding through semantic
           annotations and semantic networks for teaching and learning
    • Authors: Roberto Willrich, Adiel Mittmann, Renato Fileto, Alckmar Luiz dos Santos
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-05-21T01:10:55Z
      DOI: 10.1177/0165551519849514
  • Finding top performers through email patterns analysis
    • Authors: Qi Wen, Peter A Gloor, Andrea Fronzetti Colladon, Praful Tickoo, Tushar Joshi
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-05-20T02:11:01Z
      DOI: 10.1177/0165551519849519
  • A study on first citations of patents through a combination of
           Bradford’s distribution, Cox regression and life tables method
    • Authors: Mohammad Tavakolizadeh-Ravari, Faramarz Soheili, Fatemeh Makkizadeh, Fatemeh Akrami
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-05-08T02:27:10Z
      DOI: 10.1177/0165551519845848
  • University students’ mobile news consumption activities and
           evaluative/affective reactions to political news during election
           campaigns: A diary study
    • Authors: Rong Tang, Kyong Eun Oh
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-04-29T08:14:17Z
      DOI: 10.1177/0165551519845855
  • Concept-LDA: Incorporating Babelfy into LDA for aspect extraction
    • Authors: Ekin Ekinci, Sevinç İlhan Omurca
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-04-29T08:05:17Z
      DOI: 10.1177/0165551519845854
  • Mapping the efficiency of international scientific collaboration between
           cities worldwide
    • Authors: György Csomós, Balázs Lengyel
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-04-10T12:43:13Z
      DOI: 10.1177/0165551519842128
  • Understanding data search as a socio-technical practice
         This is an Open Access Article Open Access Article

    • Authors: Kathleen M Gregory, Helena Cousijn, Paul Groth, Andrea Scharnhorst, Sally Wyatt
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-04-02T11:00:00Z
      DOI: 10.1177/0165551519837182
  • How to identify the roots of broad research topics and fields' The
           introduction of RPYS sampling using the example of climate change research

         This is an Open Access Article Open Access Article

    • Authors: Robin Haunschild, Werner Marx, Andreas Thor, Lutz Bornmann
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-04-02T10:52:00Z
      DOI: 10.1177/0165551519837175
  • A semantic web methodological framework to evaluate the support of
           integrity in thesaurus tools
    • Authors: M Mercedes Martínez-González, María-Luisa Alvite-Díez
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-27T09:33:42Z
      DOI: 10.1177/0165551519837195
  • Visual analysis of information world maps: An exploration of four methods
    • Authors: Devon Greyson, Heather O’Brien, Saguna Shankar
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-22T01:48:47Z
      DOI: 10.1177/0165551519837174
  • Research diversification and its relationship with publication counts and
           impact: A case study based on Australian professors
    • Authors: Hamid R Jamali, Alireza Abbasi, Lutz Bornmann
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-18T09:08:00Z
      DOI: 10.1177/0165551519837191
  • Multimodal ensemble approach to identify and rank top-k influential nodes
           of scholarly literature using Twitter network
    • Authors: Bharat Tidke, Rupa Mehta, Jenish Dhanani
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-18T08:51:21Z
      DOI: 10.1177/0165551519837190
  • Decision tree classification: Ranking journals using IGIDI
    • Authors: Muhammad Shaheen, Tanveer Zafar, Sajid Ali Khan
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-18T08:45:01Z
      DOI: 10.1177/0165551519837176
  • An improved evidence-based aggregation method for sentiment analysis
    • Authors: Parisa Jamadi Khiabani, Mohammad Ehsan Basiri, Hamid Rastegari
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-18T08:39:07Z
      DOI: 10.1177/0165551519837187
  • The citation advantage for open access science journals with and without
           article processing charges
    • Authors: Mohammad Reza Ghane, Mohammad Reza Niazmand, Ameneh Sabet Sarvestani
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-18T08:36:42Z
      DOI: 10.1177/0165551519837183
  • A study of the determinants of postgraduate students’ satisfaction of
           using online research databases
    • Authors: A Y M Atiquil Islam, Arslan Sheikh
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-11T02:37:29Z
      DOI: 10.1177/0165551519834714
  • Effect of knowledge management on software product experience with
           mediating effect of perceived software process improvement: An empirical
           study for Indian software industry
    • Authors: Mitali Chugh, Nitin Chanderwal, Rajesh Upadhyay, Devendra Kumar Punia
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-06T01:59:45Z
      DOI: 10.1177/0165551519833610
  • Twitter speaks: A case of national disaster situational awareness
    • Authors: Amir Karami, Vishal Shah, Reza Vaezi, Amit Bansal
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-03-04T02:06:25Z
      DOI: 10.1177/0165551519828620
  • LAZY R-tree: The R-tree with lazy splitting algorithm
    • Authors: Yang Yang, Pengwei Bai, Ningling Ge, Zhipeng Gao, Xuesong Qiu
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-02-26T10:43:25Z
      DOI: 10.1177/0165551519828616
  • Twitter sentiment analysis using fuzzy integral classifier fusion
    • Authors: Mehdi Emadi, Maseud Rahgozar
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-02-21T11:07:38Z
      DOI: 10.1177/0165551519828627
  • Examining the classification and evolution of novice users’ mental
           models of an academic database in the search task completion process
    • Authors: Zhengbiao Han, Preben Hansen, Haiyun Xu, Rui Luo
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-02-21T11:00:54Z
      DOI: 10.1177/0165551519828621
  • Exploring the characteristics of crowdsourcing: An online observational
    • Authors: Harpreet Bassi, Christopher J Lee, Laura Misener, Andrew M Johnson
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-02-21T10:55:34Z
      DOI: 10.1177/0165551519828626
  • The classification of rumour standpoints in online social network based on
           combinatorial classifiers
    • Authors: Jing Ma, Yongcong Luo
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-02-21T10:50:34Z
      DOI: 10.1177/0165551519828619
  • Aspect-based summarisation using distributed clustering and
           single-objective optimisation
    • Authors: V Priya, K Umamaheswari
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-02-21T03:56:36Z
      DOI: 10.1177/0165551519827896
  • OPPCAT: Ontology population from tabular data
    • Authors: Ovunc Ozturk
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-02-21T03:32:52Z
      DOI: 10.1177/0165551519827892
  • A novel approach to provenance management for privacy preservation
    • Authors: Ozgu Can, Dilek Yilmazer
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-02-21T03:27:49Z
      DOI: 10.1177/0165551519827882
  • HOMPer: A new hybrid system for opinion mining in the Persian language
    • Authors: Mohammad Ehsan Basiri, Arman Kabiri
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-02-06T03:36:56Z
      DOI: 10.1177/0165551519827886
  • ‘No comment’' A study of commenting on PLOS articles
    • Authors: Simon Wakeling, Peter Willett, Claire Creaser, Jenny Fry, Stephen Pinfield, Valerie Spezi, Marc Bonne, Christina Founti, Itzelle Medina Perea
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-01-24T09:57:02Z
      DOI: 10.1177/0165551518819965
  • Deception detection methods incorporating discourse network metrics in
           synchronous computer-mediated communication
    • Authors: Jiang Wu, Yangyang Liu
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-01-22T11:26:16Z
      DOI: 10.1177/0165551518823176
  • Open-access policy and data-sharing practice in UK academia
    • Authors: Yimei Zhu
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-01-21T02:23:38Z
      DOI: 10.1177/0165551518823174
  • Hotel recommendation system by bipartite networks and link prediction
    • Authors: Buket Kaya
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-01-21T02:15:18Z
      DOI: 10.1177/0165551518824577
  • iLDA: An interactive latent Dirichlet allocation model to improve topic
    • Authors: Yezheng Liu, Fei Du, Jianshan Sun, Yuanchun Jiang
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-01-09T03:43:16Z
      DOI: 10.1177/0165551518822455
  • Factors influencing the information needs and information access channels
           of farmers: An empirical study in Guangdong, China
    • Authors: Yongshan Chen, Yonghe Lu
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-01-08T01:47:33Z
      DOI: 10.1177/0165551518819970
  • Document recommendation based on the analysis of group trust and user
    • Authors: Chin-Hui Lai, Yu-Chieh Chang
      First page: 845
      Abstract: Journal of Information Science, Ahead of Print.

      Citation: Journal of Information Science
      PubDate: 2019-01-04T11:43:32Z
      DOI: 10.1177/0165551518819973
  • Group trip planning and information seeking behaviours by mobile social
           media users: A study of tourists in Australia, Bangladesh and China
    • Authors: Jannatul Fardous, Jia Tina Du, Preben Hansen, Kim-Kwang Raymond Choo, Songshan (Sam) Huang
      Abstract: Journal of Information Science, Ahead of Print.
      Social media plays an increasingly important role in travel information seeking and decision-making. However, there is limited understanding of how a group of tourists use social media to plan trips collaboratively and the different practices between countries. In this study, we investigated the collaborative information seeking (CIS) and sharing behaviours of mobile social media users from Australia, Bangladesh and China. Specifically, we surveyed a total of 219 participants to explore the differences in CIS behaviours when people were planning a group trip. The findings suggest significant differences among three countries in terms of the motivations of using social media, CIS activities and social interactions outside the group. Key findings include Bangladeshi and Chinese travellers preferred known contacts on social media, while Australian tourists intended to use both known contacts and user-generated contents for seeking information. The findings also show that social interactions employed by individuals are considered as an important complement of and are interwoven with in-group CIS; both contribute to tourism information seeking. Finally, we propose a framework for CIS research in the tourism domain.
      Citation: Journal of Information Science
      DOI: 10.1177/0165551519890515
  • The association between professional stratification and use of online
           sources: Evidence from the National Dental Practice-Based Research Network
    • Authors: Simone Rosenblum, Kimberley R Isett, Julia Melkers, Ellen Funkhouser, Diana Hicks, Gregg Gilbert, Michael J Melkers, Deborah McEdward, Meredith Buchberg-Trejo
      Abstract: Journal of Information Science, Ahead of Print.
      The use of online information sources in most professions is widespread, and well researched. Less understood is how the use of these sources vary across the strata within a single profession, and how question context affects search behaviour. Using the dental profession as a case of a highly stratified discipline, we examine search preferences for sources by professional strata among dentists in a practice-based network. Results show that variation exists in information search behaviour across professional strata of dental clinicians. This study highlights the importance of addressing information literacy across different levels of a profession. Findings also underscore that search behaviour and source preference vary with perceived question relevance.
      Citation: Journal of Information Science
      DOI: 10.1177/0165551519890519
  • Topic modelling and social network analysis of publications and patents in
           humanoid robot technology
    • Authors: Richa Kumari, Jae Yun Jeong, Byeong-Hee Lee, Kwang-Nam Choi, Kiseok Choi
      Abstract: Journal of Information Science, Ahead of Print.
      This article presents analysis of data from scientific articles and patents to identify the evolving trends and underlying topics in research on humanoid robots. We used topic modelling based on latent Dirichlet allocation analysis to identify underlying topics in sub-areas in the field. We also used social network analysis to measure the centrality indices of publication keywords to detect important and influential sub-areas and used co-occurrence analysis of keywords to visualise relationships among subfields. The research result is useful to identify evolving topics and areas of current focus in the field of humanoid technology. The results contribute to identify valuable research patterns from publications and to increase understanding of the hidden knowledge themes that are revealed by patents.
      Citation: Journal of Information Science
      DOI: 10.1177/0165551519887878
  • Indicator of quality for environmental articles on Wikipedia at the higher
           education level
    • Authors: Eduard Petiška, Bedřich Moldan
      Abstract: Journal of Information Science, Ahead of Print.
      Wikipedia is important in higher education because students and scholars often use it. Nevertheless, the issue of Wikipedia’s quality is an obstacle for its use at the higher education level. In order to contribute to this discussion, we have proposed ‘Verifiability by respected sources’ as an indicator for assessing the quality of Wikipedia articles at the higher education level and conducted an analysis of the most frequently visited articles in the category of Environment on Wikipedia. Results show that these articles contain many unreferenced statements, so their usage at the higher education level is problematic. Therefore, we also propose specific steps for relevant actors that could help to improve the quality of Wikipedia.
      Citation: Journal of Information Science
      DOI: 10.1177/0165551519888607
  • DeepLink: A novel link prediction framework based on deep learning
    • Authors: Mohammad Mehdi Keikha, Maseud Rahgozar, Masoud Asadpour
      Abstract: Journal of Information Science, Ahead of Print.
      Recently, link prediction has attracted more attention from various disciplines such as computer science, bioinformatics and economics. In link prediction, numerous information such as network topology, profile information and user-generated contents are considered to discover missing links between nodes. Whereas numerous previous researches had focused on the structural features of the networks for link prediction, recent studies have shown more interest in profile and content information, too. So, some of these researches combine structural and content information. However, some issues such as scalability and feature engineering need to be investigated to solve a few remaining problems. Moreover, most of the previous researches are presented only for undirected and unweighted networks. In this article, a novel link prediction framework named ‘DeepLink’ is presented, which is based on deep learning techniques. While deep learning has the advantage of extracting automatically the best features for link prediction, many other link prediction algorithms need manual feature engineering. Moreover, in the proposed framework, both structural and content information are employed. The framework is capable of using different structural feature vectors that are prepared by various link prediction methods. It learns all proximity orders that are presented on a network during the structural feature learning. We have evaluated the effectiveness of DeepLink on two real social network datasets, Telegram and irBlogs. On both datasets, the proposed framework outperforms several other structural and hybrid approaches for link prediction.
      Citation: Journal of Information Science
      DOI: 10.1177/0165551519891345
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-