A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

        1 2 3 | Last   [Sort by number of followers]   [Restore default list]

  Subjects -> SCIENCES: COMPREHENSIVE WORKS (Total: 426 journals)
Showing 1 - 200 of 265 Journals sorted alphabetically
AAS Open Research     Open Access   (Followers: 1)
ABC Journal of Advanced Research     Open Access  
Accountability in Research: Policies and Quality Assurance     Hybrid Journal   (Followers: 22)
Acta Materialia Transilvanica     Open Access  
Acta Nova     Open Access   (Followers: 1)
Acta Scientifica Malaysia     Open Access  
Acta Scientifica Naturalis     Open Access   (Followers: 3)
Adıyaman University Journal of Science     Open Access  
Advanced Science     Open Access   (Followers: 13)
Advanced Science, Engineering and Medicine     Partially Free   (Followers: 11)
Advanced Theory and Simulations     Hybrid Journal   (Followers: 5)
Advances in Research     Open Access  
Advances in Science and Technology     Full-text available via subscription   (Followers: 17)
African Journal of Science, Technology, Innovation and Development     Hybrid Journal   (Followers: 8)
Afrique Science : Revue Internationale des Sciences et Technologie     Open Access   (Followers: 2)
AFRREV STECH : An International Journal of Science and Technology     Open Access   (Followers: 4)
American Academic & Scholarly Research Journal     Open Access   (Followers: 6)
American Journal of Applied Sciences     Open Access   (Followers: 27)
American Journal of Humanities and Social Sciences     Open Access   (Followers: 17)
ANALES de la Universidad Central del Ecuador     Open Access   (Followers: 4)
Anales del Instituto de la Patagonia     Open Access  
Applied Mathematics and Nonlinear Sciences     Open Access   (Followers: 1)
Apuntes de Ciencia & Sociedad     Open Access  
Arab Journal of Basic and Applied Sciences     Open Access  
Arabian Journal for Science and Engineering     Hybrid Journal   (Followers: 5)
Archives Internationales d'Histoire des Sciences     Partially Free   (Followers: 7)
Archives of Current Research International     Open Access  
ARO. The Scientific Journal of Koya University     Open Access  
ARPHA Conference Abstracts     Open Access   (Followers: 6)
ARPHA Proceedings     Open Access   (Followers: 5)
ArtefaCToS : Revista de estudios sobre la ciencia y la tecnología     Open Access   (Followers: 1)
Asia-Pacific Journal of Science and Technology     Open Access  
Asian Journal of Advanced Research and Reports     Open Access   (Followers: 2)
Asian Journal of Applied Science and Engineering     Open Access   (Followers: 2)
Asian Journal of Scientific Research     Open Access   (Followers: 3)
Asian Journal of Technology Innovation     Hybrid Journal   (Followers: 7)
Australian Field Ornithology     Full-text available via subscription   (Followers: 4)
Australian Journal of Social Issues     Hybrid Journal   (Followers: 7)
AZimuth     Full-text available via subscription   (Followers: 2)
Bangladesh Journal of Scientific Research     Open Access   (Followers: 1)
Beni-Suef University Journal of Basic and Applied Sciences     Open Access   (Followers: 3)
Berichte Zur Wissenschaftsgeschichte     Hybrid Journal   (Followers: 10)
Berkeley Scientific Journal     Full-text available via subscription  
BIBECHANA     Open Access   (Followers: 2)
BibNum     Open Access  
Bilge International Journal of Science and Technology Research     Open Access   (Followers: 1)
Bioethics Research Notes     Full-text available via subscription   (Followers: 16)
Bistua : Revista de la Facultad de Ciencias Básicas     Open Access  
BJHS Themes     Open Access   (Followers: 1)
Black Sea Journal of Engineering and Science     Open Access  
Borneo Journal of Resource Science and Technology     Open Access  
Brazilian Journal of Science and Technology     Open Access   (Followers: 2)
Bulletin de la Société Royale des Sciences de Liège     Open Access  
Bulletin of the National Research Centre     Open Access  
Butlletí de la Institució Catalana d'Història Natural     Open Access  
Central European Journal of Clinical Research     Open Access  
Chain Reaction     Full-text available via subscription  
Ciencia & Natura     Open Access   (Followers: 1)
Ciencia Amazónica (Iquitos)     Open Access   (Followers: 1)
Ciencia en Desarrollo     Open Access   (Followers: 3)
Ciencia en su PC     Open Access   (Followers: 1)
Ciencia Ergo Sum     Open Access  
Ciência ET Praxis     Open Access  
Ciencia y Tecnología     Open Access  
Ciencia, Docencia y Tecnología     Open Access  
Ciencias Holguin     Open Access   (Followers: 2)
CienciaUAT     Open Access   (Followers: 1)
Citizen Science : Theory and Practice     Open Access   (Followers: 2)
Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering     Open Access  
Communications in Applied Sciences     Open Access  
Comprehensive Therapy     Hybrid Journal   (Followers: 3)
Comunicata Scientiae     Open Access   (Followers: 1)
ConCiencia     Open Access  
Conference Papers in Science     Open Access   (Followers: 2)
Configurations     Full-text available via subscription   (Followers: 10)
COSMOS     Hybrid Journal   (Followers: 1)
Crea Ciencia Revista Científica     Open Access   (Followers: 2)
Cuadernos de Investigación UNED     Open Access  
Current Issues in Criminal Justice     Hybrid Journal   (Followers: 15)
Current Research in Geoscience     Open Access   (Followers: 8)
Dalat University Journal of Science     Open Access  
Data     Open Access   (Followers: 5)
Data Curation Profiles Directory     Open Access   (Followers: 5)
Dhaka University Journal of Science     Open Access  
Dharmakarya     Open Access  
Diálogos Interdisciplinares     Open Access  
Digithum     Open Access   (Followers: 2)
Discover Sustainability     Open Access   (Followers: 3)
Einstein (São Paulo)     Open Access  
Ekaia : EHUko Zientzia eta Teknologia aldizkaria     Open Access  
Elkawnie : Journal of Islamic Science and Technology     Open Access  
Emergent Scientist     Open Access  
Enhancing Learning in the Social Sciences     Open Access   (Followers: 9)
Enseñanza de las Ciencias : Revista de Investigación y Experiencias Didácticas     Open Access  
Entramado     Open Access  
Entre Ciencia e Ingeniería     Open Access   (Followers: 1)
Epiphany     Open Access   (Followers: 4)
Ergo     Open Access  
Estação Científica (UNIFAP)     Open Access   (Followers: 1)
Ethiopian Journal of Education and Sciences     Open Access   (Followers: 6)
Ethiopian Journal of Science and Technology     Open Access  
Ethiopian Journal of Sciences and Sustainable Development     Open Access   (Followers: 5)
European Online Journal of Natural and Social Sciences     Open Access   (Followers: 12)
European Scientific Journal     Open Access   (Followers: 10)
Evidência - Ciência e Biotecnologia - Interdisciplinar     Open Access  
Exchanges : the Warwick Research Journal     Open Access   (Followers: 2)
Experimental Results     Open Access   (Followers: 1)
Extensionismo, Innovación y Transferencia Tecnológica     Open Access   (Followers: 3)
Facets     Open Access  
Fides et Ratio : Revista de Difusión Cultural y Científica     Open Access   (Followers: 1)
Fırat University Turkish Journal of Science & Technology     Open Access  
Fontanus     Open Access  
Forensic Science Policy & Management: An International Journal     Hybrid Journal   (Followers: 378)
Frontiers for Young Minds     Open Access  
Frontiers in Climate     Open Access   (Followers: 3)
Frontiers in Science     Open Access   (Followers: 1)
Fundamental Research     Open Access  
Futures & Foresight Science     Hybrid Journal   (Followers: 4)
Gaudium Sciendi     Open Access   (Followers: 1)
Gazi University Journal of Science     Open Access  
Ghana Studies     Full-text available via subscription   (Followers: 15)
Global Journal of Pure and Applied Sciences     Full-text available via subscription  
Global Journal of Science Frontier Research     Open Access   (Followers: 2)
Globe, The     Full-text available via subscription   (Followers: 4)
HardwareX     Open Access  
Heidelberger Jahrbücher Online     Open Access  
Heliyon     Open Access  
Himalayan Journal of Science and Technology     Open Access   (Followers: 1)
History of Science and Technology     Open Access   (Followers: 1)
Hoosier Science Teacher     Open Access  
Iberoamerican Journal of Science Measurement and Communication     Open Access  
Impact     Open Access   (Followers: 2)
Indian Journal of History of Science     Hybrid Journal   (Followers: 2)
Indonesian Journal of Fundamental Sciences     Open Access  
Indonesian Journal of Science and Mathematics Education     Open Access   (Followers: 4)
Indonesian Journal of Science and Technology     Open Access  
Ingenieria y Ciencia     Open Access   (Followers: 1)
Innovare : Revista de ciencia y tecnología     Open Access  
Instruments     Open Access  
Integrated Research Advances     Open Access  
Interciencia     Open Access   (Followers: 1)
Interface Focus     Full-text available via subscription  
International Annals of Science     Open Access  
International Archives of Science and Technology     Open Access  
International Journal of Academic Research in Business, Arts & Science     Open Access   (Followers: 2)
International Journal of Advanced Multidisciplinary Research and Review     Open Access  
International Journal of Applied Science     Open Access  
International Journal of Basic and Applied Sciences     Open Access   (Followers: 4)
International Journal of Computational and Experimental Science and Engineering (IJCESEN)     Open Access  
International Journal of Culture and Modernity     Open Access   (Followers: 3)
International Journal of Engineering, Science and Technology     Open Access  
International Journal of Innovation and Applied Studies     Open Access   (Followers: 12)
International Journal of Innovative Research and Scientific Studies     Open Access   (Followers: 6)
International Journal of Network Science     Hybrid Journal   (Followers: 3)
International Journal of Recent Contributions from Engineering, Science & IT     Open Access   (Followers: 1)
International Journal of Research in Science     Open Access   (Followers: 2)
International Journal of Science & Emerging Technologies     Open Access   (Followers: 1)
International Journal of Social Sciences and Management     Open Access   (Followers: 3)
International Journal of Technology Policy and Law     Hybrid Journal   (Followers: 7)
International Letters of Social and Humanistic Sciences     Open Access   (Followers: 1)
International Review of Applied Sciences     Open Access  
InterSciencePlace     Open Access   (Followers: 1)
Investiga : TEC     Open Access  
Investigación Joven     Open Access  
Investigación Valdizana     Open Access  
Investigacion y Ciencia     Open Access   (Followers: 1)
Iranian Journal of Science and Technology, Transactions A : Science     Hybrid Journal  
iScience     Open Access   (Followers: 2)
Issues in Science & Technology     Free   (Followers: 7)
Ithaca : Viaggio nella Scienza     Open Access  
J : Multidisciplinary Scientific Journal     Open Access  
Jaunujų mokslininkų darbai     Open Access   (Followers: 1)
Journal de la Recherche Scientifique de l'Universite de Lome     Full-text available via subscription   (Followers: 2)
Journal of Chromatography & Separation Techniques     Open Access   (Followers: 12)
Journal of Advanced Research     Open Access   (Followers: 3)
Journal of Al-Qadisiyah for Pure Science     Open Access   (Followers: 1)
Journal of Alasmarya University     Open Access   (Followers: 3)
Journal of Analytical Science & Technology     Open Access   (Followers: 6)
Journal of Applied Science and Technology     Full-text available via subscription   (Followers: 1)
Journal of Applied Sciences and Environmental Management     Open Access   (Followers: 3)
Journal of Big History     Open Access   (Followers: 3)
Journal of Composites Science     Open Access   (Followers: 3)
Journal of Critical Thought and Praxis     Open Access   (Followers: 2)
Journal of Deliberative Mechanisms in Science     Open Access  
Journal of Diversity Management     Open Access   (Followers: 6)
Journal of Indian Council of Philosophical Research     Hybrid Journal  
Journal of Institute of Science and Technology     Open Access  
Journal of Integrated Science and Technology     Open Access  
Journal of Interaction Science     Open Access   (Followers: 1)
Journal of Kerbala University     Open Access   (Followers: 1)
Journal of King Saud University - Science     Open Access   (Followers: 1)
Journal of Mathematical and Fundamental Sciences     Open Access  
Journal of Natural Sciences and Mathematics Research     Open Access  
Journal of Natural Sciences Research     Open Access   (Followers: 4)
Journal of Negative and No Positive Results     Open Access  
Journal of Responsible Technology     Open Access   (Followers: 1)
Journal of Science (JSc)     Open Access  
Journal of Science and Engineering     Open Access  
Journal of Science and Technology     Open Access   (Followers: 2)
Journal of Science and Technology     Open Access   (Followers: 1)

        1 2 3 | Last   [Sort by number of followers]   [Restore default list]

Similar Journals
Journal Cover
Data
Number of Followers: 5  

  This is an Open Access Journal Open Access journal
ISSN (Online) 2306-5729
Published by MDPI Homepage  [238 journals]
  • Data, Vol. 6, Pages 95: TRIPOD—A Treadmill Walking Dataset with IMU,
           Pressure-Distribution and Photoelectric Data for Gait Analysis

    • Authors: Justin Trautmann, Lin Zhou, Clemens Markus Brahms, Can Tunca, Cem Ersoy, Urs Granacher, Bert Arnrich
      First page: 95
      Abstract: Inertial measurement units (IMUs) enable easy to operate and low-cost data recording for gait analysis. When combined with treadmill walking, a large number of steps can be collected in a controlled environment without the need of a dedicated gait analysis laboratory. In order to evaluate existing and novel IMU-based gait analysis algorithms for treadmill walking, a reference dataset that includes IMU data as well as reliable ground truth measurements for multiple participants and walking speeds is needed. This article provides a reference dataset consisting of 15 healthy young adults who walked on a treadmill at three different speeds. Data were acquired using seven IMUs placed on the lower body, two different reference systems (Zebris FDMT-HQ and OptoGait), and two RGB cameras. Additionally, in order to validate an existing IMU-based gait analysis algorithm using the dataset, an adaptable modular data analysis pipeline was built. Our results show agreement between the pressure-sensitive Zebris and the photoelectric OptoGait system (r = 0.99), demonstrating the quality of our reference data. As a use case, the performance of an algorithm originally designed for overground walking was tested on treadmill data using the data pipeline. The accuracy of stride length and stride time estimations was comparable to that reported in other studies with overground data, indicating that the algorithm is equally applicable to treadmill data. The Python source code of the data pipeline is publicly available, and the dataset will be provided by the authors upon request, enabling future evaluations of IMU gait analysis algorithms without the need of recording new data.
      Citation: Data
      PubDate: 2021-08-26
      DOI: 10.3390/data6090095
      Issue No: Vol. 6, No. 9 (2021)
       
  • Data, Vol. 6, Pages 96: Lessons Learnt from Engineering Science Projects
           Participating in the Horizon 2020 Open Research Data Pilot

    • Authors: Timothy Austin, Kyriaki Bei, Theodoros Efthymiadis, Elias P. Koumoulos
      First page: 96
      Abstract: Trends in the sciences are indicative of data management becoming established as a feature of the mainstream research process. In this context, the European Commission introduced an Open Research Data pilot at the start of the Horizon 2020 research programme. This initiative followed the success of the Open Access pilot implemented in the prior (FP7) research programme, which thereafter became an integral component of Horizon 2020. While the Open Access phenomenon can reasonably be argued to be one of many instances of web technologies disrupting established business models (namely publication practices and workflows established over several centuries in the case of Open Access), initiatives designed to promote research data management have no established foundation on which to build. For Open Data to become a reality and, more importantly, to contribute to the scientific process, data management best practices and workflows are required. Furthermore, with the scientific community having operated to good effect in the absence of data management, there is a need to demonstrate the merits of data management. This circumstance is complicated by the lack of the necessary ICT infrastructures, especially interoperability standards, required to facilitate the seamless transfer, aggregation and analysis of research data. Any activity aiming to promote Open Data thus needs to overcome a number of cultural and technological challenges. It is in this context that this paper examines the data management activities and outcomes of a number of projects participating in the Horizon 2020 Open Research Data pilot. The result has been to identify a number of commonly encountered benefits and issues; to assess the utilisation of data management plans; and through the close examination of specific cases, to gain insights into obstacles to data management and potential solutions. Although primarily anecdotal and difficult to quantify, the experiences reported in this paper tend to favour developing data management best practices rather than doggedly pursue the Open Data mantra. While Open Data may prove valuable in certain circumstances, there is good reason to claim that managed access to scientific data of high inherent intellectual and financial value will prove more effective in driving knowledge discovery and innovation.
      Citation: Data
      PubDate: 2021-09-06
      DOI: 10.3390/data6090096
      Issue No: Vol. 6, No. 9 (2021)
       
  • Data, Vol. 6, Pages 97: BioCPR–A Tool for Correlation Plots

    • Authors: Vidal Fey, Dhanaprakash Jambulingam, Henri Sara, Samuel Heron, Csilla Sipeky, Johanna Schleutker
      First page: 97
      Abstract: A gene is a sequence of DNA bases through which genetic information is passed on to the next generation. Most genes encode for proteins that ultimately control cellular function. Understanding the interrelation between genes without the application of statistical methods can be a daunting task. Correlation analysis is a powerful approach to determine the strength of association between two variables (e.g., gene-wise expression). Moreover, it becomes essential to visualize this data to establish patterns and derive insight. The most common method for gene expression visualization is to use correlation heatmaps in which the colors of the plot represent strength of co-expression. In order to address this requirement, we developed a visualization tool called BioCPR: Biological Correlation Plots in R. This tool performs both correlation analysis and subsequent visualization in the form of an interactive heatmap, improving both usability and interpretation of the data. BioCPR is an R Shiny-based application and can be run locally in Rstudio or a web browser.
      Citation: Data
      PubDate: 2021-09-08
      DOI: 10.3390/data6090097
      Issue No: Vol. 6, No. 9 (2021)
       
  • Data, Vol. 6, Pages 98: Seismic Envelopes of Coda Decay for Q-coda
           Attenuation Studies of the Gargano Promontory (Southern Italy) and
           Surrounding Regions

    • Authors: Marilena Filippucci, Salvatore Lucente, Salvatore de Lorenzo, Edoardo Del Pezzo, Giacomo Prosser, Andrea Tallarico
      First page: 98
      Abstract: Here, we describe the dataset of seismic envelopes used to study the S-wave Q-coda attenuation quality factor Qc of the Gargano Promontory (Southern Italy). With this dataset, we investigated the crustal seismic attenuation by the Qc parameter. We collected this dataset starting from two different earthquake catalogues: the first regarding the period from April 2013 to July 2014; the second regarding the period from July 2015 to August 2018. Visual inspection of the envelopes was carried out on recordings filtered with a Butterworth two-poles filter with central frequency fc = 6 Hz. The obtained seismic envelopes of coda decay can be linearly fitted in a bilogarithmic diagram in order to obtain a series of single source-receiver measures of Qc for each seismogram component at different frequency fc. The analysis of the trend Qc(fc) gives important insights into the heterogeneity and the anelasticity of the sampled Earth medium.
      Citation: Data
      PubDate: 2021-09-13
      DOI: 10.3390/data6090098
      Issue No: Vol. 6, No. 9 (2021)
       
  • Data, Vol. 6, Pages 99: Technical Data of Heterologous Expression and
           Purification of SARS-CoV-2 Proteases Using Escherichia coli System

    • Authors: Rafida Razali, Vijay Kumar Subbiah, Cahyo Budiman
      First page: 99
      Abstract: The SARS-CoV-2 coronavirus expresses two essential proteases: firstly, the 3Chymotrypsin-like protease (3CLpro) or main protease (Mpro), and secondly, the papain-like protease (PLpro), both of which are considered as viable drug targets for the inhibition of viral replication. In order to perform drug discovery assays for SARS-CoV-2, it is imperative that efficient methods are established for the production and purification of 3CLpro and PLpro of SARS-CoV-2, designated as 3CLpro-CoV2 and PLpro-CoV2, respectively. This article expands the data collected in the attempts to express SARS-CoV-2 proteases under different conditions and purify them under single-step chromatography. Data showed that the use of E. coli BL21(DE3) strain was sufficient to express 3CLpro-CoV2 in a fully soluble form. Nevertheless, the single affinity chromatography step was only applicable for 3CLpro-CoV2 expressed at 18 °C, with a yield and purification fold of 92% and 49, respectively. Meanwhile, PLpro-CoV2 was successfully expressed in a fully soluble form in either BL21(DE3) or BL21-CodonPlus(DE3) strains. In contrast, the single affinity chromatography step was only applicable for PLpro-CoV2 expressed using E. coli BL21-CodonPlus(DE3) at 18 or 37 °C, with a yield and purification fold of 86% (18 °C) or 83.36% (37 °C) and 112 (18 °C) or 71 (37 °C), respectively. The findings provide a guide for optimizing the production of SARS-CoV-2 proteases of E. coli host cells.
      Citation: Data
      PubDate: 2021-09-16
      DOI: 10.3390/data6090099
      Issue No: Vol. 6, No. 9 (2021)
       
  • Data, Vol. 6, Pages 100: Dataset of Flow-Induced Vibrations on a Pipe
           Conveying Cold Water

    • Authors: Francisco Villa, Cherlly Sánchez, Marcela Vallejo, Juan S. Botero-Valencia, Edilson Delgado-Trejos
      First page: 100
      Abstract: Analysis of flow-induced pipe vibrations has been applied in a variety of applications, such as flowrate inference and leak detection. These applications are based on a functional relationship between the vibration features estimated in the pipe walls and the dynamics related to the flow of the substance. The dataset described in this document is comprised of signals acquired using an accelerometer attached to a pipe conveying cold water at specific flowrate values. Tests were carried out under numerals of the ISO 4064-1/2: 2016 standard and were performed in two measurement benches designed for flowmeter calibration, and a total of 80 flowrate values, from 25 L/h to 20,000 L/h, were considered. For each flowrate value, 3 to 6 samples were taken, so that the resulting dataset has a total of 382 signals that contain acceleration values in three axes and a timestamp in microseconds.
      Citation: Data
      PubDate: 2021-09-17
      DOI: 10.3390/data6090100
      Issue No: Vol. 6, No. 9 (2021)
       
  • Data, Vol. 6, Pages 79: Temporal Changes in Delaware Waters Using
           Long-Term (1967–2019) Water Temperature Data

    • Authors: Bhanu Paudel, Lori M. Brown
      First page: 79
      Abstract: The present article provides long-term (1967–2019) water temperature data collected from Delaware water quality monitoring sites. In Delaware, there are approximately 140 water quality monitoring sites in Piedmont, Delaware Bay, Chesapeake Bay, and Inland Bay drainage basins. Long-term quarterly (i.e., four times a year: Q1—January–February–March; Q2—April–May–June; Q3—July–August–September; Q4—October–November–December) water temperature data were calculated from each water quality monitoring sites’ continuous monthly data. This study focuses on water quality monitoring sites with significant (p-value identifying linear regression model) increasing or decreasing trends of water temperature. Quarterly water temperature data, statistical analysis, and maps showing increasing and decreasing trend from water quality monitoring sites with significant trends are presented in this article.
      Citation: Data
      PubDate: 2021-07-24
      DOI: 10.3390/data6080079
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 80: Machine-Learning-Based Prediction of Corrosion
           Behavior in Additively Manufactured Inconel 718

    • Authors: O. V. Mythreyi, M. Rohith Srinivaas, Tigga Amit Kumar, R. Jayaganthan
      First page: 80
      Abstract: This research work focuses on machine-learning-assisted prediction of the corrosion behavior of laser-powder-bed-fused (LPBF) and postprocessed Inconel 718. Corrosion testing data of these specimens were collected and fit into the following machine learning algorithms: polynomial regression, support vector regression, decision tree, and extreme gradient boosting. The model performance, after hyperparameter optimization, was evaluated using a set of established metrics: R2, mean absolute error, and root mean square error. Among the algorithms, the extreme gradient boosting algorithm performed best in predicting the corrosion behavior, closely followed by other algorithms. Feature importance analysis was executed in order to determine the postprocessing parameters that influenced the most the corrosion behavior in Inconel 718 manufactured by LPBF.
      Citation: Data
      PubDate: 2021-07-26
      DOI: 10.3390/data6080080
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 81: Dataset of Gravity-Induced Landforms and Sinkholes
           of the Northeast Coast of Malta (Central Mediterranean Sea)

    • Authors: Stefano Devoto, Linley J. Hastewell, Mariacristina Prampolini, Stefano Furlani
      First page: 81
      Abstract: This study investigates gravity-induced landforms that populate the North-Eastern coast of Malta. Attention is focused on tens of persistent joints and thousands of boulders associated with deep-seated gravitational slope deformations (DGSDs), such as lateral spreads and block slides. Lateral spreads produce deep and long joints, which partially isolate limestone boulders along the edge of wide plateaus. These lateral spreads evolve into large block slides that detach thousands of limestone boulders from the cliffs and transport them towards the sea. These boulders are grouped in large slope-failure deposits surrounding limestone plateaus and cover downslope terrains. Gravity-induced joints (n = 124) and downslope boulders (n = 39,861) were identified and categorized using Google Earth (GE) images and later validated by field surveys. The datasets were digitized in QGIS and stored using ESRI shapefiles, which are common digital formats for storing vector GIS data. These types of landslides are characterized by slow-moving mechanisms, which evolve into destructive failures and present an elevated level of risk to coastal populations and infrastructure. Hundreds of blocks identified along the shore also provide evidence of sinkholes; for this reason, the paper also provides a catalogue of sinkholes. The outputs from this research can provide coastal managers with important information regarding the occurrence of coastal geohazards and represent a key resource for future landslide hazard assessment.
      Citation: Data
      PubDate: 2021-07-31
      DOI: 10.3390/data6080081
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 82: Factors Influencing Business Analytics Solutions
           and Views on Business Problems

    • Authors: Martin Potančok, Jan Pour, Wui Ip
      First page: 82
      Abstract: The main aim of this paper is to identify and specify factors that influence business analytics. A factor in this context refers to any significant characteristic that defines the environment in which business analytics and business in general are conducted. Factors and their understanding are essential for the quality of final business analytics solutions, given their complexity and interconnectedness. Factors play an extremely important role in analytic thinking and business analysts’ skills and knowledge. These factors determine effective approaches and procedures for business analytics, and, in some cases, they also aid in the decision to delay a business analytics solution given a situation. This paper has used the case study method, a qualitative research method, due to the need to carry out investigation within the actual business (company) environment, in order to be able to fully understand and verify factors affecting analytics from the viewpoint of all stakeholders. This study provides a set of 15 factors from business, company, and market environments, including their importance in business analytics.
      Citation: Data
      PubDate: 2021-08-04
      DOI: 10.3390/data6080082
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 83: A Global Book Reading Dataset

    • Authors: Nazanin Sabri, Ingmar Weber
      First page: 83
      Abstract: The choice of what to read is both influenced by and indicative of such factors as a person’s beliefs, culture, gender, and socioeconomic status. However, obtaining data including such personal attributes, as well as detailed reading habits and activities of individuals is difficult and would usually require either (i) data from e-readers, such as the Amazon Kindle, or from library checkouts, both of which are hard to obtain, or (ii) distributing questionnaires and conducting interviews, which can be expensive and suffers from recall bias. In this study, we present a dataset of over 40 million reading instances of 1,872,677 unique individuals collected from Goodreads. Goodreads is a book-cataloging social media platform with millions of users, where users share comments on the books they have read, while creating and maintaining social connections. We enrich the dataset with gender and location information. The dataset presented in this study can be used to perform cross-national and cross-gender analyses of reading behavior among book enthusiasts.
      Citation: Data
      PubDate: 2021-08-04
      DOI: 10.3390/data6080083
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 84: The Automatic Detection of Dataset Names in
           Scientific Articles

    • Authors: Jenny Heddes, Pim Meerdink, Miguel Pieters, Maarten Marx
      First page: 84
      Abstract: We study the task of recognizing named datasets in scientific articles as a Named Entity Recognition (NER) problem. Noticing that available annotated datasets were not adequate for our goals, we annotated 6000 sentences extracted from four major AI conferences, with roughly half of them containing one or more named datasets. A distinguishing feature of this set is the many sentences using enumerations, conjunctions and ellipses, resulting in long BI+ tag sequences. On all measures, the SciBERT NER tagger performed best and most robustly. Our baseline rule based tagger performed remarkably well and better than several state-of-the-art methods. The gold standard dataset, with links and offsets from each sentence to the (open access available) articles together with the annotation guidelines and all code used in the experiments, is available on GitHub.
      Citation: Data
      PubDate: 2021-08-04
      DOI: 10.3390/data6080084
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 85: VISEMURE: A Visual Analytics System for Making
           Sense of Multimorbidity Using Electronic Medical Record Data

    • Authors: Maede S. Nouri, Daniel J. Lizotte, Kamran Sedig, Sheikh S. Abdullah
      First page: 85
      Abstract: Multimorbidity is a growing healthcare problem, especially for aging populations. Traditional single disease-centric approaches are not suitable for multimorbidity, and a holistic framework is required for health research and for enhancing patient care. Patterns of multimorbidity within populations are complex and difficult to communicate with static visualization techniques such as tables and charts. We designed a visual analytics system called VISEMURE that facilitates making sense of data collected from patients with multimorbidity. With VISEMURE, users can interactively create different subsets of electronic medical record data to investigate multimorbidity within different subsets of patients with pre-existing chronic diseases. It also allows the creation of groups of patients based on age, gender, and socioeconomic status for investigation. VISEMURE can use a range of statistical and machine learning techniques and can integrate them seamlessly to compute prevalence and correlation estimates for selected diseases. It presents results using interactive visualizations to help healthcare researchers in making sense of multimorbidity. Using a case study, we demonstrate how VISEMURE can be used to explore the high-dimensional joint distribution of random variables that describes the multimorbidity present in a patient population.
      Citation: Data
      PubDate: 2021-08-04
      DOI: 10.3390/data6080085
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 86: Contemporary Business Analytics: An Overview

    • Authors: Raghupathi, Raghupathi
      First page: 86
      Abstract: We examine the state-of-the-art of the business analytics field by identifying and describing the four types of analytics and the three pillars of modeling. Further, we offer a framework of the interplay between the types of analytics and those pillars of modeling. The article describes the architectural framework and outlines an analytics methodology life cycle. Additionally, key contemporary design issues and challenges are highlighted. In this paper, we offer researchers and practitioners a contemporary overview of business analytics. As business analytics has emerged as a distinct discipline with the key objective to gain insight to make informed decisions, this state-of-the art survey sheds light on recent developments in the business analytics discipline.
      Citation: Data
      PubDate: 2021-08-04
      DOI: 10.3390/data6080086
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 87: A Dataset of Photos and Videos for Digital
           Forensics Analysis Using Machine Learning Processing

    • Authors: Sara Ferreira, Mário Antunes, Manuel E. Correia
      First page: 87
      Abstract: Deepfake and manipulated digital photos and videos are being increasingly used in a myriad of cybercrimes. Ransomware, the dissemination of fake news, and digital kidnapping-related crimes are the most recurrent, in which tampered multimedia content has been the primordial disseminating vehicle. Digital forensic analysis tools are being widely used by criminal investigations to automate the identification of digital evidence in seized electronic equipment. The number of files to be processed and the complexity of the crimes under analysis have highlighted the need to employ efficient digital forensics techniques grounded on state-of-the-art technologies. Machine Learning (ML) researchers have been challenged to apply techniques and methods to improve the automatic detection of manipulated multimedia content. However, the implementation of such methods have not yet been massively incorporated into digital forensic tools, mostly due to the lack of realistic and well-structured datasets of photos and videos. The diversity and richness of the datasets are crucial to benchmark the ML models and to evaluate their appropriateness to be applied in real-world digital forensics applications. An example is the development of third-party modules for the widely used Autopsy digital forensic application. This paper presents a dataset obtained by extracting a set of simple features from genuine and manipulated photos and videos, which are part of state-of-the-art existing datasets. The resulting dataset is balanced, and each entry comprises a label and a vector of numeric values corresponding to the features extracted through a Discrete Fourier Transform (DFT). The dataset is available in a GitHub repository, and the total amount of photos and video frames is 40,588 and 12,400, respectively. The dataset was validated and benchmarked with deep learning Convolutional Neural Networks (CNN) and Support Vector Machines (SVM) methods; however, a plethora of other existing ones can be applied. Generically, the results show a better F1-score for CNN when comparing with SVM, both for photos and videos processing. CNN achieved an F1-score of 0.9968 and 0.8415 for photos and videos, respectively. Regarding SVM, the results obtained with 5-fold cross-validation are 0.9953 and 0.7955, respectively, for photos and videos processing. A set of methods written in Python is available for the researchers, namely to preprocess and extract the features from the original photos and videos files and to build the training and testing sets. Additional methods are also available to convert the original PKL files into CSV and TXT, which gives more flexibility for the ML researchers to use the dataset on existing ML frameworks and tools.
      Citation: Data
      PubDate: 2021-08-05
      DOI: 10.3390/data6080087
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 88: VHR-REA_IT Dataset: Very High Resolution Dynamical
           Downscaling of ERA5 Reanalysis over Italy by COSMO-CLM

    • Authors: Mario Raffa, Alfredo Reder, Gian Franco Marras, Marco Mancini, Gabriella Scipione, Monia Santini, Paola Mercogliano
      First page: 88
      Abstract: This work presents a new dataset for recent climate developed within the Highlander project by dynamically downscaling ERA5 reanalysis, originally available at ≃31 km horizontal resolution, to ≃2.2 km resolution (i.e., convection permitting scale). Dynamical downscaling was conducted through the COSMO Regional Climate Model (RCM). The temporal resolution of output is hourly (like for ERA5). Runs cover the whole Italian territory (and neighboring areas according to the necessary computation boundary) to provide a very detailed (in terms of space–time resolution) and comprehensive (in terms of meteorological fields) dataset of climatological data for at least the last 30 years (01/1989-12/2020). These types of datasets can be used for (applied) research and downstream services (e.g., for decision support systems).
      Citation: Data
      PubDate: 2021-08-09
      DOI: 10.3390/data6080088
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 89: Geodatabase of Publicly Available Information
           about Czech Municipalities’ Local Administration

    • Authors: Vít Pászto, Jiří Pánek, Jaroslav Burian
      First page: 89
      Abstract: In this data description, we introduce a unique (geo)dataset with publicly available information about the municipalities focused on (geo)participatory aspects of local administration. The dataset comprises 6258 Czech municipalities linked with their respective administrative boundaries. In total, 55 attributes were prepared for each municipality. We also describe the process of data collection, processing, verification, and publication as open data. The uniqueness of the dataset is that such a complex dataset regarding geographical coverage with a high level of detail (municipalities) has never been collected in Czechia before. Besides, it could be applied in various research agendas in public participation and local administration and used thematically using selected indicators from various participation domains. The dataset is available freely in the Esri geodatabase, geospatial services using API (REST, GeoJSON), and other common non-spatial formats (MS Excel and CSV).
      Citation: Data
      PubDate: 2021-08-10
      DOI: 10.3390/data6080089
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 90: Canadian Dental Patients with a Single-Unit
           Implant-Supported Restoration in the Aesthetic Region of the Mouth:
           Qualitative and Quantitative Patient-Reported Outcome Measures (PROMs)

    • Authors: Afrashtehfar, Igarashi, Bryant
      First page: 90
      Abstract: This article contains quantitative and qualitative patient-reported outcome measures (PROMs) collected from nine dental patients, with a single-implant in the maxillary anterior region of the mouth, recruited after obtaining consent documents. The quantitative data were obtained from participants’ demographics, frontal extraoral digital photographs, intraoral scans (IOS) of the maxillary arch, and self-administered questionnaires (where patients judged the overall, appearance, function, and comfort of their single-implant-supported crowns). Objective single-implant aesthetic index mean scores (Pink Esthetic Score/White Esthetic Score [PES/WES]) were obtained after two experienced calibrated clinicians analyzed the photographs and the three-dimensional models generated from the IOS. The self-administered questionnaires used a visual analogue scale (VAS) to obtain the patients’ subjective perceptions. The qualitative data were obtained from in-depth, semi-structured one-to-one interviews. The transcriptions from audio-recorded interview data were managed and coded, with the aid of a Computer-Assisted Qualitative Data Analysis Software (CAQDAS). These data were stored in a public repository that can be easily downloaded from a Mendeley data repository (
      DOI : 10.17632/sv8t6tkvjv.1).
      Citation: Data
      PubDate: 2021-08-11
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 91: NagareDB: A Resource-Efficient Document-Oriented
           Time-Series Database

    • Authors: Carlos Garcia Calatrava, Yolanda Becerra Fontal, Fernando M. Cucchietti, Carla Diví Cuesta
      First page: 91
      Abstract: The recent great technological advance has led to a broad proliferation of Monitoring Infrastructures, which typically keep track of specific assets along time, ranging from factory machinery, device location, or even people. Gathering this data has become crucial for a wide number of applications, like exploration dashboards or Machine Learning techniques, such as Anomaly Detection. Time-Series Databases, designed to handle these data, grew in popularity, becoming the fastest-growing database type from 2019. In consequence, keeping track and mastering those rapidly evolving technologies became increasingly difficult. This paper introduces the holistic design approach followed for building NagareDB, a Time-Series database built on top of MongoDB—the most popular NoSQL Database, typically discouraged in the Time-Series scenario. The goal of NagareDB is to ease the access to three of the essential resources needed to building time-dependent systems: Hardware, since it is able to work in commodity machines; Software, as it is built on top of an open-source solution; and Expert Personnel, as its foundation database is considered the most popular NoSQL DB, lowering its learning curve. Concretely, NagareDB is able to outperform MongoDB recommended implementation up to 4.7 times, when retrieving data, while also offering a stream-ingestion up to 35% faster than InfluxDB, the most popular Time-Series database. Moreover, by relaxing some requirements, NagareDB is able to reduce the disk space usage up to 40%.
      Citation: Data
      PubDate: 2021-08-13
      DOI: 10.3390/data6080091
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 92: Country-Specific Interests towards Fall Detection
           from 2004–2021: An Open Access Dataset and Research Questions

    • Authors: Nirmalya Thakur, Chia Y. Han
      First page: 92
      Abstract: Falls, which are increasing at an unprecedented rate in the global elderly population, are associated with a multitude of needs such as healthcare, medical, caregiver, and economic, and they are posing various forms of burden on different countries across the world, specifically in the low- and middle-income countries. For these respective countries to anticipate, respond, address, and remedy these diverse needs either by using their existing resources, or by developing new policies and initiatives, or by seeking support from other countries or international organizations dedicated to global public health, the timely identification of these needs and their associated trends is highly necessary. This paper addresses this challenge by presenting a study that uses the potential of the modern Internet of Everything lifestyle, where relevant Google Search data originating from different geographic regions can be interpreted to understand the underlining region-specific user interests towards a specific topic, which further demonstrates the public health need towards the same. The scientific contributions of this study are two-fold. First, it presents an open-access dataset that consists of the user interests towards fall detection for all the 193 countries of the world studied from 2004–2021. In the dataset, the user interest data is available for each month for all these countries in this time range. Second, based on the analysis of potential and emerging research directions in the interrelated fields of Big Data, Data Mining, Information Retrieval, Natural Language Processing, Data Science, and Pattern Recognition, in the context of fall detection research, this paper presents 22 research questions that may be studied, evaluated, and investigated by researchers using this dataset.
      Citation: Data
      PubDate: 2021-08-15
      DOI: 10.3390/data6080092
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 93: A Sustainable Method for Publishing Interoperable
           Open Data on the Web

    • Authors: Raf Buyle, Brecht Van de Vyvere, Julián Rojas Meléndez, Dwight Van Lancker, Eveline Vlassenroot, Mathias Van Compernolle, Stefan Lefever, Pieter Colpaert, Peter Mechant, Erik Mannens
      First page: 93
      Abstract: Smart cities need (sensor) data for better decision-making. However, while there are vast amounts of data available about and from cities, an intermediary is needed that connects and interprets (sensor) data on a Web-scale. Today, governments in Europe are struggling to publish open data in a sustainable, predictable and cost-effective way. Our research question considers what methods for publishing Linked Open Data time series, in particular air quality data, are suitable in a sustainable and cost-effective way. Furthermore, we demonstrate the cross-domain applicability of our data publishing approach through a different use case on railway infrastructure—Linked Open Data. Based on scenarios co-created with various governmental stakeholders, we researched methods to promote data interoperability, scalability and flexibility. The results show that applying a Linked Data Fragments-based approach on public endpoints for air quality and railway infrastructure data, lowers the cost of publishing and increases availability due to better Web caching strategies.
      Citation: Data
      PubDate: 2021-08-19
      DOI: 10.3390/data6080093
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 94: Do the European Data Portal Datasets in the
           Categories Government and Public Sector, Transport, and Education, Culture
           and Sport Meet the Data on the Web Best Practices'

    • Authors: Andrade, Cunha, Figueiredo, Baptista
      First page: 94
      Abstract: (1) Background: The European Data Portal is one of the worldwide initiatives that aggregates and make open data available. (2) Methods: This is a case study with a qualitative approach that aims to determine to what extent the datasets from the Government and Public Sector, Transport, and Education, Culture and Sport categories published on the portal meet the Data on the Web Best Practices (W3C). With the datasets sorted by last modified and filtered by the ratings Excellent and Good+, we analyzed 50 different datasets from each category. (3) Results: The analysis revealed that the Government and Transport categories have the best-rated datasets, followed by Transportation and, lastly, Education. (4) Conclusions: This analysis revealed that the Government and Transport categories have the best-rated datasets and Education the least. The most observed BPs were: BP1, BP2, BP4, BP5, BP10, BP11, BP12, BP13C, BP16, BP17, BP19, BP29, and BP34, while the least observed were: BP3, BP7H, BP7C, BP13H, BP14, BP15, BP21, BP32, and BP35. These results fill a gap in the literature on the quality of the data made available by this portal and provide insights for European data managers on which best practices are most observed and which ones need more attention.
      Citation: Data
      PubDate: 2021-08-19
      DOI: 10.3390/data6080094
      Issue No: Vol. 6, No. 8 (2021)
       
  • Data, Vol. 6, Pages 69: Transitioning to Society 5.0 in Africa: Tools to
           Support ICT Infrastructure Sharing

    • Authors: Kennedy Nomamidobo Amadasun, Michael Short, Rajesh Shankar-Priya, Tracey Crosbie
      First page: 69
      Abstract: Society 5.0 represents an opportunity to transform the economy and create a digital society with the goal of long-term sustainable development and economic growth. There is a growing importance of boosting ICT as an effective and efficient means of achieving this transformation, and Target 9c of the UN Sustainable Development Goals is to ‘Significantly increase access to information and communications technology and strive to provide universal and affordable access to the Internet in least developed countries’. Mobile telecommunication systems have become the most effective and convenient means of communicating in the world, and as such, they are revolutionizing business operations. Nigeria is the fastest growing telecommunication market in Africa, with approximately 298 million subscribers accommodated by over 53,000 base transceiver stations (BTSs) which are largely concentrated in urban areas. As a result of increasing subscribers, all mobile network service providers in Nigeria are building new BTSs, often without considering existing infrastructure. This has led to a proliferation of masts, defacing the environment and causing unnecessary environmental pollution as BTSs are largely powered by diesel generators. It is therefore becoming paramount for the telecommunication regulatory body in Nigeria to enforce principles of infrastructure sharing and the colocation of sites for all mobile network service provider BTSs to improve network availability, reliability, scalability, customer satisfaction and sustainability. This paper argues, through the development of ICT tools and their application to a case study, that infrastructure sharing and colocation of sites is not only feasible if supported correctly but also offers the potential to reduce operational and capital expenditure, reduce the number of BTSs required for the rapidly growing mobile telecoms industry in Nigeria and in doing so reduce environmental pollution.
      Citation: Data
      PubDate: 2021-06-25
      DOI: 10.3390/data6070069
      Issue No: Vol. 6, No. 7 (2021)
       
  • Data, Vol. 6, Pages 70: An AI-Enabled Approach in Analyzing Media Data: An
           Example from Data on COVID-19 News Coverage in Vietnam

    • Authors: Vuong, La, Nguyen, Nguyen, Le, Ho
      First page: 70
      Abstract: This method article presents the nuts and bolts of an AI-enabled approach to extracting and analyzing social media data. The method is based on our previous rapidly cited COVID-19 research publication, working on a dataset of more than 14,000 news articles from Vietnamese newspapers, to provide a comprehensive picture of how Vietnam has been responding to this unprecedented pandemic. This same method is behind our IUCN-supported research regarding the social aspects of environmental protection missions, now appearing in print in Wiley’s Corporate Social Responsibility and Environmental Management. Homemade AI-enabled software was the backbone of the study. The software has provided a fast and automatic approach in collecting and analyzing social data. Moreover, the tool also allows manually sorting the data, AI-generated word tokenizing in the Vietnamese language, and powerful visualization. The method hopes to provide an effective but low-cost method for social scientists to gather a massive amount of data and analyze them in a short amount of time.
      Citation: Data
      PubDate: 2021-06-25
      DOI: 10.3390/data6070070
      Issue No: Vol. 6, No. 7 (2021)
       
  • Data, Vol. 6, Pages 71: An Annotated Corpus of Crime-Related Portuguese
           Documents for NLP and Machine Learning Processing

    • Authors: Gonçalo Carnaz, Mário Antunes, Vitor Beires Nogueira
      First page: 71
      Abstract: Criminal investigations collect and analyze the facts related to a crime, from which the investigators can deduce evidence to be used in court. It is a multidisciplinary and applied science, which includes interviews, interrogations, evidence collection, preservation of the chain of custody, and other methods and techniques of investigation. These techniques produce both digital and paper documents that have to be carefully analyzed to identify correlations and interactions among suspects, places, license plates, and other entities that are mentioned in the investigation. The computerized processing of these documents is a helping hand to the criminal investigation, as it allows the automatic identification of entities and their relations, being some of which difficult to identify manually. There exists a wide set of dedicated tools, but they have a major limitation: they are unable to process criminal reports in the Portuguese language, as an annotated corpus for that purpose does not exist. This paper presents an annotated corpus, composed of a collection of anonymized crime-related documents, which were extracted from official and open sources. The dataset was produced as the result of an exploratory initiative to collect crime-related data from websites and conditioned-access police reports. The dataset was evaluated and a mean precision of 0.808, recall of 0.722, and F1-score of 0.733 were obtained with the classification of the annotated named-entities present in the crime-related documents. This corpus can be employed to benchmark Machine Learning (ML) and Natural Language Processing (NLP) methods and tools to detect and correlate entities in the documents. Some examples are sentence detection, named-entity recognition, and identification of terms related to the criminal domain.
      Citation: Data
      PubDate: 2021-06-26
      DOI: 10.3390/data6070071
      Issue No: Vol. 6, No. 7 (2021)
       
  • Data, Vol. 6, Pages 72: BROAD—A Benchmark for Robust Inertial
           Orientation Estimation

    • Authors: Daniel Laidig, Marco Caruso, Andrea Cereatti, Thomas Seel
      First page: 72
      Abstract: Inertial measurement units (IMUs) enable orientation, velocity, and position estimation in several application domains ranging from robotics and autonomous vehicles to human motion capture and rehabilitation engineering. Errors in orientation estimation greatly affect any of those motion parameters. The present work explains the main challenges in inertial orientation estimation (IOE) and presents an extensive benchmark dataset that includes 3D inertial and magnetic data with synchronized optical marker-based ground truth measurements, the Berlin Robust Orientation Estimation Assessment Dataset (BROAD). The BROAD dataset consists of 39 trials that are conducted at different speeds and include various types of movement. Thereof, 23 trials are performed in an undisturbed indoor environment, and 16 trials are recorded with deliberate magnetometer and accelerometer disturbances. We furthermore propose error metrics that allow for IOE accuracy evaluation while separating the heading and inclination portions of the error and introduce well-defined benchmark metrics. Based on the proposed benchmark, we perform an exemplary case study on two widely used openly available IOE algorithms. Due to the broad range of motion and disturbance scenarios, the proposed benchmark is expected to provide valuable insight and useful tools for the assessment, selection, and further development of inertial sensor fusion methods and IMU-based application systems.
      Citation: Data
      PubDate: 2021-06-27
      DOI: 10.3390/data6070072
      Issue No: Vol. 6, No. 7 (2021)
       
  • Data, Vol. 6, Pages 73: A Robust Distributed Clustering of Large Data Sets
           on a Grid of Commodity Machines

    • Authors: Salah Taamneh, Mo’taz Al-Hami, Hani Bani-Salameh, Alaa E. Abdallah
      First page: 73
      Abstract: Distributed clustering algorithms have proven to be effective in dramatically reducing execution time. However, distributed environments are characterized by a high rate of failure. Nodes can easily become unreachable. Furthermore, it is not guaranteed that messages are delivered to their destination. As a result, fault tolerance mechanisms are of paramount importance to achieve resiliency and guarantee continuous progress. In this paper, a fault-tolerant distributed k-means algorithm is proposed on a grid of commodity machines. Machines in such an environment are connected in a peer-to-peer fashion and managed by a gossip protocol with the actor model used as the concurrency model. The fact that no synchronization is needed makes it a good fit for parallel processing. Using the passive replication technique for the leader node and the active replication technique for the workers, the system exhibited robustness against failures. The results showed that the distributed k-means algorithm with no fault-tolerant mechanisms achieved up to a 34% improvement over the Hadoop-based k-means algorithm, while the robust one achieved up to a 12% improvement. The experiments also showed that the overhead, using such techniques, was negligible. Moreover, the results indicated that losing up to 10% of the messages had no real impact on the overall performance.
      Citation: Data
      PubDate: 2021-07-07
      DOI: 10.3390/data6070073
      Issue No: Vol. 6, No. 7 (2021)
       
  • Data, Vol. 6, Pages 74: Performing Learning Analytics via Generalised
           Mixed-Effects Trees

    • Authors: Luca Fontana, Chiara Masci, Francesca Ieva, Anna Maria Paganoni
      First page: 74
      Abstract: Nowadays, the importance of educational data mining and learning analytics in higher education institutions is being recognised. The analysis of university careers and of student dropout prediction is one of the most studied topics in the area of learning analytics. From the perspective of estimating the likelihood of a student dropping out, we propose an innovative statistical method that is a generalisation of mixed-effects trees for a response variable in the exponential family: generalised mixed-effects trees (GMET). We performed a simulation study in order to validate the performance of our proposed method and to compare GMET to classical models. In the case study, we applied GMET to model undergraduate student dropout in different courses at Politecnico di Milano. The model was able to identify discriminating student characteristics and estimate the effect of each degree-based course on the probability of student dropout.
      Citation: Data
      PubDate: 2021-07-09
      DOI: 10.3390/data6070074
      Issue No: Vol. 6, No. 7 (2021)
       
  • Data, Vol. 6, Pages 75: Preprocessing of Public RNA-Sequencing Datasets to
           Facilitate Downstream Analyses of Human Diseases

    • Authors: Naomi Rapier-Sharman, John Krapohl, Ethan J. Beausoleil, Kennedy T. L. Gifford, Benjamin R. Hinatsu, Curtis S. Hoffmann, Makayla Komer, Tiana M. Scott, Brett E. Pickett
      First page: 75
      Abstract: Publicly available RNA-sequencing (RNA-seq) data are a rich resource for elucidating the mechanisms of human disease; however, preprocessing these data requires considerable bioinformatic expertise and computational infrastructure. Analyzing multiple datasets with a consistent computational workflow increases the accuracy of downstream meta-analyses. This collection of datasets represents the human intracellular transcriptional response to disorders and diseases such as acute lymphoblastic leukemia (ALL), B-cell lymphomas, chronic obstructive pulmonary disease (COPD), colorectal cancer, lupus erythematosus; as well as infection with pathogens including Borrelia burgdorferi, hantavirus, influenza A virus, Middle East respiratory syndrome coronavirus (MERS-CoV), Streptococcus pneumoniae, respiratory syncytial virus (RSV), severe acute respiratory syndrome coronavirus (SARS-CoV), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). We calculated the statistically significant differentially expressed genes and Gene Ontology terms for all datasets. In addition, a subset of the datasets also includes results from splice variant analyses, intracellular signaling pathway enrichments as well as read mapping and quantification. All analyses were performed using well-established algorithms and are provided to facilitate future data mining activities, wet lab studies, and to accelerate collaboration and discovery.
      Citation: Data
      PubDate: 2021-07-15
      DOI: 10.3390/data6070075
      Issue No: Vol. 6, No. 7 (2021)
       
  • Data, Vol. 6, Pages 76: Impact of COVID-19 on Electricity Demand: Deriving
           Minimum States of System Health for Studies on Resilience

    • Authors: Smruti Manjunath, Madhura Yeligeti, Maria Fyta, Jannik Haas, Hans-Christian Gils
      First page: 76
      Abstract: To assess the resilience of energy systems, i.e., the ability to recover after an unexpected shock, the system’s minimum state of service is a key input. Quantitative descriptions of such states are inherently elusive. The measures adopted by governments to contain COVID-19 have provided empirical data, which may serve as a proxy for such states of minimum service. Here, we systematize the impact of the adopted COVID-19 measures on the electricity demand. We classify the measures into three phases of increasing stringency, ranging from working from home to soft and full lockdowns, for four major electricity consuming countries of Europe. We use readily accessible data from the European Network of Transmission System Operators for Electricity as a basis. For each country and phase, we derive representative daily load profiles with hourly resolution obtained by k-medoids clustering. The analysis could unravel the influence of the different measures to the energy consumption and the differences among the four countries. It is observed that the daily peak load is considerably flattened and the total electricity consumption decreases by up to 30% under the circumstances brought about by the COVID-19 restrictions. These demand profiles are useful for the energy planning community, especially when designing future electricity systems with a focus on system resilience and a more digitalised society in terms of working from home.
      Citation: Data
      PubDate: 2021-07-16
      DOI: 10.3390/data6070076
      Issue No: Vol. 6, No. 7 (2021)
       
  • Data, Vol. 6, Pages 77: Dealing with Randomness and Concept Drift in Large
           Datasets

    • Authors: Kassim S. Mwitondi, Raed A. Said
      First page: 77
      Abstract: Data-driven solutions to societal challenges continue to bring new dimensions to our daily lives. For example, while good-quality education is a well-acknowledged foundation of sustainable development, innovation and creativity, variations in student attainment and general performance remain commonplace. Developing data -driven solutions hinges on two fronts-technical and application. The former relates to the modelling perspective, where two of the major challenges are the impact of data randomness and general variations in definitions, typically referred to as concept drift in machine learning. The latter relates to devising data-driven solutions to address real-life challenges such as identifying potential triggers of pedagogical performance, which aligns with the Sustainable Development Goal (SDG) #4-Quality Education. A total of 3145 pedagogical data points were obtained from the central data collection platform for the United Arab Emirates (UAE) Ministry of Education (MoE). Using simple data visualisation and machine learning techniques via a generic algorithm for sampling, measuring and assessing, the paper highlights research pathways for educationists and data scientists to attain unified goals in an interdisciplinary context. Its novelty derives from embedded capacity to address data randomness and concept drift by minimising modelling variations and yielding consistent results across samples. Results show that intricate relationships among data attributes describe the invariant conditions that practitioners in the two overlapping fields of data science and education must identify.
      Citation: Data
      PubDate: 2021-07-19
      DOI: 10.3390/data6070077
      Issue No: Vol. 6, No. 7 (2021)
       
  • Data, Vol. 6, Pages 78: Multi-Layout Invoice Document Dataset (MIDD): A
           Dataset for Named Entity Recognition

    • Authors: Dipali Baviskar, Swati Ahirrao, Ketan Kotecha
      First page: 78
      Abstract: The day-to-day working of an organization produces a massive volume of unstructured data in the form of invoices, legal contracts, mortgage processing forms, and many more. Organizations can utilize the insights concealed in such unstructured documents for their operational benefit. However, analyzing and extracting insights from such numerous and complex unstructured documents is a tedious task. Hence, the research in this area is encouraging the development of novel frameworks and tools that can automate the key information extraction from unstructured documents. However, the availability of standard, best-quality, and annotated unstructured document datasets is a serious challenge for accomplishing the goal of extracting key information from unstructured documents. This work expedites the researcher’s task by providing a high-quality, highly diverse, multi-layout, and annotated invoice documents dataset for extracting key information from unstructured documents. Researchers can use the proposed dataset for layout-independent unstructured invoice document processing and to develop an artificial intelligence (AI)-based tool to identify and extract named entities in the invoice documents. Our dataset includes 630 invoice document PDFs with four different layouts collected from diverse suppliers. As far as we know, our invoice dataset is the only openly available dataset comprising high-quality, highly diverse, multi-layout, and annotated invoice documents.
      Citation: Data
      PubDate: 2021-07-20
      DOI: 10.3390/data6070078
      Issue No: Vol. 6, No. 7 (2021)
       
  • Data, Vol. 6, Pages 54: Data on the Quantification of Aspartate, GABA and
           Glutamine Levels in the Spinal Cord of Larval Sea Lampreys after a
           Complete Spinal Cord Injury

    • Authors: Blanca Fernández-López, Natividad Pereiro, Anunciación Lafuente, María Celina Rodicio, Antón Barreiro-Iglesias
      First page: 54
      Abstract: We used high-performance liquid chromatography (HPLC) methods to quantify aspartate, GABA, and glutamine levels in the spinal cord of larval sea lampreys following a complete spinal cord injury. Mature larval sea lampreys recover spontaneously from a complete spinal cord transection and the changes in neurotransmitter systems after spinal cord injury might be related to their amazing regenerative capabilities. The data presented here show the concentration of the aminoacidergic neurotransmitters GABA (and its precursor glutamine) and aspartate in the spinal cord of control (non-injured) and 2-, 4-, and 10-week post-lesion animals. Statistical analyses showed that GABA and aspartate levels significantly increase in the spinal cord four weeks after a complete spinal cord injury and that glutamine levels decrease 10 weeks after injury as compared to controls. These data might be of interest to those studying the role of neurotransmitters and neuromodulators in recovery from spinal cord injury in vertebrates.
      Citation: Data
      PubDate: 2021-05-24
      DOI: 10.3390/data6060054
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 55: Machine Learning-Based Algorithms to Knowledge
           Extraction from Time Series Data: A Review

    • Authors: Giuseppe Ciaburro, Gino Iannace
      First page: 55
      Abstract: To predict the future behavior of a system, we can exploit the information collected in the past, trying to identify recurring structures in what happened to predict what could happen, if the same structures repeat themselves in the future as well. A time series represents a time sequence of numerical values observed in the past at a measurable variable. The values are sampled at equidistant time intervals, according to an appropriate granular frequency, such as the day, week, or month, and measured according to physical units of measurement. In machine learning-based algorithms, the information underlying the knowledge is extracted from the data themselves, which are explored and analyzed in search of recurring patterns or to discover hidden causal associations or relationships. The prediction model extracts knowledge through an inductive process: the input is the data and, possibly, a first example of the expected output, the machine will then learn the algorithm to follow to obtain the same result. This paper reviews the most recent work that has used machine learning-based techniques to extract knowledge from time series data.
      Citation: Data
      PubDate: 2021-05-25
      DOI: 10.3390/data6060055
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 56: Automation of Work Processes and Night Work

    • Authors: Urška Kosem, Mirko Markič, Annmarie Gorenc Zoran
      First page: 56
      Abstract: Background: Automation of production processes is not just a simple replacement of a person in production, but it should lead to the success of an organization and contribute to the sustainable development of society and the natural environment. The aim of our study was to find out whether the level of automation of production processes affects the proportion of night work hours of production workers and whether employers are willing to automate production processes to achieve a lower number of night work hours. Methods: We used a quantitative approach to collect primary data through the survey method. The questionnaire was completed by 502 large and medium-sized manufacturing companies in Slovenia. Results: We found no statistically significant correlation between the level of automation of production processes and the percentage of night work hours of production workers. We also found that the reduction of the proportion of night work does not appear to be the main motivator for the introduction of automation of production processes. Conclusions: Based on the results, we rejected the assumption that automation of production processes has a direct impact on the proportion of night work. Moreover, our study will benefit all those who are concerned with the automation of production processes and night work.
      Citation: Data
      PubDate: 2021-05-26
      DOI: 10.3390/data6060056
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 57: Dataset for the Solar Incident Radiation and
           Electricity Production BIPV/BAPV System on the Northern/Southern Façade
           in Dense Urban Areas

    • Authors: Hassan Gholami, Harald Nils Røstvik
      First page: 57
      Abstract: The prosperous implementation of Building Integrated Photovoltaics (BIPV), as well as Building Attached Photovoltaics (BAPV), needs an accurate and detailed assessment of the potential of solar irradiation and electricity production of various commercialised technologies in different orientations on the outer skins of the building. This article presents a dataset for the solar incident radiation and electricity production of PV systems in the north and south orientations in a dense urban area (in the northern hemisphere). The solar incident radiation and the electricity production of two back-to-back PV panels with a ten-centimetre gap for one year are monitored and logged as primary data sources. Using Microsoft Excel, both panels’ efficiency is also presented as a secondary source of data. The implemented PV panels are composed of polycrystalline silicon cells with an efficiency of 16.9%. The results depicted that the actual efficiency of the south-facing panel (13–15%) is always closer to the standard efficiency of the panel compared to the actual efficiency of the north-facing panel (8–12%). Moreover, although the efficiency of the south-facing panel on sunny days of the year is almost constant, the efficiency of the north-facing panel decreases significantly in winter. This phenomenon might be linked to the spectral response of the polycrystalline silicon cells and different incident solar radiation spectrum on the panels. While the monitored data cover the radiation and system electricity production in various air conditions, the analysis is mainly conducted for sunny days, and more investigation is needed to analyse the system performance in other weather conditions (like cloudy and overcast skies). The presented database could be used to analyse the performance of polycrystalline silicon PV panels and their operational efficiency in a dense urban area and for different orientations.
      Citation: Data
      PubDate: 2021-05-26
      DOI: 10.3390/data6060057
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 58: A Large-Scale Dataset of Barley, Maize and Sorghum
           Variety Identification Using DNA Fingerprinting in Ethiopia

    • Authors: Frederic Kosmowski, Alemayehu Ambel, Asmelash Tsegay, Alemayehu Negawo, Jason Carling, Andrzej Kilian, The Central Statistics Agency
      First page: 58
      Abstract: The data described in this paper were part of a large-scale nationally representative household survey, the Ethiopian Socioeconomic Survey (ESS 2018/19). Grain samples of barley, maize and sorghum were collected in six regions in Ethiopia. Variety identification was assessed by matching samples to a reference library composed of released improved materials, using approximately 50,000 markers from DArTseq platforms. This data were part of a study documenting the reach of CGIAR-related germplasms in Ethiopia. These objective measures of crop varietal adoption, unique in the public domain, can be analyzed along with a large set of variables related to agro-ecologies, household characteristics and plot management practices, available in the Ethiopian Socioeconomic Survey 2018/19.
      Citation: Data
      PubDate: 2021-06-03
      DOI: 10.3390/data6060058
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 59: APIs for EU Governments: A Landscape Analysis on
           Policy Instruments, Standards, Strategies and Best Practices

    • Authors: Lorenzino Vaccari, Monica Posada, Mark Boyd, Mattia Santoro
      First page: 59
      Abstract: Application Programming Interfaces (APIs) could greatly facilitate the exchange of data and functionalities between software applications in a flexible, controlled and secure way, especially on the web. Private companies, from startups to enterprises, have been using APIs for several years now, but it is only recently that APIs have seen increased interest in the public sector. API adoption in the public sector faces organisational, technical, legal and economic obstacles, and to overcome these barriers, proposed methods from the private sector and early adopters in the public sector provide a way forward. The available documentation is often sparse, difficult to find and to reuse for new contexts. No past efforts to collect and analyse these resources have been made. To address this shortcoming, this paper describes a landscape analysis in four areas: the main European Commission policy instruments on the adoption of APIs, the available web API standards, a set of European government API strategies and cases, and a list of government proposed methods distilled from more than 3900 documents. Our results reveal that European policy legislation and associated instruments promote, and in some cases mandate, the use of APIs, and that governments’ API strategies in the European Union are rather young but also that there are well known web APIs standards and proposed methods ready to support the digital transformation of governments through rapid, harmonised and successful adoption of APIs.
      Citation: Data
      PubDate: 2021-06-08
      DOI: 10.3390/data6060059
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 60: Information Quality Assessment for Data Fusion
           Systems

    • Authors: Miguel A. Becerra, Catalina Tobón, Andrés Eduardo Castro-Ospina, Diego H. Peluffo-Ordóñez
      First page: 60
      Abstract: This paper provides a comprehensive description of the current literature on data fusion, with an emphasis on Information Quality (IQ) and performance evaluation. This literature review highlights recent studies that reveal existing gaps, the need to find a synergy between data fusion and IQ, several research issues, and the challenges and pitfalls in this field. First, the main models, frameworks, architectures, algorithms, solutions, problems, and requirements are analyzed. Second, a general data fusion engineering process is presented to show how complex it is to design a framework for a specific application. Third, an IQ approach, as well as the different methodologies and frameworks used to assess IQ in information systems are addressed; in addition, data fusion systems are presented along with their related criteria. Furthermore, information on the context in data fusion systems and its IQ assessment are discussed. Subsequently, the issue of data fusion systems’ performance is reviewed. Finally, some key aspects and concluding remarks are outlined, and some future lines of work are gathered.
      Citation: Data
      PubDate: 2021-06-08
      DOI: 10.3390/data6060060
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 61: A Framework Using Contrastive Learning for
           Classification with Noisy Labels

    • Authors: Madalina Ciortan, Romain Dupuis, Thomas Peel
      First page: 61
      Abstract: We propose a framework using contrastive learning as a pre-training task to perform image classification in the presence of noisy labels. Recent strategies, such as pseudo-labeling, sample selection with Gaussian Mixture models, and weighted supervised contrastive learning have, been combined into a fine-tuning phase following the pre-training. In this paper, we provide an extensive empirical study showing that a preliminary contrastive learning step brings a significant gain in performance when using different loss functions: non robust, robust, and early-learning regularized. Our experiments performed on standard benchmarks and real-world datasets demonstrate that: (i) the contrastive pre-training increases the robustness of any loss function to noisy labels and (ii) the additional fine-tuning phase can further improve accuracy, but at the cost of additional complexity.
      Citation: Data
      PubDate: 2021-06-09
      DOI: 10.3390/data6060061
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 62: Measurements of LoRaWAN Technology in Urban
           Scenarios: A Data Descriptor

    • Authors: Pavel Masek, Martin Stusek , Ekaterina Svertoka , Jan Pospisil, Radim Burget, Elena Simona Lohan, Ion Marghescu , Jiri Hosek, Aleksandr Ometov
      First page: 62
      Abstract: This work is a data descriptor paper for measurements related to various operational aspects of LoRaWAN communication technology collected in Brno, Czech Republic. This paper also provides data characterizing the long-term behavior of the LoRaWAN channel collected during the two-month measurement campaign. It covers two measurement locations, one at the university premises, and the second situated near the city center. The dataset’s primary goal is to provide the researchers lacking LoRaWAN devices with an opportunity to compare and analyze the information obtained from 303 different outdoor test locations transmitting to up to 20 gateways operating in the 868 MHz band in a varying metropolitan landscape. To collect the data, we developed a prototype equipped with a Microchip RN2483 Low-Power Wide-Area Network (LPWAN) LoRaWAN technology transceiver module for the field measurements. As an example of data utilization, we showed the Signal-to-noise Ratio (SNR) and Received Signal Strength Indicator (RSSI) in relation to the closest gateway distance.
      Citation: Data
      PubDate: 2021-06-10
      DOI: 10.3390/data6060062
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 63: A Disease Control-Oriented Land Cover Land Use Map
           for Myanmar

    • Authors: Dong Chen, Varada Shevade, Allison Baer, Jiaying He, Amanda Hoffman-Hall, Qing Ying, Yao Li, Tatiana V. Loboda
      First page: 63
      Abstract: Malaria is a serious infectious disease that leads to massive casualties globally. Myanmar is a key battleground for the global fight against malaria because it is where the emergence of drug-resistant malaria parasites has been documented. Controlling the spread of malaria in Myanmar thus carries global significance, because the failure to do so would lead to devastating consequences in vast areas where malaria is prevalent in tropical/subtropical regions around the world. Thanks to its wide and consistent spatial coverage, remote sensing has become increasingly used in the public health domain. Specifically, remote sensing-based land cover/land use (LCLU) maps present a powerful tool that provides critical information on population distribution and on the potential human-vector interactions interfaces on a large spatial scale. Here, we present a 30-meter LCLU map that was created specifically for the malaria control and eradication efforts in Myanmar. This bottom-up approach can be modified and customized to other vector-borne infectious diseases in Myanmar or other Southeastern Asian countries.
      Citation: Data
      PubDate: 2021-06-13
      DOI: 10.3390/data6060063
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 64: A Geo-Tagged COVID-19 Twitter Dataset for 10 North
           American Metropolitan Areas over a 255-Day Period

    • Authors: Sara Melotte, Mayank Kejriwal
      First page: 64
      Abstract: One of the unfortunate findings from the ongoing COVID-19 crisis is the disproportionate impact the crisis has had on people and communities who were already socioeconomically disadvantaged. It has, however, been difficult to study this issue at scale and in greater detail using social media platforms like Twitter. Several COVID-19 Twitter datasets have been released, but they have very broad scope, both topically and geographically. In this paper, we present a more controlled and compact dataset that can be used to answer a range of potential research questions (especially pertaining to computational social science) without requiring extensive preprocessing or tweet-hydration from the earlier datasets. The proposed dataset comprises tens of thousands of geotagged (and in many cases, reverse-geocoded) tweets originally collected over a 255-day period in 2020 over 10 metropolitan areas in North America. Since there are socioeconomic disparities within these cities (sometimes to an extreme extent, as witnessed in ‘inner city neighborhoods’ in some of these cities), the dataset can be used to assess such socioeconomic disparities from a social media lens, in addition to comparing and contrasting behavior across cities.
      Citation: Data
      PubDate: 2021-06-16
      DOI: 10.3390/data6060064
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 65: Sustainability of Urbanization, Non-Agricultural
           Output and Air Pollution in the World’s Top 20 Polluting Countries

    • Authors: Ramesh Chandra Das, Tonmoy Chatterjee, Enrico Ivaldi
      First page: 65
      Abstract: Rapid urbanization is being increasingly recognized as a significant factor of environmental pollution across the world. However, the significance of sustainable urbanization in controlling both pollution and population remains either limited in scope, in the case of developed countries, or less researched, in the case of developing nations. To fill this gap, the present study employed both theoretical and empirical tools to investigate the significant link between sustainable urbanization, pollution and non-agricultural output. In order to empirically examine the supposed link among the key variables mentioned above, the present study considered a panel of the world’s top 20 polluting countries for the 1991–2018 period, which significantly includes both developed and developing nations. Panel vector error correction model and panel co-integration techniques were employed to derive the possible correlation between the variables through sustainable urbanization. Empirical findings show an absence of equilibrium relations among the three variables in the panel of developed countries. However, the study clearly finds that all the three indicators maintain long-run associations for the panel of developing countries. Furthermore, in the short run, the results determine unambiguously that there are significant causal interplays between any two sets of variables and the remaining one variable for both the panel data of developed and developing countries. On the other hand, short-run interplays among the variables we considered exist for both developed and developing economies. From the perspective of policy formulation, the present study shows that policy makers from both the developed and developing nations should be cautious before encouraging urbanization, at least in the short term. However, the combined effects in the short and long term suggest policy makers should be more careful before encouraging urbanization in developing economies.
      Citation: Data
      PubDate: 2021-06-17
      DOI: 10.3390/data6060065
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 66: The NCAR Airborne 94-GHz Cloud Radar: Calibration
           and Data Processing

    • Authors: Ulrike Romatschke, Michael Dixon, Peisang Tsai, Eric Loew, Jothiram Vivekanandan, Jonathan Emmett, Robert Rilling
      First page: 66
      Abstract: The 94-GHz airborne HIAPER Cloud Radar (HCR) has been deployed in three major field campaigns, sampling clouds over the Pacific between California and Hawaii (2015), over the cold waters of the Southern Ocean (2018), and characterizing tropical convection in the Western Caribbean and Pacific waters off Panama and Costa Rica (2019). An extensive set of quality assurance and quality control procedures were developed and applied to all collected data. Engineering measurements yielded calibration characteristics for the antenna, reflector, and radome, which were applied during flight, to produce the radar moments in real-time. Temperature changes in the instrument during flight affect the receiver gains, leading to some bias. Post project, we estimate the temperature-induced gain errors and apply gain corrections to improve the quality of the data. The reflectivity calibration is monitored by comparing sea surface cross-section measurements against theoretically calculated model values. These comparisons indicate that the HCR is calibrated to within 1–2 dB of the theory. A radar echo classification algorithm was developed to identify “cloud echo” and distinguish it from artifacts. Model reanalysis data and digital terrain elevation data were interpolated to the time-range grid of the radar data, to provide an environmental reference.
      Citation: Data
      PubDate: 2021-06-19
      DOI: 10.3390/data6060066
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 67: Semantic Partitioning and Machine Learning in
           Sentiment Analysis

    • Authors: Ebaa Fayyoumi, Sahar Idwan
      First page: 67
      Abstract: This paper investigates sentiment analysis in Arabic tweets that have the presence of Jordanian dialect. A new dataset was collected during the coronavirus disease (COVID-19) pandemic. We demonstrate two models: the Traditional Arabic Language (TAL) model and the Semantic Partitioning Arabic Language (SPAL) model to envisage the polarity of the collected tweets by invoking several, well-known classifiers. The extraction and allocation of numerous Arabic features, such as lexical features, writing style features, grammatical features, and emotional features, have been used to analyze and classify the collected tweets semantically. The partitioning concept was performed on the original dataset by utilizing the hidden semantic meaning between tweets in the SPAL model before invoking various classifiers. The experimentation reveals that the overall performance of the SPAL model competes over and better than the performance of the TAL model due to imposing the genuine idea of semantic partitioning on the collected dataset.
      Citation: Data
      PubDate: 2021-06-21
      DOI: 10.3390/data6060067
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 68: Analyses of Li-Rich Minerals Using Handheld LIBS
           Tool

    • Authors: Cécile Fabre, Nour Eddine Ourti, Julien Mercadier, Joana Cardoso-Fernandes, Filipa Dias, Mônica Perrotta, Friederike Koerting, Alexandre Lima, Friederike Kaestner, Nicole Koellner, Robert Linnen, David Benn, Tania Martins, Jean Cauzid
      First page: 68
      Abstract: Lithium (Li) is one of the latest metals to be added to the list of critical materials in Europe and, thus, lithium exploration in Europe has become a necessity to guarantee its mid- to long-term stable supply. Laser-induced breakdown spectroscopy (LIBS) is a powerful analysis technique that allows for simultaneous multi-elemental analysis with an excellent coverage of light elements (Z < 13). This data paper provides more than 4000 LIBS spectra obtained using a handheld LIBS tool on approximately 140 Li-content materials (minerals, powder pellets, and rocks) and their Li concentrations. The high resolution of the spectrometers combined with the low detection limits for light elements make the LIBS technique a powerful option to detect Li and trace elements of first interest, such as Be, Cs, F, and Rb. The LIBS spectra dataset combined with the Li content dataset can be used to obtain quantitative estimation of Li in Li-rich matrices. This paper can be utilized as technical and spectroscopic support for Li detection in the field using a portable LIBS instrument.
      Citation: Data
      PubDate: 2021-06-21
      DOI: 10.3390/data6060068
      Issue No: Vol. 6, No. 6 (2021)
       
  • Data, Vol. 6, Pages 43: Data for Interaction Diagrams of Geopolymer FRC
           Slender Columns with Double-Layer GFRP and Steel Reinforcement

    • Authors: Mohammad AlHamaydeh, Fouad Amin
      First page: 43
      Abstract: This article provides data of axial load-bending moment capacities of plain and fiber-reinforced geopolymer concrete (GPC, FRGPC) columns. The columns were reinforced by double layers of longitudinal and transverse reinforcement using steel and/or glass-fiber-reinforced polymer (GFRP) bars. The concrete fiber-reinforcing materials included steel and synthetic fibers. The columns data included different parameters like the longitudinal reinforcement ratio, the applied load eccentricity, and the columns’ slenderness ratio. The data was collected from different analysis output files then sorted and tabulated in usable formatted tables. The data can support the development of design axial load-bending moment interactions. In addition, further processing of the data can yield analytical strength curves which are useful in determining the columns stability under different structural loading configurations. Researchers and educators can make use of these data for illustrations and prospective new research suggestions.
      Citation: Data
      PubDate: 2021-04-26
      DOI: 10.3390/data6050043
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 44: Collection of a Bacterial Community Reconstructed
           from Marine Metagenomes Derived from Jinhae Bay, South Korea

    • Authors: Jae-Hyun Lim, Il-Nam Kim
      First page: 44
      Abstract: Marine bacteria are known to play significant roles in marine biogeochemical cycles regarding the decomposition of organic matter. Despite the increasing attention paid to the study of marine bacteria, research has been too limited to fully elucidate the complex interaction between marine bacterial communities and environmental variables. Jinhae Bay, the study area in this work, is the most anthropogenically eutrophied coastal bay in South Korea, and while its physical and biogeochemical characteristics are well described, less is known about the associated changes in microbial communities. In the present study, we reconstructed a metagenomics data based on the 16S rRNA gene to investigate temporal and vertical changes in microbial communities at three depths (surface, middle, and bottom) during a seven-month period from June to December 2016 at one sampling site (J1) in Jinhae Bay. Of all the bacterial data, Proteobacteria, Bacteroidetes, and Cyanobacteria were predominant from June to November, whereas Firmicutes were predominant in December, especially at the middle and bottom depths. These results show that the composition of the microbial community is strongly associated with temporal changes. Furthermore, the community compositions were markedly different between the surface, middle, and bottom depths in summer, when water column stratification and bottom water hypoxia (low dissolved oxygen level) were strongly developed. Metagenomics data contribute to improving our understanding of important relationships between environmental characteristics and microbial community change in eutrophication-induced and deoxygenated coastal areas.
      Citation: Data
      PubDate: 2021-04-26
      DOI: 10.3390/data6050044
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 45: Data from Smartphones and Wearables

    • Authors: Joaquín Torres-Sospedra, Aleksandr Ometov
      First page: 45
      Abstract: Wearables are wireless devices that we “wear” on our bodies [...]
      Citation: Data
      PubDate: 2021-04-28
      DOI: 10.3390/data6050045
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 46: IntelliRehabDS (IRDS)—A Dataset of Physical
           Rehabilitation Movements

    • Authors: Alina Miron, Noureddin Sadawi, Waidah Ismail, Hafez Hussain, Crina Grosan
      First page: 46
      Abstract: In this article, we present a dataset that comprises different physical rehabilitation movements. The dataset was captured as part of a research project intended to provide automatic feedback on the execution of rehabilitation exercises, even in the absence of a physiotherapist. A Kinect motion sensor camera was used to record gestures. The dataset contains repetitions of nine gestures performed by 29 subjects, out of which 15 were patients and 14 were healthy controls. The data are presented in an easily accessible format, provided as 3D coordinates of 25 body joints along with the corresponding depth map for each frame. Each movement was annotated with the gesture type, the position of the person performing the gesture (sitting or standing) as well as a correctness label. The data are publicly available and were released with to provide a comprehensive dataset that can be used for assessing the performance of different patients while performing simple movements in a rehabilitation setting and for comparing these movements with a control group of healthy individuals.
      Citation: Data
      PubDate: 2021-04-30
      DOI: 10.3390/data6050046
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 47: Industry 4.0 and Proactive Works Council Members

    • Authors: Mari Božič, Annmarie Gorenc Zoran, Matej Jevšček
      First page: 47
      Abstract: Background: Integrating Industry 4.0 technologies in organizations affects employees’ workplaces and working conditions. Works Council members play an essential role in this because as intermediaries of information between employees and management, they increase mutual trust and help introduce changes in the work environment. This article discusses the Works Council members’ autopoietic endowments that are necessary for their proactive activity, which we discuss as building blocks for creating constructive relationships with management and quality energy in an organization. As such, we were interested in examining whether the autopoietic endowments of Works Council members influenced the type of relationship with the Works Council and management, and whether this relationship affected Works Council members’ organizational energy. Methods: A questionnaire was developed, piloted and distributed to Works Council Members, and 220 completed questionnaires were returned. Results: We found that the higher the level of self-awareness, the better the relationship between Works Council members and management. Moreover, poor energy represented poor relationships, and poor relationships signified a higher degree of resigned inertia and corrosive energy. Conclusions: Our research provides managements with insights into the relationship between employees and management, and the quality of their organizational energy.
      Citation: Data
      PubDate: 2021-04-30
      DOI: 10.3390/data6050047
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 48: Designing Knowledge Sharing System for Statistical
           Activities in BPS-Statistics Indonesia

    • Authors: Dana Indra Sensuse, Viktor Suwiyanto, Sofian Lusa, Arfive Gandhi, Muhammad Mishbah, Damayanti Elisabeth
      First page: 48
      Abstract: Statistics of Indonesia’s (BPS) performance are not optimal since there is a lack of integration among business processes. This has resulted in unsynchronized data, unstandardized business processes, and inefficient IT investment. To encourage more qualified and integrated business processes, BPS should optimize the knowledge sharing process (KSP) among government employees in statistical areas. This study designed a Knowledge Sharing System (KSS) to facilitate KSP in BPS towards knowledge sharing improvement. The KSS manifested a hypothesis that the design of qualified knowledge management can facilitate an organization to overcome the lack of integration among business processes. Hence, BPS can avoid repetitive mistakes, improve work efficiency, and reduce the risk of failure. This study generated a business process-oriented KSS by combining soft system methodology with the B-KIDE (Business process-oriented Knowledge Infrastructure Development) Framework. It delivered research artifacts (a rich picture, CATWOE analysis (costumer, actor, transformation, weltanschauung, owner, environment), and conceptual model) to capture eight mechanisms of knowledge, map them into the knowledge process, and define the applicable technology. The KSS model has perceived a score of 0.40 using the Kappa formula that indicates the stakeholders’ acceptance. Therefore, BPS can leverage a qualified KSS towards the integrated business processes statistically while the hypothesis was accepted.
      Citation: Data
      PubDate: 2021-05-12
      DOI: 10.3390/data6050048
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 49: Factors That Affect E-Learning Platforms after the
           Spread of COVID-19: Post Acceptance Study

    • Authors: Rana Saeed Al-Maroof, Khadija Alhumaid, Iman Akour, Said Salloum
      First page: 49
      Abstract: The fear of vaccines has led to population rejection due to various reasons. Students have had their own inquiries towards the effectiveness of the vaccination, which leads to vaccination hesitancy. Vaccination hesitancy can affect students’ perception, hence, acceptance of e-learning platforms. Therefore, this research attempts to explore the post-acceptance of e-learning platforms based on a conceptual model that has various variables. Each variable contributes differently to the post-acceptance of the e-learning platform. The research investigates the moderating role of vaccination fear on the post-acceptance of e-learning platforms among students. Thus, the study aims at exploring students’ perceptions about their post-acceptance of e-learning platforms where vaccination fear functions as a moderator. The current study depends on an online questionnaire that is composed of 29 items. The total number of respondents is 630. The collected data was implemented to test the study model and the proposed constructs and hypotheses depending on the Smart PLS Software. Fear of vaccination has a significant impact on the acceptance of e-learning platforms, and it is a strong mediator in the conceptual model. The findings indicate a positive effect of the fear of vaccination as a mediator in the variables: perceived ease of use and usefulness, perceived daily routine, perceived critical mass and perceived self-efficiency. The implication gives a deep insight to take effective steps in reducing the level of fear of vaccination, supporting the vaccination confidence among educators, teachers and students who will, in turn, affect the society as a whole.
      Citation: Data
      PubDate: 2021-05-12
      DOI: 10.3390/data6050049
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 50: Dataset on the Effects of Anti-Insect Nets of
           Different Porosity on Mineral and Organic Acids Profile of Cucurbita pepo
           L. Fruits and Leaves

    • Authors: Luigi Formisano, Michele Ciriello, Christophe El-Nakhel, Stefania De Pascale, Youssef Rouphael
      First page: 50
      Abstract: The growing interest in healthy foods has driven the agricultural sector towards eco-friendly implementation to manage biotic and abiotic factors in protected environments. In this perspective, anti-insect nets are an effective tool for controlling harmful insect populations concomitantly with reducing chemicals’ interference. However, the low porosity of nets necessary to ensure high exclusion efficiency for a designated insect leads to reduced airflow, impacting the productivity and quality attributes of vegetables. The evidence presented in this dataset pertains to the content of total nitrogen, minerals (i.e., NO3, K, PO4, SO4, Ca, Mg, Cl, and Na), and organic acids (i.e., malate and citrate) of zucchini squash (Cucurbita pepo L. cv. Zufolo F1) in leaves and fruits grown with two anti-insect nets with different porosities (Biorete® 50 mesh and Biorete® 50 mesh AirPlus), is and analyzed by the Kjeldahl method and ion chromatography (ICS3000), respectively. Data of total nitrogen concentration, macronutrients, and organic acids provide in-depth information about plants’ physiological response to microclimate changes induced by anti-insect nets.
      Citation: Data
      PubDate: 2021-05-13
      DOI: 10.3390/data6050050
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 51: LeLePhid: An Image Dataset for Aphid Detection and
           Infestation Severity on Lemon Leaves

    • Authors: Jorge Parraga-Alava, Roberth Alcivar-Cevallos, Jéssica Morales Carrillo, Magdalena Castro, Shabely Avellán, Aaron Loor, Fernando Mendoza
      First page: 51
      Abstract: Aphids are small insects that feed on plant sap, and they belong to a superfamily called Aphoidea. They are among the major pests causing damage to citrus crops in most parts of the world. Precise and automatic identification of aphids is needed to understand citrus pest dynamics and management. This article presents a dataset that contains 665 healthy and unhealthy lemon leaf images. The latter are leaves with the presence of aphids, and visible white spots characterize them. Moreover, each image includes a set of annotations that identify the leaf, its health state, and the infestation severity according to the percentage of the affected area on it. Images were collected manually in real-world conditions in a lemon plant field in Junín, Manabí, Ecuador, during the winter, by using a smartphone camera. The dataset is called LeLePhid: lemon (Le) leaf (Le) image dataset for aphid (Phid) detection and infestation severity. The data can facilitate evaluating models for image segmentation, detection, and classification problems related to plant disease recognition.
      Citation: Data
      PubDate: 2021-05-17
      DOI: 10.3390/data6050051
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 52: The Modern Greek Language on the Social Web: A
           Survey of Data Sets and Mining Applications

    • Authors: Maria Nefeli Nikiforos, Yorghos Voutos, Anthi Drougani, Phivos Mylonas, Katia Lida Kermanidis
      First page: 52
      Abstract: Mining social web text has been at the heart of the Natural Language Processing and Data Mining research community in the last 15 years. Though most of the reported work is on widely spoken languages, such as English, the significance of approaches that deal with less commonly spoken languages, such as Greek, is evident for reasons of preserving and documenting minority languages, cultural and ethnic diversity, and identifying intercultural similarities and differences. The present work aims at identifying, documenting and comparing social text data sets, as well as mining techniques and applications on social web text that target Modern Greek, focusing on the arising challenges and the potential for future research in the specific less widely spoken language.
      Citation: Data
      PubDate: 2021-05-17
      DOI: 10.3390/data6050052
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 53: Recursive Genetic Micro-Aggregation Technique:
           Information Loss, Disclosure Risk and Scoring Index

    • Authors: Ebaa Fayyoumi, Omar Alhuniti
      First page: 53
      Abstract: This research investigates the micro-aggregation problem in secure statistical databases by integrating the divide and conquer concept with a genetic algorithm. This is achieved by recursively dividing a micro-data set into two subsets based on the proximity distance similarity. On each subset the genetic operation “crossover” is performed until the convergence condition is satisfied. The recursion will be terminated if the size of the generated subset is satisfied. Eventually, the genetic operation “mutation” will be performed over all generated subsets that satisfied the variable group size constraint in order to maximize the objective function. Experimentally, the proposed micro-aggregation technique was applied to recommended real-life data sets. Results demonstrated a remarkable reduction in the computational time, which sometimes exceeded 70% compared to the state-of-the-art. Furthermore, a good equilibrium value of the Scoring Index (SI) was achieved by involving a linear combination of the General Information Loss (GIL) and the General Disclosure Risk (GDR).
      Citation: Data
      PubDate: 2021-05-20
      DOI: 10.3390/data6050053
      Issue No: Vol. 6, No. 5 (2021)
       
  • Data, Vol. 6, Pages 35: A Sentinel-2 Dataset for Uganda

    • Authors: Jonas Ardö
      First page: 35
      Abstract: Earth observation data provide useful information for the monitoring and management of vegetation- and land-related resources. The Framework for Operational Radiometric Correction for Environmental monitoring (FORCE) was used to download, process and composite Sentinel-2 data from 2018–2020 for Uganda. Over 16,500 Sentinel-2 data granules were downloaded and processed from top of the atmosphere reflectance to bottom of the atmosphere reflectance and higher-level products, totalling > 9 TB of input data. The output data include the number of clear sky observations per year, the best available pixel composite per year and vegetation indices (mean of EVI and NDVI) per quarter. The study intention was to provide analysis-ready data for all of Uganda from Sentinel-2 at 10 m spatial resolution, allowing users to bypass some basic processing and, hence, facilitate environmental monitoring.
      Citation: Data
      PubDate: 2021-03-30
      DOI: 10.3390/data6040035
      Issue No: Vol. 6, No. 4 (2021)
       
  • Data, Vol. 6, Pages 36: Targeted Chemometrics Investigations of Source-,
           Age- and Gender-Dependencies of Oral Cavity Malodorous Volatile Sulphur
           Compounds

    • Authors: Kerry L. Grootveld, Victor Ruiz-Rodado, Martin Grootveld
      First page: 36
      Abstract: Halitosis is a highly distressing, socially unaesthetic condition, with a very high incidence amongst the adult population. It predominantly arises from excessive oral cavity volatile sulphur compound (VSC) concentrations, which have either oral or extra-oral etiologies (90–95% and 5–10% of cases, respectively). However, reports concerning age- and gender-related influences on the patterns and concentrations of these malodorous agents remain sparse; therefore, this study’s first objective was to explore the significance and impact of these potential predictor variables on the oral cavity levels of these malodorants. Moreover, because non-oral etiologies for halitosis may represent avatars of serious extra-oral diseases, the second objective was to distinguish between etiology- (source-) dependent patterns of oral cavity VSCs. Oral cavity VSC determinations were performed on 116 healthy human participants using a non-stationary gas chromatographic facility, and following a 4 h period of abstention from all non-respiratory oral activities. Participants were grouped according to ages or age bands, and gender. Statistical analyses of VSC level data acquired featured both univariate/correlation and multivariate (MV) approaches. Factorial analysis-of-variance and MV analyses revealed that the levels of all VSCs monitored were independent of both age and gender. Principal component analysis (PCA) and a range of further MV analysis techniques, together with an agglomerative hierarchal clustering strategy, demonstrated that VSC predictor variables were partitioned into two components, the first arising from orally-sourced H2S and CH3SH, the second from extra-orally-sourced (CH3)2S alone (about 55% and 30% of total variance respectively). In conclusion, oral cavity VSC concentrations appear not to be significantly influenced by age and gender. Furthermore, (CH3)2S may serve as a valuable biomarker for selected extra-oral conditions.
      Citation: Data
      PubDate: 2021-04-06
      DOI: 10.3390/data6040036
      Issue No: Vol. 6, No. 4 (2021)
       
  • Data, Vol. 6, Pages 37: FastFix Albatross Data: Snapshots of Raw GPS L-1
           Data from Southern Royal Albatross

    • Authors: Timothy C. A. Molteno, Keith W. Payne
      First page: 37
      Abstract: This dataset contains 4-millisecond snapshots of the GPS radio spectrum stored by wildlife tracking tags deployed on adult Southern Royal Albatross (Diomedea epomophora) in New Zealand. Approximately 60,000 snapshots were recovered from nine birds over two southern-hemisphere summers in 2012 and 2013. The data can be post-processed using snapshot positioning algorithms, and are made available as a test dataset for further development of these algorithms. Included are post-processed position estimates for reference, as well as test data from stationary tags positioned under various test conditions for the purposes of characterizing tag performance.
      Citation: Data
      PubDate: 2021-04-07
      DOI: 10.3390/data6040037
      Issue No: Vol. 6, No. 4 (2021)
       
  • Data, Vol. 6, Pages 38: Hand-Washing Video Dataset Annotated According to
           the World Health Organization’s Hand-Washing Guidelines

    • Authors: Martins Lulla, Aleksejs Rutkovskis, Andreta Slavinska, Aija Vilde, Anastasija Gromova, Maksims Ivanovs, Ansis Skadins, Roberts Kadikis, Atis Elsts
      First page: 38
      Abstract: Washing hands is one of the most important ways to prevent infectious diseases, including COVID-19. The World Health Organization (WHO) has published hand-washing guidelines. This paper presents a large real-world dataset with videos recording medical staff washing their hands as part of their normal job duties in the Pauls Stradins Clinical University Hospital. There are 3185 hand-washing episodes in total, each of which is annotated by up to seven different persons. The annotations classify the washing movements according to the WHO guidelines by marking each frame in each video with a certain movement code. The intention of this “in-the-wild” dataset is two-fold: to serve as a basis for training machine-learning classifiers for automated hand-washing movement recognition and quality control, and to allow to investigation of the real-world quality of washing performed by working medical staff. We demonstrate how the data can be used to train a machine-learning classifier that achieves classification accuracy of 0.7511 on a test dataset.
      Citation: Data
      PubDate: 2021-04-07
      DOI: 10.3390/data6040038
      Issue No: Vol. 6, No. 4 (2021)
       
  • Data, Vol. 6, Pages 39: Exploring Inner-City Residents’ and
           Foreigners’ Commitment to Improving Air Pollution: Evidence from a Field
           Survey in Hanoi, Vietnam

    • Authors: Quan-Hoang Vuong, Tri Vu Phu, Tuyet-Anh T. Le, Quy Van Khuc
      First page: 39
      Abstract: Solutions for mitigating and reducing environmental pollution are important priorities for many developed and developing countries. This study was conducted to better understand the degree to which inner-city citizens and foreigners perceive air pollution and respond to it, particularly how much they willingly contribute to improving air quality in Vietnam, a lower-middle-income nation in Southeast Asia. During mid-December 2019, a stratified random sampling technique and a contingent valuation method (CVM) were employed to survey 199 inhabitants and 75 foreigners who reside and travel within the inner-city of Hanoi. The data comprises four major groups of information on: (1) perception of air pollution and its impacts, (2) preventive measures used to mitigate polluted air, (3) commitments on willingness-to-pay (WTP) for reducing air pollution alongside reasons for the yes-or-no-WTP decision, and (4) demographic information of interviewees. The findings and data of this study could offer many policy implications for better environmental management in the study area and beyond.
      Citation: Data
      PubDate: 2021-04-10
      DOI: 10.3390/data6040039
      Issue No: Vol. 6, No. 4 (2021)
       
  • Data, Vol. 6, Pages 40: Isolation of Microsatellite Markers from De Novo
           Whole Genome Sequences of Coptotermes gestroi (Wasmann) (Blattodea:
           Rhinotermitidae)

    • Authors: Li Yang Lim, Shawn Cheng, Abdul Hafiz Ab Majid
      First page: 40
      Abstract: Coptotermes gestroi (Wasmann) (Blattodea: Rhinotermitidae) is a subterranean termite species from Southeast Asia which has been unintentionally introduced to many parts of the world through commerce and modern transportation. Known for causing extensive damage to timber used in the built environment, the termite also has a habit of nesting in carton nests in wood and wooden structures in buildings. As so little is known of its breeding system, colony, and genetic structure, we initiated work to sequence its genome with an Illumina HiSeq™ 2000 sequencer. In this publication, we announce our paired-end sequencing data and report the isolation of 119,190 microsatellite markers from our DNA assembly. The microsatellite marker reported in this publication can be used to elucidate the mating system and genetic structure of this highly invasive termite species. Additionally, in this announcement the study authors make the Bio Project sequence accession number SRR13105492 accessible from the Sequence Read Archive database.
      Citation: Data
      PubDate: 2021-04-10
      DOI: 10.3390/data6040040
      Issue No: Vol. 6, No. 4 (2021)
       
  • Data, Vol. 6, Pages 41: AMAΛΘΕΙA: A Dish-Driven
           Ontology in the Food Domain

    • Authors: Stella Markantonatou, Katerina Toraki, Panagiotis Minos, Anna Vacalopoulou, Vivian Stamou, George Pavlidis
      First page: 41
      Abstract: We present AΜAΛΘΕΙA (AMALTHIA), an application ontology that models the domain of dishes as they are presented in 112 menus collected from restaurants/taverns/patisseries in East Macedonia and Thrace in Northern Greece. AΜAΛΘΕΙA supports a tourist mobile application offering multilingual translation of menus, dietary and cultural information about the dishes and their ingredients, as well as information about the geographical dispersion of the dishes. In this document, we focus on the food/dish dimension that constitutes the ontology’s backbone. Its dish-oriented perspective differentiates AΜAΛΘΕΙA from other food ontologies and thesauri, such as Langual, enabling it to codify information about the dishes served, particularly considering the fact that they are subject to wide variation due to the inevitable evolution of recipes over time, to geographical and cultural dispersion, and to the chef’s creativity. We argue for the adopted design decisions by drawing on semantic information retrieved from the menus, as well as other social and commercial facts, and compare AMAΛΘΕΙA with other important taxonomies in the food field. To the best of our knowledge, AΜAΛΘΕΙA is the first ontology modeling (i) dish variation and (ii) Greek (commercial) cuisine (a component of the Mediterranean diet).
      Citation: Data
      PubDate: 2021-04-14
      DOI: 10.3390/data6040041
      Issue No: Vol. 6, No. 4 (2021)
       
  • Data, Vol. 6, Pages 42: BOOSTR: A Dataset for Accelerator Control Systems

    • Authors: Diana Kafkes, Jason St. John
      First page: 42
      Abstract: The Booster Operation Optimization Sequential Time-series for Regression (BOOSTR) dataset was created to provide a cycle-by-cycle time series of readings and settings from instruments and controllable devices of the Booster, Fermilab’s Rapid-Cycling Synchrotron (RCS) operating at 15 Hz. BOOSTR provides a time series from 55 device readings and settings that pertain most directly to the high-precision regulation of the Booster’s gradient magnet power supply (GMPS). To our knowledge, this is one of the first well-documented datasets of accelerator device parameters made publicly available. We are releasing it in the hopes that it can be used to demonstrate aspects of artificial intelligence for advanced control systems, such as reinforcement learning and autonomous anomaly detection.
      Citation: Data
      PubDate: 2021-04-16
      DOI: 10.3390/data6040042
      Issue No: Vol. 6, No. 4 (2021)
       
  • Data, Vol. 6, Pages 23: Information System for Selection of Conditions and
           Equipment for Mammalian Cell Cultivation

    • Authors: Natalia Menshutina, Elena Guseva, Diana Batyrgazieva, Igor Mitrofanov
      First page: 23
      Abstract: Over the past few decades, animal cell culture technology has advanced significantly. It is now considered a reliable, functional, and relatively well-developed technology. At present, biotherapeutic drugs are synthesized using cell culture techniques by large manufacturing enterprises that produce products for commercial use and clinical research. The reliable implementation of mammalian cell culture technology requires the optimization of a number of variables, including the culture environment and bioreactor conditions, suitable cell lines, operating costs, efficient process management and, most importantly, quality. Successful implementation also requires an appropriate process development strategy, industrial scale, and characteristics, as well as the certification of sustainable procedures that meet the requirements of current regulations. All of this has led to a trend of increasing research in the field of biotechnology and, as a result, to a great accumulation of scientific information which, however, remains fragmentary and non-systematic. The development of information and network technologies allow us to solve this problem. Information system creation allows for implementation of the modern concept of integrating various structured and unstructured data, as well as the collection of information from internal and external sources. We propose and develop an information system which contains the conditions and various parameters of cultivation processes. The associated ranking system is the result of the set of recommendations—both from technological and hardware solutions—which allow for choosing the optimal conditions for the cultivation of mammalian cells at the stage of scientific research, thereby significantly reducing the time and cost of work. The proposed information system allows for the accumulation of experience regarding existing technologies for the cultivation of mammalian cells, along with application to the development of new technologies. The main goal of the present work is to discuss information systems, the organizational support of scientific research in the field of mammalian cell cultivation, and to provide a detailed description of the developed system and its main modules, including the conceptual and logical scheme of the database.
      Citation: Data
      PubDate: 2021-02-25
      DOI: 10.3390/data6030023
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 24: A Data Resource for Sulfuric Acid Reactivity of
           Organic Chemicals

    • Authors: William Bains, Janusz Jurand Petkowski, Sara Seager
      First page: 24
      Abstract: We describe a dataset of the quantitative reactivity of organic chemicals with concentrated sulfuric acid. As well as being a key industrial chemical, sulfuric acid is of environmental and planetary importance. In the absence of measured reaction kinetics, the reaction rate of a chemical with sulfuric acid can be estimated from the reaction rate of structurally related chemicals. To allow an approximate prediction, we have collected 589 sets of kinetic data on the reaction of organic chemicals with sulfuric acid from 262 literature sources and used a functional group-based approach to build a model of how the functional groups would react in any sulfuric acid concentration from 60–100%, and between −20 °C and 100 °C. The data set provides the original reference data and kinetic measurements, parameters, intermediate computation steps, and a set of first-order rate constants for the functional groups across the range of conditions −20 °C–100 °C and 60–100% sulfuric acid. The dataset will be useful for a range of studies in chemistry and atmospheric sciences where the reaction rate of a chemical with sulfuric acid is needed but has not been measured.
      Citation: Data
      PubDate: 2021-02-25
      DOI: 10.3390/data6030024
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 25: FIKWaste: A Waste Generation Dataset from Three
           Restaurant Kitchens in Portugal

    • Authors: Lucas Pereira, Vitor Aguiar, Fábio Vasconcelos
      First page: 25
      Abstract: In the era of big data and artificial intelligence, public datasets are becoming increasingly important for researchers to build and evaluate their models. This paper presents the FIKWaste dataset, which contains time series data for the volume of waste produced in three restaurant kitchens in Portugal. Organic (undifferentiated) and inorganic (glass, paper, and plastic) waste bins were monitored for a consecutive period of four weeks. In addition to the time series measurements, the FIKWaste dataset contains labels for waste disposal events, i.e., when the waste bins are emptied, and technical and non-technical details of the monitored kitchens.
      Citation: Data
      PubDate: 2021-02-26
      DOI: 10.3390/data6030025
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 26: FIKWater: A Water Consumption Dataset from Three
           Restaurant Kitchens in Portugal

    • Authors: Lucas Pereira, Vitor Aguiar, Fábio Vasconcelos
      First page: 26
      Abstract: With the advent of the IoT and low-cost sensing technologies, the availability of data has reached levels never imagined before by the research community. However, independently of their size, data are only as valuable as the ability to have access to them. This paper presents the FIKWater dataset, which contains time series data for hot and cold water demand collected from three restaurant kitchens in Portugal for consecutive periods between two and four weeks. The measurements were taken using ultrasonic flow meters, at a sampling frequency of 0.2 Hz. Additionally, some details of the monitored spaces are also provided.
      Citation: Data
      PubDate: 2021-03-02
      DOI: 10.3390/data6030026
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 27: Collection of Environmental Variables and
           Bacterial Community Compositions in Marian Cove, Antarctica, during Summer
           2018

    • Authors: Kim, Lim, Kim, Kim
      First page: 27
      Abstract: Marine bacteria, which are known as key drivers for marine biogeochemical cycles and Earth’s climate system, are mainly responsible for the decomposition of organic matter and production of climate-relevant gases (i.e., CO₂, N₂O, and CH₄). However, research is still required to fully understand the correlation between environmental variables and bacteria community composition. Marine bacteria living in the Marian Cove, where the inflow of freshwater has been rapidly increasing due to substantial glacial retreat, must be undergoing significant environmental changes. During the summer of 2018, we conducted a hydrographic survey to collect environmental variables and bacterial community composition data at three different layers (i.e., the seawater surface, middle, and bottom layers) from 15 stations. Of all the bacterial data, 17 different phylum level bacteria and 21 different class level bacteria were found and Proteobacteria occupy 50.3% at phylum level following Bacteroidetes. Gammaproteobacteria and Alphaproteobacteria, which belong to Proteobacteria, are the highest proportion at the class level. Gammaproteobacteria showed the highest relative abundance in all three seawater layers. The collection of environmental variables and bacterial composition data contributes to improving our understanding of the significant relationships between marine Antarctic regions and marine bacteria that lives in the Antarctic.
      Citation: Data
      PubDate: 2021-03-05
      DOI: 10.3390/data6030027
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 28: Stark Width Data for Tb II, Tb III and Tb IV
           Spectral Lines

    • Authors: Milan S. Dimitrijević
      First page: 28
      Abstract: A dataset of Stark widths for Tb II, Tb III and Tb IV is presented. To data obtained before, the results of new calculations for 62 Tb III lines from 5d to 6pj(6,j)o, a transition array, have been added. Calculations have been performed by using the simplified modified semiempirical method for temperatures from 5000 to 80,000 K for an electron density of 1017 cm−3. The results were also used to discuss the regularities within multiplets and a supermultiplet.
      Citation: Data
      PubDate: 2021-03-08
      DOI: 10.3390/data6030028
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 29: LeafLive-DB: Classification and Data Storage of
           Botanical Studies

    • Authors: Jorge Rodolfo Beingolea, Diego Ramos-Pires, Jorge Rendulich, Milagros Zegarra, Juan Borja-Murillo, Simone A. Siqueira da Fonseca
      First page: 29
      Abstract: The development of studies, projects, and technologies that contribute to the understanding and preservation of plant biodiversity is becoming highly necessary, as well as tools and software platforms that enable the storage and classification of information resulting from studies on biodiversity. This work presents LeafLive-DB, a software platform that helps map and characterize species from the Brazilian plant biodiversity, offering the possibility of worldwide distribution. Developed by Brazilian and Peruvians researchers, this platform, which is available in its first version, features some functions for consulting and registering plant species and their taxonomy, among other information, through intuitive interfaces and an environment that promotes collaboration and data and research sharing. The platform innovates in data processing, functionality, and development architecture. It has ten thousand registers, and it should start to be distributed in partnership with schools and higher education institutions.
      Citation: Data
      PubDate: 2021-03-09
      DOI: 10.3390/data6030029
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 30: Dataset of the Optimization of a Low Power
           Chemoresistive Gas Sensor: Predictive Thermal Modelling and Mechanical
           Failure Analysis

    • Authors: Gaiardo, Novel, Scattolo, Bucciarelli, Bellutti, Pepponi
      First page: 30
      Abstract: Over the last few years, employment of the standard silicon microfabrication techniques for the gas sensor technology has allowed for the development of ever-small, low-cost, and low-power consumption devices. Specifically, the development of silicon microheaters (MHs) has become well established to produce MOS gas sensors. Therefore, the development of predictive models that help to define a priori the optimal design and layout of the device have become crucial, in order to achieve both low power consumption and high mechanical stability. In this research dataset, we present the experimental data collected to develop a specific and useful predictive thermal-mechanical model for high performing silicon MHs. To this aim, three MH layouts over three different membrane sizes were developed by using the standard silicon microfabrication process. Thermal and mechanical performances of the produced devices were experimentally evaluated, by using probe stations and mechanical failure analysis, respectively. The measured thermal curves were used to develop the predictive thermal model towards low power consumption. Moreover, a statistical analysis was finally introduced to cross-correlate the mechanical failure results and the thermal predictive model, aiming at MH design optimization for gas sensing applications. All the data collected in this investigation are shown.
      Citation: Data
      PubDate: 2021-03-09
      DOI: 10.3390/data6030030
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 31: KazNewsDataset: Single Country Overall Digital
           Mass Media Publication Corpus

    • Authors: Kirill Yakunin, Maksat Kalimoldayev, Ravil I. Mukhamediev, Rustam Mussabayev, Vladimir Barakhnin, Yan Kuchin, Sanzhar Murzakhmetov, Timur Buldybayev, Ulzhan Ospanova, Marina Yelis, Akylbek Zhumabayev, Viktors Gopejenko, Zhazirakhanym Meirambekkyzy, Alibek Abdurazakov
      First page: 31
      Abstract: Mass media is one of the most important elements influencing the information environment of society. The mass media is not only a source of information about what is happening but is often the authority that shapes the information agenda, the boundaries, and forms of discussion on socially relevant topics. A multifaceted and, where possible, quantitative assessment of mass media performance is crucial for understanding their objectivity, tone, thematic focus and, quality. The paper presents a corpus of Kazakhstan media, which contains over 4 million publications from 36 primary sources (which has at least 500 publications). The corpus also includes more than 2 million texts of Russian media for comparative analysis of publication activity of the countries, also about 4000 sections of state policy documents. The paper briefly describes the natural language processing and multiple-criteria decision-making methods, which are the algorithmic basis of the text and mass media evaluation method, and describes the results of several research cases, such as identification of propaganda, assessment of the tone of publications, calculation of the level of socially relevant negativity, comparative analysis of publication activity in the field of renewable energy. Experiments confirm the general possibility of evaluating the socially significant news, identifying texts with propagandistic content, evaluating the sentiment of publications using the topic model of the text corpus since the area under receiver operating characteristics curve (ROC AUC) values of 0.81, 0.73 and 0.93 were achieved on abovementioned tasks. The described cases do not exhaust the possibilities of thematic, tonal, dynamic, etc., analysis of the considered corpus of texts. The corpus will be interesting to researchers considering both multiple publications and mass media analysis, including comparative analysis and identification of common patterns inherent in the media of different countries.
      Citation: Data
      PubDate: 2021-03-14
      DOI: 10.3390/data6030031
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 32: A High-Accuracy GNSS Dataset of Ground Truth
           Points Collected within Îles-de-Boucherville National Park, Quebec,
           Canada

    • Authors: Kathryn Elmer, Margaret Kalacska
      First page: 32
      Abstract: A new ground truth dataset generated with high-accuracy Global Navigation Satellite Systems (GNSS) positional data of the invasive reed Phragmites australis subsp. australis within Îles-de-Boucherville National Park (Quebec, Canada) is described. The park is one of five study sites for the Canadian Airborne Biodiversity Observatory (CABO) and has stands of invasive P. australis spread throughout the park. Previously, within the context of CABO, no ground truth data had been collected within the park consolidating the locations of P. australis. This dataset was collected to serve as training and validation data for CABO airborne hyperspectral imagery acquired in 2019 to assist with the detection and mapping of P. australis. The locations of the ground truth points were found to be accurate within one pixel of the hyperspectral imagery. Overall, 320 ground truth points were collected, representing 158 locations where P. australis was present and 162 locations where it was absent. Auxiliary data includes field photographs and digitized field notes that provide context for each point.
      Citation: Data
      PubDate: 2021-03-14
      DOI: 10.3390/data6030032
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 33: Tools for Remote Exploration: A Lithium (Li)
           Dedicated Spectral Library of the Fregeneda–Almendra Aplite–Pegmatite
           Field

    • Authors: Joana Cardoso-Fernandes, João Silva, Filipa Dias, Alexandre Lima, Ana C. Teodoro, Odile Barrès, Jean Cauzid, Mônica Perrotta, Encarnación Roda-Robles, Maria Anjos Ribeiro
      First page: 33
      Abstract: The existence of diagnostic features in the visible and infrared regions makes it possible to use reflectance spectra not only to identify mineral assemblages but also for calibration and classification of satellite images, considering lithological and/or mineral mapping. For this purpose, a consistent spectral library with the target spectra of minerals and rocks is needed. Currently, there is big market pressure for raw materials including lithium (Li) that has driven new satellite image applications for Li exploration. However, there are no reference spectra for petalite (a Li mineral) in large, open spectral datasets. In this work, a spectral library was built exclusively dedicated to Li minerals and Li pegmatite exploration through satellite remote sensing. The database includes field and laboratory spectra collected in the Fregeneda–Almendra region (Spain–Portugal) from (i) distinct Li minerals (spodumene, petalite, lepidolite); (ii) several Li pegmatites and other outcropping lithologies to allow satellite-based lithological mapping; (iii) areas previously misclassified as Li pegmatites using machine learning algorithms to allow comparisons between these regions and the target areas. Ancillary data include (i) sample location and coordinates, (ii) sample conditions, (iii) sample color, (iv) type of face measured, (v) equipment used, and for the laboratory spectra, (vi) sample photographs, (vii) continuum removed spectra files, and (viii) statistics on the main absorption features automatically extracted. The potential future uses of this spectral library are reinforced by its major advantages: (i) data is provided in a universal file format; (ii) it allows users to compare field and laboratory spectra; (iii) a large number of complementary data allow the comparison of shape, asymmetry, and depth of the absorption features of the distinct Li minerals.
      Citation: Data
      PubDate: 2021-03-16
      DOI: 10.3390/data6030033
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 34: A Data Descriptor for Black Tea Fermentation
           Dataset

    • Authors: Gibson Kimutai, Alexander Ngenzi, Rutabayiro Ngoga Said, Rose C. Ramkat, Anna Förster
      First page: 34
      Abstract: Tea is currently the most popular beverage after water. Tea contributes to the livelihood of more than 10 million people globally. There are several categories of tea, but black tea is the most popular, accounting for about 78% of total tea consumption. Processing of black tea involves the following steps: plucking, withering, crushing, tearing and curling, fermentation, drying, sorting, and packaging. Fermentation is the most important step in determining the final quality of the processed tea. Fermentation is a time-bound process and it must take place under certain temperature and humidity conditions. During fermentation, tea color changes from green to coppery brown to signify the attainment of optimum fermentation levels. These parameters are currently manually monitored. At present, there is only one existing dataset on tea fermentation images. This study makes a tea fermentation dataset available, composed of tea fermentation conditions and tea fermentation images.
      Citation: Data
      PubDate: 2021-03-19
      DOI: 10.3390/data6030034
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 11: The Effect of Preprocessing Techniques, Applied to
           Numeric Features, on Classification Algorithms’ Performance

    • Authors: Esra’a Alshdaifat, Doa’a Alshdaifat, Ayoub Alsarhan, Fairouz Hussein, Subhieh Moh’d Faraj S. El-Salhi
      First page: 11
      Abstract: It is recognized that the performance of any prediction model is a function of several factors. One of the most significant factors is the adopted preprocessing techniques. In other words, preprocessing is an essential process to generate an effective and efficient classification model. This paper investigates the impact of the most widely used preprocessing techniques, with respect to numerical features, on the performance of classification algorithms. The effect of combining various normalization techniques and handling missing values strategies is assessed on eighteen benchmark datasets using two well-known classification algorithms and adopting different performance evaluation metrics and statistical significance tests. According to the reported experimental results, the impact of the adopted preprocessing techniques varies from one classification algorithm to another. In addition, a statistically significant difference between the considered data preprocessing techniques is demonstrated.
      Citation: Data
      PubDate: 2021-01-21
      DOI: 10.3390/data6020011
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 12: A Systematic Survey of ML Datasets for Prime CV
           Research Areas—Media and Metadata

    • Authors: Helder F. Castro, Jaime S. Cardoso, Maria T. Andrade
      First page: 12
      Abstract: The ever-growing capabilities of computers have enabled pursuing Computer Vision through Machine Learning (i.e., MLCV). ML tools require large amounts of information to learn from (ML datasets). These are costly to produce but have received reduced attention regarding standardization. This prevents the cooperative production and exploitation of these resources, impedes countless synergies, and hinders ML research. No global view exists of the MLCV dataset tissue. Acquiring it is fundamental to enable standardization. We provide an extensive survey of the evolution and current state of MLCV datasets (1994 to 2019) for a set of specific CV areas as well as a quantitative and qualitative analysis of the results. Data were gathered from online scientific databases (e.g., Google Scholar, CiteSeerX). We reveal the heterogeneous plethora that comprises the MLCV dataset tissue; their continuous growth in volume and complexity; the specificities of the evolution of their media and metadata components regarding a range of aspects; and that MLCV progress requires the construction of a global standardized (structuring, manipulating, and sharing) MLCV “library”. Accordingly, we formulate a novel interpretation of this dataset collective as a global tissue of synthetic cognitive visual memories and define the immediately necessary steps to advance its standardization and integration.
      Citation: Data
      PubDate: 2021-01-22
      DOI: 10.3390/data6020012
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 13: Acknowledgment to Reviewers of Data in 2020

    • Authors: Data Editorial Office Data Editorial Office
      First page: 13
      Abstract: Peer review is the driving force of journal development, and reviewers are gatekeepers who ensure that Data maintains its standards for the high quality of its published papers [...]
      Citation: Data
      PubDate: 2021-02-01
      DOI: 10.3390/data6020013
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 14: Retinal Fundus Multi-Disease Image Dataset
           (RFMiD): A Dataset for Multi-Disease Detection Research

    • Authors: Samiksha Pachade, Prasanna Porwal, Dhanshree Thulkar, Manesh Kokare, Girish Deshmukh, Vivek Sahasrabuddhe, Luca Giancardo, Gwenolé Quellec, Fabrice Mériaudeau
      First page: 14
      Abstract: The world faces difficulties in terms of eye care, including treatment, quality of prevention, vision rehabilitation services, and scarcity of trained eye care experts. Early detection and diagnosis of ocular pathologies would enable forestall of visual impairment. One challenge that limits the adoption of computer-aided diagnosis tool by ophthalmologists is the number of sight-threatening rare pathologies, such as central retinal artery occlusion or anterior ischemic optic neuropathy, and others are usually ignored. In the past two decades, many publicly available datasets of color fundus images have been collected with a primary focus on diabetic retinopathy, glaucoma, age-related macular degeneration and few other frequent pathologies. To enable development of methods for automatic ocular disease classification of frequent diseases along with the rare pathologies, we have created a new Retinal Fundus Multi-disease Image Dataset (RFMiD). It consists of 3200 fundus images captured using three different fundus cameras with 46 conditions annotated through adjudicated consensus of two senior retinal experts. To the best of our knowledge, our dataset, RFMiD, is the only publicly available dataset that constitutes such a wide variety of diseases that appear in routine clinical settings. This dataset will enable the development of generalizable models for retinal screening.
      Citation: Data
      PubDate: 2021-02-03
      DOI: 10.3390/data6020014
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 15: Repository Approaches to Improving Quality of
           Shared Data and Code

    • Authors: Ana Trisovic, Katherine Mika, Ceilyn Boyd, Sebastian Feger, Mercè Crosas
      First page: 15
      Abstract: Sharing data and code for reuse have become increasingly important in scientific work over the past decade. However, in practice, shared data and code may be unusable, or published results obtained from them may be irreproducible. Data repository features and services contribute significantly to the quality, longevity, and reusability of datasets. This paper presents a combination of original and secondary data analysis studies focusing on computational reproducibility, data curation, and gamified design elements that can be employed to indicate and improve the quality of shared data and code. The findings of these studies are sorted into three approaches that can be valuable to data repositories, archives, and other research dissemination platforms.
      Citation: Data
      PubDate: 2021-02-03
      DOI: 10.3390/data6020015
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 16: Investigating the Adoption of Big Data Management
           in Healthcare in Jordan

    • Authors: Hani Bani-Salameh, Mona Al-Qawaqneh, Salah Taamneh
      First page: 16
      Abstract: Software developers and data scientists use and deal with big data to easily discover useful knowledge and find better solutions to improve healthcare services and patient safety. Big data analytics (BDA) is getting attention due to its role in decision-making across the healthcare field. Therefore, this article examines the adoption mechanism of big data analytics and management in healthcare organizations in Jordan. Additionally, it discusses health big data’s characteristics and the challenges, and limitations for health big data analytics and management in Jordan. This article proposes a conceptual framework that allows utilizing health big data. The proposed conceptual framework suggests a way to merge the existing health information system with the National Health Information Exchange (HIE), which might play a role in extracting insights from our massive datasets, increases the data availability and reduces waste in resources. When applying the framework, the collected data are processed to develop knowledge and support decision-making, which helps improve the health care quality for both the community and individuals by improving diagnosis, treatment, and other services.
      Citation: Data
      PubDate: 2021-02-06
      DOI: 10.3390/data6020016
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 17: Agricultural Crop Change in the Willamette Valley,
           Oregon, from 2004 to 2017

    • Authors: Bogdan M. Strimbu, George Mueller-Warrant, Kristin Trippe
      First page: 17
      Abstract: The Willamette Valley, bounded to the west by the Coast Range and to the east by the Cascade Mountains, is the largest river valley completely confined to Oregon. The fertile valley soils combined with a temperate, marine climate create ideal agronomic conditions for seed production. Historically, seed cropping systems in the Willamette Valley have focused on the production of grass and forage seeds. In addition to growing over two-thirds of the nation’s cool-season grass seed, cropping systems in the Willamette Valley include a diverse rotation of over 250 commodities for forage, seed, food, and cover cropping applications. Tracking the sequence of crop rotations that are grown in the Willamette Valley is paramount to answering a broad spectrum of agronomic, environmental, and economical questions. Landsat imagery covering approximately 25,303 km2 were used to identify agricultural crops in production from 2004 to 2017. The agricultural crops were distinguished by classifying images primarily acquired by three platforms: Landsat 5 (2003–2013), Landsat 7 (2003–2017), and Landsat 8 (2013–2017). Before conducting maximum likelihood remote sensing classification, the images acquired by the Landsat 7 were pre-processed to reduce the impact of the scan line corrector failure. The corrected images were subsequently used to classify 35 different land-use classes and 137 unique two-year-long sequences of 57 classes of non-urban and non-forested land-use categories from 2004 through 2014. Our final data product uses new and previously published results to classify the western Oregon landscape into 61 different land use classes, including four majority-rule-over-time super-classes and 57 regular classes of annually disturbed agricultural crops (19 classes), perennial crops (20 classes), forests (13 classes), and urban developments (5 classes). These publicly available data can be used to inform and support environmental and agricultural land-use studies.
      Citation: Data
      PubDate: 2021-02-07
      DOI: 10.3390/data6020017
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 18: The State of the Art in Methodologies of Course
           Recommender Systems—A Review of Recent Research

    • Authors: Deepani B. Guruge, Rajan Kadel, Sharly J. Halder
      First page: 18
      Abstract: In recent years, education institutions have offered a wide range of course selections with overlaps. This presents significant challenges to students in selecting successful courses that match their current knowledge and personal goals. Although many studies have been conducted on Recommender Systems (RS), a review of methodologies used in course RS is still insufficiently explored. To fill this literature gap, this paper presents the state of the art of methodologies used in course RS along with the summary of the types of data sources used to evaluate these techniques. This review aims to recognize emerging trends in course RS techniques in recent research literature to deliver insights for researchers for further investigation. We provide a systematic review process followed by research findings on the current methodologies implemented in different course RS in selected research journals such as: collaborative, content-based, knowledge-based, Data Mining (DM), hybrid, statistical and Conversational RS (CRS). This study analyzed publications between 2016 and June 2020, in three repositories; IEEE Xplore, ACM, and Google Scholar. These papers were explored and classified based on the methodology used in recommending courses. This review has revealed that there is a growing popularity in hybrid course RS and followed by DM techniques in recent publications. However, few CRS-based course RS were present in the selected publications. Finally, we discussed future avenues based on the research outcome, which might lead to next-generation course RS.
      Citation: Data
      PubDate: 2021-02-11
      DOI: 10.3390/data6020018
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 19: High-to-Low (Regional) Fertility Transitions in a
           Peripheral European Country: The Contribution of Exploratory Time Series
           Analysis

    • Authors: Jesus Rodrigo-Comino, Gianluca Egidi, Luca Salvati, Giovanni Quaranta, Rosanna Salvia, Antonio Gimenez-Morera
      First page: 19
      Abstract: Diachronic variations in demographic rates have frequently reflected social transformations and a (more or less evident) impact of sequential economic downturns. By assessing changes over time in Total Fertility Rate (TFR) at the regional scale in Italy, our study investigates the long-term transition (1952–2019) characteristic of Mediterranean fertility, showing a continuous decline of births since the late 1970s and marked disparities between high- and low-fertility regions along the latitude gradient. Together with a rapid decline in the country TFR, the spatiotemporal evolution of regional fertility in Italy—illustrated through an exploratory time series statistical approach—outlines the marked divide between (wealthier) Northern regions and (economically disadvantaged) Southern regions. Non-linear fertility trends and increasing spatial heterogeneity in more recent times indicate the role of individual behaviors leveraging a generalized decline in marriage and childbearing propensity. Assuming differential responses of regional fertility to changing socioeconomic contexts, these trends are more evident in Southern Italy than in Northern Italy. Reasons at the base of such fertility patterns were extensively discussed focusing—among others—on the distinctive contribution of internal and international migrations to regional fertility rates. Based on these findings, Southern Italy, an economically disadvantaged, peripheral region in Mediterranean Europe, is taken as a paradigmatic case of demographic shrinkage—whose causes and consequences can be generalized to wider contexts in (and outside) Europe.
      Citation: Data
      PubDate: 2021-02-16
      DOI: 10.3390/data6020019
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 20: Dataset of Two-Dimensional Gel Electrophoresis
           Images of Acute Myeloid Leukemia Patients before and after Induction
           Therapy

    • Authors: Juan E. Urrea, Luisa F. Restrepo, Jeanette Prada-Arismendy, Erwing Castillo, Manuel M. Goez, Maria C. Torres-Madronero, Edilson Delgado-Trejos, Sarah Röthlisberger
      First page: 20
      Abstract: Acute myeloid leukemia (AML) is a malignant disorder of the hematopoietic stem and progenitor cells, which results in the build-up of immature blasts in the bone marrow and eventually in the peripheral blood of affected patients. Accurately assessing a patient´s prognosis is very important for clinical management of the disease, which is why there are several prognostic factors such as age, performance status at diagnosis, platelet count, serum creatinine and albumin that are taken into account by the clinician when deciding the course of treatment. However, proteomic changes related to treatment response in this patient group have not been widely explored. Here, we make available a set of 22 two-dimensional gel electrophoresis (2DGE) images obtained from the peripheral blood samples of 11 patients with AML, taken at the time of diagnosis and after induction therapy (approximately 21–28 days after starting treatment). The same set of 2DGE images is also made available after a preprocessing stage (an additional 22 2DGE pre-processed images), which was performed using algorithms developed in Python, in order to improve the visualization of characteristic spots and facilitate proteomic analysis of this type of images.
      Citation: Data
      PubDate: 2021-02-18
      DOI: 10.3390/data6020020
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 21: An Open GMNS Dataset of a Dynamic Multi-modal
           Transportation Network Model of Melbourne, Australia

    • Authors: Nourmohammadi, Mansourianfar, Shafiei, Gu, Saberi
      First page: 21
      Abstract: Simulation-based dynamic traffic assignment models are increasingly used in urban transportation systems analysis and planning. They replicate traffic dynamics across transportation networks by capturing the complex interactions between travel demand and supply. However, their applications particularly for large-scale networks have been hindered by the challenges associated with the collection, parsing, development, and sharing of data-intensive inputs. In this paper, we develop and share an open dataset for reproduction of a dynamic multi-modal transportation network model of Melbourne, Australia. The dataset is developed consistently with the General Modeling Network Specification (GMNS), enabling software-agnostic human and machine readability. GMNS is a standard readable format for sharing routable transportation network data that is designed to be used in multimodal static and dynamic transportation operations and planning models.
      Citation: Data
      PubDate: 2021-02-19
      DOI: 10.3390/data6020021
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 22: A Long-Term, Real-Life Parkinson Monitoring
           Database Combining Unscripted Objective and Subjective Recordings

    • Authors: Jeroen G. V. Habets, Margot Heijmans, Albert F. G. Leentjens, Claudia J. P. Simons, Yasin Temel, Mark L. Kuijf, Pieter L. Kubben, Christian Herff
      First page: 22
      Abstract: Accurate real-life monitoring of motor and non-motor symptoms is a challenge in Parkinson’s disease (PD). The unobtrusive capturing of symptoms and their naturalistic fluctuations within or between days can improve evaluation and titration of therapy. First-generation commercial PD motion sensors are promising to augment clinical decision-making in general neurological consultation, but concerns remain regarding their short-term validity, and long-term real-life usability. In addition, tools monitoring real-life subjective experiences of motor and non-motor symptoms are lacking. The dataset presented in this paper constitutes a combination of objective kinematic data and subjective experiential data, recorded parallel to each other in a naturalistic, long-term real-life setting. The objective data consists of accelerometer and gyroscope data, and the subjective data consists of data from ecological momentary assessments. Twenty PD patients were monitored without daily life restrictions for fourteen consecutive days. The two types of data can be used to address hypotheses on naturalistic motor and/or non-motor symptomatology in PD.
      Citation: Data
      PubDate: 2021-02-23
      DOI: 10.3390/data6020022
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 109: Neglected Theories of Business
           Cycle—Alternative Ways of Explaining Economic Fluctuations

    • Authors: Klára Čermáková, Michal Bejček, Jan Vorlíček, Helena Mitwallyová
      First page: 109
      Abstract: The business cycle is a frequent topic in economic research; however, the approach based on individual strategies often remains neglected. The aspiration of this study is to prove that the behavior of individuals can originate and fuel an economic cycle. For this purpose, we are using an algorithm based on a repeated dove–hawk game. The results reveal that the sum of output in a society is affected by the ratio of individual strategies. Cyclical changes in this ratio will be translated into fluctuations of the total product of society. We present game theory modelling of a strategic behavioral approach as a valid theoretical foundation for explaining economic fluctuations. This article offers an unusual insight into the business cycle’s causes and growth theories.
      Citation: Data
      PubDate: 2021-10-20
      DOI: 10.3390/data6110109
      Issue No: Vol. 6, No. 11 (2021)
       
  • Data, Vol. 6, Pages 110: Dataset of Students’ Performance Using Student
           Information System, Moodle and the Mobile Application “eDify”

    • Authors: Raza Hasan, Sellappan Palaniappan, Salman Mahmood, Ali Abbas, Kamal Uddin Sarker
      First page: 110
      Abstract: The data presented in this article comprise an educational dataset collected from the student information system (SIS), the learning management system (LMS) called Moodle, and video interactions from the mobile application called “eDify.” The dataset, from the higher educational institution (HEI) in Sultanate of Oman, comprises five modules of data from Spring 2017 to Spring 2021. The dataset consists of 326 student records with 40 features in total, including the students’ academic information from SIS (which has 24 features), the students’ activities performed on Moodle within and outside the campus (comprising 10 features), and the students’ video interactions collected from eDify (consisting of six features). The dataset is useful for researchers who want to explore students’ academic performance in online learning environments, and will help them to model their educational datamining models. Moreover, it can serve as an input for predicting students’ academic performance within the module for educational datamining and learning analytics. Furthermore, researchers are highly recommended to refer to the original papers for more details.
      Citation: Data
      PubDate: 2021-10-22
      DOI: 10.3390/data6110110
      Issue No: Vol. 6, No. 11 (2021)
       
  • Data, Vol. 6, Pages 101: Long-Term Dataset of Tidal Residuals in New South
           Wales, Australia

    • Authors: Cristina N. A. Viola, Danielle C. Verdon-Kidd, David J. Hanslow, Sam Maddox, Hannah E. Power
      First page: 101
      Abstract: Continuous water level records are required to detect long-term trends and analyse the climatological mechanisms responsible for extreme events. This paper compiles nine ocean water level records from gauges located along the New South Wales (NSW) coast of Australia. These gauges represent the longest and most complete records of hourly—and in five cases 15-min—water level data for this region. The datasets were adjusted to the vertical Australian Height Datum (AHD) and had the rainfall-related peaks removed from the records. The Unified Tidal Analysis and Prediction (Utide) model was subsequently used to predict tides for datasets with at least 25 years of records to obtain the associated tidal residuals. Finally, we provide a series of examples of how this dataset can be used to analyse trends in tidal anomalies as well as extreme events and their causal processes.
      Citation: Data
      PubDate: 2021-09-23
      DOI: 10.3390/data6100101
      Issue No: Vol. 6, No. 10 (2021)
       
  • Data, Vol. 6, Pages 102: Multiple Image Splicing Dataset (MISD): A Dataset
           for Multiple Splicing

    • Authors: Kalyani Dhananjay Kadam, Swati Ahirrao, Ketan Kotecha
      First page: 102
      Abstract: Image forgery has grown in popularity due to easy access to abundant image editing software. These forged images are so devious that it is impossible to predict with the naked eye. Such images are used to spread misleading information in society with the help of various social media platforms such as Facebook, Twitter, etc. Hence, there is an urgent need for effective forgery detection techniques. In order to validate the credibility of these techniques, publically available and more credible standard datasets are required. A few datasets are available for image splicing, such as Columbia, Carvalho, and CASIA V1.0. However, these datasets are employed for the detection of image splicing. There are also a few custom datasets available such as Modified CASIA, AbhAS, which are also employed for the detection of image splicing forgeries. A study of existing datasets used for the detection of image splicing reveals that they are limited to only image splicing and do not contain multiple spliced images. This research work presents a Multiple Image Splicing Dataset, which consists of a total of 300 multiple spliced images. We are the pioneer in developing the first publicly available Multiple Image Splicing Dataset containing high-quality, annotated, realistic multiple spliced images. In addition, we are providing a ground truth mask for these images. This dataset will open up opportunities for researchers working in this significant area.
      Citation: Data
      PubDate: 2021-09-28
      DOI: 10.3390/data6100102
      Issue No: Vol. 6, No. 10 (2021)
       
  • Data, Vol. 6, Pages 103: Experimental Data of Bottom Pressure and Free
           Surface Elevation including Wave and Current Interactions

    • Authors: Roman Gabl, Samuel Draycott, Ajit C. Pillai, Thomas Davey
      First page: 103
      Abstract: Force plates are commonly used in tank testing to measure loads acting on the foundation of a structure. These targeted measurements are overlaid by the hydrostatic and dynamic pressure acting on the force plate induced by the waves and currents. This paper presents a dataset of bottom force measurement with a six degree-of-freedom force plate (AMTI OR6-7 1000, surface area 0.464 m × 0.508 m) combined with synchronised measurements of surface elevation and current velocity. The data cover wave frequencies between 0.2 to 0.7 Hz and wave directions between 0∘ and 180∘. These variations are provided for current speeds of 0 and 0.2 m/s and a variation of the current in the absence of waves covering 0 to 0.45 m/s. The dataset can be utilised as a validation dataset for models predicting bottom pressure based on free surface elevation. Additionally, the dataset provides the wave- and current-induced load acting on the specific load cell at a fixed water depth of 2 m, which can subsequently be removed to obtain the often-desired measurement of structural loads.
      Citation: Data
      PubDate: 2021-09-30
      DOI: 10.3390/data6100103
      Issue No: Vol. 6, No. 10 (2021)
       
  • Data, Vol. 6, Pages 104: Human Activity Vibrations

    • Authors: Sakdirat Kaewunruen, Jessada Sresakoolchai, Junhui Huang, Satoru Harada, Wisinee Wisetjindawat
      First page: 104
      Abstract: We present a unique, comprehensive dataset that provides the pattern of five activities walking, cycling, taking a train, a bus, or a taxi. The measurements are carried out by embedded sensor accelerometers in smartphones. The dataset offers dynamic responses of subjects carrying smartphones in varied styles as they perform the five activities through vibrations acquired by accelerometers. The dataset contains corresponding time stamps and vibrations in three directions longitudinal, horizontal, and vertically stored in an Excel Macro-enabled Workbook (xlsm) format that can be used to train an AI model in a smartphone which has the potential to collect people’s vibration data and decide what movement is being conducted. Moreover, with more data received, the database can be updated and used to train the model with a larger dataset. The prevalence of the smartphone opens the door to crowdsensing, which leads to the pattern of people taking public transport being understood. Furthermore, the time consumed in each activity is available in the dataset. Therefore, with a better understanding of people using public transport, services and schedules can be planned perceptively.
      Citation: Data
      PubDate: 2021-09-30
      DOI: 10.3390/data6100104
      Issue No: Vol. 6, No. 10 (2021)
       
  • Data, Vol. 6, Pages 105: Experimental Data of a Hexagonal Floating
           Structure under Waves

    • Authors: Roman Gabl, Robert Klar, Thomas Davey, David M. Ingram
      First page: 105
      Abstract: Floating structures have a wide range of application and shapes. This experimental investigations observes a hexagonal floating structure under wave conditions for three different draft configurations. Regular waves as well as a range of white noise tests were conducted to quantify the response amplitude operator (RAO). Further irregular waves focused on the survivability of the floating structure. The presented dataset includes wave gauge data as well as a six degree of freedom motion measurement to quantify the response only restricted by a soft mooring system. Additional analysis include the measurement of the mass properties of the individual configuration, natural frequency of the mooring system as well as the comparison between requested and measured wave heights. This allows us to use the provided dataset as a validation experiment.
      Citation: Data
      PubDate: 2021-09-30
      DOI: 10.3390/data6100105
      Issue No: Vol. 6, No. 10 (2021)
       
  • Data, Vol. 6, Pages 106: Mobile Apps to Fight the COVID-19 Crisis

    • Authors: Chrisa Tsinaraki, Irena Mitton, Marco Minghini, Marina Micheli, Alexander Kotsev, Lorena Hernandez Quiros, Fabiano-Antonio Spinelli, Alessandro Dalla Benetta, Sven Schade
      First page: 106
      Abstract: The COVID-19 pandemic led to a multi-faceted global crisis, which triggered the diverse and quickly emerging use of old and new digital tools. We have developed a multi-channel approach for the monitoring and analysis of a subset of such tools, the COVID-19 related mobile applications (apps). Our approach builds on the information available in the two most prominent app stores (i.e., Google Play for Android-powered devices and Apple’s App Store for iOS-powered devices), as well as on relevant tweets and digital media outlets. The dataset presented here is one of the outcomes of this approach, uses the content of the app stores and enriches it, providing aggregated information about 837 mobile apps published across the world to fight the COVID-19 crisis. This information includes: (a) information available in the mobile app stores between 20 April 2020 and 2 August 2020; (b) complementary information obtained from manual analysis performed until mid-September 2020; and (c) status information about app availability on 28 February 2021, when we last collected data from the mobile app stores. We highlight our findings with a series of descriptives, which depict both the activities in the app stores and the qualitative information that was revealed by the manual analysis.
      Citation: Data
      PubDate: 2021-10-08
      DOI: 10.3390/data6100106
      Issue No: Vol. 6, No. 10 (2021)
       
  • Data, Vol. 6, Pages 107: The Retreat of Mountain Glaciers since the Little
           Ice Age: A Spatially Explicit Database

    • Authors: Silvio Marta, Roberto Sergio Azzoni, Davide Fugazza, Levan Tielidze, Pritam Chand, Katrin Sieron, Peter Almond, Roberto Ambrosini, Fabien Anthelme, Pablo Alviz Gazitúa, Rakesh Bhambri, Aurélie Bonin, Marco Caccianiga, Sophie Cauvy-Fraunié, Jorge Luis Ceballos Lievano, John Clague, Justiniano Alejo Cochachín Rapre, Olivier Dangles, Philip Deline, Andre Eger, Rolando Cruz Encarnación, Sergey Erokhin, Andrea Franzetti, Ludovic Gielly, Fabrizio Gili, Mauro Gobbi, Alessia Guerrieri, Sigmund Hågvar, Norine Khedim, Rahab Kinyanjui, Erwan Messager, Marco Aurelio Morales-Martínez, Gwendolyn Peyre, Francesca Pittino, Jerome Poulenard, Roberto Seppi, Milap Chand Sharma, Nurai Urseitova, Blake Weissling, Yan Yang, Vitalii Zaginaev, Anaïs Zimmer, Guglielmina Adele Diolaiuti, Antoine Rabatel, Gentile Francesco Ficetola
      First page: 107
      Abstract: Most of the world’s mountain glaciers have been retreating for more than a century in response to climate change. Glacier retreat is evident on all continents, and the rate of retreat has accelerated during recent decades. Accurate, spatially explicit information on the position of glacier margins over time is useful for analyzing patterns of glacier retreat and measuring reductions in glacier surface area. This information is also essential for evaluating how mountain ecosystems are evolving due to climate warming and the attendant glacier retreat. Here, we present a non-comprehensive spatially explicit dataset showing multiple positions of glacier fronts since the Little Ice Age (LIA) maxima, including many data from the pre-satellite era. The dataset is based on multiple historical archival records including topographical maps; repeated photographs, paintings, and aerial or satellite images with a supplement of geochronology; and own field data. We provide ESRI shapefiles showing 728 past positions of 94 glacier fronts from all continents, except Antarctica, covering the period between the Little Ice Age maxima and the present. On average, the time series span the past 190 years. From 2 to 46 past positions per glacier are depicted (on average: 7.8).
      Citation: Data
      PubDate: 2021-10-09
      DOI: 10.3390/data6100107
      Issue No: Vol. 6, No. 10 (2021)
       
  • Data, Vol. 6, Pages 108: A Principal Components Analysis-Based Method for
           the Detection of Cannabis Plants Using Representation Data by Remote
           Sensing

    • Authors: Carmine Gambardella, Rosaria Parente, Alessandro Ciambrone, Marialaura Casbarra
      First page: 108
      Abstract: Integrating the representation of the territory, through airborne remote sensing activities with hyperspectral and visible sensors, and managing complex data through dimensionality reduction for the identification of cannabis plantations, in Albania, is the focus of the research proposed by the multidisciplinary group of the Benecon University Consortium. In this study, principal components analysis (PCA) was used to remove redundant spectral information from multiband datasets. This makes it easier to identify the most prevalent spectral characteristics in most bands and those that are specific to only a few bands. The survey and airborne monitoring by hyperspectral sensors is carried out with an Itres CASI 1500 sensor owned by Benecon, characterized by a spectral range of 380–1050 nm and 288 configurable channels. The spectral configuration adopted for the research was developed specifically to maximize the spectral separability of cannabis. The ground resolution of the georeferenced cartographic data varies according to the flight planning, inserted in the aerial platform of an Italian Guardia di Finanza’s aircraft, in relation to the orography of the sites under investigation. The geodatabase, wherein the processing of hyperspectral and visible images converge, contains ancillary data such as digital aeronautical maps, digital terrain models, color orthophoto, topographic data and in any case a significant amount of data so that they can be processed synergistically. The goal is to create maps and predictive scenarios, through the application of the spectral angle mapper algorithm, of the cannabis plantations scattered throughout the area. The protocol consists of comparing the spectral data acquired with the CASI1500 airborne sensor and the spectral signature of the cannabis leaves that have been acquired in the laboratory with ASD Fieldspec PRO FR spectrometers. These scientific studies have demonstrated how it is possible to achieve ex ante control of the evolution of the phenomenon itself for monitoring the cultivation of cannabis plantations.
      Citation: Data
      PubDate: 2021-10-13
      DOI: 10.3390/data6100108
      Issue No: Vol. 6, No. 10 (2021)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 3.236.218.88
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-