A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

        1 2 3 | Last   [Sort by number of followers]   [Restore default list]

  Subjects -> SCIENCES: COMPREHENSIVE WORKS (Total: 434 journals)
Showing 1 - 200 of 265 Journals sorted alphabetically
AAS Open Research     Open Access   (Followers: 2)
ABC Journal of Advanced Research     Open Access  
Academic Voices : A Multidisciplinary Journal     Open Access   (Followers: 2)
Accountability in Research: Policies and Quality Assurance     Hybrid Journal   (Followers: 18)
Acta Materialia Transilvanica     Open Access  
Acta Nova     Open Access   (Followers: 1)
Acta Scientifica Malaysia     Open Access  
Acta Scientifica Naturalis     Open Access   (Followers: 3)
Adıyaman University Journal of Science     Open Access  
Advanced Science     Open Access   (Followers: 13)
Advanced Science, Engineering and Medicine     Partially Free   (Followers: 10)
Advanced Theory and Simulations     Hybrid Journal   (Followers: 5)
Advances in Research     Open Access   (Followers: 1)
Advances in Science and Technology     Full-text available via subscription   (Followers: 17)
African Journal of Science, Technology, Innovation and Development     Hybrid Journal   (Followers: 8)
Afrique Science : Revue Internationale des Sciences et Technologie     Open Access   (Followers: 2)
AFRREV STECH : An International Journal of Science and Technology     Open Access   (Followers: 4)
Akademik Platform Mühendislik ve Fen Bilimleri Dergisi     Open Access   (Followers: 2)
American Academic & Scholarly Research Journal     Open Access   (Followers: 6)
American Journal of Applied Sciences     Open Access   (Followers: 27)
American Journal of Humanities and Social Sciences     Open Access   (Followers: 15)
Anadol University Journal of Science and Technology B : Theoritical Sciences     Open Access  
Anadolu University Journal of Science and Technology A : Applied Sciences and Engineering     Open Access  
ANALES de la Universidad Central del Ecuador     Open Access   (Followers: 3)
Anales del Instituto de la Patagonia     Open Access  
Annali dell'Istituto e Museo di storia della scienza di Firenze     Hybrid Journal  
Applied Mathematics and Nonlinear Sciences     Open Access   (Followers: 1)
Apuntes de Ciencia & Sociedad     Open Access  
Arab Journal of Basic and Applied Sciences     Open Access  
Arabian Journal for Science and Engineering     Hybrid Journal   (Followers: 5)
Archives Internationales d'Histoire des Sciences     Partially Free   (Followers: 6)
Archives of Current Research International     Open Access  
ARO. The Scientific Journal of Koya University     Open Access  
ARPHA Conference Abstracts     Open Access   (Followers: 7)
ARPHA Proceedings     Open Access   (Followers: 4)
ArtefaCToS : Revista de estudios sobre la ciencia y la tecnología     Open Access   (Followers: 1)
Asia-Pacific Journal of Science and Technology     Open Access  
Asian Journal of Advanced Research and Reports     Open Access   (Followers: 2)
Asian Journal of Applied Science and Engineering     Open Access   (Followers: 2)
Asian Journal of Scientific Research     Open Access   (Followers: 3)
Asian Journal of Technology Innovation     Hybrid Journal   (Followers: 7)
Australian Field Ornithology     Full-text available via subscription   (Followers: 4)
Australian Journal of Social Issues     Hybrid Journal   (Followers: 7)
Avances en Ciencias e Ingeniería     Open Access  
Avrasya Terim Dergisi     Open Access  
AZimuth     Full-text available via subscription   (Followers: 2)
Bangladesh Journal of Scientific Research     Open Access   (Followers: 1)
Beni-Suef University Journal of Basic and Applied Sciences     Open Access   (Followers: 3)
Berichte Zur Wissenschaftsgeschichte     Hybrid Journal   (Followers: 10)
Berkeley Scientific Journal     Full-text available via subscription  
BIBECHANA     Open Access   (Followers: 2)
BibNum     Open Access  
Bilge International Journal of Science and Technology Research     Open Access   (Followers: 1)
Bioethics Research Notes     Full-text available via subscription   (Followers: 16)
Bistua : Revista de la Facultad de Ciencias Básicas     Open Access  
Bitlis Eren Üniversitesi Fen Bilimleri Dergisi     Open Access  
Bitlis Eren University Journal of Science and Technology     Open Access   (Followers: 1)
Black Sea Journal of Engineering and Science     Open Access  
Borneo Journal of Resource Science and Technology     Open Access  
Brazilian Journal of Science and Technology     Open Access   (Followers: 2)
Bulletin de la Société Royale des Sciences de Liège     Open Access  
Bulletin of the National Research Centre     Open Access  
Butlletí de la Institució Catalana d'Història Natural     Open Access  
Celal Bayar Üniversitesi Fen Bilimleri Dergisi     Open Access   (Followers: 1)
Central European Journal of Clinical Research     Open Access  
Chain Reaction     Full-text available via subscription  
Ciencia & Natura     Open Access   (Followers: 1)
Ciencia Amazónica (Iquitos)     Open Access   (Followers: 1)
Ciencia en Desarrollo     Open Access   (Followers: 2)
Ciencia en su PC     Open Access   (Followers: 1)
Ciencia Ergo Sum     Open Access  
Ciência ET Praxis     Open Access  
Ciencia y Tecnología     Open Access  
Ciencia, Docencia y Tecnología     Open Access  
Ciencias Holguin     Open Access   (Followers: 2)
CienciaUAT     Open Access   (Followers: 1)
Citizen Science : Theory and Practice     Open Access   (Followers: 1)
Communications Faculty of Sciences University of Ankara Series A2-A3 Physical Sciences and Engineering     Open Access  
Communications in Applied Sciences     Open Access  
Comprehensive Therapy     Hybrid Journal   (Followers: 3)
Comunicata Scientiae     Open Access   (Followers: 1)
ConCiencia     Open Access  
Conference Papers in Science     Open Access   (Followers: 2)
Configurations     Full-text available via subscription   (Followers: 10)
COSMOS     Hybrid Journal  
Crea Ciencia Revista Científica     Open Access   (Followers: 3)
Cuadernos de Investigación UNED     Open Access  
Cumhuriyet Science Journal     Open Access  
Current Issues in Criminal Justice     Hybrid Journal   (Followers: 15)
Current Research in Geoscience     Open Access   (Followers: 8)
Dalat University Journal of Science     Open Access  
Data     Open Access   (Followers: 4)
Data Curation Profiles Directory     Open Access   (Followers: 4)
Dhaka University Journal of Science     Open Access  
Dharmakarya     Open Access  
Diálogos Interdisciplinares     Open Access  
Digithum     Open Access   (Followers: 2)
Discover Sustainability     Open Access  
Eastern Anatolian Journal of Science     Open Access  
Einstein (São Paulo)     Open Access  
Ekaia : EHUko Zientzia eta Teknologia aldizkaria     Open Access  
Elkawnie : Journal of Islamic Science and Technology     Open Access  
Emergent Scientist     Open Access  
Enhancing Learning in the Social Sciences     Open Access   (Followers: 9)
Enseñanza de las Ciencias : Revista de Investigación y Experiencias Didácticas     Open Access  
Entramado     Open Access  
Entre Ciencia e Ingeniería     Open Access   (Followers: 1)
Epiphany     Open Access   (Followers: 4)
Episteme Transversalis     Open Access  
Ergo     Open Access  
Estação Científica (UNIFAP)     Open Access   (Followers: 1)
Ethiopian Journal of Education and Sciences     Open Access   (Followers: 6)
Ethiopian Journal of Science and Technology     Open Access  
Ethiopian Journal of Sciences and Sustainable Development     Open Access   (Followers: 7)
European Online Journal of Natural and Social Sciences     Open Access   (Followers: 12)
European Scientific Journal     Open Access   (Followers: 10)
Evidência - Ciência e Biotecnologia - Interdisciplinar     Open Access  
Exchanges : the Warwick Research Journal     Open Access   (Followers: 2)
Extensionismo, Innovación y Transferencia Tecnológica     Open Access   (Followers: 5)
Facets     Open Access  
Fides et Ratio : Revista de Difusión Cultural y Científica     Open Access   (Followers: 1)
Fırat Üniversitesi Fen Bilimleri Dergisi     Open Access  
Fırat University Turkish Journal of Science & Technology     Open Access  
Fontanus     Open Access  
Forensic Science Policy & Management: An International Journal     Hybrid Journal   (Followers: 372)
Frontiers for Young Minds     Open Access  
Frontiers in Climate     Open Access   (Followers: 3)
Frontiers in Science     Open Access   (Followers: 1)
Futures & Foresight Science     Hybrid Journal   (Followers: 3)
Gaudium Sciendi     Open Access   (Followers: 1)
Gazi University Journal of Science     Open Access  
Gaziosmanpaşa Bilimsel Araştırma Dergisi     Open Access  
Ghana Studies     Full-text available via subscription   (Followers: 15)
Global Journal of Pure and Applied Sciences     Full-text available via subscription  
Global Journal of Science Frontier Research     Open Access   (Followers: 2)
Globe, The     Full-text available via subscription   (Followers: 4)
HardwareX     Open Access  
Heidelberger Jahrbücher Online     Open Access  
Heliyon     Open Access  
Himalayan Journal of Science and Technology     Open Access   (Followers: 1)
Hoosier Science Teacher     Open Access  
Iğdır Üniversitesi Fen Bilimleri Enstitüsü Dergisi / Journal of the Institute of Science and Technology     Open Access  
Impact     Open Access   (Followers: 2)
Indonesian Journal of Fundamental Sciences     Open Access  
Indonesian Journal of Science and Mathematics Education     Open Access   (Followers: 4)
Indonesian Journal of Science and Technology     Open Access  
Ingenieria y Ciencia     Open Access   (Followers: 1)
Innovare : Revista de ciencia y tecnología     Open Access  
Instruments     Open Access  
Integrated Research Advances     Open Access  
Interciencia     Open Access   (Followers: 1)
Interface Focus     Full-text available via subscription  
International Annals of Science     Open Access  
International Archives of Science and Technology     Open Access  
International Journal of Academic Research in Business, Arts & Science     Open Access   (Followers: 2)
International Journal of Advanced Multidisciplinary Research and Review     Open Access  
International Journal of Advancement in Education and Social Sciences     Open Access   (Followers: 1)
International Journal of Advances in Engineering, Science and Technology     Open Access   (Followers: 3)
International Journal of Applied Science     Open Access  
International Journal of Basic and Applied Sciences     Open Access   (Followers: 4)
International Journal of Computational and Experimental Science and Engineering (IJCESEN)     Open Access  
International Journal of Engineering, Science and Technology     Open Access  
International Journal of Engineering, Technology and Natural Sciences     Open Access  
International Journal of Innovation and Applied Studies     Open Access   (Followers: 12)
International Journal of Innovative Research and Scientific Studies     Open Access   (Followers: 5)
International Journal of Innovative Research in Social and Natural Sciences     Open Access   (Followers: 2)
International Journal of Network Science     Hybrid Journal   (Followers: 3)
International Journal of Recent Contributions from Engineering, Science & IT     Open Access   (Followers: 1)
International Journal of Research in Science     Open Access   (Followers: 2)
International Journal of Science & Emerging Technologies     Open Access   (Followers: 1)
International Journal of Sciences : Basic and Applied Research     Open Access  
International Journal of Social Sciences and Management     Open Access   (Followers: 2)
International Journal of Technology Policy and Law     Hybrid Journal   (Followers: 7)
International Letters of Social and Humanistic Sciences     Open Access   (Followers: 1)
International Review of Applied Sciences     Open Access  
International Scientific and Vocational Studies Journal     Open Access   (Followers: 2)
InterSciencePlace     Open Access   (Followers: 1)
Investiga : TEC     Open Access  
Investigación Joven     Open Access  
Investigación Valdizana     Open Access  
Investigacion y Ciencia     Open Access   (Followers: 1)
Iranian Journal of Science and Technology, Transactions A : Science     Hybrid Journal  
iScience     Open Access  
Issues in Science & Technology     Free   (Followers: 7)
İstanbul Ticaret Üniversitesi Fen Bilimleri Dergisi / İstanbul Commerce University Journal of Science     Open Access  
Istituto Lombardo - Accademia di Scienze e Lettere - Rendiconti di Scienze     Open Access  
Ithaca : Viaggio nella Scienza     Open Access  
J : Multidisciplinary Scientific Journal     Open Access  
Journal de la Recherche Scientifique de l'Universite de Lome     Full-text available via subscription   (Followers: 2)
Journal for New Generation Sciences     Open Access   (Followers: 3)
Journal of Chromatography & Separation Techniques     Open Access   (Followers: 12)
Journal of Advanced Research     Open Access   (Followers: 3)
Journal of Al-Qadisiyah for Pure Science     Open Access   (Followers: 1)
Journal of Analytical Science & Technology     Open Access   (Followers: 6)
Journal of Applied Science and Technology     Full-text available via subscription   (Followers: 1)
Journal of Applied Sciences and Environmental Management     Open Access   (Followers: 3)
Journal of Big History     Open Access   (Followers: 3)
Journal of Composites Science     Open Access   (Followers: 3)
Journal of Critical Thought and Praxis     Open Access   (Followers: 2)
Journal of Deliberative Mechanisms in Science     Open Access  

        1 2 3 | Last   [Sort by number of followers]   [Restore default list]

Similar Journals
Journal Cover
Data
Number of Followers: 4  

  This is an Open Access Journal Open Access journal
ISSN (Online) 2306-5729
Published by MDPI Homepage  [233 journals]
  • Data, Vol. 6, Pages 23: Information System for Selection of Conditions and
           Equipment for Mammalian Cell Cultivation

    • Authors: Natalia Menshutina, Elena Guseva, Diana Batyrgazieva, Igor Mitrofanov
      First page: 23
      Abstract: Over the past few decades, animal cell culture technology has advanced significantly. It is now considered a reliable, functional, and relatively well-developed technology. At present, biotherapeutic drugs are synthesized using cell culture techniques by large manufacturing enterprises that produce products for commercial use and clinical research. The reliable implementation of mammalian cell culture technology requires the optimization of a number of variables, including the culture environment and bioreactor conditions, suitable cell lines, operating costs, efficient process management and, most importantly, quality. Successful implementation also requires an appropriate process development strategy, industrial scale, and characteristics, as well as the certification of sustainable procedures that meet the requirements of current regulations. All of this has led to a trend of increasing research in the field of biotechnology and, as a result, to a great accumulation of scientific information which, however, remains fragmentary and non-systematic. The development of information and network technologies allow us to solve this problem. Information system creation allows for implementation of the modern concept of integrating various structured and unstructured data, as well as the collection of information from internal and external sources. We propose and develop an information system which contains the conditions and various parameters of cultivation processes. The associated ranking system is the result of the set of recommendations—both from technological and hardware solutions—which allow for choosing the optimal conditions for the cultivation of mammalian cells at the stage of scientific research, thereby significantly reducing the time and cost of work. The proposed information system allows for the accumulation of experience regarding existing technologies for the cultivation of mammalian cells, along with application to the development of new technologies. The main goal of the present work is to discuss information systems, the organizational support of scientific research in the field of mammalian cell cultivation, and to provide a detailed description of the developed system and its main modules, including the conceptual and logical scheme of the database.
      Citation: Data
      PubDate: 2021-02-25
      DOI: 10.3390/data6030023
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 24: A Data Resource for Sulfuric Acid Reactivity of
           Organic Chemicals

    • Authors: William Bains, Janusz Jurand Petkowski, Sara Seager
      First page: 24
      Abstract: We describe a dataset of the quantitative reactivity of organic chemicals with concentrated sulfuric acid. As well as being a key industrial chemical, sulfuric acid is of environmental and planetary importance. In the absence of measured reaction kinetics, the reaction rate of a chemical with sulfuric acid can be estimated from the reaction rate of structurally related chemicals. To allow an approximate prediction, we have collected 589 sets of kinetic data on the reaction of organic chemicals with sulfuric acid from 262 literature sources and used a functional group-based approach to build a model of how the functional groups would react in any sulfuric acid concentration from 60–100%, and between −20 °C and 100 °C. The data set provides the original reference data and kinetic measurements, parameters, intermediate computation steps, and a set of first-order rate constants for the functional groups across the range of conditions −20 °C–100 °C and 60–100% sulfuric acid. The dataset will be useful for a range of studies in chemistry and atmospheric sciences where the reaction rate of a chemical with sulfuric acid is needed but has not been measured.
      Citation: Data
      PubDate: 2021-02-25
      DOI: 10.3390/data6030024
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 25: FIKWaste: A Waste Generation Dataset from Three
           Restaurant Kitchens in Portugal

    • Authors: Lucas Pereira, Vitor Aguiar, Fábio Vasconcelos
      First page: 25
      Abstract: In the era of big data and artificial intelligence, public datasets are becoming increasingly important for researchers to build and evaluate their models. This paper presents the FIKWaste dataset, which contains time series data for the volume of waste produced in three restaurant kitchens in Portugal. Organic (undifferentiated) and inorganic (glass, paper, and plastic) waste bins were monitored for a consecutive period of four weeks. In addition to the time series measurements, the FIKWaste dataset contains labels for waste disposal events, i.e., when the waste bins are emptied, and technical and non-technical details of the monitored kitchens.
      Citation: Data
      PubDate: 2021-02-26
      DOI: 10.3390/data6030025
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 26: FIKWater: A Water Consumption Dataset from Three
           Restaurant Kitchens in Portugal

    • Authors: Lucas Pereira, Vitor Aguiar, Fábio Vasconcelos
      First page: 26
      Abstract: With the advent of the IoT and low-cost sensing technologies, the availability of data has reached levels never imagined before by the research community. However, independently of their size, data are only as valuable as the ability to have access to them. This paper presents the FIKWater dataset, which contains time series data for hot and cold water demand collected from three restaurant kitchens in Portugal for consecutive periods between two and four weeks. The measurements were taken using ultrasonic flow meters, at a sampling frequency of 0.2 Hz. Additionally, some details of the monitored spaces are also provided.
      Citation: Data
      PubDate: 2021-03-02
      DOI: 10.3390/data6030026
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 27: Collection of Environmental Variables and
           Bacterial Community Compositions in Marian Cove, Antarctica, during Summer
           2018

    • Authors: Kim, Lim, Kim, Kim
      First page: 27
      Abstract: Marine bacteria, which are known as key drivers for marine biogeochemical cycles and Earth’s climate system, are mainly responsible for the decomposition of organic matter and production of climate-relevant gases (i.e., CO₂, N₂O, and CH₄). However, research is still required to fully understand the correlation between environmental variables and bacteria community composition. Marine bacteria living in the Marian Cove, where the inflow of freshwater has been rapidly increasing due to substantial glacial retreat, must be undergoing significant environmental changes. During the summer of 2018, we conducted a hydrographic survey to collect environmental variables and bacterial community composition data at three different layers (i.e., the seawater surface, middle, and bottom layers) from 15 stations. Of all the bacterial data, 17 different phylum level bacteria and 21 different class level bacteria were found and Proteobacteria occupy 50.3% at phylum level following Bacteroidetes. Gammaproteobacteria and Alphaproteobacteria, which belong to Proteobacteria, are the highest proportion at the class level. Gammaproteobacteria showed the highest relative abundance in all three seawater layers. The collection of environmental variables and bacterial composition data contributes to improving our understanding of the significant relationships between marine Antarctic regions and marine bacteria that lives in the Antarctic.
      Citation: Data
      PubDate: 2021-03-05
      DOI: 10.3390/data6030027
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 28: Stark Width Data for Tb II, Tb III and Tb IV
           Spectral Lines

    • Authors: Milan S. Dimitrijević
      First page: 28
      Abstract: A dataset of Stark widths for Tb II, Tb III and Tb IV is presented. To data obtained before, the results of new calculations for 62 Tb III lines from 5d to 6pj(6,j)o, a transition array, have been added. Calculations have been performed by using the simplified modified semiempirical method for temperatures from 5000 to 80,000 K for an electron density of 1017 cm−3. The results were also used to discuss the regularities within multiplets and a supermultiplet.
      Citation: Data
      PubDate: 2021-03-08
      DOI: 10.3390/data6030028
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 29: LeafLive-DB: Classification and Data Storage of
           Botanical Studies

    • Authors: Jorge Rodolfo Beingolea, Diego Ramos-Pires, Jorge Rendulich, Milagros Zegarra, Juan Borja-Murillo, Simone A. Siqueira da Fonseca
      First page: 29
      Abstract: The development of studies, projects, and technologies that contribute to the understanding and preservation of plant biodiversity is becoming highly necessary, as well as tools and software platforms that enable the storage and classification of information resulting from studies on biodiversity. This work presents LeafLive-DB, a software platform that helps map and characterize species from the Brazilian plant biodiversity, offering the possibility of worldwide distribution. Developed by Brazilian and Peruvians researchers, this platform, which is available in its first version, features some functions for consulting and registering plant species and their taxonomy, among other information, through intuitive interfaces and an environment that promotes collaboration and data and research sharing. The platform innovates in data processing, functionality, and development architecture. It has ten thousand registers, and it should start to be distributed in partnership with schools and higher education institutions.
      Citation: Data
      PubDate: 2021-03-09
      DOI: 10.3390/data6030029
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 30: Dataset of the Optimization of a Low Power
           Chemoresistive Gas Sensor: Predictive Thermal Modelling and Mechanical
           Failure Analysis

    • Authors: Gaiardo, Novel, Scattolo, Bucciarelli, Bellutti, Pepponi
      First page: 30
      Abstract: Over the last few years, employment of the standard silicon microfabrication techniques for the gas sensor technology has allowed for the development of ever-small, low-cost, and low-power consumption devices. Specifically, the development of silicon microheaters (MHs) has become well established to produce MOS gas sensors. Therefore, the development of predictive models that help to define a priori the optimal design and layout of the device have become crucial, in order to achieve both low power consumption and high mechanical stability. In this research dataset, we present the experimental data collected to develop a specific and useful predictive thermal-mechanical model for high performing silicon MHs. To this aim, three MH layouts over three different membrane sizes were developed by using the standard silicon microfabrication process. Thermal and mechanical performances of the produced devices were experimentally evaluated, by using probe stations and mechanical failure analysis, respectively. The measured thermal curves were used to develop the predictive thermal model towards low power consumption. Moreover, a statistical analysis was finally introduced to cross-correlate the mechanical failure results and the thermal predictive model, aiming at MH design optimization for gas sensing applications. All the data collected in this investigation are shown.
      Citation: Data
      PubDate: 2021-03-09
      DOI: 10.3390/data6030030
      Issue No: Vol. 6, No. 3 (2021)
       
  • Data, Vol. 6, Pages 7: Data for Sustainable Platform Economy: Connections
           between Platform Models and Sustainable Development Goals

    • Authors: Mayo Fuster Morell, Ricard Espelt, Enric Senabre Hidalgo
      First page: 7
      Abstract: In recent years, the platform economy has been recognised by researchers and governments around the world for its potential to contribute to the sustainable development of society. Yet, platform economy cases such as Uber, Airbnb, and Deliveroo have created a huge controversy over their socioeconomic impact, while other alternative models have been associated with a new form of cooperativism. In parallel, the United Nations are advocating global sustainable development by promoting Sustainable Development Goals (SDGs), considering elements such as decent work, inclusive and sustainable economic growth, and fostering innovation. In any case, the SDGs have been also criticised for the lack of digital perspective. This dataset draws from two 2020 European projects’ (DECODE and PLUS) data collections and presents the possibility to compare different platform economy models and their connections with the SDGs.
      Citation: Data
      PubDate: 2021-01-20
      DOI: 10.3390/data6020007
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 8: Characteristics of Recent Aftershocks Sequences
           (2014, 2015, 2018) Derived from New Seismological and Geodetic Data on the
           Ionian Islands, Greece

    • Authors: Moshou, Argyrakis, Konstantaras, Daverona, Sagias
      First page: 8
      Abstract: In 2014–2018, four strong earthquakes occurred in the Ionian Sea, Greece. After these events, a rich aftershock sequence followed. More analytically, according to the manual solutions of the National Observatory of Athens, the first event occurred on 26 January 2014 in Cephalonia Island with magnitude ML = 5.8, followed by another in the same region on 3 February 2014 with magnitude ML = 5.7. The third event occurred on 17 November 2015, ML = 6.0 in Lefkas Island and the last on 25 October 2018, ML = 6.6 in Zakynthos Island. The first three of these earthquakes caused moderate structural damages, mainly in houses and produced particular unrest to the local population. This work determines a seismic moment tensor for both large and intermediate magnitude earthquakes (M > 4.0). Geodetic data from permanent GPS stations were analyzed to investigate the displacement due to the earthquakes.
      Citation: Data
      PubDate: 2021-01-20
      DOI: 10.3390/data6020008
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 9: On Linear and Circular Approach to GPS Data
           Processing: Analyses of the Horizontal Positioning Deviations Based on the
           Adriatic Region IGS Observables

    • Authors: Davor Šakan, Serdjo Kos, Biserka Drascic Ban, David Brčić
      First page: 9
      Abstract: Global and regional positional accuracy assessment is of the highest importance for any satellite navigation system, including the Global Positioning System (GPS). Although positioning error can be expressed as a vector quantity with direction and magnitude, most of the research focuses on error magnitude only. The positional accuracy can be evaluated in terms of navigational quadrants as further refinement of error distribution, as it was shown here. This research was conducted in the wider area of the Northern Adriatic Region, employing the International Global Navigation Satellite Systems (GNSS) Service (IGS) data and products. Similarities of positional accuracy and deviations distributions for Single Point Positioning (SPP) were addressed in terms of magnitudes. Data were analyzed during the 11-day period. Linear and circular statistical methods were used to quantify regional positional accuracy and error behavior. This was conducted in terms of both scalar and vector values, with assessment of the underlying probability distributions. Navigational quadrantal positioning error subset analysis was carried out. Similarity in the positional accuracy and positioning deviations behavior, with uneven positional distribution between quadrants, indicated the directionality of the total positioning error. The underlying distributions for latitude and longitude deviations followed approximately normal distributions, while the radius was approximated by the Rayleigh distribution. The Weibull and gamma distributions were considered, as well. Possible causes of the analyzed positioning deviations were not investigated, but the ultimate positioning products were obtained as in standard, single-frequency positioning scenarios.
      Citation: Data
      PubDate: 2021-01-21
      DOI: 10.3390/data6020009
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 10: Balancing Plurality and Educational Essence:
           Higher Education Between Data-Competent Professionals and Data
           Self-Empowered Citizens

    • Authors: Nils Hachmeister, Katharina Weiß, Juliane Theiß, Reinhold Decker
      First page: 10
      Abstract: Data are increasingly important in central facets of modern life: academics, professions, and society at large. Educating aspiring minds to meet highest standards in these facets is the mandate of institutions of higher education. This, naturally, includes the preparation for excelling in today’s data-driven world. In recent years, an intensive academic discussion has resulted in the distinction between two different modes of data related education: data science and data literacy education. As a large number of study programs and offers is emerging around the world, data literacy in higher education is a particular focus of this paper. These programs, despite sharing the same name, differ substantially in their educational content, i.e., a high plurality can be observed. This paper explores this plurality, comments on the role it might play and suggests ways it can be dealt with by maintaining a high degree of adaptiveness and plurality while simultaneously establishing a consistent educational “essence”. It identifies a skill set, data self-empowerment, as a potential part of this essence. Data science and literacy education are still experiencing changeability in their emergence as fields of study, while additionally being stirred up by rapid developments, bringing about a need for flexibility and dialectic.
      Citation: Data
      PubDate: 2021-01-21
      DOI: 10.3390/data6020010
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 11: The Effect of Preprocessing Techniques, Applied to
           Numeric Features, on Classification Algorithms’ Performance

    • Authors: Esra’a Alshdaifat, Doa’a Alshdaifat, Ayoub Alsarhan, Fairouz Hussein, Subhieh Moh’d Faraj S. El-Salhi
      First page: 11
      Abstract: It is recognized that the performance of any prediction model is a function of several factors. One of the most significant factors is the adopted preprocessing techniques. In other words, preprocessing is an essential process to generate an effective and efficient classification model. This paper investigates the impact of the most widely used preprocessing techniques, with respect to numerical features, on the performance of classification algorithms. The effect of combining various normalization techniques and handling missing values strategies is assessed on eighteen benchmark datasets using two well-known classification algorithms and adopting different performance evaluation metrics and statistical significance tests. According to the reported experimental results, the impact of the adopted preprocessing techniques varies from one classification algorithm to another. In addition, a statistically significant difference between the considered data preprocessing techniques is demonstrated.
      Citation: Data
      PubDate: 2021-01-21
      DOI: 10.3390/data6020011
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 12: A Systematic Survey of ML Datasets for Prime CV
           Research Areas—Media and Metadata

    • Authors: Helder F. Castro, Jaime S. Cardoso, Maria T. Andrade
      First page: 12
      Abstract: The ever-growing capabilities of computers have enabled pursuing Computer Vision through Machine Learning (i.e., MLCV). ML tools require large amounts of information to learn from (ML datasets). These are costly to produce but have received reduced attention regarding standardization. This prevents the cooperative production and exploitation of these resources, impedes countless synergies, and hinders ML research. No global view exists of the MLCV dataset tissue. Acquiring it is fundamental to enable standardization. We provide an extensive survey of the evolution and current state of MLCV datasets (1994 to 2019) for a set of specific CV areas as well as a quantitative and qualitative analysis of the results. Data were gathered from online scientific databases (e.g., Google Scholar, CiteSeerX). We reveal the heterogeneous plethora that comprises the MLCV dataset tissue; their continuous growth in volume and complexity; the specificities of the evolution of their media and metadata components regarding a range of aspects; and that MLCV progress requires the construction of a global standardized (structuring, manipulating, and sharing) MLCV “library”. Accordingly, we formulate a novel interpretation of this dataset collective as a global tissue of synthetic cognitive visual memories and define the immediately necessary steps to advance its standardization and integration.
      Citation: Data
      PubDate: 2021-01-22
      DOI: 10.3390/data6020012
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 13: Acknowledgment to Reviewers of Data in 2020

    • Authors: Data Editorial Office Data Editorial Office
      First page: 13
      Abstract: Peer review is the driving force of journal development, and reviewers are gatekeepers who ensure that Data maintains its standards for the high quality of its published papers [...]
      Citation: Data
      PubDate: 2021-02-01
      DOI: 10.3390/data6020013
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 14: Retinal Fundus Multi-Disease Image Dataset
           (RFMiD): A Dataset for Multi-Disease Detection Research

    • Authors: Samiksha Pachade, Prasanna Porwal, Dhanshree Thulkar, Manesh Kokare, Girish Deshmukh, Vivek Sahasrabuddhe, Luca Giancardo, Gwenolé Quellec, Fabrice Mériaudeau
      First page: 14
      Abstract: The world faces difficulties in terms of eye care, including treatment, quality of prevention, vision rehabilitation services, and scarcity of trained eye care experts. Early detection and diagnosis of ocular pathologies would enable forestall of visual impairment. One challenge that limits the adoption of computer-aided diagnosis tool by ophthalmologists is the number of sight-threatening rare pathologies, such as central retinal artery occlusion or anterior ischemic optic neuropathy, and others are usually ignored. In the past two decades, many publicly available datasets of color fundus images have been collected with a primary focus on diabetic retinopathy, glaucoma, age-related macular degeneration and few other frequent pathologies. To enable development of methods for automatic ocular disease classification of frequent diseases along with the rare pathologies, we have created a new Retinal Fundus Multi-disease Image Dataset (RFMiD). It consists of 3200 fundus images captured using three different fundus cameras with 46 conditions annotated through adjudicated consensus of two senior retinal experts. To the best of our knowledge, our dataset, RFMiD, is the only publicly available dataset that constitutes such a wide variety of diseases that appear in routine clinical settings. This dataset will enable the development of generalizable models for retinal screening.
      Citation: Data
      PubDate: 2021-02-03
      DOI: 10.3390/data6020014
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 15: Repository Approaches to Improving Quality of
           Shared Data and Code

    • Authors: Ana Trisovic, Katherine Mika, Ceilyn Boyd, Sebastian Feger, Mercè Crosas
      First page: 15
      Abstract: Sharing data and code for reuse have become increasingly important in scientific work over the past decade. However, in practice, shared data and code may be unusable, or published results obtained from them may be irreproducible. Data repository features and services contribute significantly to the quality, longevity, and reusability of datasets. This paper presents a combination of original and secondary data analysis studies focusing on computational reproducibility, data curation, and gamified design elements that can be employed to indicate and improve the quality of shared data and code. The findings of these studies are sorted into three approaches that can be valuable to data repositories, archives, and other research dissemination platforms.
      Citation: Data
      PubDate: 2021-02-03
      DOI: 10.3390/data6020015
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 16: Investigating the Adoption of Big Data Management
           in Healthcare in Jordan

    • Authors: Hani Bani-Salameh, Mona Al-Qawaqneh, Salah Taamneh
      First page: 16
      Abstract: Software developers and data scientists use and deal with big data to easily discover useful knowledge and find better solutions to improve healthcare services and patient safety. Big data analytics (BDA) is getting attention due to its role in decision-making across the healthcare field. Therefore, this article examines the adoption mechanism of big data analytics and management in healthcare organizations in Jordan. Additionally, it discusses health big data’s characteristics and the challenges, and limitations for health big data analytics and management in Jordan. This article proposes a conceptual framework that allows utilizing health big data. The proposed conceptual framework suggests a way to merge the existing health information system with the National Health Information Exchange (HIE), which might play a role in extracting insights from our massive datasets, increases the data availability and reduces waste in resources. When applying the framework, the collected data are processed to develop knowledge and support decision-making, which helps improve the health care quality for both the community and individuals by improving diagnosis, treatment, and other services.
      Citation: Data
      PubDate: 2021-02-06
      DOI: 10.3390/data6020016
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 17: Agricultural Crop Change in the Willamette Valley,
           Oregon, from 2004 to 2017

    • Authors: Bogdan M. Strimbu, George Mueller-Warrant, Kristin Trippe
      First page: 17
      Abstract: The Willamette Valley, bounded to the west by the Coast Range and to the east by the Cascade Mountains, is the largest river valley completely confined to Oregon. The fertile valley soils combined with a temperate, marine climate create ideal agronomic conditions for seed production. Historically, seed cropping systems in the Willamette Valley have focused on the production of grass and forage seeds. In addition to growing over two-thirds of the nation’s cool-season grass seed, cropping systems in the Willamette Valley include a diverse rotation of over 250 commodities for forage, seed, food, and cover cropping applications. Tracking the sequence of crop rotations that are grown in the Willamette Valley is paramount to answering a broad spectrum of agronomic, environmental, and economical questions. Landsat imagery covering approximately 25,303 km2 were used to identify agricultural crops in production from 2004 to 2017. The agricultural crops were distinguished by classifying images primarily acquired by three platforms: Landsat 5 (2003–2013), Landsat 7 (2003–2017), and Landsat 8 (2013–2017). Before conducting maximum likelihood remote sensing classification, the images acquired by the Landsat 7 were pre-processed to reduce the impact of the scan line corrector failure. The corrected images were subsequently used to classify 35 different land-use classes and 137 unique two-year-long sequences of 57 classes of non-urban and non-forested land-use categories from 2004 through 2014. Our final data product uses new and previously published results to classify the western Oregon landscape into 61 different land use classes, including four majority-rule-over-time super-classes and 57 regular classes of annually disturbed agricultural crops (19 classes), perennial crops (20 classes), forests (13 classes), and urban developments (5 classes). These publicly available data can be used to inform and support environmental and agricultural land-use studies.
      Citation: Data
      PubDate: 2021-02-07
      DOI: 10.3390/data6020017
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 18: The State of the Art in Methodologies of Course
           Recommender Systems—A Review of Recent Research

    • Authors: Deepani B. Guruge, Rajan Kadel, Sharly J. Halder
      First page: 18
      Abstract: In recent years, education institutions have offered a wide range of course selections with overlaps. This presents significant challenges to students in selecting successful courses that match their current knowledge and personal goals. Although many studies have been conducted on Recommender Systems (RS), a review of methodologies used in course RS is still insufficiently explored. To fill this literature gap, this paper presents the state of the art of methodologies used in course RS along with the summary of the types of data sources used to evaluate these techniques. This review aims to recognize emerging trends in course RS techniques in recent research literature to deliver insights for researchers for further investigation. We provide a systematic review process followed by research findings on the current methodologies implemented in different course RS in selected research journals such as: collaborative, content-based, knowledge-based, Data Mining (DM), hybrid, statistical and Conversational RS (CRS). This study analyzed publications between 2016 and June 2020, in three repositories; IEEE Xplore, ACM, and Google Scholar. These papers were explored and classified based on the methodology used in recommending courses. This review has revealed that there is a growing popularity in hybrid course RS and followed by DM techniques in recent publications. However, few CRS-based course RS were present in the selected publications. Finally, we discussed future avenues based on the research outcome, which might lead to next-generation course RS.
      Citation: Data
      PubDate: 2021-02-11
      DOI: 10.3390/data6020018
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 19: High-to-Low (Regional) Fertility Transitions in a
           Peripheral European Country: The Contribution of Exploratory Time Series
           Analysis

    • Authors: Jesus Rodrigo-Comino, Gianluca Egidi, Luca Salvati, Giovanni Quaranta, Rosanna Salvia, Antonio Gimenez-Morera
      First page: 19
      Abstract: Diachronic variations in demographic rates have frequently reflected social transformations and a (more or less evident) impact of sequential economic downturns. By assessing changes over time in Total Fertility Rate (TFR) at the regional scale in Italy, our study investigates the long-term transition (1952–2019) characteristic of Mediterranean fertility, showing a continuous decline of births since the late 1970s and marked disparities between high- and low-fertility regions along the latitude gradient. Together with a rapid decline in the country TFR, the spatiotemporal evolution of regional fertility in Italy—illustrated through an exploratory time series statistical approach—outlines the marked divide between (wealthier) Northern regions and (economically disadvantaged) Southern regions. Non-linear fertility trends and increasing spatial heterogeneity in more recent times indicate the role of individual behaviors leveraging a generalized decline in marriage and childbearing propensity. Assuming differential responses of regional fertility to changing socioeconomic contexts, these trends are more evident in Southern Italy than in Northern Italy. Reasons at the base of such fertility patterns were extensively discussed focusing—among others—on the distinctive contribution of internal and international migrations to regional fertility rates. Based on these findings, Southern Italy, an economically disadvantaged, peripheral region in Mediterranean Europe, is taken as a paradigmatic case of demographic shrinkage—whose causes and consequences can be generalized to wider contexts in (and outside) Europe.
      Citation: Data
      PubDate: 2021-02-16
      DOI: 10.3390/data6020019
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 20: Dataset of Two-Dimensional Gel Electrophoresis
           Images of Acute Myeloid Leukemia Patients before and after Induction
           Therapy

    • Authors: Juan E. Urrea, Luisa F. Restrepo, Jeanette Prada-Arismendy, Erwing Castillo, Manuel M. Goez, Maria C. Torres-Madronero, Edilson Delgado-Trejos, Sarah Röthlisberger
      First page: 20
      Abstract: Acute myeloid leukemia (AML) is a malignant disorder of the hematopoietic stem and progenitor cells, which results in the build-up of immature blasts in the bone marrow and eventually in the peripheral blood of affected patients. Accurately assessing a patient´s prognosis is very important for clinical management of the disease, which is why there are several prognostic factors such as age, performance status at diagnosis, platelet count, serum creatinine and albumin that are taken into account by the clinician when deciding the course of treatment. However, proteomic changes related to treatment response in this patient group have not been widely explored. Here, we make available a set of 22 two-dimensional gel electrophoresis (2DGE) images obtained from the peripheral blood samples of 11 patients with AML, taken at the time of diagnosis and after induction therapy (approximately 21–28 days after starting treatment). The same set of 2DGE images is also made available after a preprocessing stage (an additional 22 2DGE pre-processed images), which was performed using algorithms developed in Python, in order to improve the visualization of characteristic spots and facilitate proteomic analysis of this type of images.
      Citation: Data
      PubDate: 2021-02-18
      DOI: 10.3390/data6020020
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 21: An Open GMNS Dataset of a Dynamic Multi-modal
           Transportation Network Model of Melbourne, Australia

    • Authors: Nourmohammadi, Mansourianfar, Shafiei, Gu, Saberi
      First page: 21
      Abstract: Simulation-based dynamic traffic assignment models are increasingly used in urban transportation systems analysis and planning. They replicate traffic dynamics across transportation networks by capturing the complex interactions between travel demand and supply. However, their applications particularly for large-scale networks have been hindered by the challenges associated with the collection, parsing, development, and sharing of data-intensive inputs. In this paper, we develop and share an open dataset for reproduction of a dynamic multi-modal transportation network model of Melbourne, Australia. The dataset is developed consistently with the General Modeling Network Specification (GMNS), enabling software-agnostic human and machine readability. GMNS is a standard readable format for sharing routable transportation network data that is designed to be used in multimodal static and dynamic transportation operations and planning models.
      Citation: Data
      PubDate: 2021-02-19
      DOI: 10.3390/data6020021
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 22: A Long-Term, Real-Life Parkinson Monitoring
           Database Combining Unscripted Objective and Subjective Recordings

    • Authors: Jeroen G. V. Habets, Margot Heijmans, Albert F. G. Leentjens, Claudia J. P. Simons, Yasin Temel, Mark L. Kuijf, Pieter L. Kubben, Christian Herff
      First page: 22
      Abstract: Accurate real-life monitoring of motor and non-motor symptoms is a challenge in Parkinson’s disease (PD). The unobtrusive capturing of symptoms and their naturalistic fluctuations within or between days can improve evaluation and titration of therapy. First-generation commercial PD motion sensors are promising to augment clinical decision-making in general neurological consultation, but concerns remain regarding their short-term validity, and long-term real-life usability. In addition, tools monitoring real-life subjective experiences of motor and non-motor symptoms are lacking. The dataset presented in this paper constitutes a combination of objective kinematic data and subjective experiential data, recorded parallel to each other in a naturalistic, long-term real-life setting. The objective data consists of accelerometer and gyroscope data, and the subjective data consists of data from ecological momentary assessments. Twenty PD patients were monitored without daily life restrictions for fourteen consecutive days. The two types of data can be used to address hypotheses on naturalistic motor and/or non-motor symptomatology in PD.
      Citation: Data
      PubDate: 2021-02-23
      DOI: 10.3390/data6020022
      Issue No: Vol. 6, No. 2 (2021)
       
  • Data, Vol. 6, Pages 2: The Use of National Strategic Reference Framework
           Data in Knowledge Graphs and Data Mining to Identify Red Flags

    • Authors: Charalampos Bratsas, Evangelos Chondrokostas, Kleanthis Koupidis, Ioannis Antoniou
      First page: 2
      Abstract: Red Flags in fiscal projects are warning signs that may indicate underlying problems with their implementation. In this paper, we present how National Strategic Reference Framework Open Data can be used to take full advantage of semantic web technologies and data mining techniques to build a knowledge-based system that identifies Red Flags. We collected the data from the Open Data API provided by the Greek Ministry of Economy and Finance. Data modeling consist of two ontologies; the Vocabulary of Fiscal Projects, describing the fiscal projects and the National Strategic Reference Framework Greece Vocabulary, illustrating the Greek National Strategic Reference Framework data. We transformed the data into RDF triples and uploaded them onto an OpenLink Virtuoso Server, so that we could retrieve them via SPARQL queries. Performance indicators were defined to assess the state of the project and Density-Based Spatial Clustering of Applications with Noise, (DBSCAN) was used to identify Red Flags. User’s demands is that rejected projects should raise Red Flags, to avoid project failure and assist the auditor to organize the monitoring process efficiently, by avoiding to examine most of the non-problematic projects. We performed a use case scenario in which an auditor has to examine NSRF projects, approximately 12 months before the end of the programming period. The system retrieved the fiscal information, calculated the performance indicators and identified the Red Flags. The last update of the projects status after the end of the programming period was retrieved and extracted the number of rejected projects, to test whether the user requirements are satisfied. Rejected projects consist of 3.8% of the total projects. The results of the use case scenario show that RedFlags platform is more likely to identify project failures and not raise Red Flags on not rejected projects. Therefore, the RedFlags platform using open data, assists the auditor to organize the monitoring process better.
      Citation: Data
      PubDate: 2021-01-04
      DOI: 10.3390/data6010002
      Issue No: Vol. 6, No. 1 (2021)
       
  • Data, Vol. 6, Pages 3: Drugs, Active Ingredients and Diseases Database in
           Spanish. Augmenting the Resources for Analyses on Drug–Illness
           Interactions

    • Authors: Irene López-Rodríguez, César F. Reyes-Manzano, Israel Reyes-Ramírez, Tania J. Contreras-Uribe, Lev Guzmán-Vargas
      First page: 3
      Abstract: Quantitative and qualitative data on active-ingredient drug composition are essential information for characterizing near-field exposure of consumers to product-related chemicals, among other things. Equally as important is the characterization of the relationship between one or many active ingredients in terms of the diseases they are prescribed for. Such evaluations, however, require quantitative information at different anatomical levels. To complement the available sources of information on active substances and diseases, we have designed a database with enough versatility to potentially be used in a variety of analyzes. By using information provided by a well-established online pharmacological dictionary, we present a database with 11 tables which are easy to access and manipulate. Specifically, we present datasets containing the details of 12,827 marketed drug products, 40,164 diseases, 6231 active pharmaceutical ingredients and 4093 side effects. We exemplify the usefulness of our database with three simple visualizations, which confirm the importance of the data for quantifying the complexity in the associations among active substances, diseases and side effects. Although there are databases with detailed information on active substances and diseases, none of them can be found in Spanish. Our work presents an option that contributes substantially to obtaining well classified information in order to evaluate the roles of active pharmaceutical ingredients, diseases and side effects. These datasets also provide information about clinical and pharmacological groupings which may be useful for clinical and academic researchers. The database will be regularly updated and extended with the newly available Virtual Medicinal Products.
      Citation: Data
      PubDate: 2021-01-09
      DOI: 10.3390/data6010003
      Issue No: Vol. 6, No. 1 (2021)
       
  • Data, Vol. 6, Pages 4: No-z Model for Magnetic Fields of Different
           Astrophysical Objects and Stability of the Solutions

    • Authors: Evgeny Mikhailov, Daniela Boneva, Maria Pashentseva
      First page: 4
      Abstract: A wide range of astrophysical objects, such as the Sun, galaxies, stars, planets, accretion discs etc., have large-scale magnetic fields. Their generation is often based on the dynamo mechanism, which is connected with joint action of the alpha-effect and differential rotation. They compete with the turbulent diffusion. If the dynamo is intensive enough, the magnetic field grows, else it decays. The magnetic field evolution is described by Steenbeck—Krause—Raedler equations, which are quite difficult to be solved. So, for different objects, specific two-dimensional models are used. As for thin discs (this shape corresponds to galaxies and accretion discs), usually, no-z approximation is used. Some of the partial derivatives are changed by the algebraic expressions, and the solenoidality condition is taken into account as well. The field generation is restricted by the equipartition value and saturates if the field becomes comparable with it. From the point of view of mathematical physics, they can be characterized as stable points of the equations. The field can come to these values monotonously or have oscillations. It depends on the type of the stability of these points, whether it is a node or focus. Here, we study the stability of such points and give examples for astrophysical applications.
      Citation: Data
      PubDate: 2021-01-10
      DOI: 10.3390/data6010004
      Issue No: Vol. 6, No. 1 (2021)
       
  • Data, Vol. 6, Pages 5: Aircraft Engine Run-to-Failure Dataset under Real
           Flight Conditions for Prognostics and Diagnostics

    • Authors: Manuel Arias Chao, Chetan Kulkarni, Kai Goebel, Olga Fink
      First page: 5
      Abstract: A key enabler of intelligent maintenance systems is the ability to predict the remaining useful lifetime (RUL) of its components, i.e., prognostics. The development of data-driven prognostics models requires datasets with run-to-failure trajectories. However, large representative run-to-failure datasets are often unavailable in real applications because failures are rare in many safety-critical systems. To foster the development of prognostics methods, we develop a new realistic dataset of run-to-failure trajectories for a fleet of aircraft engines under real flight conditions. The dataset was generated with the Commercial Modular Aero-Propulsion System Simulation (CMAPSS) model developed at NASA. The damage propagation modelling used in this dataset builds on the modelling strategy from previous work and incorporates two new levels of fidelity. First, it considers real flight conditions as recorded on board of a commercial jet. Second, it extends the degradation modelling by relating the degradation process to its operation history. This dataset also provides the health, respectively, fault class. Therefore, besides its applicability to prognostics problems, the dataset can be used for fault diagnostics.
      Citation: Data
      PubDate: 2021-01-13
      DOI: 10.3390/data6010005
      Issue No: Vol. 6, No. 1 (2021)
       
  • Data, Vol. 6, Pages 6: The Hierarchical Classifier for COVID-19 Resistance
           Evaluation

    • Authors: Nataliya Shakhovska, Ivan Izonin, Nataliia Melnykova
      First page: 6
      Abstract: Finding dependencies in the data requires the analysis of relations between dozens of parameters of the studied process and hundreds of possible sources of influence on this process. Dependencies are nondeterministic and therefore modeling requires the use of statistical methods for analyzing random processes. Part of the information is often hidden from observation or not monitored. That is why many difficulties have arisen in the process of analyzing the collected information. The paper aims to find frequent patterns and parameters affected by COVID-19. The novelty of the paper is hierarchical architecture comprises supervised and unsupervised methods. It allows the development of an ensemble of the methods based on k-means clustering and classification. The best classifiers from the ensemble are random forest with 500 trees and XGBoost. Classification for separated clusters gives us higher accuracy on 4% in comparison with dataset analysis. The proposed approach can be used also for personalized medicine decision support in other domains. The features selection allows us to analyze the following features with the highest impact on COVID-19: age, sex, blood group, had influenza.
      Citation: Data
      PubDate: 2021-01-15
      DOI: 10.3390/data6010006
      Issue No: Vol. 6, No. 1 (2021)
       
  • Data, Vol. 6, Pages 1: OFCOD: On the Fly Clustering Based Outlier
           Detection Framework

    • Authors: Ahmed Elmogy, Hamada Rizk, Amany M. Sarhan
      First page: 1
      Abstract: In data mining, outlier detection is a major challenge as it has an important role in many applications such as medical data, image processing, fraud detection, intrusion detection, and so forth. An extensive variety of clustering based approaches have been developed to detect outliers. However they are by nature time consuming which restrict their utilization with real-time applications. Furthermore, outlier detection requests are handled one at a time, which means that each request is initiated individually with a particular set of parameters. In this paper, the first clustering based outlier detection framework, (On the Fly Clustering Based Outlier Detection (OFCOD)) is presented. OFCOD enables analysts to effectively find out outliers on time with request even within huge datasets. The proposed framework has been tested and evaluated using two real world datasets with different features and applications; one with 699 records, and another with five millions records. The experimental results show that the performance of the proposed framework outperforms other existing approaches while considering several evaluation metrics.
      Citation: Data
      PubDate: 2020-12-30
      DOI: 10.3390/data6010001
      Issue No: Vol. 6, No. 1 (2020)
       
  • Data, Vol. 5, Pages 87: Survey of Decentralized Solutions with Mobile
           Devices for User Location Tracking, Proximity Detection, and Contact
           Tracing in the COVID-19 Era

    • Authors: Viktoriia Shubina, Sylvia Holcer, Michael Gould, Elena Simona Lohan
      First page: 87
      Abstract: Some of the recent developments in data science for worldwide disease control have involved research of large-scale feasibility and usefulness of digital contact tracing, user location tracking, and proximity detection on users’ mobile devices or wearables. A centralized solution relying on collecting and storing user traces and location information on a central server can provide more accurate and timely actions than a decentralized solution in combating viral outbreaks, such as COVID-19. However, centralized solutions are more prone to privacy breaches and privacy attacks by malevolent third parties than decentralized solutions, storing the information in a distributed manner among wireless networks. Thus, it is of timely relevance to identify and summarize the existing privacy-preserving solutions, focusing on decentralized methods, and analyzing them in the context of mobile device-based localization and tracking, contact tracing, and proximity detection. Wearables and other mobile Internet of Things devices are of particular interest in our study, as not only privacy, but also energy-efficiency, targets are becoming more and more critical to the end-users. This paper provides a comprehensive survey of user location-tracking, proximity-detection, and digital contact-tracing solutions in the literature from the past two decades, analyses their advantages and drawbacks concerning centralized and decentralized solutions, and presents the authors’ thoughts on future research directions in this timely research field.
      Citation: Data
      PubDate: 2020-09-23
      DOI: 10.3390/data5040087
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 88: Volatile Compounds Emitted from the Cat Urine
           Contaminated Carpet before and after Treatment with Marketed Cleaning
           Products: A Simultaneous Chemical and Sensory Analysis

    • Authors: Chumki Banik, Jacek A. Koziel, Elizabeth Flickinger
      First page: 88
      Abstract: Urination on carpet and subflooring can develop into a persistent and challenging problem when trying to mitigate odor. Very little or no information is published on how volatile organic compounds (VOCs) change over time when urine is deposited on a carpet covering a plywood-type subflooring. This research has investigated the VOCs emitted from carpet + subflooring (control), carpet + subflooring sprayed with water (control with moisture), and cat urine-contaminated carpet + subflooring (treatment) over time (day 0 and 15). In addition, the study has recorded the effect of four popular cleaning product applications on VOCs emitted from carpet and evaluated their efficacy in eliminating cat urine related indoor odors over time (days 0 and 15). Carpet-subflooring with all treatments were also contaminated with Micrococcus luteus, a nonmotile obligate aerobe commonly found in household dust, to observe the impact of the aerobe on carpet-subflooring VOCs emission. VOCs emitted from carpet + subflooring receiving different treatments were collected from headspace using solid-phase microextraction (SPME). The VOCs were analyzed using a gas chromatography-mass spectrometry olfactometry (GC-MS-O). Many common VOCs were released from the carpet on day 0 and day 15, specifically from urine contamination. Cleaning products were effective in masking several potent odors of cat urine contaminated carpet VOCs on day 0 but were unable to remove the odor that appeared on day 15 in most cases.
      Citation: Data
      PubDate: 2020-09-24
      DOI: 10.3390/data5040088
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 89: Emidec: A Database Usable for the Automatic
           Evaluation of Myocardial Infarction from Delayed-Enhancement Cardiac MRI

    • Authors: Alain Lalande, Zhihao Chen, Thomas Decourselle, Abdul Qayyum, Thibaut Pommier, Luc Lorgis, Ezequiel de la Rosa, Alexandre Cochet, Yves Cottin, Dominique Ginhac, Michel Salomon, Raphaël Couturier, Fabrice Meriaudeau
      First page: 89
      Abstract: One crucial parameter to evaluate the state of the heart after myocardial infarction (MI) is the viability of the myocardial segment, i.e., if the segment recovers its functionality upon revascularization. MRI performed several minutes after the injection of a contrast agent (delayed enhancement-MRI or DE-MRI) is a method of choice to evaluate the extent of MI, and by extension, to assess viable tissues after an injury. The Emidec dataset is composed of a series of exams with DE-MR images in short axis orientation covering the left ventricle from normal cases or patients with myocardial infarction, with the contouring of the myocardium and diseased areas (if present) from experts in the domains. Moreover, classical available clinical parameters when the patient is managed by an emergency department are provided for each case. To the best of our knowledge, the Emidec dataset is the first one where annotated DE-MRI are combined with clinical characteristics of the patient, allowing the development of methodologies for exam classification as for exam quantification.
      Citation: Data
      PubDate: 2020-09-24
      DOI: 10.3390/data5040089
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 90: Towards a Contextual Approach to Data Quality

    • Authors: Stefano Canali
      First page: 90
      Abstract: In this commentary, I propose a framework for thinking about data quality in the context of scientific research. I start by analyzing conceptualizations of quality as a property of information, evidence and data and reviewing research in the philosophy of information, the philosophy of science and the philosophy of biomedicine. I identify a push for purpose dependency as one of the main results of this review. On this basis, I present a contextual approach to data quality in scientific research, whereby the quality of a dataset is dependent on the context of use of the dataset as much as the dataset itself. I exemplify the approach by discussing current critiques and debates of scientific quality, thus showcasing how data quality can be approached contextually.
      Citation: Data
      PubDate: 2020-09-25
      DOI: 10.3390/data5040090
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 91: A Public Dataset of 24-h Multi-Levels
           Psycho-Physiological Responses in Young Healthy Adults

    • Authors: Alessio Rossi, Eleonora Da Pozzo, Dario Menicagli, Chiara Tremolanti, Corrado Priami, Alina Sîrbu, David A. Clifton, Claudia Martini, Davide Morelli
      First page: 91
      Abstract: Wearable devices now make it possible to record large quantities of physiological data, which can be used to obtain a clearer view of a person’s health status and behavior. However, to the best of our knowledge, there are no open datasets in the literature that provide psycho-physiological data. The Multilevel Monitoring of Activity and Sleep in Healthy people (MMASH) dataset presented in this paper provides 24 h of continuous psycho-physiological data, that is, inter-beat intervals data, heart rate data, wrist accelerometry data, sleep quality index, physical activity (i.e., number of steps per second), psychological characteristics (e.g., anxiety status, stressful events, and emotion declaration), and sleep hormone levels for 22 participants. The MMASH dataset will enable the investigation of possible relationships between the physical and psychological characteristics of people in daily life. Data were validated through different analyses that showed their compatibility with the literature.
      Citation: Data
      PubDate: 2020-09-25
      DOI: 10.3390/data5040091
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 92: Dataset of Search Results Organized as Learning
           Paths Recommended by Experts to Support Search as Learning

    • Authors: Verónica Proaño-Ríos, Roberto González-Ibáñez
      First page: 92
      Abstract: In this article, we introduce a dataset of curated learning paths (LPs) to support search as learning. LPs were obtained through an online survey delivered to experts in different domains. Data were then analyzed and described in terms of a set of variables. The resulting dataset comprised 83 LPs, each containing three web pages, for an overall collection consisting of 249 documents. The dataset is intended to provide information scientists, education researchers, and industry professionals, who provide information services in educational contexts, a valuable resource to (i) investigate patterns in the order of LPs, (ii) improve ranking models and/or re-ranking methods, (iii) explain the structure of the recommended LPs, and (iv) investigate alternative approaches to display search results based on the features of LPs.
      Citation: Data
      PubDate: 2020-09-27
      DOI: 10.3390/data5040092
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 93: Classification of Actual Sensor Network
           Deployments in Research Studies from 2013 to 2017

    • Authors: Janis Judvaitis, Artis Mednis, Valters Abolins, Ansis Skadins, Didzis Lapsa, Raimonds Rava, Maksims Ivanovs, Krisjanis Nesenbergs
      First page: 93
      Abstract: Technologies, such as Wireless Sensor Networks (WSN) and Internet of Things (IoT), have captured the imagination of researchers, businesses, and general public, due to breakthroughs in embedded system development, sensing technologies, and ubiquitous connectivity in recent years. That resulted in the emergence of an enormous, difficult-to-navigate body of work related to WSN and IoT. In an ongoing research effort to highlight trends and developments in these technologies and to see whether they are actually deployed rather than subjects of theoretical research with presumed potential use cases, we gathered and codified a dataset of scientific publications from a five-year period from 2013 to 2017 involving actual sensor network deployments, which will serve as a basis for future in-depth analysis of the field. In the first iteration, 15,010 potentially relevant articles were identified in SCOPUS and Web of Science databases; after two iterations, 3059 actual sensor network deployments were extracted from those articles and classified in a consistent way according to different categories, such as type of nodes, field of application, communication types, etc. We publish the resulting dataset with the intent that its further analysis may identify prospective research fields and future trends in WSN and IoT.
      Citation: Data
      PubDate: 2020-09-30
      DOI: 10.3390/data5040093
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 94: Visual Analytics Approach to Comprehensive
           Meteorological Time-Series Analysis

    • Authors: Milena Vuckovic, Johanna Schmidt
      First page: 94
      Abstract: In some of the domain-specific sectors, such as the climate domain, the provision of publicly available present-day high-resolution meteorological time series is often quite limited or completely lacking. This repeatedly leads to excessive deployment of synthetically generated (historical) meteorological time series (TMY) to support thermal performance assessments on both building and urban scale. These datasets are generally a misrepresentation of current weather variability, which may lead to erroneous inferences drawn from modelling results. In this regard, we outline the application potential of a visual analytics approach in the context of data quality assessment and validation of TMYs. For this purpose, we deployed a standalone visual analytics tool Visplore, enriched with interlinked dashboards, customizable visualizations, and intuitive workflows, to support continuous interaction and early visual feedback. Driven by such integrated visual representations and visual interactions to enhance the analytical reasoning process, we were able to detect critical multifaceted discrepancies, on different levels of granularity, between TMY and present-day meteorological time series and synthetize them into cohesive patterns and insights. These mainly entailed diverging temporal trends and event time lags, under- and overestimation of warming and cooling regimes, respectively, and seasonal discrepancies, in particular meteorological parameters, to name a few.
      Citation: Data
      PubDate: 2020-09-30
      DOI: 10.3390/data5040094
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 95: Digital Psychological Platform for Mass
           Web-Surveys

    • Authors: Evgeny Nikulchev, Dmitry Ilin, Anastasiya Silaeva, Pavel Kolyasnikov, Vladimir Belov, Andrey Runtov, Pavel Pushkin, Nikolay Laptev, Anna Alexeenko, Shamil Magomedov, Alexander Kosenkov, Ilya Zakharov, Victoria Ismatullina, Sergey Malykh
      First page: 95
      Abstract: Web-surveys are one of the most popular forms of primary data collection used for various researches. However, mass surveys involve some challenges. It is required to consider different platforms and browsers, as well as different data transfer rates using connections in different regions of the country. Ensuring guaranteed data delivery in these conditions should determine the right choice of technologies for implementing web-surveys. The paper describes the solution to transfer a questionnaire to the client side in the form of an archive. This technological solution ensures independence from the data transfer rate and the stability of the communication connection with significant survey filling time. The conducted survey benefited the service of education psychologists under the federal Ministry of Education. School psychologists consciously took part in the survey, realizing the importance of their opinion for organizing and improving their professional activities. The desire to answer open-ended questions in detail created a part of the answers in the dataset, where there were several sentences about different aspects of professional activity. An important challenge of the problem is the Russian language, for which there are not as many tools as for the languages more widespread in the world. The survey involved 20,443 school psychologists from all regions of the Russian Federation, both from urban and rural areas. The answers did not contain spam, runaround answers, and so on as evidenced by the average response time. For the surveys, an authoring development tool DigitalPsyTools.ru was used.
      Citation: Data
      PubDate: 2020-10-05
      DOI: 10.3390/data5040095
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 96: ASDToolkit: A Novel MATLAB Processing Toolbox for
           ASD Field Spectroscopy Data

    • Authors: Kathryn Elmer, Raymond J. Soffer, J. Pablo Arroyo-Mora, Margaret Kalacska
      First page: 96
      Abstract: Over the past 30 years, the use of field spectroscopy has risen in importance in remote sensing studies for the characterization of the surface reflectance of materials in situ within a broad range of applications. Potential uses range from measurements of individual targets of interest (e.g., vegetation, soils, validation targets) to characterizing the contributions of different materials within larger spatially mixed areas as would be representative of the spatial resolution captured by a sensor pixel (UAV to satellite scale). As such, it is essential that a complete and rigorous assessment of both the data acquisition procedures and the suitability of the derived data product be carried out. The measured energy from solar-reflective range spectroradiometers is influenced by the viewing and illumination geometries and the illumination conditions, which vary due to changes in solar position and atmospheric conditions. By applying corrections, the estimated absolute reflectance (Rabs) of targets can be calculated. This property is independent of illumination intensity or conditions, and is the metric commonly suggested to be used to compare spectra even when data are collected by different sensors or acquired under different conditions. By standardizing the process of estimated Rabs, as is provided in the described toolkit, consistency and repeatability in processing are ensured and the otherwise labor-intensive and error-prone processing steps are streamlined. The resultant end data product (Rabs) represents our current best effort to generate consistent and comparable ground spectra that have been corrected for viewing and illumination geometries as well as other factors such as the individual characteristics of the reference panel used during acquisition.
      Citation: Data
      PubDate: 2020-10-08
      DOI: 10.3390/data5040096
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 97: An Eddy Covariance Mesonet For Measuring
           Greenhouse Gas Fluxes in Coastal South Carolina

    • Authors: Jeremy D. Forsythe, Thomas L. O’Halloran, Michael A. Kline
      First page: 97
      Abstract: Coastal ecosystems are vulnerable to climate change and have been identified as sources of uncertainty in the global carbon budget. Here we introduce a recently established mesonet of eddy covariance towers in South Carolina and describe the sensor arrays and data workflow used to produce three site-years of flux observations in coastal ecosystems. The tower sites represent tidal salt marsh (US-HB1), mature longleaf pine forest (US-HB2), and longleaf pine restoration (replanted clearcut; US-HB3). Coastal ecosystems remain less represented in climate studies despite their potential to sequester large amounts of carbon. Our goal in publishing this open access dataset is to contribute observations in understudied coastal ecosystems to facilitate synthesis and modeling analyses that advance carbon cycle science.
      Citation: Data
      PubDate: 2020-10-15
      DOI: 10.3390/data5040097
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 98: The Role of Administrative and Secondary Data in
           Estimating the Costs and Effects of School and Workplace Closures due to
           the COVID-19 Pandemic

    • Authors: Auliya A. Suwantika, Neily Zakiyah, Ajeng Diantini, Rizky Abdulah, Maarten J. Postma
      First page: 98
      Abstract: As a part of mitigation strategies during a COVID-19 pandemic, the WHO currently recommends social distancing measures through school closures (SC) and work closures (WC) to control the infection spread and reduce the illness attack rate. Focusing on the use of administrative and secondary data, this study aimed to estimate the costs and effects of alternative strategies for mitigating the COVID-19 pandemic in Jakarta, Indonesia, by comparing the baseline (no intervention) with SC + WC for 2, 4, and 8 weeks as respective scenarios. A modified Susceptible-Exposed-Infected-Recovered (SEIR) compartmental model accounting for the spread of infection during the latent period was applied by taking into account a 1-year time horizon. To estimate the total pandemic cost of all scenarios, we took into account the cost of healthcare, SC, and productivity loss due to WC and illness. Next to costs, averted deaths were considered as the effect measure. In comparison with the baseline, the result showed that total savings in scenarios of SC + WC for 2, 4, and 8 weeks would be approximately $24 billion, $25 billion, and $34 billion, respectively. In addition, increasing the duration of SC and WC would increase the number of averted deaths. Scenarios of SC + WC for 2, 4, and 8 weeks would result in approximately 159,075, 173,963, and 250,842 averted deaths, respectively. A sensitivity analysis showed that the wage per day, infectious period, basic reproduction number, incubation period, and case fatality rate were found to be the most influential parameters affecting the savings and number of averted deaths. It can be concluded that all the mitigation scenarios were considered to be cost-saving, and increasing the duration of SC and WC would increase both the savings and the number of averted deaths.
      Citation: Data
      PubDate: 2020-10-18
      DOI: 10.3390/data5040098
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 99: Specialization of Business Process Model and
           Notation Applications in Medicine—A Review

    • Authors: Hana Tomaskova, Martin Kopecky
      First page: 99
      Abstract: Process analysis and process modeling are a current topic that extends to many areas. This trendof using optimization and modeling techniques in various specific areas has led to the question of howwidespread these approaches are overall in medical specializations. We compiled a list of 272 medicaldisciplines that we used as a search string with the Business Process Model and Notation (BPMN) fora Web of Science database search. Thus, we found a total of 485 documents that we subjected to theexclusion criteria. We analyzed the remaining 108 articles using bibliometric and content analyses tofind answers to three research questions. This systematic review was carried out using the procedureproposed by Kitchenham and following the Preferred Items of the Systematic Review and Meta-AnalysisReport (PRISMA). Due to the broad scope of the medical field, it was no surprise that for almost 85% of thesought-after medical specializations, we could not identify any publications in the given database whenapplying the BPMN. We analyzed the impact of upgrades to the BPMN on publishing. The keywordanalysis showed a diametrical difference between the authors’ keywords and the so-called “KeywordsPlus”, and we categorized the publications according to the purpose of applying the BPMN. However,the growing interest in combining BPMN with other approaches brings new challenges in practice.
      Citation: Data
      PubDate: 2020-10-19
      DOI: 10.3390/data5040099
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 100: Essential Variables for Environmental Monitoring:
           What are the Possible Contributions of Earth Observation Data Cubes'

    • Authors: Gregory Giuliani, Elvire Egger, Julie Italiano, Charlotte Poussin, Jean-Philippe Richard, Bruno Chatenoux
      First page: 100
      Abstract: Environmental sustainability is nowadays a major global issue that requires efficient and effective responses from governments. Essential variables (EV) have emerged in different scientific communities as a means to characterize and follow environmental changes through a set of measurements required to support policy evidence. To help track these changes, our planet has been under continuous observation from satellites since 1972. Currently, petabytes of satellite Earth observation (EO) data are freely available. However, the full information potential of EO data has not been yet realized because many big data challenges and complexity barriers hinder their effective use. Consequently, facilitating the production of EVs using the wealth of satellite EO data can be beneficial for environmental monitoring systems. In response to this issue, a comprehensive list of EVs that can take advantage of consistent time-series satellite data has been derived. In addition, a set of use-cases, using an Earth Observation Data Cube (EODC) to process large volumes of satellite data, have been implemented to demonstrate the practical applicability of EODC to produce EVs. The proposed approach has been successfully tested showing that EODC can facilitate the production of EVs at different scales and benefiting from the spatial and temporal dimension of satellite EO data for enhanced environmental monitoring.
      Citation: Data
      PubDate: 2020-10-21
      DOI: 10.3390/data5040100
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 101: Dataset for Assessing the Economic Performance of
           a Residential PV Plant: The Analysis of a New Policy Proposal

    • Authors: Idiano D’Adamo, Massimo Gastaldi, Piergiuseppe Morone
      First page: 101
      Abstract: This data article aims at providing a data description about the manuscript entitled “The post COVID-19 green recovery in practice: assessing the profitability of a policy proposal on residential photovoltaic plants”. The definition of a business plan is a complex decision because the choice of the input data significantly influences the economic assessment of a project. An Excel file is used to construct an economic model based on the Discounted Cash Flow (DCF) methodology using Net Present Value (NPV) as an indicator. The choice of input data is defined by literature analysis, and policy proposals are identified by the Revival Decree adopted by Italian Government to contrast human and economic shock effected by COVID-19. The aggregation of these data enabled us to obtain both baseline and alternative scenarios to define if the realization of a residential photovoltaic (PV) plant is economically feasible. Similar data can be obtained for other countries according to the policy actions adopted, and this work can be easily replicated in different geographical contexts and considering varying categories of stakeholders (e.g., consumers, which are called upon to implement a green transition).
      Citation: Data
      PubDate: 2020-10-28
      DOI: 10.3390/data5040101
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 102: Data for Heuristic Optimization of Electric
           Vehicles’ Charging Configuration Based on Loading Parameters

    • Authors: Sajjad Haider, Peter Schegner
      First page: 102
      Abstract: This dataset includes multiple files related to optimization of electric vehicles to minimize overloading in low voltage grids by varying the locations available to charge the EVs. The data include lognormally sampled hourly sorted scenarios across 11 charging locations for a stochastics-based Monte Carlo simulation. This simulation runs through 2 million scenarios based on actual probabilities to incorporate most possible situations. It also includes samples from normally distributed household electricity use scenarios based on agent-based modeling. The article includes the test grid parameters for simulation, which were used to create a benchmark grid in DigSilent Powerfactory software, as well as intermediate outputs defining worst case scenarios when electric vehicles were charged and results from three different optimization approaches involving a reduction in voltage drops, cable overloading and total line losses. The outputs from the benchmark grid were used to train a machine learning algorithm, the weights and codes for which are also attached. This trained network acted as the grid for subsequent iterative optimization procedures. Outputs are presented as a comparison between pre-optimization and post-optimization scenarios. The above dataset and procedure were repeated while varying the number of EVs between 0 and 100 in increments of 20, data for which are also attached. The data article supports a related submission titled “Minimization of Overloading Caused by Electric Vehicle (EV) Charging in Low Voltage Networks”.
      Citation: Data
      PubDate: 2020-10-29
      DOI: 10.3390/data5040102
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 103: Comparison of 3D Point Clouds Obtained by
           Terrestrial Laser Scanning and Personal Laser Scanning on Forest Inventory
           Sample Plots

    • Authors: Christoph Gollob, Tim Ritter, Arne Nothdurft
      First page: 103
      Abstract: In forest inventory, trees are usually measured using handheld instruments; among the most relevant are calipers, inclinometers, ultrasonic devices, and laser range finders. Traditional forest inventory has been redesigned since modern laser scanner technology became available. Laser scanners generate massive data in the form of 3D point clouds. We have developed a novel methodology to provide estimates of the tree positions, stem diameters, and tree heights from these 3D point clouds. This dataset was made publicly accessible to test new software routines for the automatic measurement of forest trees using laser scanner data. Benchmark studies with performance tests of different algorithms are welcome. The dataset contains co-registered raw 3D point-cloud data collected on 20 forest inventory sample plots in Austria. The data were collected by two different laser scanning systems: (1) A mobile personal laser scanner (PLS) (ZEB Horizon, GeoSLAM Ltd., Nottingham, UK) and (2) a static terrestrial laser scanner (TLS) (Focus3D X330, Faro Technologies Inc., Lake Mary, FL, USA). The data also contain digital terrain models (DTMs), field measurements as reference data (ground-truth), and the output of recent software routines for the automatic tree detection and the automatic stem diameter measurement.
      Citation: Data
      PubDate: 2020-10-31
      DOI: 10.3390/data5040103
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 104: Distinct Two-Stream Convolutional Networks for
           Human Action Recognition in Videos Using Segment-Based Temporal Modeling

    • Authors: Ashok Sarabu, Ajit Kumar Santra
      First page: 104
      Abstract: The Two-stream convolution neural network (CNN) has proven a great success in action recognition in videos. The main idea is to train the two CNNs in order to learn spatial and temporal features separately, and two scores are combined to obtain final scores. In the literature, we observed that most of the methods use similar CNNs for two streams. In this paper, we design a two-stream CNN architecture with different CNNs for the two streams to learn spatial and temporal features. Temporal Segment Networks (TSN) is applied in order to retrieve long-range temporal features, and to differentiate the similar type of sub-action in videos. Data augmentation techniques are employed to prevent over-fitting. Advanced cross-modal pre-training is discussed and introduced to the proposed architecture in order to enhance the accuracy of action recognition. The proposed two-stream model is evaluated on two challenging action recognition datasets: HMDB-51 and UCF-101. The findings of the proposed architecture shows the significant performance increase and it outperforms the existing methods.
      Citation: Data
      PubDate: 2020-11-11
      DOI: 10.3390/data5040104
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 105: An Anti-Nucleocapsid Antigen Sars-Cov-2 Total
           Antibody Assay Finds Comparable Results in Edta-Anticoagulated Whole Blood
           Obtained from Capillary and Venous Blood Sampling

    • Authors: Martin Risch, Marc Kovac, Corina Risch, Dorothea Hillmann, Michael Ritzler, Nadia Wohlwend, Thomas Lung, Michael Allmann, Christoph Seger, Lorenz Risch
      First page: 105
      Abstract: Although SARS-CoV-2 antibody assays have been found to provide valid results in EDTA-anticoagulated whole blood, so far, they have not demonstrated that antibody levels in whole blood originating from capillary blood samples are comparable to antibody levels measured in blood from a venous origin. Here, blood is drawn simultaneously by capillary and venous blood sampling. Antibody titers are determined by an assay employing electrochemiluminescence (ECLIA) and SARS-CoV-2 total immunoglobulins are detected with specificity directed against the nucleocapsid antigen. Six individuals with confirmed COVID-19 and six individuals without COVID-19 are analyzed. Antibody titers in capillary venous whole blood did not show significant differences, and when corrected for hematocrit, they did not differ from the results obtained from serum. In conclusion, capillary sampled EDTA-anticoagulated whole blood seems to be an attractive alternative matrix for the evaluation of SARS-CoV-2 antibodies when employing ECLIA for detecting total antibodies directed against nucleocapsid antibodies.
      Citation: Data
      PubDate: 2020-11-12
      DOI: 10.3390/data5040105
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 106: On the Stark Broadening of Be II Spectral Lines

    • Authors: Milan S. Dimitrijević, Magdalena Christova, Sylvie Sahal-Bréchot
      First page: 106
      Abstract: Calculated Stark broadening parameters of singly ionized beryllium spectral lines have been reported. Three spectral series have been studied within semiclassical perturbation theory. The plasma conditions cover temperatures from 2500 to 50,000 K and perturber densities 1011 cm−3 and 1013 cm−3. The influence of the temperature and the role of the perturbers (electrons, protons and He+ ions) on the Stark width and shift have been discussed. Results could be useful for plasma diagnostics in astrophysics, laboratory, and industrial plasmas.
      Citation: Data
      PubDate: 2020-11-23
      DOI: 10.3390/data5040106
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 107: Municipalities in the Czech
           Republic—Compilation of “a Universal” Dataset

    • Authors: Vít Pászto, Rostislav Nétek, Alena Vondráková, Vít Voženílek
      First page: 107
      Abstract: There have been many changes in the spatial composition and formal delimitation of administrative boundaries of Czech municipalities over the past 30 years. Many municipalities have changed their official status; they separated into ones that were more independent or were merged with existing ones, or formally redrew their boundaries due to advances in mapping technology. Such changes have made it almost impossible to analyze and visualize the temporal development of selected socioeconomic indicators, in order to deliver spatially coherent and time-comparable results. In this data description, we present an evolution of a unique (geo) dataset comprising of the administrative borders of the Czech municipalities. The uniqueness lies in time and topologically justified spatial data resulting in a common division of the administrative units at the LAU2 level, valid from 1995 to 2019. Besides the topologically correct spatial representations of municipalities in Czechia, we also provide correspondence tables for each year in the mentioned period, which allows joining tabular statistics to spatial data. The dataset is available as a base layer for further temporal and spatial analyses and visualization of various socioeconomic statistical data.
      Citation: Data
      PubDate: 2020-11-24
      DOI: 10.3390/data5040107
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 108: Dataset of User Reactions When Filling Out Web
           Questionnaires

    • Authors: Shamil Magomedov, Dmirty Ilin, Anastasiya Silaeva, Evgeny Nikulchev
      First page: 108
      Abstract: This paper presents the dataset and the results of the analysis of user reactions when filling out questionnaires. Based on the analysis of 1980 results of users’ responses to simple questionnaire questions, patterns in user reactions were revealed. Data analysis shows that a user is characterized by reactions when answering a variety of questions, reflecting the individual skills of the interface, reading speed, speed of choosing an answer, which can be used to supplement personal verification in information systems. The built-in reaction time does not significantly load the data volumes for logging and transferring and does not contain confidential information. The data would be of interest for further research by specialists in the field of psychology, information security, and information systems design.
      Citation: Data
      PubDate: 2020-11-25
      DOI: 10.3390/data5040108
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 109: Long-Term, Gridded Standardized Precipitation
           Index for Hawai’i

    • Authors: Matthew P. Lucas, Clay Trauernicht, Abby G. Frazier, Tomoaki Miura
      First page: 109
      Abstract: Spatially explicit, wall-to-wall rainfall data provide foundational climatic information but alone are inadequate for characterizing meteorological, hydrological, agricultural, or ecological drought. The Standardized Precipitation Index (SPI) is one of the most widely used indicators of drought and defines localized conditions of both drought and excess rainfall based on period-specific (e.g., 1-month, 6-month, 12-month) accumulated precipitation relative to multi-year averages. A 93-year (1920–2012), high-resolution (250 m) gridded dataset of monthly rainfall available for the State of Hawai’i was used to derive gridded, monthly SPI values for 1-, 3-, 6-, 9-, 12-, 24-, 36-, 48-, and 60-month intervals. Gridded SPI data were validated against independent, station-based calculations of SPI provided by the National Weather Service. The gridded SPI product was also compared with the U.S. Drought Monitor during the overlapping period. This SPI product provides several advantages over currently available drought indices for Hawai’i in that it has statewide coverage over a long historical period at high spatial resolution to capture fine-scale climatic gradients and monitor changes in local drought severity.
      Citation: Data
      PubDate: 2020-11-26
      DOI: 10.3390/data5040109
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 110: Data Employed in the Construction of a Composite
           Protein Database for Proteogenomic Analyses of Cephalopods Salivary
           Apparatus

    • Authors: Daniela Almeida, Dany Domínguez-Pérez, Ana Matos, Guillermin Agüero-Chapin, Yuselis Castaño, Vitor Vasconcelos, Alexandre Campos, Agostinho Antunes
      First page: 110
      Abstract: Here we provide all datasets and details applied in the construction of a composite protein database required for the proteogenomic analyses of the article “Putative Antimicrobial Peptides of the Posterior Salivary Glands from the Cephalopod Octopus vulgaris Revealed by Exploring a Composite Protein Database”. All data, subdivided into six datasets, are deposited at the Mendeley Data repository as follows. Dataset_1 provides our composite database “All_Databases_5950827_sequences.fasta” derived from six smaller databases composed of (i) protein sequences retrieved from public databases related to cephalopods’ salivary glands, (ii) proteins identified with Proteome Discoverer software using our original data obtained by shotgun proteomic analyses of posterior salivary glands (PSGs) from three Octopus vulgaris specimens (provided as Dataset_2) and (iii) a non-redundant antimicrobial peptide (AMP) database. Dataset_3 includes the transcripts obtained by de novo assembly of 16 transcriptomes from cephalopods’ PSGs using CLC Genomics Workbench. Dataset_4 provides the proteins predicted by the TransDecoder tool from the de novo assembly of 16 transcriptomes of cephalopods’ PSGs. Further details about database construction, as well as the scripts and command lines used to construct them, are deposited within Dataset_5 and Dataset_6. The data provided in this article will assist in unravelling the role of cephalopods’ PSGs in feeding strategies, toxins and AMP production.
      Citation: Data
      PubDate: 2020-11-27
      DOI: 10.3390/data5040110
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 111: In Silico Estimation of the Abundance and
           Phylogenetic Significance of the Composite Oct4-Sox2 Binding Motifs within
           a Wide Range of Species

    • Authors: Arman Kulyyassov, Ruslan Kalendar
      First page: 111
      Abstract: High-throughput sequencing technologies have greatly accelerated the progress of genomics, transcriptomics, and metagenomics. Currently, a large amount of genomic data from various organisms is being generated, the volume of which is increasing every year. Therefore, the development of methods that allow the rapid search and analysis of DNA sequences is urgent. Here, we present a novel motif-based high-throughput sequence scoring method that generates genome information. We found and identified Utf1-like, Fgf4-like, and Hoxb1-like motifs, which are cis-regulatory elements for the pluripotency transcription factors Sox2 and Oct4 within the genomes of different eukaryotic organisms. The genome-wide analysis of these motifs was performed to understand the impact of their diversification on mammalian genome evolution. Utf1-like, Fgf4-like, and Hoxb1-like motif diversity was evaluated across genomes from multiple species.
      Citation: Data
      PubDate: 2020-11-29
      DOI: 10.3390/data5040111
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 112: First Draft Genome Assembly of the Malaysian
           Stingless Bee, Heterotrigona itama (Apidae, Meliponinae)

    • Authors: Chien-Yeong Wee, Amin-Asyraf Tamizi, Nazrul-Hisham Nazaruddin, Siuk-Mun Ng, Jia-Shiun Khoo, Rosliza Jajuli
      First page: 112
      Abstract: The Malaysian stingless bee industry is hugely dependent on wild colonies. Nevertheless, the availability of new queens to establish new colonies is insufficient to meet the growing demand for hives in the industry. Heterotrigona itama is primarily utilized for honey production in the region and the major source of stingless bee colonies comes from the wild. To propagate new colonies domestically, a fundamental understanding of the biology of queen development, especially from the genomics aspect, is necessary. The whole genome was sequenced using a paired-end 150 strategy on the Illumina HiSeq X platform. The shotgun sequencing generated approximately 89 million raw pair-end reads with a total output of 13.37 Gb and a GC content of 37.31%. The genome size of the species was estimated to be approximately 272 Mb. Phylogenetic analysis showed H. itama are much more closely related to the bumble bee (Bombus spp.) than they are to the modern honey bee (Apis spp.). The genome data provided here are expected to contribute to a better understanding of the genetic aspect of queen differentiation as well as of important molecular pathways which are crucial for stingless bee biology, management and conservation.
      Citation: Data
      PubDate: 2020-11-30
      DOI: 10.3390/data5040112
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 113: Mid-Cycle Observations of CR Boo and Estimation
           of the System

    • Authors: Daniela Boneva, Svetlana Boeva, Yanko Nikolov, Zorica Cvetković, Radoslav Zamanov
      First page: 113
      Abstract: We present observations (with NAO Rozhen and AS Vidojevica telescopes) of the AM Canum Venaticorum (AM CVn) binary star CR Bootis (CR Boo) in the UBV bands. The data were obtained in two nights in July 2019, when the V band brightness was in the range of 16.1–17.0 mag. In both nights, a variability for a period of 25 ± 1 min and amplitude of about 0.2 magnitudes was visible. These brightness variations are most likely indications of “humps”. During our observational time, they appear for a period similar to the CR Boo orbital period. A possible reason of their origin is the phase rotation of the bright spot, placed in the contact point of the infalling matter and the outer disc edge. We estimated some of the parameters of the binary system, on the base of the observational data.
      Citation: Data
      PubDate: 2020-12-02
      DOI: 10.3390/data5040113
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 114: A Compendium of Chemical Class and Use Type Open
           Access Databases

    • Authors: Niklas Heinemann, Sascha Bub, Jakob Wolfram, Sebastian Stehle, Lara L. Petschick, Ralf Schulz
      First page: 114
      Abstract: With an ever-increasing production and registration of chemical substances, obtaining reliable and up to date information on their use types (UT) and chemical class (CC) is of crucial importance. We evaluated the current status of open access chemical substance databases (DBs) regarding UT and CC information using the “Meta-analysis of the Global Impact of Chemicals” (MAGIC) graph as a benchmark. A decision tree-based selection process was used to choose the most suitable out of 96 databases. To compare the DB content for 100 weighted, randomly selected chemical substances, an extensive quantitative and qualitative analysis was performed. It was found that four DBs yielded more qualitative and quantitative UT and CC results than the current MAGIC graph: The European Bioinformatics Institute DB, ChemSpider, the English Wikipedia page, and the National Center for Biotechnology Information (NCBI). The NCBI, along with its subsidiary DBs PubChem and Medical Subject Headings (MeSH), showed the best performance according to the defined criteria. To analyse large datasets, harmonisation of the available information might be beneficial, as the available DBs mostly aggregate information without harmonising them.
      Citation: Data
      PubDate: 2020-12-04
      DOI: 10.3390/data5040114
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 115: BLE-GSpeed: A New BLE-Based Dataset to Estimate
           User Gait Speed

    • Authors: Emilio Sansano-Sansano, Fernando J. Aranda, Raúl Montoliu, Fernando J. Álvarez
      First page: 115
      Abstract: To estimate the user gait speed can be crucial in many topics, such as health care systems, since the presence of difficulties in walking is a core indicator of health and function in aging and disease. Methods for non-invasive and continuous assessment of the gait speed may be key to enable early detection of cognitive diseases such as dementia or Alzheimer’s disease. Wearable technologies can provide innovative solutions for healthcare problems. Bluetooth Low Energy (BLE) technology is excellent for wearables because it is very energy efficient, secure, and inexpensive. In this paper, the BLE-GSpeed database is presented. The dataset is composed of several BLE RSSI measurements obtained while users were walking at a constant speed along a corridor. Moreover, a set of experiments using a baseline algorithm to estimate the gait speed are also presented to provide baseline results to the research community.
      Citation: Data
      PubDate: 2020-12-07
      DOI: 10.3390/data5040115
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 116: Seized Ecstasy Pills: Infrared Spectra and Image
           Datasets

    • Authors: Luc Patiny, Michaël Zasso, Pierre Esseiva, Julien Wist
      First page: 116
      Abstract: According to the World Drug Report 2020, cocaine and ecstasy are the most consumed stimulant drugs, with 19 and 27 million estimated users in 2018. In this context, large efforts are being made to design fast and cost-effective analytical methods to track and monitor the distribution networks of these synthetic drugs. Here, we share two datasets of ecstasy pills seized in the northeast of Switzerland between 2010 and 2011. The first contains 621 forensic-grade images of pills, while the second one consists of 486 mid-infrared (mIR) spectra. While both sets are not covering the same seizure, both provide high-quality data with orthogonal information to evaluate clustering and dimension reduction methods.
      Citation: Data
      PubDate: 2020-12-09
      DOI: 10.3390/data5040116
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 117: First 1-M Resolution Land Cover Map Labeling the
           Overlap in the 3rd Dimension: The 2018 Map for Wallonia

    • Authors: Céline Bassine, Julien Radoux, Benjamin Beaumont, Taïs Grippa, Moritz Lennert, Céline Champagne, Mathilde De Vroey, Augustin Martinet, Olivier Bouchez, Nicolas Deffense, Eric Hallot, Eléonore Wolff, Pierre Defourny
      First page: 117
      Abstract: Land cover maps contribute to a large diversity of geospatial applications, including but not limited to land management, hydrology, land use planning, climate modeling and biodiversity monitoring. In densely populated and highly fragmented landscapes as observed in the Walloon region (Belgium), very high spatial resolution is required to depict all the infrastructures, buildings and most of the structural elements of the semi-natural landscapes (like hedges and small water bodies). Because of the resolution, the vertical dimension needs explicit handling to avoid discontinuities incompatible with many applications. For example, how to map a river flowing under a bridge' The particularity of our data is to provide a two-digit land cover code to label all the overlapping items. The identification of all the overlaps resulted from the combination of remote sensing image analysis and decision rules involving ancillary data. The final product is therefore semantically precise and accurate in terms of land cover description thanks to the addition of 24 classes on top of the 11 pure land cover classes. The quality of the map has been assessed using a state-of-the-art validation scheme. Its overall accuracy is as high as 91.5%, with an average producer’s accuracy of 86% and an average user’s accuracy of 91%.
      Citation: Data
      PubDate: 2020-12-11
      DOI: 10.3390/data5040117
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 118: A State-Level Socioeconomic Data Collection of
           the United States for COVID-19 Research

    • Authors: Sha, Malarvizhi, Liu, Tian, Zhou, Ruan, Dong, Carte, Lan, Wang, Yang
      First page: 118
      Abstract: The outbreak of COVID-19 from late 2019 not only threatens the health and lives of humankind but impacts public policies, economic activities, and human behavior patterns significantly. To understand the impact and better prepare for future outbreaks, socioeconomic factors play significant roles in (1) determinant analysis with health care, environmental exposure and health behavior; (2) human mobility analyses driven by policies; (3) economic pressure and recovery analyses for decision making; and (4) short to long term social impact analysis for equity, justice and diversity. To support these analyses for rapid impact responses, state level socioeconomic factors for the United States of America (USA) are collected and integrated into topic-based indicators, including (1) the daily quantitative policy stringency index; (2) dynamic economic indices with multiple time frequency of GDP, international trade, personal income, employment, the housing market, and others; (3) the socioeconomic determinant baseline of the demographic, housing financial situation and medical resources. This paper introduces the measurements and metadata of relevant socioeconomic data collection, along with the sharing platform, data warehouse framework and quality control strategies. Different from existing COVID-19 related data products, this collection recognized the geospatial and dynamic factor as essential dimensions of epidemiologic research and scaled down the spatial resolution of socioeconomic data collection from country level to state level of the USA with a standard data format and high quality.
      Citation: Data
      PubDate: 2020-12-11
      DOI: 10.3390/data5040118
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 119: Dataset on the Effects of Different Pre-Harvest
           

    • Authors: Giandomenico Corrado, Luigi Lucini, Begoña Miras-Moreno, Leilei Zhang, Biancamaria Senizza, Boris Basile, Youssef Rouphael
      First page: 119
      Abstract: The study of the relationship between cultivated plants and environmental factors can provide information ranging from a deeper understanding of the plant biological system to the development of more effective management strategies for improving yield, quality, and sustainability of the produce. In this article, we present a comprehensive metabolomics dataset of two phytochemically divergent lettuce (Lactuca sativa L.) butterhead varieties under different growing conditions. Plants were cultivated in hydroponics in a growth chamber with ambient control. The pre-harvest factors that were independently investigated were light intensity (two levels), the ionic strength of the nutrient solutions (three levels), and the molar ratio of three macroelements (K, Mg, and Ca) in the nutrient solution (three levels). We used an untargeted, mass-spectrometry-based approach to characterize the metabolomics profiles of leaves harvested 19 days after transplant. The data revealed the ample impact on both primary and secondary metabolism and its range of variation. Moreover, our dataset is useful for uncovering the complex effects of the genotype, the environmental factor(s), and their interaction, which may deserve further investigation.
      Citation: Data
      PubDate: 2020-12-15
      DOI: 10.3390/data5040119
      Issue No: Vol. 5, No. 4 (2020)
       
  • Data, Vol. 5, Pages 57: Experimental Force Data of a Restrained ROV
           under Waves and Current

    • Authors: Gabl, Davey, Cao, Li, Li, Walker, Giorgio-Serchi, Aracri, Kiprakis, Stokes, Ingram
      First page: 57
      Abstract: Hydrodynamic forces are an important input value for the design, navigation and stationkeeping of underwater Remotely Operated Vehicles (ROVs). The experiment investigated the forcesimparted by currents (with representative real world turbulence) and waves on a commerciallyavailable ROV, namely the BlueROV2 (Blue Robotics, Torrance, USA). Three different distances ofa simplified cylindrical obstacle (shading effects) were investigated in addition to the free streamcases. Eight tethers held the ROV in the middle of the 2 m water depth to minimise the influence ofthe support structure without completely restricting the degrees of freedom (DoF). Each tether wasequipped with a load cell and small motions and rotations were documented with an underwatervideo motion capture system. The paper describes the experimental set-up, input values (currentspeed and wave definitions) and initial processing of the data. In addition to the raw data, a processeddataset is provided, which includes forces in all three main coordinate directions for each mountingpoint synchronised with the 6DoF results and the free surface elevations. The provided dataset can beused as a validation experiment as well as for testing and development of an algorithm for positioncontrol of comparable ROVs.
      Citation: Data
      PubDate: 2020-06-30
      DOI: 10.3390/data5030057
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 58: An Update to the TraVA Database: Time Series of
           Capsella bursa-pastoris Shoot Apical Meristems during Transition to
           Flowering

    • Authors: Klepikova, Kasianov
      First page: 58
      Abstract: Transition to flowering is a crucial part of plant life directly affecting the fitness of a plant. Time series of transcriptomes is a useful tool for the investigation of process dynamics and can be used for the identification of novel genes and gene networks involved in the process. We present a detailed time series of polyploid Capsella bursa-pastoris shoot apical meristems created with RNA-seq. The time series covers transition to flowering and can be used for thorough analysis of the process. To make the data easy to access, we uploaded them in our database Transcriptome Variation Analysis (TraVA), which provides a convenient depiction of the gene expression profiles, the differential expression analysis between the homeologs and quick data extraction.
      Citation: Data
      PubDate: 2020-06-30
      DOI: 10.3390/data5030058
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 59: The Dataset of the Experimental Evaluation of
           Software Components for Application Design Selection Directed by the
           Artificial Bee Colony Algorithm

    • Authors: Alexander Gusev, Dmitry Ilin, Evgeny Nikulchev
      First page: 59
      Abstract: The paper presents the swarm intelligence approach to the selection of a set of software components based on computational experiments simulating the desired operating conditions of the software system being developed. A mathematical model is constructed, aimed at the effective selection of components from the available alternative options using the artificial bee colony algorithm. The model and process of component selection are introduced and applied to the case of selecting Node.js components for the development of a digital platform. The aim of the development of the platform is to facilitate countrywide simultaneous online psychological surveys in schools in the conditions of unstable internet connection and the large variety of desktop and mobile client devices, running different operating systems and browsers. The module whose development is considered in the paper should provide functionality for the archiving and checksum verification of the survey forms and graphical data. With the swarm intelligence approach proposed in the paper, the effective set of components was identified through a directional search based on fuzzy assessment of the three experimental quality indicators. To simulate the desired operating conditions and to guarantee the reproducibility of the experiments, the virtual infrastructure was configured. The application of swarm intelligence led to reproducible results for component selection after 312 experiments instead of the 1080 experiments needed by the exhaustive search algorithm. The suggested approach can be widely used for the effective selection of software components for distributed systems operating in the given conditions at this stage of their development.
      Citation: Data
      PubDate: 2020-07-08
      DOI: 10.3390/data5030059
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 60: An Arabic Dataset for Disease Named Entity
           Recognition with Multi-Annotation Schemes

    • Authors: Nasser Alshammari, Saad Alanazi
      First page: 60
      Abstract: This article outlines a novel data descriptor that provides the Arabic natural language processing community with a dataset dedicated to named entity recognition tasks for diseases. The dataset comprises more than 60 thousand words, which were annotated manually by two independent annotators using the inside–outside (IO) annotation scheme. To ensure the reliability of the annotation process, the inter-annotator agreements rate was calculated, and it scored 95.14%. Due to the lack of research efforts in the literature dedicated to studying Arabic multi-annotation schemes, a distinguishing and a novel aspect of this dataset is the inclusion of six more annotation schemes that will bridge the gap by allowing researchers to explore and compare the effects of these schemes on the performance of the Arabic named entity recognizers. These annotation schemes are IOE, IOB, BIES, IOBES, IE, and BI. Additionally, five linguistic features, including part-of-speech tags, stopwords, gazetteers, lexical markers, and the presence of the definite article, are provided for each record in the dataset.
      Citation: Data
      PubDate: 2020-07-13
      DOI: 10.3390/data5030060
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 61: Single-Beam Acoustic Doppler Profiler and
           Co-Located Acoustic Doppler Velocimeter Flow Velocity Data

    • Authors: Marilou Jourdain de Thieulloy, Mairi Dorward, Chris Old, Roman Gabl, Thomas Davey, David M. Ingram, Brian G. Sellar
      First page: 61
      Abstract: Acoustic Doppler Profilers (ADPs) are routinely used to measure flow velocity in the ocean, enabling multi-points measurement along a profile while Acoustic Doppler Velocimeters (ADVs) are laboratory instruments that provide very precise point velocity measurement. The experimental set-up allows laboratory comparison of measurement from these two instruments. Simultaneous multi-point measurements of velocity along the horizontal tank profile from Single-Beam Acoustic Doppler Profiler (SB-ADP) were compared against multiple co-located point measurements from an ADV. Measurements were performed in the FloWave Ocean Energy Research Facility at the University of Edinburgh at flow velocities between 0.6 ms − 1 and 1.2 ms − 1 . This paper describes the data; the analysis of the inter-instrument comparison is presented in an associated Sensors paper by the same authors. This data-set contains (a) time series of raw SB-ADP uni-directional velocity measurements along a 10 m tank profile binned into 54 measurements cells and (b) ADV point measurements of three-directional velocity time series recorded in beam coordinates at selected locations along the profile. Associated with the data are instrument generated quality data, metadata and user-derived quality flags. An analysis of the quality of SB-ADP data along the profile is presented. This data-set provides multiple contemporaneous velocity measurements along the tank profile, relevant for correlation statistics, length-scale calculations and validation of numerical models simulating flow hydrodynamics in circular test facilities.
      Citation: Data
      PubDate: 2020-07-14
      DOI: 10.3390/data5030061
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 62: Luxembourg Fund Data Repository

    • Authors: Angeliki Skoura, Julian Presber, Jang Schiltz
      First page: 62
      Abstract: In this paper, we introduce the Luxembourg Fund Data Repository, a novel database of investment funds available for academic research that was created at the Department of Finance of the University of Luxembourg. The database contains the population of Undertakings for Collective Investment in Transferable Securities funds domiciled in Luxembourg from the starting month of their existence (March 1988) to October 2016. The fund characteristics are organized in a comprehensive database architecture encompassing static and dynamic data over the entire life of the funds. The characteristics include fund identifiers, official name, status information, management company and other service providers, daily and monthly performance time-series, portfolio holdings, classification of investment objective, fees, dividends, and cash flows. The database was constructed after collecting and assembling complementary historical information from three data providers. Importantly, funds no longer in existence due to liquidation or mergers are included in the database, preventing survivorship bias. The database has been constructed to serve as a research dataset of high accuracy due to the maximization of population coverage, the maximization of historical coverage, and validation by using information acquired from the supervisory authority of the financial sector of Luxembourg. License currently available to researchers of the Department of Finance of the University of Luxembourg. Future plans for extending accessibility to the global academic community.
      Citation: Data
      PubDate: 2020-07-19
      DOI: 10.3390/data5030062
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 63: Genotyping by Sequencing Reads of 20 Vicia faba
           Lines with High and Low Vicine and Convicine Content

    • Authors: Felix Heinrich, Mehmet Gültas, Wolfgang Link, Armin Otto Schmitt
      First page: 63
      Abstract: The grain faba bean (Vicia faba) which belongs to the family of the Leguminosae, is a crop that is grown worldwide for consumption by humans and livestock. Despite being a rich source of plant-based protein and various agro-ecological advantages its usage is limited due to its anti-nutrients in the form of the seed-compounds vicine and convicine (V+C). While markers for a low V+C content exist the underlying pathway and the responsible genes have remained unknown for a long time and only recently a possible pathway and enzyme were found. Genetic research into Vicia faba is difficult due to the lack of a reference genome and the near exclusivity of V+C to the species. Here, we present sequence reads obtained through genotyping-by-sequencing of 20 Vicia faba lines with varying V+C contents. For each line, ∼3 million 150 bp paired end reads are available. This data can be useful in the genomic research of Vicia faba in general and its V+C content in particular.
      Citation: Data
      PubDate: 2020-07-20
      DOI: 10.3390/data5030063
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 64: A Dataset to Evaluate IEEE 802.15.4g SUN for
           Dependable Low-Power Wireless Communications in Industrial Scenarios

    • Authors: Pere Tuset-Peiró, Ruan D. Gomes, Pascal Thubert, Eva Cuerva, Eduard Egusquiza, Xavier Vilajosana
      First page: 64
      Abstract: This article presents a dataset obtained from the deployment of an IEEE 802.15.4g SUN (Smart Utility Network) single-hop network (11 nodes) in a large industrial scenario (110,044 m 2 ) for a long period of time (99 days). The dataset contains ∼11 M entries with RSSI (Received Signal Strength Indicator), CCA (Clear Channel Assessment), and PDR (Packet Delivery Ratio) values. The analyzed results show a high variability in the average RSSI (i.e., between −82.1 dBm and −101.7 dBm) and CCA (i.e., between −111.2 dBm and −119.9 dBm) values, which is caused by the effects of multi-path propagation and external interference. Despite being above the sensitivity limit for each modulation, these values result in poor average PDR values (i.e., from 65.9% to 87.4%), indicating that additional schemes are needed to meet the link reliability requirements of industrial applications. Hence, the presented dataset will allow researchers and practitioners to propose novel mechanisms and evaluate their performance using realistic conditions, enabling the dependability vision of the RAW (Reliable and Available Wireless) WG (Working Group) at the IETF (Internet Engineering Task Force).
      Citation: Data
      PubDate: 2020-07-23
      DOI: 10.3390/data5030064
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 65: Novel Molecular Resources to Facilitate Future
           Genetics Research on Freshwater Mussels (Bivalvia: Unionidae)

    • Authors: Nathan A. Johnson, Chase H. Smith
      First page: 65
      Abstract: Molecular data have been an integral tool in the resolution of the evolutionary relationships and systematics of freshwater mussels, despite the limited number of nuclear markers available for Sanger sequencing. To facilitate future studies, we evaluated the phylogenetic informativeness of loci from the recently published anchored hybrid enrichment (AHE) probe set Unioverse and developed novel Sanger primer sets to amplify two protein-coding nuclear loci with high net phylogenetic informativeness scores: fem-1 homolog C (FEM1) and UbiA prenyltransferase domain-containing protein 1 (UbiA). We report the methods used for marker development, along with the primer sequences and optimized PCR and thermal cycling conditions. To demonstrate the utility of these markers, we provide haplotype networks, DNA alignments, and summary statistics regarding the sequence variation for the two protein-coding nuclear loci (FEM1 and UbiA). Additionally, we compare the DNA sequence variation of FEM1 and UbiA to three loci commonly used in freshwater mussel genetic studies: the mitochondrial genes cytochrome c oxidase subunit 1 (CO1) and NADH dehydrogenase subunit 1 (ND1), and the nuclear internal transcribed spacer 1 (ITS1). All five loci distinguish among the three focal species (Potamilus fragilis, Potamilus inflatus, and Potamilus purpuratus), and the sequence variation was highest for ND1, followed by CO1, ITS1, UbiA, and FEM1, respectively. The newly developed Sanger PCR primers and methodologies for extracting additional loci from AHE probe sets have great potential to facilitate molecular investigations targeting supraspecific relationships in freshwater mussels, but may be of limited utility at shallow taxonomic scales.
      Citation: Data
      PubDate: 2020-07-30
      DOI: 10.3390/data5030065
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 66: Measurements of Mobile Blockchain Execution Impact
           on Smartphone Battery

    • Authors: Yulia Bardinova, Konstantin Zhidanov, Sergey Bezzateev, Mikhail Komarov, Aleksandr Ometov
      First page: 66
      Abstract: This is a data descriptor paper for a set of the battery output data measurements during the turned on display discharge process caused by the execution of modern mobile blockchain projects on Android devices. The measurements were executed for Proof-of-Work (PoW) and Proof-of-Activity (PoA) consensus algorithms. In this descriptor, we give examples of Samsung Galaxy S9 operation while a broader range of measurements is available in the dataset. Examples provide the data about battery output current, output voltage, temperature, and status. We also show the measurements obtained utilizing short-range (IEEE 802.11n) and cellular (LTE) networks. This paper describes the proposed dataset and the method employed to gather the data. To provide a further understanding of the dataset’s nature, an analysis of the collected data is also briefly presented. This dataset may be of interest to both researchers from information security and human–computer interaction fields and industrial distributed ledger/blockchain developers.
      Citation: Data
      PubDate: 2020-07-30
      DOI: 10.3390/data5030066
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 67: Multi-Slot BLE Raw Database for Accurate
           Positioning in Mixed Indoor/Outdoor Environments

    • Authors: Fernando J. Aranda, Felipe Parralejo, Fernando J. Álvarez, Joaquín Torres-Sospedra
      First page: 67
      Abstract: The technologies and sensors embedded in smartphones have contributed to the spread of disruptive applications built on top of Location Based Services (LBSs). Among them, Bluetooth Low Energy (BLE) has been widely adopted for proximity and localization, as it is a simple but efficient positioning technology. This article presents a database of received signal strength measurements (RSSIs) on BLE signals in a real positioning system. The system was deployed on two buildings belonging to the campus of the University of Extremadura in Badajoz. the database is divided into three different deployments, changing in each of them the number of measurement points and the configuration of the BLE beacons. the beacons used in this work can broadcast up to six emission slots simultaneously. Fingerprinting positioning experiments are presented in this work using multiple slots, improving positioning accuracy when compared with the traditional single slot approach.
      Citation: Data
      PubDate: 2020-07-30
      DOI: 10.3390/data5030067
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 68: An Environmental Data Collection for COVID-19
           Pandemic Research

    • Authors: Qian Liu, Wei Liu, Dexuan Sha, Shubham Kumar, Emily Chang, Vishakh Arora, Hai Lan, Yun Li, Zifu Wang, Yadong Zhang, Zhiran Zhang, Jackson T. Harris, Srikar Chinala, Chaowei Yang
      First page: 68
      Abstract: The COVID-19 viral disease surfaced at the end of 2019 and quickly spread across the globe. To rapidly respond to this pandemic and offer data support for various communities (e.g., decision-makers in health departments and governments, researchers in academia, public citizens), the National Science Foundation (NSF) spatiotemporal innovation center constructed a spatiotemporal platform with various task forces including international researchers and implementation strategies. Compared to similar platforms that only offer viral and health data, this platform views virus-related environmental data collection (EDC) an important component for the geospatial analysis of the pandemic. The EDC contains environmental factors either proven or with potential to influence the spread of COVID-19 and virulence or influence the impact of the pandemic on human health (e.g., temperature, humidity, precipitation, air quality index and pollutants, nighttime light (NTL)). In this platform/framework, environmental data are processed and organized across multiple spatiotemporal scales for a variety of applications (e.g., global mapping of daily temperature, humidity, precipitation, correlation of the pandemic to the mean values of climate and weather factors by city). This paper introduces the raw input data, construction and metadata of reprocessed data, and data storage, as well as the sharing and quality control methodologies of the COVID-19 related environmental data collection.
      Citation: Data
      PubDate: 2020-08-03
      DOI: 10.3390/data5030068
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 69: Towing Test Data Set of the Kyushu University Kite
           System

    • Authors: Mostafa A. Rushdi, Tarek N. Dief, Shigeo Yoshida, Roland Schmehl
      First page: 69
      Abstract: Kites can be used to harvest wind energy with substantially lower material and environmental footprints and a higher capacity factor than conventional wind turbines. In this paper, we present measurement data from seven individual tow tests with the kite system developed by Kyushu University. This system was designed for 7 kW traction power and comprises an inflatable wing of 6 m2 surface area with a suspended kite control unit that is towed on a relatively short tether of 0.4 m by a truck driving at constant speed along a straight runway. To produce a controlled relative flow environment, the experiment was conducted only when the background wind speed was negligible. We recorded the time-series of 11 different sensor values acquired on the kite, the control unit and the truck. The measured data can be used to assess the effects of the towing speed, the flight mode and the lengths of the control lines on the tether force.
      Citation: Data
      PubDate: 2020-08-03
      DOI: 10.3390/data5030069
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 70: A Multi-Annotator Survey of Sub-km Craters on Mars

    • Authors: Alistair Francis, Jonathan Brown , Thomas Cameron , Reuben Crawford Clarke , Romilly Dodd , Jennifer Hurdle , Matthew Neave , Jasmine Nowakowska , Viran Patel, Arianne Puttock , Oliver Redmond , Aaron Ruban , Damien Ruban , Meg Savage, Wiggert Vermeer , Alice Whelan , Panagiotis Sidiropoulos Sidiropoulos, Muller
      First page: 70
      Abstract: We present here a dataset of nearly 5000 small craters across roughly 1700 km2 of the Martian surface, in the MC-11 East quadrangle. The dataset covers twelve 2000-by-2000 pixel Context Camera images, each of which is comprehensively labelled by six annotators, whose results are combined using agglomerative clustering. Crater size-frequency distributions are centrally important to the estimation of planetary surface ages, in lieu of in-situ sampling. Older surfaces are exposed to meteoritic impactors for longer and, thus, are more densely cratered. However, whilst populations of larger craters are well understood, the processes governing the production and erosion of small (sub-km) craters are more poorly constrained. We argue that, by surveying larger numbers of small craters, the planetary science community can reduce some of the current uncertainties regarding their production and erosion rates. To this end, many have sought to use state-of-the-art object detection techniques utilising Deep Learning, which—although powerful—require very large amounts of labelled training data to perform optimally. This survey gives researchers a large dataset to analyse small crater statistics over MC-11 East, and allows them to better train and validate their crater detection algorithms. The collection of these data also demonstrates a multi-annotator method for the labelling of many small objects, which produces an estimated confidence score for each annotation and annotator.
      Citation: Data
      PubDate: 2020-08-03
      DOI: 10.3390/data5030070
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 71: Displacements of an Active Moderately Rapid
           Landslide—A Dataset Retrieved by Continuous GNSS Arrays

    • Authors: Mulas, Ciccarese, Truffelli, Corsini
      First page: 71
      Abstract: This paper describes a dataset of continuous GNSS positioning solutions referring to slope movements in the Ca’ Lita landslide (Northern Apennines, Italy). The dataset covers the period from 24 March 2016 to 17 July 2019 and includes time-series of the daily position of three GNSS rovers located in different parts of the landslide: head zone, upper track zone, and lower track zone. Two different types of continuous GNSS arrays have been used: one is based on high-end Leica geodetic receivers, and the other is based on low-cost effective Emlid receivers. Displacements captured in the dataset are up to more than a hundred meters and are characterized by prolonged phases of slow movement and moderately rapid acceleration phases. The data presented in this contribution were used to underline slope processes and validate displacements retrieved by the application of digital image correlation to a stack of a satellite images.
      Citation: Data
      PubDate: 2020-08-08
      DOI: 10.3390/data5030071
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 72: Data Analysis of Land Use Change and Urban and
           Rural Impacts in Lagos State, Nigeria

    • Authors: Olalekan O. Onilude, Eric Vaz
      First page: 72
      Abstract: This study examines land use change and impacts on urban and rural activity in Lagos State, Nigeria. To achieve this, multi-temporal land use and land cover (LULC) datasets derived from the GlobeLand30 product of years 2000 and 2010 for urban and rural areas of Lagos State were imported into ArcMap 10.6 and converted to raster files (raster thematic maps) for spatial analysis in the FRAGSTATS situated in the Patch Analyst. Thus, different landscape metrics were computed to generate statistical results. The results have shown that fragmentation of cultivated lands increased in the rural areas but decreased in the urban areas. Also, the findings display that land-use change resulted in incremental fragmentation of forest in the urban areas, and reduction in the rural areas. The fragmentation measure of diversity increased in the urban areas, while it decreased in the rural areas during the period of study. These results suggest that cultivated land fragmentation is a complex process connected with socio-economic trends at regional and local levels. In addition, this study has shown that landscape metrics can be used to understand the spatial pattern of LULC change in an urban-rural context. Finally, the outcomes of this study will help the policymakers at the three levels of governments in Nigeria to make crucial informed decisions about sustainable land use.
      Citation: Data
      PubDate: 2020-08-11
      DOI: 10.3390/data5030072
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 73: Forty Years of the Applications of Stark
           Broadening Data Determined with the Modified Semiempirical Method

    • Authors: Dimitrijević
      First page: 73
      Abstract: The aim of this paper is to analyze the various uses of Stark broadening data for non-hydrogenic lines emitted from plasma, obtained with the modified semiempirical method formulated 40 years ago (1980), which are continuously implemented in the STARK-B database. In such a way one can identify research fields where they are applied and better see the needs of users in order to better plan future work. This is done by analysis of citations of the modified semiempirical method and the corresponding data in international scientific journals, excluding cases when they are used for comparison with other experimental or theoretical Stark broadening data or for development of the theory of Stark broadening. On the basis of our analysis, one can conclude that the principal applications of such data are in astronomy (white dwarfs, A and B stars, and opacity), investigations of laser produced plasmas, laser design and optimization and their applications in industry and technology (ablation, laser melting, deposition, plasma during electrolytic oxidation, laser micro sintering), as well as for the determination of radiative properties of various plasmas, plasma diagnostics, and investigations of regularities and systematic trends of Stark broadening parameters.
      Citation: Data
      PubDate: 2020-08-23
      DOI: 10.3390/data5030073
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 74: Stark Broadening of Co II Lines in Stellar
           Atmospheres

    • Authors: Zlatko Majlinger, Milan S. Dimitrijević, Vladimir A. Srećković
      First page: 74
      Abstract: Data for Stark full widths at half maximum for 46 Co II multiplets were calculated using a modified semiempirical method. In order to show the applicability and usefulness of this set of data for research into white dwarf and A type star atmospheres, the obtained results were used to investigate the significance of the Stark broadening mechanism for Co II lines in the atmospheres of these objects. We examined the influence of surface gravity (log g), effective temperature and the wavelength of the spectral line on the importance of the inclusion of Stark broadening contribution in the profiles of the considered Co II spectral lines, for plasma conditions in atmospheric layers corresponding to different optical depths.
      Citation: Data
      PubDate: 2020-08-27
      DOI: 10.3390/data5030074
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 75: High-Resolution Surface Water Classifications of
           the Xingu River, Brazil, Pre and Post Operationalization of the Belo Monte
           Hydropower Complex

    • Authors: Margaret Kalacska, Oliver Lucanus, Leandro Sousa, J. Pablo Arroyo-Mora
      First page: 75
      Abstract: We describe a new high spatial resolution surface water classification dataset generated for the Xingu river, Brazil, from its confluence with the Iriri river to the Pimental dam prior to construction of the Belo Monte hydropower complex, and after its operationalization. This river is well-known for its exceptionally high diversity and endemism in ichthyofauna. Pre-existing datasets generated from moderate resolution satellite imagery (e.g., 30 m) do not adequately capture the extent of the river. Accurate measurements of water extent are important for a range of applications utilizing surface water data, including greenhouse gas emission estimation, land cover change mapping, and habitat loss/change estimates, among others. We generated the new classifications from RapidEye imagery (5 m pixel size) for 2011 and PlanteScope imagery (3 m pixel size) for 2019 using a Geographic Object Based Image Analysis (GEOBIA) approach.
      Citation: Data
      PubDate: 2020-08-29
      DOI: 10.3390/data5030075
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 76: Non-Spatial Data towards Spatially Located News
           about COVID-19: A Semi-Automated Aggregator of Pandemic Data from (Social)
           Media within the Olomouc Region, Czechia

    • Authors: Konicek, Netek, Burian, Novakova, Kaplan
      First page: 76
      Abstract: The article describes the process of aggregation of media-based data about the coronavirus pandemic in the Olomouc region, the Czech Republic. Originally non-spatially located news from different sources and various platforms (government, social media, news portals) were automatically aggregated into a centralized database. The application “COVID-map” is an interactive web map solution which visualizes records from the database in a spatial way. The COVID-map has been developed within the Ad hoc online hackathon as an academic project at the Department of Geoinformatics, Palacký University Olomouc, Czech Republic. Alongside spatially localized data, the map application collects statistical data from official sources e.g., from the governmental crisis management office. The impact of the application was immediate. Within a few days after the launch, tens of thousands users per day visited the COVID-map. It has been published by regional and national media. The COVID-map solution could be considered as a suitable implementation of the correctly used cartographical method for the example of the coronavirus pandemic.
      Citation: Data
      PubDate: 2020-08-30
      DOI: 10.3390/data5030076
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 77: Dataset of Nile Red Fluorescence Readings with
           Different Yeast Strains, Solvents, and Incubation Times

    • Authors: Mauricio Ramirez-Castrillon, Victoria Jaramillo-Garcia, Helio Barros, João Henriques, Valter Stefani, Patricia Valente
      First page: 77
      Abstract: We used Nile red to estimate lipid content in oleaginous yeasts using a high-throughput approach. We measured the fluorescence intensity of Nile red using different solvents, yeast strains, and incubation times in optimized excitation/emission wavelengths. The data show the relative fluorescence units (RFU) for Nile red excitation, using 1× PBS, 1× PBS and 5% v/v isopropyl alcohol, 50% v/v glycerol, culture medium A-gly broth, and A-gly broth supplemented with 5% v/v DMSO. In addition, we showed the RFU for the Nile red dye for different oleaginous and non-oleaginous yeast strains, such as Meyerozyma guilliermondii BI281A, Yarrowia lipolytica QU21 and Saccharomyces cerevisiae MRC164. Other measurements of lipid accumulation kinetics were shown for the above and additional yeast strains. These datasets provide the guidelines to obtain the optimal solvent system and the minimal interaction time for the Nile red dye to enter in the cells and obtain a stable readout.
      Citation: Data
      PubDate: 2020-09-01
      DOI: 10.3390/data5030077
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 78: 13C NMR Dataset Qualitative Analysis of Grecian
           Wines

    • Authors: Alberto Mannu, Ioannis K. Karabagias, Salvatore Baldino, Cristina Prandi, Vassilios K. Karabagias, Anastasia V. Badeka
      First page: 78
      Abstract: The development of analytical techniques for characterizing food samples, especially for the wine industry, is a main topic of research. Regarding the classification of wines based on their geographical origin, nuclear magnetic resonance (NMR) spectroscopy represents a fast and effective tool for determining chemical fingerprints. Herein, a 13C NMR dataset, which was acquired for classification of Grecian wines through multivariate statistics, is reported and described. Thus, the main qualitative differences between grapes of the same geographical origin, observable by the visual analysis of the 13C NMR data, are discussed.
      Citation: Data
      PubDate: 2020-09-05
      DOI: 10.3390/data5030078
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 79: Assessing Sustainability of the Capital and
           Emerging Secondary Cities of Cambodia Based on the 2018 Commune Database

    • Authors: Puthearath Chan
      First page: 79
      Abstract: The world is rapidly urbanizing which 68% of its population is expected to live in urban areas by 2050. Likewise, secondary cities of Cambodia are rapidly emerging while the capital is the largest city with a population of more than two million. Improving urban sustainability is, therefore, necessary for the world, as well as Cambodia. Thus, Cambodia has launched clean city standard indicators, proposed sectoral green city indicators, and adapted one target of global sustainable development goal 11 (UN SDG 11), to improve its urban quality and sustainability. However, using these indicators is not sufficient towards achieving urban sustainability because these indicators are limited in social and economic dimensions. Hence, this study aims to develop all dimensional indicators of sustainability based on all targets of UN SDG 11 with the above indicators. This study focused on the priorities of indicators in Cambodia verified and prioritized by Delphi and analytic hierarchy process (AHP) techniques. Then, a priority-based urban sustainability index for Cambodia was formed based on the concept of sustainability in developing countries. Finally, the standard scores were applied to comparatively assess the sustainability of capital and emerging secondary cities of Cambodia based on the 2018 Commune Database. Through this application, the study also sought to find out whether the priority weights of indicators are necessary for the comparative assessment. The results showed that the sustainability levels of Phnom Penh and Sihanoukville were found to be strong in all environmental, social, and economic dimensions. Battambang is also strong although economic sustainability is slightly lower than the average. Siem Reap is low in economic sustainability level while Poi Pet is remarkably low in environmental and social sustainability. Furthermore, the ranks of sustainability levels of the five cities based on weighted scores are different from their ranks based on unweighted scores. Therefore, this study confirms that priority weights of indicators are necessary for the comparative assessment towards improving the accuracy of the comparison.
      Citation: Data
      PubDate: 2020-09-07
      DOI: 10.3390/data5030079
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 80: The Interaction between Internet, Sustainable
           Development, and Emergence of Society 5.0

    • Authors: Vasja Roblek, Maja Meško, Mirjana Pejić Bach, Oshane Thorpe, Polona Šprajc
      First page: 80
      Abstract: (1) Background: The importance of this article is to analyze the technological developments in the field of the Internet and Internet technologies and to determine their significance for sustainable development, which will result in the emergence of Society 5.0. (2) The authors used automated content analysis for the analysis of 552 articles published in 306 scientific journals indexed by SCII and/or SCI - EXPANDED (Web of Science (WOS) platform). The goal of the research was to present the relationship between the Internet and sustainable development. (3) Results: The results of the analysis show that the top four most important themes in the selected journals were “development”, “information”, “data”, and “business and services”. (4) Conclusions: Our research approach emphasizes the importance of the culmination of scientific innovation with the conceptual, technological and contextual frameworks of the Internet and Internet technology usage and its impact on sustainable development and the emergence of the Society 5.0.
      Citation: Data
      PubDate: 2020-09-08
      DOI: 10.3390/data5030080
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 81: SARS-CoV-2 Persistence: Data Summary up to Q2 2020

    • Authors: Gabriele Cervino, Luca Fiorillo, Giovanni Surace, Valeria Paduano, Maria Teresa Fiorillo, Rosa De Stefano, Riccardo Laudicella, Sergio Baldari, Michele Gaeta, Marco Cicciù
      First page: 81
      Abstract: The coronavirus pandemic is causing confusion in the world. This confusion also affects the different guidelines adopted by each country. The persistence of Coronavirus, responsible for coronavirus disease 2019 (Covid-19) has been evaluated by different articles, but it is still not well-defined, and the method of diffusion is unclear. The aim of this manuscript is to underline new Coronavirus persistence features on different environments and surfaces. The scientific literature is still poor on this topic and research is mainly focused on therapy and diagnosis, rather than the characteristics of the virus. These data could be an aid to summarize virus features and formulate new guidelines and anti-spread strategies.
      Citation: Data
      PubDate: 2020-09-09
      DOI: 10.3390/data5030081
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 82: Extraction of Missing Tendency Using Decision Tree
           Learning in Business Process Event Log

    • Authors: Hiroki Horita, Yuta Kurihashi, Nozomi Miyamori
      First page: 82
      Abstract: In recent years, process mining has been attracting attention as an effective method for improving business operations by analyzing event logs that record what is done in business processes. The event log may contain missing data due to technical or human error, and if the data are missing, the analysis results will be inadequate. Traditional methods mainly use prediction completion when there are missing values, but accurate completion is not always possible. In this paper, we propose a method for understanding the tendency of missing values in the event log using decision tree learning without supplementing the missing values. We conducted experiments using data from the incident management system and confirmed the effectiveness of our method.
      Citation: Data
      PubDate: 2020-09-09
      DOI: 10.3390/data5030082
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 83: Data on Vietnamese Students’ Acceptance of Using
           VCTs for Distance Learning during the COVID-19 Pandemic

    • Authors: Duc-Hoa Pho, Xuan-An Nguyen, Dinh-Hai Luong, Hoai-Thu Nguyen, Thi-Phuong-Thao Vu, Thi-Thuong-Thuong Nguyen
      First page: 83
      Abstract: The outbreak of COVID-19 at the beginning of 2020 has heavily influenced education all around the world. In Vietnam, educational institutes were suspended, and distance learning was conducted to ensure students’ learning process, with distance learning occurring mainly via video conferencing tools (VTCs). The purpose of this paper is to provide data on Vietnamese students’ acceptance of using VCTs in distance learning during the COVID-19 pandemic through an extended technology acceptance model (TAM) and structural equation modeling (SEM) method. This study used the TAM of Venkatesh and Davis. The questionnaire was designed based on Venkatesh and Davis and Salloum et al.’s scale. An online survey with snowball sampling was selected in April. The final dataset consisted of 277 valid records. This data descriptor presented descriptive statistics (mean, standard deviation), internal consistency (Cronbach’s alpha), reliability and validity measures (composite reliability, average value extracted test,) and factor loading of items of eight factors: output quality, computer playfulness, subjective norm, perceived usefulness, perceived ease of use, attitude towards to use, behavioral intention to use, and actual system to use. Results indicated that external factors such as subjective norm and computer playfulness had a significant impact on most TAM constructs. Furthermore, output quality was found to have a positive influence on students’ perceived usefulness and acceptance of VCTs in distance learning.
      Citation: Data
      PubDate: 2020-09-11
      DOI: 10.3390/data5030083
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 84: Information Loss due to the Data Reduction of
           Sample Data from Discrete Distributions

    • Authors: Maryam Moghimi, H. W. Corley
      First page: 84
      Abstract: In this paper, we study the information lost when a real-valued statistic is used to reduce or summarize sample data from a discrete random variable with a one-dimensional parameter. We compare the probability that a random sample gives a particular data set to the probability of the statistic’s value for this data set. We focus on sufficient statistics for the parameter of interest and develop a general formula independent of the parameter for the Shannon information lost when a data sample is reduced to such a summary statistic. We also develop a measure of entropy for this lost information that depends only on the real-valued statistic but neither the parameter nor the data. Our approach would also work for non-sufficient statistics, but the lost information and associated entropy would involve the parameter. The method is applied to three well-known discrete distributions to illustrate its implementation.
      Citation: Data
      PubDate: 2020-09-13
      DOI: 10.3390/data5030084
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 85: Bryan’s Maximum Entropy Method—Diagnosis of a
           Flawed Argument and Its Remedy

    • Authors: Alexander Rothkopf
      First page: 85
      Abstract: The Maximum Entropy Method (MEM) is a popular data analysis technique based on Bayesian inference, which has found various applications in the research literature. While the MEM itself is well-grounded in statistics, I argue that its state-of-the-art implementation, suggested originally by Bryan, artificially restricts its solution space. This restriction leads to a systematic error often unaccounted for in contemporary MEM studies. The goal of this paper is to carefully revisit Bryan’s train of thought, point out its flaw in applying linear algebra arguments to an inherently nonlinear problem, and suggest possible ways to overcome it.
      Citation: Data
      PubDate: 2020-09-17
      DOI: 10.3390/data5030085
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 86: Large-Scale Dataset of Local Java Software Build
           Results

    • Authors: Matúš Sulír, Michaela Bačíková, Matej Madeja, Sergej Chodarev, Ján Juhár
      First page: 86
      Abstract: When a person decides to inspect or modify a third-party software project, the first necessary step is its successful compilation from source code using a build system. However, such attempts often end in failure. In this data descriptor paper, we provide a dataset of build results of open source Java software systems. We tried to automatically build a large number of Java projects from GitHub using their Maven, Gradle, and Ant build scripts in a Docker container simulating a standard programmer’s environment. The dataset consists of the output of two executions: 7264 build logs from a study executed in 2016 and 7233 logs from the 2020 execution. In addition to the logs, we collected exit codes, file counts, and various project metadata. The proportion of failed builds in our dataset is 38% in the 2016 execution and 59% in the 2020 execution. The published data can be helpful for multiple purposes, such as correlation analysis of factors affecting build success, build failure prediction, and research in the area of build breakage repair.
      Citation: Data
      PubDate: 2020-09-21
      DOI: 10.3390/data5030086
      Issue No: Vol. 5, No. 3 (2020)
       
  • Data, Vol. 5, Pages 50: Data Wrangling in Database Systems: Purging of
           Dirty Data

    • Authors: Otmane Azeroual
      First page: 50
      Abstract: Researchers need to be able to integrate ever-increasing amounts of data into their institutional databases, regardless of the source, format, or size of the data. It is then necessary to use the increasing diversity of data to derive greater value from data for their organization. The processing of electronic data plays a central role in modern society. Data constitute a fundamental part of operational processes in companies and scientific organizations. In addition, they form the basis for decisions. Bad data quality can negatively affect decisions and have a negative impact on results. The quality of the data is crucial. This includes the new theme of data wrangling, sometimes referred to as data munging or data crunching, to find the dirty data and to transform and clean them. The aim of data wrangling is to prepare a lot of raw data in their original state so that they can be used for further analysis steps. Only then can knowledge be obtained that may bring added value. This paper shows how the data wrangling process works and how it can be used in database systems to clean up data from heterogeneous data sources during their acquisition and integration.
      Citation: Data
      PubDate: 2020-06-05
      DOI: 10.3390/data5020050
      Issue No: Vol. 5, No. 2 (2020)
       
  • Data, Vol. 5, Pages 51: An Interdisciplinary Review of Camera Image
           Collection and Analysis Techniques, with Considerations for Environmental
           Conservation Social Science

    • Authors: Little, Perry, Fefer, Brownlee, Sharp
      First page: 51
      Abstract: Camera-based data collection and image analysis are integral methods in many research disciplines. However, few studies are specifically dedicated to trends in these methods or opportunities for interdisciplinary learning. In this systematic literature review, we analyze published sources (n = 391) to synthesize camera use patterns and image collection and analysis techniques across research disciplines. We frame this inquiry with interdisciplinary learning theory to identify cross-disciplinary approaches and guiding principles. Within this, we explicitly focus on trends within and applicability to environmental conservation social science (ECSS). We suggest six guiding principles for standardized, collaborative approaches to camera usage and image analysis in research. Our analysis suggests that ECSS may offer inspiration for novel combinations of data collection, standardization tactics, and detailed presentations of findings and limitations. ECSS can correspondingly incorporate more image analysis tactics from other disciplines, especially in regard to automated image coding of pertinent attributes.
      Citation: Data
      PubDate: 2020-06-06
      DOI: 10.3390/data5020051
      Issue No: Vol. 5, No. 2 (2020)
       
  • Data, Vol. 5, Pages 52: Large-Scale Dataset for Radio Frequency-Based
           Device-Free Crowd Estimation

    • Authors: Abdil Kaya, Stijn Denis, Ben Bellekens, Maarten Weyn, Rafael Berkvens
      First page: 52
      Abstract: Organisers of events attracting many people have the important task to ensure the safety of the crowd on their venue premises. Measuring the size of the crowd is a critical first step, but often challenging because of occlusions, noise and the dynamics of the crowd. We have been working on a passive Radio Frequency (RF) sensing technique for crowd size estimation, and we now present three datasets of measurements collected at the Tomorrowland music festival in environments containing thousands of people. All datasets have reference data, either based on payment transactions or an access control system, and we provide an example analysis script. We hope that future analyses can lead to an added value for crowd safety experts.
      Citation: Data
      PubDate: 2020-06-09
      DOI: 10.3390/data5020052
      Issue No: Vol. 5, No. 2 (2020)
       
  • Data, Vol. 5, Pages 53: Charge Recombination Kinetics of Bacterial
           Photosynthetic Reaction Centres Reconstituted in Liposomes: Deterministic
           Versus Stochastic Approach

    • Authors: Emiliano Altamura, Paola Albanese, Pasquale Stano, Massimo Trotta, Francesco Milano, Fabio Mavelli
      First page: 53
      Abstract: In this theoretical work, we analyse the kinetics of charge recombination reaction after a light excitation of the Reaction Centres extracted from the photosynthetic bacterium Rhodobacter sphaeroides and reconstituted in small unilamellar phospholipid vesicles. Due to the compartmentalized nature of liposomes, vesicles may exhibit a random distribution of both ubiquinone molecules and the Reaction Centre protein complexes that can produce significant differences on the local concentrations from the average expected values. Moreover, since the amount of reacting species is very low in compartmentalized lipid systems the stochastic approach is more suitable to unveil deviations of the average time behaviour of vesicles from the deterministic time evolution.
      Citation: Data
      PubDate: 2020-06-12
      DOI: 10.3390/data5020053
      Issue No: Vol. 5, No. 2 (2020)
       
  • Data, Vol. 5, Pages 54: Emissions from Swine Manure Treated with Current
           Products for Mitigation of Odors and Reduction of NH3, H2S, VOC, and GHG
           Emissions

    • Authors: Baitong Chen, Jacek A. Koziel, Chumki Banik, Hantian Ma, Myeongseong Lee, Jisoo Wi, Zhanibek Meiirkhanuly, Daniel S. Andersen, Andrzej Białowiec, David B. Parker
      First page: 54
      Abstract: Odor and gaseous emissions from the swine industry are of concern for the wellbeing of humans and livestock. Additives applied to the swine manure surface are popular, marketed products to solve this problem and relatively inexpensive and easy for farmers to use. There is no scientific data evaluating the effectiveness of many of these products. We evaluated 12 manure additive products that are currently being marketed on their effectiveness in mitigating odor and gaseous emissions from swine manure. We used a pilot-scale system simulating the storage of swine manure with a controlled ventilation of headspace and periodic addition of manure. This dataset contains measured concentrations and estimated emissions of target gases in manure headspace above treated and untreated swine manure. These include ammonia (NH3), hydrogen sulfide (H2S), greenhouse gases (CO2, CH4, and N2O), volatile organic compounds (VOC), and odor. The experiment to test each manure additive product lasted for two months; the measurements of NH3 and H2S were completed twice a week; others were conducted weekly. The manure for each test was collected from three different farms in central Iowa to provide the necessary variety in stored swine manure properties. This dataset is useful for further analyses of gaseous emissions from swine manure under simulated storage conditions and for performance comparison of marketed products for the mitigation of gaseous emissions. Ultimately, swine farmers, the regulatory community, and the public need to have scientific data informing decisions about the usefulness of manure additives.
      Citation: Data
      PubDate: 2020-06-18
      DOI: 10.3390/data5020054
      Issue No: Vol. 5, No. 2 (2020)
       
  • Data, Vol. 5, Pages 55: A Database for the Radio Frequency Fingerprinting
           of Bluetooth Devices

    • Authors: Emre Uzundurukan, Yaser Dalveren, Ali Kara
      First page: 55
      Abstract: Radio frequency fingerprinting (RFF) is a promising physical layer protection technique which can be used to defend wireless networks from malicious attacks. It is based on the use of the distinctive features of the physical waveforms (signals) transmitted from wireless devices in order to classify authorized users. The most important requirement to develop an RFF method is the existence of a precise, robust, and extensive database of the emitted signals. In this context, this paper introduces a database consisting of Bluetooth (BT) signals collected at different sampling rates from 27 different smartphones (six manufacturers with several models for each). Firstly, the data acquisition system to create the database is described in detail. Then, the two well-known methods based on transient BT signals are experimentally tested by using the provided data to check their solidity. The results show that the created database may be useful for many researchers working on the development of the RFF of BT devices.
      Citation: Data
      PubDate: 2020-06-21
      DOI: 10.3390/data5020055
      Issue No: Vol. 5, No. 2 (2020)
       
  • Data, Vol. 5, Pages 56: A Probabilistic Bag-to-Class Approach to
           Multiple-Instance Learning

    • Authors: Kajsa Møllersen, Jon Yngve Hardeberg, Fred Godtliebsen
      First page: 56
      Abstract: Multi-instance (MI) learning is a branch of machine learning, where each object (bag) consists of multiple feature vectors (instances)—for example, an image consisting of multiple patches and their corresponding feature vectors. In MI classification, each bag in the training set has a class label, but the instances are unlabeled. The instances are most commonly regarded as a set of points in a multi-dimensional space. Alternatively, instances are viewed as realizations of random vectors with corresponding probability distribution, where the bag is the distribution, not the realizations. By introducing the probability distribution space to bag-level classification problems, dissimilarities between probability distributions (divergences) can be applied. The bag-to-bag Kullback–Leibler information is asymptotically the best classifier, but the typical sparseness of MI training sets is an obstacle. We introduce bag-to-class divergence to MI learning, emphasizing the hierarchical nature of the random vectors that makes bags from the same class different. We propose two properties for bag-to-class divergences, and an additional property for sparse training sets, and propose a dissimilarity measure that fulfils them. Its performance is demonstrated on synthetic and real data. The probability distribution space is valid for MI learning, both for the theoretical analysis and applications.
      Citation: Data
      PubDate: 2020-06-26
      DOI: 10.3390/data5020056
      Issue No: Vol. 5, No. 2 (2020)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 44.192.27.11
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-