for Journals by Title or ISSN
for Articles by Keywords

Publisher: Oxford University Press   (Total: 397 journals)

 A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

        1 2 | Last   [Sort by number of followers]   [Restore default list]

Showing 1 - 200 of 397 Journals sorted alphabetically
ACS Symposium Series     Full-text available via subscription   (SJR: 0.189, CiteScore: 0)
Acta Biochimica et Biophysica Sinica     Hybrid Journal   (Followers: 5, SJR: 0.79, CiteScore: 2)
Adaptation     Hybrid Journal   (Followers: 9, SJR: 0.143, CiteScore: 0)
Advances in Nutrition     Hybrid Journal   (Followers: 53, SJR: 2.196, CiteScore: 5)
Aesthetic Surgery J.     Hybrid Journal   (Followers: 6, SJR: 1.434, CiteScore: 1)
African Affairs     Hybrid Journal   (Followers: 66, SJR: 1.869, CiteScore: 2)
Age and Ageing     Hybrid Journal   (Followers: 92, SJR: 1.989, CiteScore: 4)
Alcohol and Alcoholism     Hybrid Journal   (Followers: 19, SJR: 1.376, CiteScore: 3)
American Entomologist     Full-text available via subscription   (Followers: 8)
American Historical Review     Hybrid Journal   (Followers: 161, SJR: 0.467, CiteScore: 1)
American J. of Agricultural Economics     Hybrid Journal   (Followers: 45, SJR: 2.113, CiteScore: 3)
American J. of Clinical Nutrition     Hybrid Journal   (Followers: 167, SJR: 3.438, CiteScore: 6)
American J. of Epidemiology     Hybrid Journal   (Followers: 190, SJR: 2.713, CiteScore: 3)
American J. of Hypertension     Hybrid Journal   (Followers: 25, SJR: 1.322, CiteScore: 3)
American J. of Jurisprudence     Hybrid Journal   (Followers: 19, SJR: 0.281, CiteScore: 1)
American J. of Legal History     Full-text available via subscription   (Followers: 8, SJR: 0.116, CiteScore: 0)
American Law and Economics Review     Hybrid Journal   (Followers: 27, SJR: 1.053, CiteScore: 1)
American Literary History     Hybrid Journal   (Followers: 16, SJR: 0.391, CiteScore: 0)
Analysis     Hybrid Journal   (Followers: 22, SJR: 1.038, CiteScore: 1)
Animal Frontiers     Hybrid Journal   (Followers: 1)
Annals of Behavioral Medicine     Hybrid Journal   (Followers: 16, SJR: 1.423, CiteScore: 3)
Annals of Botany     Hybrid Journal   (Followers: 37, SJR: 1.721, CiteScore: 4)
Annals of Oncology     Hybrid Journal   (Followers: 56, SJR: 5.599, CiteScore: 9)
Annals of the Entomological Society of America     Full-text available via subscription   (Followers: 10, SJR: 0.722, CiteScore: 1)
Annals of Work Exposures and Health     Hybrid Journal   (Followers: 34, SJR: 0.728, CiteScore: 2)
Antibody Therapeutics     Open Access  
AoB Plants     Open Access   (Followers: 4, SJR: 1.28, CiteScore: 3)
Applied Economic Perspectives and Policy     Hybrid Journal   (Followers: 18, SJR: 0.858, CiteScore: 2)
Applied Linguistics     Hybrid Journal   (Followers: 58, SJR: 2.987, CiteScore: 3)
Applied Mathematics Research eXpress     Hybrid Journal   (Followers: 1, SJR: 1.241, CiteScore: 1)
Arbitration Intl.     Full-text available via subscription   (Followers: 20)
Arbitration Law Reports and Review     Hybrid Journal   (Followers: 14)
Archives of Clinical Neuropsychology     Hybrid Journal   (Followers: 30, SJR: 0.731, CiteScore: 2)
Aristotelian Society Supplementary Volume     Hybrid Journal   (Followers: 3)
Arthropod Management Tests     Hybrid Journal   (Followers: 2)
Astronomy & Geophysics     Hybrid Journal   (Followers: 44, SJR: 0.146, CiteScore: 0)
Behavioral Ecology     Hybrid Journal   (Followers: 52, SJR: 1.871, CiteScore: 3)
Bioinformatics     Hybrid Journal   (Followers: 324, SJR: 6.14, CiteScore: 8)
Biology Methods and Protocols     Hybrid Journal  
Biology of Reproduction     Full-text available via subscription   (Followers: 9, SJR: 1.446, CiteScore: 3)
Biometrika     Hybrid Journal   (Followers: 20, SJR: 3.485, CiteScore: 2)
BioScience     Hybrid Journal   (Followers: 29, SJR: 2.754, CiteScore: 4)
Bioscience Horizons : The National Undergraduate Research J.     Open Access   (Followers: 1, SJR: 0.146, CiteScore: 0)
Biostatistics     Hybrid Journal   (Followers: 17, SJR: 1.553, CiteScore: 2)
BJA : British J. of Anaesthesia     Hybrid Journal   (Followers: 179, SJR: 2.115, CiteScore: 3)
BJA Education     Hybrid Journal   (Followers: 65)
Brain     Hybrid Journal   (Followers: 68, SJR: 5.858, CiteScore: 7)
Briefings in Bioinformatics     Hybrid Journal   (Followers: 52, SJR: 2.505, CiteScore: 5)
Briefings in Functional Genomics     Hybrid Journal   (Followers: 3, SJR: 2.15, CiteScore: 3)
British J. for the Philosophy of Science     Hybrid Journal   (Followers: 36, SJR: 2.161, CiteScore: 2)
British J. of Aesthetics     Hybrid Journal   (Followers: 25, SJR: 0.508, CiteScore: 1)
British J. of Criminology     Hybrid Journal   (Followers: 599, SJR: 1.828, CiteScore: 3)
British J. of Social Work     Hybrid Journal   (Followers: 85, SJR: 1.019, CiteScore: 2)
British Medical Bulletin     Hybrid Journal   (Followers: 6, SJR: 1.355, CiteScore: 3)
British Yearbook of Intl. Law     Hybrid Journal   (Followers: 33)
Bulletin of the London Mathematical Society     Hybrid Journal   (Followers: 4, SJR: 1.376, CiteScore: 1)
Cambridge J. of Economics     Hybrid Journal   (Followers: 65, SJR: 0.764, CiteScore: 2)
Cambridge J. of Regions, Economy and Society     Hybrid Journal   (Followers: 11, SJR: 2.438, CiteScore: 4)
Cambridge Quarterly     Hybrid Journal   (Followers: 10, SJR: 0.104, CiteScore: 0)
Capital Markets Law J.     Hybrid Journal   (Followers: 2, SJR: 0.222, CiteScore: 0)
Carcinogenesis     Hybrid Journal   (Followers: 2, SJR: 2.135, CiteScore: 5)
Cardiovascular Research     Hybrid Journal   (Followers: 14, SJR: 3.002, CiteScore: 5)
Cerebral Cortex     Hybrid Journal   (Followers: 46, SJR: 3.892, CiteScore: 6)
CESifo Economic Studies     Hybrid Journal   (Followers: 18, SJR: 0.483, CiteScore: 1)
Chemical Senses     Hybrid Journal   (Followers: 1, SJR: 1.42, CiteScore: 3)
Children and Schools     Hybrid Journal   (Followers: 6, SJR: 0.246, CiteScore: 0)
Chinese J. of Comparative Law     Hybrid Journal   (Followers: 5, SJR: 0.412, CiteScore: 0)
Chinese J. of Intl. Law     Hybrid Journal   (Followers: 22, SJR: 0.329, CiteScore: 0)
Chinese J. of Intl. Politics     Hybrid Journal   (Followers: 10, SJR: 1.392, CiteScore: 2)
Christian Bioethics: Non-Ecumenical Studies in Medical Morality     Hybrid Journal   (Followers: 10, SJR: 0.183, CiteScore: 0)
Classical Receptions J.     Hybrid Journal   (Followers: 27, SJR: 0.123, CiteScore: 0)
Clean Energy     Open Access   (Followers: 1)
Clinical Infectious Diseases     Hybrid Journal   (Followers: 70, SJR: 5.051, CiteScore: 5)
Communication Theory     Hybrid Journal   (Followers: 24, SJR: 2.424, CiteScore: 3)
Communication, Culture & Critique     Hybrid Journal   (Followers: 27, SJR: 0.222, CiteScore: 1)
Community Development J.     Hybrid Journal   (Followers: 27, SJR: 0.268, CiteScore: 1)
Computer J.     Hybrid Journal   (Followers: 9, SJR: 0.319, CiteScore: 1)
Conservation Physiology     Open Access   (Followers: 3, SJR: 1.818, CiteScore: 3)
Contemporary Women's Writing     Hybrid Journal   (Followers: 9, SJR: 0.121, CiteScore: 0)
Contributions to Political Economy     Hybrid Journal   (Followers: 5, SJR: 0.906, CiteScore: 1)
Critical Values     Full-text available via subscription  
Current Developments in Nutrition     Open Access   (Followers: 2)
Current Legal Problems     Hybrid Journal   (Followers: 29)
Current Zoology     Full-text available via subscription   (Followers: 3, SJR: 1.164, CiteScore: 2)
Database : The J. of Biological Databases and Curation     Open Access   (Followers: 8, SJR: 1.791, CiteScore: 3)
Digital Scholarship in the Humanities     Hybrid Journal   (Followers: 14, SJR: 0.259, CiteScore: 1)
Diplomatic History     Hybrid Journal   (Followers: 20, SJR: 0.45, CiteScore: 1)
DNA Research     Open Access   (Followers: 5, SJR: 2.866, CiteScore: 6)
Dynamics and Statistics of the Climate System     Open Access   (Followers: 4)
Early Music     Hybrid Journal   (Followers: 16, SJR: 0.139, CiteScore: 0)
Economic Policy     Hybrid Journal   (Followers: 42, SJR: 3.584, CiteScore: 3)
ELT J.     Hybrid Journal   (Followers: 24, SJR: 0.942, CiteScore: 1)
English Historical Review     Hybrid Journal   (Followers: 54, SJR: 0.612, CiteScore: 1)
English: J. of the English Association     Hybrid Journal   (Followers: 15, SJR: 0.1, CiteScore: 0)
Environmental Entomology     Full-text available via subscription   (Followers: 11, SJR: 0.818, CiteScore: 2)
Environmental Epigenetics     Open Access   (Followers: 3)
Environmental History     Hybrid Journal   (Followers: 27, SJR: 0.408, CiteScore: 1)
EP-Europace     Hybrid Journal   (Followers: 3, SJR: 2.748, CiteScore: 4)
Epidemiologic Reviews     Hybrid Journal   (Followers: 9, SJR: 4.505, CiteScore: 8)
ESHRE Monographs     Hybrid Journal  
Essays in Criticism     Hybrid Journal   (Followers: 19, SJR: 0.113, CiteScore: 0)
European Heart J.     Hybrid Journal   (Followers: 63, SJR: 9.315, CiteScore: 9)
European Heart J. - Cardiovascular Imaging     Hybrid Journal   (Followers: 9, SJR: 3.625, CiteScore: 3)
European Heart J. - Cardiovascular Pharmacotherapy     Full-text available via subscription   (Followers: 2)
European Heart J. - Quality of Care and Clinical Outcomes     Hybrid Journal  
European Heart J. : Case Reports     Open Access  
European Heart J. Supplements     Hybrid Journal   (Followers: 8, SJR: 0.223, CiteScore: 0)
European J. of Cardio-Thoracic Surgery     Hybrid Journal   (Followers: 9, SJR: 1.681, CiteScore: 2)
European J. of Intl. Law     Hybrid Journal   (Followers: 196, SJR: 0.694, CiteScore: 1)
European J. of Orthodontics     Hybrid Journal   (Followers: 4, SJR: 1.279, CiteScore: 2)
European J. of Public Health     Hybrid Journal   (Followers: 20, SJR: 1.36, CiteScore: 2)
European Review of Agricultural Economics     Hybrid Journal   (Followers: 10, SJR: 1.172, CiteScore: 2)
European Review of Economic History     Hybrid Journal   (Followers: 30, SJR: 0.702, CiteScore: 1)
European Sociological Review     Hybrid Journal   (Followers: 42, SJR: 2.728, CiteScore: 3)
Evolution, Medicine, and Public Health     Open Access   (Followers: 12)
Family Practice     Hybrid Journal   (Followers: 16, SJR: 1.018, CiteScore: 2)
Fems Microbiology Ecology     Hybrid Journal   (Followers: 15, SJR: 1.492, CiteScore: 4)
Fems Microbiology Letters     Hybrid Journal   (Followers: 28, SJR: 0.79, CiteScore: 2)
Fems Microbiology Reviews     Hybrid Journal   (Followers: 32, SJR: 7.063, CiteScore: 13)
Fems Yeast Research     Hybrid Journal   (Followers: 13, SJR: 1.308, CiteScore: 3)
Food Quality and Safety     Open Access   (Followers: 1)
Foreign Policy Analysis     Hybrid Journal   (Followers: 25, SJR: 1.425, CiteScore: 1)
Forest Science     Hybrid Journal   (Followers: 7, SJR: 0.89, CiteScore: 2)
Forestry: An Intl. J. of Forest Research     Hybrid Journal   (Followers: 16, SJR: 1.133, CiteScore: 3)
Forum for Modern Language Studies     Hybrid Journal   (Followers: 6, SJR: 0.104, CiteScore: 0)
French History     Hybrid Journal   (Followers: 33, SJR: 0.118, CiteScore: 0)
French Studies     Hybrid Journal   (Followers: 21, SJR: 0.148, CiteScore: 0)
French Studies Bulletin     Hybrid Journal   (Followers: 10, SJR: 0.152, CiteScore: 0)
Gastroenterology Report     Open Access   (Followers: 2)
Genome Biology and Evolution     Open Access   (Followers: 14, SJR: 2.578, CiteScore: 4)
Geophysical J. Intl.     Hybrid Journal   (Followers: 36, SJR: 1.506, CiteScore: 3)
German History     Hybrid Journal   (Followers: 23, SJR: 0.161, CiteScore: 0)
GigaScience     Open Access   (Followers: 5, SJR: 5.022, CiteScore: 7)
Global Summitry     Hybrid Journal   (Followers: 1)
Glycobiology     Hybrid Journal   (Followers: 13, SJR: 1.493, CiteScore: 3)
Health and Social Work     Hybrid Journal   (Followers: 57, SJR: 0.388, CiteScore: 1)
Health Education Research     Hybrid Journal   (Followers: 16, SJR: 0.854, CiteScore: 2)
Health Policy and Planning     Hybrid Journal   (Followers: 24, SJR: 1.512, CiteScore: 2)
Health Promotion Intl.     Hybrid Journal   (Followers: 22, SJR: 0.812, CiteScore: 2)
History Workshop J.     Hybrid Journal   (Followers: 31, SJR: 1.278, CiteScore: 1)
Holocaust and Genocide Studies     Hybrid Journal   (Followers: 28, SJR: 0.105, CiteScore: 0)
Human Communication Research     Hybrid Journal   (Followers: 15, SJR: 2.146, CiteScore: 3)
Human Molecular Genetics     Hybrid Journal   (Followers: 9, SJR: 3.555, CiteScore: 5)
Human Reproduction     Hybrid Journal   (Followers: 72, SJR: 2.643, CiteScore: 5)
Human Reproduction Open     Open Access   (Followers: 1)
Human Reproduction Update     Hybrid Journal   (Followers: 20, SJR: 5.317, CiteScore: 10)
Human Rights Law Review     Hybrid Journal   (Followers: 62, SJR: 0.756, CiteScore: 1)
ICES J. of Marine Science: J. du Conseil     Hybrid Journal   (Followers: 56, SJR: 1.591, CiteScore: 3)
ICSID Review     Hybrid Journal   (Followers: 10)
ILAR J.     Hybrid Journal   (Followers: 2, SJR: 1.732, CiteScore: 4)
IMA J. of Applied Mathematics     Hybrid Journal   (SJR: 0.679, CiteScore: 1)
IMA J. of Management Mathematics     Hybrid Journal   (SJR: 0.538, CiteScore: 1)
IMA J. of Mathematical Control and Information     Hybrid Journal   (Followers: 2, SJR: 0.496, CiteScore: 1)
IMA J. of Numerical Analysis - advance access     Hybrid Journal   (SJR: 1.987, CiteScore: 2)
Industrial and Corporate Change     Hybrid Journal   (Followers: 10, SJR: 1.792, CiteScore: 2)
Industrial Law J.     Hybrid Journal   (Followers: 39, SJR: 0.249, CiteScore: 1)
Inflammatory Bowel Diseases     Hybrid Journal   (Followers: 48, SJR: 2.511, CiteScore: 4)
Information and Inference     Free  
Integrative and Comparative Biology     Hybrid Journal   (Followers: 8, SJR: 1.319, CiteScore: 2)
Interacting with Computers     Hybrid Journal   (Followers: 11, SJR: 0.292, CiteScore: 1)
Interactive CardioVascular and Thoracic Surgery     Hybrid Journal   (Followers: 7, SJR: 0.762, CiteScore: 1)
Intl. Affairs     Hybrid Journal   (Followers: 65, SJR: 1.505, CiteScore: 3)
Intl. Data Privacy Law     Hybrid Journal   (Followers: 25)
Intl. Health     Hybrid Journal   (Followers: 6, SJR: 0.851, CiteScore: 2)
Intl. Immunology     Hybrid Journal   (Followers: 3, SJR: 2.167, CiteScore: 4)
Intl. J. for Quality in Health Care     Hybrid Journal   (Followers: 36, SJR: 1.348, CiteScore: 2)
Intl. J. of Constitutional Law     Hybrid Journal   (Followers: 64, SJR: 0.601, CiteScore: 1)
Intl. J. of Epidemiology     Hybrid Journal   (Followers: 241, SJR: 3.969, CiteScore: 5)
Intl. J. of Law and Information Technology     Hybrid Journal   (Followers: 5, SJR: 0.202, CiteScore: 1)
Intl. J. of Law, Policy and the Family     Hybrid Journal   (Followers: 28, SJR: 0.223, CiteScore: 1)
Intl. J. of Lexicography     Hybrid Journal   (Followers: 10, SJR: 0.285, CiteScore: 1)
Intl. J. of Low-Carbon Technologies     Open Access   (Followers: 1, SJR: 0.403, CiteScore: 1)
Intl. J. of Neuropsychopharmacology     Open Access   (Followers: 3, SJR: 1.808, CiteScore: 4)
Intl. J. of Public Opinion Research     Hybrid Journal   (Followers: 11, SJR: 1.545, CiteScore: 1)
Intl. J. of Refugee Law     Hybrid Journal   (Followers: 38, SJR: 0.389, CiteScore: 1)
Intl. J. of Transitional Justice     Hybrid Journal   (Followers: 11, SJR: 0.724, CiteScore: 2)
Intl. Mathematics Research Notices     Hybrid Journal   (Followers: 1, SJR: 2.168, CiteScore: 1)
Intl. Political Sociology     Hybrid Journal   (Followers: 40, SJR: 1.465, CiteScore: 3)
Intl. Relations of the Asia-Pacific     Hybrid Journal   (Followers: 23, SJR: 0.401, CiteScore: 1)
Intl. Studies Perspectives     Hybrid Journal   (Followers: 9, SJR: 0.983, CiteScore: 1)
Intl. Studies Quarterly     Hybrid Journal   (Followers: 48, SJR: 2.581, CiteScore: 2)
Intl. Studies Review     Hybrid Journal   (Followers: 25, SJR: 1.201, CiteScore: 1)
ISLE: Interdisciplinary Studies in Literature and Environment     Hybrid Journal   (Followers: 2, SJR: 0.15, CiteScore: 0)
ITNOW     Hybrid Journal   (Followers: 1, SJR: 0.103, CiteScore: 0)
J. of African Economies     Hybrid Journal   (Followers: 16, SJR: 0.533, CiteScore: 1)
J. of American History     Hybrid Journal   (Followers: 46, SJR: 0.297, CiteScore: 1)
J. of Analytical Toxicology     Hybrid Journal   (Followers: 14, SJR: 1.065, CiteScore: 2)
J. of Antimicrobial Chemotherapy     Hybrid Journal   (Followers: 15, SJR: 2.419, CiteScore: 4)
J. of Antitrust Enforcement     Hybrid Journal   (Followers: 1)
J. of Applied Poultry Research     Hybrid Journal   (Followers: 5, SJR: 0.585, CiteScore: 1)
J. of Biochemistry     Hybrid Journal   (Followers: 40, SJR: 1.226, CiteScore: 2)
J. of Burn Care & Research     Hybrid Journal   (Followers: 10, SJR: 0.768, CiteScore: 2)
J. of Chromatographic Science     Hybrid Journal   (Followers: 18, SJR: 0.36, CiteScore: 1)
J. of Church and State     Hybrid Journal   (Followers: 12, SJR: 0.139, CiteScore: 0)
J. of Communication     Hybrid Journal   (Followers: 55, SJR: 4.411, CiteScore: 5)
J. of Competition Law and Economics     Hybrid Journal   (Followers: 37, SJR: 0.33, CiteScore: 0)
J. of Complex Networks     Hybrid Journal   (Followers: 2, SJR: 1.05, CiteScore: 4)
J. of Computer-Mediated Communication     Open Access   (Followers: 29, SJR: 2.961, CiteScore: 6)
J. of Conflict and Security Law     Hybrid Journal   (Followers: 13, SJR: 0.402, CiteScore: 0)
J. of Consumer Research     Full-text available via subscription   (Followers: 47, SJR: 5.856, CiteScore: 5)

        1 2 | Last   [Sort by number of followers]   [Restore default list]

Journal Cover
Database : The Journal of Biological Databases and Curation
Journal Prestige (SJR): 1.791
Citation Impact (citeScore): 3
Number of Followers: 8  

  This is an Open Access Journal Open Access journal
ISSN (Online) 1758-0463
Published by Oxford University Press Homepage  [397 journals]
  • A scalable, aggregated genotypic–phenotypic database for human
           disease variation

    • Authors: Barrett R; Neben C, Zimmer A, et al.
      Abstract: Next generation sequencing multi-gene panels have greatly improved the diagnostic yield and cost effectiveness of genetic testing and are rapidly being integrated into the clinic for hereditary cancer risk. With this technology comes a dramatic increase in the volume, type and complexity of data. This invaluable data though is too often buried or inaccessible to researchers, especially to those without strong analytical or programming skills. To effectively share comprehensive, integrated genotypic–phenotypic data, we built Color Data, a publicly available, cloud-based database that supports broad access and data literacy. The database is composed of 50 000 individuals who were sequenced for 30 genes associated with hereditary cancer risk and provides useful information on allele frequency and variant classification, as well as associated phenotypic information such as demographics and personal and family history. Our user-friendly interface allows researchers to easily execute their own queries with filtering, and the results of queries can be shared and/or downloaded. The rapid and broad dissemination of these research results will help increase the value of, and reduce the waste in, scientific resources and data. Furthermore, the database is able to quickly scale and support integration of additional genes and human hereditary conditions. We hope that this database will help researchers and scientists explore genotype–phenotype correlations in hereditary cancer, identify novel variants for functional analysis and enable data-driven drug discovery and development.
      PubDate: Wed, 13 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz013
      Issue No: Vol. 2019 (2019)
  • LIVE: a manually curated encyclopedia of experimentally validated
           interactions of lncRNAs

    • Authors: An G; Sun J, Ren C, et al.
      Abstract: Advances in studies of long noncoding RNAs (lncRNAs) have provided data regarding the regulatory roles of lncRNAs, which perform functional roles through interactions with other functional elements. To track the underlying relationships among lncRNAs, various databases have been developed as repositories for lncRNA data. However, the ability to comprehensively explore the diverse interactions between lncRNAs and other functional elements is limited. To this end, we developed LIVE (LncRNA Interaction Validated Encyclopaedia), an interactive resource to integrate the diverse interactions of functional elements with lncRNAs. LIVE is a manually curated database of experimentally validated interactions of lncRNAs with genes, proteins and other various functional elements. By mining publications, we constructed LIVE with the following three interaction networks: a binding interaction network, a regulation network and a disease network; then, we combined them to form a comprehensive lncRNA interaction network. The current release of LIVE contains the validated interactions of 572 lncRNAs in humans and mice with 103 proteins, 209 genes, 56 transcription factors and 194 diseases. LIVE provides an interactive interface with charts and figures to aid users in searching and browsing interactions with lncRNAs. LIVE will greatly facilitate further investigation into the regulatory roles of lncRNAs and is freely available.
      PubDate: Wed, 13 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz011
      Issue No: Vol. 2019 (2019)
  • Increased interactivity and improvements to the GigaScience database,

    • Authors: Xiao S; Armit C, Edmunds S, et al.
      Abstract: With a large increase in the volume and type of data archived in GigaScience Database (GigaDB) since its launch in 2011, we have studied the metrics and user patterns to assess the important aspects needed to best suit current and future use. This has led to new front-end developments and enhanced interactivity and functionality that greatly improve user experience. In this article, we present an overview of the current practices including the Biocurational role of the GigaDB staff, the broad usage metrics of GigaDB datasets and an update on how the GigaDB platform has been overhauled and enhanced to improve the stability and functionality of the codebase. Finally, we report on future directions for the GigaDB resource.
      PubDate: Mon, 11 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz016
      Issue No: Vol. 2019 (2019)
  • RiceMetaSysB: a database of blast and bacterial blight responsive genes in
           rice and its utilization in identifying key blast-resistant WRKY genes

    • Authors: Sureshkumar V; Dutta B, Kumar V, et al.
      Abstract: Nearly two decades of revolution in the area of genomics serves as the basis of present-day molecular breeding in major food crops such as rice. Here we report an open source database on two major biotic stresses of rice, named RiceMetaSysB, which provides detailed information about rice blast and bacterial blight (BB) responsive genes (RGs). Meta-analysis of microarray data from different blast- and BB-related experiments across 241 and 186 samples identified 15135 unique genes for blast and 7475 for BB. A total of 9365 and 5375 simple sequence repeats (SSRs) in blast and BB RGs were identified for marker development. Retrieval of candidate genes using different search options like genotypes, tissue, developmental stage of the host, strain, hours/days post-inoculation, physical position and SSR marker information is facilitated in the database. Search options like ‘common genes among varieties’ and ‘strains’ have been enabled to identify robust candidate genes. A 2D representation of the data can be used to compare expression profiles across genes, genotypes and strains. To demonstrate the utility of this database, we queried for blast-responsive WRKY genes (fold change ≥5) using their gene IDs. The structural variations in the 12 WRKY genes so identified and their promoter regions were explored in two rice genotypes contrasting for their reaction to blast infection. Expression analysis of these genes in panicle tissue infected with a virulent and an avirulent strain of Magnaporthe oryzae could identify WRKY7, WRKY58, WRKY62, WRKY64 and WRKY76 as potential candidate genes for resistance to panicle blast, as they showed higher expression only in the resistant genotype against the virulent strain. Thus, we demonstrated that RiceMetaSysB can play an important role in providing robust candidate genes for rice blast and BB.
      PubDate: Mon, 11 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz015
      Issue No: Vol. 2019 (2019)
  • Integrated curation and data mining for disease and phenotype models at
           the Rat Genome Database

    • Authors: Wang S; Laulederkind S, Zhao Y, et al.
      Abstract: Rats have been used as research models in biomedical research for over 150 years. These disease models arise from naturally occurring mutations, selective breeding and, more recently, genome manipulation. Through the innovation of genome-editing technologies, genome-modified rats provide precision models of disease by disrupting or complementing targeted genes. To facilitate the use of these data produced from rat disease models, the Rat Genome Database (RGD) organizes rat strains and annotates these strains with disease and qualitative phenotype terms as well as quantitative phenotype measurements. From the curated quantitative data, the expected phenotype profile ranges were established through a meta-analysis pipeline using inbred rat strains in control conditions. The disease and qualitative phenotype annotations are propagated to their associated genes and alleles if applicable. Currently, RGD has curated nearly 1300 rat strains with disease/phenotype annotations and about 11% of them have known allele associations. All of the annotations (disease and phenotype) are integrated and displayed on the strain, gene and allele report pages. Finding disease and phenotype models at RGD can be done by searching for terms in the ontology browser, browsing the disease or phenotype ontology branches or entering keywords in the general search. Use cases are provided to show different targeted searches of rat strains at RGD.
      PubDate: Mon, 11 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz014
      Issue No: Vol. 2019 (2019)
  • Using deep learning to identify translational research in genomic medicine
           beyond bench to bedside

    • Authors: Hsu Y; Clyne M, Wei C, et al.
      Abstract: Tracking scientific research publications on the evaluation, utility and implementation of genomic applications is critical for the translation of basic research to impact clinical and population health. In this work, we utilize state-of-the-art machine learning approaches to identify translational research in genomics beyond bench to bedside from the biomedical literature. We apply the convolutional neural networks (CNNs) and support vector machines (SVMs) to the bench/bedside article classification on the weekly manual annotation data of the Public Health Genomics Knowledge Base database. Both classifiers employ salient features to determine the probability of curation-eligible publications, which can effectively reduce the workload of manual triage and curation process. We applied the CNNs and SVMs to an independent test set (n = 400), and the models achieved the F-measure of 0.80 and 0.74, respectively. We further tested the CNNs, which perform better results, on the routine annotation pipeline for 2 weeks and significantly reduced the effort and retrieved more appropriate research articles. Our approaches provide direct insight into the automated curation of genomic translational research beyond bench to bedside. The machine learning classifiers are found to be helpful for annotators to enhance the efficiency of manual curation.
      PubDate: Fri, 08 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz010
      Issue No: Vol. 2019 (2019)
  • ImmunoSPdb: an archive of immunosuppressive peptides

    • Authors: Usmani S; Agrawal P, Sehgal M, et al.
      Abstract: Immunosuppression proved as a captivating therapy in several autoimmune disorders, asthma as well as in organ transplantation. Immunosuppressive peptides are specific for reducing efficacy of immune system with wide range of therapeutic implementations. `ImmunoSPdb’ is a comprehensive, manually curated database of around 500 experimentally verified immunosuppressive peptides compiled from 79 research article and 32 patents. The current version comprises of 553 entries providing extensive information including peptide name, sequence, chirality, chemical modification, origin, nature of peptide, its target as well as mechanism of action, amino acid frequency and composition, etc. Data analysis revealed that most of the immunosuppressive peptides are linear (91%), are shorter in length i.e. up to 20 amino acids (62%) and have L form of amino acids (81%). About 30% peptide are either chemically modified or have end terminal modification. Most of the peptides either are derived from proteins (41%) or naturally (27%) exist. Blockage of potassium ion channel (24%) is one a major target for immunosuppressive peptides. In addition, we have annotated tertiary structure by using PEPstrMOD and I-TASSER. Many user-friendly, web-based tools have been integrated to facilitate searching, browsing and analyzing the data. We have developed a user-friendly responsive website to assist a wide range of users.
      PubDate: Fri, 08 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz012
      Issue No: Vol. 2019 (2019)
  • Enhanced taxonomy annotation of antiviral activity data from ChEMBL

    • Authors: Nikitina A; Orlov A, Kozlovskaya L, et al.
      Abstract: The discovery of antiviral drugs is a rapidly developing area of medicinal chemistry research. The emergence of resistant variants and outbreaks of poorly studied viral diseases make this area constantly developing. The amount of antiviral activity data available in ChEMBL consistently grows, but virus taxonomy annotation of these data is not sufficient for thorough studies of antiviral chemical space. We developed a procedure for semi-automatic extraction of antiviral activity data from ChEMBL and mapped them to the virus taxonomy developed by the International Committee for Taxonomy of Viruses (ICTV). The procedure is based on the lists of virus-related values of ChEMBL annotation fields and a dictionary of virus names and acronyms mapped to ICTV taxa. Application of this data extraction procedure allows retrieving from ChEMBL 1.6 times more assays linked to 2.5 times more compounds and data points than ChEMBL web interface allows. Mapping of these data to ICTV taxa allows analyzing all the compounds tested against each viral species. Activity values and structures of the compounds were standardized, and the antiviral activity profile was created for each standard structure. Data set compiled using this algorithm was called ViralChEMBL. As case studies, we compared descriptor and scaffold distributions for the full ChEMBL and its `viral’ and `non-viral’ subsets, identified the most studied compounds and created a self-organizing map for ViralChEMBL. Our approach to data annotation appeared to be a very efficient tool for the study of antiviral chemical space.
      PubDate: Fri, 08 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/bay139
      Issue No: Vol. 2019 (2019)
  • The radish genome database (RadishGD): an integrated information resource
           for radish genomics

    • Authors: Yu H; Baek S, Lee Y, et al.
      Abstract: Radish (Raphanus sativus L.) is an important root vegetable crop in the family Brassicaceae, which provides diverse nutrients for human health and is closely related to the Brassica crop species. Recently, we sequenced and assembled the radish genome into nine chromosome pseudomolecules. In addition, we developed diverse genomic resources, including genetic maps, molecular markers, transcriptome, genome-wide methylation and variome data. In this study, we describe the radish genome database (RadishGD), including details of data sets that we generated and the web interface that allows access to these data. RadishGD comprises six major units that enable researchers and general users to search, browse and analyze the radish genomic data in an integrated manner. The Search unit provides gene structures and sequences for gene models through keyword or BLAST searches. The Genome browser displays graphic representations of gene models, mRNAs, repetitive sequences, genome-wide methylation and variomes among various genotypes. The Functional annotation unit offers gene ontology, plant ontology, pathway and gene family information for gene models. The Genetic map unit provides information about markers and their genetic locations using two types of genetic maps. The Expression unit presents transcriptional characteristics and methylation levels for each gene in 18 tissues. All sequence data incorporated into RadishGD can be downloaded from the Data resources unit. RadishGD will be continually updated to serve as a community resource for radish genomics and breeding research.
      PubDate: Tue, 05 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz009
      Issue No: Vol. 2019 (2019)
  • ZincBind—the database of zinc binding sites

    • Authors: Ireland S; Martin A.
      Abstract: Zinc is one of the most important biologically active metals. Ten per cent of the human genome is thought to encode a zinc binding protein and its uses encompass catalysis, structural stability, gene expression and immunity. At present, there is no specific resource devoted to identifying and presenting all currently known zinc binding sites. Here we present ZincBind, a database of zinc binding sites and its web front-end. Using the structural data in the Protein Data Bank, ZincBind identifies every instance of zinc binding to a protein, identifies its binding site and clusters sites based on 90% sequence identity. There are currently 24 992 binding sites, clustered into 7489 unique sites. The data are available over the web where they can be browsed and downloaded, and via a REST API. ZincBind is regularly updated and will continue to be updated with new data and features.
      PubDate: Tue, 05 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz006
      Issue No: Vol. 2019 (2019)
  • Integration of macromolecular complex data into the Saccharomyces Genome

    • Authors: Wong E; Skrzypek M, Weng S, et al.
      Abstract: Proteins seldom function individually. Instead, they interact with other proteins or nucleic acids to form stable macromolecular complexes that play key roles in important cellular processes and pathways. One of the goals of Saccharomyces Genome Database (SGD; is to provide a complete picture of budding yeast biological processes. To this end, we have collaborated with the Molecular Interactions team that provides the Complex Portal database at EMBL-EBI to manually curate the complete yeast complexome. These data, from a total of 589 complexes, were previously available only in SGD’s YeastMine data warehouse ( and the Complex Portal ( We have now incorporated these macromolecular complex data into the SGD core database and designed complex-specific reports to make these data easily available to researchers. These web pages contain referenced summaries focused on the composition and function of individual complexes. In addition, detailed information about how subunits interact within the complex, their stoichiometry and the physical structure are displayed when such information is available. Finally, we generate network diagrams displaying subunits and Gene Ontology annotations that are shared between complexes. Information on macromolecular complexes will continue to be updated in collaboration with the Complex Portal team and curated as more data become available.
      PubDate: Mon, 04 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz008
      Issue No: Vol. 2019 (2019)
  • CircFunBase: a database for functional circular RNAs

    • Authors: Meng X; Hu D, Zhang P, et al.
      Abstract: Increasing evidence reveals that circular RNAs (circRNAs) are widespread in eukaryotes and play important roles in diverse biological processes. However, a comprehensive functionally annotated circRNA database is still lacking. CircFunBase is a web-accessible database that aims to provide a high-quality functional circRNA resource including experimentally validated and computationally predicted functions. The current version of CircFunBase documents more than 7000 manually curated functional circRNA entries, mainly including Homo sapiens, Mus musculus etc. CircFunBase provides visualized circRNA-miRNA interaction networks. In addition, a genome browser is provided to visualize the genome context of circRNAs. As a biological information platform for circRNAs, CircFunBase will contribute for circRNA studies and bridge the gap between circRNAs and their functions.
      PubDate: Mon, 04 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz003
      Issue No: Vol. 2019 (2019)
  • Annotation of gene product function from high-throughput studies using the
           Gene Ontology

    • Authors: Attrill H; Gaudet P, Huntley R, et al.
      Abstract: High-throughput studies constitute an essential and valued source of information for researchers. However, high-throughput experimental workflows are often complex, with multiple data sets that may contain large numbers of false positives. The representation of high-throughput data in the Gene Ontology (GO) therefore presents a challenging annotation problem, when the overarching goal of GO curation is to provide the most precise view of a gene's role in biology. To address this, representatives from annotation teams within the GO Consortium reviewed high-throughput data annotation practices. We present an annotation framework for high-throughput studies that will facilitate good standards in GO curation and, through the use of new high-throughput evidence codes, increase the visibility of these annotations to the research community.
      PubDate: Fri, 01 Feb 2019 00:00:00 GMT
      DOI: 10.1093/database/baz007
      Issue No: Vol. 2019 (2019)
  • APID database: redefining protein–protein interaction experimental
           evidences and binary interactomes

    • Authors: Alonso-López D; Campos-Laborie F, Gutiérrez M, et al.
      Abstract: The collection and integration of all the known protein–protein physical interactions within a proteome framework are critical to allow proper exploration of the protein interaction networks that drive biological processes in cells at molecular level. APID Interactomes is a public resource of biological data ( that provides a comprehensive and curated collection of `protein interactomes’ for more than 1100 organisms, including 30 species with more than 500 interactions, derived from the integration of experimentally detected protein-to-protein physical interactions (PPIs). We have performed an update of APID database including a redefinition of several key properties of the PPIs to provide a more precise data integration and to avoid false duplicated records. This includes the unification of all the PPIs from five primary databases of molecular interactions (BioGRID, DIP, HPRD, IntAct and MINT), plus the information from two original systematic sources of human data and from experimentally resolved 3D structures (i.e. PDBs, Protein Data Bank files, where more than two distinct proteins have been identified). Thus, APID provides PPIs reported in published research articles (with traceable PMIDs) and detected by valid experimental interaction methods that give evidences about such protein interactions (following the `ontology and controlled vocabulary’:; developed by `HUPO PSI-MI’). Within this data mining framework, all interaction detection methods have been grouped into two main types: (i) `binary’ physical direct detection methods and (ii) `indirect’ methods. As a result of these redefinitions, APID provides unified protein interactomes including the specific `experimental evidences’ that support each PPI, indicating whether the interactions can be considered `binary’ (i.e. supported by at least one binary detection method) or not.
      PubDate: Thu, 31 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/baz005
      Issue No: Vol. 2019 (2019)
  • Meta-omics data and collection objects (MOD-CO): a conceptual schema and
           data model for processing sample data in meta-omics research

    • Authors: Rambold G; Yilmaz P, Harjes J, et al.
      Abstract: With the advent of advanced molecular meta-omics techniques and methods, a new era commenced for analysing and characterizing historic collection specimens, as well as recently collected environmental samples. Nucleic acid and protein sequencing-based analyses are increasingly applied to determine the origin, identity and traits of environmental (biological) objects and organisms. In this context, the need for new data structures is evident and former approaches for data processing need to be expanded according to the new meta-omics techniques and operational standards. Existing schemas and community standards in the biodiversity and molecular domain concentrate on terms important for data exchange and publication. Detailed operational aspects of origin and laboratory as well as object and data management issues are frequently neglected. Meta-omics Data and Collection Objects (MOD-CO) has therefore been set up as a new schema for meta-omics research, with a hierarchical organization of the concepts describing collection samples, as well as products and data objects being generated during operational workflows. It is focussed on object trait descriptions as well as on operational aspects and thereby may serve as a backbone for R&D laboratory information management systems with functions of an electronic laboratory notebook. The schema in its current version 1.0 includes 653 concepts and 1810 predefined concept values, being equivalent to descriptors and descriptor states, respectively. It is published in several representations, like a Semantic Media Wiki publication with 2463 interlinked Wiki pages for concepts and concept values, being grouped in 37 concept collections and subcollections. The SQL database application DiversityDescriptions, a generic tool for maintaining descriptive data and schemas, has been applied for setting up and testing MOD-CO and for concept mapping on elements of corresponding schemas.
      PubDate: Thu, 31 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/baz002
      Issue No: Vol. 2019 (2019)
  • One tool to find them all: a case of data integration and querying in a
           distributed LIMS platform

    • Authors: Grand A; Geda E, Mignone A, et al.
      Abstract: In the last years, Laboratory Information Management Systems (LIMS) have been growing from mere inventory systems into increasingly comprehensive software platforms, spanning functionalities as diverse as data search, annotation and analysis. Our institution started in 2011 a LIMS project named the Laboratory Assistant Suite with the purpose of assisting researchers throughout all of their laboratory activities, providing graphical tools to support decision-making tasks and building complex analyses on integrated data. The modular architecture of the system exploits multiple databases with different technologies. To provide an efficient and easy tool for retrieving information of interest, we developed the Multi-Dimensional Data Manager (MDDM). By means of intuitive interfaces, scientists can execute complex queries without any knowledge of query languages or database structures, and easily integrate heterogeneous data stored in multiple databases. Together with the other software modules making up the platform, the MDDM has helped improve the overall quality of the data, substantially reduced the time spent with manual data entry and retrieval and ultimately broadened the spectrum of interconnections among the data, offering novel perspectives to the biomedical analysts.
      PubDate: Wed, 30 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/baz004
      Issue No: Vol. 2019 (2019)
  • Automatic identification of relevant chemical compounds from patents

    • Authors: Akhondi S; Rey H, Schwörer M, et al.
      Abstract: In commercial research and development projects, public disclosure of new chemical compounds often takes place in patents. Only a small proportion of these compounds are published in journals, usually a few years after the patent. Patent authorities make available the patents but do not provide systematic continuous chemical annotations. Content databases such as Elsevier’s Reaxys provide such services mostly based on manual excerptions, which are time-consuming and costly. Automatic text-mining approaches help overcome some of the limitations of the manual process. Different text-mining approaches exist to extract chemical entities from patents. The majority of them have been developed using sub-sections of patent documents and focus on mentions of compounds. Less attention has been given to relevancy of a compound in a patent. Relevancy of a compound to a patent is based on the patent’s context. A relevant compound plays a major role within a patent. Identification of relevant compounds reduces the size of the extracted data and improves the usefulness of patent resources (e.g. supports identifying the main compounds). Annotators of databases like Reaxys only annotate relevant compounds. In this study, we design an automated system that extracts chemical entities from patents and classifies their relevance. The gold-standard set contained 18 789 chemical entity annotations. Of these, 10% were relevant compounds, 88% were irrelevant and 2% were equivocal. Our compound recognition system was based on proprietary tools. The performance (F-score) of the system on compound recognition was 84% on the development set and 86% on the test set. The relevancy classification system had an F-score of 86% on the development set and 82% on the test set. Our system can extract chemical compounds from patents and classify their relevance with high performance. This enables the extension of the Reaxys database by means of automation.
      PubDate: Wed, 30 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/baz001
      Issue No: Vol. 2019 (2019)
  • Overview of the BioCreative VI Precision Medicine Track: mining protein
           interactions and mutations for precision medicine

    • Authors: Islamaj Doğan R; Kim S, Chatr-aryamontri A, et al.
      Abstract: The Precision Medicine Initiative is a multicenter effort aiming at formulating personalized treatments leveraging on individual patient data (clinical, genome sequence and functional genomic data) together with the information in large knowledge bases (KBs) that integrate genome annotation, disease association studies, electronic health records and other data types. The biomedical literature provides a rich foundation for populating theseKBs, reporting genetic and molecular interactions that provide the scaffold for the cellular regulatory systems and detailing the influence of genetic variants in these interactions. The goal of BioCreative VI Precision Medicine Track was to extract this particular type of information and was organized in two tasks: (i) document triage task, focused on identifying scientific literature containing experimentally verified protein–protein interactions (PPIs) affected by genetic mutations and (ii) relation extraction task, focused on extracting the affected interactions (protein pairs). To assist system developers and task participants, a large-scale corpus of PubMed documents was manually annotated for this task. Ten teams worldwide contributed 22 distinct text-mining models for the document triage task, and six teams worldwide contributed 14 different text-mining systems for the relation extraction task. When comparing the text-mining system predictions with human annotations, for the triage task, the best F-score was 69.06%, the best precision was 62.89%, the best recall was 98.0% and the best average precision was 72.5%. For the relation extraction task, when taking homologous genes into account, the best F-score was 37.73%, the best precision was 46.5% and the best recall was 54.1%. Submitted systems explored a wide range of methods, from traditional rule-based, statistical and machine learning systems to state-of-the-art deep learning methods. Given the level of participation and the individual team results we find the precision medicine track to be successful in engaging the text-mining research community. In the meantime, the track produced a manually annotated corpus of 5509 PubMed documents developed by BioGRID curators and relevant for precision medicine. The data set is freely available to the community, and the specific interactions have been integrated into the BioGRID data set. In addition, this challenge provided the first results of automatically identifying PubMed articles that describe PPI affected by mutations, as well as extracting the affected relations from those articles. Still, much progress is needed for computer-assisted precision medicine text mining to become mainstream. Future work should focus on addressing the remaining technical challenges and incorporating the practical benefits of text-mining tools into real-world precision medicine information-related curation.
      PubDate: Mon, 28 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay147
      Issue No: Vol. 2019 (2019)
  • EnhancerDB: a resource of transcriptional regulation in the context of

    • Authors: Kang R; Zhang Y, Huang Q, et al.
      Abstract: Enhancers can act as cis-regulatory elements to control transcriptional regulation by recruiting DNA-binding transcription factors (TFs) in a tissue-specific manner. Recent studies show that enhancers regulate not only protein-coding genes but also microRNAs (miRNAs), and mutations within the TF binding sites (TFBSs) located on enhancers will cause a variety of diseases such as cancer. However, a comprehensive resource to integrate these regulation elements for revealing transcriptional regulations in the context of enhancers is not currently available. Here, we introduce EnhancerDB, a web-accessible database to provide a resource to browse and search regulatory relationships identified in this study, including 131 054 581 TF–enhancer, 17 059 enhancer–miRNAs, 318 993 enhancer–genes, 4 639 558 TF–miRNAs, 1 059 695 TF–genes, 11 439 394 enhancer–single-nucleotide polymorphisms (SNPs) and 23 334 genes associated with expression quantitative trait loci (eQTL) SNP and expression profile of TF/gene/miRNA across multiple human tissues/cell lines. We also developed a tool that further allows users to define tissue-specific enhancers by setting the threshold score of tissue specificity of enhancers. In addition, links to external resources are also available at EnhancerDB.
      PubDate: Thu, 24 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay141
      Issue No: Vol. 2019 (2019)
  • Towards comprehensive annotation of Drosophila melanogaster enzymes in

    • Authors: Garapati P; Zhang J, Rey A, et al.
      Abstract: The catalytic activities of enzymes can be described using Gene Ontology (GO) terms and Enzyme Commission (EC) numbers. These annotations are available from numerous biological databases and are routinely accessed by researchers and bioinformaticians to direct their work. However, enzyme data may not be congruent between different resources, while the origin, quality and genomic coverage of these data within any one resource are often unclear. GO/EC annotations are assigned either manually by expert curators or inferred computationally, and there is potential for errors in both types of annotation. If such errors remain unchecked, false positive annotations may be propagated across multiple resources, significantly degrading the quality and usefulness of these data. Similarly, the absence of annotations (false negatives) from any one resource can lead to incorrect inferences or conclusions. We are systematically reviewing and enhancing the functional annotation of the enzymes of Drosophila melanogaster, focusing on improvements within the FlyBase ( database. We have reviewed four major enzyme groups to date: oxidoreductases, lyases, isomerases and ligases. Herein, we describe our review workflow, the improvement in the quality and coverage of enzyme annotations within FlyBase and the wider impact of our work on other related databases.
      PubDate: Wed, 23 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay144
      Issue No: Vol. 2019 (2019)
  • ccPDB 2.0: an updated version of datasets created and compiled from
           Protein Data Bank

    • Authors: Agrawal P; Patiyal S, Kumar R, et al.
      Abstract: ccPDB 2.0 ( is an updated version of the manually curated database ccPDB that maintains datasets required for developing methods to predict the structure and function of proteins. The number of datasets compiled from literature increased from 45 to 141 in ccPDB 2.0. Similarly, the number of protein structures used for creating datasets also increased from ~74 000 to ~137 000 (PDB March 2018 release). ccPDB 2.0 provides the same web services and flexible tools which were present in the previous version of the database. In the updated version, links of the number of methods developed in the past few years have also been incorporated. This updated resource is built on responsive templates which is compatible with smartphones (mobile, iPhone, iPad, tablets etc.) and large screen gadgets. In summary, ccPDB 2.0 is a user-friendly web-based platform that provides comprehensive as well as updated information about datasets.
      PubDate: Wed, 23 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay142
      Issue No: Vol. 2019 (2019)
  • A manual corpus of annotated main findings of clinical case reports

    • Authors: Smalheiser N; Luo M, Addepalli S, et al.
      Abstract: Clinical case reports are the `eyewitness reports’ of medicine and provide a valuable, unique, albeit noisy and underutilized type of evidence. Generally a case report has a single main finding that represents the reason for writing up the report in the first place. In the present study, we present the results of manual annotation carried out by two individuals on 500 randomly sampled case reports. This corpus contains main finding sentences extracted from title, abstract and full-text of the same article that can be regarded as semantically related and are often paraphrases. The final reconciled corpus of 416 articles comprises an open resource for further study. This is the first step in establishing text mining models and tools that can identify main finding sentences in an automated fashion, and in measuring quantitatively how similar any two main findings are. We envision that case reports in PubMed may be automatically indexed by main finding, so that users can carry out information queries for specific main findings (rather than general topics)—and given one case report, a user can retrieve those having the most similar main findings. The metric of main finding similarity may also potentially be relevant to the modeling of paraphrasing, summarization and entailment within the biomedical literature.
      PubDate: Thu, 17 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay143
      Issue No: Vol. 2019 (2019)
  • RRMdb—an evolutionary-oriented database of RNA recognition motif

    • Authors: Nowacka M; Boccaletto P, Jankowska E, et al.
      Abstract: RNA-recognition motif (RRM) is an RNA-interacting protein domain that plays an important role in the processes of RNA metabolism such as the splicing, editing, export, degradation, and regulation of translation. Here, we present the RNA-recognition motif database (RRMdb), which affords rapid identification and annotation of RRM domains in a given protein sequence. The RRMdb database is compiled from ~57 000 collected representative RRM domain sequences, classified into 415 families. Whenever possible, the families are associated with the available literature and structural data. Moreover, the RRM families are organized into a network of sequence similarities that allows for the assessment of the evolutionary relationships between them.
      PubDate: Wed, 16 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay148
      Issue No: Vol. 2019 (2019)
  • Restructured GEO: restructuring Gene Expression Omnibus metadata for
           genome dynamics analysis

    • Authors: Chen G; Ramírez J, Deng N, et al.
      Abstract: MotivationGene Expression Omnibus (GEO) and other publicly available data store their metadata in the format of unstructured English text, which is very difficult for automated reuse.ResultsWe employed text mining techniques to analyze the metadata of GEO and developed Restructured GEO database (ReGEO). ReGEO reorganizes and categorizes GEO series and makes them searchable by two new attributes extracted automatically from each series’ metadata. These attributes are the number of time points tested in the experiment and the disease being investigated. ReGEO also makes series searchable by other attributes available in GEO, such as platform organism, experiment type, associated PubMed ID as well as general keywords in the study’s description. Our approach greatly expands the usability of GEO data, demonstrating a credible approach to improve the utility of vast amount of publicly available data in the era of Big Data research.
      PubDate: Wed, 16 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay145
      Issue No: Vol. 2019 (2019)
  • Involving community in genes and pathway curation

    • Authors: Naithani S; Gupta P, Preece J, et al.
      Abstract: Biocuration plays a crucial role in building databases and complex systems-level platforms required for processing, annotating and analyzing ‘Big Data’ in biology. However, biocuration efforts cannot keep pace with a dramatic increase in the production of omics data; this presents one of the bottlenecks in genomics. In two pathway curation jamborees, Plant Reactome curators tested strategies for introducing researchers to pathway curation tools, harnessing biologists’ expertise in curating plant pathways and developing a network of community biocurators. We summarize the strategy, workflow and outcomes of these exercises, and discuss the role of community biocuration in advancing databases and genomic resources.
      PubDate: Wed, 16 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay146
      Issue No: Vol. 2019 (2019)
  • Extracting chemical–protein interactions from literature using sentence
           structure analysis and feature engineering

    • Authors: Lung P; He Z, Zhao T, et al.
      Abstract: Information about the interactions between chemical compounds and proteins is indispensable for understanding the regulation of biological processes and the development of therapeutic drugs. Manually extracting such information from biomedical literature is very time and resource consuming. In this study, we propose a computational method to automatically extract chemical–protein interactions (CPIs) from a given text. Our method extracts CPI pairs and CPI triplets from sentences, where a CPI pair consists of a chemical compound and a protein name, and a CPI triplet consists of a CPI pair along with an interaction word describing their relationship. We extracted a diverse set of features from sentences that were used to build multiple machine learning models. Our models contain both simple features, which can be directly computed from sentences, and more sophisticated features derived using sentence structure analysis techniques. For example, one set of features was extracted based on the shortest paths between the CPI pairs or among the CPI triplets in the dependency graphs obtained from sentence parsing. We designed a three-stage approach to predict the multiple categories of CPIs. Our method performed the best among systems that use non-deep learning methods and outperformed several deep-learning-based systems in the track 5 of the BioCreative VI challenge. The features we designed in this study are informative and can be applied to other machine learning methods including deep learning.
      PubDate: Tue, 08 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay138
      Issue No: Vol. 2019 (2019)
  • TP53LNC-DB, the database of lncRNAs in the p53 signalling network

    • Authors: Khan M; Bukhari I, Khan R, et al.
      Abstract: The TP53 gene product, p53, is a pleiotropic transcription factor induced by stress, which functions to promote cell cycle arrest, apoptosis and senescence. Genome-wide profiling has revealed an extensive system of long noncoding RNAs (lncRNAs) that is integral to the p53 signalling network. As a research tool, we implemented a public access database called TP53LNC-DB that annotates currently available information relating lncRNAs to p53 signalling in humans.
      PubDate: Tue, 08 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay136
      Issue No: Vol. 2019 (2019)
  • RAEdb: a database of enhancers identified by high-throughput reporter

    • Authors: Cai Z; Cui Y, Tan Z, et al.
      Abstract: High-throughput reporter assays have been recently developed to directly and quantitatively assess enhancer activity for thousands of regulatory elements. However, there is still no database to collect these enhancers. We developed RAEdb, the first database to collect enhancers identified by high-throughput reporter assays. RAEdb includes 538 320 enhancers derived from eight studies, most of which were from six human cell lines. An activity score was assigned to each enhancer based on reporter assays. Based on these enhancers, 7658 epromoters (promoters with enhancer activity) were identified and stored in the database. RAEdb provides two ways of searches: the first is to search studies by species and cell line; the other is to search enhancers or epromoters by position, activity score, sequence and gene. RAEdb also provides a genome browser to query, visualize and compare enhancers. All data in RAEdb is freely available for download.
      PubDate: Tue, 08 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay140
      Issue No: Vol. 2019 (2019)
  • PubTerm: a web tool for organizing, annotating and curating genes,
           diseases, molecules and other concepts from PubMed records

    • Authors: Garcia-Pelaez J; Rodriguez D, Medina-Molina R, et al.
      Abstract: Background and objectiveAnalysis, annotation and curation of biomedical scientific literature is a recurrent task in biomedical research, database curation and clinics. Commonly, the reading is centered on concepts such as genes, diseases or molecules. Database curators may also need to annotate published abstracts related to a specific topic. However, few free and intuitive tools exist to assist users in this context. Therefore, we developed PubTerm, a web tool to organize, categorize, curate and annotate a large number of PubMed abstracts related to biological entities such as genes, diseases, chemicals, species, sequence variants and other related information.MethodsA variety of interfaces were implemented to facilitate curation and annotation, including the organization of abstracts by terms, by the co-occurrence of terms or by specific phrases. Information includes statistics on the occurrence of terms. The abstracts, terms and other related information can be annotated and categorized using user-defined categories. The session information can be saved and restored, and the data can be exported to other formats.ResultsThe pipeline in PubTerm starts by specifying a PubMed query or list of PubMed identifiers. Then, the user can specify three lists of categories and specify what information will be highlighted in which colors. The user then utilizes the `term view’ to organize the abstracts by gene, disease, species or other information to facilitate the annotation and categorization of terms or abstracts. Other views also facilitate the exploration of abstracts and connections between terms. We have used PubTerm to quickly and efficiently curate collections of more than 400 abstracts that mention more than 350 genes to generate revised lists of susceptibility genes for diseases. An example is provided for pulmonary arterial hypertension.ConclusionsPubTerm saves time for literature revision by assisting with annotation organization and knowledge acquisition.
      PubDate: Tue, 08 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay137
      Issue No: Vol. 2019 (2019)
  • TogoGenome/TogoStanza: modularized Semantic Web genome database

    • Authors: Katayama T; Kawashima S, Okamoto S, et al.
      Abstract: TogoGenome is a genome database that is purely based on the Semantic Web technology, which enables the integration of heterogeneous data and flexible semantic searches. All the information is stored as Resource Description Framework (RDF) data, and the reporting web pages are generated on the fly using SPARQL Protocol and RDF Query Language (SPARQL) queries. TogoGenome provides a semantic-faceted search system by gene functional annotation, taxonomy, phenotypes and environment based on the relevant ontologies. TogoGenome also serves as an interface to conduct semantic comparative genomics by which a user can observe pan-organism or organism-specific genes based on the functional aspect of gene annotations and the combinations of organisms from different taxa. The TogoGenome database exhibits a modularized structure, and each module in the report pages is separately served as TogoStanza, which is a generic framework for rendering an information block as IFRAME/Web Components, which can, unlike several other monolithic databases, also be reused to construct other databases. TogoGenome and TogoStanza have been under development since 2012 and are freely available along with their source codes on the GitHub repositories at and, respectively, under the MIT license.
      PubDate: Tue, 08 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay132
      Issue No: Vol. 2019 (2019)
  • The integrated National NeuroAIDS Tissue Consortium database: a rich
           platform for neuroHIV research

    • Authors: Heithoff A; Totusek S, Le D, et al.
      Abstract: Herein we present major updates to the National NeuroAIDS Tissue Consortium (NNTC) database. The NNTC's ongoing multisite clinical research study was established to facilitate access to ante-mortem and post-mortem data, tissues and biofluids for the neurohuman immunodeficiency virus (HIV) research community. Recently, the NNTC has expanded to include data from the central nervous system HIV Antiretroviral Therapy Effects Research (CHARTER) study. The data and biospecimens from CHARTER and NNTC cohorts are available to qualified researchers upon request. Data generated by requestors using NNTC biospecimens and tissues are returned to the NNTC upon the conclusion of requestors' work, and this external, experimental data are annotated and curated in the publically accessible NNTC database, thereby extending the utility of each case. A flexible and extensible database ontology allows the integration of disparate data sets, including external experimental data, clinical neuropsychological and neuromedical testing data, tissue pathology and neuroimaging data.
      PubDate: Tue, 08 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay134
      Issue No: Vol. 2019 (2019)
  • Combining relation extraction with function detection for BEL statement

    • Authors: Liu S; Cheng W, Qian L, et al.
      Abstract: The BioCreative-V community proposed a challenging task of automatic extraction of causal relation network in Biological Expression Language (BEL) from the biomedical literature. Previous studies on this task largely used models induced from other related tasks and then transformed intermediate structures to BEL statements, which left the given training corpus unexplored. To make full use of the BEL training corpus, in this work, we propose a deep learning-based approach to extract BEL statements. Specifically, we decompose the problem into two subtasks: entity relation extraction and entity function detection. First, two attention-based bidirectional long short-term memory networks models are used to extract entity relation and entity function, respectively. Then entity relation and their functions are combined into a BEL statement. In order to boost the overall performance, a strategy of threshold filtering is applied to improve the precision of identified entity functions. We evaluate our approach on the BioCreative-V Track 4 corpus with or without gold entities. The experimental results show that our method achieves the state-of-the-art performance with an overall F1-measure of 46.9% in stage 2 and 21.3% in stage 1, respectively.
      PubDate: Tue, 08 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay133
      Issue No: Vol. 2019 (2019)
  • AtFusionDB: a database of fusion transcripts in Arabidopsis thaliana

    • Authors: Singh A; Zahra S, Das D, et al.
      Abstract: Fusion transcripts are chimeric RNAs generated as a result of fusion either at DNA or RNA level. These novel transcripts have been extensively studied in the case of human cancers but still remain underexamined in plants. In this study, we introduce the first plant-specific database of fusion transcripts named AtFusionDB ( This is a comprehensive database that contains the detailed information about fusion transcripts identified in model plant Arabidopsis thaliana. A total of 82 969 fusion transcript entries generated from 17 181 different genes of A. thaliana are available in this database. Apart from the basic information consisting of the Ensembl gene names, official gene name, tissue type, EricScore, fusion type, AtFusionDB ID and sample ID (e.g. Sequence Read Archive ID), additional information like UniProt, gene coordinates (together with the function of parental genes), junction sequence, expression level of both parent genes and fusion transcript may be of high utility to the user. Two different types of search modules viz. ‘Simple Search’ and ‘Advanced Search’ in addition to the ‘Browse’ option with data download facility are provided in this database. Three different modules for mapping and alignment of the query sequences viz. BLASTN, SW Align and Mapping are incorporated in AtFusionDB. This database is a head start for exploring the complex and unexplored domain of gene/transcript fusion in plants.
      PubDate: Tue, 08 Jan 2019 00:00:00 GMT
      DOI: 10.1093/database/bay135
      Issue No: Vol. 2019 (2019)
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
Home (Search)
Subjects A-Z
Publishers A-Z
Your IP address:
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-