Followed Journals
Journal you Follow: 0
Sign Up to follow journals, search in your chosen journals and, optionally, receive Email Alerts when new issues of your Followed Journals are published.
Already have an account? Sign In to see the journals you follow.
Similar Journals
Journal Cover
Nucleic Acids Research
Journal Prestige (SJR): 9.025
Citation Impact (citeScore): 11
Number of Followers: 68  

  This is an Open Access Journal Open Access journal
ISSN (Print) 0305-1048 - ISSN (Online) 1362-4962
Published by Oxford University Press Homepage  [412 journals]
  • Editorial: the 18th annual Nucleic Acids Research web server issue 2020

    • Abstract: The 2020 version of Nucleic Acids Research's web server issue marks a change. This 18th web server issue is the first one to be assembled without the editorial supervision of Gary Benson, who served as its executive editor for more than a decade and whose work has made the issue the most prominent resource for scientific web-based applications.
      PubDate: Sun, 21 Jun 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa528
      Issue No: Vol. 48, No. W1 (2020)
  • RNAProbe: a web server for normalization and analysis of RNA structure
           probing data

    • Authors: Wirecki T; Merdas K, Bernat A, et al.
      Abstract: RNA molecules play key roles in all living cells. Knowledge of the structural characteristics of RNA molecules allows for a better understanding of the mechanisms of their action. RNA chemical probing allows us to study the susceptibility of nucleotides to chemical modification, and the information obtained can be used to guide secondary structure prediction. These experimental results can be analyzed using various computational tools, which, however, requires additional, tedious steps (e.g., further normalization of the reactivities and visualization of the results), for which there are no fully automated methods. Here, we introduce RNAProbe, a web server that facilitates normalization, analysis, and visualization of the low-pass SHAPE, DMS and CMCT probing results with the modification sites detected by capillary electrophoresis. RNAProbe automatically analyzes chemical probing output data and turns tedious manual work into a one-minute assignment. RNAProbe performs normalization based on a well-established protocol, utilizes recognized secondary structure prediction methods, and generates high-quality images with structure representations and reactivity heatmaps. It summarizes the results in the form of a spreadsheet, which can be used for comparative analyses between experiments. Results of predictions with normalized reactivities are also collected in text files, providing interoperability with bioinformatics workflows. RNAProbe is available at
      PubDate: Sat, 06 Jun 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa396
      Issue No: Vol. 48, No. W1 (2020)
  • SNPnexus: a web server for functional annotation of human genome sequence
           variation (2020 update)

    • Authors: Oscanoa J; Sivapalan L, Gadaleta E, et al.
      Abstract: SNPnexus is a web-based annotation tool for the analysis and interpretation of both known and novel sequencing variations. Since its last release, SNPnexus has received continual updates to expand the range and depth of annotations provided. SNPnexus has undergone a complete overhaul of the underlying infrastructure to accommodate faster computational times. The scope for data annotation has been substantially expanded to enhance biological interpretations of queried variants. This includes the addition of pathway analysis for the identification of enriched biological pathways and molecular processes. We have further expanded the range of user directed annotation fields available for the study of cancer sequencing data. These new additions facilitate investigations into cancer driver variants and targetable molecular alterations within input datasets. New user directed filtering options have been coupled with the addition of interactive graphical and visualization tools. These improvements streamline the analysis of variants derived from large sequencing datasets for the identification of biologically and clinically significant subsets in the data. SNPnexus is the most comprehensible web-based application currently available and these new set of updates ensures that it remains a state-of-the-art tool for researchers. SNPnexus is freely available at
      PubDate: Thu, 04 Jun 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa420
      Issue No: Vol. 48, No. W1 (2020)
  • Tox21BodyMap: a webtool to map chemical effects on the human body

    • Authors: Borrel A; Auerbach S, Houck K, et al.
      Abstract: To support rapid chemical toxicity assessment and mechanistic hypothesis generation, here we present an intuitive webtool allowing a user to identify target organs in the human body where a substance is estimated to be more likely to produce effects. This tool, called Tox21BodyMap, incorporates results of 9,270 chemicals tested in the United States federal Tox21 research consortium in 971 high-throughput screening (HTS) assays whose targets were mapped onto human organs using organ-specific gene expression data. Via Tox21BodyMap's interactive tools, users can visualize chemical target specificity by organ system, and implement different filtering criteria by changing gene expression thresholds and activity concentration parameters. Dynamic network representations, data tables, and plots with comprehensive activity summaries across all Tox21 HTS assay targets provide an overall picture of chemical bioactivity. Tox21BodyMap webserver is available at
      PubDate: Wed, 03 Jun 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa433
      Issue No: Vol. 48, No. W1 (2020)
  • miRNet 2.0: network-based visual analytics for miRNA functional analysis
           and systems biology

    • Authors: Chang L; Zhou G, Soufan O, et al.
      Abstract: miRNet is an easy-to-use, web-based platform designed to help elucidate microRNA (miRNA) functions by integrating users' data with existing knowledge via network-based visual analytics. Since its first release in 2016, miRNet has been accessed by >20 000 researchers worldwide, with ∼100 users on a daily basis. While version 1.0 was focused primarily on miRNA-target gene interactions, it has become clear that in order to obtain a global view of miRNA functions, it is necessary to bring other important players into the context during analysis. Driven by this concept, in miRNet version 2.0, we have (i) added support for transcription factors (TFs) and single nucleotide polymorphisms (SNPs) that affect miRNAs, miRNA-binding sites or target genes, whilst also greatly increased (>5-fold) the underlying knowledgebases of miRNAs, ncRNAs and disease associations; (ii) implemented new functions to allow creation and visual exploration of multipartite networks, with enhanced support for in situ functional analysis and (iii) revamped the web interface, optimized the workflow, and introduced microservices and web application programming interface (API) to sustain high-performance, real-time data analysis. The underlying R package is also released in tandem with version 2.0 to allow more flexible data analysis for R programmers. The miRNet 2.0 website is freely available at
      PubDate: Tue, 02 Jun 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa467
      Issue No: Vol. 48, No. W1 (2020)
  • HomolWat: a web server tool to incorporate ‘homologous’ water
           molecules into GPCR structures

    • Authors: Mayol E; García-Recio A, Tiemann J, et al.
      Abstract: Internal water molecules play an essential role in the structure and function of membrane proteins including G protein-coupled receptors (GPCRs). However, technical limitations severely influence the number and certainty of observed water molecules in 3D structures. This may compromise the accuracy of further structural studies such as docking calculations or molecular dynamics simulations. Here we present HomolWat, a web application for incorporating water molecules into GPCR structures by using template-based modelling of homologous water molecules obtained from high-resolution structures. While there are various tools available to predict the positions of internal waters using energy-based methods, the approach of borrowing lacking water molecules from homologous GPCR structures makes HomolWat unique. The tool can incorporate water molecules into a protein structure in about a minute with around 85% of water recovery. The web server is freely available at
      PubDate: Tue, 02 Jun 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa440
      Issue No: Vol. 48, No. W1 (2020)
  • mirnaQC: a webserver for comparative quality control of miRNA-seq data

    • Authors: Aparicio-Puerta E; Gómez-Martín C, Giannoukakos S, et al.
      Abstract: Although miRNA-seq is extensively used in many different fields, its quality control is frequently restricted to a PhredScore-based filter. Other important quality related aspects like microRNA yield, the fraction of putative degradation products (such as rRNA fragments) or the percentage of adapter-dimers are hard to assess using absolute thresholds. Here we present mirnaQC, a webserver that relies on 34 quality parameters to assist in miRNA-seq quality control. To improve their interpretability, quality attributes are ranked using a reference distribution obtained from over 36 000 publicly available miRNA-seq datasets. Accepted input formats include FASTQ and SRA accessions. The results page contains several sections that deal with putative technical artefacts related to library preparation, sequencing, contamination or yield. Different visualisations, including PCA and heatmaps, are available to help users identify underlying issues. Finally, we show the usefulness of this approach by analysing two publicly available datasets and discussing the different quality issues that can be detected using mirnaQC.
      PubDate: Tue, 02 Jun 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa452
      Issue No: Vol. 48, No. W1 (2020)
  • CReSCENT: CanceR Single Cell ExpressioN Toolkit

    • Authors: Mohanraj S; Díaz-Mejía J, Pham M, et al.
      Abstract: CReSCENT: CanceR Single Cell ExpressioN Toolkit (, is an intuitive and scalable web portal incorporating a containerized pipeline execution engine for standardized analysis of single-cell RNA sequencing (scRNA-seq) data. While scRNA-seq data for tumour specimens are readily generated, subsequent analysis requires high-performance computing infrastructure and user expertise to build analysis pipelines and tailor interpretation for cancer biology. CReSCENT uses public data sets and preconfigured pipelines that are accessible to computational biology non-experts and are user-editable to allow optimization, comparison, and reanalysis for specific experiments. Users can also upload their own scRNA-seq data for analysis and results can be kept private or shared with other users.
      PubDate: Mon, 01 Jun 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa437
      Issue No: Vol. 48, No. W1 (2020)
  • The Galaxy platform for accessible, reproducible and collaborative
           biomedical analyses: 2020 update

    • Authors: Jalili V; Afgan E, Gu Q, et al.
      Abstract: Galaxy ( is a web-based computational workbench used by tens of thousands of scientists across the world to analyze large biomedical datasets. Since 2005, the Galaxy project has fostered a global community focused on achieving accessible, reproducible, and collaborative research. Together, this community develops the Galaxy software framework, integrates analysis tools and visualizations into the framework, runs public servers that make Galaxy available via a web browser, performs and publishes analyses using Galaxy, leads bioinformatics workshops that introduce and use Galaxy, and develops interactive training materials for Galaxy. Over the last two years, all aspects of the Galaxy project have grown: code contributions, tools integrated, users, and training materials. Key advances in Galaxy's user interface include enhancements for analyzing large dataset collections as well as interactive tools for exploratory data analysis. Extensions to Galaxy's framework include support for federated identity and access management and increased ability to distribute analysis jobs to remote resources. New community resources include large public servers in Europe and Australia, an increasing number of regional and local Galaxy communities, and substantial growth in the Galaxy Training Network.
      PubDate: Mon, 01 Jun 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa434
      Issue No: Vol. 48, No. W1 (2020)
  • TopMatch-web: pairwise matching of large assemblies of protein and nucleic
           acid chains in 3D

    • Authors: Wiederstein M; Sippl M.
      Abstract: Frequently, the complete functional units of biological molecules are assemblies of protein and nucleic acid chains. Stunning examples are the complex structures of ribosomes. Here, we present TopMatch-web, a computational tool for the study of the three-dimensional structure, function and evolution of such molecules. The unique feature of TopMatch is its ability to match the protein as well as nucleic acid chains of complete molecular assemblies simultaneously. The resulting structural alignments are visualized instantly using the high-performance molecular viewer NGL. We use the mitochondrial ribosomes of human and yeast as an example to demonstrate the capabilities of TopMatch-web. The service responds immediately, enabling the interactive study of many pairwise alignments of large molecular assemblies in a single session. TopMatch-web is freely accessible at
      PubDate: Mon, 01 Jun 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa366
      Issue No: Vol. 48, No. W1 (2020)
  • FATCAT 2.0: towards a better understanding of the structural diversity of

    • Authors: Li Z; Jaroszewski L, Iyer M, et al.
      Abstract: FATCAT 2.0 server (, provides access to a flexible protein structure alignment algorithm developed in our group. In such an alignment, rotations and translations between elements in the structure are allowed to minimize the overall root mean square deviation (RMSD) between the compared structures. This allows to effectively compare protein structures even if they underwent structural rearrangements in different functional forms, different crystallization conditions or as a result of mutations. The major update for the server introduces a new graphical interface, much faster database searches and several new options for visualization of the structural differences between proteins
      PubDate: Fri, 29 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa443
      Issue No: Vol. 48, No. W1 (2020)
  • mCSM-membrane: predicting the effects of mutations on transmembrane

    • Authors: Pires D; Rodrigues C, Ascher D.
      Abstract: Significant efforts have been invested into understanding and predicting the molecular consequences of mutations in protein coding regions, however nearly all approaches have been developed using globular, soluble proteins. These methods have been shown to poorly translate to studying the effects of mutations in membrane proteins. To fill this gap, here we report, mCSM-membrane, a user-friendly web server that can be used to analyse the impacts of mutations on membrane protein stability and the likelihood of them being disease associated. mCSM-membrane derives from our well-established mutation modelling approach that uses graph-based signatures to model protein geometry and physicochemical properties for supervised learning. Our stability predictor achieved correlations of up to 0.72 and 0.67 (on cross validation and blind tests, respectively), while our pathogenicity predictor achieved a Matthew's Correlation Coefficient (MCC) of up to 0.77 and 0.73, outperforming previously described methods in both predicting changes in stability and in identifying pathogenic variants. mCSM-membrane will be an invaluable and dedicated resource for investigating the effects of single-point mutations on membrane proteins through a freely available, user friendly web server at
      PubDate: Fri, 29 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa416
      Issue No: Vol. 48, No. W1 (2020)
  • piNET: a versatile web platform for downstream analysis and visualization
           of proteomics data

    • Authors: Shamsaei B; Chojnacki S, Pilarczyk M, et al.
      Abstract: Rapid progress in proteomics and large-scale profiling of biological systems at the protein level necessitates the continued development of efficient computational tools for the analysis and interpretation of proteomics data. Here, we present the piNET server that facilitates integrated annotation, analysis and visualization of quantitative proteomics data, with emphasis on PTM networks and integration with the LINCS library of chemical and genetic perturbation signatures in order to provide further mechanistic and functional insights. The primary input for the server consists of a set of peptides or proteins, optionally with PTM sites, and their corresponding abundance values. Several interconnected workflows can be used to generate: (i) interactive graphs and tables providing comprehensive annotation and mapping between peptides and proteins with PTM sites; (ii) high resolution and interactive visualization for enzyme-substrate networks, including kinases and their phospho-peptide targets; (iii) mapping and visualization of LINCS signature connectivity for chemical inhibitors or genetic knockdown of enzymes upstream of their target PTM sites. piNET has been built using a modular Spring-Boot JAVA platform as a fast, versatile and easy to use tool. The Apache Lucene indexing is used for fast mapping of peptides into UniProt entries for the human, mouse and other commonly used model organism proteomes. PTM-centric network analyses combine PhosphoSitePlus, iPTMnet and SIGNOR databases of validated enzyme-substrate relationships, for kinase networks augmented by DeepPhos predictions and sequence-based mapping of PhosphoSitePlus consensus motifs. Concordant LINCS signatures are mapped using iLINCS. For each workflow, a RESTful API counterpart can be used to generate the results programmatically in the json format. The server is available at, and it is free and open to all users without login requirement.
      PubDate: Fri, 29 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa436
      Issue No: Vol. 48, No. W1 (2020)
  • PaCRISPR: a server for predicting and visualizing anti-CRISPR proteins

    • Authors: Wang J; Dai W, Li J, et al.
      Abstract: Anti-CRISPRs are widespread amongst bacteriophage and promote bacteriophage infection by inactivating the bacterial host's CRISPR–Cas defence system. Identifying and characterizing anti-CRISPR proteins opens an avenue to explore and control CRISPR–Cas machineries for the development of new CRISPR–Cas based biotechnological and therapeutic tools. Past studies have identified anti-CRISPRs in several model phage genomes, but a challenge exists to comprehensively screen for anti-CRISPRs accurately and efficiently from genome and metagenome sequence data. Here, we have developed an ensemble learning based predictor, PaCRISPR, to accurately identify anti-CRISPRs from protein datasets derived from genome and metagenome sequencing projects. PaCRISPR employs different types of feature recognition united within an ensemble framework. Extensive cross-validation and independent tests show that PaCRISPR achieves a significantly more accurate performance compared with homology-based baseline predictors and an existing toolkit. The performance of PaCRISPR was further validated in discovering anti-CRISPRs that were not part of the training for PaCRISPR, but which were recently demonstrated to function as anti-CRISPRs for phage infections. Data visualization on anti-CRISPR relationships, highlighting sequence similarity and phylogenetic considerations, is part of the output from the PaCRISPR toolkit, which is freely available at
      PubDate: Wed, 27 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa432
      Issue No: Vol. 48, No. W1 (2020)
  • ShiftCrypt: a web server to understand and biophysically align proteins
           through their NMR chemical shift values

    • Authors: Orlando G; Raimondi D, Kagami L, et al.
      Abstract: Nuclear magnetic resonance (NMR) spectroscopy data provides valuable information on the behaviour of proteins in solution. The primary data to determine when studying proteins are the per-atom NMR chemical shifts, which reflect the local environment of atoms and provide insights into amino acid residue dynamics and conformation. Within an amino acid residue, chemical shifts present multi-dimensional and complexly cross-correlated information, making them difficult to analyse. The ShiftCrypt method, based on neural network auto-encoder architecture, compresses the per-amino acid chemical shift information in a single, interpretable, amino acid-type independent value that reflects the biophysical state of a residue. We here present the ShiftCrypt web server, which makes the method readily available. The server accepts chemical shifts input files in the NMR Exchange Format (NEF) or NMR-STAR format, executes ShiftCrypt and visualises the results, which are also accessible via an API. It also enables the ”biophysically-based” pairwise alignment of two proteins based on their ShiftCrypt values. This approach uses Dynamic Time Warping and can optionally include their amino acid code information, and has applications in, for example, the alignment of disordered regions. The server uses a token-based system to ensure the anonymity of the users and results. The web server is available at
      PubDate: Wed, 27 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa391
      Issue No: Vol. 48, No. W1 (2020)
  • EpiRegio: analysis and retrieval of regulatory elements linked to genes

    • Authors: Baumgarten N; Hecker D, Karunanithi S, et al.
      Abstract: A current challenge in genomics is to interpret non-coding regions and their role in transcriptional regulation of possibly distant target genes. Genome-wide association studies show that a large part of genomic variants are found in those non-coding regions, but their mechanisms of gene regulation are often unknown. An additional challenge is to reliably identify the target genes of the regulatory regions, which is an essential step in understanding their impact on gene expression. Here we present the EpiRegio web server, a resource of regulatory elements (REMs). REMs are genomic regions that exhibit variations in their chromatin accessibility profile associated with changes in expression of their target genes. EpiRegio incorporates both epigenomic and gene expression data for various human primary cell types and tissues, providing an integrated view of REMs in the genome. Our web server allows the analysis of genes and their associated REMs, including the REM’s activity and its estimated cell type-specific contribution to its target gene’s expression. Further, it is possible to explore genomic regions for their regulatory potential, investigate overlapping REMs and by that the dissection of regions of large epigenomic complexity. EpiRegio allows programmatic access through a REST API and is freely available at
      PubDate: Wed, 27 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa382
      Issue No: Vol. 48, No. W1 (2020)
  • CVCDAP: an integrated platform for molecular and clinical analysis of
           cancer virtual cohorts

    • Authors: Guan X; Cai M, Du Y, et al.
      Abstract: Recent large-scale multi-omics studies resulted in quick accumulation of an overwhelming amount of cancer-related data, which provides an unprecedented resource to interrogate diverse questions. While certain existing web servers are valuable and widely used, analysis and visualization functions with regard to re-investigation of these data at cohort level are not adequately addressed. Here, we present CVCDAP, a web-based platform to deliver an interactive and customizable toolbox off the shelf for cohort-level analysis of TCGA and CPTAC public datasets, as well as user uploaded datasets. CVCDAP allows flexible selection of patients sharing common molecular and/or clinical characteristics across multiple studies as a virtual cohort, and provides dozens of built-in customizable tools for seamless genomic, transcriptomic, proteomic and clinical analysis of a single virtual cohort, as well as, to compare two virtual cohorts with relevance. The flexibility and analytic competence of CVCDAP empower experimental and clinical researchers to identify new molecular mechanisms and develop potential therapeutic approaches, by building and analyzing virtual cohorts for their subject of interests. We demonstrate that CVCDAP can conveniently reproduce published findings and reveal novel insights by two applications. The CVCDAP web server is freely available at
      PubDate: Mon, 25 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa423
      Issue No: Vol. 48, No. W1 (2020)
  • ASAP 2020 update: an open, scalable and interactive web-based portal for
           (single-cell) omics analyses

    • Authors: David F; Litovchenko M, Deplancke B, et al.
      Abstract: Single-cell omics enables researchers to dissect biological systems at a resolution that was unthinkable just 10 years ago. However, this analytical revolution also triggered new demands in ‘big data’ management, forcing researchers to stay up to speed with increasingly complex analytical processes and rapidly evolving methods. To render these processes and approaches more accessible, we developed the web-based, collaborative portal ASAP (Automated Single-cell Analysis Portal). Our primary goal is thereby to democratize single-cell omics data analyses (scRNA-seq and more recently scATAC-seq). By taking advantage of a Docker system to enhance reproducibility, and novel bioinformatics approaches that were recently developed for improving scalability, ASAP meets challenging requirements set by recent cell atlasing efforts such as the Human (HCA) and Fly (FCA) Cell Atlas Projects. Specifically, ASAP can now handle datasets containing millions of cells, integrating intuitive tools that allow researchers to collaborate on the same project synchronously. ASAP tools are versioned, and researchers can create unique access IDs for storing complete analyses that can be reproduced or completed by others. Finally, ASAP does not require any installation and provides a full and modular single-cell RNA-seq analysis pipeline. ASAP is freely available at
      PubDate: Mon, 25 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa412
      Issue No: Vol. 48, No. W1 (2020)
  • PseudoChecker: an integrated online platform for gene inactivation

    • Authors: Alves L; Ruivo R, Fonseca M, et al.
      Abstract: The rapid expansion of high-quality genome assemblies, exemplified by ongoing initiatives such as the Genome-10K and i5k, demands novel automated methods to approach comparative genomics. Of these, the study of inactivating mutations in the coding region of genes, or pseudogenization, as a source of evolutionary novelty is mostly overlooked. Thus, to address such evolutionary/genomic events, a systematic, accurate and computationally automated approach is required. Here, we present PseudoChecker, the first integrated online platform for gene inactivation inference. Unlike the few existing methods, our comparative genomics-based approach displays full automation, a built-in graphical user interface and a novel index, PseudoIndex, for an empirical evaluation of the gene coding status. As a multi-platform online service, PseudoChecker simplifies access and usability, allowing a fast identification of disruptive mutations. An analysis of 30 genes previously reported to be eroded in mammals, and 30 viable genes from the same lineages, demonstrated that PseudoChecker was able to correctly infer 97% of loss events and 95% of functional genes, confirming its reliability. PseudoChecker is freely available, without login required, at
      PubDate: Mon, 25 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa408
      Issue No: Vol. 48, No. W1 (2020)
  • Fluxer: a web application to compute, analyze and visualize genome-scale
           metabolic flux networks

    • Authors: Hari A; Lobo D.
      Abstract: Next-generation sequencing has paved the way for the reconstruction of genome-scale metabolic networks as a powerful tool for understanding metabolic circuits in any organism. However, the visualization and extraction of knowledge from these large networks comprising thousands of reactions and metabolites is a current challenge in need of user-friendly tools. Here we present Fluxer (, a free and open-access novel web application for the computation and visualization of genome-scale metabolic flux networks. Any genome-scale model based on the Systems Biology Markup Language can be uploaded to the tool, which automatically performs Flux Balance Analysis and computes different flux graphs for visualization and analysis. The major metabolic pathways for biomass growth or for biosynthesis of any metabolite can be interactively knocked-out, analyzed and visualized as a spanning tree, dendrogram or complete graph using different layouts. In addition, Fluxer can compute and visualize the k-shortest metabolic paths between any two metabolites or reactions to identify the main metabolic routes between two compounds of interest. The web application includes >80 whole-genome metabolic reconstructions of diverse organisms from bacteria to human, readily available for exploration. Fluxer enables the efficient analysis and visualization of genome-scale metabolic models toward the discovery of key metabolic pathways.
      PubDate: Fri, 22 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa409
      Issue No: Vol. 48, No. W1 (2020)
  • NanoSPC: a scalable, portable, cloud compatible viral nanopore metagenomic
           data processing pipeline

    • Authors: Xu Y; Yang-Turner F, Volk D, et al.
      Abstract: Metagenomic sequencing combined with Oxford Nanopore Technology has the potential to become a point-of-care test for infectious disease in public health and clinical settings, providing rapid diagnosis of infection, guiding individual patient management and treatment strategies, and informing infection prevention and control practices. However, publicly available, streamlined, and reproducible pipelines for analyzing Nanopore metagenomic sequencing data are still lacking. Here we introduce NanoSPC, a scalable, portable and cloud compatible pipeline for analyzing Nanopore sequencing data. NanoSPC can identify potentially pathogenic viruses and bacteria simultaneously to provide comprehensive characterization of individual samples. The pipeline can also detect single nucleotide variants and assemble high quality complete consensus genome sequences, permitting high-resolution inference of transmission. We implement NanoSPC using Nextflow manager within Docker images to allow reproducibility and portability of the analysis. Moreover, we deploy NanoSPC to our scalable pathogen pipeline platform, enabling elastic computing for high throughput Nanopore data on HPC cluster as well as multiple cloud platforms, such as Google Cloud, Amazon Elastic Computing Cloud, Microsoft Azure and OpenStack. Users could either access our web interface ( to run cloud-based analysis, monitor process, and visualize results, as well as download Docker images and run command line to analyse data locally.
      PubDate: Fri, 22 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa413
      Issue No: Vol. 48, No. W1 (2020)
  • SYNERGxDB: an integrative pharmacogenomic portal to identify synergistic
           drug combinations for precision oncology

    • Authors: Seo H; Tkachuk D, Ho C, et al.
      Abstract: Drug-combination data portals have recently been introduced to mine huge amounts of pharmacological data with the aim of improving current chemotherapy strategies. However, these portals have only been investigated for isolated datasets, and molecular profiles of cancer cell lines are lacking. Here we developed a cloud-based pharmacogenomics portal called SYNERGxDB ( that integrates multiple high-throughput drug-combination studies with molecular and pharmacological profiles of a large panel of cancer cell lines. This portal enables the identification of synergistic drug combinations through harmonization and unified computational analysis. We integrated nine of the largest drug combination datasets from both academic groups and pharmaceutical companies, resulting in 22 507 unique drug combinations (1977 unique compounds) screened against 151 cancer cell lines. This data compendium includes metabolomics, gene expression, copy number and mutation profiles of the cancer cell lines. In addition, SYNERGxDB provides analytical tools to discover effective therapeutic combinations and predictive biomarkers across cancer, including specific types. Combining molecular and pharmacological profiles, we systematically explored the large space of univariate predictors of drug synergism. SYNERGxDB constitutes a comprehensive resource that opens new avenues of research for exploring the mechanism of action for drug synergy with the potential of identifying new treatment strategies for cancer patients.
      PubDate: Fri, 22 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa421
      Issue No: Vol. 48, No. W1 (2020)
  • TIMER2.0 for analysis of tumor-infiltrating immune cells

    • Authors: Li T; Fu J, Zeng Z, et al.
      Abstract: Tumor progression and the efficacy of immunotherapy are strongly influenced by the composition and abundance of immune cells in the tumor microenvironment. Due to the limitations of direct measurement methods, computational algorithms are often used to infer immune cell composition from bulk tumor transcriptome profiles. These estimated tumor immune infiltrate populations have been associated with genomic and transcriptomic changes in the tumors, providing insight into tumor–immune interactions. However, such investigations on large-scale public data remain challenging. To lower the barriers for the analysis of complex tumor–immune interactions, we significantly improved our previous web platform TIMER. Instead of just using one algorithm, TIMER2.0 ( provides more robust estimation of immune infiltration levels for The Cancer Genome Atlas (TCGA) or user-provided tumor profiles using six state-of-the-art algorithms. TIMER2.0 provides four modules for investigating the associations between immune infiltrates and genetic or clinical features, and four modules for exploring cancer-related associations in the TCGA cohorts. Each module can generate a functional heatmap table, enabling the user to easily identify significant associations in multiple cancer types simultaneously. Overall, the TIMER2.0 web server provides comprehensive analysis and visualization functions of tumor infiltrating immune cells.
      PubDate: Fri, 22 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa407
      Issue No: Vol. 48, No. W1 (2020)
  • 3D-GNOME 2.0: a three-dimensional genome modeling engine for predicting
           structural variation-driven alterations of chromatin spatial structure in
           the human genome

    • Authors: Wlasnowolski M; Sadowski M, Czarnota T, et al.
      Abstract: Structural variants (SVs) that alter DNA sequence emerge as a driving force involved in the reorganisation of DNA spatial folding, thus affecting gene transcription. In this work, we describe an improved version of our integrated web service for structural modeling of three-dimensional genome (3D-GNOME), which now incorporates all types of SVs to model changes to the reference 3D conformation of chromatin. In 3D-GNOME 2.0, the default reference 3D genome structure is generated using ChIA-PET data from the GM12878 cell line and SVs data are sourced from the population-scale catalogue of SVs identified by the 1000 Genomes Consortium. However, users may also submit their own structural data to set a customized reference genome structure, and/or a custom input list of SVs. 3D-GNOME 2.0 provides novel tools to inspect, visualize and compare 3D models for regions that differ in terms of their linear genomic sequence. Contact diagrams are displayed to compare the reference 3D structure with the one altered by SVs. In our opinion, 3D-GNOME 2.0 is a unique online tool for modeling and analyzing conformational changes to the human genome induced by SVs across populations. It can be freely accessed at
      PubDate: Fri, 22 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa388
      Issue No: Vol. 48, No. W1 (2020)
  • mmCSM-AB: guiding rational antibody engineering through multiple point

    • Authors: Myung Y; Pires D, Ascher D.
      Abstract: While antibodies are becoming an increasingly important therapeutic class, especially in personalized medicine, their development and optimization has been largely through experimental exploration. While there have been many efforts to develop computational tools to guide rational antibody engineering, most approaches are of limited accuracy when applied to antibody design, and have largely been limited to analysing a single point mutation at a time. To overcome this gap, we have curated a dataset of 242 experimentally determined changes in binding affinity upon multiple point mutations in antibody-target complexes (89 increasing and 153 decreasing binding affinity). Here, we have shown that by using our graph-based signatures and atomic interaction information, we can accurately analyse the consequence of multi-point mutations on antigen binding affinity. Our approach outperformed other available tools across cross-validation and two independent blind tests, achieving Pearson's correlations of up to 0.95. We have implemented our new approach, mmCSM-AB, as a web-server that can help guide the process of affinity maturation in antibody design. mmCSM-AB is freely available at
      PubDate: Wed, 20 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa389
      Issue No: Vol. 48, No. W1 (2020)
  • ARTS 2.0: feature updates and expansion of the Antibiotic Resistant Target
           Seeker for comparative genome mining

    • Authors: Mungan M; Alanjary M, Blin K, et al.
      Abstract: Multi-drug resistant pathogens have become a major threat to human health and new antibiotics are urgently needed. Most antibiotics are derived from secondary metabolites produced by bacteria. In order to avoid suicide, these bacteria usually encode resistance genes, in some cases within the biosynthetic gene cluster (BGC) of the respective antibiotic compound. Modern genome mining tools enable researchers to computationally detect and predict BGCs that encode the biosynthesis of secondary metabolites. The major challenge now is the prioritization of the most promising BGCs encoding antibiotics with novel modes of action. A recently developed target-directed genome mining approach allows researchers to predict the mode of action of the encoded compound of an uncharacterized BGC based on the presence of resistant target genes. In 2017, we introduced the ‘Antibiotic Resistant Target Seeker’ (ARTS). ARTS allows for specific and efficient genome mining for antibiotics with interesting and novel targets by rapidly linking housekeeping and known resistance genes to BGC proximity, duplication and horizontal gene transfer (HGT) events. Here, we present ARTS 2.0 available at ARTS 2.0 now includes options for automated target directed genome mining in all bacterial taxa as well as metagenomic data. Furthermore, it enables comparison of similar BGCs from different genomes and their putative resistance genes.
      PubDate: Tue, 19 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa374
      Issue No: Vol. 48, No. W1 (2020)
  • RiboToolkit: an integrated platform for analysis and annotation of
           ribosome profiling data to decode mRNA translation at codon resolution

    • Authors: Liu Q; Shvarts T, Sliz P, et al.
      Abstract: Ribosome profiling (Ribo-seq) is a powerful technology for globally monitoring RNA translation; ranging from codon occupancy profiling, identification of actively translated open reading frames (ORFs), to the quantification of translational efficiency under various physiological or experimental conditions. However, analyzing and decoding translation information from Ribo-seq data is not trivial. Although there are many existing tools to analyze Ribo-seq data, most of these tools are designed for specific or limited functionalities and an easy-to-use integrated tool to analyze Ribo-seq data is lacking. Fortunately, the small size (26–34 nt) of ribosome protected fragments (RPFs) in Ribo-seq and the relatively small amount of sequencing data greatly facilitates the development of such a web platform, which is easy to manipulate for users with or without bioinformatic expertise. Thus, we developed RiboToolkit (, a convenient, freely available, web-based service to centralize Ribo-seq data analyses, including data cleaning and quality evaluation, expression analysis based on RPFs, codon occupancy, translation efficiency analysis, differential translation analysis, functional annotation, translation metagene analysis, and identification of actively translated ORFs. Besides, easy-to-use web interfaces were developed to facilitate data analysis and intuitively visualize results. Thus, RiboToolkit will greatly facilitate the study of mRNA translation based on ribosome profiling.
      PubDate: Tue, 19 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa395
      Issue No: Vol. 48, No. W1 (2020)
  • webPSN v2.0: a webserver to infer fingerprints of structural communication
           in biomacromolecules

    • Authors: Felline A; Seeber M, Fanelli F.
      Abstract: A mixed Protein Structure Network (PSN) and Elastic Network Model-Normal Mode Analysis (ENM-NMA)-based strategy (i.e. PSN-ENM) was developed to investigate structural communication in bio-macromolecules. Protein Structure Graphs (PSGs) are computed on a single structure, whereas information on system dynamics is supplied by ENM-NMA. The approach was implemented in a webserver (webPSN), which was significantly updated herein. The webserver now handles both proteins and nucleic acids and relies on an internal upgradable database of network parameters for ions and small molecules in all PDB structures. Apart from the radical restyle of the server and some changes in the calculation setup, other major novelties concern the possibility to: a) compute the differences in nodes, links, and communication pathways between two structures (i.e. network difference) and b) infer links, hubs, communities, and metapaths from consensus networks computed on a number of structures. These new features are useful to identify commonalties and differences between two different functional states of the same system or structural-communication signatures in homologous or analogous systems. The output analysis relies on 3D-representations, interactive tables and graphs, also available for download. Speed and accuracy make this server suitable to comparatively investigate structural communication in large sets of bio-macromolecular systems. URL:
      PubDate: Tue, 19 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa397
      Issue No: Vol. 48, No. W1 (2020)
  • IRIS3: integrated cell-type-specific regulon inference server from
           single-cell RNA-Seq

    • Authors: Ma A; Wang C, Chang Y, et al.
      Abstract: A group of genes controlled as a unit, usually by the same repressor or activator gene, is known as a regulon. The ability to identify active regulons within a specific cell type, i.e., cell-type-specific regulons (CTSR), provides an extraordinary opportunity to pinpoint crucial regulators and target genes responsible for complex diseases. However, the identification of CTSRs from single-cell RNA-Seq (scRNA-Seq) data is computationally challenging. We introduce IRIS3, the first-of-its-kind web server for CTSR inference from scRNA-Seq data for human and mouse. IRIS3 is an easy-to-use server empowered by over 20 functionalities to support comprehensive interpretations and graphical visualizations of identified CTSRs. CTSR data can be used to reliably characterize and distinguish the corresponding cell type from others and can be combined with other computational or experimental analyses for biomedical studies. CTSRs can, therefore, aid in the discovery of major regulatory mechanisms and allow reliable constructions of global transcriptional regulation networks encoded in a specific cell type. The broader impact of IRIS3 includes, but is not limited to, investigation of complex diseases hierarchies and heterogeneity, causal gene regulatory network construction, and drug development. IRIS3 is freely accessible from with no login requirement.
      PubDate: Mon, 18 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa394
      Issue No: Vol. 48, No. W1 (2020)
  • FGviewer: an online visualization tool for functional features of human
           fusion genes

    • Authors: Kim P; Yiya K, Zhou X.
      Abstract: Among the diverse location of the breakpoints (BPs) of structural variants (SVs), the breakpoints of fusion genes (FGs) are located in the gene bodies. This broken gene context provided the aberrant functional clues to study disease genesis. Many tumorigenic fusion genes have retained or lost functional or regulatory domains and these features impacted tumorigenesis. Full annotation of fusion genes aided by the visualization tool based on two gene bodies will be helpful to study the functional aspect of fusion genes. To date, a specialized tool with effective visualization of the functional features of fusion genes is not available. In this study, we built FGviewer, a tool for visualizing functional features of human fusion genes, which is available at FGviewer gets the input of fusion gene symbols, breakpoint information, or structural variants from whole-genome sequence (WGS) data. For any combination of gene pairs/breakpoints to be involved in fusion genes, the users can search the functional/regulatory aspect of the fusion gene in the three bio-molecular levels (DNA-, RNA-, and protein-levels) and one clinical level (pathogenic-level). FGviewer will be a unique online tool in disease research communities.
      PubDate: Mon, 18 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa364
      Issue No: Vol. 48, No. W1 (2020)
  • mRNALoc: a novel machine-learning based in-silico tool to predict mRNA
           subcellular localization

    • Authors: Garg A; Singhal N, Kumar R, et al.
      Abstract: Recent evidences suggest that the localization of mRNAs near the subcellular compartment of the translated proteins is a more robust cellular tool, which optimizes protein expression, post-transcriptionally. Retention of mRNA in the nucleus can regulate the amount of protein translated from each mRNA, thus allowing a tight temporal regulation of translation or buffering of protein levels from bursty transcription. Besides, mRNA localization performs a variety of additional roles like long-distance signaling, facilitating assembly of protein complexes and coordination of developmental processes. Here, we describe a novel machine-learning based tool, mRNALoc, to predict five sub-cellular locations of eukaryotic mRNAs using cDNA/mRNA sequences. During five fold cross-validations, the maximum overall accuracy was 65.19, 75.36, 67.10, 99.70 and 73.59% for the extracellular region, endoplasmic reticulum, cytoplasm, mitochondria, and nucleus, respectively. Assessment on independent datasets revealed the prediction accuracies of 58.10, 69.23, 64.55, 96.88 and 69.35% for extracellular region, endoplasmic reticulum, cytoplasm, mitochondria, and nucleus, respectively. The corresponding values of AUC were 0.76, 0.75, 0.70, 0.98 and 0.74 for the extracellular region, endoplasmic reticulum, cytoplasm, mitochondria, and nucleus, respectively. The mRNALoc standalone software and web-server are freely available for academic use under GNU GPL at
      PubDate: Mon, 18 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa385
      Issue No: Vol. 48, No. W1 (2020)
  • InterPred: a webtool to predict chemical autofluorescence and luminescence

    • Authors: Borrel A; Mansouri K, Nolte S, et al.
      Abstract: High-throughput screening (HTS) research programs for drug development or chemical hazard assessment are designed to screen thousands of molecules across hundreds of biological targets or pathways. Most HTS platforms use fluorescence and luminescence technologies, representing more than 70% of the assays in the US Tox21 research consortium. These technologies are subject to interferent signals largely explained by chemicals interacting with light spectrum. This phenomenon results in up to 5–10% of false positive results, depending on the chemical library used. Here, we present the InterPred webserver (version 1.0), a platform to predict such interference chemicals based on the first large-scale chemical screening effort to directly characterize chemical-assay interference, using assays in the Tox21 portfolio specifically designed to measure autofluorescence and luciferase inhibition. InterPred combines 17 quantitative structure activity relationship (QSAR) models built using optimized machine learning techniques and allows users to predict the probability that a new chemical will interfere with different combinations of cellular and technology conditions. InterPred models have been applied to the entire Distributed Structure-Searchable Toxicity (DSSTox) Database (∼800,000 chemicals). The InterPred webserver is available at
      PubDate: Mon, 18 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa378
      Issue No: Vol. 48, No. W1 (2020)
  • ToxicoDB: an integrated database to mine and visualize large-scale
           toxicogenomic datasets

    • Authors: Nair S; Eeles C, Ho C, et al.
      Abstract: In the past few decades, major initiatives have been launched around the world to address chemical safety testing. These efforts aim to innovate and improve the efficacy of existing methods with the long-term goal of developing new risk assessment paradigms. The transcriptomic and toxicological profiling of mammalian cells has resulted in the creation of multiple toxicogenomic datasets and corresponding tools for analysis. To enable easy access and analysis of these valuable toxicogenomic data, we have developed ToxicoDB (, a free and open cloud-based platform integrating data from large in vitro toxicogenomic studies, including gene expression profiles of primary human and rat hepatocytes treated with 231 potential toxicants. To efficiently mine these complex toxicogenomic data, ToxicoDB provides users with harmonized chemical annotations, time- and dose-dependent plots of compounds across datasets, as well as the toxicity-related pathway analysis. The data in ToxicoDB have been generated using our open-source R package, ToxicoGx ( Altogether, ToxicoDB provides a streamlined process for mining highly organized, curated, and accessible toxicogenomic data that can be ultimately applied to preclinical toxicity studies and further our understanding of adverse outcomes.
      PubDate: Mon, 18 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa390
      Issue No: Vol. 48, No. W1 (2020)
  • PlaToLoCo: the first web meta-server for visualization and annotation of
           low complexity regions in proteins

    • Authors: Jarnot P; Ziemska-Legiecka J, Dobson L, et al.
      Abstract: Low complexity regions (LCRs) in protein sequences are characterized by a less diverse amino acid composition compared to typically observed sequence diversity. Recent studies have shown that LCRs may co-occur with intrinsically disordered regions, are highly conserved in many organisms, and often play important roles in protein functions and in diseases. In previous decades, several methods have been developed to identify regions with LCRs or amino acid bias, but most of them as stand-alone applications and currently there is no web-based tool which allows users to explore LCRs in protein sequences with additional functional annotations. We aim to fill this gap by providing PlaToLoCo - PLAtform of TOols for LOw COmplexity—a meta-server that integrates and collects the output of five different state-of-the-art tools for discovering LCRs and provides functional annotations such as domain detection, transmembrane segment prediction, and calculation of amino acid frequencies. In addition, the union or intersection of the results of the search on a query sequence can be obtained. By developing the PlaToLoCo meta-server, we provide the community with a fast and easily accessible tool for the analysis of LCRs with additional information included to aid the interpretation of the results. The PlaToLoCo platform is available at:
      PubDate: Mon, 18 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa339
      Issue No: Vol. 48, No. W1 (2020)
  • NetMHCpan-4.1 and NetMHCIIpan-4.0: improved predictions of MHC antigen
           presentation by concurrent motif deconvolution and integration of MS MHC
           eluted ligand data

    • Authors: Reynisson B; Alvarez B, Paul S, et al.
      Abstract: Major histocompatibility complex (MHC) molecules are expressed on the cell surface, where they present peptides to T cells, which gives them a key role in the development of T-cell immune responses. MHC molecules come in two main variants: MHC Class I (MHC-I) and MHC Class II (MHC-II). MHC-I predominantly present peptides derived from intracellular proteins, whereas MHC-II predominantly presents peptides from extracellular proteins. In both cases, the binding between MHC and antigenic peptides is the most selective step in the antigen presentation pathway. Therefore, the prediction of peptide binding to MHC is a powerful utility to predict the possible specificity of a T-cell immune response. Commonly MHC binding prediction tools are trained on binding affinity or mass spectrometry-eluted ligands. Recent studies have however demonstrated how the integration of both data types can boost predictive performances. Inspired by this, we here present NetMHCpan-4.1 and NetMHCIIpan-4.0, two web servers created to predict binding between peptides and MHC-I and MHC-II, respectively. Both methods exploit tailored machine learning strategies to integrate different training data types, resulting in state-of-the-art performance and outperforming their competitors. The servers are available at and
      PubDate: Thu, 14 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa379
      Issue No: Vol. 48, No. W1 (2020)
  • AnnoLnc2: the one-stop portal to systematically annotate novel lncRNAs for
           human and mouse

    • Authors: Ke L; Yang D, Wang Y, et al.
      Abstract: With the abundant mammalian lncRNAs identified recently, a comprehensive annotation resource for these novel lncRNAs is an urgent need. Since its first release in November 2016, AnnoLnc has been the only online server for comprehensively annotating novel human lncRNAs on-the-fly. Here, with significant updates to multiple annotation modules, backend datasets and the code base, AnnoLnc2 continues the effort to provide the scientific community with a one-stop online portal for systematically annotating novel human and mouse lncRNAs with a comprehensive functional spectrum covering sequences, structure, expression, regulation, genetic association and evolution. In response to numerous requests from multiple users, a standalone package is also provided for large-scale offline analysis. We believe that updated AnnoLnc2 ( will help both computational and bench biologists identify lncRNA functions and investigate underlying mechanisms.
      PubDate: Thu, 14 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa368
      Issue No: Vol. 48, No. W1 (2020)
  • Conserved unique peptide patterns (CUPP) online platform: peptide-based
           functional annotation of carbohydrate active enzymes

    • Authors: Barrett K; Hunt C, Lange L, et al.
      Abstract: The CUPP platform includes a web server for functional annotation and sub-grouping of carbohydrate active enzymes (CAZymes) based on a novel peptide-based similarity assessment algorithm, i.e. protein grouping according to Conserved Unique Peptide Patterns (CUPP). This online platform is open to all users and there is no login requirement. The web server allows the user to perform genome-based annotation of carbohydrate active enzymes to CAZy families, CAZy subfamilies, CUPP groups and EC numbers (function) via assessment of peptide-motifs by CUPP. The web server is intended for functional annotation assessment of the CAZy inventory of prokaryotic and eukaryotic organisms from genomic DNA (up to 30MB compressed) or directly from amino acid sequences (up to 10MB compressed). The custom query sequences are assessed using the CUPP annotation algorithm, and the outcome is displayed in interactive summary result pages of CAZymes. The results displayed allow for inspection of members of the individual CUPP groups and include information about experimentally characterized members. The web server and the other resources on the CUPP platform can be accessed from
      PubDate: Thu, 14 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa375
      Issue No: Vol. 48, No. W1 (2020)
  • MISCAST: MIssense variant to protein StruCture Analysis web SuiTe

    • Authors: Iqbal S; Hoksza D, Pérez-Palma E, et al.
      Abstract: Human genome sequencing efforts have greatly expanded, and a plethora of missense variants identified both in patients and in the general population is now publicly accessible. Interpretation of the molecular-level effect of missense variants, however, remains challenging and requires a particular investigation of amino acid substitutions in the context of protein structure and function. Answers to questions like ‘Is a variant perturbing a site involved in key macromolecular interactions and/or cellular signaling'’, or ‘Is a variant changing an amino acid located at the protein core or part of a cluster of known pathogenic mutations in 3D'’ are crucial. Motivated by these needs, we developed MISCAST (missense variant to protein structure analysis web suite; MISCAST is an interactive and user-friendly web server to visualize and analyze missense variants in protein sequence and structure space. Additionally, a comprehensive set of protein structural and functional features have been aggregated in MISCAST from multiple databases, and displayed on structures alongside the variants to provide users with the biological context of the variant location in an integrated platform. We further made the annotated data and protein structures readily downloadable from MISCAST to foster advanced offline analysis of missense variants by a wide biological community.
      PubDate: Wed, 13 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa361
      Issue No: Vol. 48, No. W1 (2020)
  • Atomic Charge Calculator II: web-based tool for the calculation of partial
           atomic charges

    • Authors: Raček T; Schindler O, Toušek D, et al.
      Abstract: Partial atomic charges serve as a simple model for the electrostatic distribution of a molecule that drives its interactions with its surroundings. Since partial atomic charges are frequently used in computational chemistry, chemoinformatics and bioinformatics, many computational approaches for calculating them have been introduced. The most applicable are fast and reasonably accurate empirical charge calculation approaches. Here, we introduce Atomic Charge Calculator II (ACC II), a web application that enables the calculation of partial atomic charges via all the main empirical approaches and for all types of molecules. ACC II implements 17 empirical charge calculation methods, including the highly cited (QEq, EEM), the recently published (EQeq, EQeq+C), and the old but still often used (PEOE). ACC II enables the fast calculation of charges even for large macromolecular structures. The web server also offers charge visualization, courtesy of the powerful LiteMol viewer. The calculation setup of ACC II is very straightforward and enables the quick calculation of high-quality partial charges. The application is available at
      PubDate: Wed, 13 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa367
      Issue No: Vol. 48, No. W1 (2020)
  • PaccMann: a web service for interpretable anticancer compound sensitivity

    • Authors: Cadow J; Born J, Manica M, et al.
      Abstract: The identification of new targeted and personalized therapies for cancer requires the fast and accurate assessment of the drug efficacy of potential compounds against a particular biomolecular sample. It has been suggested that the integration of complementary sources of information might strengthen the accuracy of a drug efficacy prediction model. Here, we present a web-based platform for the Prediction of AntiCancer Compound sensitivity with Multimodal Attention-based Neural Networks (PaccMann). PaccMann is trained on public transcriptomic cell line profiles, compound structure information and drug sensitivity screenings, and outperforms state-of-the-art methods on anticancer drug sensitivity prediction. On the open-access web service (, users can select a known drug compound or design their own compound structure in an interactive editor, perform in-silico drug testing and investigate compound efficacy on publicly available or user-provided transcriptomic profiles. PaccMann leverages methods for model interpretability and outputs confidence scores as well as attention heatmaps that highlight the genes and chemical sub-structures that were more important to make a prediction, hence facilitating the understanding of the model’s decision making and the involved biochemical processes. We hope to serve the community with a toolbox for fast and efficient validation in drug repositioning or lead compound identification regimes.
      PubDate: Wed, 13 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa327
      Issue No: Vol. 48, No. W1 (2020)
  • InteractomeSeq: a web server for the identification and profiling of
           domains and epitopes from phage display and next generation sequencing

    • Authors: Puccio S; Grillo G, Consiglio A, et al.
      Abstract: High-Throughput Sequencing technologies are transforming many research fields, including the analysis of phage display libraries. The phage display technology coupled with deep sequencing was introduced more than a decade ago and holds the potential to circumvent the traditional laborious picking and testing of individual phage rescued clones. However, from a bioinformatics point of view, the analysis of this kind of data was always performed by adapting tools designed for other purposes, thus not considering the noise background typical of the ‘interactome sequencing’ approach and the heterogeneity of the data. InteractomeSeq is a web server allowing data analysis of protein domains (‘domainome’) or epitopes (‘epitome’) from either Eukaryotic or Prokaryotic genomic phage libraries generated and selected by following an Interactome sequencing approach. InteractomeSeq allows users to upload raw sequencing data and to obtain an accurate characterization of domainome/epitome profiles after setting the parameters required to tune the analysis. The release of this tool is relevant for the scientific and clinical community, because InteractomeSeq will fill an existing gap in the field of large-scale biomarkers profiling, reverse vaccinology, and structural/functional studies, thus contributing essential information for gene annotation or antigen identification. InteractomeSeq is freely available at
      PubDate: Wed, 13 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa363
      Issue No: Vol. 48, No. W1 (2020)
  • AcrFinder: genome mining anti-CRISPR operons in prokaryotes and their

    • Authors: Yi H; Huang L, Yang B, et al.
      Abstract: Anti-CRISPR (Acr) proteins encoded by (pro)phages/(pro)viruses have a great potential to enable a more controllable genome editing. However, genome mining new Acr proteins is challenging due to the lack of a conserved functional domain and the low sequence similarity among experimentally characterized Acr proteins. We introduce here AcrFinder, a web server ( that combines three well-accepted ideas used by previous experimental studies to pre-screen genomic data for Acr candidates. These ideas include homology search, guilt-by-association (GBA), and CRISPR-Cas self-targeting spacers. Compared to existing bioinformatics tools, AcrFinder has the following unique functions: (i) it is the first online server specifically mining genomes for Acr-Aca operons; (ii) it provides a most comprehensive Acr and Aca (Acr-associated regulator) database (populated by GBA-based Acr and Aca datasets); (iii) it combines homology-based, GBA-based, and self-targeting approaches in one software package; and (iv) it provides a user-friendly web interface to take both nucleotide and protein sequence files as inputs, and output a result page with graphic representation of the genomic contexts of Acr-Aca operons. The leave-one-out cross-validation on experimentally characterized Acr-Aca operons showed that AcrFinder had a 100% recall. AcrFinder will be a valuable web resource to help experimental microbiologists discover new Anti-CRISPRs.
      PubDate: Wed, 13 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa351
      Issue No: Vol. 48, No. W1 (2020)
  • Oviz-Bio: a web-based platform for interactive cancer genomics data

    • Authors: Jia W; Li H, Li S, et al.
      Abstract: Genetics data visualization plays an important role in the sharing of knowledge from cancer genome research. Many types of visualization are widely used, most of which are static and require sufficient coding experience to create. Here, we present Oviz-Bio, a web-based platform that provides interactive and real-time visualizations of cancer genomics data. Researchers can interactively explore visual outputs and export high-quality diagrams. Oviz-Bio supports a diverse range of visualizations on common cancer mutation types, including annotation and signatures of small scale mutations, haplotype view and focal clusters of copy number variations, split-reads alignment and heatmap view of structural variations, transcript junction of fusion genes and genomic hotspot of oncovirus integrations. Furthermore, Oviz-Bio allows landscape view to investigate multi-layered data in samples cohort. All Oviz-Bio visual applications are freely available at
      PubDate: Mon, 11 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa371
      Issue No: Vol. 48, No. W1 (2020)
  • EnzymeMiner: automated mining of soluble enzymes with diverse structures,
           catalytic properties and stabilities

    • Authors: Hon J; Borko S, Stourac J, et al.
      Abstract: Millions of protein sequences are being discovered at an incredible pace, representing an inexhaustible source of biocatalysts. Despite genomic databases growing exponentially, classical biochemical characterization techniques are time-demanding, cost-ineffective and low-throughput. Therefore, computational methods are being developed to explore the unmapped sequence space efficiently. Selection of putative enzymes for biochemical characterization based on rational and robust analysis of all available sequences remains an unsolved problem. To address this challenge, we have developed EnzymeMiner—a web server for automated screening and annotation of diverse family members that enables selection of hits for wet-lab experiments. EnzymeMiner prioritizes sequences that are more likely to preserve the catalytic activity and are heterologously expressible in a soluble form in Escherichia coli. The solubility prediction employs the in-house SoluProt predictor developed using machine learning. EnzymeMiner reduces the time devoted to data gathering, multi-step analysis, sequence prioritization and selection from days to hours. The successful use case for the haloalkane dehalogenase family is described in a comprehensive tutorial available on the EnzymeMiner web page. EnzymeMiner is a universal tool applicable to any enzyme family that provides an interactive and easy-to-use web interface freely available at
      PubDate: Mon, 11 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa372
      Issue No: Vol. 48, No. W1 (2020)
  • MutaRNA: analysis and visualization of mutation-induced changes in RNA

    • Authors: Miladi M; Raden M, Diederichs S, et al.
      Abstract: RNA molecules fold into complex structures as a result of intramolecular interactions between their nucleotides. The function of many non-coding RNAs and some cis-regulatory elements of messenger RNAs highly depends on their fold. Single-nucleotide variants (SNVs) and other types of mutations can disrupt the native function of an RNA element by altering its base pairing pattern. Identifying the effect of a mutation on an RNA’s structure is, therefore, a crucial step in evaluating the impact of mutations on the post-transcriptional regulation and function of RNAs within the cell. Even though a single nucleotide variation can have striking impacts on the structure formation, interpreting and comparing the impact usually needs expertise and meticulous efforts. Here, we present MutaRNA, a web server for visualization and interpretation of mutation-induced changes on the RNA structure in an intuitive and integrative fashion. To this end, probabilities of base pairing and position-wise unpaired probabilities of wildtype and mutated RNA sequences are computed and compared. Differential heatmap-like dot plot representations in combination with circular plots and arc diagrams help to identify local structure abberations, which are otherwise hidden in standard outputs. Eventually, MutaRNA provides a comprehensive and comparative overview of the mutation-induced changes in base pairing potentials and accessibility. The MutaRNA web server is freely available at
      PubDate: Mon, 11 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa331
      Issue No: Vol. 48, No. W1 (2020)
  • CoCoCoNet: conserved and comparative co-expression across a diverse set of

    • Authors: Lee J; Shah M, Ballouz S, et al.
      Abstract: Co-expression analysis has provided insight into gene function in organisms from Arabidopsis to zebrafish. Comparison across species has the potential to enrich these results, for example by prioritizing among candidate human disease genes based on their network properties or by finding alternative model systems where their co-expression is conserved. Here, we present CoCoCoNet as a tool for identifying conserved gene modules and comparing co-expression networks. CoCoCoNet is a resource for both data and methods, providing gold standard networks and sophisticated tools for on-the-fly comparative analyses across 14 species. We show how CoCoCoNet can be used in two use cases. In the first, we demonstrate deep conservation of a nucleolus gene module across very divergent organisms, and in the second, we show how the heterogeneity of autism mechanisms in humans can be broken down by functional groups and translated to model organisms. CoCoCoNet is free to use and available to all at, with data and R scripts available at
      PubDate: Mon, 11 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa348
      Issue No: Vol. 48, No. W1 (2020)
  • BIOMEX: an interactive workflow for (single cell) omics data
           interpretation and visualization

    • Authors: Taverna F; Goveia J, Karakach T, et al.
      Abstract: The amount of biological data, generated with (single cell) omics technologies, is rapidly increasing, thereby exacerbating bottlenecks in the data analysis and interpretation of omics experiments. Data mining platforms that facilitate non-bioinformatician experimental scientists to analyze a wide range of experimental designs and data types can alleviate such bottlenecks, aiding in the exploration of (newly generated or publicly available) omics datasets. Here, we present BIOMEX, a browser-based software, designed to facilitate the Biological Interpretation Of Multi-omics EXperiments by bench scientists. BIOMEX integrates state-of-the-art statistical tools and field-tested algorithms into a flexible but well-defined workflow that accommodates metabolomics, transcriptomics, proteomics, mass cytometry and single cell data from different platforms and organisms. The BIOMEX workflow is accompanied by a manual and video tutorials that provide the necessary background to navigate the interface and get acquainted with the employed methods. BIOMEX guides the user through omics-tailored analyses, such as data pretreatment and normalization, dimensionality reduction, differential and enrichment analysis, pathway mapping, clustering, marker analysis, trajectory inference, meta-analysis and others. BIOMEX is fully interactive, allowing users to easily change parameters and generate customized plots exportable as high-quality publication-ready figures. BIOMEX is open source and freely available at
      PubDate: Mon, 11 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa332
      Issue No: Vol. 48, No. W1 (2020)
  • AlloSigMA 2: paving the way to designing allosteric effectors and to
           exploring allosteric effects of mutations

    • Authors: Tan Z; Guarnera E, Tee W, et al.
      Abstract: The AlloSigMA 2 server provides an interactive platform for exploring the allosteric signaling caused by ligand binding and/or mutations, for analyzing the allosteric effects of mutations and for detecting potential cancer drivers and pathogenic nsSNPs. It can also be used for searching latent allosteric sites and for computationally designing allosteric effectors for these sites with required agonist/antagonist activity. The server is based on the implementation of the Structure-Based Statistical Mechanical Model of Allostery (SBSMMA), which allows one to evaluate the allosteric free energy as a result of the perturbation at per-residue resolution. The Allosteric Signaling Map (ASM) providing a comprehensive residue-by-residue allosteric control over the protein activity can be obtained for any structure of interest. The Allosteric Probing Map (APM), in turn, allows one to perform the fragment-based-like computational design experiment aimed at finding leads for potential allosteric effectors. The server can be instrumental in elucidating of allosteric mechanisms and actions of allosteric mutations, and in the efforts on design of new elements of allosteric control. The server is freely available at:
      PubDate: Mon, 11 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa338
      Issue No: Vol. 48, No. W1 (2020)
  • CausalMGM: an interactive web-based causal discovery tool

    • Authors: Ge X; Raghu V, Chrysanthis P, et al.
      Abstract: High-throughput sequencing and the availability of large online data repositories (e.g. The Cancer Genome Atlas and Trans-Omics for Precision Medicine) have the potential to revolutionize systems biology by enabling researchers to study interactions between data from different modalities (i.e. genetic, genomic, clinical, behavioral, etc.). Currently, data mining and statistical approaches are confined to identifying correlates in these datasets, but researchers are often interested in identifying cause-and-effect relationships. Causal discovery methods were developed to infer such cause-and-effect relationships from observational data. Though these algorithms have had demonstrated successes in several biomedical applications, they are difficult to use for non-experts. So, there is a need for web-based tools to make causal discovery methods accessible. Here, we present CausalMGM (, the first web-based causal discovery tool that enables researchers to find cause-and-effect relationships from observational data. Web-based CausalMGM consists of three data analysis tools: (i) feature selection and clustering; (ii) automated identification of cause-and-effect relationships via a graphical model; and (iii) interactive visualization of the learned causal (directed) graph. We demonstrate how CausalMGM enables an end-to-end exploratory analysis of biomedical datasets, giving researchers a clearer picture of its capabilities.
      PubDate: Mon, 11 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa350
      Issue No: Vol. 48, No. W1 (2020)
  • AWSEM-Suite: a protein structure prediction server based on
           template-guided, coevolutionary-enhanced optimized folding landscapes

    • Authors: Jin S; Contessoto V, Chen M, et al.
      Abstract: The accurate and reliable prediction of the 3D structures of proteins and their assemblies remains difficult even though the number of solved structures soars and prediction techniques improve. In this study, a free and open access web server, AWSEM-Suite, whose goal is to predict monomeric protein tertiary structures from sequence is described. The model underlying the server’s predictions is a coarse-grained protein force field which has its roots in neural network ideas that has been optimized using energy landscape theory. Employing physically motivated potentials and knowledge-based local structure biasing terms, the addition of homologous template and co-evolutionary restraints to AWSEM-Suite greatly improves the predictive power of pure AWSEM structure prediction. From the independent evaluation metrics released in the CASP13 experiment, AWSEM-Suite proves to be a reasonably accurate algorithm for free modeling, standing at the eighth position in the free modeling category of CASP13. The AWSEM-Suite server also features a front end with a user-friendly interface. The AWSEM-Suite server is a powerful tool for predicting monomeric protein tertiary structures that is most useful when a suitable structure template is not available. The AWSEM-Suite server is freely available at:
      PubDate: Fri, 08 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa356
      Issue No: Vol. 48, No. W1 (2020)
  • TeamTat: a collaborative text annotation tool

    • Authors: Islamaj R; Kwon D, Kim S, et al.
      Abstract: Manually annotated data is key to developing text-mining and information-extraction algorithms. However, human annotation requires considerable time, effort and expertise. Given the rapid growth of biomedical literature, it is paramount to build tools that facilitate speed and maintain expert quality. While existing text annotation tools may provide user-friendly interfaces to domain experts, limited support is available for figure display, project management, and multi-user team annotation. In response, we developed TeamTat (, a web-based annotation tool (local setup available), equipped to manage team annotation projects engagingly and efficiently. TeamTat is a novel tool for managing multi-user, multi-label document annotation, reflecting the entire production life cycle. Project managers can specify annotation schema for entities and relations and select annotator(s) and distribute documents anonymously to prevent bias. Document input format can be plain text, PDF or BioC (uploaded locally or automatically retrieved from PubMed/PMC), and output format is BioC with inline annotations. TeamTat displays figures from the full text for the annotator's convenience. Multiple users can work on the same document independently in their workspaces, and the team manager can track task completion. TeamTat provides corpus quality assessment via inter-annotator agreement statistics, and a user-friendly interface convenient for annotation review and inter-annotator disagreement resolution to improve corpus quality.
      PubDate: Fri, 08 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa333
      Issue No: Vol. 48, No. W1 (2020)
  • ARIAweb: a server for automated NMR structure calculation

    • Authors: Allain F; Mareuil F, Ménager H, et al.
      Abstract: Nuclear magnetic resonance (NMR) spectroscopy is a method of choice to study the dynamics and determine the atomic structure of macromolecules in solution. The standalone program ARIA (Ambiguous Restraints for Iterative Assignment) for automated assignment of nuclear Overhauser enhancement (NOE) data and structure calculation is well established in the NMR community. To ultimately provide a perfectly transparent and easy to use service, we designed an online user interface to ARIA with additional functionalities. Data conversion, structure calculation setup and execution, followed by interactive visualization of the generated 3D structures are all integrated in ARIAweb and freely accessible at
      PubDate: Fri, 08 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa362
      Issue No: Vol. 48, No. W1 (2020)
  • SIB Literature Services: RESTful customizable search engines in biomedical
           literature, enriched with automatically mapped biomedical concepts

    • Authors: Gobeill J; Caucheteur D, Michel P, et al.
      Abstract: Thanks to recent efforts by the text mining community, biocurators have now access to plenty of good tools and Web interfaces for identifying and visualizing biomedical entities in literature. Yet, many of these systems start with a PubMed query, which is limited by strong Boolean constraints. Some semantic search engines exploit entities for Information Retrieval, and/or deliver relevance-based ranked results. Yet, they are not designed for supporting a specific curation workflow, and allow very limited control on the search process. The Swiss Institute of Bioinformatics Literature Services (SIBiLS) provide personalized Information Retrieval in the biological literature. Indeed, SIBiLS allow fully customizable search in semantically enriched contents, based on keywords and/or mapped biomedical entities from a growing set of standardized and legacy vocabularies. The services have been used and favourably evaluated to assist the curation of genes and gene products, by delivering customized literature triage engines to different curation teams. SIBiLS ( are freely accessible via REST APIs and are ready to empower any curation workflow, built on modern technologies scalable with big data: MongoDB and Elasticsearch. They cover MEDLINE and PubMed Central Open Access enriched by nearly 2 billion of mapped biomedical entities, and are daily updated.
      PubDate: Thu, 07 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa328
      Issue No: Vol. 48, No. W1 (2020)
  • GeneTrail 3: advanced high-throughput enrichment analysis

    • Authors: Gerstner N; Kehl T, Lenhof K, et al.
      Abstract: We present GeneTrail 3, a major extension of our web service GeneTrail that offers rich functionality for the identification, analysis, and visualization of deregulated biological processes. Our web service provides a comprehensive collection of biological processes and signaling pathways for 12 model organisms that can be analyzed with a powerful framework for enrichment and network analysis of transcriptomic, miRNomic, proteomic, and genomic data sets. Moreover, GeneTrail offers novel workflows for the analysis of epigenetic marks, time series experiments, and single cell data. We demonstrate the capabilities of our web service in two case-studies, which highlight that GeneTrail is well equipped for uncovering complex molecular mechanisms. GeneTrail is freely accessible at:
      PubDate: Thu, 07 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa306
      Issue No: Vol. 48, No. W1 (2020)
  • COVTree: Coevolution in OVerlapped sequences by Tree analysis server

    • Authors: Teppa E; Zea D, Oteri F, et al.
      Abstract: Overlapping genes are commonplace in viruses and play an important role in their function and evolution. For these genes, molecular coevolution may be seen as a mechanism to decrease the evolutionary constraints of amino acid positions in the overlapping regions and to tolerate or compensate unfavorable mutations. Tracing these mutational sites, could help to gain insight on the direct or indirect effect of the mutations in the corresponding overlapping proteins. In the past, coevolution analysis has been used to identify residue pairs and coevolutionary signatures within or between proteins that served as markers of physical interactions and/or functional relationships. Coevolution in OVerlapped sequences by Tree analysis (COVTree) is a web server providing the online analysis of coevolving amino-acid pairs in overlapping genes, where residues might be located inside or outside the overlapping region. COVTree is designed to handle protein families with various characteristics, among which those that typically display a small number of highly conserved sequences. It is based on BIS2, a fast version of the coevolution analysis tool Blocks in Sequences (BIS). COVTree provides a rich and interactive graphical interface to ease biological interpretation of the results and it is openly accessible at
      PubDate: Wed, 06 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa330
      Issue No: Vol. 48, No. W1 (2020)
  • The omics discovery REST interface

    • Authors: Dass G; Vu M, Xu P, et al.
      Abstract: The Omics Discovery Index is an open source platform that can be used to access, discover and disseminate omics datasets. OmicsDI integrates proteomics, genomics, metabolomics, models and transcriptomics datasets. Using an efficient indexing system, OmicsDI integrates different biological entities including genes, transcripts, proteins, metabolites and the corresponding publications from PubMed. In addition, it implements a group of pipelines to estimate the impact of each dataset by tracing the number of citations, reanalysis and biological entities reported by each dataset. Here, we present the OmicsDI REST interface ( to enable programmatic access to any dataset in OmicsDI or all the datasets for a specific provider (database). Clients can perform queries on the API using different metadata information such as sample details (species, tissues, etc), instrumentation (mass spectrometer, sequencer), keywords and other provided annotations. In addition, we present two different libraries in R and Python to facilitate the development of tools that can programmatically interact with the OmicsDI REST interface.
      PubDate: Wed, 06 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa326
      Issue No: Vol. 48, No. W1 (2020)
  • miEAA 2.0: integrating multi-species microRNA enrichment analysis and
           workflow management systems

    • Authors: Kern F; Fehlmann T, Solomon J, et al.
      Abstract: Gene set enrichment analysis has become one of the most frequently used applications in molecular biology research. Originally developed for gene sets, the same statistical principles are now available for all omics types. In 2016, we published the miRNA enrichment analysis and annotation tool (miEAA) for human precursor and mature miRNAs. Here, we present miEAA 2.0, supporting miRNA input from ten frequently investigated organisms. To facilitate inclusion of miEAA in workflow systems, we implemented an Application Programming Interface (API). Users can perform miRNA set enrichment analysis using either the web-interface, a dedicated Python package, or custom remote clients. Moreover, the number of category sets was raised by an order of magnitude. We implemented novel categories like annotation confidence level or localisation in biological compartments. In combination with the miRBase miRNA-version and miRNA-to-precursor converters, miEAA supports research settings where older releases of miRBase are in use. The web server also offers novel comprehensive visualizations such as heatmaps and running sum curves with background distributions. We demonstrate the new features with case studies for human kidney cancer, a biomarker study on Parkinson’s disease from the PPMI cohort, and a mouse model for breast cancer. The tool is freely accessible at:
      PubDate: Wed, 06 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa309
      Issue No: Vol. 48, No. W1 (2020)
  • The Quest for Orthologs benchmark service and consensus calls in 2020

    • Authors: Altenhoff A; Garrayo-Ventas J, Cosentino S, et al.
      Abstract: The identification of orthologs—genes in different species which descended from the same gene in their last common ancestor—is a prerequisite for many analyses in comparative genomics and molecular evolution. Numerous algorithms and resources have been conceived to address this problem, but benchmarking and interpreting them is fraught with difficulties (need to compare them on a common input dataset, absence of ground truth, computational cost of calling orthologs). To address this, the Quest for Orthologs consortium maintains a reference set of proteomes and provides a web server for continuous orthology benchmarking ( Furthermore, consensus ortholog calls derived from public benchmark submissions are provided on the Alliance of Genome Resources website, the joint portal of NIH-funded model organism databases.
      PubDate: Wed, 06 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa308
      Issue No: Vol. 48, No. W1 (2020)
  • miRSwitch: detecting microRNA arm shift and switch events

    • Authors: Kern F; Amand J, Senatorov I, et al.
      Abstract: Arm selection, the preferential expression of a 3′ or 5′ mature microRNA (miRNA), is a highly dynamic and tissue-specific process. Time-dependent expression shifts or switches between the arms are also relevant for human diseases. We present miRSwitch, a web server to facilitate the analysis and interpretation of arm selection events. Our species-independent tool evaluates pre-processed small non-coding RNA sequencing (sncRNA-seq) data, i.e. expression matrices or output files from miRNA quantification tools (miRDeep2, miRMaster, sRNAbench). miRSwitch highlights potential changes in the distribution of mature miRNAs from the same precursor. Group comparisons from one or several user-provided annotations (e.g. disease states) are possible. Results can be dynamically adjusted by choosing from a continuous range of highly specific to very sensitive parameters. Users can compare potential arm shifts in the provided data to a human reference map of pre-computed arm shift frequencies. We created this map from 46 tissues and 30 521 samples. As case studies we present novel arm shift information in a Alzheimer’s disease biomarker data set and from a comparison of tissues in Homo sapiens and Mus musculus. In summary, miRSwitch offers a broad range of customized arm switch analyses along with comprehensive visualizations, and is freely available at:
      PubDate: Fri, 01 May 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa323
      Issue No: Vol. 48, No. W1 (2020)
  • LIST-S2: taxonomy based sorting of deleterious missense mutations across

    • Authors: Malhis N; Jacobson M, Jones S, et al.
      Abstract: The separation of deleterious from benign mutations remains a key challenge in the interpretation of genomic data. Computational methods used to sort mutations based on their potential deleteriousness rely largely on conservation measures derived from sequence alignments. Here, we introduce LIST-S2, a successor to our previously developed approach LIST, which aims to exploit local sequence identity and taxonomy distances in quantifying the conservation of human protein sequences. Unlike its predecessor, LIST-S2 is not limited to human sequences but can assess conservation and make predictions for sequences from any organism. Moreover, we provide a web-tool and downloadable software to compute and visualize the deleteriousness of mutations in user-provided sequences. This web-tool contains an HTML interface and a RESTful API to submit and manage sequences as well as a browsable set of precomputed predictions for a large number of UniProtKB protein sequences of common taxa. LIST-S2 is available at:
      PubDate: Thu, 30 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa288
      Issue No: Vol. 48, No. W1 (2020)
  • MetaPhOrs 2.0: integrative, phylogeny-based inference of orthology and
           paralogy across the tree of life

    • Authors: Chorostecki U; Molina M, Pryszcz L, et al.
      Abstract: Inferring homology relationships across genes in different species is a central task in comparative genomics. Therefore, a large number of resources and methods have been developed over the years. Some public databases include phylogenetic trees of homologous gene families which can be used to further differentiate homology relationships into orthology and paralogy. MetaPhOrs is a web server that integrates phylogenetic information from different sources to provide orthology and paralogy relationships based on a common phylogeny-based predictive algorithm and associated with a consistency-based confidence score. Here we describe the latest version of the web server which includes major new implementations and provides orthology and paralogy relationships derived from ∼8.2 million gene family trees—from 13 different source repositories across ∼4000 species with sequenced genomes. MetaPhOrs server is freely available, without registration, at
      PubDate: Tue, 28 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa282
      Issue No: Vol. 48, No. W1 (2020)
  • PDBMD2CD: providing predicted protein circular dichroism spectra from
           multiple molecular dynamics-generated protein structures

    • Authors: Drew E; Janes R.
      Abstract: PDBMD2CD is a new web server capable of predicting circular dichroism (CD) spectra for multiple protein structures derived from molecular dynamics (MD) simulations, enabling predictions from thousands of protein atomic coordinate files (e.g. MD trajectories) and generating spectra for each of these structures provided by the user. Using MD enables exploration of systems that cannot be monitored by direct experimentation. Validation of MD-derived data from these types of trajectories can be difficult via conventional structure-determining techniques such as crystallography or nuclear magnetic resonance spectroscopy. CD is an experimental technique that can provide protein structure information from such conditions. The website utilizes a much faster (minimum ∼1000×) and more accurate approach for calculating CD spectra than its predecessor, PDB2CD (1). As well as improving on the speed and accuracy of current methods, new analysis tools are provided to cluster predictions or compare them against experimental CD spectra. By identifying a subset of the closest predicted CD spectra derived from PDBMD2CD to an experimental spectrum, the associated cluster of structures could be representative of those found under the conditions in which the MD studies were undertaken, thereby offering an analytical insight into the results. PDBMD2CD is freely available at:
      PubDate: Tue, 28 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa296
      Issue No: Vol. 48, No. W1 (2020)
  • MetagenoNets: comprehensive inference and meta-insights for microbial
           correlation networks

    • Authors: Nagpal S; Singh R, Yadav D, et al.
      Abstract: Microbial association networks are frequently used for understanding and comparing community dynamics from microbiome datasets. Inferring microbial correlations for such networks and obtaining meaningful biological insights, however, requires a lengthy data management workflow, choice of appropriate methods, statistical computations, followed by a different pipeline for suitably visualizing, reporting and comparing the associations. The complexity is further increased with the added dimension of multi-group ‘meta-data’ and ‘inter-omic’ functional profiles that are often associated with microbiome studies. This not only necessitates the need for categorical networks, but also integrated and bi-partite networks. Multiple options of network inference algorithms further add to the efforts required for performing correlation-based microbiome interaction studies. We present MetagenoNets, a web-based application, which accepts multi-environment microbial abundance as well as functional profiles, intelligently segregates ‘continuous and categorical’ meta-data and allows inference as well as visualization of categorical, integrated (inter-omic) and bi-partite networks. Modular structure of MetagenoNets ensures logical flow of analysis (inference, integration, exploration and comparison) in an intuitive and interactive personalized dashboard driven framework. Dynamic choice of filtration, normalization, data transformation and correlation algorithms ensures, that end-users get a one-stop solution for microbial network analysis. MetagenoNets is freely available at
      PubDate: Mon, 27 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa254
      Issue No: Vol. 48, No. W1 (2020)
  • VarFish: comprehensive DNA variant analysis for diagnostics and research

    • Authors: Holtgrewe M; Stolpe O, Nieminen M, et al.
      Abstract: VarFish is a user-friendly web application for the quality control, filtering, prioritization, analysis, and user-based annotation of DNA variant data with a focus on rare disease genetics. It is capable of processing variant call files with single or multiple samples. The variants are automatically annotated with population frequencies, molecular impact, and presence in databases such as ClinVar. Further, it provides support for pathogenicity scores including CADD, MutationTaster, and phenotypic similarity scores. Users can filter variants based on these annotations and presumed inheritance pattern and sort the results by these scores. Variants passing the filter are listed with their annotations and many useful link-outs to genome browsers, other gene/variant data portals, and external tools for variant assessment. VarFish allows users to create their own annotations including support for variant assessment following ACMG-AMP guidelines. In close collaboration with medical practitioners, VarFish was designed for variant analysis and prioritization in diagnostic and research settings as described in the software's extensive manual. The user interface has been optimized for supporting these protocols. Users can install VarFish on their own in-house servers where it provides additional lab notebook features for collaborative analysis and allows re-analysis of cases, e.g. after update of genotype or phenotype databases.
      PubDate: Mon, 27 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa241
      Issue No: Vol. 48, No. W1 (2020)
  • NOREVA: enhanced normalization and evaluation of time-course and
           multi-class metabolomic data

    • Authors: Yang Q; Wang Y, Zhang Y, et al.
      Abstract: Biological processes (like microbial growth & physiological response) are usually dynamic and require the monitoring of metabolic variation at different time-points. Moreover, there is clear shift from case-control (N=2) study to multi-class (N>2) problem in current metabolomics, which is crucial for revealing the mechanisms underlying certain physiological process, disease metastasis, etc. These time-course and multi-class metabolomics have attracted great attention, and data normalization is essential for removing unwanted biological/experimental variations in these studies. However, no tool (including NOREVA 1.0 focusing only on case-control studies) is available for effectively assessing the performance of normalization method on time-course/multi-class metabolomic data. Thus, NOREVA was updated to version 2.0 by (i) realizing normalization and evaluation of both time-course and multi-class metabolomic data, (ii) integrating 144 normalization methods of a recently proposed combination strategy and (iii) identifying the well-performing methods by comprehensively assessing the largest set of normalizations (168 in total, significantly larger than those 24 in NOREVA 1.0). The significance of this update was extensively validated by case studies on benchmark datasets. All in all, NOREVA 2.0 is distinguished for its capability in identifying well-performing normalization method(s) for time-course and multi-class metabolomics, which makes it an indispensable complement to other available tools. NOREVA can be accessed at
      PubDate: Thu, 23 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa258
      Issue No: Vol. 48, No. W1 (2020)
  • MusiteDeep: a deep-learning based webserver for protein post-translational
           modification site prediction and visualization

    • Authors: Wang D; Liu D, Yuchi J, et al.
      Abstract: MusiteDeep is an online resource providing a deep-learning framework for protein post-translational modification (PTM) site prediction and visualization. The predictor only uses protein sequences as input and no complex features are needed, which results in a real-time prediction for a large number of proteins. It takes less than three minutes to predict for 1000 sequences per PTM type. The output is presented at the amino acid level for the user-selected PTM types. The framework has been benchmarked and has demonstrated competitive performance in PTM site predictions by other researchers. In this webserver, we updated the previous framework by utilizing more advanced ensemble techniques, and providing prediction and visualization for multiple PTMs simultaneously for users to analyze potential PTM cross-talks directly. Besides prediction, users can interactively review the predicted PTM sites in the context of known PTM annotations and protein 3D structures through homology-based search. In addition, the server maintains a local database providing pre-processed PTM annotations from Uniport/Swiss-Prot for users to download. This database will be updated every three months. The MusiteDeep server is available at The stand-alone tools for locally using MusiteDeep are available at
      PubDate: Thu, 23 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa275
      Issue No: Vol. 48, No. W1 (2020)
  • TFmotifView: a webserver for the visualization of transcription factor
           motifs in genomic regions

    • Authors: Leporcq C; Spill Y, Balaramane D, et al.
      Abstract: Transcription factors (TFs) regulate the expression of gene expression. The binding specificities of many TFs have been deciphered and summarized as position-weight matrices, also called TF motifs. Despite the availability of hundreds of known TF motifs in databases, it remains non-trivial to quickly query and visualize the enrichment of known TF motifs in genomic regions of interest. Towards this goal, we developed TFmotifView, a web server that allows to study the distribution of known TF motifs in genomic regions. Based on input genomic regions and selected TF motifs, TFmotifView performs an overlap of the genomic regions with TF motif occurrences identified using a dynamic P-value threshold. TFmotifView generates three different outputs: (i) an enrichment table and scatterplot calculating the significance of TF motif occurrences in genomic regions compared to control regions, (ii) a genomic view of the organisation of TF motifs in each genomic region and (iii) a metaplot summarizing the position of TF motifs relative to the center of the regions. TFmotifView will contribute to the integration of TF motif information with a wide range of genomic datasets towards the goal to better understand the regulation of gene expression by transcription factors. TFmotifView is freely available at
      PubDate: Thu, 23 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa252
      Issue No: Vol. 48, No. W1 (2020)
  • miRViz: a novel webserver application to visualize and interpret microRNA

    • Authors: Giroux P; Bhajun R, Segard S, et al.
      Abstract: MicroRNAs (miRNAs) are small non-coding RNAs that are involved in the regulation of major pathways in eukaryotic cells through their binding to and repression of multiple mRNAs. With high-throughput methodologies, various outcomes can be measured that produce long lists of miRNAs that are often difficult to interpret. A common question is: after differential expression or phenotypic screening of miRNA mimics, which miRNA should be chosen for further investigation' Here, we present miRViz (, a webserver application designed to visualize and interpret large miRNA datasets, with no need for programming skills. MiRViz has two main goals: (i) to help biologists to raise data-driven hypotheses and (ii) to share miRNA datasets in a straightforward way through publishable quality data representation, with emphasis on relevant groups of miRNAs. MiRViz can currently handle datasets from 11 eukaryotic species. We present real-case applications of miRViz, and provide both datasets and procedures to reproduce the corresponding figures. MiRViz offers rapid identification of miRNA families, as demonstrated here for the miRNA-320 family, which is significantly exported in exosomes of colon cancer cells. We also visually highlight a group of miRNAs associated with pluripotency that is particularly active in control of a breast cancer stem-cell population in culture.
      PubDate: Wed, 22 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa259
      Issue No: Vol. 48, No. W1 (2020)
  • Zebra2: advanced and easy-to-use web-server for bioinformatic analysis of
           subfamily-specific and conserved positions in diverse protein

    • Authors: Suplatov D; Sharapova Y, Geraseva E, et al.
      Abstract: Zebra2 is a highly automated web-tool to search for subfamily-specific and conserved positions (i.e. the determinants of functional diversity as well as the key catalytic and structural residues) in protein superfamilies. The bioinformatic analysis is facilitated by Mustguseal—a companion web-server to automatically collect and superimpose a large representative set of functionally diverse homologs with high structure similarity but low sequence identity to the selected query protein. The results are automatically prioritized and provided at four information levels to facilitate the knowledge-driven expert selection of the most promising positions on-line: as a sequence similarity network; interfaces to sequence-based and 3D-structure-based analysis of conservation and variability; and accompanied by the detailed annotation of proteins accumulated from the integrated databases with links to the external resources. The integration of Zebra2 and Mustguseal web-tools provides the first of its kind out-of-the-box open-access solution to conduct a systematic analysis of evolutionarily related proteins implementing different functions within a shared 3D-structure of the superfamily, determine common and specific patterns of function-associated local structural elements, assist to select hot-spots for rational design and to prepare focused libraries for directed evolution. The web-servers are free and open to all users at, no login required.
      PubDate: Mon, 20 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa276
      Issue No: Vol. 48, No. W1 (2020)
  • SPEED2: inferring upstream pathway activity from differential gene

    • Authors: Rydenfelt M; Klinger B, Klünemann M, et al.
      Abstract: Extracting signalling pathway activities from transcriptome data is important to infer mechanistic origins of transcriptomic dysregulation, for example in disease. A popular method to do so is by enrichment analysis of signature genes in e.g. differentially regulated genes. Previously, we derived signatures for signalling pathways by integrating public perturbation transcriptome data and generated a signature database called SPEED (Signalling Pathway Enrichment using Experimental Datasets), for which we here present a substantial upgrade as SPEED2. This web server hosts consensus signatures for 16 signalling pathways that are derived from a large number of transcriptomic signalling perturbation experiments. When providing a gene list of e.g. differentially expressed genes, the web server allows to infer signalling pathways that likely caused these genes to be deregulated. In addition to signature lists, we derive ‘continuous’ gene signatures, in a transparent and automated fashion without any fine-tuning, and describe a new algorithm to score these signatures.
      PubDate: Mon, 20 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa236
      Issue No: Vol. 48, No. W1 (2020)
  • novoPathFinder: a webserver of designing novel-pathway with integrating

    • Authors: Ding S; Tian Y, Cai P, et al.
      Abstract: To increase the number of value-added chemicals that can be produced by metabolic engineering and synthetic biology, constructing metabolic space with novel reactions/pathways is crucial. However, with the large number of reactions that existed in the metabolic space and complicated metabolisms within hosts, identifying novel pathways linking two molecules or heterologous pathways when engineering a host to produce a target molecule is an arduous task. Hence, we built a user-friendly web server, novoPathFinder, which has several features: (i) enumerate novel pathways between two specified molecules without considering hosts; (ii) construct heterologous pathways with known or putative reactions for producing target molecule within Escherichia coli or yeast without giving precursor; (iii) estimate novel pathways with considering several categories, including enzyme promiscuity, Synthetic Complex Score (SCScore) and LD50 of intermediates, overall stoichiometric conversions, pathway length, theoretical yields and thermodynamic feasibility. According to the results, novoPathFinder is more capable to recover experimentally validated pathways when comparing other rule-based web server tools. Besides, more efficient pathways with novel reactions could also be retrieved for further experimental exploration. novoPathFinder is available at
      PubDate: Mon, 20 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa230
      Issue No: Vol. 48, No. W1 (2020)
  • OligoMinerApp: a web-server application for the design of genome-scale
           oligonucleotide in situ hybridization probes through the flexible
           OligoMiner environment

    • Authors: Passaro M; Martinovic M, Bevilacqua V, et al.
      Abstract: Fluorescence in situ hybridization (FISH) is a powerful single-cell technique that harnesses nucleic acid base pairing to detect the abundance and positioning of cellular RNA and DNA molecules in fixed samples. Recent technology development has paved the way to the construction of FISH probes entirely from synthetic oligonucleotides (oligos), allowing the optimization of thermodynamic properties together with the opportunity to design probes against any sequenced genome. However, comparatively little progress has been made in the development of computational tools to facilitate the oligos design, and even less has been done to extend their accessibility. OligoMiner is an open-source and modular pipeline written in Python that introduces a novel method of assessing probe specificity that employs supervised machine learning to predict probe binding specificity from genome-scale sequence alignment information. However, its use is restricted to only those people who are confident with command line interfaces because it lacks a Graphical User Interface (GUI), potentially cutting out many researchers from this technology. Here, we present OligoMinerApp (, a web-based application that aims to extend the OligoMiner framework through the implementation of a smart and easy-to-use GUI and the introduction of new functionalities specially designed to make effective probe mining available to everyone.
      PubDate: Mon, 20 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa251
      Issue No: Vol. 48, No. W1 (2020)
  • Galaxy HiCExplorer 3: a web server for reproducible Hi-C, capture Hi-C and
           single-cell Hi-C data analysis, quality control and visualization

    • Authors: Wolff J; Rabbani L, Gilsbach R, et al.
      Abstract: The Galaxy HiCExplorer provides a web service at It enables the integrative analysis of chromosome conformation by providing tools and computational resources to pre-process, analyse and visualize Hi-C, Capture Hi-C (cHi-C) and single-cell Hi-C (scHi-C) data. Since the last publication, Galaxy HiCExplorer has been expanded considerably with new tools to facilitate the analysis of cHi-C and to provide an in-depth analysis of Hi-C data. Moreover, it supports the analysis of scHi-C data by offering a broad range of tools. With the help of the standard graphical user interface of Galaxy, presented workflows, extensive documentation and tutorials, novices as well as Hi-C experts are supported in their Hi-C data analysis with Galaxy HiCExplorer.
      PubDate: Fri, 17 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa220
      Issue No: Vol. 48, No. W1 (2020)
  • ProteinsPlus: interactive analysis of protein–ligand binding

    • Authors: Schöning-Stierand K; Diedrich K, Fährrolfes R, et al.
      Abstract: Due to the increasing amount of publicly available protein structures searching, enriching and investigating these data still poses a challenging task. The ProteinsPlus web service ( offers a broad range of tools addressing these challenges. The web interface to the tool collection focusing on protein–ligand interactions has been geared towards easy and intuitive access to a large variety of functionality for life scientists. Since our last publication, the ProteinsPlus web service has been extended by additional services as well as it has undergone substantial infrastructural improvements. A keyword search functionality was added on the start page of ProteinsPlus enabling users to work on structures without knowing their PDB code. The tool collection has been augmented by three tools: StructureProfiler validates ligands and active sites using selection criteria of well-established protein–ligand benchmark data sets, WarPP places water molecules in the ligand binding sites of a protein, and METALizer calculates, predicts and scores coordination geometries of metal ions based on surrounding complex atoms. Additionally, all tools provided by ProteinsPlus are available through a REST service enabling the automated integration in structure processing and modeling pipelines.
      PubDate: Thu, 16 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa235
      Issue No: Vol. 48, No. W1 (2020)
  • rMAPS2: an update of the RNA map analysis and plotting server for
           alternative splicing regulation

    • Authors: Hwang J; Jung S, Kook T, et al.
      Abstract: The rMAPS2 (RNA Map Analysis and Plotting Server 2) web server, freely available at, has provided the high-throughput sequencing data research community with curated tools for the identification of RNA binding protein sites. rMAPS2 analyzes differential alternative splicing or CLIP peak data obtained from high-throughput sequencing data analysis tools like MISO, rMATS, Piranha, PIPE-CLIP and PARalyzer, and then, graphically displays enriched RNA-binding protein target sites. The initial release of rMAPS focused only on the most common alternative splicing event, skipped exon or exon skipping. However, there was a high demand for the analysis of other major types of alternative splicing events, especially for retained intron events since this is the most common type of alternative splicing in plants, such as Arabidopsis thaliana. Here, we expanded the implementation of rMAPS2 to facilitate analyses for all five major types of alternative splicing events: skipped exon, mutually exclusive exons, alternative 5′ splice site, alternative 3′ splice site and retained intron. In addition, by employing multi-threading, rMAPS2 has vastly improved the user experience with significant reductions in running time, ∼3.5 min for the analysis of all five major alternative splicing types at once.
      PubDate: Tue, 14 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa237
      Issue No: Vol. 48, No. W1 (2020)
  • TREND: a platform for exploring protein function in prokaryotes based on
           phylogenetic, domain architecture and gene neighborhood analyses

    • Authors: Gumerov V; Zhulin I.
      Abstract: Key steps in a computational study of protein function involve analysis of (i) relationships between homologous proteins, (ii) protein domain architecture and (iii) gene neighborhoods the corresponding proteins are encoded in. Each of these steps requires a separate computational task and sets of tools. Currently in order to relate protein features and gene neighborhoods information to phylogeny, researchers need to prepare all the necessary data and combine them by hand, which is time-consuming and error-prone. Here, we present a new platform, TREND (tree-based exploration of neighborhoods and domains), which can perform all the necessary steps in automated fashion and put the derived information into phylogenomic context, thus making evolutionary based protein function analysis more efficient. A rich set of adjustable components allows a user to run the computational steps specific to his task. TREND is freely available at
      PubDate: Mon, 13 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa243
      Issue No: Vol. 48, No. W1 (2020)
  • Prediction of synonymous corrections by the BE-FF computational tool
           expands the targeting scope of base editing

    • Authors: Rabinowitz R; Abadi S, Almog S, et al.
      Abstract: Base editing is a genome-editing approach that employs the CRISPR/Cas system to precisely install point mutations within the genome. A deaminase enzyme is fused to a deactivated Cas and enables transition conversions. The diversified repertoire of base editors provides a wide range of base editing possibilities. However, existing base editors cannot induce transversion substitutions and activate only within a specified region relative to the binding site, thus, they cannot precisely correct every point mutation. Here, we present BE-FF (Base Editors Functional Finder), a novel computational tool that identifies suitable base editors to correct the translated sequence erred by a point mutation. When a precise correction is impossible, BE-FF aims to mutate bystander nucleotides in order to induce synonymous corrections that will correct the coding sequence. To measure BE-FF practicality, we analysed a database of human pathogenic point mutations. Out of the transition mutations, 60.9% coding sequences could be corrected. Notably, 19.4% of the feasible corrections were not achieved by precise corrections but only by synonymous corrections. Moreover, 298 cases of transversion-derived pathogenic mutations were detected to be potentially repairable by base editing via synonymous corrections, although base editing is considered impractical for such mutations.
      PubDate: Tue, 07 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa215
      Issue No: Vol. 48, No. W1 (2020)
  • SynergyFinder 2.0: visual analytics of multi-drug combination synergies

    • Authors: Ianevski A; Giri A, Aittokallio T.
      Abstract: SynergyFinder ( is a stand-alone web-application for interactive analysis and visualization of drug combination screening data. Since its first release in 2017, SynergyFinder has become a widely used web-tool both for the discovery of novel synergistic drug combinations in pre-clinical model systems (e.g. cell lines or primary patient-derived cells), and for better understanding of mechanisms of combination treatment efficacy or resistance. Here, we describe the latest version of SynergyFinder (release 2.0), which has extensively been upgraded through the addition of novel features supporting especially higher-order combination data analytics and exploratory visualization of multi-drug synergy patterns, along with automated outlier detection procedure, extended curve-fitting functionality and statistical analysis of replicate measurements. A number of additional improvements were also implemented based on the user requests, including new visualization and export options, updated user interface, as well as enhanced stability and performance of the web-tool. With these improvements, SynergyFinder 2.0 is expected to greatly extend its potential applications in various areas of multi-drug combinatorial screening and precision medicine.
      PubDate: Sat, 04 Apr 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa216
      Issue No: Vol. 48, No. W1 (2020)
  • LINbase: a web server for genome-based identification of prokaryotes as
           members of crowdsourced taxa

    • Authors: Tian L; Huang C, Mazloom R, et al.
      Abstract: High throughput DNA sequencing in combination with efficient algorithms could provide the basis for a highly resolved, genome phylogeny-based and digital prokaryotic taxonomy. However, current taxonomic practice continues to rely on cumbersome journal publications for the description of new species, which still constitute the smallest taxonomic units. In response, we introduce LINbase, a web server that allows users to genomically circumscribe any group of prokaryotes with measurable DNA similarity and that uses the individual isolate as smallest unit. Since LINbase leverages the concept of Life Identification Numbers (LINs), which are codes assigned to individual genomes based on reciprocal average nucleotide identity, we refer to groups circumscribed in LINbase as LINgroups. Users can associate with each LINgroup a name, a short description, and a URL to a peer-reviewed publication. As soon as a LINgroup is circumscribed, any user can immediately identify query genomes as members and submit comments about the LINgroup. Most genomes currently in LINbase were imported from GenBank, but users can upload their own genome sequences as well. In conclusion, LINbase combines the resolution of LINs with the power of crowdsourcing in support of a highly resolved, genome phylogeny-based digital taxonomy. LINbase is available at
      PubDate: Mon, 30 Mar 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa190
      Issue No: Vol. 48, No. W1 (2020)
  • SuperCYPsPred—a web server for the prediction of cytochrome activity

    • Authors: Banerjee P; Dunkel M, Kemmler E, et al.
      Abstract: Cytochrome P450 enzymes (CYPs)-mediated drug metabolism influences drug pharmacokinetics and results in adverse outcomes in patients through drug–drug interactions (DDIs). Absorption, distribution, metabolism, excretion and toxicity (ADMET) issues are the leading causes for the failure of a drug in the clinical trials. As details on their metabolism are known for just half of the approved drugs, a tool for reliable prediction of CYPs specificity is needed. The SuperCYPsPred web server is currently focused on five major CYPs isoenzymes, which includes CYP1A2, CYP2C19, CYP2D6, CYP2C9 and CYP3A4 that are responsible for more than 80% of the metabolism of clinical drugs. The prediction models for classification of the CYPs inhibition are based on well-established machine learning methods. The models were validated both on cross-validation and external validation sets and achieved good performance. The web server takes a 2D chemical structure as input and reports the CYP inhibition profile of the chemical for 10 models using different molecular fingerprints, along with confidence scores, similar compounds, known CYPs information of drugs—published in literature, detailed interaction profile of individual cytochromes including a DDIs table and an overall CYPs prediction radar chart ( The web server does not require log in or registration and is free to use.
      PubDate: Tue, 17 Mar 2020 00:00:00 GMT
      DOI: 10.1093/nar/gkaa166
      Issue No: Vol. 48, No. W1 (2020)
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-