Followed Journals
Journal you Follow: 0
Sign Up to follow journals, search in your chosen journals and, optionally, receive Email Alerts when new issues of your Followed Journals are published.
Already have an account? Sign In to see the journals you follow.
Similar Journals
Journal Cover
Bioinformatics Advances : Journal of the International Society for Computational Biology
Number of Followers: 0  

  This is an Open Access Journal Open Access journal
ISSN (Online) 2635-0041
Published by Oxford University Press Homepage  [416 journals]
  • GraphQL for the delivery of bioinformatics web APIs and application to

    • Authors: Ireland S; Martin A, Zhang Z.
      Abstract: AbstractMotivationMany bioinformatics resources are provided as ‘web services’, with large databases and analysis software stored on a central server, and clients interacting with them using the hypertext transport protocol (HTTP). While some provide only a visual HTML interface, requiring a web browser to use them, many provide programmatic access using a web application programming interface (API) which returns XML, JSON or plain text that computer programs can interpret more easily. This allows access to be automated. Initially, many bioinformatics APIs used the ‘simple object access protocol’ (SOAP) and, more recently, representational state transfer (REST).ResultsGraphQL is a novel, increasingly prevalent alternative to REST and SOAP that represents the available data in the form of a graph to which any conceivable query can be submitted, and which is seeing increasing adoption in industry. Here, we review the principles of GraphQL, outline its particular suitability to the delivery of bioinformatics resources and describe its implementation in our ZincBind resource.Availability and implementation informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Wed, 29 Sep 2021 00:00:00 GMT
  • Omics Notebook: robust, reproducible and flexible automated multiomics
           exploratory analysis and reporting

    • Authors: Blum B; Emili A, Zhang Z.
      Abstract: AbstractSummaryMass spectrometry is an increasingly important tool for the global interrogation of diverse biomolecules. Unfortunately, the complexity of downstream data analysis is a major challenge for the routine use of these data by investigators from broader training backgrounds. Omics Notebook is an open-source framework for exploratory analysis, reporting and integrating multiomic data that are automated, reproducible and customizable. Built-in functions allow the processing of proteomic data from MaxQuant and metabolomic data from XCMS, along with other omics data in standardized input formats as specified in the documentation. In addition, the use of containerization manages R package installation requirements and is tailored for shared high-performance computing or cloud environments.Availability and implementationOmics Notebook is implemented in Python and R and is available for download from with additional documentation under a GNU GPLv3 license.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Tue, 21 Sep 2021 00:00:00 GMT
  • RibDif: can individual species be differentiated by 16S sequencing'

    • Authors: Strube M; Stamatakis A.
      Abstract: AbstractMotivationMetataxonomic analysis is now routinely used to profile the microbiome of virtually every ecological niche on planet Earth. The use of amplicon sequence variants (ASVs), proposing to be the exact biological 16S rRNA amplicon sequences of a given biological system, is now considered the gold standard. However, the 16S rRNA genes, and in particular the amplicons derived from it, are not unique for most species nor are they necessarily unique within individual genomes. Despite these restrictions, individual ASVs are often used to make inferences on the state of a given ecosystem, which may cause erroneous conclusions on the effects of a given species on a specific host phenotype or ecosystem.Results: To support researchers working with metataxonomics, we have developed RibDif, which easily and rapidly evaluates the feasibility of using metataxonomics to profile individual species. We use RibDif to demonstrate that the genus Pseudoalteromonas contains species that are impossible to distinguish with 16S amplicons and that this is a common motif in bacterial genera. We propose that researchers consult RibDif when making conclusions on individual species from metataxonomic data.Availability and implementationRibDif is freely available along with source code and detailed documentation at
      PubDate: Sun, 19 Sep 2021 00:00:00 GMT
  • Balanced Functional Module Detection in genomic data

    • Authors: Tritchler D; Towle-Miller L, Miecznikowski J, et al.
      Abstract: AbstractMotivationHigh-dimensional genomic data can be analyzed to understand the effects of variables on a target variable such as a clinical outcome. For understanding the underlying biological mechanism affecting the target, it is important to discover the complete set of relevant variables. Thus variable selection is a primary goal, which differs from a prediction criterion. Of special interest are functional modules, cooperating sets of variables affecting the target which can be characterized by a graph. In applications such as social networks, the concept of balance in undirected signed graphs characterizes the consistency of associations within the network. This property requires that the module variables have a joint effect on the target outcome with no internal conflict, an efficiency that may be applied to biological networks.ResultsIn this paper, we model genomic variables in signed undirected graphs for applications where the set of predictor variables influences an outcome. Consequences of the balance property are exploited to implement a new module discovery algorithm, balanced Functional Module Detection (bFMD), which selects a subset of variables from high-dimensional data that compose a balanced functional module. Our bFMD algorithm performed favorably in simulations as compared to other module detection methods. Additionally, bFMD detected interpretable results in an application using RNA-seq data obtained from subjects with Uterine Corpus Endometrial Carcinoma using the percentage of tumor invasion as the outcome of interest. The variables selected by bFMD have improved interpretability due to the logical consistency afforded by the balance property.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Thu, 16 Sep 2021 00:00:00 GMT
  • Identifying anti-TNF response biomarkers in ulcerative colitis using a
           diffusion-based signalling model

    • Authors: Singh A; Anderssen E, Fenton C, et al.
      Abstract: AbstractMotivationResistance to anti-TNF therapy in subgroups of ulcerative colitis (UC) patients is a major challenge and incurs significant treatment costs. Identification of patients at risk of nonresponse to anti-TNF is of major clinical importance. To date, no quantitative computational framework exists to develop a complex biomarker for the prognosis of UC treatment. Modelling patient-wise receptor to transcription factor (TF) network connectivity may enable personalized treatment.ResultsWe present an approach for quantitative diffusion analysis between receptors and TFs using gene expression data. Key TFs were identified using pandaR. Network connectivities between immune-specific receptor-TF pairs were quantified using network diffusion in UC patients and controls. The patient-specific network could be considered a complex biomarker that separates anti-TNF treatment-resistant and responder patients both in the gene expression dataset used for model development and separate independent test datasets. The model was further validated in rheumatoid arthritis where it successfully discriminated resistant and responder patients to tocilizumab treatment. Our model may contribute to prognostic biomarkers that may identify treatment-resistant and responder subpopulations of UC patients.Availability and implementationSoftware is available at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Wed, 18 Aug 2021 00:00:00 GMT
  • BIONDA: a free database for a fast information on published biomarkers

    • Authors: Turewicz M; Frericks-Zipper A, Stepath M, et al.
      Abstract: SummaryBecause of the steadily increasing and already manually unmanageable total number of biomarker-related articles in biomedical research, there is a need for intelligent systems that extract all relevant information from biomedical texts and provide it as structured information to researchers in a user-friendly way. To address this, BIONDA was implemented as a free text mining-based online database for molecular biomarkers including genes, proteins and miRNAs and for all kinds of diseases. The contained structured information on published biomarkers is extracted automatically from Europe PMC publication abstracts and high-quality sources like UniProt and Disease Ontology. This allows frequent content updates.Availability and implementationBIONDA is freely accessible via a user-friendly web application at The current BIONDA code is available at GitHub via informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Wed, 18 Aug 2021 00:00:00 GMT
  • MSABrowser: dynamic and fast visualization of sequence alignments,
           variations and annotations

    • Authors: Torun F; Bilgin H, Kaplan O, et al.
      Abstract: AbstractSummarySequence alignment is an excellent way to visualize the similarities and differences between DNA, RNA or protein sequences, yet it is currently difficult to jointly view sequence alignment data with genetic variations, modifications such as post-translational modifications and annotations (i.e. protein domains). Here, we present the MSABrowser tool that makes it easy to co-visualize genetic variations, modifications and annotations on the respective positions of amino acids or nucleotides in pairwise or multiple sequence alignments. MSABrowser is developed entirely in JavaScript and works on any modern web browser at any platform, including Linux, Mac OS X and Windows systems without any installation. MSABrowser is also freely available for the benefit of the scientific community.Availability and implementationMSABrowser is released as open-source and web-based software under MIT License. The visualizer, documentation, all source codes and examples are available at and GitHub repository informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Sat, 07 Aug 2021 00:00:00 GMT
  • cblaster: a remote search tool for rapid identification and visualization
           of homologous gene clusters

    • Authors: Gilchrist C; Booth T, van Wersch B, et al.
      Abstract: AbstractMotivationGenes involved in coordinated biological pathways, including metabolism, drug resistance and virulence, are often collocalized as gene clusters. Identifying homologous gene clusters aids in the study of their function and evolution, however, existing tools are limited to searching local sequence databases. Tools for remotely searching public databases are necessary to keep pace with the rapid growth of online genomic data.ResultsHere, we present cblaster, a Python-based tool to rapidly detect collocated genes in local and remote databases. cblaster is easy to use, offering both a command line and a user-friendly graphical user interface. It generates outputs that enable intuitive visualizations of large datasets and can be readily incorporated into larger bioinformatic pipelines. cblaster is a significant update to the comparative genomics toolbox.Availability and implementationcblaster source code and documentation is freely available from GitHub under the MIT license ( informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Thu, 05 Aug 2021 00:00:00 GMT
  • Identifying and classifying goals for scientific knowledge

    • Authors: Boguslav M; Salem N, White E, et al.
      Abstract: AbstractMotivationScience progresses by posing good questions, yet work in biomedical text mining has not focused on them much. We propose a novel idea for biomedical natural language processing: identifying and characterizing the questions stated in the biomedical literature. Formally, the task is to identify and characterize statements of ignorance, statements where scientific knowledge is missing or incomplete. The creation of such technology could have many significant impacts, from the training of PhD students to ranking publications and prioritizing funding based on particular questions of interest. The work presented here is intended as the first step towards these goals.ResultsWe present a novel ignorance taxonomy driven by the role statements of ignorance play in research, identifying specific goals for future scientific knowledge. Using this taxonomy and reliable annotation guidelines (inter-annotator agreement above 80%), we created a gold standard ignorance corpus of 60 full-text documents from the prenatal nutrition literature with over 10 000 annotations and used it to train classifiers that achieved over 0.80 F1 scores.Availability and implementationCorpus and source code freely available for download at The source code is implemented in Python.
      PubDate: Wed, 28 Jul 2021 00:00:00 GMT
  • A novel method of literature mining to identify candidate COVID-19 drugs

    • Authors: Muramatsu T; Tanokura M, Lengauer T.
      Abstract: Summary COVID-19 is a serious infectious disease that has recently emerged and continues to spread worldwide. Its spreading rate is too high to expect that new specific drugs will be developed in sufficient time. As an alternative, drugs already developed for other diseases have been tested for use in the treatment of COVID-19 (drug repositioning). However, to select candidate drugs from a large number of compounds, numerous inhibition assays involving viral infection of cultured cells are required. For efficiency, it would be useful to narrow the list of candidates down using logical considerations prior to performing these assays. We have developed a powerful tool to predict candidate drugs for the treatment of COVID-19 and other diseases. This tool is based on the concatenation of events/substances, each of which is linked to a KEGG (Kyoto Encyclopedia of Genes and Genomes) code based on a relationship obtained from text mining of the vast literature in the PubMed database. By analyzing 21 589 326 records with abstracts from PubMed, 98 556 KEGG codes with NAME/DEFINITION fields were connected. Among them, 9799 KEGG drug codes were connected to COVID-19, of which 7492 codes had no direct connection to COVID-19. Although this report focuses on COVID-19, the program developed here can be applied to other infectious diseases and used to quickly identify drug candidates when new infectious diseases appear in the future.Availability and implementationThe programs and data underlying this article will be shared on reasonable request to the corresponding authors., informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Thu, 22 Jul 2021 00:00:00 GMT
  • Chemsearch: collaborative compound libraries with structure-aware browsing

    • Authors: Gaffney S; Smaga S, Schepartz A, et al.
      Abstract: AbstractSummaryChemsearch is a cross-platform server application for developing and managing a chemical compound library and associated data files, with an interface for browsing and search that allows for easy navigation to a compound of interest, similar compounds or compounds that have desired structural properties. With provisions for access control and centralized document and data storage, Chemsearch supports collaboration by distributed teams.Availability and implementationChemsearch is a free and open-source Flask web application that can be linked to a Google Workspace account. Source code is available at (GPLv3 license). A Docker image allowing rapid deployment is available at
      PubDate: Fri, 16 Jul 2021 00:00:00 GMT
  • PathBIX—a web server for network-based pathway annotation with
           adaptive null models

    • Authors: Castresana-Aguirre M; Persson E, Sonnhammer E, et al.
      Abstract: ABSTRACTMotivationPathway annotation is a vital tool for interpreting and giving meaning to experimental data in life sciences. Numerous tools exist for this task, where the most recent generation of pathway enrichment analysis tools, network-based methods, utilize biological networks to gain a richer source of information as a basis of the analysis than merely the gene content. Network-based methods use the network crosstalk between the query gene set and the genes in known pathways, and compare this to a null model of random expectation. ResultsWe developed PathBIX, a novel web application for network-based pathway analysis, based on the recently published ANUBIX algorithm which has been shown to be more accurate than previous network-based methods. The PathBIX website performs pathway annotation for 21 species, and utilizes prefetched and preprocessed network data from FunCoup 5.0 networks and pathway data from three databases: KEGG, Reactome, and WikiPathways.Availability informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Thu, 01 Jul 2021 00:00:00 GMT
  • Improved prediction of conopeptide superfamilies with ConoDictor 2.0

    • Authors: Koua D; Ebou A, Dutertre S, et al.
      Abstract: AbstractMotivationCone snails are among the richest sources of natural peptides with promising pharmacological and therapeutic applications. With the reduced costs of RNAseq, scientists now heavily rely on venom gland transcriptomes for the mining of novel bioactive conopeptides, but the bioinformatic analyses often hamper the discovery process.ResultsHere, we present ConoDictor 2.0 as a standalone and user-friendly command-line program. We have updated the program originally published as a web server 10 years ago using novel and updated tools and algorithms and improved our classification models with new and higher quality sequences. ConoDictor 2.0 is now more accurate, faster, multiplatform and able to deal with a whole cone snail venom gland transcriptome (raw reads or contigs) in a very short time. The new version of Conodictor also improves the identification and subsequent classification for entirely novel or relatively distant conopeptides. We conducted various tests on known conopeptides from public databases and on the published venom duct transcriptome of Conus geographus, and compared previous results with the output of ConoDictor 2.0, ConoSorter and BLAST. Overall, ConoDictor 2.0 is 4 to 8 times faster for the analysis of a whole transcriptome on a single core computer and performed better at predicting gene superfamily.Availability and implementationConoDictor 2.0 is available as a python 3 git folder at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Thu, 17 Jun 2021 00:00:00 GMT
  • Aquila_stLFR: diploid genome assembly based structural variant calling
           package for stLFR linked-reads

    • Authors: Liu Y; Grubbs G, Zhang L, et al.
      Abstract: AbstractMotivationIdentifying structural variants (SVs) is critical in health and disease, however, detecting them remains a challenge. Several linked-read sequencing technologies, including 10X Genomics, TELL-Seq and single tube long fragment read (stLFR), have been recently developed as cost-effective approaches to reconstruct multi-megabase haplotypes (phase blocks) from sequence data of a single sample. These technologies provide an optimal sequencing platform to characterize SVs, though few computational algorithms can utilize them. Thus, we developed Aquila_stLFR, an approach that resolves SVs through haplotype-based assembly of stLFR linked-reads.ResultsAquila_stLFR first partitions long fragment reads into two haplotype-specific blocks with the assistance of the high-quality reference genome, by taking advantage of the potential phasing ability of the linked-read itself. Each haplotype is then assembled independently, to achieve a complete diploid assembly to finally reconstruct the genome-wide SVs. We benchmarked Aquila_stLFR on a well-studied sample, NA24385, and showed Aquila_stLFR can detect medium to large size deletions (50 bp–10 kb) with high sensitivity and medium-size insertions (50 bp–1 kb) with high specificity.Availability and implementationSource code and documentation are available on informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Wed, 16 Jun 2021 00:00:00 GMT
  • CLARINET: efficient learning of dynamic network models from literature

    • Authors: Ahmed Y; Telmer C, Miskov-Zivanov N, et al.
      Abstract: AbstractMotivationCreating or extending computational models of complex systems, such as intra- and intercellular biological networks, is a time and labor-intensive task, often limited by the knowledge and experience of modelers. Automating this process would enable rapid, consistent, comprehensive and robust analysis and understanding of complex systems.ResultsIn this work, we present CLARINET (CLARIfying NETworks), a novel methodology and a tool for automatically expanding models using the information extracted from the literature by machine reading. CLARINET creates collaboration graphs from the extracted events and uses several novel metrics for evaluating these events individually, in pairs, and in groups. These metrics are based on the frequency of occurrence and co-occurrence of events in literature, and their connectivity to the baseline model. We tested how well CLARINET can reproduce manually built and curated models, when provided with varying amount of information in the baseline model and in the machine reading output. Our results show that CLARINET can recover all relevant interactions that are present in the reading output and it automatically reconstructs manually built models with average recall of 80% and average precision of 70%. CLARINET is highly scalable, its average runtime is at the order of ten seconds when processing several thousand interactions, outperforming other similar methods.Availability and implementationThe data underlying this article are available in Bitbucket at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Thu, 03 Jun 2021 00:00:00 GMT
  • Comparative genome analysis using sample-specific string detection in
           accurate long reads

    • Authors: Khorsand P; Denti L, , et al.
      Abstract: AbstractMotivationComparative genome analysis of two or more whole-genome sequenced (WGS) samples is at the core of most applications in genomics. These include the discovery of genomic differences segregating in populations, case-control analysis in common diseases and diagnosing rare disorders. With the current progress of accurate long-read sequencing technologies (e.g. circular consensus sequencing from PacBio sequencers), we can dive into studying repeat regions of the genome (e.g. segmental duplications) and hard-to-detect variants (e.g. complex structural variants).ResultsWe propose a novel framework for comparative genome analysis through the discovery of strings that are specific to one genome (‘samples-specific’ strings). We have developed a novel, accurate and efficient computational method for the discovery of sample-specific strings between two groups of WGS samples. The proposed approach will give us the ability to perform comparative genome analysis without the need to map the reads and is not hindered by shortcomings of the reference genome and mapping algorithms. We show that the proposed approach is capable of accurately finding sample-specific strings representing nearly all variation (>98%) reported across pairs or trios of WGS samples using accurate long reads (e.g. PacBio HiFi data).Availability and implementationData, code and instructions for reproducing the results presented in this manuscript are publicly available at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Mon, 31 May 2021 00:00:00 GMT
  • More practical differentially private publication of key statistics in

    • Authors: Yamamoto A; Shibuya T, Mulder N.
      Abstract: Abstract Motivation: Analyses of datasets that contain personal genomic information are very important for revealing associations between diseases and genomes. Genome-wide association studies, which are large-scale genetic statistical analyses, often involve tests with contingency tables. However, if the statistics obtained by these tests are made public as they are, sensitive information of individuals could be leaked. Existing studies have proposed privacy-preserving methods for statistics in the χ2 test with a 3 × 2 contingency table, but they do not cover all the tests used in association studies. In addition, existing methods for releasing differentially private P-values are not practical.Results: In this work, we propose methods for releasing statistics in the χ2 test, the Fisher’s exact test and the Cochran–Armitage’s trend test while preserving both personal privacy and utility. Our methods for releasing P-values are the first to achieve practicality under the concept of differential privacy by considering their base 10 logarithms. We make theoretical guarantees by showing the sensitivity of the above statistics. From our experimental results, we evaluate the utility of the proposed methods and show appropriate thresholds with high accuracy for using the private statistics in actual tests.Availability and implementationA python implementation of our experiments is available at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Tue, 18 May 2021 00:00:00 GMT
  • Hierarchical Meta-Storms enables comprehensive and rapid comparison of
           microbiome functional profiles on a large scale using hierarchical
           dissimilarity metrics and parallel computing

    • Authors: Zhang Y; Jing G, Chen Y, et al.
      Abstract: ABSTRACT Functional beta-diversity analysis on numerous microbiomes interprets the linkages between metabolic functions and their meta-data. To evaluate the microbiome beta-diversity, widely used distance metrices only count overlapped gene families but omit their inherent relationships, resulting in erroneous distances due to the sparsity of high-dimensional function profiles. Here we propose Hierarchical Meta-Storms (HMS) to tackle such problem. HMS contains two core components: (i) a dissimilarity algorithm that comprehensively measures functional distances among microbiomes using multi-level metabolic hierarchy and (ii) a fast Principal Co-ordinates Analysis (PCoA) implementation that deduces the beta-diversity pattern optimized by parallel computing. Results showed HMS can detect the variations of microbial functions in upper-level metabolic pathways, however, always missed by other methods. In addition, HMS accomplished the pairwise distance matrix and PCoA for 20 000 microbiomes in 3.9 h on a single computing node, which was 23 times faster and 80% less RAM consumption compared to existing methods, enabling the in-depth data mining among microbiomes on a high resolution. HMS takes microbiome functional profiles as input, produces their pairwise distance matrix and PCoA coordinates.Availability and implementationIt is coded in C/C++ with parallel computing and released in two alternative forms: a standalone software ( and an equivalent R package ( informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Wed, 12 May 2021 00:00:00 GMT
  • Letter by the ISCB President

    • Authors: Orengo C.
      Abstract: This year is the 21st anniversary of the founding of the International Society for Computational Biology (ISCB) and the launch of Bioinformatics Advances is an exciting opportunity to mark this occasion.
      PubDate: Wed, 12 May 2021 00:00:00 GMT
  • Editorial

    • Authors: Bateman A; Lengauer T.
      Abstract: Welcome to Bioinformatics Advances, an interdisciplinary journal on bioinformatics and computational biology. (In the following, we will use the term bioinformatics to stand for both bioinformatics and computational biology.) The journal represents a joint endeavor of the International Society for Computational Biology (ISCB) and Oxford University Press (OUP).
      PubDate: Wed, 12 May 2021 00:00:00 GMT
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-