Subjects -> COMPUTER SCIENCE (Total: 2313 journals)
    - ANIMATION AND SIMULATION (33 journals)
    - ARTIFICIAL INTELLIGENCE (133 journals)
    - AUTOMATION AND ROBOTICS (116 journals)
    - COMPUTER ARCHITECTURE (11 journals)
    - COMPUTER ENGINEERING (12 journals)
    - COMPUTER GAMES (23 journals)
    - COMPUTER PROGRAMMING (25 journals)
    - COMPUTER SCIENCE (1305 journals)
    - COMPUTER SECURITY (59 journals)
    - DATA BASE MANAGEMENT (21 journals)
    - DATA MINING (50 journals)
    - E-BUSINESS (21 journals)
    - E-LEARNING (30 journals)
    - IMAGE AND VIDEO PROCESSING (42 journals)
    - INFORMATION SYSTEMS (109 journals)
    - INTERNET (111 journals)
    - SOCIAL WEB (61 journals)
    - SOFTWARE (43 journals)
    - THEORY OF COMPUTING (10 journals)

COMPUTER SCIENCE (1305 journals)                  1 2 3 4 5 6 7 | Last

Showing 1 - 200 of 872 Journals sorted alphabetically
3D Printing and Additive Manufacturing     Full-text available via subscription   (Followers: 27)
Abakós     Open Access   (Followers: 3)
ACM Computing Surveys     Hybrid Journal   (Followers: 29)
ACM Inroads     Full-text available via subscription   (Followers: 1)
ACM Journal of Computer Documentation     Free   (Followers: 4)
ACM Journal on Computing and Cultural Heritage     Hybrid Journal   (Followers: 5)
ACM Journal on Emerging Technologies in Computing Systems     Hybrid Journal   (Followers: 11)
ACM SIGACCESS Accessibility and Computing     Free   (Followers: 2)
ACM SIGAPP Applied Computing Review     Full-text available via subscription  
ACM SIGBioinformatics Record     Full-text available via subscription  
ACM SIGEVOlution     Full-text available via subscription  
ACM SIGHIT Record     Full-text available via subscription  
ACM SIGHPC Connect     Full-text available via subscription  
ACM SIGITE Newsletter     Open Access   (Followers: 1)
ACM SIGMIS Database: the DATABASE for Advances in Information Systems     Hybrid Journal  
ACM SIGUCCS plugged in     Full-text available via subscription  
ACM SIGWEB Newsletter     Full-text available via subscription   (Followers: 3)
ACM Transactions on Accessible Computing (TACCESS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Algorithms (TALG)     Hybrid Journal   (Followers: 13)
ACM Transactions on Applied Perception (TAP)     Hybrid Journal   (Followers: 3)
ACM Transactions on Architecture and Code Optimization (TACO)     Hybrid Journal   (Followers: 9)
ACM Transactions on Asian and Low-Resource Language Information Processing (TALLIP)     Hybrid Journal  
ACM Transactions on Autonomous and Adaptive Systems (TAAS)     Hybrid Journal   (Followers: 10)
ACM Transactions on Computation Theory (TOCT)     Hybrid Journal   (Followers: 11)
ACM Transactions on Computational Logic (TOCL)     Hybrid Journal   (Followers: 5)
ACM Transactions on Computer Systems (TOCS)     Hybrid Journal   (Followers: 19)
ACM Transactions on Computer-Human Interaction     Hybrid Journal   (Followers: 15)
ACM Transactions on Computing Education (TOCE)     Hybrid Journal   (Followers: 9)
ACM Transactions on Computing for Healthcare     Hybrid Journal  
ACM Transactions on Cyber-Physical Systems (TCPS)     Hybrid Journal   (Followers: 1)
ACM Transactions on Design Automation of Electronic Systems (TODAES)     Hybrid Journal   (Followers: 5)
ACM Transactions on Economics and Computation     Hybrid Journal  
ACM Transactions on Embedded Computing Systems (TECS)     Hybrid Journal   (Followers: 4)
ACM Transactions on Information Systems (TOIS)     Hybrid Journal   (Followers: 18)
ACM Transactions on Intelligent Systems and Technology (TIST)     Hybrid Journal   (Followers: 11)
ACM Transactions on Interactive Intelligent Systems (TiiS)     Hybrid Journal   (Followers: 6)
ACM Transactions on Internet of Things     Hybrid Journal   (Followers: 2)
ACM Transactions on Modeling and Performance Evaluation of Computing Systems (ToMPECS)     Hybrid Journal  
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)     Hybrid Journal   (Followers: 10)
ACM Transactions on Parallel Computing     Full-text available via subscription  
ACM Transactions on Reconfigurable Technology and Systems (TRETS)     Hybrid Journal   (Followers: 6)
ACM Transactions on Sensor Networks (TOSN)     Hybrid Journal   (Followers: 9)
ACM Transactions on Social Computing     Hybrid Journal  
ACM Transactions on Spatial Algorithms and Systems (TSAS)     Hybrid Journal   (Followers: 1)
ACM Transactions on Speech and Language Processing (TSLP)     Hybrid Journal   (Followers: 11)
ACM Transactions on Storage     Hybrid Journal  
ACS Applied Materials & Interfaces     Hybrid Journal   (Followers: 39)
Acta Informatica Malaysia     Open Access  
Acta Universitatis Cibiniensis. Technical Series     Open Access   (Followers: 1)
Ad Hoc Networks     Hybrid Journal   (Followers: 12)
Adaptive Behavior     Hybrid Journal   (Followers: 8)
Additive Manufacturing Letters     Open Access   (Followers: 3)
Advanced Engineering Materials     Hybrid Journal   (Followers: 32)
Advanced Science Letters     Full-text available via subscription   (Followers: 9)
Advances in Adaptive Data Analysis     Hybrid Journal   (Followers: 9)
Advances in Artificial Intelligence     Open Access   (Followers: 31)
Advances in Catalysis     Full-text available via subscription   (Followers: 7)
Advances in Computational Mathematics     Hybrid Journal   (Followers: 20)
Advances in Computer Engineering     Open Access   (Followers: 13)
Advances in Computer Science : an International Journal     Open Access   (Followers: 18)
Advances in Computing     Open Access   (Followers: 3)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 52)
Advances in Engineering Software     Hybrid Journal   (Followers: 26)
Advances in Geosciences (ADGEO)     Open Access   (Followers: 19)
Advances in Human-Computer Interaction     Open Access   (Followers: 19)
Advances in Image and Video Processing     Open Access   (Followers: 20)
Advances in Materials Science     Open Access   (Followers: 19)
Advances in Multimedia     Open Access   (Followers: 1)
Advances in Operations Research     Open Access   (Followers: 13)
Advances in Remote Sensing     Open Access   (Followers: 59)
Advances in Science and Research (ASR)     Open Access   (Followers: 8)
Advances in Technology Innovation     Open Access   (Followers: 5)
AEU - International Journal of Electronics and Communications     Hybrid Journal   (Followers: 8)
African Journal of Information and Communication     Open Access   (Followers: 6)
African Journal of Mathematics and Computer Science Research     Open Access   (Followers: 5)
AI EDAM     Hybrid Journal   (Followers: 2)
Air, Soil & Water Research     Open Access   (Followers: 6)
AIS Transactions on Human-Computer Interaction     Open Access   (Followers: 5)
Al-Qadisiyah Journal for Computer Science and Mathematics     Open Access   (Followers: 2)
AL-Rafidain Journal of Computer Sciences and Mathematics     Open Access   (Followers: 3)
Algebras and Representation Theory     Hybrid Journal  
Algorithms     Open Access   (Followers: 13)
American Journal of Computational and Applied Mathematics     Open Access   (Followers: 8)
American Journal of Computational Mathematics     Open Access   (Followers: 6)
American Journal of Information Systems     Open Access   (Followers: 4)
American Journal of Sensor Technology     Open Access   (Followers: 2)
Analog Integrated Circuits and Signal Processing     Hybrid Journal   (Followers: 15)
Animation Practice, Process & Production     Hybrid Journal   (Followers: 4)
Annals of Combinatorics     Hybrid Journal   (Followers: 3)
Annals of Data Science     Hybrid Journal   (Followers: 14)
Annals of Mathematics and Artificial Intelligence     Hybrid Journal   (Followers: 16)
Annals of Pure and Applied Logic     Open Access   (Followers: 4)
Annals of Software Engineering     Hybrid Journal   (Followers: 12)
Annual Reviews in Control     Hybrid Journal   (Followers: 7)
Anuario Americanista Europeo     Open Access  
Applicable Algebra in Engineering, Communication and Computing     Hybrid Journal   (Followers: 3)
Applied and Computational Harmonic Analysis     Full-text available via subscription  
Applied Artificial Intelligence: An International Journal     Hybrid Journal   (Followers: 17)
Applied Categorical Structures     Hybrid Journal   (Followers: 4)
Applied Clinical Informatics     Hybrid Journal   (Followers: 4)
Applied Computational Intelligence and Soft Computing     Open Access   (Followers: 16)
Applied Computer Systems     Open Access   (Followers: 6)
Applied Computing and Geosciences     Open Access   (Followers: 3)
Applied Mathematics and Computation     Hybrid Journal   (Followers: 31)
Applied Medical Informatics     Open Access   (Followers: 11)
Applied Numerical Mathematics     Hybrid Journal   (Followers: 4)
Applied Soft Computing     Hybrid Journal   (Followers: 13)
Applied Spatial Analysis and Policy     Hybrid Journal   (Followers: 5)
Applied System Innovation     Open Access   (Followers: 1)
Archive of Applied Mechanics     Hybrid Journal   (Followers: 4)
Archive of Numerical Software     Open Access  
Archives and Museum Informatics     Hybrid Journal   (Followers: 97)
Archives of Computational Methods in Engineering     Hybrid Journal   (Followers: 5)
arq: Architectural Research Quarterly     Hybrid Journal   (Followers: 7)
Array     Open Access   (Followers: 1)
Artifact : Journal of Design Practice     Open Access   (Followers: 8)
Artificial Life     Hybrid Journal   (Followers: 7)
Asian Journal of Computer Science and Information Technology     Open Access   (Followers: 3)
Asian Journal of Control     Hybrid Journal  
Asian Journal of Research in Computer Science     Open Access   (Followers: 4)
Assembly Automation     Hybrid Journal   (Followers: 2)
Automatic Control and Computer Sciences     Hybrid Journal   (Followers: 6)
Automatic Documentation and Mathematical Linguistics     Hybrid Journal   (Followers: 5)
Automatica     Hybrid Journal   (Followers: 13)
Automatika : Journal for Control, Measurement, Electronics, Computing and Communications     Open Access  
Automation in Construction     Hybrid Journal   (Followers: 8)
Balkan Journal of Electrical and Computer Engineering     Open Access  
Basin Research     Hybrid Journal   (Followers: 7)
Behaviour & Information Technology     Hybrid Journal   (Followers: 32)
BenchCouncil Transactions on Benchmarks, Standards, and Evaluations     Open Access   (Followers: 3)
Big Data and Cognitive Computing     Open Access   (Followers: 5)
Big Data Mining and Analytics     Open Access   (Followers: 10)
Biodiversity Information Science and Standards     Open Access   (Followers: 1)
Bioinformatics     Hybrid Journal   (Followers: 216)
Bioinformatics Advances : Journal of the International Society for Computational Biology     Open Access   (Followers: 1)
Biomedical Engineering     Hybrid Journal   (Followers: 11)
Biomedical Engineering and Computational Biology     Open Access   (Followers: 11)
Briefings in Bioinformatics     Hybrid Journal   (Followers: 43)
British Journal of Educational Technology     Hybrid Journal   (Followers: 93)
Bulletin of Taras Shevchenko National University of Kyiv. Series: Physics and Mathematics     Open Access  
c't Magazin fuer Computertechnik     Full-text available via subscription   (Followers: 1)
Cadernos do IME : Série Informática     Open Access  
CALCOLO     Hybrid Journal  
CALICO Journal     Full-text available via subscription   (Followers: 1)
Calphad     Hybrid Journal  
Canadian Journal of Electrical and Computer Engineering     Full-text available via subscription   (Followers: 14)
Catalysis in Industry     Hybrid Journal  
CCF Transactions on High Performance Computing     Hybrid Journal  
CCF Transactions on Pervasive Computing and Interaction     Hybrid Journal  
CEAS Space Journal     Hybrid Journal   (Followers: 6)
Cell Communication and Signaling     Open Access   (Followers: 3)
Central European Journal of Computer Science     Hybrid Journal   (Followers: 4)
CERN IdeaSquare Journal of Experimental Innovation     Open Access  
Chaos, Solitons & Fractals     Hybrid Journal   (Followers: 1)
Chaos, Solitons & Fractals : X     Open Access   (Followers: 1)
Chemometrics and Intelligent Laboratory Systems     Hybrid Journal   (Followers: 13)
ChemSusChem     Hybrid Journal   (Followers: 7)
China Communications     Full-text available via subscription   (Followers: 8)
Chinese Journal of Catalysis     Full-text available via subscription   (Followers: 2)
Chip     Full-text available via subscription   (Followers: 2)
Ciencia     Open Access  
CIN : Computers Informatics Nursing     Hybrid Journal   (Followers: 11)
Circuits and Systems     Open Access   (Followers: 16)
CLEI Electronic Journal     Open Access  
Clin-Alert     Hybrid Journal   (Followers: 1)
Clinical eHealth     Open Access  
Cluster Computing     Hybrid Journal   (Followers: 1)
Cognitive Computation     Hybrid Journal   (Followers: 2)
Cognitive Computation and Systems     Open Access  
COMBINATORICA     Hybrid Journal  
Combinatorics, Probability and Computing     Hybrid Journal   (Followers: 4)
Combustion Theory and Modelling     Hybrid Journal   (Followers: 18)
Communication Methods and Measures     Hybrid Journal   (Followers: 12)
Communication Theory     Hybrid Journal   (Followers: 29)
Communications in Algebra     Hybrid Journal   (Followers: 1)
Communications in Partial Differential Equations     Hybrid Journal   (Followers: 2)
Communications of the ACM     Full-text available via subscription   (Followers: 59)
Communications of the Association for Information Systems     Open Access   (Followers: 15)
Communications on Applied Mathematics and Computation     Hybrid Journal   (Followers: 1)
COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering     Hybrid Journal   (Followers: 4)
Complex & Intelligent Systems     Open Access   (Followers: 1)
Complex Adaptive Systems Modeling     Open Access  
Complex Analysis and Operator Theory     Hybrid Journal   (Followers: 2)
Complexity     Hybrid Journal   (Followers: 8)
Computación y Sistemas     Open Access  
Computation     Open Access   (Followers: 1)
Computational and Applied Mathematics     Hybrid Journal   (Followers: 3)
Computational and Mathematical Methods     Hybrid Journal  
Computational and Mathematical Methods in Medicine     Open Access   (Followers: 2)
Computational and Mathematical Organization Theory     Hybrid Journal   (Followers: 1)
Computational and Structural Biotechnology Journal     Open Access   (Followers: 1)
Computational and Theoretical Chemistry     Hybrid Journal   (Followers: 11)
Computational Astrophysics and Cosmology     Open Access   (Followers: 6)
Computational Biology and Chemistry     Hybrid Journal   (Followers: 13)
Computational Biology Journal     Open Access   (Followers: 6)
Computational Brain & Behavior     Hybrid Journal   (Followers: 1)
Computational Chemistry     Open Access   (Followers: 3)
Computational Communication Research     Open Access   (Followers: 1)
Computational Complexity     Hybrid Journal   (Followers: 5)
Computational Condensed Matter     Open Access   (Followers: 1)

        1 2 3 4 5 6 7 | Last

Similar Journals
Journal Cover
Bioinformatics Advances : Journal of the International Society for Computational Biology
Number of Followers: 1  

  This is an Open Access Journal Open Access journal
ISSN (Online) 2635-0041
Published by Oxford University Press Homepage  [419 journals]
  • Pollock: fishing for cell states

    • Authors: Storrs E; Zhou D, Wendl M, et al.
      Abstract: AbstractMotivationThe use of single-cell methods is expanding at an ever-increasing rate. While there are established algorithms that address cell classification, they are limited in terms of cross platform compatibility, reliance on the availability of a reference dataset and classification interpretability. Here, we introduce Pollock, a suite of algorithms for cell type identification that is compatible with popular single-cell methods and analysis platforms, provides a set of pretrained human cancer reference models, and reports interpretability scores that identify the genes that drive cell type classifications.ResultsPollock performs comparably to existing classification methods, while offering easily deployable pretrained classification models across a wide variety of tissue and data types. Additionally, it demonstrates utility in immune pan-cancer analysis.Availability and implementationSource code and documentation are available at Pretrained models and datasets are available for download at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Fri, 13 May 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac028
      Issue No: Vol. 2, No. 1 (2022)
  • Iris Antes 1969–2021

    • Authors: Kapurniotu A; Lengauer T.
      Abstract: Dr Iris Antes, a professor at the Technical University of Munich, passed away on August 4, 2021 in Murnau near Munich. She was a productive scientist, a great mentor and an engaged activist for the causes of our field. Her untimely early death fills her friends, colleagues, mentees and students with great sadness.
      PubDate: Wed, 04 May 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac024
      Issue No: Vol. 2, No. 1 (2022)
  • MAGI-MS: multiple seed-centric module discovery

    • Authors: Chow J; Zhou R, Hormozdiari F, et al.
      Abstract: AbstractSummaryComplex disorders manifest by the interaction of multiple genetic and environmental factors. Through the construction of genetic modules that consist of highly coexpressed genes, it is possible to identify genes that participate in common biological pathways relevant to specific phenotypes. We have previously developed tools MAGI and MAGI-S for genetic module discovery by incorporating coexpression and protein interaction networks. Here, we introduce an extension to MAGI-S, denoted as Merging Affected Genes into Integrated Networks—Multiple Seeds (MAGI-MS), which permits the user to further specify a disease pathway of interest by selecting multiple seed genes likely to function in the same molecular mechanism. By providing MAGI-MS with seed genes involved in processes underlying certain classes of neurodevelopmental disorders, such as epilepsy, we demonstrate that MAGI-MS can reveal modules enriched in genes relevant to chemical synaptic transmission, glutamatergic synapse and other functions associated with the provided seed genes.Availability and implementationMAGI-MS is free and available at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Fri, 29 Apr 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac025
      Issue No: Vol. 2, No. 1 (2022)
  • ODEbase: a repository of ODE systems for systems biology

    • Authors: Lüders C; Sturm T, Radulescu O, et al.
      Abstract: SummaryRecently, symbolic computation and computer algebra systems have been successfully applied in systems biology, especially in chemical reaction network theory. One advantage of symbolic computation is its potential for qualitative answers to biological questions. Qualitative methods analyze dynamical input systems as formal objects, in contrast to investigating only part of the state space, as is the case with numerical simulation. However, corresponding tools and libraries have a different set of requirements for their input data than their numerical counterparts. A common format used in mathematical modeling of biological processes is Systems Biology Markup Language (SBML). We illustrate that the use of SBML data in symbolic computation requires significant pre-processing, incorporating external biological and mathematical expertise. ODEbase provides suitable input data derived from established existing biomodels, covering in particular the BioModels database.Availability and implementationODEbase is available free of charge at
      PubDate: Tue, 26 Apr 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac027
      Issue No: Vol. 2, No. 1 (2022)
  • Federated horizontally partitioned principal component analysis for
           biomedical applications

    • Authors: Hartebrodt A; Röttger R, Lengauer T.
      Abstract: MotivationFederated learning enables privacy-preserving machine learning in the medical domain because the sensitive patient data remain with the owner and only parameters are exchanged between the data holders. The federated scenario introduces specific challenges related to the decentralized nature of the data, such as batch effects and differences in study population between the sites. Here, we investigate the challenges of moving classical analysis methods to the federated domain, specifically principal component analysis (PCA), a versatile and widely used tool, often serving as an initial step in machine learning and visualization workflows. We provide implementations of different federated PCA algorithms and evaluate them regarding their accuracy for high-dimensional biological data using realistic sample distributions over multiple data sites, and their ability to preserve downstream analyses.ResultsFederated subspace iteration converges to the centralized solution even for unfavorable data distributions, while approximate methods introduce error. Larger sample sizes at the study sites lead to better accuracy of the approximate methods. Approximate methods may be sufficient for coarse data visualization, but are vulnerable to outliers and batch effects. Before the analysis, the PCA algorithm, as well as the number of eigenvectors should be considered carefully to avoid unnecessary communication overhead.Availability and implementationSimulation code and notebooks for federated PCA can be found at; the code for the federated app is available at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Tue, 26 Apr 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac026
      Issue No: Vol. 2, No. 1 (2022)
  • Prediction of RNA–protein interactions using a nucleotide language

    • Authors: Yamada K; Hamada M, Arighi C.
      Abstract: MotivationThe accumulation of sequencing data has enabled researchers to predict the interactions between RNA sequences and RNA-binding proteins (RBPs) using novel machine learning techniques. However, existing models are often difficult to interpret and require additional information to sequences. Bidirectional encoder representations from transformer (BERT) is a language-based deep learning model that is highly interpretable. Therefore, a model based on BERT architecture can potentially overcome such limitations.ResultsHere, we propose BERT-RBP as a model to predict RNA–RBP interactions by adapting the BERT architecture pretrained on a human reference genome. Our model outperformed state-of-the-art prediction models using the eCLIP-seq data of 154 RBPs. The detailed analysis further revealed that BERT-RBP could recognize both the transcript region type and RNA secondary structure only based on sequence information. Overall, the results provide insights into the fine-tuning mechanism of BERT in biological contexts and provide evidence of the applicability of the model to other RNA-related problems.Availability and implementationPython source codes are freely available at The datasets underlying this article were derived from sources in the public domain: [RBPsuite (, Ensembl Biomart (].Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Thu, 07 Apr 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac023
      Issue No: Vol. 2, No. 1 (2022)
  • Mining hidden knowledge: embedding models of cause–effect relationships
           curated from the biomedical literature

    • Authors: Krämer A; Green J, Billaud J, et al.
      Abstract: AbstractMotivationWe explore the use of literature-curated signed causal gene expression and gene–function relationships to construct unsupervised embeddings of genes, biological functions and diseases. Our goal is to prioritize and predict activating and inhibiting functional associations of genes and to discover hidden relationships between functions. As an application, we are particularly interested in the automatic construction of networks that capture relevant biology in a given disease context.ResultsWe evaluated several unsupervised gene embedding models leveraging literature-curated signed causal gene expression findings. Using linear regression, we show that, based on these gene embeddings, gene–function relationships can be predicted with about 95% precision for the highest scoring genes. Function embedding vectors, derived from parameters of the linear regression model, allow inference of relationships between different functions or diseases. We show for several diseases that gene and function embeddings can be used to recover key drivers of pathogenesis, as well as underlying cellular and physiological processes. These results are presented as disease-centric networks of genes and functions. To illustrate the applicability of our approach to other machine learning tasks, we also computed embeddings for drug molecules, which were then tested using a simple neural network to predict drug–disease associations.Availability and implementationPython implementations of the gene and function embedding algorithms operating on a subset of our literature-curated content as well as other code used for this paper are made available as part of the Supplementary dataSupplementary data.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Thu, 07 Apr 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac022
      Issue No: Vol. 2, No. 1 (2022)
  • LMPred: predicting antimicrobial peptides using pre-trained language
           models and deep learning

    • Authors: Dee W; Gromiha M.
      Abstract: AbstractMotivationAntimicrobial peptides (AMPs) are increasingly being used in the development of new therapeutic drugs in areas such as cancer therapy and hypertension. Additionally, they are seen as an alternative to antibiotics due to the increasing occurrence of bacterial resistance. Wet-laboratory experimental identification, however, is both time-consuming and costly, so in silico models are now commonly used in order to screen new AMP candidates.ResultsThis paper proposes a novel approach for creating model inputs; using pre-trained language models to produce contextualized embeddings, representing the amino acids within each peptide sequence, before a convolutional neural network is trained as the classifier. The results were validated on two datasets—one previously used in AMP prediction research, and a larger independent dataset created by this paper. Predictive accuracies of 93.33% and 88.26% were achieved, respectively, outperforming previous state-of-the-art classification models.Availability and implementationAll codes are available and can be accessed here: informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Thu, 31 Mar 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac021
      Issue No: Vol. 2, No. 1 (2022)
  • RCX—an R package adapting the Cytoscape Exchange format for
           biological networks

    • Authors: Auer F; Kramer F, Kuijjer M.
      Abstract: AbstractMotivationThe Cytoscape Exchange (CX) format is a JSON-based data structure designed for the transmission of biological networks using standard web technologies. It was developed by the network data exchange, which itself serves as online commons to share and collaborate on biological networks. Furthermore, the Cytoscape software for the analysis and visualization of biological networks contributes structure elements to capture the visual layout within the CX format. However, there is a fundamental difference between data handling in web standards and R. A manual conversion requires detailed knowledge of the CX format to reproduce and work with the networks.ResultsHere, we present a software package to create, handle, validate, visualize and convert networks in CX format to standard data types and objects within R. Networks in this format can serve as a source for biological knowledge and also capture the results of the analysis of those while preserving the visual layout across all platforms. The RCX package connects the R environment for statistical computing with outside platforms for storage and collaboration, as well as further analysis and visualization of biological networks.AvailabilityRCX is a free and open-source R package, available on Bioconductor from release 3.15 ( and via GitHub ( informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Thu, 31 Mar 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac020
      Issue No: Vol. 2, No. 1 (2022)
  • FlexDotPlot: a universal and modular dot plot visualization tool for
           complex multifaceted data

    • Authors: Leonard S; Lardenois A, Tarte K, et al.
      Abstract: AbstractMotivationDot plots are heatmap-like charts that provide a compact way to simultaneously display two quantitative information by means of dots of different sizes and colors. Despite the popularity of this visualization method, particularly in single-cell RNA-sequencing (scRNA-seq) studies, existing tools used to make dot plots are limited in terms of functionality and usability.ResultsWe developed FlexDotPlot, an R package for generating dot plots from multifaceted data, including scRNA-seq data. It provides a universal and easy-to-use solution with a high versatility. An interactive R Shiny application is also available allowing non-R users to easily generate dot plots with several tunable parameters.Availability and implementationSource code and detailed manual are available on CRAN (stable version) and at (development version). Code to reproduce figures is available at A Shiny app is available as a stand-alone application within the package.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Wed, 23 Mar 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac019
      Issue No: Vol. 2, No. 1 (2022)
  • LIQUORICE: detection of epigenetic signatures in liquid biopsies based on
           whole-genome sequencing data

    • Authors: Peneder P; Bock C, Tomazou E, et al.
      Abstract: AbstractSummaryFragmentation patterns of cell-free DNA reflect the chromatin structure of the cells from which these fragments are derived. Nucleosomes protect the DNA from fragmentation, resulting in decreased sequencing coverage in regions of open chromatin. LIQUORICE is a user-friendly software tool that takes aligned whole-genome sequencing data as input and calculates bias-corrected coverage signatures for predefined, application-specific sets of genomic regions. The tool thereby enables a blood-based analysis of cell death in the body, and it provides a minimally invasive assessment of tumor chromatin states and cell-of-origin. With user-defined sets of regions that exhibit tissue-specific or disease-specific open chromatin, LIQUORICE can be applied to a wide range of detection, classification and quantification tasks in the analysis of liquid biopsies.Availability and implementationLIQUORICE is freely and openly available as a Python package and command-line tool for UNIX-based systems from bioconda. Documentation, examples and usage instructions are provided at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Wed, 23 Mar 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac017
      Issue No: Vol. 2, No. 1 (2022)
  • decoupleR: ensemble of computational methods to infer biological
           activities from omics data

    • Authors: Badia-i-Mompel P; Vélez Santiago J, Braunger J, et al.
      Abstract: AbstractSummaryMany methods allow us to extract biological activities from omics data using information from prior knowledge resources, reducing the dimensionality for increased statistical power and better interpretability. Here, we present decoupleR, a Bioconductor and Python package containing computational methods to extract these activities within a unified framework. decoupleR allows us to flexibly run any method with a given resource, including methods that leverage mode of regulation and weights of interactions, which are not present in other frameworks. Moreover, it leverages OmniPath, a meta-resource comprising over 100 databases of prior knowledge. Using decoupleR, we evaluated the performance of methods on transcriptomic and phospho-proteomic perturbation experiments. Our findings suggest that simple linear models and the consensus score across top methods perform better than other methods at predicting perturbed regulators.Availability and implementationdecoupleR’s open-source code is available in Bioconductor ( for R and in GitHub ( for Python. The code to reproduce the results is in GitHub ( and the data in Zenodo ( informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Tue, 08 Mar 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac016
      Issue No: Vol. 2, No. 1 (2022)
  • AbAdapt: an adaptive approach to predicting antibody–antigen complex
           structures from sequence

    • Authors: Davila A; Xu Z, Li S, et al.
      Abstract: AbstractMotivationThe scoring of antibody–antigen docked poses starting from unbound homology models has not been systematically optimized for a large and diverse set of input sequences.ResultsTo address this need, we have developed AbAdapt, a webserver that accepts antibody and antigen sequences, models their 3D structures, predicts epitope and paratope, and then docks the modeled structures using two established docking engines (Piper and Hex). Each of the key steps has been optimized by developing and training new machine-learning models. The sequences from a diverse set of 622 antibody–antigen pairs with known structure were used as inputs for leave-one-out cross-validation. The final set of cluster representatives included at least one ‘Adequate’ pose for 550/622 (88.4%) of the queries. The median (interquartile range) ranks of these ‘Adequate’ poses were 22 (5–77). Similar results were obtained on a holdout set of 100 unrelated antibody–antigen pairs. When epitopes were repredicted using docking-derived features for specific antibodies, the median ROC AUC increased from 0.679 to 0.720 in cross-validation and from 0.694 to 0.730 in the holdout set.Availability and implementationAbAdapt and related data are available at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Mon, 07 Mar 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac015
      Issue No: Vol. 2, No. 1 (2022)
  • Shuffle & untangle: novel untangle methods for solving the
           tanglegram layout problem

    • Authors: Nguyen N; Chawshin K, Berg C, et al.
      Abstract: AbstractMotivationA tanglegram is a plot of two-tree-like diagrams, one facing the other, and having their labels connected by inter-tree edges. These two trees, which could be both phylogenetic trees and dendrograms stemming from hierarchical clusterings, have thus identically labelled leaves but different topologies. As a result, the inter-tree edges of a tanglegram can be intricately tangled and difficult to be analysed and explained by human readers. To better visualize the tanglegram (and thus compare the two dendrograms) one may try to untangle it, i.e. search for that series of flippings of the various branches of the two trees that minimizes the number of crossings among the inter-tree edges. The untanglement problem has received significant interest in the past decade, and several techniques have been proposed to address it. These techniques are computationally efficient but tend to fail at finding the global optimum configuration generating the least tangly tanglegram.ResultsWe leverage the existing results to propose untanglement methods that are characterized by an overall slower convergence method than the ones in the literature, but that produce tanglegrams with lower entanglements.Availability and implementationOne of the algorithms is implemented in Python, and available from
      PubDate: Mon, 28 Feb 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac014
      Issue No: Vol. 2, No. 1 (2022)
  • Gene and drug landing page aggregator

    • Authors: Clarke D; Kuleshov M, Xie Z, et al.
      Abstract: AbstractMotivationMany biological and biomedical researchers commonly search for information about genes and drugs to gather knowledge from these resources. For the most part, such information is served as landing pages in disparate data repositories and web portals.ResultsThe Gene and Drug Landing Page Aggregator (GDLPA) provides users with access to 50 gene-centric and 19 drug-centric repositories, enabling them to retrieve landing pages corresponding to their gene and drug queries. Bringing these resources together into one dashboard that directs users to the landing pages across many resources can help centralize gene- and drug-centric knowledge, as well as raise awareness of available resources that may be missed when using standard search engines. To demonstrate the utility of GDLPA, case studies for the gene klotho and the drug remdesivir were developed. The first case study highlights the potential role of klotho as a drug target for aging and kidney disease, while the second study gathers knowledge regarding approval, usage, and safety for remdesivir, the first approved coronavirus disease 2019 therapeutic. Finally, based on our experience, we provide guidelines for developing effective landing pages for genes and drugs.Availability and implementationGDLPA is open source and is available from: informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Mon, 28 Feb 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac013
      Issue No: Vol. 2, No. 1 (2022)
  • KaruBioNet: a network and discussion group for a better collaboration and
           structuring of bioinformatics in Guadeloupe (French West Indies)

    • Authors: Couvin D; Dereeper A, Meyer D, et al.
      Abstract: AbstractSummarySequencing and other biological data are now more frequently available and at a lower price. Mutual tools and strategies are needed to analyze the huge amount of heterogeneous data generated by several research teams and devices. Bioinformatics represents a growing field in the scientific community globally. This multidisciplinary field provides a great amount of tools and methods that can be used to conduct scientific studies in a more strategic way. Coordinated actions and collaborations are needed to find more innovative and accurate methods for a better understanding of real-life data. A wide variety of organizations are contributing to KaruBioNet in Guadeloupe (French West Indies), a Caribbean archipelago. The purpose of this group is to foster collaboration and mutual aid among people from different disciplines using a ‘one health’ approach, for a better comprehension and surveillance of humans, plants or animals’ health and diseases. The KaruBioNet network particularly aims to help researchers in their studies related to ‘omics’ data, but also more general aspects concerning biological data analysis. This transdisciplinary network is a platform for discussion, sharing, training and support between scientists interested in bioinformatics and related fields. Starting from a little archipelago in the Caribbean, we envision to facilitate exchange between other Caribbean partners in the future, knowing that the Caribbean is a region with non-negligible biodiversity which should be preserved and protected. Joining forces with other Caribbean countries or territories would strengthen scientific collaborative impact in the region. Information related to this network can be found at: Furthermore, a dedicated ‘Galaxy KaruBioNet’ platform is available at: and implementation Information about KaruBioNet is availabe at: informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Fri, 18 Feb 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac010
      Issue No: Vol. 2, No. 1 (2022)
  • scMoC: single-cell multi-omics clustering

    • Authors: Eltager M; Abdelaal T, Mahfouz A, et al.
      Abstract: AbstractMotivationSingle-cell multi-omics assays simultaneously measure different molecular features from the same cell. A key question is how to benefit from the complementary data available and perform cross-modal clustering of cells.ResultsWe propose Single-Cell Multi-omics Clustering (scMoC), an approach to identify cell clusters from data with comeasurements of scRNA-seq and scATAC-seq from the same cell. We overcome the high sparsity of the scATAC-seq data by using an imputation strategy that exploits the less-sparse scRNA-seq data available from the same cell. Subsequently, scMoC identifies clusters of cells by merging clusterings derived from both data domains individually. We tested scMoC on datasets generated using different protocols with variable data sparsity levels. We show that scMoC (i) is able to generate informative scATAC-seq data due to its RNA-guided imputation strategy and (ii) results in integrated clusters based on both RNA and ATAC information that are biologically meaningful either from the RNA or from the ATAC perspective.Availability and implementationThe data used in this manuscript is publicly available, and we refer to the original manuscript for their description and availability. For convience sci-CAR data is available at NCBI GEO under the accession number of GSE117089. SNARE-seq data is available at NCBI GEO under the accession number of GSE126074. The 10X multiome data is available at the following link informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Tue, 15 Feb 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac011
      Issue No: Vol. 2, No. 1 (2022)
  • Compositional Data Analysis using Kernels in mass cytometry data

    • Authors: Rudra P; Baxter R, Hsieh E, et al.
      Abstract: AbstractMotivationCell-type abundance data arising from mass cytometry experiments are compositional in nature. Classical association tests do not apply to the compositional data due to their non-Euclidean nature. Existing methods for analysis of cell type abundance data suffer from several limitations for high-dimensional mass cytometry data, especially when the sample size is small.ResultsWe proposed a new multivariate statistical learning methodology, Compositional Data Analysis using Kernels (CODAK), based on the kernel distance covariance (KDC) framework to test the association of the cell type compositions with important predictors (categorical or continuous) such as disease status. CODAK scales well for high-dimensional data and provides satisfactory performance for small sample sizes (n < 25). We conducted simulation studies to compare the performance of the method with existing methods of analyzing cell type abundance data from mass cytometry studies. The method is also applied to a high-dimensional dataset containing different subgroups of populations including Systemic Lupus Erythematosus (SLE) patients and healthy control subjects.Availability and implementationCODAK is implemented using R. The codes and the data used in this manuscript are available on the web at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Fri, 11 Feb 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac003
      Issue No: Vol. 2, No. 1 (2022)
  • SangeR: the high-throughput Sanger sequencing analysis pipeline

    • Authors: Schmid K; Dohmen H, Ritschel N, et al.
      Abstract: AbstractSummaryIn the era of next generation sequencing and beyond, the Sanger technique is still widely used for variant verification of inconclusive or ambiguous high-throughput sequencing results or as a low-cost molecular genetical analysis tool for single targets in many fields of study. Many analysis steps need time-consuming manual intervention. Therefore, we present here a pipeline-capable high-throughput solution with an optional Shiny web interface, that provides a binary mutation decision of hotspots together with plotted chromatograms including annotations via flat files.Availability and implementationSangeR is freely available at and or informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Mon, 31 Jan 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac009
      Issue No: Vol. 2, No. 1 (2022)
  • qmotif: determination of telomere content from whole-genome sequence data

    • Authors: Holmes O; Nones K, Tang Y, et al.
      Abstract: AbstractMotivationChanges in telomere length have been observed in cancer and can be indicative of mechanisms involved in carcinogenesis. Most methods used to estimate telomere length require laboratory analysis of DNA samples. Here, we present qmotif, a fast and easy tool that determines telomeric repeat sequences content as an estimate of telomere length directly from whole-genome sequencing.Resultsqmotif shows similar results to quantitative PCR, the standard method for high-throughput clinical telomere length quantification. qmotif output correlates strongly with the output of other tools for determining telomere sequence content, TelSeq and TelomereHunter, but can run in a fraction of the time—usually under a minute.Availability and implementationqmotif is implemented in Java and source code is available at, with instructions on how to build and use the application available from informationSupplementary data are available at Bioinformatics Advances online.
      PubDate: Mon, 31 Jan 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac005
      Issue No: Vol. 2, No. 1 (2022)
  • MODalyseR—a novel software for inference of disease module hub
           regulators identified a putative multiple sclerosis regulator supported by
           independent eQTL data

    • Authors: de Weerd H; Åkesson J, Guala D, et al.
      Abstract: AbstractMotivationNetwork-based disease modules have proven to be a powerful concept for extracting knowledge about disease mechanisms, predicting for example disease risk factors and side effects of treatments. Plenty of tools exist for the purpose of module inference, but less effort has been put on simultaneously utilizing knowledge about regulatory mechanisms for predicting disease module hub regulators.ResultsWe developed MODalyseR, a novel software for identifying disease module regulators and reducing modules to the most disease-associated genes. This pipeline integrates and extends previously published software packages MODifieR and ComHub and hereby provides a user-friendly network medicine framework combining the concepts of disease modules and hub regulators for precise disease gene identification from transcriptomics data. To demonstrate the usability of the tool, we designed a case study for multiple sclerosis that revealed IKZF1 as a promising hub regulator, which was supported by independent ChIP-seq data.Availability and implementationMODalyseR is available as a Docker image at with user guide and installation instructions found at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Tue, 25 Jan 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac006
      Issue No: Vol. 2, No. 1 (2022)
  • From pairwise to multiple spliced alignment

    • Authors: Jammali S; Djossou A, Ouédraogo W, et al.
      Abstract: AbstractMotivationAlternative splicing is a ubiquitous process in eukaryotes that allows distinct transcripts to be produced from the same gene. Yet, the study of transcript evolution within a gene family is still in its infancy. One prerequisite for this study is the availability of methods to compare sets of transcripts while accounting for their splicing structure. In this context, we generalize the concept of pairwise spliced alignments (PSpAs) to multiple spliced alignments (MSpAs). MSpAs have several important purposes in addition to empowering the study of the evolution of transcripts. For instance, it is a key to improving the prediction of gene models, which is important to solve the growing problem of genome annotation. Despite its essentialness, a formal definition of the concept and methods to compute MSpAs are still lacking.ResultsWe introduce the MSpA problem and the SplicedFamAlignMulti (SFAM) method, to compute the MSpA of a gene family. Like most multiple sequence alignment (MSA) methods that are generally greedy heuristic methods assembling pairwise alignments, SFAM combines all PSpAs of coding DNA sequences and gene sequences of a gene family into an MSpA. It produces a single structure that represents the superstructure and models of the gene family. Using real vertebrate and simulated gene family data, we illustrate the utility of SFAM for computing accurate gene family superstructures, MSAs, inferring splicing orthologous groups and improving gene-model annotations.Availability and implementationThe supporting data and implementation of SFAM are freely available at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Wed, 05 Jan 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbab044
      Issue No: Vol. 2, No. 1 (2022)
  • Hagit Shatkay-Reshef 1965–2022

    • Authors: Arighi C.
      Abstract: Professor Hagit Shatkay in 2012 Courtesy of Kathy F. Atkinson/University of Delaware.
      PubDate: Fri, 04 Mar 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac012
  • N6-methyladenosine enhances post-transcriptional gene regulation by

    • Authors: Kanoria S; Rennie W, Carmack C, et al.
      Abstract: AbstractMotivationN6-methyladenosine (m6A) is the most prevalent modification in eukaryotic messenger RNAs. MicroRNAs (miRNAs) are abundant post-transcriptional regulators of gene expression. Correlation between m6A and miRNA-targeting sites has been reported to suggest possible involvement of m6A in miRNA-mediated gene regulation. However, it is unknown what the regulatory effects might be. In this study, we performed comprehensive analyses of high-throughput data on m6A and miRNA target binding and regulation.ResultsWe found that the level of miRNA-mediated target suppression is significantly enhanced when m6A is present on target mRNAs. The evolutionary conservation for miRNA-binding sites with m6A modification is significantly higher than that for miRNA-binding sites without modification. These findings suggest functional significance of m6A modification in post-transcriptional gene regulation by miRNAs. We also found that methylated targets have more stable structure than non-methylated targets, as indicated by significantly higher GC content. Furthermore, miRNA-binding sites that can be potentially methylated are significantly less accessible without methylation than those that do not possess potential methylation sites. Since either RNA-binding proteins or m6A modification by itself can destabilize RNA structure, we propose a model in which m6A alters local target secondary structure to increase accessibility for efficient binding by Argonaute proteins, leading to enhanced miRNA-mediated regulation.Availability and implementationN/A.
      PubDate: Tue, 18 Jan 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbab046
  • MSPypeline: a python package for streamlined data analysis of mass
           spectrometry-based proteomics

    • Authors: Heming S; Hansen P, Vlasov A, et al.
      Abstract: AbstractSummaryMass spectrometry-based proteomics is increasingly employed in biology and medicine. To generate reliable information from large datasets and ensure comparability of results, it is crucial to implement and standardize the quality control of the raw data, the data processing steps and the statistical analyses. MSPypeline provides a platform for importing MaxQuant output tables, generating quality control reports, data preprocessing including normalization and performing exploratory analyses by statistical inference plots. These standardized steps assess data quality, provide customizable figures and enable the identification of differentially expressed proteins to reach biologically relevant conclusions.Availability and implementationThe source code is available under the MIT license at with documentation at Benchmark mass spectrometry data are available on ProteomeXchange (PXD025792).Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Mon, 17 Jan 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac004
  • Biclique extension as an effective approach to identify missing links in
           metabolic compound–protein interaction networks

    • Authors: Thieme S; Walther D, Kuijjer M.
      Abstract: AbstractMotivationMetabolic networks are complex systems of chemical reactions proceeding via physical interactions between metabolites and proteins. We aimed to predict previously unknown compound–protein interactions (CPI) in metabolic networks by applying biclique extension, a network-structure-based prediction method.ResultsWe developed a workflow, named BiPredict, to predict CPIs based on biclique extension and applied it to Escherichia coli and human using their respective known CPI networks as input. Depending on the chosen biclique size and using a STITCH-derived E.coli CPI network as input, a sensitivity of 39% and an associated precision of 59% was reached. For the larger human STITCH network, a sensitivity of 78% with a false-positive rate of <5% and precision of 75% was obtained. High performance was also achieved when using KEGG metabolic-reaction networks as input. Prediction performance significantly exceeded that of randomized controls and compared favorably to state-of-the-art deep-learning methods. Regarding metabolic process involvement, TCA-cycle and ribosomal processes were found enriched among predicted interactions. BiPredict can be used for network curation, may help increase the efficiency of experimental testing of CPIs, and can readily be applied to other species.Availability and implementationBiPredict and related datasets are available at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Wed, 12 Jan 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbac001
  • Folding the unfoldable: using AlphaFold to explore spurious proteins

    • Authors: Monzon V; Haft D, Bateman A, et al.
      Abstract: AbstractMotivationThe release of AlphaFold 2.0 has revolutionized our ability to determine protein structures from sequences. This tool also inadvertently opens up many unanticipated opportunities. In this article, we investigate the AntiFam resource, which contains 250 protein sequence families that we believe to be spurious protein translations. We would not expect proteins belonging to these families to fold into well-ordered globular structures. To test this hypothesis, we have attempted to computationally determine the structure of a representative sequence from all AntiFam 6.0 families.ResultsAlthough the large majority of families showed no evidence of globular structure, we have identified one example for which a globular structure is predicted. Proteins in this AntiFam entry indeed seem likely to be bona fide proteins, based on additional considerations, and thus AlphaFold provides a useful quality control for the AntiFam database. Conversely, known spurious proteins offer useful set of quality controls for AlphaFold. We have identified a trend that the mean structure prediction confidence score pLDDT is higher for shorter sequences. Of the 131 AntiFam representative sequences <100 amino acids in length, AlphaFold predicts a mean pLDDT of 80 or greater for six of them. Thus, particular care should be taken when applying AlphaFold to short protein sequences.Availability and implementationThe AlphaFold predictions for representative sequences can be found at the following URL: informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Sun, 09 Jan 2022 00:00:00 GMT
      DOI: 10.1093/bioadv/vbab043
  • The developmentally dynamic microRNA transcriptome of Glossina pallidipes
           tsetse flies, vectors of animal trypanosomiasis

    • Authors: Naitore C; Villinger J, Kibet C, et al.
      Abstract: AbstractSummaryMicroRNAs (miRNAs) are single stranded gene regulators of 18–25 bp in length. They play a crucial role in regulating several biological processes in insects. However, the functions of miRNA in Glossina pallidipes, one of the biological vectors of African animal trypanosomosis in sub-Saharan Africa, remain poorly characterized. We used a combination of both molecular biology and bioinformatics techniques to identify miRNA genes at different developmental stages (larvae, pupae, teneral and reproductive unmated adults, gravid females) and sexes of G. pallidipes. We identified 157 mature miRNA genes, including 12 novel miRNAs unique to G. pallidipes. Moreover, we identified 93 miRNA genes that were differentially expressed by sex and/or in specific developmental stages. By combining both miRanda and RNAhybrid algorithms, we identified 5550 of their target genes. Further analyses with the Gene Ontology term and KEGG pathways for these predicted target genes suggested that the miRNAs may be involved in key developmental biological processes. Our results provide the first repository of G. pallidipes miRNAs across developmental stages, some of which appear to play crucial roles in tsetse fly development. Hence, our findings provide a better understanding of tsetse biology and a baseline for exploring miRNA genes in tsetse flies.Availability and implementationRaw sequence data are available from NCBI Sequence Read Archives (SRA) under Bioproject accession number PRJNA590626.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Tue, 28 Dec 2021 00:00:00 GMT
      DOI: 10.1093/bioadv/vbab047
  • LYRUS: a machine learning model for predicting the pathogenicity of
           missense variants

    • Authors: Lai J; Yang J, Gamsiz Uzun E, et al.
      Abstract: AbstractSummarySingle amino acid variations (SAVs) are a primary contributor to variations in the human genome. Identifying pathogenic SAVs can provide insights to the genetic architecture of complex diseases. Most approaches for predicting the functional effects or pathogenicity of SAVs rely on either sequence or structural information. This study presents 〈Lai Yang Rubenstein Uzun Sarkar〉 (LYRUS), a machine learning method that uses an XGBoost classifier to predict the pathogenicity of SAVs. LYRUS incorporates five sequence-based, six structure-based and four dynamics-based features. Uniquely, LYRUS includes a newly proposed sequence co-evolution feature called the variation number. LYRUS was trained using a dataset that contains 4363 protein structures corresponding to 22 639 SAVs from the ClinVar database, and tested using the VariBench testing dataset. Performance analysis showed that LYRUS achieved comparable performance to current variant effect predictors. LYRUS’s performance was also benchmarked against six Deep Mutational Scanning datasets for PTEN and TP53.Availability and implementationLYRUS is freely available and the source code can be found at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Sat, 25 Dec 2021 00:00:00 GMT
      DOI: 10.1093/bioadv/vbab045
  • A random forest classifier for protein–protein docking models

    • Authors: Barradas-Bautista D; Cao Z, Vangone A, et al.
      Abstract: Abstract Herein, we present the results of a machine learning approach we developed to single out correct 3D docking models of protein–protein complexes obtained by popular docking software. To this aim, we generated 3×104 docking models for each of the 230 complexes in the protein–protein benchmark, version 5, using three different docking programs (HADDOCK, FTDock and ZDOCK), for a cumulative set of ≈7×106 docking models. Three different machine learning approaches (Random Forest, Supported Vector Machine and Perceptron) were used to train classifiers with 158 different scoring functions (features). The Random Forest algorithm outperformed the other two algorithms and was selected for further optimization. Using a features selection algorithm, and optimizing the random forest hyperparameters, allowed us to train and validate a random forest classifier, named COnservation Driven Expert System (CoDES). Testing of CoDES on independent datasets, as well as results of its comparative performance with machine learning methods recently developed in the field for the scoring of docking decoys, confirm its state-of-the-art ability to discriminate correct from incorrect decoys both in terms of global parameters and in terms of decoys ranked at the top positions.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.Software and data availability statementThe docking models are available at The programs underlying this article will be shared on request to the corresponding authors.
      PubDate: Fri, 10 Dec 2021 00:00:00 GMT
      DOI: 10.1093/bioadv/vbab042
  • protti: an R package for comprehensive data analysis of peptide- and
           protein-centric bottom-up proteomics data

    • Authors: Quast J; Schuster D, Picotti P, et al.
      Abstract: AbstractSummaryWe present a flexible, user-friendly R package called protti for comprehensive quality control, analysis and interpretation of quantitative bottom-up proteomics data. protti supports the analysis of protein-centric data such as those associated with protein expression analyses, as well as peptide-centric data such as those resulting from limited proteolysis-coupled mass spectrometry analysis. Due to its flexible design, it supports analysis of label-free, data-dependent, data-independent and targeted proteomics datasets. protti can be run on the output of any search engine and software package commonly used for bottom-up proteomics experiments such as Spectronaut, Skyline, MaxQuant or Proteome Discoverer, adequately exported to table format.Availability and implementationprotti is implemented as an open-source R package. Release versions are available via CRAN ( and work on all major operating systems. The development version is maintained on GitHub ( Full documentation including examples is provided in the form of vignettes on our package website (
      PubDate: Fri, 10 Dec 2021 00:00:00 GMT
      DOI: 10.1093/bioadv/vbab041
  • wTAM: a web server for annotation of weighted human microRNAs

    • Authors: Cui C; Fan R, Zhou Y, et al.
      Abstract: Abstract It is well-known that some microRNAs (miRNAs) are more important than the others for life, hinting the wide range of miRNA in essentiality or importance. Functional enrichment analysis is a quite pervasive method to dig out the underlying biological pathway for a given gene list and several tools of miRNA set enrichment analysis have been developed. However, all those tools treat each miRNA equally and neglect the importance score of miRNA itself, which could be an obstacle to seek more insightful biological processes for researchers. Here, we developed wTAM, a tool for annotation of weighted human miRNAs, introducing the miRNA importance scores into enrichment analysis. In addition, the annotation repository has been enlarged comparing to TAM. Finally, the case study demonstrated the availability and flexibility of wTAM.Availability and implementationwTAM is freely available at informationSupplementary dataSupplementary data are available at Bioinformatics Advances online.
      PubDate: Tue, 07 Dec 2021 00:00:00 GMT
      DOI: 10.1093/bioadv/vbab040
  • Cutevariant: a standalone GUI-based desktop application to explore genetic
           variations from an annotated VCF file

    • Authors: Schutz S; Monod-Broca C, Bourneuf L, et al.
      Abstract: AbstractSummaryCutevariant is a graphical user interface (GUI)-based desktop application designed to filter variations from annotated VCF file. The application imports data into a local SQLite database where complex filter queries can be built either from GUI controllers or using a domain-specific language called Variant Query Language. Cutevariant provides more features than existing applications and is fully customizable thanks to a complete plugins architecture.Availability and implementationCutevariant is distributed as a multiplatform client-side software under an open source license and is available at
      PubDate: Thu, 25 Nov 2021 00:00:00 GMT
      DOI: 10.1093/bioadv/vbab028
  • MUNDO: protein function prediction embedded in a multispecies world

    • Authors: Arsenescu V; Devkota K, Erden M, et al.
      Abstract: AbstractMotivationLeveraging cross-species information in protein function prediction can add significant power to network-based protein function prediction methods, because so much functional information is conserved across at least close scales of evolution. We introduce MUNDO, a new cross-species co-embedding method that combines a single-network embedding method with a co-embedding method to predict functional annotations in a target species, leveraging also functional annotations in a model species network.ResultsAcross a wide range of parameter choices, MUNDO performs best at predicting annotations in the mouse network, when trained on mouse and human protein–protein interaction (PPI) networks, in the human network, when trained on human and mouse PPIs, and in Baker’s yeast, when trained on Fission and Baker’s yeast, as compared to competitor methods. MUNDO also outperforms all the cross-species methods when predicting in Fission yeast when trained on Fission and Baker’s yeast; however, in this single case, discarding the information from the other species and using annotations from the Fission yeast network alone usually performs best.Availability and implementationAll code is available and can be accessed here: informationSupplementary dataSupplementary data are available at Bioinformatics Advances online. Additional experimental results are on our github site.
      PubDate: Wed, 29 Sep 2021 00:00:00 GMT
      DOI: 10.1093/bioadv/vbab025
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762

Your IP address:
Home (Search)
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-