Subjects -> BIOLOGY (Total: 3134 journals)
    - BIOCHEMISTRY (239 journals)
    - BIOENGINEERING (143 journals)
    - BIOLOGY (1491 journals)
    - BIOPHYSICS (53 journals)
    - BIOTECHNOLOGY (243 journals)
    - BOTANY (220 journals)
    - CYTOLOGY AND HISTOLOGY (32 journals)
    - ENTOMOLOGY (67 journals)
    - GENETICS (152 journals)
    - MICROBIOLOGY (265 journals)
    - MICROSCOPY (13 journals)
    - ORNITHOLOGY (26 journals)
    - PHYSIOLOGY (73 journals)
    - ZOOLOGY (117 journals)

BIOLOGY (1491 journals)                  1 2 3 4 5 6 7 8 | Last

Showing 1 - 200 of 1720 Journals sorted alphabetically
AAPS Journal     Hybrid Journal   (Followers: 29)
ACS Pharmacology & Translational Science     Hybrid Journal   (Followers: 3)
ACS Synthetic Biology     Hybrid Journal   (Followers: 39)
Acta Biologica Hungarica     Full-text available via subscription   (Followers: 6)
Acta Biologica Marisiensis     Open Access   (Followers: 5)
Acta Biologica Sibirica     Open Access   (Followers: 2)
Acta Biologica Turcica     Open Access   (Followers: 2)
Acta Biomaterialia     Hybrid Journal   (Followers: 32)
Acta Biotheoretica     Hybrid Journal   (Followers: 3)
Acta Chiropterologica     Full-text available via subscription   (Followers: 6)
acta ethologica     Hybrid Journal   (Followers: 7)
Acta Fytotechnica et Zootechnica     Open Access   (Followers: 3)
Acta Ichthyologica et Piscatoria     Open Access   (Followers: 5)
Acta Médica Costarricense     Open Access   (Followers: 4)
Acta Scientiarum. Biological Sciences     Open Access   (Followers: 2)
Acta Scientifica Naturalis     Open Access   (Followers: 4)
Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis     Open Access   (Followers: 2)
Actualidades Biológicas     Open Access   (Followers: 1)
Advanced Biology     Hybrid Journal   (Followers: 2)
Advanced Health Care Technologies     Open Access   (Followers: 12)
Advanced Journal of Graduate Research     Open Access   (Followers: 2)
Advanced Membranes     Open Access   (Followers: 8)
Advanced Quantum Technologies     Hybrid Journal   (Followers: 5)
Advances in Biological Regulation     Hybrid Journal   (Followers: 4)
Advances in Biology     Open Access   (Followers: 16)
Advances in Biomarker Sciences and Technology     Open Access   (Followers: 2)
Advances in Biosensors and Bioelectronics     Open Access   (Followers: 8)
Advances in Cell Biology/ Medical Journal of Cell Biology     Open Access   (Followers: 28)
Advances in Ecological Research     Full-text available via subscription   (Followers: 47)
Advances in Environmental Sciences - International Journal of the Bioflux Society     Open Access   (Followers: 17)
Advances in Enzyme Research     Open Access   (Followers: 11)
Advances in High Energy Physics     Open Access   (Followers: 27)
Advances in Life Science and Technology     Open Access   (Followers: 14)
Advances in Life Sciences     Open Access   (Followers: 6)
Advances in Marine Biology     Full-text available via subscription   (Followers: 29)
Advances in Virus Research     Full-text available via subscription   (Followers: 9)
Adversity and Resilience Science : Journal of Research and Practice     Hybrid Journal   (Followers: 4)
African Journal of Ecology     Hybrid Journal   (Followers: 18)
African Journal of Range & Forage Science     Hybrid Journal   (Followers: 12)
AFRREV STECH : An International Journal of Science and Technology     Open Access   (Followers: 3)
Ageing Research Reviews     Hybrid Journal   (Followers: 13)
Aggregate     Open Access   (Followers: 3)
Aging Cell     Open Access   (Followers: 23)
Agrokémia és Talajtan     Full-text available via subscription   (Followers: 2)
AJP Cell Physiology     Hybrid Journal   (Followers: 14)
AJP Endocrinology and Metabolism     Hybrid Journal   (Followers: 14)
AJP Lung Cellular and Molecular Physiology     Hybrid Journal   (Followers: 3)
Al-Kauniyah : Jurnal Biologi     Open Access  
Alasbimn Journal     Open Access   (Followers: 1)
Alces : A Journal Devoted to the Biology and Management of Moose     Open Access  
Alfarama Journal of Basic & Applied Sciences     Open Access   (Followers: 12)
All Life     Open Access   (Followers: 1)
AMB Express     Open Access   (Followers: 1)
Ambix     Hybrid Journal   (Followers: 3)
American Journal of Agricultural and Biological Sciences     Open Access   (Followers: 7)
American Journal of Bioethics     Hybrid Journal   (Followers: 17)
American Journal of Human Biology     Hybrid Journal   (Followers: 19)
American Journal of Plant Sciences     Open Access   (Followers: 24)
American Journal of Primatology     Hybrid Journal   (Followers: 17)
American Naturalist     Full-text available via subscription   (Followers: 83)
Amphibia-Reptilia     Hybrid Journal   (Followers: 5)
Anaerobe     Hybrid Journal   (Followers: 3)
Analytical Methods     Hybrid Journal   (Followers: 7)
Analytical Science Advances     Open Access   (Followers: 2)
Anatomia     Open Access   (Followers: 16)
Anatomical Science International     Hybrid Journal   (Followers: 3)
Animal Cells and Systems     Hybrid Journal   (Followers: 4)
Animal Microbiome     Open Access   (Followers: 7)
Animal Models and Experimental Medicine     Open Access  
Annales françaises d'Oto-rhino-laryngologie et de Pathologie Cervico-faciale     Full-text available via subscription   (Followers: 2)
Annales Henri Poincaré     Hybrid Journal   (Followers: 2)
Annales Universitatis Mariae Curie-Sklodowska, sectio C – Biologia     Open Access   (Followers: 1)
Annals of Applied Biology     Hybrid Journal   (Followers: 7)
Annals of Biomedical Engineering     Hybrid Journal   (Followers: 18)
Annals of Human Biology     Hybrid Journal   (Followers: 6)
Annals of Science and Technology     Open Access   (Followers: 2)
Annual Research & Review in Biology     Open Access   (Followers: 1)
Annual Review of Biomedical Engineering     Full-text available via subscription   (Followers: 19)
Annual Review of Cell and Developmental Biology     Full-text available via subscription   (Followers: 40)
Annual Review of Food Science and Technology     Full-text available via subscription   (Followers: 13)
Annual Review of Genomics and Human Genetics     Full-text available via subscription   (Followers: 33)
Antibiotics     Open Access   (Followers: 12)
Antioxidants     Open Access   (Followers: 4)
Antonie van Leeuwenhoek     Hybrid Journal   (Followers: 3)
Anzeiger für Schädlingskunde     Hybrid Journal   (Followers: 1)
Apidologie     Hybrid Journal   (Followers: 4)
Apmis     Hybrid Journal   (Followers: 1)
APOPTOSIS     Hybrid Journal   (Followers: 5)
Applied Biology     Open Access  
Applied Bionics and Biomechanics     Open Access   (Followers: 4)
Applied Phycology     Open Access   (Followers: 1)
Applied Vegetation Science     Full-text available via subscription   (Followers: 9)
Aquaculture Environment Interactions     Open Access   (Followers: 7)
Aquaculture International     Hybrid Journal   (Followers: 25)
Aquaculture Reports     Open Access   (Followers: 3)
Aquaculture, Aquarium, Conservation & Legislation - International Journal of the Bioflux Society     Open Access   (Followers: 9)
Aquatic Biology     Open Access   (Followers: 9)
Aquatic Ecology     Hybrid Journal   (Followers: 45)
Aquatic Ecosystem Health & Management     Hybrid Journal   (Followers: 16)
Aquatic Science and Technology     Open Access   (Followers: 4)
Aquatic Toxicology     Hybrid Journal   (Followers: 26)
Arabian Journal of Scientific Research / المجلة العربية للبحث العلمي     Open Access  
Archaea     Open Access   (Followers: 3)
Archiv für Molluskenkunde: International Journal of Malacology     Full-text available via subscription   (Followers: 1)
Archives of Biological Sciences     Open Access  
Archives of Microbiology     Hybrid Journal   (Followers: 9)
Archives of Natural History     Hybrid Journal   (Followers: 9)
Archives of Oral Biology     Hybrid Journal   (Followers: 2)
Archives of Virology     Hybrid Journal   (Followers: 6)
Archivum Immunologiae et Therapiae Experimentalis     Hybrid Journal   (Followers: 2)
Arid Ecosystems     Hybrid Journal   (Followers: 2)
Arquivos do Museu Dinâmico Interdisciplinar     Open Access  
Arthropod Structure & Development     Hybrid Journal   (Followers: 1)
Arthropod Systematics & Phylogeny     Open Access   (Followers: 12)
Artificial DNA: PNA & XNA     Hybrid Journal   (Followers: 2)
Artificial Intelligence in the Life Sciences     Open Access   (Followers: 1)
Asian Bioethics Review     Full-text available via subscription   (Followers: 2)
Asian Journal of Biological Sciences     Open Access   (Followers: 2)
Asian Journal of Biology     Open Access  
Asian Journal of Biotechnology and Bioresource Technology     Open Access  
Asian Journal of Cell Biology     Open Access   (Followers: 4)
Asian Journal of Developmental Biology     Open Access   (Followers: 1)
Asian Journal of Medical and Biological Research     Open Access   (Followers: 3)
Asian Journal of Nematology     Open Access   (Followers: 4)
Asian Journal of Poultry Science     Open Access   (Followers: 3)
Atti della Accademia Peloritana dei Pericolanti - Classe di Scienze Medico-Biologiche     Open Access  
Australian Life Scientist     Full-text available via subscription   (Followers: 2)
Australian Mammalogy     Hybrid Journal   (Followers: 8)
Autophagy     Hybrid Journal   (Followers: 8)
Avian Biology Research     Hybrid Journal   (Followers: 4)
Avian Conservation and Ecology     Open Access   (Followers: 19)
Bacterial Empire     Open Access   (Followers: 1)
Bacteriology Journal     Open Access   (Followers: 2)
Bacteriophage     Full-text available via subscription   (Followers: 2)
Bangladesh Journal of Bioethics     Open Access  
Bangladesh Journal of Scientific Research     Open Access  
Between the Species     Open Access   (Followers: 2)
BIO Web of Conferences     Open Access  
BIO-SITE : Biologi dan Sains Terapan     Open Access  
Biocatalysis and Biotransformation     Hybrid Journal   (Followers: 4)
BioCentury Innovations     Full-text available via subscription   (Followers: 2)
Biochemistry and Cell Biology     Hybrid Journal   (Followers: 18)
Biochimie     Hybrid Journal   (Followers: 2)
BioControl     Hybrid Journal   (Followers: 2)
Biocontrol Science and Technology     Hybrid Journal   (Followers: 5)
Biodemography and Social Biology     Hybrid Journal   (Followers: 1)
BIODIK : Jurnal Ilmiah Pendidikan Biologi     Open Access  
BioDiscovery     Open Access   (Followers: 2)
Biodiversity : Research and Conservation     Open Access   (Followers: 30)
Biodiversity Data Journal     Open Access   (Followers: 7)
Biodiversity Informatics     Open Access   (Followers: 3)
Biodiversity Information Science and Standards     Open Access   (Followers: 5)
Biodiversity Observations     Open Access   (Followers: 2)
Bioeksperimen : Jurnal Penelitian Biologi     Open Access  
Bioelectrochemistry     Hybrid Journal   (Followers: 1)
Bioelectromagnetics     Hybrid Journal   (Followers: 1)
Bioenergy Research     Hybrid Journal   (Followers: 3)
Bioengineering and Bioscience     Open Access   (Followers: 1)
BioEssays     Hybrid Journal   (Followers: 10)
Bioethics     Hybrid Journal   (Followers: 20)
BioéthiqueOnline     Open Access   (Followers: 1)
Biogeographia : The Journal of Integrative Biogeography     Open Access   (Followers: 2)
Biogeosciences (BG)     Open Access   (Followers: 19)
Biogeosciences Discussions (BGD)     Open Access   (Followers: 3)
Bioinformatics     Hybrid Journal   (Followers: 324)
Bioinformatics Advances : Journal of the International Society for Computational Biology     Open Access   (Followers: 5)
Bioinformatics and Biology Insights     Open Access   (Followers: 15)
Biointerphases     Open Access   (Followers: 1)
Biojournal of Science and Technology     Open Access  
Biologia     Hybrid Journal   (Followers: 1)
Biologia Futura     Hybrid Journal  
Biologia on-line : Revista de divulgació de la Facultat de Biologia     Open Access  
Biological Bulletin     Partially Free   (Followers: 6)
Biological Control     Hybrid Journal   (Followers: 6)
Biological Invasions     Hybrid Journal   (Followers: 24)
Biological Journal of the Linnean Society     Hybrid Journal   (Followers: 18)
Biological Procedures Online     Open Access  
Biological Psychiatry     Hybrid Journal   (Followers: 60)
Biological Psychology     Hybrid Journal   (Followers: 5)
Biological Research     Open Access   (Followers: 1)
Biological Rhythm Research     Hybrid Journal  
Biological Theory     Hybrid Journal   (Followers: 3)
Biological Trace Element Research     Hybrid Journal  
Biologicals     Full-text available via subscription   (Followers: 5)
Biologics: Targets & Therapy     Open Access   (Followers: 1)
Biologie Aujourd'hui     Full-text available via subscription  
Biologie in Unserer Zeit (Biuz)     Hybrid Journal   (Followers: 2)
Biologija     Open Access  
Biology     Open Access   (Followers: 5)
Biology and Philosophy     Hybrid Journal   (Followers: 19)
Biology Bulletin     Hybrid Journal   (Followers: 1)
Biology Bulletin Reviews     Hybrid Journal  
Biology Direct     Open Access   (Followers: 9)
Biology Methods and Protocols     Open Access  
Biology of Sex Differences     Open Access   (Followers: 1)
Biology of the Cell     Full-text available via subscription   (Followers: 8)
Biology, Medicine, & Natural Product Chemistry     Open Access   (Followers: 2)
Biomacromolecules     Hybrid Journal   (Followers: 21)
Biomarker Insights     Open Access   (Followers: 1)
Biomarkers     Hybrid Journal   (Followers: 6)

        1 2 3 4 5 6 7 8 | Last

Similar Journals
Journal Cover
Bioinformatics
Journal Prestige (SJR): 6.14
Citation Impact (citeScore): 8
Number of Followers: 324  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1367-4803 - ISSN (Online) 1460-2059
Published by Oxford University Press Homepage  [425 journals]
  • grenedalf: population genetic statistics for the next generation of pool
           sequencing

    • Free pre-print version: Loading...

      First page: btae508
      Abstract: AbstractSummaryPool sequencing is an efficient method for capturing genome-wide allele frequencies from multiple individuals, with broad applications such as studying adaptation in Evolve-and-Resequence experiments, monitoring of genetic diversity in wild populations, and genotype-to-phenotype mapping. Here, we present grenedalf, a command line tool written in C++ that implements common population genetic statistics such as θ, Tajima’s D, and FST for Pool sequencing. It is orders of magnitude faster than current tools, and is focused on providing usability and scalability, while also offering a plethora of input file formats and convenience options.Availability and implementationgrenedalf is published under the GPL-3, and freely available at github.com/lczech/grenedalf.
      PubDate: Mon, 26 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae508
      Issue No: Vol. 40, No. 8 (2024)
       
  • cypress: an R/Bioconductor package for cell-type-specific differential
           expression analysis power assessment

    • Free pre-print version: Loading...

      First page: btae511
      Abstract: AbstractSummaryRecent methodology advances in computational signal deconvolution have enabled bulk transcriptome data analysis at a finer cell-type level. Through deconvolution, identifying cell-type-specific differentially expressed (csDE) genes is drawing increasing attention in clinical applications. However, researchers still face a number of difficulties in adopting csDE genes detection methods in practice, especially in their experimental design. Here we present cypress, the first experimental design and statistical power analysis tool in csDE genes identification. This tool can reliably model purified cell-type-specific (CTS) profiles, cell-type compositions, biological and technical variations, offering a high-fidelity simulator for bulk RNA-seq convolution and deconvolution. cypress conducts simulation and evaluates the impact of multiple influencing factors, by various statistical metrics, to help researchers optimize experimental design and conduct power analysis.Availability and implementationcypress is an open-source R/Bioconductor package at https://bioconductor.org/packages/cypress/.
      PubDate: Sat, 17 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae511
      Issue No: Vol. 40, No. 8 (2024)
       
  • WatFinder: a ProDy tool for protein–water interactions

    • Free pre-print version: Loading...

      First page: btae516
      Abstract: AbstractSummaryWe introduce WatFinder, a tool designed to identify and visualize protein–water interactions (water bridges, water-mediated associations, or water channels, fluxes, and clusters) relevant to protein stability, dynamics, and function. WatFinder is integrated into ProDy, a Python API broadly used for structure-based prediction of protein dynamics. WatFinder provides a suite of functions for generating raw data as well as outputs from statistical analyses. The ProDy framework facilitates comprehensive automation and efficient analysis of the ensembles of structures resolved for a given protein or the time-evolved conformations from simulations in explicit water, as illustrated in five case studies presented in the Supplementary MaterialSupplementary Material.Availability and implementationProDy is open-source and freely available under MIT License from https://github.com/ProDy/ProDy.
      PubDate: Sat, 17 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae516
      Issue No: Vol. 40, No. 8 (2024)
       
  • An Ensemble Spectral Prediction (ESP) model for metabolite annotation

    • Free pre-print version: Loading...

      First page: btae490
      Abstract: AbstractMotivationA key challenge in metabolomics is annotating measured spectra from a biological sample with chemical identities. Currently, only a small fraction of measurements can be assigned identities. Two complementary computational approaches have emerged to address the annotation problem: mapping candidate molecules to spectra, and mapping query spectra to molecular candidates. In essence, the candidate molecule with the spectrum that best explains the query spectrum is recommended as the target molecule. Despite candidate ranking being fundamental in both approaches, limited prior works incorporated rank learning tasks in determining the target molecule.ResultsWe propose a novel machine learning model, Ensemble Spectral Prediction (ESP), for metabolite annotation. ESP takes advantage of prior neural network-based annotation models that utilize multilayer perceptron (MLP) networks and Graph Neural Networks (GNNs). Based on the ranking results of the MLP- and GNN-based models, ESP learns a weighting for the outputs of MLP and GNN spectral predictors to generate a spectral prediction for a query molecule. Importantly, training data is stratified by molecular formula to provide candidate sets during model training. Further, baseline MLP and GNN models are enhanced by considering peak dependencies through label mixing and multi-tasking on spectral topic distributions. When trained on the NIST 2020 dataset and evaluated on the relevant candidate sets from PubChem, ESP improves average rank by 23.7% and 37.2% over the MLP and GNN baselines, respectively, demonstrating performance gain over state-of-the-art neural network approaches. However, MLP approaches remain strong contenders when considering top five ranks. Importantly, we show that annotation performance is dependent on the training dataset, the number of molecules in the candidate set and candidate similarity to the target molecule.Availability and implementationThe ESP code, a trained model, and a Jupyter notebook that guide users on using the ESP tool is available at https://github.com/HassounLab/ESP.
      PubDate: Sat, 17 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae490
      Issue No: Vol. 40, No. 8 (2024)
       
  • Publication-ready single nucleotide polymorphism visualization with snipit

    • Free pre-print version: Loading...

      First page: btae510
      Abstract: AbstractSummarySnipit is an analysis and visualization tool designed for summarizing single nucleotide polymorphisms in sequences in comparison to a reference sequence. This tool efficiently catalogues nucleotide and amino acid differences, enabling clear comparisons through customizable, publication-ready figures. With features such as configurable colour palettes, customizable record sorting, and the ability to output figures in multiple formats, snipit offers a user-friendly interface for researchers across diverse disciplines. In addition, snipit includes a specialized recombi-mode for illustrating recombination patterns, which can highlight otherwise often difficult-to-detect relationships between sequences.Availability and implementationSnipit is an open-source python-based tool that is hosted on GitHub under a GNU-GPL 3.0 licence (https://github.com/aineniamh/snipit). It can be installed from PyPi using pip. Source code and additional documentation can be found on the GitHub repository.
      PubDate: Tue, 13 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae510
      Issue No: Vol. 40, No. 8 (2024)
       
  • Interactive visualization of nanopore sequencing signal data with
           Squigualiser

    • Free pre-print version: Loading...

      First page: btae501
      Abstract: AbstractMotivationNanopore sequencing current signal data can be ‘basecalled’ into sequence information or analysed directly, with the capacity to identify diverse molecular features, such as DNA/RNA base modifications and secondary structures. However, raw signal data is large and complex, and there is a need for improved visualization strategies to facilitate signal analysis, exploration and tool development.ResultsSquigualiser (Squiggle visualiser) is a toolkit for intuitive, interactive visualization of sequence-aligned signal data, which currently supports both DNA and RNA sequencing data from Oxford Nanopore Technologies instruments. Squigualiser is compatible with a wide range of alternative signal-alignment software packages and enables visualization of both signal-to-read and signal-to-reference aligned data at single-base resolution. Squigualiser generates an interactive signal browser view (HTML file), in which the user can navigate across a genome/transcriptome region and customize the display. Multiple independent reads are integrated into a ‘signal pileup’ format and different datasets can be displayed as parallel tracks. Although other methods exist, Squigualiser provides the community with a software package purpose-built for raw signal data visualization, incorporating a range of new and existing features into a unified platform.Availability and implementationSquigualiser is an open-source package under an MIT licence: https://github.com/hiruna72/squigualiser. The software was developed using Python 3.8 and can be installed with pip or bioconda or executed directly using prebuilt binaries provided with each release.
      PubDate: Tue, 13 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae501
      Issue No: Vol. 40, No. 8 (2024)
       
  • Improved allele-specific single-cell copy number estimation in
           low-coverage DNA-sequencing

    • Free pre-print version: Loading...

      First page: btae506
      Abstract: AbstractMotivationAdvances in whole-genome single-cell DNA sequencing (scDNA-seq) have led to the development of numerous methods for detecting copy number aberrations (CNAs), a key driver of genetic heterogeneity in cancer. While most of these methods are limited to the inference of total copy number, some recent approaches now infer allele-specific CNAs using innovative techniques for estimating allele-frequencies in low coverage scDNA-seq data. However, these existing allele-specific methods are limited in their segmentation strategies, a crucial step in the CNA detection pipeline.ResultsWe present SEACON (Single-cell Estimation of Allele-specific COpy Numbers), an allele-specific copy number profiler for scDNA-seq data. SEACON uses a Gaussian Mixture Model to identify latent copy number states and breakpoints between contiguous segments across cells, filters the segments for high-quality breakpoints using an ensemble technique, and adopts several strategies for tolerating noisy read-depth and allele frequency measurements. Using a wide array of both real and simulated datasets, we show that SEACON derives accurate copy numbers and surpasses existing approaches under numerous experimental conditions, and identify its strengths and weaknesses.Availability and implementationSEACON is implemented in Python and is freely available open-source from https://github.com/NabaviLab/SEACON and https://doi.org/10.5281/zenodo.12727008.
      PubDate: Mon, 12 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae506
      Issue No: Vol. 40, No. 8 (2024)
       
  • ceas: an R package for Seahorse data analysis and visualization

    • Free pre-print version: Loading...

      First page: btae503
      Abstract: AbstractSummaryMeasuring cellular energetics is essential to understanding a matrix’s (e.g. cell, tissue, or biofluid) metabolic state. The Agilent Seahorse machine is a common method to measure real-time cellular energetics, but existing analysis tools are highly manual or lack functionality. The Cellular Energetics Analysis Software (ceas) R package fills this analytical gap by providing modular and automated Seahorse data analysis and visualization.Availability and implementationceas is available on CRAN (https://cran.r-project.org/package=ceas). Source code and installable tarballs are freely available for download at https://github.com/jamespeapen/ceas/releases/ under the MIT license. Package documentation may be found at https://jamespeapen.github.io/ceas/. ceas is implemented in R and is supported on macOS, Windows and Linux.
      PubDate: Mon, 12 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae503
      Issue No: Vol. 40, No. 8 (2024)
       
  • ExpOmics: a comprehensive web platform empowering biologists with robust
           multi-omics data analysis capabilities

    • Free pre-print version: Loading...

      First page: btae507
      Abstract: AbstractMotivationHigh-throughput technologies yield a broad spectrum of multi-omics datasets, which offer unparalleled insights into complex biological systems. However, effectively analyzing this diverse array of data presents challenges, considering factors such as species diversity, data types, costs, and limitations of the available tools.ResultsHerein, we present ExpOmics, a comprehensive web platform featuring 7 applications and 4 toolkits, with 28 customizable analysis functions spanning various analyses of differential expression, co-expression, Weighted Gene Co-expression Network Analysis (WGCNA), feature selection, and functional enrichment. ExpOmics allows users to upload and explore multi-omics data without organism restrictions, supporting various expression data, including genes, mRNAs, lncRNAs, miRNAs, circRNAs, piRNAs, and proteins and is compatible with diverse gene nomenclatures and expression values. Moreover, ExpOmics enables users to analyze 22 427 transcriptomic datasets of 196 cancer subtypes sourced from 63 projects of The Cancer Genome Atlas Program (TCGA) to identify cancer biomarkers. The analysis results from ExpOmics are presented in high-quality graphical formats suitable for publication and are available for free download. A case study using ExpOmics identified two potential oncogenes, SERPINE1 and SLC43A1, that may regulate colorectal cancer through distinct biological processes. In summary, ExpOmics can serves as a robust platform for global researchers to explore multi-omics data, gain biological insights, and formulate testable hypotheses.Availability and implementationExpOmics is available at http://www.biomedical-web.com/expomics.
      PubDate: Sat, 10 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae507
      Issue No: Vol. 40, No. 8 (2024)
       
  • pyTWMR: transcriptome-wide Mendelian randomization in python

    • Free pre-print version: Loading...

      First page: btae505
      Abstract: AbstractMotivationMendelian randomization (MR) is a widely used approach to estimate causal effect of variation in gene expression on complex traits. Among several MR-based algorithms, transcriptome-wide summary statistics-based Mendelian Randomization approach (TWMR) enables the uses of multiple SNPs as instruments and multiple gene expression traits as exposures to facilitate causal inference in observational studies.ResultsHere we present a Python-based implementation of TWMR and revTWMR. Our implementation offers GPU computational support for faster computations and robust computation mode resilient to highly correlated gene expressions and genetic variantsAvailability and implementationpyTWMR is available at github.com/soreshkov/pyTWMR.
      PubDate: Sat, 10 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae505
      Issue No: Vol. 40, No. 8 (2024)
       
  • PGAT-ABPp: harnessing protein language models and graph attention networks
           for antibacterial peptide identification with remarkable accuracy

    • Free pre-print version: Loading...

      First page: btae497
      Abstract: AbstractMotivationThe emergence of drug-resistant pathogens represents a formidable challenge to global health. Using computational methods to identify the antibacterial peptides (ABPs), an alternative antimicrobial agent, has demonstrated advantages in further drug design studies. Most of the current approaches, however, rely on handcrafted features and underutilize structural information, which may affect prediction performance.ResultsTo present an ultra-accurate model for ABP identification, we propose a novel deep learning approach, PGAT-ABPp. PGAT-ABPp leverages structures predicted by AlphaFold2 and a pretrained protein language model, ProtT5-XL-U50 (ProtT5), to construct graphs. Then the graph attention network (GAT) is adopted to learn global discriminative features from the graphs. PGAT-ABPp outperforms the other fourteen state-of-the-art models in terms of accuracy, F1-score and Matthews Correlation Coefficient on the independent test dataset. The results show that ProtT5 has significant advantages in the identification of ABPs and the introduction of spatial information further improves the prediction performance of the model. The interpretability analysis of key residues in known active ABPs further underscores the superiority of PGAT-ABPp.Availability and implementationThe datasets and source codes for the PGAT-ABPp model are available at https://github.com/moonseter/PGAT-ABPp/.
      PubDate: Fri, 09 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae497
      Issue No: Vol. 40, No. 8 (2024)
       
  • TransTEx: novel tissue-specificity scoring method for grouping human
           transcriptome into different expression groups

    • Free pre-print version: Loading...

      First page: btae475
      Abstract: AbstractMotivationAlthough human tissues carry out common molecular processes, gene expression patterns can distinguish different tissues. Traditional informatics methods, primarily at the gene level, overlook the complexity of alternative transcript variants and protein isoforms produced by most genes, changes in which are linked to disease prognosis and drug resistance.ResultsWe developed TransTEx (Transcript-level Tissue Expression), a novel tissue-specificity scoring method, for grouping transcripts into four expression groups. TransTEx applies sequential cut-offs to tissue-wise transcript probability estimates, subsampling-based P-values and fold-change estimates. Application of TransTEx on GTEx mRNA-seq data divided 199 166 human transcripts into different groups as 17 999 tissue-specific (TSp), 7436 tissue-enhanced, 36 783 widely expressed (Wide), 79 191 lowly expressed (Low), and 57 757 no expression (Null) transcripts. Testis has the most (13 466) TSp isoforms followed by liver (890), brain (701), pituitary (435), and muscle (420). We found that the tissue specificity of alternative transcripts of a gene is predominantly influenced by alternate promoter usage. By overlapping brain-specific transcripts with the cell-type gene-markers in scBrainMap database, we found that 63% of the brain-specific transcripts were enriched in nonneuronal cell types, predominantly astrocytes followed by endothelial cells and oligodendrocytes. In addition, we found 61 brain cell-type marker genes encoding a total of 176 alternative transcripts as brain-specific and 22 alternative transcripts as testis-specific, highlighting the complex TSp and cell-type specific gene regulation and expression at isoform-level. TransTEx can be adopted to the analysis of bulk RNA-seq or scRNA-seq datasets to find tissue- and/or cell-type specific isoform-level gene markers.Availability and implementationTransTEx database: https://bmi.cewit.stonybrook.edu/transtexdb/ and the R package is available via GitHub: https://github.com/pallavisurana1/TransTEx.
      PubDate: Fri, 09 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae475
      Issue No: Vol. 40, No. 8 (2024)
       
  • Attention-based approach to predict drug–target interactions across
           seven target superfamilies

    • Free pre-print version: Loading...

      First page: btae496
      Abstract: AbstractMotivationDrug–target interactions (DTIs) hold a pivotal role in drug repurposing and elucidation of drug mechanisms of action. While single-targeted drugs have demonstrated clinical success, they often exhibit limited efficacy against complex diseases, such as cancers, whose development and treatment is dependent on several biological processes. Therefore, a comprehensive understanding of primary, secondary and even inactive targets becomes essential in the quest for effective and safe treatments for cancer and other indications. The human proteome offers over a thousand druggable targets, yet most FDA-approved drugs bind to only a small fraction of these targets.ResultsThis study introduces an attention-based method (called as MMAtt-DTA) to predict drug–target bioactivities across human proteins within seven superfamilies. We meticulously examined nine different descriptor sets to identify optimal signature descriptors for predicting novel DTIs. Our testing results demonstrated Spearman correlations exceeding 0.72 (P < 0.001) for six out of seven superfamilies. The proposed method outperformed fourteen state-of-the-art machine learning, deep learning and graph-based methods and maintained relatively high performance for most target superfamilies when tested with independent bioactivity data sources. We computationally validated 185 676 drug–target pairs from ChEMBL-V33 that were not available during model training, achieving a reasonable performance with Spearman correlation >0.57 (P < 0.001) for most superfamilies. This underscores the robustness of the proposed method for predicting novel DTIs. Finally, we applied our method to predict missing bioactivities among 3492 approved molecules in ChEMBL-V33, offering a valuable tool for advancing drug mechanism discovery and repurposing existing drugs for new indications.Availability and implementationhttps://github.com/AronSchulman/MMAtt-DTA.
      PubDate: Thu, 08 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae496
      Issue No: Vol. 40, No. 8 (2024)
       
  • Pseudobulk with proper offsets has the same statistical properties as
           generalized linear mixed models in single-cell case-control studies

    • Free pre-print version: Loading...

      First page: btae498
      Abstract: AbstractMotivationGeneralized linear mixed models (GLMMs), such as the negative-binomial or Poisson linear mixed model, are widely applied to single-cell RNA sequencing data to compare transcript expression between different conditions determined at the subject level. However, the model is computationally intensive, and its relative statistical performance to pseudobulk approaches is poorly understood.ResultsWe propose offset-pseudobulk as a lightweight alternative to GLMMs. We prove that a count-based pseudobulk equipped with a proper offset variable has the same statistical properties as GLMMs in terms of both point estimates and standard errors. We confirm our findings using simulations based on real data. Offset-pseudobulk is substantially faster (>×10) and numerically more stable than GLMMs.Availability and implementationOffset pseudobulk can be easily implemented in any generalized linear model software by tweaking a few options. The codes can be found at https://github.com/hanbin973/pseudobulk_is_mm.
      PubDate: Thu, 08 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae498
      Issue No: Vol. 40, No. 8 (2024)
       
  • Benchmarking of AlphaFold2 accuracy self-estimates as indicators of
           empirical model quality and ranking: a comparison with independent model
           quality assessment programmes

    • Free pre-print version: Loading...

      First page: btae491
      Abstract: AbstractMotivationDespite an increase in protein modelling accuracy following the development of AlphaFold2, there remains an accuracy gap between predicted and observed model quality assessment (MQA) scores. In CASP15, variations in AlphaFold2 model accuracy prediction were noticed for quaternary models of very similar observed quality. In this study, we compare plDDT and pTM to their observed counterparts the local distance difference test (lDDT) and TM-score for both tertiary and quaternary models to examine whether reliability is retained across the scoring range under normal modelling conditions and in situations where AlphaFold2 functionality is customized. We also explore plDDT and pTM ranking accuracy in comparison with the published independent MQA programmes ModFOLD9 and ModFOLDdock.ResultsplDDT was found to be an accurate descriptor of tertiary model quality compared to observed lDDT-Cα scores (Pearson r = 0.97), and achieved a ranking agreement true positive rate (TPR) of 0.34 with observed scores, which ModFOLD9 could not improve. However, quaternary structure accuracy was reduced (plDDT r = 0.67, pTM r = 0.70) and significant overprediction was seen with both scores for some lower quality models. Additionally, ModFOLDdock was able to improve upon AF2-Multimer model ranking compared to TM-score (TPR 0.34) and oligo-lDDT score (TPR 0.43). Finally, evidence is presented for increased variability in plDDT and pTM when using custom template recycling, which is more pronounced for quaternary structures.Availability and implementationThe ModFOLD9 and ModFOLDdock quality assessment servers are available at https://www.reading.ac.uk/bioinf/ModFOLD/ and https://www.reading.ac.uk/bioinf/ModFOLDdock/, respectively. A docker image is available at https://hub.docker.com/r/mcguffin/multifold.
      PubDate: Thu, 08 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae491
      Issue No: Vol. 40, No. 8 (2024)
       
  • popDMS infers mutation effects from deep mutational scanning data

    • Free pre-print version: Loading...

      First page: btae499
      Abstract: AbstractSummaryDeep mutational scanning (DMS) experiments provide a powerful method to measure the functional effects of genetic mutations at massive scales. However, the data generated from these experiments can be difficult to analyze, with significant variation between experimental replicates. To overcome this challenge, we developed popDMS, a computational method based on population genetics theory, to infer the functional effects of mutations from DMS data. Through extensive tests, we found that the functional effects of single mutations and epistasis inferred by popDMS are highly consistent across replicates, comparing favorably with existing methods. Our approach is flexible and can be widely applied to DMS data that includes multiple time points, multiple replicates, and different experimental conditions.Availability and implementationpopDMS is implemented in Python and Julia, and is freely available on GitHub at https://github.com/bartonlab/popDMS.
      PubDate: Thu, 08 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae499
      Issue No: Vol. 40, No. 8 (2024)
       
  • R2Dtool: integration and visualization of isoform-resolved RNA features

    • Free pre-print version: Loading...

      First page: btae495
      Abstract: AbstractMotivationLong-read RNA sequencing enables the mapping of RNA modifications, structures, and protein-interaction sites at the resolution of individual transcript isoforms. To understand the functions of these RNA features, it is critical to analyze them in the context of transcriptomic and genomic annotations, such as open reading frames and splice junctions.ResultsWe have developed R2Dtool, a bioinformatics tool that integrates transcript-mapped information with transcript and genome annotations, allowing for the isoform-resolved analytics and graphical representation of RNA features in their genomic context. We illustrate R2Dtool’s capability to integrate and expedite RNA feature analysis using epitranscriptomics data. R2Dtool facilitates the comprehensive analysis and interpretation of alternative transcript isoforms.Availability and implementationR2Dtool is freely available under the MIT license at github.com/comprna/R2Dtool.
      PubDate: Wed, 07 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae495
      Issue No: Vol. 40, No. 8 (2024)
       
  • ModDotPlot—rapid and interactive visualization of tandem repeats

    • Free pre-print version: Loading...

      First page: btae493
      Abstract: AbstractMotivationA common method for analyzing genomic repeats is to produce a sequence similarity matrix visualized via a dot plot. Innovative approaches such as StainedGlass have improved upon this classic visualization by rendering dot plots as a heatmap of sequence identity, enabling researchers to better visualize multi-megabase tandem repeat arrays within centromeres and other heterochromatic regions of the genome. However, computing the similarity estimates for heatmaps requires high computational overhead and can suffer from decreasing accuracy.ResultsIn this work, we introduce ModDotPlot, an interactive and alignment-free dot plot viewer. By approximating average nucleotide identity via a k-mer-based containment index, ModDotPlot produces accurate plots orders of magnitude faster than StainedGlass. We accomplish this through the use of a hierarchical modimizer scheme that can visualize the full 128 Mb genome of Arabidopsis thaliana in under 5 min on a laptop. ModDotPlot is bundled with a graphical user interface supporting real-time interactive navigation of entire chromosomes.Availability and implementationModDotPlot is available at https://github.com/marbl/ModDotPlot.
      PubDate: Wed, 07 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae493
      Issue No: Vol. 40, No. 8 (2024)
       
  • Accelerated dimensionality reduction of single-cell RNA sequencing data
           with fastglmpca

    • Free pre-print version: Loading...

      First page: btae494
      Abstract: AbstractSummaryMotivated by theoretical and practical issues that arise when applying Principal component analysis (PCA) to count data, Townes et al. introduced “Poisson GLM-PCA”, a variation of PCA adapted to count data, as a tool for dimensionality reduction of single-cell RNA sequencing (scRNA-seq) data. However, fitting GLM-PCA is computationally challenging. Here we study this problem, and show that a simple algorithm, which we call “Alternating Poisson Regression” (APR), produces better quality fits, and in less time, than existing algorithms. APR is also memory-efficient and lends itself to parallel implementation on multi-core processors, both of which are helpful for handling large scRNA-seq datasets. We illustrate the benefits of this approach in three publicly available scRNA-seq datasets. The new algorithms are implemented in an R package, fastglmpca.Availability and implementationThe fastglmpca R package is released on CRAN for Windows, macOS and Linux, and the source code is available at github.com/stephenslab/fastglmpca under the open source GPL-3 license. Scripts to reproduce the results in this paper are also available in the GitHub repository and on Zenodo.
      PubDate: Wed, 07 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae494
      Issue No: Vol. 40, No. 8 (2024)
       
  • Galaxy Helm chart: a standardized method for deploying production Galaxy
           servers

    • Free pre-print version: Loading...

      First page: btae486
      Abstract: AbstractMotivationThe Galaxy application is a popular open-source framework for data intensive sciences, counting thousands of monthly users across more than 100 public servers. To support a growing number of users and a greater variety of use cases, the complexity of a production-grade Galaxy installation has also grown, requiring more administration effort. There is a need for a rapid and reproducible Galaxy deployment method that can be maintained at high-availability with minimal maintenance.ResultsWe describe the Galaxy Helm chart that codifies all elements of a production-grade Galaxy installation into a single package. Deployable on Kubernetes clusters, the chart encapsulates supporting software services and implements the best-practices model for running Galaxy. It is also the most rapid method available for deploying a scalable, production-grade Galaxy instance on one’s own infrastructure. The chart is highly configurable, allowing systems administrators to swap dependent services if desired. Notable uses of the chart include on-demand, fully-automated deployments on AnVIL, providing training infrastructure for the Bioconductor project, and as the AWS-recommended solution for running Galaxy on the Amazon cloud.Availability and implementationThe source code for Galaxy Helm is available at https://github.com/galaxyproject/galaxy-helm, the corresponding Helm package at https://github.com/CloudVE/helm-charts, and the required Galaxy container image https://github.com/galaxyproject/galaxy-docker-k8s.
      PubDate: Tue, 06 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae486
      Issue No: Vol. 40, No. 8 (2024)
       
  • Locuaz: an in silico platform for protein binders optimization

    • Free pre-print version: Loading...

      First page: btae492
      Abstract: AbstractMotivationEngineering high-affinity binders targeting specific antigenic determinants remains a challenging and often daunting task, requiring extensive experimental screening. Computational methods have the potential to accelerate this process, reducing costs and time, but only if they demonstrate broad applicability and efficiency in exploring mutations, evaluating affinity, and pruning unproductive mutation paths.ResultsIn response to these challenges, we introduce a new computational platform for optimizing protein binders towards their targets. The platform is organized as a series of modules, performing mutation selection and application, molecular dynamics simulations to sample conformations around interaction poses, and mutation prioritization using suitable scoring functions. Notably, the platform supports parallel exploration of different mutation streams, enabling in silico high-throughput screening on High Performance Computing (HPC) systems. Furthermore, the platform is highly customizable, allowing users to implement their own protocols.Availability and implementationThe source code is available at https://github.com/pgbarletta/locuaz and documentation is at https://locuaz.readthedocs.io/. The data underlying this article are available at https://github.com/pgbarletta/suppl_info_locuaz
      PubDate: Tue, 06 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae492
      Issue No: Vol. 40, No. 8 (2024)
       
  • BertSNR: an interpretable deep learning framework for single-nucleotide
           resolution identification of transcription factor binding sites based on
           DNA language model

    • Free pre-print version: Loading...

      First page: btae461
      Abstract: AbstractMotivationTranscription factors are pivotal in the regulation of gene expression, and accurate identification of transcription factor binding sites (TFBSs) at high resolution is crucial for understanding the mechanisms underlying gene regulation. The task of identifying TFBSs from DNA sequences is a significant challenge in the field of computational biology today. To address this challenge, a variety of computational approaches have been developed. However, these methods face limitations in their ability to achieve high-resolution identification and often lack interpretability.ResultsWe propose BertSNR, an interpretable deep learning framework for identifying TFBSs at single-nucleotide resolution. BertSNR integrates sequence-level and token-level information by multi-task learning based on pre-trained DNA language models. Benchmarking comparisons show that our BertSNR outperforms the existing state-of-the-art methods in TFBS predictions. Importantly, we enhanced the interpretability of the model through attentional weight visualization and motif analysis, and discovered the subtle relationship between attention weight and motif. Moreover, BertSNR effectively identifies TFBSs in promoter regions, facilitating the study of intricate gene regulation.Availability and implementationThe BertSNR source code can be found at https://github.com/lhy0322/BertSNR.
      PubDate: Tue, 06 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae461
      Issue No: Vol. 40, No. 8 (2024)
       
  • Cellular proliferation biases clonal lineage tracing and trajectory
           inference

    • Free pre-print version: Loading...

      First page: btae483
      Abstract: AbstractMotivationLineage tracing and trajectory inference from single-cell RNA-sequencing data hold tremendous potential for uncovering the genetic programs driving development and disease. Single cell datasets are thought to provide an unbiased view on the diverse cellular architecture of tissues. Sampling bias, however, can skew single cell datasets away from the cellular composition they are meant to represent.ResultsWe demonstrate a novel form of sampling bias, caused by a statistical phenomenon related to repeated sampling from a growing, heterogeneous population. Relative growth rates of cells influence the probability that they will be sampled in clones observed across multiple time points. We support our probabilistic derivations with a simulation study and an analysis of a real time-course of T-cell development. We find that this bias can impact fate probability predictions, and we explore how to develop trajectory inference methods which are robust to this bias.Availability and implementationSource code for the simulated datasets and to create the figures in this manuscript is freely available in python at https://github.com/rbonhamcarter/simulate-clones. A python implementation of the extension of the LineageOT method is freely available at https://github.com/rbonhamcarter/LineageOT/tree/multi-time-clones.
      PubDate: Mon, 05 Aug 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae483
      Issue No: Vol. 40, No. 8 (2024)
       
  • New GO-based measures in multiple network alignment

    • Free pre-print version: Loading...

      First page: btae476
      Abstract: AbstractMotivationProtein–protein interaction (PPI) networks provide valuable insights into the function of biological systems. Aligning multiple PPI networks may expose relationships beyond those observable by pairwise comparisons. However, assessing the biological quality of multiple network alignments is a challenging problem.ResultsWe propose two new measures to evaluate the quality of multiple network alignments using functional information from Gene Ontology (GO) terms. When aligning multiple real PPI networks across species, we observe that both measures are highly correlated with objective quality indicators, such as common orthologs. Additionally, our measures strongly correlate with an alignment’s ability to predict novel GO annotations, which is a unique advantage over existing GO-based measures.Availability and implementationThe scripts and the links to the raw and alignment data can be accessed at https://github.com/kimiayazdani/GO_Measures.git
      PubDate: Wed, 31 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae476
      Issue No: Vol. 40, No. 8 (2024)
       
  • Enabling population protein dynamics through Bayesian modeling

    • Free pre-print version: Loading...

      First page: btae484
      Abstract: AbstractMotivationThe knowledge of protein dynamics, or turnover, in patients provides invaluable information related to certain diseases, drug efficacy, or biological processes. A great corpus of experimental and computational methods has been developed, including by us, in the case of human patients followed in vivo. Moving one step further, we propose a novel modeling approach to capture population protein dynamics using Bayesian methods.ResultsUsing two datasets, we demonstrate that models inspired by population pharmacokinetics can accurately capture protein turnover within a cohort and account for inter-individual variability. Such models pave the way for comparative studies searching for altered dynamics or biomarkers in diseases.Availability and implementationR code and preprocessed data are available from zenodo.org. Raw data are available from panoramaweb.org.
      PubDate: Tue, 30 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae484
      Issue No: Vol. 40, No. 8 (2024)
       
  • RawHash2: mapping raw nanopore signals using hash-based seeding and
           adaptive quantization

    • Free pre-print version: Loading...

      First page: btae478
      Abstract: AbstractSummaryRaw nanopore signals can be analyzed while they are being generated, a process known as real-time analysis. Real-time analysis of raw signals is essential to utilize the unique features that nanopore sequencing provides, enabling the early stopping of the sequencing of a read or the entire sequencing run based on the analysis. The state-of-the-art mechanism, RawHash, offers the first hash-based efficient and accurate similarity identification between raw signals and a reference genome by quickly matching their hash values. In this work, we introduce RawHash2, which provides major improvements over RawHash, including more sensitive quantization and chaining algorithms, weighted mapping decisions, frequency filters to reduce ambiguous seed hits, minimizers for hash-based sketching, and support for the R10.4 flow cell version and POD5 and SLOW5 file formats. Compared to RawHash, RawHash2 provides better F1 accuracy (on average by 10.57% and up to 20.25%) and better throughput (on average by 4.0× and up to 9.9×) than RawHash.Availability and implementationRawHash2 is available at https://github.com/CMU-SAFARI/RawHash. We also provide the scripts to fully reproduce our results on our GitHub page.
      PubDate: Tue, 30 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae478
      Issue No: Vol. 40, No. 8 (2024)
       
  • GRIEVOUS: your command-line general for resolving cross-dataset genotype
           inconsistencies

    • Free pre-print version: Loading...

      First page: btae489
      Abstract: AbstractSummaryHarmonizing variant indexing and allele assignments across datasets is crucial for data integrity in cross-dataset studies such as multi-cohort genome-wide association studies, meta-analyses, and the development, validation, and application of polygenic risk scores. Ensuring this indexing and allele consistency is a laborious, time-consuming, and error-prone process requiring a certain degree of computational proficiency. Here, we introduce GRIEVOUS, a command-line tool for cross-dataset variant homogenization. By means of an internal database and a custom indexing methodology, GRIEVOUS identifies, formats, and aligns all biallelic single nucleotide polymorphisms (SNPs) across all summary statistic and genotype files of interest. Upon completion of dataset harmonization, GRIEVOUS can also be used to extract the maximal set of biallelic SNPs common to all datasets.Availability and implementationGRIEVOUS and all supporting documentation and tutorials can be found at https://github.com/jvtalwar/GRIEVOUS. It is freely and publicly available under the MIT license and can be installed via pip.
      PubDate: Tue, 30 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae489
      Issue No: Vol. 40, No. 8 (2024)
       
  • AssemblyQC: a Nextflow pipeline for reproducible reporting of assembly
           quality

    • Free pre-print version: Loading...

      First page: btae477
      Abstract: AbstractSummaryGenome assembly projects have grown exponentially due to breakthroughs in sequencing technologies and assembly algorithms. Evaluating the quality of genome assemblies is critical to ensure the reliability of downstream analysis and interpretation. To fulfil this task, we have developed the AssemblyQC pipeline that performs file-format validation, contaminant checking, contiguity measurement, gene- and repeat-space completeness quantification, telomere inspection, taxonomic assignment, synteny alignment, scaffold examination through Hi-C contact-map visualization, and assessments of completeness, consensus quality and phasing through k-mer analysis. It produces a comprehensive HTML report with method descriptions, tables, and visualizations.Availability and implementationThe pipeline uses Nextflow for workflow orchestration and adheres to the best-practice established by the nf-core community. This pipeline offers a reproducible, scalable, and portable method to assess the quality of genome assemblies—the code is available online at GitHub: https://github.com/Plant-Food-Research-Open/assemblyqc.
      PubDate: Tue, 30 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae477
      Issue No: Vol. 40, No. 8 (2024)
       
  • Genome-wide detection of somatic mosaicism at short tandem repeats

    • Free pre-print version: Loading...

      First page: btae485
      Abstract: AbstractMotivationSomatic mosaicism has been implicated in several developmental disorders, cancers, and other diseases. Short tandem repeats (STRs) consist of repeated sequences of 1–6 bp and comprise >1 million loci in the human genome. Somatic mosaicism at STRs is known to play a key role in the pathogenicity of loci implicated in repeat expansion disorders and is highly prevalent in cancers exhibiting microsatellite instability. While a variety of tools have been developed to genotype germline variation at STRs, a method for systematically identifying mosaic STRs is lacking.ResultsWe introduce prancSTR, a novel method for detecting mosaic STRs from individual high-throughput sequencing datasets. prancSTR is designed to detect loci characterized by a single high-frequency mosaic allele, but can also detect loci with multiple mosaic alleles. Unlike many existing mosaicism detection methods for other variant types, prancSTR does not require a matched control sample as input. We show that prancSTR accurately identifies mosaic STRs in simulated data, demonstrate its feasibility by identifying candidate mosaic STRs in Illumina whole genome sequencing data derived from lymphoblastoid cell lines for individuals sequenced by the 1000 Genomes Project, and evaluate the use of prancSTR on Element and PacBio data. In addition to prancSTR, we present simTR, a novel simulation framework which simulates raw sequencing reads with realistic error profiles at STRs.Availability and implementationprancSTR and simTR are freely available at https://github.com/gymrek-lab/trtools. Detailed documentation is available at https://trtools.readthedocs.io/.
      PubDate: Tue, 30 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae485
      Issue No: Vol. 40, No. 8 (2024)
       
  • Poincaré and SimBio: a versatile and extensible Python ecosystem for
           modeling systems

    • Free pre-print version: Loading...

      First page: btae465
      Abstract: AbstractMotivationChemical reaction networks (CRNs) play a pivotal role in diverse fields such as systems biology, biochemistry, chemical engineering, and epidemiology. High-level definitions of CRNs enables to use various simulation approaches, including deterministic and stochastic methods, from the same model. However, existing Python tools for simulation of CRN typically wrap external C/C++ libraries for model definition, translation into equations and/or numerically solving them, limiting their extensibility and integration with the broader Python ecosystem.ResultsIn response, we developed Poincaré and SimBio, two novel Python packages for simulation of dynamical systems and CRNs. Poincaré serves as a foundation for dynamical systems modeling, while SimBio extends this functionality to CRNs, including support for the Systems Biology Markup Language (SBML). Poincaré and SimBio are developed as pure Python packages enabling users to easily extend their simulation capabilities by writing new or leveraging other Python packages. Moreover, this does not compromise the performance, as code can be just-in-time compiled with Numba. Our benchmark tests using curated models from the BioModels repository demonstrate that these tools may provide a potentially superior performance advantage compared to other existing tools. In addition, to ensure a user-friendly experience, our packages use standard typed modern Python syntax that provides a seamless integration with integrated development environments. Our Python-centric approach significantly enhances code analysis, error detection, and refactoring capabilities, positioning Poincaré and SimBio as valuable tools for the modeling community.Availability and implementationPoincaré and SimBio are released under the MIT license. Their source code is available on GitHub (https://github.com/maurosilber/poincare and https://github.com/hgrecco/simbio) and can be installed from PyPI or conda-forge.
      PubDate: Tue, 30 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae465
      Issue No: Vol. 40, No. 8 (2024)
       
  • HDXBoxeR: an R package for statistical analysis and visualization of
           multiple Hydrogen–Deuterium Exchange Mass-Spectrometry datasets of
           different protein states

    • Free pre-print version: Loading...

      First page: btae479
      Abstract: AbstractSummaryHydrogen–Deuterium Exchange Mass Spectrometry (HDX-MS) is a powerful protein characterization technique that provides insights into protein dynamics and flexibility at the peptide level. However, analyzing HDX-MS data presents a significant challenge due to the wealth of information it generates. Each experiment produces data for hundreds of peptides, often measured in triplicate across multiple time points. Comparisons between different protein states create distinct datasets containing thousands of peptides that require matching, rigorous statistical evaluation, and visualization. Our open-source R package, HDXBoxeR, is a comprehensive tool designed to facilitate statistical analysis and comparison of multiple sets among samples and time points for different protein states, along with data visualization.Availability and implementationHDXBoxeR is accessible as the R package (https://cran.r-project.org/web//packages/HDXBoxeR) and GitHub: mkajano/HDXBoxeR.
      PubDate: Tue, 30 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae479
      Issue No: Vol. 40, No. 8 (2024)
       
  • Epigenomics coverage data extraction and aggregation in R with
           tidyCoverage

    • Free pre-print version: Loading...

      First page: btae487
      Abstract: AbstractSummaryThe tidyCoverage R package provides a framework for intuitive investigation of collections of genomic tracks over genomic features, relying on the principle of tidy data manipulation. It defines two data structures, CoverageExperiment and AggregatedCoverage classes, directly extending the SummarizedExperiment fundamental class, and introduces a principled approach to exploring genome-wide data. This infrastructure facilitates the extraction and manipulation of genomic coverage track data across individual or multiple sets of thousands of genomic loci. This allows the end user to rapidly visualize track coverage at individual genomic loci or aggregated coverage profiles over sets of genomic loci. tidyCoverage seamlessly combines with the existing Bioconductor ecosystem to accelerate the integration of genome-wide track data in epigenomic analysis workflows. tidyCoverage emerges as a valuable tool, contributing to the advancement of epigenomics research by promoting consistency, reproducibility, and accessibility in data analysis.Availability and implementationtidyCoverage is an R package freely available from Bioconductor ≥ 3.19 (https://www.bioconductor.org/packages/tidyCoverage) for R ≥ 4.4. The software is distributed under the MIT License and is accompanied by example files and data.
      PubDate: Mon, 29 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae487
      Issue No: Vol. 40, No. 8 (2024)
       
  • ClusterMatch aligns single-cell RNA-sequencing data at the multi-scale
           cluster level via stable matching

    • Free pre-print version: Loading...

      First page: btae480
      Abstract: AbstractMotivationUnsupervised clustering of single-cell RNA sequencing (scRNA-seq) data holds the promise of characterizing known and novel cell type in various biological and clinical contexts. However, intrinsic multi-scale clustering resolutions poses challenges to deal with multiple sources of variability in the high-dimensional and noisy data.ResultsWe present ClusterMatch, a stable match optimization model to align scRNA-seq data at the cluster level. In one hand, ClusterMatch leverages the mutual correspondence by canonical correlation analysis and multi-scale Louvain clustering algorithms to identify cluster with optimized resolutions. In the other hand, it utilizes stable matching framework to align scRNA-seq data in the latent space while maintaining interpretability with overlapped marker gene set. Through extensive experiments, we demonstrate the efficacy of ClusterMatch in data integration, cell type annotation, and cross-species/timepoint alignment scenarios. Our results show ClusterMatch’s ability to utilize both global and local information of scRNA-seq data, sets the appropriate resolution of multi-scale clustering, and offers interpretability by utilizing marker genes.Availability and implementationThe code of ClusterMatch software is freely available at https://github.com/AMSSwanglab/ClusterMatch.
      PubDate: Mon, 29 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae480
      Issue No: Vol. 40, No. 8 (2024)
       
  • DeepCRISTL: deep transfer learning to predict CRISPR/Cas9 on-target
           editing efficiency in specific cellular contexts

    • Free pre-print version: Loading...

      First page: btae481
      Abstract: AbstractMotivationCRISPR/Cas9 technology has been revolutionizing the field of gene editing. Guide RNAs (gRNAs) enable Cas9 proteins to target specific genomic loci for editing. However, editing efficiency varies between gRNAs and so computational methods were developed to predict editing efficiency for any gRNA of interest. High-throughput datasets of Cas9 editing efficiencies were produced to train machine-learning models to predict editing efficiency. However, these high-throughput datasets have a low correlation with functional and endogenous datasets, which are too small to train accurate machine-learning models on.ResultsWe developed DeepCRISTL, a deep-learning model to predict the editing efficiency in a specific cellular context. DeepCRISTL takes advantage of high-throughput datasets to learn general patterns of gRNA editing efficiency and then fine-tunes the model on functional or endogenous data to fit a specific cellular context. We tested two state-of-the-art models trained on high-throughput datasets for editing efficiency prediction, our newly improved DeepHF and CRISPRon, combined with various transfer-learning approaches. The combination of CRISPRon and fine-tuning all model weights was the overall best performer. DeepCRISTL outperformed state-of-the-art methods in predicting editing efficiency in a specific cellular context on functional and endogenous datasets. Using saliency maps, we identified and compared the important features learned by DeepCRISTL across cellular contexts. We believe DeepCRISTL will improve prediction performance in many other CRISPR/Cas9 editing contexts by leveraging transfer learning to utilize both high-throughput datasets and smaller and more biologically relevant datasets.Availability and implementationDeepCRISTL is available via https://github.com/OrensteinLab/DeepCRISTL.
      PubDate: Mon, 29 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae481
      Issue No: Vol. 40, No. 8 (2024)
       
  • PhysioFit: a software to quantify cell growth parameters and extracellular
           fluxes

    • Free pre-print version: Loading...

      First page: btae488
      Abstract: AbstractSummaryQuantification of growth parameters and extracellular uptake and production fluxes is central in systems and synthetic biology. Fluxes can be estimated using various mathematical models by fitting time–course measurements of the concentration of cells and extracellular substrates and products. A single tool is available to non-computational biologists to calculate extracellular fluxes, but it is hardly interoperable and is limited to a single hard-coded growth model. We present our open-source flux calculation software, PhysioFit, which can be used with any growth model and is interoperable by design. PhysioFit includes some of the most common growth models, and advanced users can implement additional models to calculate extracellular fluxes and other growth parameters for metabolic systems or experimental setups that follow alternative kinetics. PhysioFit can be used as a Python library and offers a graphical user interface for intuitive use by end-users and a command-line interface to streamline integration into existing pipelines.Availability and implementationPhysioFit v3 is implemented in Python 3 and was tested on Windows, Unix, and MacOS platforms. The source code and the documentation are freely distributed under GPL3 license at https://github.com/MetaSys-LISBP/PhysioFit/ and https://physiofit.readthedocs.io/.
      PubDate: Mon, 29 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae488
      Issue No: Vol. 40, No. 8 (2024)
       
  • MuCoCP: a priori chemical knowledge-based multimodal contrastive learning
           pre-trained neural network for the prediction of cyclic peptide membrane
           penetration ability

    • Free pre-print version: Loading...

      First page: btae473
      Abstract: AbstractMotivationThere has been a burgeoning interest in cyclic peptide therapeutics due to their various outstanding advantages and strong potential for drug formation. However, it is undoubtedly costly and inefficient to use traditional wet lab methods to clarify their biological activities. Using artificial intelligence instead is a more energy-efficient and faster approach. MuCoCP aims to build a complete pre-trained model for extracting potential features of cyclic peptides, which can be fine-tuned to accurately predict cyclic peptide bioactivity on various downstream tasks. To maximize its effectiveness, we use a novel data augmentation method based on a priori chemical knowledge and multiple unsupervised training objective functions to greatly improve the information-grabbing ability of the model.ResultsTo assay the efficacy of the model, we conducted validation on the membrane-permeability of cyclic peptides which achieved an accuracy of 0.87 and R-squared of 0.503 on CycPeptMPDB using semi-supervised training and obtained an accuracy of 0.84 and R-squared of 0.384 using a model with frozen parameters on an external dataset. This result has achieved state-of-the-art, which substantiates the stability and generalization capability of MuCoCP. It means that MuCoCP can fully explore the high-dimensional information of cyclic peptides and make accurate predictions on downstream bioactivity tasks, which will serve as a guide for the future de novo design of cyclic peptide drugs and promote the development of cyclic peptide drugs.Availability and implementationAll code used in our proposed method can be found at https://github.com/lennonyu11234/MuCoCP.
      PubDate: Sat, 27 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae473
      Issue No: Vol. 40, No. 8 (2024)
       
  • Best practices to evaluate the impact of biomedical research
           software—metric collection beyond citations

    • Free pre-print version: Loading...

      First page: btae469
      Abstract: AbstractMotivationSoftware is vital for the advancement of biology and medicine. Impact evaluations of scientific software have primarily emphasized traditional citation metrics of associated papers, despite these metrics inadequately capturing the dynamic picture of impact and despite challenges with improper citation.ResultsTo understand how software developers evaluate their tools, we conducted a survey of participants in the Informatics Technology for Cancer Research (ITCR) program funded by the National Cancer Institute (NCI). We found that although developers realize the value of more extensive metric collection, they find a lack of funding and time hindering. We also investigated software among this community for how often infrastructure that supports more nontraditional metrics were implemented and how this impacted rates of papers describing usage of the software. We found that infrastructure such as social media presence, more in-depth documentation, the presence of software health metrics, and clear information on how to contact developers seemed to be associated with increased mention rates. Analysing more diverse metrics can enable developers to better understand user engagement, justify continued funding, identify novel use cases, pinpoint improvement areas, and ultimately amplify their software’s impact. Challenges are associated, including distorted or misleading metrics, as well as ethical and security concerns. More attention to nuances involved in capturing impact across the spectrum of biomedical software is needed. For funders and developers, we outline guidance based on experience from our community. By considering how we evaluate software, we can empower developers to create tools that more effectively accelerate biological and medical research progress.Availability and implementationMore information about the analysis, as well as access to data and code is available at https://github.com/fhdsl/ITCR_Metrics_manuscript_website.
      PubDate: Sat, 27 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae469
      Issue No: Vol. 40, No. 8 (2024)
       
  • BELHD: improving biomedical entity linking with homonym disambiguation

    • Free pre-print version: Loading...

      First page: btae474
      Abstract: AbstractMotivationBiomedical entity linking (BEL) is the task of grounding entity mentions to a given knowledge base (KB). Recently, neural name-based methods, system identifying the most appropriate name in the KB for a given mention using neural network (either via dense retrieval or autoregressive modeling), achieved remarkable results for the task, without requiring manual tuning or definition of domain/entity-specific rules. However, as name-based methods directly return KB names, they cannot cope with homonyms, i.e. different KB entities sharing the exact same name. This significantly affects their performance for KBs where homonyms account for a large amount of entity mentions (e.g. UMLS and NCBI Gene).ResultsWe present BELHD (Biomedical Entity Linking with Homonym Disambiguation), a new name-based method that copes with this challenge. BELHD builds upon the BioSyn model with two crucial extensions. First, it performs pre-processing of the KB, during which it expands homonyms with a specifically constructed disambiguating string, thus enforcing unique linking decisions. Second, it introduces candidate sharing, a novel strategy that strengthens the overall training signal by including similar mentions from the same document as positive or negative examples, according to their corresponding KB identifier. Experiments with 10 corpora and 5 entity types show that BELHD improves upon current neural state-of-the-art approaches, achieving the best results in 6 out of 10 corpora with an average improvement of 4.55pp recall@1. Furthermore, the KB preprocessing is orthogonal to the prediction model and thus can also improve other neural methods, which we exemplify for GenBioEL, a generative name-based BEL approach.Availability and implementationThe code to reproduce our experiments can be found at: https://github.com/sg-wbi/belhd.
      PubDate: Sat, 27 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae474
      Issue No: Vol. 40, No. 8 (2024)
       
  • GUANIN: an all-in-one GUi-driven analyzer for NanoString interactive
           normalization

    • Free pre-print version: Loading...

      First page: btae462
      Abstract: AbstractSummaryMost tools for normalizing NanoString gene expression data, apart from the default NanoString nCounter software, are R packages that focus on technical normalization and lack configurable parameters. However, content normalization is the most sensitive, experiment-specific, and relevant step to preprocess NanoString data. Currently this step requires the use of multiple tools and a deep understanding of data management by the researcher. We present GUANIN, a comprehensive normalization tool that integrates both new and well-established methods, offering a wide variety of options to introduce, filter, choose, and evaluate reference genes for content normalization. GUANIN allows the introduction of genes from an endogenous subset as reference genes, addressing housekeeping-related selection problems. It performs a specific and straightforward normalization approach for each experiment, using a wide variety of parameters with suggested default values. GUANIN provides a large number of informative output files that enable the iterative refinement of the normalization process. In terms of normalization, GUANIN matches or outperforms other available methods. Importantly, it allows researchers to interact comprehensively with the data preprocessing step without programming knowledge, thanks to its easy-to-use Graphical User Interface (GUI).Availability and implementationGUANIN can be installed with pip install GUANIN and it is available at https://pypi.org/project/guanin/. Source code, documentation, and case studies are available at https://github.com/julimontoto/guanin under the GPLv3 license.
      PubDate: Thu, 25 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae462
      Issue No: Vol. 40, No. 8 (2024)
       
  • SpatialQC: automated quality control for spatial transcriptome data

    • Free pre-print version: Loading...

      First page: btae458
      Abstract: AbstractSummaryThe advent of spatial transcriptomics has revolutionized our understanding of the spatial heterogeneity in tissues, providing unprecedented insights into the cellular and molecular mechanisms underlying biological processes. Although quality control (QC) critical for downstream data analyses, there is currently a lack of specialized tools for one-stop spatial transcriptome QC. Here, we introduce SpatialQC, a one-stop QC pipeline, which generates comprehensive QC reports and produces clean data in an interactive fashion. SpatialQC is widely applicable to spatial transcriptomic techniques.Availability and implementationsource code and user manuals are available via https://github.com/mgy520/spatialQC, and deposited on Zenodo (https://doi.org/10.5281/zenodo.12634669).
      PubDate: Thu, 25 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae458
      Issue No: Vol. 40, No. 8 (2024)
       
  • Representing core gene expression activity relationships using the latent
           structure implicit in Bayesian networks

    • Free pre-print version: Loading...

      First page: btae463
      Abstract: AbstractMotivationMany types of networks, such as co-expression or ChIP-seq-based gene-regulatory networks, provide useful information for biomedical studies. However, they are often too full of connections and difficult to interpret, forming “indecipherable hairballs.”ResultsTo address this issue, we propose that a Bayesian network can summarize the core relationships between gene expression activities. This network, which we call the LatentDAG, is substantially simpler than conventional co-expression network and ChIP-seq networks (by two orders of magnitude). It provides clearer clusters, without extraneous cross-cluster connections, and clear separators between modules. Moreover, one can find a number of clear examples showing how it bridges the connection between steps in the transcriptional regulatory network and other networks (e.g. RNA-binding protein). In conjunction with a graph neural network, the LatentDAG works better than other biological networks in a variety of tasks, including prediction of gene conservation and clustering genes.Availability and implementationCode is available at https://github.com/gersteinlab/LatentDAG
      PubDate: Thu, 25 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae463
      Issue No: Vol. 40, No. 8 (2024)
       
  • TSpred: a robust prediction framework for TCR–epitope interactions using
           paired chain TCR sequence data

    • Free pre-print version: Loading...

      First page: btae472
      Abstract: AbstractMotivationPrediction of T-cell receptor (TCR)–epitope interactions is important for many applications in biomedical research, such as cancer immunotherapy and vaccine design. The prediction of TCR–epitope interactions remains challenging especially for novel epitopes, due to the scarcity of available data.ResultsWe propose TSpred, a new deep learning approach for the pan-specific prediction of TCR binding specificity based on paired chain TCR data. We develop a robust model that generalizes well to unseen epitopes by combining the predictive power of CNN and the attention mechanism. In particular, we design a reciprocal attention mechanism which focuses on extracting the patterns underlying TCR–epitope interactions. Upon a comprehensive evaluation of our model, we find that TSpred achieves state-of-the-art performances in both seen and unseen epitope specificity prediction tasks. Also, compared to other predictors, TSpred is more robust to bias related to peptide imbalance in the dataset. In addition, the reciprocal attention component of our model allows for model interpretability by capturing structurally important binding regions. Results indicate that TSpred is a robust and reliable method for the task of TCR–epitope binding prediction.Availability and implementationSource code is available at https://github.com/ha01994/TSpred.
      PubDate: Thu, 25 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae472
      Issue No: Vol. 40, No. 8 (2024)
       
  • PostFocus: automated selective post-acquisition high-throughput focus
           restoration using diffusion model for label-free time-lapse microscopy

    • Free pre-print version: Loading...

      First page: btae467
      Abstract: AbstractMotivationHigh-throughput time-lapse imaging is a fundamental tool for efficient living cell profiling at single-cell resolution. Label-free phase-contrast video microscopy enables noninvasive, nontoxic, and long-term imaging. The tradeoff between speed and throughput, however, implies that despite the state-of-the-art autofocusing algorithms, out-of-focus cells are unavoidable due to the migratory nature of immune cells (velocities >10 μm/min). Here, we propose PostFocus to (i) identify out-of-focus images within time-lapse sequences with a classifier, and (ii) deploy a de-noising diffusion probabilistic model to yield reliable in-focus images.ResultsDe-noising diffusion probabilistic model outperformed deep discriminative models with a superior performance on the whole image and around cell boundaries. In addition, PostFocus improves the accuracy of image analysis (cell and contact detection) and the yield of usable videos.Availability and implementationOpen-source code and sample data are available at: https://github.com/kwu14victor/PostFocus.
      PubDate: Tue, 23 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae467
      Issue No: Vol. 40, No. 8 (2024)
       
  • FAIRsoft—a practical implementation of FAIR principles for research
           software

    • Free pre-print version: Loading...

      First page: btae464
      Abstract: AbstractMotivationSoftware plays a crucial and growing role in research. Unfortunately, the computational component in Life Sciences research is often challenging to reproduce and verify. It could be undocumented, opaque, contain unknown errors that affect the outcome, or be directly unavailable and impossible to use for others. These issues are detrimental to the overall quality of scientific research. One step to address this problem is the formulation of principles that research software in the domain should meet to ensure its quality and sustainability, resembling the FAIR (findable, accessible, interoperable, and reusable) data principles.ResultsWe present here a comprehensive series of quantitative indicators based on a pragmatic interpretation of the FAIR Principles and their implementation on OpenEBench, ELIXIR’s open platform providing both support for scientific benchmarking and an active observatory of quality-related features for Life Sciences research software. The results serve to understand the current practices around research software quality-related features and provide objective indications for improving them.Availability and implementationSoftware metadata, from 11 different sources, collected, integrated, and analysed in the context of this manuscript are available at https://doi.org/10.5281/zenodo.7311067. Code used for software metadata retrieval and processing is available in the following repository: https://gitlab.bsc.es/inb/elixir/software-observatory/FAIRsoft_ETL.
      PubDate: Mon, 22 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae464
      Issue No: Vol. 40, No. 8 (2024)
       
  • Genome-wide analysis and visualization of copy number with CNVpytor in
           igv.js

    • Free pre-print version: Loading...

      First page: btae453
      Abstract: AbstractSummaryCopy number variation (CNV) and alteration (CNA) analysis is a crucial component in many genomic studies and its applications span from basic research to clinic diagnostics and personalized medicine. CNVpytor is a tool featuring a read depth-based caller and combined read depth and B-allele frequency (BAF) based 2D caller to find CNVs and CNAs. The tool stores processed intermediate data and CNV/CNA calls in a compact HDF5 file—pytor file. Here, we describe a new track in igv.js that utilizes pytor and whole genome variant files as input for on-the-fly read depth and BAF visualization, CNV/CNA calling and analysis. Embedding into HTML pages and Jupiter Notebooks enables convenient remote data access and visualization simplifying interpretation and analysis of omics data.Availability and implementationThe CNVpytor track is integrated with igv.js and available at https://github.com/igvteam/igv.js. The documentation is available at https://github.com/igvteam/igv.js/wiki/cnvpytor. Usage can be tested in the IGV-Web app at https://igv.org/app and also on https://github.com/abyzovlab/CNVpytor.
      PubDate: Wed, 17 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae453
      Issue No: Vol. 40, No. 8 (2024)
       
  • Fine-tuning protein embeddings for functional similarity evaluation

    • Free pre-print version: Loading...

      First page: btae445
      Abstract: AbstractMotivationProteins with unknown function are frequently compared to better characterized relatives, either using sequence similarity, or recently through similarity in a learned embedding space. Through comparison, protein sequence embeddings allow for interpretable and accurate annotation of proteins, as well as for downstream tasks such as clustering for unsupervised discovery of protein families. However, it is unclear whether embeddings can be deliberately designed to improve their use in these downstream tasks.ResultsWe find that for functional annotation of proteins, as represented by Gene Ontology (GO) terms, direct fine-tuning of language models on a simple classification loss has an immediate positive impact on protein embedding quality. Fine-tuned embeddings show stronger performance as representations for K-nearest neighbor classifiers, reaching stronger performance for GO annotation than even directly comparable fine-tuned classifiers, while maintaining interpretability through protein similarity comparisons. They also maintain their quality in related tasks, such as rediscovering protein families with clustering.Availability and implementationgithub.com/mofradlab/go_metric
      PubDate: Wed, 10 Jul 2024 00:00:00 GMT
      DOI: 10.1093/bioinformatics/btae445
      Issue No: Vol. 40, No. 8 (2024)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 18.97.14.84
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-
JournalTOCs
 
 
  Subjects -> BIOLOGY (Total: 3134 journals)
    - BIOCHEMISTRY (239 journals)
    - BIOENGINEERING (143 journals)
    - BIOLOGY (1491 journals)
    - BIOPHYSICS (53 journals)
    - BIOTECHNOLOGY (243 journals)
    - BOTANY (220 journals)
    - CYTOLOGY AND HISTOLOGY (32 journals)
    - ENTOMOLOGY (67 journals)
    - GENETICS (152 journals)
    - MICROBIOLOGY (265 journals)
    - MICROSCOPY (13 journals)
    - ORNITHOLOGY (26 journals)
    - PHYSIOLOGY (73 journals)
    - ZOOLOGY (117 journals)

BIOLOGY (1491 journals)                  1 2 3 4 5 6 7 8 | Last

Showing 1 - 200 of 1720 Journals sorted alphabetically
AAPS Journal     Hybrid Journal   (Followers: 29)
ACS Pharmacology & Translational Science     Hybrid Journal   (Followers: 3)
ACS Synthetic Biology     Hybrid Journal   (Followers: 39)
Acta Biologica Hungarica     Full-text available via subscription   (Followers: 6)
Acta Biologica Marisiensis     Open Access   (Followers: 5)
Acta Biologica Sibirica     Open Access   (Followers: 2)
Acta Biologica Turcica     Open Access   (Followers: 2)
Acta Biomaterialia     Hybrid Journal   (Followers: 32)
Acta Biotheoretica     Hybrid Journal   (Followers: 3)
Acta Chiropterologica     Full-text available via subscription   (Followers: 6)
acta ethologica     Hybrid Journal   (Followers: 7)
Acta Fytotechnica et Zootechnica     Open Access   (Followers: 3)
Acta Ichthyologica et Piscatoria     Open Access   (Followers: 5)
Acta Médica Costarricense     Open Access   (Followers: 4)
Acta Scientiarum. Biological Sciences     Open Access   (Followers: 2)
Acta Scientifica Naturalis     Open Access   (Followers: 4)
Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis     Open Access   (Followers: 2)
Actualidades Biológicas     Open Access   (Followers: 1)
Advanced Biology     Hybrid Journal   (Followers: 2)
Advanced Health Care Technologies     Open Access   (Followers: 12)
Advanced Journal of Graduate Research     Open Access   (Followers: 2)
Advanced Membranes     Open Access   (Followers: 8)
Advanced Quantum Technologies     Hybrid Journal   (Followers: 5)
Advances in Biological Regulation     Hybrid Journal   (Followers: 4)
Advances in Biology     Open Access   (Followers: 16)
Advances in Biomarker Sciences and Technology     Open Access   (Followers: 2)
Advances in Biosensors and Bioelectronics     Open Access   (Followers: 8)
Advances in Cell Biology/ Medical Journal of Cell Biology     Open Access   (Followers: 28)
Advances in Ecological Research     Full-text available via subscription   (Followers: 47)
Advances in Environmental Sciences - International Journal of the Bioflux Society     Open Access   (Followers: 17)
Advances in Enzyme Research     Open Access   (Followers: 11)
Advances in High Energy Physics     Open Access   (Followers: 27)
Advances in Life Science and Technology     Open Access   (Followers: 14)
Advances in Life Sciences     Open Access   (Followers: 6)
Advances in Marine Biology     Full-text available via subscription   (Followers: 29)
Advances in Virus Research     Full-text available via subscription   (Followers: 9)
Adversity and Resilience Science : Journal of Research and Practice     Hybrid Journal   (Followers: 4)
African Journal of Ecology     Hybrid Journal   (Followers: 18)
African Journal of Range & Forage Science     Hybrid Journal   (Followers: 12)
AFRREV STECH : An International Journal of Science and Technology     Open Access   (Followers: 3)
Ageing Research Reviews     Hybrid Journal   (Followers: 13)
Aggregate     Open Access   (Followers: 3)
Aging Cell     Open Access   (Followers: 23)
Agrokémia és Talajtan     Full-text available via subscription   (Followers: 2)
AJP Cell Physiology     Hybrid Journal   (Followers: 14)
AJP Endocrinology and Metabolism     Hybrid Journal   (Followers: 14)
AJP Lung Cellular and Molecular Physiology     Hybrid Journal   (Followers: 3)
Al-Kauniyah : Jurnal Biologi     Open Access  
Alasbimn Journal     Open Access   (Followers: 1)
Alces : A Journal Devoted to the Biology and Management of Moose     Open Access  
Alfarama Journal of Basic & Applied Sciences     Open Access   (Followers: 12)
All Life     Open Access   (Followers: 1)
AMB Express     Open Access   (Followers: 1)
Ambix     Hybrid Journal   (Followers: 3)
American Journal of Agricultural and Biological Sciences     Open Access   (Followers: 7)
American Journal of Bioethics     Hybrid Journal   (Followers: 17)
American Journal of Human Biology     Hybrid Journal   (Followers: 19)
American Journal of Plant Sciences     Open Access   (Followers: 24)
American Journal of Primatology     Hybrid Journal   (Followers: 17)
American Naturalist     Full-text available via subscription   (Followers: 83)
Amphibia-Reptilia     Hybrid Journal   (Followers: 5)
Anaerobe     Hybrid Journal   (Followers: 3)
Analytical Methods     Hybrid Journal   (Followers: 7)
Analytical Science Advances     Open Access   (Followers: 2)
Anatomia     Open Access   (Followers: 16)
Anatomical Science International     Hybrid Journal   (Followers: 3)
Animal Cells and Systems     Hybrid Journal   (Followers: 4)
Animal Microbiome     Open Access   (Followers: 7)
Animal Models and Experimental Medicine     Open Access  
Annales françaises d'Oto-rhino-laryngologie et de Pathologie Cervico-faciale     Full-text available via subscription   (Followers: 2)
Annales Henri Poincaré     Hybrid Journal   (Followers: 2)
Annales Universitatis Mariae Curie-Sklodowska, sectio C – Biologia     Open Access   (Followers: 1)
Annals of Applied Biology     Hybrid Journal   (Followers: 7)
Annals of Biomedical Engineering     Hybrid Journal   (Followers: 18)
Annals of Human Biology     Hybrid Journal   (Followers: 6)
Annals of Science and Technology     Open Access   (Followers: 2)
Annual Research & Review in Biology     Open Access   (Followers: 1)
Annual Review of Biomedical Engineering     Full-text available via subscription   (Followers: 19)
Annual Review of Cell and Developmental Biology     Full-text available via subscription   (Followers: 40)
Annual Review of Food Science and Technology     Full-text available via subscription   (Followers: 13)
Annual Review of Genomics and Human Genetics     Full-text available via subscription   (Followers: 33)
Antibiotics     Open Access   (Followers: 12)
Antioxidants     Open Access   (Followers: 4)
Antonie van Leeuwenhoek     Hybrid Journal   (Followers: 3)
Anzeiger für Schädlingskunde     Hybrid Journal   (Followers: 1)
Apidologie     Hybrid Journal   (Followers: 4)
Apmis     Hybrid Journal   (Followers: 1)
APOPTOSIS     Hybrid Journal   (Followers: 5)
Applied Biology     Open Access  
Applied Bionics and Biomechanics     Open Access   (Followers: 4)
Applied Phycology     Open Access   (Followers: 1)
Applied Vegetation Science     Full-text available via subscription   (Followers: 9)
Aquaculture Environment Interactions     Open Access   (Followers: 7)
Aquaculture International     Hybrid Journal   (Followers: 25)
Aquaculture Reports     Open Access   (Followers: 3)
Aquaculture, Aquarium, Conservation & Legislation - International Journal of the Bioflux Society     Open Access   (Followers: 9)
Aquatic Biology     Open Access   (Followers: 9)
Aquatic Ecology     Hybrid Journal   (Followers: 45)
Aquatic Ecosystem Health & Management     Hybrid Journal   (Followers: 16)
Aquatic Science and Technology     Open Access   (Followers: 4)
Aquatic Toxicology     Hybrid Journal   (Followers: 26)
Arabian Journal of Scientific Research / المجلة العربية للبحث العلمي     Open Access  
Archaea     Open Access   (Followers: 3)
Archiv für Molluskenkunde: International Journal of Malacology     Full-text available via subscription   (Followers: 1)
Archives of Biological Sciences     Open Access  
Archives of Microbiology     Hybrid Journal   (Followers: 9)
Archives of Natural History     Hybrid Journal   (Followers: 9)
Archives of Oral Biology     Hybrid Journal   (Followers: 2)
Archives of Virology     Hybrid Journal   (Followers: 6)
Archivum Immunologiae et Therapiae Experimentalis     Hybrid Journal   (Followers: 2)
Arid Ecosystems     Hybrid Journal   (Followers: 2)
Arquivos do Museu Dinâmico Interdisciplinar     Open Access  
Arthropod Structure & Development     Hybrid Journal   (Followers: 1)
Arthropod Systematics & Phylogeny     Open Access   (Followers: 12)
Artificial DNA: PNA & XNA     Hybrid Journal   (Followers: 2)
Artificial Intelligence in the Life Sciences     Open Access   (Followers: 1)
Asian Bioethics Review     Full-text available via subscription   (Followers: 2)
Asian Journal of Biological Sciences     Open Access   (Followers: 2)
Asian Journal of Biology     Open Access  
Asian Journal of Biotechnology and Bioresource Technology     Open Access  
Asian Journal of Cell Biology     Open Access   (Followers: 4)
Asian Journal of Developmental Biology     Open Access   (Followers: 1)
Asian Journal of Medical and Biological Research     Open Access   (Followers: 3)
Asian Journal of Nematology     Open Access   (Followers: 4)
Asian Journal of Poultry Science     Open Access   (Followers: 3)
Atti della Accademia Peloritana dei Pericolanti - Classe di Scienze Medico-Biologiche     Open Access  
Australian Life Scientist     Full-text available via subscription   (Followers: 2)
Australian Mammalogy     Hybrid Journal   (Followers: 8)
Autophagy     Hybrid Journal   (Followers: 8)
Avian Biology Research     Hybrid Journal   (Followers: 4)
Avian Conservation and Ecology     Open Access   (Followers: 19)
Bacterial Empire     Open Access   (Followers: 1)
Bacteriology Journal     Open Access   (Followers: 2)
Bacteriophage     Full-text available via subscription   (Followers: 2)
Bangladesh Journal of Bioethics     Open Access  
Bangladesh Journal of Scientific Research     Open Access  
Between the Species     Open Access   (Followers: 2)
BIO Web of Conferences     Open Access  
BIO-SITE : Biologi dan Sains Terapan     Open Access  
Biocatalysis and Biotransformation     Hybrid Journal   (Followers: 4)
BioCentury Innovations     Full-text available via subscription   (Followers: 2)
Biochemistry and Cell Biology     Hybrid Journal   (Followers: 18)
Biochimie     Hybrid Journal   (Followers: 2)
BioControl     Hybrid Journal   (Followers: 2)
Biocontrol Science and Technology     Hybrid Journal   (Followers: 5)
Biodemography and Social Biology     Hybrid Journal   (Followers: 1)
BIODIK : Jurnal Ilmiah Pendidikan Biologi     Open Access  
BioDiscovery     Open Access   (Followers: 2)
Biodiversity : Research and Conservation     Open Access   (Followers: 30)
Biodiversity Data Journal     Open Access   (Followers: 7)
Biodiversity Informatics     Open Access   (Followers: 3)
Biodiversity Information Science and Standards     Open Access   (Followers: 5)
Biodiversity Observations     Open Access   (Followers: 2)
Bioeksperimen : Jurnal Penelitian Biologi     Open Access  
Bioelectrochemistry     Hybrid Journal   (Followers: 1)
Bioelectromagnetics     Hybrid Journal   (Followers: 1)
Bioenergy Research     Hybrid Journal   (Followers: 3)
Bioengineering and Bioscience     Open Access   (Followers: 1)
BioEssays     Hybrid Journal   (Followers: 10)
Bioethics     Hybrid Journal   (Followers: 20)
BioéthiqueOnline     Open Access   (Followers: 1)
Biogeographia : The Journal of Integrative Biogeography     Open Access   (Followers: 2)
Biogeosciences (BG)     Open Access   (Followers: 19)
Biogeosciences Discussions (BGD)     Open Access   (Followers: 3)
Bioinformatics     Hybrid Journal   (Followers: 324)
Bioinformatics Advances : Journal of the International Society for Computational Biology     Open Access   (Followers: 5)
Bioinformatics and Biology Insights     Open Access   (Followers: 15)
Biointerphases     Open Access   (Followers: 1)
Biojournal of Science and Technology     Open Access  
Biologia     Hybrid Journal   (Followers: 1)
Biologia Futura     Hybrid Journal  
Biologia on-line : Revista de divulgació de la Facultat de Biologia     Open Access  
Biological Bulletin     Partially Free   (Followers: 6)
Biological Control     Hybrid Journal   (Followers: 6)
Biological Invasions     Hybrid Journal   (Followers: 24)
Biological Journal of the Linnean Society     Hybrid Journal   (Followers: 18)
Biological Procedures Online     Open Access  
Biological Psychiatry     Hybrid Journal   (Followers: 60)
Biological Psychology     Hybrid Journal   (Followers: 5)
Biological Research     Open Access   (Followers: 1)
Biological Rhythm Research     Hybrid Journal  
Biological Theory     Hybrid Journal   (Followers: 3)
Biological Trace Element Research     Hybrid Journal  
Biologicals     Full-text available via subscription   (Followers: 5)
Biologics: Targets & Therapy     Open Access   (Followers: 1)
Biologie Aujourd'hui     Full-text available via subscription  
Biologie in Unserer Zeit (Biuz)     Hybrid Journal   (Followers: 2)
Biologija     Open Access  
Biology     Open Access   (Followers: 5)
Biology and Philosophy     Hybrid Journal   (Followers: 19)
Biology Bulletin     Hybrid Journal   (Followers: 1)
Biology Bulletin Reviews     Hybrid Journal  
Biology Direct     Open Access   (Followers: 9)
Biology Methods and Protocols     Open Access  
Biology of Sex Differences     Open Access   (Followers: 1)
Biology of the Cell     Full-text available via subscription   (Followers: 8)
Biology, Medicine, & Natural Product Chemistry     Open Access   (Followers: 2)
Biomacromolecules     Hybrid Journal   (Followers: 21)
Biomarker Insights     Open Access   (Followers: 1)
Biomarkers     Hybrid Journal   (Followers: 6)

        1 2 3 4 5 6 7 8 | Last

Similar Journals
Similar Journals
HOME > Browse the 73 Subjects covered by JournalTOCs  
SubjectTotal Journals
 
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 18.97.14.84
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-