Subjects -> BIOLOGY (Total: 3134 journals)
    - BIOCHEMISTRY (239 journals)
    - BIOENGINEERING (143 journals)
    - BIOLOGY (1491 journals)
    - BIOPHYSICS (53 journals)
    - BIOTECHNOLOGY (243 journals)
    - BOTANY (220 journals)
    - CYTOLOGY AND HISTOLOGY (32 journals)
    - ENTOMOLOGY (67 journals)
    - GENETICS (152 journals)
    - MICROBIOLOGY (265 journals)
    - MICROSCOPY (13 journals)
    - ORNITHOLOGY (26 journals)
    - PHYSIOLOGY (73 journals)
    - ZOOLOGY (117 journals)

BIOLOGY (1491 journals)                  1 2 3 4 5 6 7 8 | Last

Showing 1 - 200 of 1720 Journals sorted alphabetically
AAPS Journal     Hybrid Journal   (Followers: 29)
ACS Pharmacology & Translational Science     Hybrid Journal   (Followers: 3)
ACS Synthetic Biology     Hybrid Journal   (Followers: 39)
Acta Biologica Hungarica     Full-text available via subscription   (Followers: 6)
Acta Biologica Marisiensis     Open Access   (Followers: 5)
Acta Biologica Sibirica     Open Access   (Followers: 2)
Acta Biologica Turcica     Open Access   (Followers: 2)
Acta Biomaterialia     Hybrid Journal   (Followers: 32)
Acta Biotheoretica     Hybrid Journal   (Followers: 3)
Acta Chiropterologica     Full-text available via subscription   (Followers: 6)
acta ethologica     Hybrid Journal   (Followers: 7)
Acta Fytotechnica et Zootechnica     Open Access   (Followers: 3)
Acta Ichthyologica et Piscatoria     Open Access   (Followers: 5)
Acta Médica Costarricense     Open Access   (Followers: 2)
Acta Scientiarum. Biological Sciences     Open Access   (Followers: 2)
Acta Scientifica Naturalis     Open Access   (Followers: 4)
Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis     Open Access   (Followers: 2)
Actualidades Biológicas     Open Access   (Followers: 1)
Advanced Biology     Hybrid Journal   (Followers: 1)
Advanced Health Care Technologies     Open Access   (Followers: 12)
Advanced Journal of Graduate Research     Open Access   (Followers: 2)
Advanced Membranes     Open Access   (Followers: 9)
Advanced Quantum Technologies     Hybrid Journal   (Followers: 5)
Advances in Biological Regulation     Hybrid Journal   (Followers: 4)
Advances in Biology     Open Access   (Followers: 16)
Advances in Biomarker Sciences and Technology     Open Access   (Followers: 2)
Advances in Biosensors and Bioelectronics     Open Access   (Followers: 8)
Advances in Cell Biology/ Medical Journal of Cell Biology     Open Access   (Followers: 28)
Advances in Ecological Research     Full-text available via subscription   (Followers: 47)
Advances in Environmental Sciences - International Journal of the Bioflux Society     Open Access   (Followers: 17)
Advances in Enzyme Research     Open Access   (Followers: 11)
Advances in High Energy Physics     Open Access   (Followers: 27)
Advances in Life Science and Technology     Open Access   (Followers: 14)
Advances in Life Sciences     Open Access   (Followers: 6)
Advances in Marine Biology     Full-text available via subscription   (Followers: 29)
Advances in Virus Research     Full-text available via subscription   (Followers: 8)
Adversity and Resilience Science : Journal of Research and Practice     Hybrid Journal   (Followers: 3)
African Journal of Ecology     Hybrid Journal   (Followers: 18)
African Journal of Range & Forage Science     Hybrid Journal   (Followers: 12)
AFRREV STECH : An International Journal of Science and Technology     Open Access   (Followers: 3)
Ageing Research Reviews     Hybrid Journal   (Followers: 13)
Aggregate     Open Access   (Followers: 3)
Aging Cell     Open Access   (Followers: 23)
Agrokémia és Talajtan     Full-text available via subscription   (Followers: 2)
AJP Cell Physiology     Hybrid Journal   (Followers: 13)
AJP Endocrinology and Metabolism     Hybrid Journal   (Followers: 14)
AJP Lung Cellular and Molecular Physiology     Hybrid Journal   (Followers: 3)
Al-Kauniyah : Jurnal Biologi     Open Access  
Alasbimn Journal     Open Access   (Followers: 1)
Alces : A Journal Devoted to the Biology and Management of Moose     Open Access  
Alfarama Journal of Basic & Applied Sciences     Open Access   (Followers: 12)
All Life     Open Access   (Followers: 2)
AMB Express     Open Access   (Followers: 1)
Ambix     Hybrid Journal   (Followers: 3)
American Journal of Agricultural and Biological Sciences     Open Access   (Followers: 7)
American Journal of Bioethics     Hybrid Journal   (Followers: 17)
American Journal of Human Biology     Hybrid Journal   (Followers: 19)
American Journal of Plant Sciences     Open Access   (Followers: 24)
American Journal of Primatology     Hybrid Journal   (Followers: 17)
American Naturalist     Full-text available via subscription   (Followers: 82)
Amphibia-Reptilia     Hybrid Journal   (Followers: 5)
Anaerobe     Hybrid Journal   (Followers: 3)
Analytical Methods     Hybrid Journal   (Followers: 7)
Analytical Science Advances     Open Access   (Followers: 2)
Anatomia     Open Access   (Followers: 15)
Anatomical Science International     Hybrid Journal   (Followers: 3)
Animal Cells and Systems     Hybrid Journal   (Followers: 4)
Animal Microbiome     Open Access   (Followers: 7)
Animal Models and Experimental Medicine     Open Access  
Annales françaises d'Oto-rhino-laryngologie et de Pathologie Cervico-faciale     Full-text available via subscription   (Followers: 2)
Annales Henri Poincaré     Hybrid Journal   (Followers: 2)
Annales Universitatis Mariae Curie-Sklodowska, sectio C – Biologia     Open Access   (Followers: 1)
Annals of Applied Biology     Hybrid Journal   (Followers: 7)
Annals of Biomedical Engineering     Hybrid Journal   (Followers: 18)
Annals of Human Biology     Hybrid Journal   (Followers: 6)
Annals of Science and Technology     Open Access   (Followers: 2)
Annual Research & Review in Biology     Open Access   (Followers: 1)
Annual Review of Biomedical Engineering     Full-text available via subscription   (Followers: 19)
Annual Review of Cell and Developmental Biology     Full-text available via subscription   (Followers: 40)
Annual Review of Food Science and Technology     Full-text available via subscription   (Followers: 13)
Annual Review of Genomics and Human Genetics     Full-text available via subscription   (Followers: 32)
Antibiotics     Open Access   (Followers: 12)
Antioxidants     Open Access   (Followers: 4)
Antonie van Leeuwenhoek     Hybrid Journal   (Followers: 3)
Anzeiger für Schädlingskunde     Hybrid Journal   (Followers: 1)
Apidologie     Hybrid Journal   (Followers: 4)
Apmis     Hybrid Journal   (Followers: 1)
APOPTOSIS     Hybrid Journal   (Followers: 5)
Applied Biology     Open Access  
Applied Bionics and Biomechanics     Open Access   (Followers: 4)
Applied Phycology     Open Access   (Followers: 1)
Applied Vegetation Science     Full-text available via subscription   (Followers: 9)
Aquaculture Environment Interactions     Open Access   (Followers: 7)
Aquaculture International     Hybrid Journal   (Followers: 25)
Aquaculture Reports     Open Access   (Followers: 3)
Aquaculture, Aquarium, Conservation & Legislation - International Journal of the Bioflux Society     Open Access   (Followers: 9)
Aquatic Biology     Open Access   (Followers: 9)
Aquatic Ecology     Hybrid Journal   (Followers: 45)
Aquatic Ecosystem Health & Management     Hybrid Journal   (Followers: 16)
Aquatic Science and Technology     Open Access   (Followers: 4)
Aquatic Toxicology     Hybrid Journal   (Followers: 26)
Arabian Journal of Scientific Research / المجلة العربية للبحث العلمي     Open Access  
Archaea     Open Access   (Followers: 3)
Archiv für Molluskenkunde: International Journal of Malacology     Full-text available via subscription   (Followers: 1)
Archives of Biological Sciences     Open Access  
Archives of Microbiology     Hybrid Journal   (Followers: 9)
Archives of Natural History     Hybrid Journal   (Followers: 8)
Archives of Oral Biology     Hybrid Journal   (Followers: 2)
Archives of Virology     Hybrid Journal   (Followers: 6)
Archivum Immunologiae et Therapiae Experimentalis     Hybrid Journal   (Followers: 2)
Arid Ecosystems     Hybrid Journal   (Followers: 2)
Arquivos do Museu Dinâmico Interdisciplinar     Open Access  
Arthropod Structure & Development     Hybrid Journal   (Followers: 1)
Arthropod Systematics & Phylogeny     Open Access   (Followers: 13)
Artificial DNA: PNA & XNA     Hybrid Journal   (Followers: 2)
Artificial Intelligence in the Life Sciences     Open Access   (Followers: 1)
Asian Bioethics Review     Full-text available via subscription   (Followers: 2)
Asian Journal of Biological Sciences     Open Access   (Followers: 2)
Asian Journal of Biology     Open Access  
Asian Journal of Biotechnology and Bioresource Technology     Open Access  
Asian Journal of Cell Biology     Open Access   (Followers: 4)
Asian Journal of Developmental Biology     Open Access   (Followers: 1)
Asian Journal of Medical and Biological Research     Open Access   (Followers: 3)
Asian Journal of Nematology     Open Access   (Followers: 4)
Asian Journal of Poultry Science     Open Access   (Followers: 3)
Atti della Accademia Peloritana dei Pericolanti - Classe di Scienze Medico-Biologiche     Open Access  
Australian Life Scientist     Full-text available via subscription   (Followers: 2)
Australian Mammalogy     Hybrid Journal   (Followers: 8)
Autophagy     Hybrid Journal   (Followers: 8)
Avian Biology Research     Hybrid Journal   (Followers: 4)
Avian Conservation and Ecology     Open Access   (Followers: 19)
Bacterial Empire     Open Access   (Followers: 1)
Bacteriology Journal     Open Access   (Followers: 2)
Bacteriophage     Full-text available via subscription   (Followers: 2)
Bangladesh Journal of Bioethics     Open Access  
Bangladesh Journal of Scientific Research     Open Access  
Between the Species     Open Access   (Followers: 2)
BIO Web of Conferences     Open Access  
BIO-SITE : Biologi dan Sains Terapan     Open Access  
Biocatalysis and Biotransformation     Hybrid Journal   (Followers: 4)
BioCentury Innovations     Full-text available via subscription   (Followers: 2)
Biochemistry and Cell Biology     Hybrid Journal   (Followers: 18)
Biochimie     Hybrid Journal   (Followers: 2)
BioControl     Hybrid Journal   (Followers: 2)
Biocontrol Science and Technology     Hybrid Journal   (Followers: 5)
Biodemography and Social Biology     Hybrid Journal   (Followers: 1)
BIODIK : Jurnal Ilmiah Pendidikan Biologi     Open Access  
BioDiscovery     Open Access   (Followers: 2)
Biodiversity : Research and Conservation     Open Access   (Followers: 30)
Biodiversity Data Journal     Open Access   (Followers: 7)
Biodiversity Informatics     Open Access   (Followers: 3)
Biodiversity Information Science and Standards     Open Access   (Followers: 3)
Biodiversity Observations     Open Access   (Followers: 2)
Bioeksperimen : Jurnal Penelitian Biologi     Open Access  
Bioelectrochemistry     Hybrid Journal   (Followers: 1)
Bioelectromagnetics     Hybrid Journal   (Followers: 1)
Bioenergy Research     Hybrid Journal   (Followers: 3)
Bioengineering and Bioscience     Open Access   (Followers: 1)
BioEssays     Hybrid Journal   (Followers: 10)
Bioethics     Hybrid Journal   (Followers: 20)
BioéthiqueOnline     Open Access   (Followers: 1)
Biogeographia : The Journal of Integrative Biogeography     Open Access   (Followers: 2)
Biogeosciences (BG)     Open Access   (Followers: 19)
Biogeosciences Discussions (BGD)     Open Access   (Followers: 3)
Bioinformatics     Hybrid Journal   (Followers: 307)
Bioinformatics Advances : Journal of the International Society for Computational Biology     Open Access   (Followers: 4)
Bioinformatics and Biology Insights     Open Access   (Followers: 14)
Biointerphases     Open Access   (Followers: 1)
Biojournal of Science and Technology     Open Access  
Biologia     Hybrid Journal   (Followers: 1)
Biologia Futura     Hybrid Journal  
Biologia on-line : Revista de divulgació de la Facultat de Biologia     Open Access  
Biological Bulletin     Partially Free   (Followers: 6)
Biological Control     Hybrid Journal   (Followers: 6)
Biological Invasions     Hybrid Journal   (Followers: 24)
Biological Journal of the Linnean Society     Hybrid Journal   (Followers: 18)
Biological Procedures Online     Open Access  
Biological Psychiatry     Hybrid Journal   (Followers: 59)
Biological Psychology     Hybrid Journal   (Followers: 5)
Biological Research     Open Access   (Followers: 1)
Biological Rhythm Research     Hybrid Journal  
Biological Theory     Hybrid Journal   (Followers: 3)
Biological Trace Element Research     Hybrid Journal  
Biologicals     Full-text available via subscription   (Followers: 5)
Biologics: Targets & Therapy     Open Access   (Followers: 1)
Biologie Aujourd'hui     Full-text available via subscription  
Biologie in Unserer Zeit (Biuz)     Hybrid Journal   (Followers: 2)
Biologija     Open Access  
Biology     Open Access   (Followers: 5)
Biology and Philosophy     Hybrid Journal   (Followers: 19)
Biology Bulletin     Hybrid Journal   (Followers: 1)
Biology Bulletin Reviews     Hybrid Journal  
Biology Direct     Open Access   (Followers: 9)
Biology Methods and Protocols     Open Access  
Biology of Sex Differences     Open Access   (Followers: 1)
Biology of the Cell     Full-text available via subscription   (Followers: 8)
Biology, Medicine, & Natural Product Chemistry     Open Access   (Followers: 2)
Biomacromolecules     Hybrid Journal   (Followers: 21)
Biomarker Insights     Open Access   (Followers: 1)
Biomarkers     Hybrid Journal   (Followers: 5)

        1 2 3 4 5 6 7 8 | Last

Similar Journals
Journal Cover
Bioinformatics Advances : Journal of the International Society for Computational Biology
Number of Followers: 4  

  This is an Open Access Journal Open Access journal
ISSN (Online) 2635-0041
Published by Oxford University Press Homepage  [425 journals]
  • Assessing the merits: an opinion on the effectiveness of simulation
           techniques in tumor subclonal reconstruction

    • First page: vbae094
      Abstract: AbstractSummaryNeoplastic tumors originate from a single cell, and their evolution can be traced through lineages characterized by mutations, copy number alterations, and structural variants. These lineages are reconstructed and mapped onto evolutionary trees with algorithmic approaches. However, without ground truth benchmark sets, the validity of an algorithm remains uncertain, limiting potential clinical applicability. With a growing number of algorithms available, there is urgent need for standardized benchmark sets to evaluate their merits. Benchmark sets rely on in silico simulations of tumor sequence, but there are no accepted standards for simulation tools, presenting a major obstacle to progress in this field.Availability and implementationAll analysis done in the paper was based on publicly available data from the publication of each accessed tool.
      PubDate: Wed, 26 Jun 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae094
      Issue No: Vol. 4, No. 1 (2024)
       
  • Perspectives on computational modeling of biological systems and the
           significance of the SysMod community

    • First page: vbae090
      Abstract: AbstractMotivationIn recent years, applying computational modeling to systems biology has caused a substantial surge in both discovery and practical applications and a significant shift in our understanding of the complexity inherent in biological systems.ResultsIn this perspective article, we briefly overview computational modeling in biology, highlighting recent advancements such as multi-scale modeling due to the omics revolution, single-cell technology, and integration of artificial intelligence and machine learning approaches. We also discuss the primary challenges faced: integration, standardization, model complexity, scalability, and interdisciplinary collaboration. Lastly, we highlight the contribution made by the Computational Modeling of Biological Systems (SysMod) Community of Special Interest (COSI) associated with the International Society of Computational Biology (ISCB) in driving progress within this rapidly evolving field through community engagement (via both in person and virtual meetings, social media interactions), webinars, and conferences.Availability and implementationAdditional information about SysMod is available at https://sysmod.info.
      PubDate: Wed, 26 Jun 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae090
      Issue No: Vol. 4, No. 1 (2024)
       
  • racoon_clip—a complete pipeline for single-nucleotide analyses of
           iCLIP and eCLIP data

    • First page: vbae084
      Abstract: AbstractMotivationA vast variety of biological questions connected to RNA-binding proteins can be tackled with UV crosslinking and immunoprecipitation (CLIP) experiments. However, the processing and analysis of CLIP data are rather complex. Moreover, different types of CLIP experiments like iCLIP or eCLIP are often processed in different ways, reducing comparability between multiple experiments. Therefore, we aimed to build an easy-to-use computational tool for the processing of CLIP data that can be used for both iCLIP and eCLIP data, as well as data from other truncation-based CLIP methods.ResultsHere, we introduce racoon_clip, a sustainable and fully automated pipeline for the complete processing of iCLIP and eCLIP data to extract RNA binding signal at single-nucleotide resolution. racoon_clip is easy to install and execute, with multiple pre-settings and fully customizable parameters, and outputs a conclusive summary report with visualizations and statistics for all analysis steps.Availability and implementationracoon_clip is implemented as a Snakemake-powered command line tool (Snakemake version ≥7.22, Python version ≥3.9). The latest release can be downloaded from GitHub (https://github.com/ZarnackGroup/racoon_clip/tree/main) and installed via pip. A detailed documentation, including installation, usage, and customization, can be found at https://racoon-clip.readthedocs.io/en/latest/. The example datasets can be downloaded from the Short Read Archive (SRA; iCLIP: SRR5646576, SRR5646577, SRR5646578) or the ENCODE Project (eCLIP: ENCSR202BFN).
      PubDate: Wed, 26 Jun 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae084
      Issue No: Vol. 4, No. 1 (2024)
       
  • Evergene: an interactive webtool for large-scale gene-centric analysis of
           primary tumours

    • First page: vbae092
      Abstract: AbstractMotivationThe data sharing of large comprehensive cancer research projects, such as The Cancer Genome Atlas (TCGA), has improved the availability of high-quality data to research labs around the world. However, due to the volume and inherent complexity of high-throughput omics data, analysis of this is limited by the capacity for performing data processing through programming languages such as R or Python. Existing webtools lack functionality that supports large-scale analysis; typically, users can only input one gene, or a gene list condensed into a gene set, instead of individual gene-level analysis. Furthermore, analysis results are usually displayed without other sample-level molecular or clinical annotations. To address these gaps in the existing webtools, we have developed Evergene using R and Shiny.ResultsEvergene is a user-friendly webtool that utilizes RNA-sequencing data, alongside other sample and clinical annotation, for large-scale gene-centric analysis, including principal component analysis (PCA), survival analysis (SA), and correlation analysis (CA). Moreover, Evergene achieves in-depth analysis of cancer transcriptomic data which can be explored through dimensional reduction methods, relating gene expression with clinical events or other sample information, such as ethnicity, histological classification, and molecular indices. Lastly, users can upload custom data to Evergene for analysis.Availability and implementationEvergene webtool is available at https://bshihlab.shinyapps.io/evergene/. The source code and example user input dataset are available at https://github.com/bshihlab/evergene.
      PubDate: Tue, 18 Jun 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae092
      Issue No: Vol. 4, No. 1 (2024)
       
  • Discovering genomic islands in unannotated bacterial genomes using
           sequence embedding

    • First page: vbae089
      Abstract: AbstractMotivationGenomic islands (GEIs) are clusters of genes in bacterial genomes that are typically acquired by horizontal gene transfer. GEIs play a crucial role in the evolution of bacteria by rapidly introducing genetic diversity and thus helping them adapt to changing environments. Specifically of interest to human health, many GEIs contain pathogenicity and antimicrobial resistance genes. Detecting GEIs is, therefore, an important problem in biomedical and environmental research. There have been many previous studies for computationally identifying GEIs. Still, most of these studies rely on detecting anomalies in the unannotated nucleotide sequences or on a fixed set of known features on annotated nucleotide sequences.ResultsHere, we present TreasureIsland, which uses a new unsupervised representation of DNA sequences to predict GEIs. We developed a high-precision boundary detection method featuring an incremental fine-tuning of GEI borders, and we evaluated the accuracy of this framework using a new comprehensive reference dataset, Benbow. We show that TreasureIsland’s accuracy rivals other GEI predictors, enabling efficient and faster identification of GEIs in unannotated bacterial genomes.Availability and implementationTreasureIsland is available under an MIT license at: https://github.com/FriedbergLab/GenomicIslandPrediction.
      PubDate: Mon, 17 Jun 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae089
      Issue No: Vol. 4, No. 1 (2024)
       
  • scVIC: deep generative modeling of heterogeneity for scRNA-seq data

    • First page: vbae086
      Abstract: AbstractMotivationSingle-cell RNA sequencing (scRNA-seq) has become a valuable tool for studying cellular heterogeneity. However, the analysis of scRNA-seq data is challenging because of inherent noise and technical variability. Existing methods often struggle to simultaneously explore heterogeneity across cells, handle dropout events, and account for batch effects. These drawbacks call for a robust and comprehensive method that can address these challenges and provide accurate insights into heterogeneity at the single-cell level.ResultsIn this study, we introduce scVIC, an algorithm designed to account for variational inference, while simultaneously handling biological heterogeneity and batch effects at the single-cell level. scVIC explicitly models both biological heterogeneity and technical variability to learn cellular heterogeneity in a manner free from dropout events and the bias of batch effects. By leveraging variational inference, we provide a robust framework for inferring the parameters of scVIC. To test the performance of scVIC, we employed both simulated and biological scRNA-seq datasets, either including, or not, batch effects. scVIC was found to outperform other approaches because of its superior clustering ability and circumvention of the batch effects problem.Availability and implementationThe code of scVIC and replication for this study are available at https://github.com/HiBearME/scVIC/tree/v1.0.
      PubDate: Thu, 13 Jun 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae086
      Issue No: Vol. 4, No. 1 (2024)
       
  • Demultiplexing of single-cell RNA-sequencing data using interindividual
           variation in gene expression

    • First page: vbae085
      Abstract: AbstractMotivationPooled designs for single-cell RNA sequencing, where many cells from distinct samples are processed jointly, offer increased throughput and reduced batch variation. This study describes expression-aware demultiplexing (EAD), a computational method that employs differential co-expression patterns between individuals to demultiplex pooled samples without any extra experimental steps.ResultsWe use synthetic sample pools and show that the top interindividual differentially co-expressed genes provide a distinct cluster of cells per individual, significantly enriching the regulation of metabolism. Our application of EAD to samples of six isogenic inbred mice demonstrated that controlling genetic and environmental effects can solve interindividual variations related to metabolic pathways. We utilized 30 samples from both sepsis and healthy individuals in six batches to assess the performance of classification approaches. The results indicate that combining genetic and EAD results can enhance the accuracy of assignments (Min. 0.94, Mean 0.98, Max. 1). The results were enhanced by an average of 1.4% when EAD and barcoding techniques were combined (Min. 1.25%, Median 1.33%, Max. 1.74%). Furthermore, we demonstrate that interindividual differential co-expression analysis within the same cell type can be used to identify cells from the same donor in different activation states. By analysing single-nuclei transcriptome profiles from the brain, we demonstrate that our method can be applied to nonimmune cells.Availability and implementationEAD workflow is available at https://isarnassiri.github.io/scDIV/ as an R package called scDIV (acronym for single-cell RNA-sequencing data demultiplexing using interindividual variations).
      PubDate: Sat, 08 Jun 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae085
      Issue No: Vol. 4, No. 1 (2024)
       
  • OM2Seq: learning retrieval embeddings for optical genome mapping

    • First page: vbae079
      Abstract: AbstractMotivationGenomics-based diagnostic methods that are quick, precise, and economical are essential for the advancement of precision medicine, with applications spanning the diagnosis of infectious diseases, cancer, and rare diseases. One technology that holds potential in this field is optical genome mapping (OGM), which is capable of detecting structural variations, epigenomic profiling, and microbial species identification. It is based on imaging of linearized DNA molecules that are stained with fluorescent labels, that are then aligned to a reference genome. However, the computational methods currently available for OGM fall short in terms of accuracy and computational speed.ResultsThis work introduces OM2Seq, a new approach for the rapid and accurate mapping of DNA fragment images to a reference genome. Based on a Transformer-encoder architecture, OM2Seq is trained on acquired OGM data to efficiently encode DNA fragment images and reference genome segments to a common embedding space, which can be indexed and efficiently queried using a vector database. We show that OM2Seq significantly outperforms the baseline methods in both computational speed (by 2 orders of magnitude) and accuracy.Availability and implementationhttps://github.com/yevgenin/om2seq.
      PubDate: Wed, 05 Jun 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae079
      Issue No: Vol. 4, No. 1 (2024)
       
  • Hapsolutely: a user-friendly tool integrating haplotype phasing, network
           construction, and haploweb calculation

    • First page: vbae083
      Abstract: AbstractMotivationHaplotype networks are a routine approach to visualize relationships among alleles. Such visual analysis of single-locus data is still of importance, especially in species diagnosis and delimitation, where a limited amount of sequence data usually are available and sufficient, along with other datasets in the framework of integrative taxonomy. In diploid organisms, this often requires separating (phasing) sequences with heterozygotic positions, and typically separate programs are required for phasing, reformatting of input files, and haplotype network construction. We therefore developed Hapsolutely, a user-friendly program with an ergonomic graphical user interface that integrates haplotype phasing from single-locus sequences with five approaches for network/genealogy reconstruction.ResultsAmong the novel options implemented, Hapsolutely integrates phasing and graphical reconstruction steps of haplotype networks, supports input of species partition data in the common SPART and SPART-XML formats, and calculates and visualizes haplowebs and fields for recombination, thus allowing graphical comparison of allele distribution and allele sharing among subsets for the purpose of species delimitation. The new tool has been specifically developed with a focus on the workflow in alpha-taxonomy, where exploring fields for recombination across alternative species partitions may help species delimitation.Availability and implementationHapsolutely is written in Python, and integrates code from Phase, SeqPHASE, and PopART in C++ and Haxe. Compiled stand-alone executables for MS Windows and Mac OS along with a detailed manual can be downloaded from https://www.itaxotools.org; the source code is openly available on GitHub (https://github.com/iTaxoTools/Hapsolutely).
      PubDate: Wed, 05 Jun 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae083
      Issue No: Vol. 4, No. 1 (2024)
       
  • CellsFromSpace: a fast, accurate, and reference-free tool to deconvolve
           and annotate spatially distributed omics data

    • First page: vbae081
      Abstract: AbstractMotivationSpatial transcriptomics enables the analysis of cell crosstalk in healthy and diseased organs by capturing the transcriptomic profiles of millions of cells within their spatial contexts. However, spatial transcriptomics approaches also raise new computational challenges for the multidimensional data analysis associated with spatial coordinates.ResultsIn this context, we introduce a novel analytical framework called CellsFromSpace based on independent component analysis (ICA), which allows users to analyze various commercially available technologies without relying on a single-cell reference dataset. The ICA approach deployed in CellsFromSpace decomposes spatial transcriptomics data into interpretable components associated with distinct cell types or activities. ICA also enables noise or artifact reduction and subset analysis of cell types of interest through component selection. We demonstrate the flexibility and performance of CellsFromSpace using real-world samples to demonstrate ICA’s ability to successfully identify spatially distributed cells as well as rare diffuse cells, and quantitatively deconvolute datasets from the Visium, Slide-seq, MERSCOPE, and CosMX technologies. Comparative analysis with a current alternative reference-free deconvolution tool also highlights CellsFromSpace’s speed, scalability and accuracy in processing complex, even multisample datasets. CellsFromSpace also offers a user-friendly graphical interface enabling non-bioinformaticians to annotate and interpret components based on spatial distribution and contributor genes, and perform full downstream analysis.Availability and implementationCellsFromSpace (CFS) is distributed as an R package available from github at https://github.com/gustaveroussy/CFS along with tutorials, examples, and detailed documentation.
      PubDate: Thu, 30 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae081
      Issue No: Vol. 4, No. 1 (2024)
       
  • Prediction of tumor-specific splicing from somatic mutations as a source
           of neoantigen candidates

    • First page: vbae080
      Abstract: AbstractMotivationNeoantigens are promising targets for cancer immunotherapies and might arise from alternative splicing. However, detecting tumor-specific splicing is challenging because many non-canonical splice junctions identified in tumors also appear in healthy tissues. To increase tumor-specificity, we focused on splicing caused by somatic mutations as a source for neoantigen candidates in individual patients.ResultsWe developed the tool splice2neo with multiple functionalities to integrate predicted splice effects from somatic mutations with splice junctions detected in tumor RNA-seq and to annotate the resulting transcript and peptide sequences. Additionally, we provide the tool EasyQuant for targeted RNA-seq read mapping to candidate splice junctions. Using a stringent detection rule, we predicted 1.7 splice junctions per patient as splice targets with a false discovery rate below 5% in a melanoma cohort. We confirmed tumor-specificity using independent, healthy tissue samples. Furthermore, using tumor-derived RNA, we confirmed individual exon-skipping events experimentally. Most target splice junctions encoded neoepitope candidates with predicted major histocompatibility complex (MHC)-I or MHC-II binding. Compared to neoepitope candidates from non-synonymous point mutations, the splicing-derived MHC-I neoepitope candidates had lower self-similarity to corresponding wild-type peptides. In conclusion, we demonstrate that identifying mutation-derived, tumor-specific splice junctions can lead to additional neoantigen candidates to expand the target repertoire for cancer immunotherapies.Availability and implementationThe R package splice2neo and the python package EasyQuant are available at https://github.com/TRON-Bioinformatics/splice2neo and https://github.com/TRON-Bioinformatics/easyquant, respectively.
      PubDate: Wed, 29 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae080
      Issue No: Vol. 4, No. 1 (2024)
       
  • Uncover spatially informed variations for single-cell spatial
           transcriptomics with STew

    • First page: vbae064
      Abstract: AbstractMotivationThe recent spatial transcriptomics (ST) technologies have enabled characterization of gene expression patterns and spatial information, advancing our understanding of cell lineages within diseased tissues. Several analytical approaches have been proposed for ST data, but effectively utilizing spatial information to unveil the shared variation with gene expression remains a challenge.ResultsWe introduce STew, a Spatial Transcriptomic multi-viEW representation learning method, to jointly analyze spatial information and gene expression in a scalable manner, followed by a data-driven statistical framework to measure the goodness of model fit. Through benchmarking using human dorsolateral prefrontal cortex and mouse main olfactory bulb data with true manual annotations, STew achieved superior performance in both clustering accuracy and continuity of identified spatial domains compared with other methods. STew is also robust to generate consistent results insensitive to model parameters, including sparsity constraints. We next applied STew to various ST data acquired from 10× Visium, Slide-seqV2, and 10× Xenium, encompassing single-cell and multi-cellular resolution ST technologies, which revealed spatially informed cell type clusters and biologically meaningful axes. In particular, we identified a proinflammatory fibroblast spatial niche using ST data from psoriatic skins. Moreover, STew scales almost linearly with the number of spatial locations, guaranteeing its applicability to datasets with thousands of spatial locations to capture disease-relevant niches in complex tissues.Availability and implementationSource code and the R software tool STew are available from github.com/fanzhanglab/STew.
      PubDate: Wed, 29 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae064
      Issue No: Vol. 4, No. 1 (2024)
       
  • Improve-RRBS: a novel tool to correct the 3′ trimming of reduced
           representation sequencing reads

    • First page: vbae076
      Abstract: AbstractMotivationReduced Representation Bisulfite Sequencing (RRBS) is a popular approach to determine DNA methylation of the CpG-rich regions of the genome. However, we observed that false positive differentially methylated sites (DMS) are also identified using the standard computational analysis.ResultsDuring RRBS library preparation the MspI digested DNA undergo end-repair by a cytosine at the 3′ end of the fragments. After sequencing, Trim Galore cuts these end-repaired nucleotides. However, Trim Galore fails to detect end-repair when it overlaps with the 3′ end of the sequencing reads. We found that these non-trimmed cytosines bias methylation calling, thus, can identify DMS erroneously. To circumvent this problem, we developed improve-RRBS, which efficiently identifies and hides these cytosines from methylation calling with a false positive rate of maximum 0.5%. To test improve-RRBS, we investigated four datasets from four laboratories and two different species. We found non-trimmed 3′ cytosines in all datasets analyzed and as much as >50% of false positive DMS under certain conditions. By applying improve-RRBS, these DMS completely disappeared from all comparisons.Availability and implementationImprove-RRBS is a freely available python package https://pypi.org/project/iRRBS/ or https://github.com/fothia/improve-RRBS to be implemented in RRBS pipelines.
      PubDate: Fri, 24 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae076
      Issue No: Vol. 4, No. 1 (2024)
       
  • Predmoter—cross-species prediction of plant promoter and enhancer
           regions

    • First page: vbae074
      Abstract: AbstractMotivationIdentifying cis-regulatory elements (CREs) is crucial for analyzing gene regulatory networks. Next generation sequencing methods were developed to identify CREs but represent a considerable expenditure for targeted analysis of few genomic loci. Thus, predicting the outputs of these methods would significantly cut costs and time investment.ResultsWe present Predmoter, a deep neural network that predicts base-wise Assay for Transposase Accessible Chromatin using sequencing (ATAC-seq) and histone Chromatin immunoprecipitation DNA-sequencing (ChIP-seq) read coverage for plant genomes. Predmoter uses only the DNA sequence as input. We trained our final model on 21 species for 13 of which ATAC-seq data and for 17 of which ChIP-seq data was publicly available. We evaluated our models on Arabidopsis thaliana and Oryza sativa. Our best models showed accurate predictions in peak position and pattern for ATAC- and histone ChIP-seq. Annotating putatively accessible chromatin regions provides valuable input for the identification of CREs. In conjunction with other in silico data, this can significantly reduce the search space for experimentally verifiable DNA–protein interaction pairs.Availability and implementationThe source code for Predmoter is available at: https://github.com/weberlab-hhu/Predmoter. Predmoter takes a fasta file as input and outputs h5, and optionally bigWig and bedGraph files.
      PubDate: Fri, 24 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae074
      Issue No: Vol. 4, No. 1 (2024)
       
  • WGCCRR: a web-based tool for genome-wide screening of convergent indels
           and substitutions of amino acids

    • First page: vbae070
      Abstract: AbstractSummaryGenome-wide analyses of proteincoding gene sequences are being employed to examine the genetic basis of adaptive evolution in many organismal groups. Previous studies have revealed that convergent/parallel adaptive evolution may be caused by convergent/parallel amino acid changes. Similarly, detailed analysis of lineage-specific amino acid changes has shown correlations with certain lineage-specific traits. However, experimental validation remains the ultimate measure of causality. With the increasing availability of genomic data, a streamlined tool for such analyses would facilitate and expedite the screening of genetic loci that hold potential for adaptive evolution, while alleviating the bioinformatic burden for experimental biologists. In this study, we present a user-friendly web-based tool called WGCCRR (Whole Genome Comparative Coding Region Read) designed to screen both convergent/parallel and lineage-specific amino acid changes on a genome-wide scale. Our tool allows users to replicate previous analyses with just a few clicks, and the exported results are straightforward to interpret. In addition, we have also included amino acid indels that are usually neglected in previous work. Our website provides an efficient platform for screening candidate loci for downstream experimental tests.Availability and ImplementationThe tool is available at: https://fishevo.xmu.edu.cn/.
      PubDate: Fri, 24 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae070
      Issue No: Vol. 4, No. 1 (2024)
       
  • The axes of biology: a novel axes-based network embedding paradigm to
           decipher the functional mechanisms of the cell

    • First page: vbae075
      Abstract: AbstractSummaryCommon approaches for deciphering biological networks involve network embedding algorithms. These approaches strictly focus on clustering the genes’ embedding vectors and interpreting such clusters to reveal the hidden information of the networks. However, the difficulty in interpreting the genes’ clusters and the limitations of the functional annotations’ resources hinder the identification of the currently unknown cell’s functioning mechanisms. We propose a new approach that shifts this functional exploration from the embedding vectors of genes in space to the axes of the space itself. Our methodology better disentangles biological information from the embedding space than the classic gene-centric approach. Moreover, it uncovers new data-driven functional interactions that are unregistered in the functional ontologies, but biologically coherent. Furthermore, we exploit these interactions to define new higher-level annotations that we term Axes-Specific Functional Annotations and validate them through literature curation. Finally, we leverage our methodology to discover evolutionary connections between cellular functions and the evolution of species.Availability and implementationData and source code can be accessed at https://gitlab.bsc.es/sdoria/axes-of-biology.git
      PubDate: Thu, 23 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae075
      Issue No: Vol. 4, No. 1 (2024)
       
  • Assessing the validity of driver gene identification tools for targeted
           genome sequencing data

    • First page: vbae073
      Abstract: AbstractMotivationMost cancer driver gene identification tools have been developed for whole-exome sequencing data. Targeted sequencing is a popular alternative to whole-exome sequencing for large cancer studies due to its greater depth at a lower cost per tumor. Unlike whole-exome sequencing, targeted sequencing only enables mutation calling for a selected subset of genes. Whether existing driver gene identification tools remain valid in that context has not previously been studied.ResultsWe evaluated the validity of seven popular driver gene identification tools when applied to targeted sequencing data. Based on whole-exome data of 14 different cancer types from TCGA, we constructed matching targeted datasets by keeping only the mutations overlapping with the pan-cancer MSK-IMPACT panel and, in the case of breast cancer, also the breast-cancer-specific B-CAST panel. We then compared the driver gene predictions obtained on whole-exome and targeted mutation data for each of the seven tools. Differences in how the tools model background mutation rates were the most important determinant of their validity on targeted sequencing data. Based on our results, we recommend OncodriveFML, OncodriveCLUSTL, 20/20+, dNdSCv, and ActiveDriver for driver gene identification in targeted sequencing data, whereas MutSigCV and DriverML are best avoided in that context.Availability and implementationCode for the analyses is available at https://github.com/SchmidtGroupNKI/TGSdrivergene_validity.
      PubDate: Thu, 23 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae073
      Issue No: Vol. 4, No. 1 (2024)
       
  • Biology’s transformation: from observation through experiment to
           computation

    • First page: vbae069
      Abstract: AbstractSummaryWe explore the nuanced temporal and epistemological distinctions among natural sciences, particularly the contrasting treatment of time and the interplay between theory and experimentation. Physics, an exemplar of mature science, relies on theoretical models for predictability and simulations. In contrast, biology, traditionally experimental, is witnessing a computational surge, with data analytics and simulations reshaping its research paradigms. Despite these strides, a unified theoretical framework in biology remains elusive. We propose that contemporary global challenges might usher in a renewed emphasis, presenting an opportunity for the establishment of a novel theoretical underpinning for the life sciences.Availability and implementationhttps://github.com/ouzounis/CLS-emerges Data in Json format, Images in PNG format.
      PubDate: Wed, 22 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae069
      Issue No: Vol. 4, No. 1 (2024)
       
  • DeGeCI 1.1: a web platform for gene annotation of mitochondrial genomes

    • First page: vbae072
      Abstract: AbstractSummaryDeGeCI is a command line tool that generates fully automated de novo gene predictions from mitochondrial nucleotide sequences by using a reference database of annotated mitogenomes which is represented as a de Bruijn graph. The input genome is mapped to this graph, creating a subgraph, which is then post-processed by a clustering routine. Version 1.1 of DeGeCI offers a web front-end for GUI-based input. It also introduces a new taxonomic filter pipeline that allows the species in the reference database to be restricted to a user-specified taxonomic classification and allows for gene boundary optimization when providing the translation table of the input genome.Availability and implementationThe web platform is accessible at https://degeci.informatik.uni-leipzig.de. Source code is freely available at https://git.informatik.uni-leipzig.de/lfiedler/degeci.
      PubDate: Mon, 13 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae072
      Issue No: Vol. 4, No. 1 (2024)
       
  • ASpedia-R: a package to retrieve junction-incorporating features and
           knowledge-based functions of human alternative splicing events

    • First page: vbae071
      Abstract: AbstractMotivationAlternative splicing (AS) is a key regulatory mechanism that confers genetic diversity and phenotypic plasticity of human. The exons and their flanking regions include comprehensive junction-incorporating sequence features like splicing factor-binding sites and protein domains. These elements involve in exon usage and finally contribute to isoform-specific biological functions. Splicing-associated sequence features are involved in the multilayered regulation encompassing DNA and proteins. However, most analysis applications have investigated limited sequence features, like protein domains. It is insufficient to explain the comprehensive cause and effect of exon-specific biological processes.ResultsWith the advent of RNA-seq technology, global AS event analysis has deduced more precise results. As accumulating analysis results, it could be a challenge to identify multi-omics sequence features for AS events. Therefore, application to investigate multi-omics sequence features is useful to scan critical evidence. ASpedia-R is an R package to interrogate junction-incorporating sequence features for human genes. Our database collected the heterogeneous profile encompassed from DNA to protein. Additionally, knowledge-based splicing genes were collected using text-mining to test the association with specific pathway terms. Our package retrieves AS events for high-throughput data analysis results via AS event ID converter. Finally, result profile could be visualized and saved to multiple formats: sequence feature result table, genome track figure, protein–protein interaction network, and gene set enrichment test result table. Our package is a convenient tool to understand global regulation mechanisms by splicing.Availability and implementationThe package source code is freely available to non-commercial users at https://github.com/ncc-bioinfo/ASpedia-R.
      PubDate: Sat, 11 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae071
      Issue No: Vol. 4, No. 1 (2024)
       
  • Microbial Abundances Retrieved from Sequencing data—automated NCBI
           Taxonomy (MARS): a pipeline to create relative microbial abundance data
           for the Microbiome Modelling Toolbox and utilizing homosynonyms for
           efficient mapping to resources

    • First page: vbae068
      Abstract: AbstractMotivationComputational approaches to the functional characterization of the microbiome, such as the Microbiome Modelling Toolbox, require precise information on microbial composition and relative abundances. However, challenges arise from homosynonyms—different names referring to the same taxon, which can hinder the mapping process and lead to missed species mapping when using microbial metabolic reconstruction resources, such as AGORA and APOLLO.ResultsWe introduce the integrated MARS pipeline, a user-friendly Python-based solution that addresses these challenges. MARS automates the extraction of relative abundances from metagenomic reads, maps species and genera onto microbial metabolic reconstructions, and accounts for alternative taxonomic names. It normalizes microbial reads, provides an optional cut-off for low-abundance taxa, and produces relative abundance tables apt for integration with the Microbiome Modelling Toolbox. A sub-component of the pipeline automates the task of identifying homosynonyms, leveraging web scraping to find taxonomic IDs of given species, searching NCBI for alternative names, and cross-reference them with microbial reconstruction resources. Taken together, MARS streamlines the entire process from processed metagenomic reads to relative abundance, thereby significantly reducing time and effort when working with microbiome data.Availability and implementationMARS is implemented in Python. It can be found as an interactive application here: https://mars-pipeline.streamlit.app/along with a detailed documentation here: https://github.com/ThieleLab/mars-pipeline.
      PubDate: Fri, 10 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae068
      Issue No: Vol. 4, No. 1 (2024)
       
  • COSGAP: COntainerized Statistical Genetics Analysis Pipelines

    • First page: vbae067
      Abstract: AbstractSummaryThe collection and analysis of sensitive data in large-scale consortia for statistical genetics is hampered by multiple challenges, due to their non-shareable nature. Time-consuming issues in installing software frequently arise due to different operating systems, software dependencies, and limited internet access. For federated analysis across sites, it can be challenging to resolve different problems, including format requirements, data wrangling, setting up analysis on high-performance computing (HPC) facilities, etc. Easier, more standardized, automated protocols and pipelines can be solutions to overcome these issues. We have developed one such solution for statistical genetic data analysis using software container technologies. This solution, named COSGAP: “COntainerized Statistical Genetics Analysis Pipelines,” consists of already established software tools placed into Singularity containers, alongside corresponding code and instructions on how to perform statistical genetic analyses, such as genome-wide association studies, polygenic scoring, LD score regression, Gaussian Mixture Models, and gene-set analysis. Using provided helper scripts written in Python, users can obtain auto-generated scripts to conduct the desired analysis either on HPC facilities or on a personal computer. COSGAP is actively being applied by users from different countries and projects to conduct genetic data analyses without spending much effort on software installation, converting data formats, and other technical requirements.Availability and implementationCOSGAP is freely available on GitHub (https://github.com/comorment/containers) under the GPLv3 license.
      PubDate: Thu, 09 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae067
      Issue No: Vol. 4, No. 1 (2024)
       
  • Network depth affects inference of gene sets from bacterial transcriptomes
           using denoising autoencoders

    • First page: vbae066
      Abstract: AbstractSummaryThe increasing number of publicly available bacterial gene expression data sets provides an unprecedented resource for the study of gene regulation in diverse conditions, but emphasizes the need for self-supervised methods for the automated generation of new hypotheses. One approach for inferring coordinated regulation from bacterial expression data is through neural networks known as denoising autoencoders (DAEs) which encode large datasets in a reduced bottleneck layer. We have generalized this application of DAEs to include deep networks and explore the effects of network architecture on gene set inference using deep learning. We developed a DAE-based pipeline to extract gene sets from transcriptomic data in Escherichia coli, validate our method by comparing inferred gene sets with known pathways, and have used this pipeline to explore how the choice of network architecture impacts gene set recovery. We find that increasing network depth leads the DAEs to explain gene expression in terms of fewer, more concisely defined gene sets, and that adjusting the width results in a tradeoff between generalizability and biological inference. Finally, leveraging our understanding of the impact of DAE architecture, we apply our pipeline to an independent uropathogenic E.coli dataset to identify genes uniquely induced during human colonization.Availability and implementationhttps://github.com/BarquistLab/DAE_architecture_exploration.
      PubDate: Wed, 08 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae066
      Issue No: Vol. 4, No. 1 (2024)
       
  • rNMPID: a database for riboNucleoside MonoPhosphates in DNA

    • First page: vbae063
      Abstract: AbstractMotivationRibonucleoside monophosphates (rNMPs) are the most abundant non-standard nucleotides embedded in genomic DNA. If the presence of rNMP in DNA cannot be controlled, it can lead to genome instability. The actual regulatory functions of rNMPs in DNA remain mainly unknown. Considering the association between rNMP embedment and various diseases and cancer, the phenomenon of rNMP embedment in DNA has become a prominent area of research in recent years.ResultsWe introduce the rNMPID database, which is the first database revealing rNMP-embedment characteristics, strand bias, and preferred incorporation patterns in the genomic DNA of samples from bacterial to human cells of different genetic backgrounds. The rNMPID database uses datasets generated by different rNMP-mapping techniques. It provides the researchers with a solid foundation to explore the features of rNMP embedded in the genomic DNA of multiple sources, and their association with cellular functions, and, in future, disease. It also significantly benefits researchers in the fields of genetics and genomics who aim to integrate their studies with the rNMP-embedment data.Availability and implementationrNMPID is freely accessible on the web at https://www.rnmpid.org.
      PubDate: Wed, 08 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae063
      Issue No: Vol. 4, No. 1 (2024)
       
  • danRerLib: a Python package for zebrafish transcriptomics

    • First page: vbae065
      Abstract: AbstractSummaryUnderstanding the pathways and biological processes underlying differential gene expression is fundamental for characterizing gene expression changes in response to an experimental condition. Zebrafish, with a transcriptome closely mirroring that of humans, are frequently utilized as a model for human development and disease. However, a challenge arises due to the incomplete annotations of zebrafish pathways and biological processes, with more comprehensive annotations existing in humans. This incompleteness may result in biased functional enrichment findings and loss of knowledge. danRerLib, a versatile Python package for zebrafish transcriptomics researchers, overcomes this challenge and provides a suite of tools to be executed in Python including gene ID mapping, orthology mapping for the zebrafish and human taxonomy, and functional enrichment analysis utilizing the latest updated Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) databases. danRerLib enables functional enrichment analysis for GO and KEGG pathways, even when they lack direct zebrafish annotations through the orthology of human-annotated functional annotations. This approach enables researchers to extend their analysis to a wider range of pathways, elucidating additional mechanisms of interest and greater insight into experimental results.Availability and implementationdanRerLib, along with comprehensive documentation and tutorials, is freely available. The source code is available at https://github.com/sdsucomptox/danrerlib/ with associated documentation and tutorials at https://sdsucomptox.github.io/danrerlib/. The package has been developed with Python 3.9 and is available for installation on the package management systems PIP (https://pypi.org/project/danrerlib/) and Conda (https://anaconda.org/sdsu_comptox/danrerlib) with additional installation instructions on the documentation website.
      PubDate: Mon, 06 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae065
      Issue No: Vol. 4, No. 1 (2024)
       
  • ISMB/ECCB 2023 organization benefited from the strengths of the French
           bioinformatics community

    • First page: vbae040
      Abstract: The International Society of Computational Biology—ISCB (https://www.iscb.org) was founded in 1997 as a non-profit organization dedicated to all aspects of the development of our understanding of living organisms using computational and mathematical methods. ISMB is the annual International Conference on Intelligent Systems for Molecular Biology, which is the flagship meeting of ISCB and was established in 1993. ECCB is the European Conference on Computational Biology, which was first held in 2002. In 2023, the French bioinformatics community collaborated with ISCB to organize a joint international conference in France, with great success. Here, we first describe the strengths of the French bioinformatics community and then how they contributed to the success of ISMB/ECCB 2023.
      PubDate: Fri, 03 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae040
      Issue No: Vol. 4, No. 1 (2024)
       
  • scX: a user-friendly tool for scRNAseq exploration

    • First page: vbae062
      Abstract: AbstractMotivationSingle-cell RNA sequencing (scRNAseq) has transformed our ability to explore biological systems. Nevertheless, proficient expertise is essential for handling and interpreting the data.ResultsIn this article, we present scX, an R package built on the Shiny framework that streamlines the analysis, exploration, and visualization of single-cell experiments. With an interactive graphic interface, implemented as a web application, scX provides easy access to key scRNAseq analyses, including marker identification, gene expression profiling, and differential gene expression analysis. Additionally, scX seamlessly integrates with commonly used single-cell Seurat and SingleCellExperiment R objects, resulting in efficient processing and visualization of varied datasets. Overall, scX serves as a valuable and user-friendly tool for effortless exploration and sharing of single-cell data, simplifying some of the complexities inherent in scRNAseq analysis.Availability and implementationSource code can be downloaded from https://github.com/chernolabs/scX. A docker image is available from dockerhub as chernolabs/scx.
      PubDate: Thu, 02 May 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae062
      Issue No: Vol. 4, No. 1 (2024)
       
  • Perspectives on tracking data reuse across biodata resources

    • First page: vbae057
      Abstract: AbstractMotivationData reuse is a common and vital practice in molecular biology and enables the knowledge gathered over recent decades to drive discovery and innovation in the life sciences. Much of this knowledge has been collated into molecular biology databases, such as UniProtKB, and these resources derive enormous value from sharing data among themselves. However, quantifying and documenting this kind of data reuse remains a challenge.ResultsThe article reports on a one-day virtual workshop hosted by the UniProt Consortium in March 2023, attended by representatives from biodata resources, experts in data management, and NIH program managers. Workshop discussions focused on strategies for tracking data reuse, best practices for reusing data, and the challenges associated with data reuse and tracking. Surveys and discussions showed that data reuse is widespread, but critical information for reproducibility is sometimes lacking. Challenges include costs of tracking data reuse, tensions between tracking data and open sharing, restrictive licenses, and difficulties in tracking commercial data use. Recommendations that emerged from the discussion include: development of standardized formats for documenting data reuse, education about the obstacles posed by restrictive licenses, and continued recognition by funding agencies that data management is a critical activity that requires dedicated resources.Availability and implementationSummaries of survey results are available at: https://docs.google.com/forms/d/1j-VU2ifEKb9C-sW6l3ATB79dgHdRk5v_lESv2hawnso/viewanalytics (survey of data providers) and https://docs.google.com/forms/d/18WbJFutUd7qiZoEzbOytFYXSfWFT61hVce0vjvIwIjk/viewanalytics (survey of users).
      PubDate: Thu, 25 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae057
      Issue No: Vol. 4, No. 1 (2024)
       
  • MerCat2: a versatile k-mer counter and diversity estimator for
           database-independent property analysis obtained from omics data

    • First page: vbae061
      Abstract: AbstractMotivationMerCat2 (“Mer—Catenate2”) is a versatile, parallel, scalable and modular property software package for robustly analyzing features in omics data. Using massively parallel sequencing raw reads, assembled contigs, and protein sequences from any platform as input, MerCat2 performs k-mer counting of any length k, resulting in feature abundance counts tables, quality control reports, protein feature metrics, and graphical representation (i.e. principal component analysis (PCA)).ResultsMerCat2 allows for direct analysis of data properties in a database-independent manner that initializes all data, which other profilers and assembly-based methods cannot perform. MerCat2 represents an integrated tool to illuminate omics data within a sample for rapid cross-examination and comparisons.Availability and implementationMerCat2 is written in Python and distributed under a BSD-3 license. The source code of MerCat2 is freely available at https://github.com/raw-lab/mercat2. MerCat2 is compatible with Python 3 on Mac OS X and Linux. MerCat2 can also be easily installed using bioconda: mamba create -n mercat2 -c conda-forge -c bioconda mercat2
      PubDate: Wed, 24 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae061
      Issue No: Vol. 4, No. 1 (2024)
       
  • Deep IDA: a deep learning approach for integrative discriminant analysis
           of multi-omics data with feature ranking—an application to COVID-19

    • First page: vbae060
      Abstract: AbstractMotivationMany diseases are complex heterogeneous conditions that affect multiple organs in the body and depend on the interplay between several factors that include molecular and environmental factors, requiring a holistic approach to better understand disease pathobiology. Most existing methods for integrating data from multiple sources and classifying individuals into one of multiple classes or disease groups have mainly focused on linear relationships despite the complexity of these relationships. On the other hand, methods for nonlinear association and classification studies are limited in their ability to identify variables to aid in our understanding of the complexity of the disease or can be applied to only two data types.ResultsWe propose Deep Integrative Discriminant Analysis (IDA), a deep learning method to learn complex nonlinear transformations of two or more views such that resulting projections have maximum association and maximum separation. Further, we propose a feature ranking approach based on ensemble learning for interpretable results. We test Deep IDA on both simulated data and two large real-world datasets, including RNA sequencing, metabolomics, and proteomics data pertaining to COVID-19 severity. We identified signatures that better discriminated COVID-19 patient groups, and related to neurological conditions, cancer, and metabolic diseases, corroborating current research findings and heightening the need to study the post sequelae effects of COVID-19 to devise effective treatments and to improve patient care.Availability and implementationOur algorithms are implemented in PyTorch and available at: https://github.com/JiuzhouW/DeepIDA
      PubDate: Wed, 24 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae060
      Issue No: Vol. 4, No. 1 (2024)
       
  • Syntactic sugars: crafting a regular expression framework for glycan
           structures

    • First page: vbae059
      Abstract: AbstractMotivationStructural analysis of glycans poses significant challenges in glycobiology due to their complex sequences. Research questions such as analyzing the sequence content of the α1-6 branch in N-glycans, are biologically meaningful yet can be hard to automate.ResultsHere, we introduce a regular expression system, designed for glycans, feature-complete, and closely aligned with regular expression formatting. We use this to annotate glycan motifs of arbitrary complexity, perform differential expression analysis on designated sequence stretches, or elucidate branch-specific binding specificities of lectins in an automated manner. We are confident that glycan regular expressions will empower computational analyses of these sequences.Availability and implementationOur regular expression framework for glycans is implemented in Python and is incorporated into the open-source glycowork package (version 1.1+). Code and documentation are available at https://github.com/BojarLab/glycowork/blob/master/glycowork/motif/regex.py.
      PubDate: Fri, 19 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae059
      Issue No: Vol. 4, No. 1 (2024)
       
  • GLIMMERS: glioma molecular markers exploration using long-read sequencing

    • First page: vbae058
      Abstract: AbstractSummaryThe revised WHO guidelines for classifying and grading brain tumors include several copy number variation (CNV) markers. The turnaround time for detecting CNVs and alterations throughout the entire genome is drastically reduced with the customized read incremental approach on the nanopore platform. However, this approach is challenging for non-bioinformaticians due to the need to use multiple software tools, extract CNV markers and interpret results, which creates barriers due to the time and specialized resources that are necessary. To address this problem and help clinicians classify and grade brain tumors, we developed GLIMMERS: glioma molecular markers exploration using long-read sequencing, an open-access tool that automatically analyzes nanopore-based CNV data and generates simplified reports.Availability and implementationGLIMMERS is available at https://gitlab.com/silol_public/glimmers under the terms of the MIT license.
      PubDate: Mon, 15 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae058
      Issue No: Vol. 4, No. 1 (2024)
       
  • FlexStat: combinatory differentially expressed protein extraction

    • First page: vbae056
      Abstract: AbstractMotivationMass spectrometry-based system proteomics allows identification of dysregulated protein hubs and associated disease-related features. Obtaining differentially expressed proteins (DEPs) is the most important step of downstream bioinformatics analysis. However, the extraction of statistically significant DEPs from datasets with multiple experimental conditions or disease types through currently available tools remains a laborious task. More often such an analysis requires considerable bioinformatics expertise, making it inaccessible to researchers with limited computational analytics experience.ResultsTo uncover the differences among the many conditions within the data in a user-friendly manner, here we introduce FlexStat, a web-based interface that extracts DEPs through combinatory analysis. This tool accepts a protein expression matrix as input and systematically generates DEP results for every conceivable combination of various experimental conditions or disease types. FlexStat includes a suite of robust statistical tools for data preprocessing, in addition to DEP extraction, and publication-ready visualization, which are built on established R scientific libraries in an automated manner. This analytics suite was validated in diverse public proteomic datasets to showcase its high performance of rapid and simultaneous pairwise comparisons of comprehensive datasets.Availability and implementationFlexStat is implemented in R and is freely available at https://jglab.shinyapps.io/flexstatv1-pipeline-only/. The source code is accessible at https://github.com/kts-desilva/FlexStat/tree/main.
      PubDate: Thu, 11 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae056
      Issue No: Vol. 4, No. 1 (2024)
       
  • OpenAnnotateApi: Python and R packages to efficiently annotate and analyze
           chromatin accessibility of genomic regions

    • First page: vbae055
      Abstract: AbstractSummaryChromatin accessibility serves as a critical measurement of physical contact between nuclear macromolecules and DNA sequence, providing valuable insights into the comprehensive landscape of regulatory mechanisms, thus we previously developed the OpenAnnotate web server. However, as an increasing number of epigenomic analysis software tools emerged, web-based annotation often faced limitations and inconveniences when integrated into these software pipelines. To address these issues, we here develop two software packages named OpenAnnotatePy and OpenAnnotateR. In addition to web-based functionalities, these packages encompass supplementary features, including the capability for simultaneous annotation across multiple cell types, advanced searching of systems, tissues and cell types, and converting the result to the data structure of mainstream tools. Moreover, we applied the packages to various scenarios, including cell type revealing, regulatory element prediction, and integration into mainstream single-cell ATAC-seq analysis pipelines including EpiScanpy, Signac, and ArchR. We anticipate that OpenAnnotateApi will significantly facilitate the deciphering of gene regulatory mechanisms, and offer crucial assistance in the field of epigenomic studies.Availability and implementationOpenAnnotateApi for R is available at https://github.com/ZjGaothu/OpenAnnotateR and for Python is available at https://github.com/ZjGaothu/OpenAnnotatePy.
      PubDate: Wed, 10 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae055
      Issue No: Vol. 4, No. 1 (2024)
       
  • Correction to: wTSA-CRAFT: an open-access web server for rapid analysis of
           thermal shift assay experiments

    • First page: vbae050
      Abstract: This is a correction to: Victor Reys, Julien Kowalewski, Muriel Gelin, Corinne Lionne, wTSA-CRAFT: an open-access web server for rapid analysis of thermal shift assay experiments, Bioinformatics Advances, Volume 3, Issue 1, 2023, vbad136, https://doi.org/10.1093/bioadv/vbad136
      PubDate: Tue, 09 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae050
      Issue No: Vol. 4, No. 1 (2024)
       
  • WAS IT A MATch I SAW' Approximate palindromes lead to overstated false
           match rates in benchmarks using reversed sequences

    • First page: vbae052
      Abstract: AbstractBackgroundSoftware for labeling biological sequences typically produces a theory-based statistic for each match (the E-value) that indicates the likelihood of seeing that match’s score by chance. E-values accurately predict false match rate for comparisons of random (shuffled) sequences, and thus provide a reasoned mechanism for setting score thresholds that enable high sensitivity with low expected false match rate. This threshold-setting strategy is challenged by real biological sequences, which contain regions of local repetition and low sequence complexity that cause excess matches between non-homologous sequences. Knowing this, tool developers often develop benchmarks that use realistic-seeming decoy sequences to explore empirical tradeoffs between sensitivity and false match rate. A recent trend has been to employ reversed biological sequences as realistic decoys, because these preserve the distribution of letters and the existence of local repeats, while disrupting the original sequence’s functional properties. However, we and others have observed that sequences appear to produce high scoring alignments to their reversals with surprising frequency, leading to overstatement of false match risk that may negatively affect downstream analysis.ResultsWe demonstrate that an alignment between a sequence S and its (possibly mutated) reversal tends to produce higher scores than alignment between truly unrelated sequences, even when S is a shuffled string with no notable repetitive or low-complexity regions. This phenomenon is due to the unintuitive fact that (even randomly shuffled) sequences contain palindromes that are on average longer than the longest common substrings (LCS) shared between permuted variants of the same sequence. Though the expected palindrome length is only slightly larger than the expected LCS, the distribution of alignment scores involving reversed sequences is strongly right-shifted, leading to greatly increased frequency of high-scoring alignments to reversed sequences.ImpactOverestimates of false match risk can motivate unnecessarily high score thresholds, leading to potentially reduced true match sensitivity. Also, when tool sensitivity is only reported up to the score of the first matched decoy sequence, a large decoy set consisting of reversed sequences can obscure sensitivity differences between tools. As a result of these observations, we advise that reversed biological sequences be used as decoys only when care is taken to remove positive matches in the original (un-reversed) sequences, or when overstatement of false labeling is not a concern. Though the primary focus of the analysis is on sequence annotation, we also demonstrate that the prevalence of internal palindromes may lead to an overstatement of the rate of false labels in protein identification with mass spectrometry.
      PubDate: Mon, 08 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae052
      Issue No: Vol. 4, No. 1 (2024)
       
  • Single-cell type annotation with deep learning in 265 cell types for
           humans

    • First page: vbae054
      Abstract: AbstractMotivationAnnotating cell types is a challenging yet essential task in analyzing single-cell RNA sequencing data. However, due to the lack of a gold standard, it is difficult to evaluate the algorithms fairly and an overfitting algorithm may be favored in benchmarks. To address this challenge, we developed a deep learning-based single-cell type prediction tool that assigns the cell type to 265 different cell types for humans, based on data from approximately five million cells.ResultsWe achieved a median area under the ROC curve (AUC) of 0.93 when evaluated across datasets. We found that inconsistent labeling in the existing database generated by different labs contributed to the mistakes of the model. Therefore, we used cell ontology to correct the annotations and retrained the model, which resulted in 0.971 median AUC. Our study reveals a limiting factor of the accuracy one may achieve with the current database annotation and points to the solutions towards an algorithm-based correction of the gold standard for future automated cell annotation approaches.Availability and implementationThe code is available at: https://github.com/SherrySDong/Hierarchical-Correction-Improves-Automated-Single-cell-Type-Annotation. Data used in this study are listed in Supplementary Table S1Supplementary Table S1 and are retrievable at the CZI database.
      PubDate: Mon, 08 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae054
      Issue No: Vol. 4, No. 1 (2024)
       
  • Changes in total charge on spike protein of SARS-CoV-2 in emerging
           lineages

    • First page: vbae053
      Abstract: AbstractMotivationCharged amino acid residues on the spike protein of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) have been shown to influence its binding to different cell surface receptors, its non-specific electrostatic interactions with the environment, and its structural stability and conformation. It is therefore important to obtain a good understanding of amino acid mutations that affect the total charge on the spike protein which have arisen across different SARS-CoV-2 lineages during the course of the virus’ evolution.ResultsWe analyse the change in the number of ionizable amino acids and the corresponding total charge on the spike proteins of almost 2200 SARS-CoV-2 lineages that have emerged over the span of the pandemic. Our results show that the previously observed trend toward an increase in the positive charge on the spike protein of SARS-CoV-2 variants of concern has essentially stopped with the emergence of the early omicron variants. Furthermore, recently emerged lineages show a greater diversity in terms of their composition of ionizable amino acids. We also demonstrate that the patterns of change in the number of ionizable amino acids on the spike protein are characteristic of related lineages within the broader clade division of the SARS-CoV-2 phylogenetic tree. Due to the ubiquity of electrostatic interactions in the biological environment, our findings are relevant for a broad range of studies dealing with the structural stability of SARS-CoV-2 and its interactions with the environment.Availability and implementationThe data underlying the article are available in the Supplementary materialSupplementary material.
      PubDate: Mon, 08 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae053
      Issue No: Vol. 4, No. 1 (2024)
       
  • Adjusting for covariates and assessing modeling fitness in machine
           learning using MUVR2

    • First page: vbae051
      Abstract: AbstractMotivationMachine learning (ML) methods are frequently used in Omics research to examine associations between molecular data and for example exposures and health conditions. ML is also used for feature selection to facilitate biological interpretation. Our previous MUVR algorithm was shown to generate predictions and variable selections at state-of-the-art performance. However, a general framework for assessing modeling fitness is still lacking. In addition, enabling to adjust for covariates is a highly desired, but largely lacking trait in ML. We aimed to address these issues in the new MUVR2 framework.ResultsThe MUVR2 algorithm was developed to include the regularized regression framework elastic net in addition to partial least squares and random forest modeling. Compared with other cross-validation strategies, MUVR2 consistently showed state-of-the-art performance, including variable selection, while minimizing overfitting. Testing on simulated and real-world data, we also showed that MUVR2 allows for the adjustment for covariates using elastic net modeling, but not using partial least squares or random forest.Availability and implementationAlgorithms, data, scripts, and a tutorial are open source under GPL-3 license and available in the MUVR2 R package at https://github.com/MetaboComp/MUVR2.
      PubDate: Thu, 04 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae051
      Issue No: Vol. 4, No. 1 (2024)
       
  • Extending BioMASS to construct mathematical models from external knowledge

    • First page: vbae042
      Abstract: AbstractMotivationMechanistic modeling based on ordinary differential equations has led to numerous findings in systems biology by integrating prior knowledge and experimental data. However, the manual curation of knowledge necessary when constructing models poses a bottleneck. As the speed of knowledge accumulation continues to grow, there is a demand for a scalable means of constructing executable models.ResultsWe previously introduced BioMASS—an open-source, Python-based framework–to construct, simulate, and analyze mechanistic models of signaling networks. With one of its features, Text2Model, BioMASS allows users to define models in a natural language-like format, thereby facilitating the construction of large-scale models. We demonstrate that Text2Model can serve as a tool for integrating external knowledge for mathematical modeling by generating Text2Model files from a pathway database or through the use of a large language model, and simulating its dynamics through BioMASS. Our findings reveal the tool's capabilities to encourage exploration from prior knowledge and pave the way for a fully data-driven approach to constructing mathematical models.Availability and implementationThe code and documentation for BioMASS are available at https://github.com/biomass-dev/biomass and https://biomass-core.readthedocs.io, respectively. The code used in this article are available at https://github.com/okadalabipr/text2model-from-knowledge.
      PubDate: Thu, 04 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae042
      Issue No: Vol. 4, No. 1 (2024)
       
  • Adapting beyond borders: Insights from the 19th Student Council Symposium
           (SCS2023), the first hybrid ISCB Student Council global event

    • First page: vbae028
      Abstract: SummaryThe 19th ISCB Student Council Symposium (SCS2023) organized by ISCB-SC adopted a hybrid format for the first time, allowing participants to engage in-person in Lyon, France, and virtually via an interactive online platform. The symposium prioritized inclusivity, featuring on-site sessions, poster presentations, and social activities for in-person attendees, while virtual participants accessed live sessions, interactive Q&A, and a virtual exhibit hall. Attendee statistics revealed a global reach, with Europe as the major contributor. SCS2023’s success in bridging in-person and virtual experiences sets a precedent for future events in Computational Biology and Bioinformatics.Availability and ImplementationThe details of the symposium, speaker information, schedules, and accepted abstracts, are available in the program booklet (https://doi.org/10.5281/zenodo.8173977). For organizers interested in adopting a similar hybrid model, it would be beneficial to have access to details regarding the online platform used, the types of sessions offered, and the challenges faced. Future iterations of SCS can address these aspects to further enhance accessibility and inclusivity.
      PubDate: Wed, 03 Apr 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae028
      Issue No: Vol. 4, No. 1 (2024)
       
  • Yves Moreau has received the 2023 Einstein Foundation Individual Award for
           Promoting Quality in Research

    • First page: vbae039
      Abstract: On 14 March 2024, Yves Moreau has received the 2023 Einstein Foundation Individual Award for Promoting Quality in Research.
      PubDate: Fri, 29 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae039
      Issue No: Vol. 4, No. 1 (2024)
       
  • Text-mining-based feature selection for anticancer drug response
           prediction

    • First page: vbae047
      Abstract: AbstractMotivationPredicting anticancer treatment response from baseline genomic data is a critical obstacle in personalized medicine. Machine learning methods are commonly used for predicting drug response from gene expression data. In the process of constructing these machine learning models, one of the most significant challenges is identifying appropriate features among a massive number of genes.ResultsIn this study, we utilize features (genes) extracted using the text-mining of scientific literatures. Using two independent cancer pharmacogenomic datasets, we demonstrate that text-mining-based features outperform traditional feature selection techniques in machine learning tasks. In addition, our analysis reveals that text-mining feature-based machine learning models trained on in vitro data also perform well when predicting the response of in vivo cancer models. Our results demonstrate that text-mining-based feature selection is an easy to implement approach that is suitable for building machine learning models for anticancer drug response prediction.Availability and implementationhttps://github.com/merlab/text_features.
      PubDate: Tue, 26 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae047
      Issue No: Vol. 4, No. 1 (2024)
       
  • ExplaineR: an R package to explain machine learning models

    • First page: vbae049
      Abstract: AbstractSummarySHapley Additive exPlanations (SHAP) is a widely used method for model interpretation. However, its full potential often remains untapped due to the absence of dedicated software tools. In response, ExplaineR, an R package to facilitate interpretation of binary classification and regression models based on clustering functionality for SHAP analysis is introduced here. It additionally offers user-interactive elements in visualizations for evaluating model performance, fairness analysis, decision-curve analysis, and a diverse range of SHAP plots. It facilitates in-depth post-prediction analysis of models, enabling users to pinpoint potentially significant patterns in SHAP plots and subsequently trace them back to instances through SHAP clustering. This functionality is particularly valuable for identifying patient subgroups in clinical cohorts, thus enhancing its role as a robust profiling tool. ExplaineR empowers users to generate comprehensive reports on machine learning outcomes, ensuring consistent and thorough documentation of model performance and interpretations.Availability and implementationExplaineR 1.0.0 is available on GitHub (https://persimune.github.io/explainer/) and CRAN (https://cran.r-project.org/web/packages/explainer/index.html).
      PubDate: Tue, 26 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae049
      Issue No: Vol. 4, No. 1 (2024)
       
  • CATD: a reproducible pipeline for selecting cell-type deconvolution
           methods across tissues

    • First page: vbae048
      Abstract: AbstractMotivationCell-type deconvolution methods aim to infer cell composition from bulk transcriptomic data. The proliferation of developed methods coupled with inconsistent results obtained in many cases, highlights the pressing need for guidance in the selection of appropriate methods. Additionally, the growing accessibility of single-cell RNA sequencing datasets, often accompanied by bulk expression from related samples enable the benchmark of existing methods.ResultsIn this study, we conduct a comprehensive assessment of 31 methods, utilizing single-cell RNA-sequencing data from diverse human and mouse tissues. Employing various simulation scenarios, we reveal the efficacy of regression-based deconvolution methods, highlighting their sensitivity to reference choices. We investigate the impact of bulk-reference differences, incorporating variables such as sample, study and technology. We provide validation using a gold standard dataset from mononuclear cells and suggest a consensus prediction of proportions when ground truth is not available. We validated the consensus method on data from the stomach and studied its spillover effect. Importantly, we propose the use of the critical assessment of transcriptomic deconvolution (CATD) pipeline which encompasses functionalities for generating references and pseudo-bulks and running implemented deconvolution methods. CATD streamlines simultaneous deconvolution of numerous bulk samples, providing a practical solution for speeding up the evaluation of newly developed methods.Availability and implementationhttps://github.com/Papatheodorou-Group/CATD_snakemake.
      PubDate: Sat, 23 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae048
      Issue No: Vol. 4, No. 1 (2024)
       
  • dingo: a Python package for metabolic flux sampling

    • First page: vbae037
      Abstract: Abstract We present dingo, a Python package that supports a variety of methods to sample from the flux space of metabolic models, based on state-of-the-art random walks and rounding methods. For uniform sampling, dingo’s sampling methods provide significant speed-ups and outperform existing software. Indicatively, dingo can sample from the flux space of the largest metabolic model up to now (Recon3D) in less than a day using a personal computer, under several statistical guarantees; this computation is out of reach for other similar software. In addition, dingo supports common analysis methods, such as flux balance analysis and flux variability analysis, and visualization components. dingo contributes to the arsenal of tools in metabolic modelling by enabling flux sampling in high dimensions (in the order of thousands).Availability and implementationThe dingo Python library is available in GitHub at https://github.com/GeomScale/dingo and the data underlying this article are available in https://doi.org/10.5281/zenodo.10423335.
      PubDate: Fri, 22 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae037
      Issue No: Vol. 4, No. 1 (2024)
       
  • BioNexusSentinel: a visual tool for bioregulatory network and
           cytohistological RNA-seq genetic expression profiling within the context
           of multicellular simulation research using ChatGPT-augmented software
           engineering

    • First page: vbae046
      Abstract: AbstractSummaryMotivated by the need to parameterize ongoing multicellular simulation research, this paper documents the culmination of a ChatGPT augmented software engineering cycle resulting in an integrated visual platform for efficient cytohistological RNA-seq and bioregulatory network exploration. As contrasted to other systems and synthetic biology tools, BioNexusSentinel was developed de novo to uniquely combine these features. Reactome served as the primary source of remotely accessible biological models, accessible using BioNexusSentinel’s novel search engine and REST API requests. The innovative, feature-rich gene expression profiler component was developed to enhance the exploratory experience for the researcher, culminating in the cytohistological RNA-seq explorer based on Human Protein Atlas data. A novel cytohistological classifier would be integrated via pre-processed analysis of the RNA-seq data via R statistical language, providing for useful analytical functionality and good performance for the end-user. Implications of the work span prospects for model orthogonality evaluations, gap identification in network modelling, prototyped automatic kinetics parameterization, and downstream simulation and cellular biological state analysis. This unique computational biology software engineering collaboration with generative natural language processing artificial intelligence was shown to enhance worker productivity, with evident benefits in terms of accelerating coding and machine-human intelligence transfer.Availability and implementationBioNexusSentinel project releases, with corresponding data and installation instructions, are available at https://github.com/RichardMatzko/BioNexusSentinel.
      PubDate: Wed, 20 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae046
      Issue No: Vol. 4, No. 1 (2024)
       
  • Data-driven information extraction and enrichment of molecular profiling
           data for cancer cell lines

    • First page: vbae045
      Abstract: AbstractMotivationWith the proliferation of research means and computational methodologies, published biomedical literature is growing exponentially in numbers and volume. Cancer cell lines are frequently used models in biological and medical research that are currently applied for a wide range of purposes, from studies of cellular mechanisms to drug development, which has led to a wealth of related data and publications. Sifting through large quantities of text to gather relevant information on cell lines of interest is tedious and extremely slow when performed by humans. Hence, novel computational information extraction and correlation mechanisms are required to boost meaningful knowledge extraction.ResultsIn this work, we present the design, implementation, and application of a novel data extraction and exploration system. This system extracts deep semantic relations between textual entities from scientific literature to enrich existing structured clinical data concerning cancer cell lines. We introduce a new public data exploration portal, which enables automatic linking of genomic copy number variants plots with ranked, related entities such as affected genes. Each relation is accompanied by literature-derived evidences, allowing for deep, yet rapid, literature search, using existing structured data as a springboard.Availability and implementationOur system is publicly available on the web at https://cancercelllines.org.
      PubDate: Sat, 16 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae045
      Issue No: Vol. 4, No. 1 (2024)
       
  • DeGenPrime provides robust primer design and optimization unlocking the
           biosphere

    • First page: vbae044
      Abstract: AbstractMotivationPolymerase chain reaction (PCR) is the world’s most important molecular diagnostic with applications ranging from medicine to ecology. PCR can fail because of poor primer design. The nearest-neighbor thermodynamic properties, picking conserved regions, and filtration via penalty of oligonucleotides form the basis for good primer design.ResultsDeGenPrime is a console-based high-quality PCR primer design tool that can utilize MSA formats and degenerate bases expanding the target range for a single primer set. Our software utilizes thermodynamic properties, filtration metrics, penalty scoring, and conserved region finding of any proposed primer. It has degeneracy, repeated k-mers, relative GC content, and temperature range filters. Minimal penalty scoring is included according to secondary structure self-dimerization metrics, GC clamping, tri- and tetra-loop hairpins, and internal repetition. We compared PrimerDesign-M, DegePrime, ConsensusPrimer, and DeGenPrime on acceptable primer yield. PrimerDesign-M, DegePrime, and ConsensusPrimer provided 0%, 11%, and 17% yield, respectively, for the alternative iron nitrogenase (anfD) gene target. DeGenPrime successfully identified quality primers within the conserved regions of the T4-like phage major capsid protein (g23), conserved regions of molybdenum-based nitrogenase (nif), and its alternatives vanadium (vnf) and iron (anf) nitrogenase. DeGenPrime provides a universal and scalable primer design tool for the entire tree of life.Availability and implementationDeGenPrime is written in C++ and distributed under a BSD-3-Clause license. The source code for DeGenPrime is freely available on www.github.com/raw-lab/degenprime.
      PubDate: Thu, 14 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae044
      Issue No: Vol. 4, No. 1 (2024)
       
  • CAFA-evaluator: a Python tool for benchmarking ontological classification
           methods

    • First page: vbae043
      Abstract: Abstract We present CAFA-evaluator, a powerful Python program designed to evaluate the performance of prediction methods on targets with hierarchical concept dependencies. It generalizes multi-label evaluation to modern ontologies where the prediction targets are drawn from a directed acyclic graph and achieves high efficiency by leveraging matrix computation and topological sorting. The program requirements include a small number of standard Python libraries, making CAFA-evaluator easy to maintain. The code replicates the Critical Assessment of protein Function Annotation (CAFA) benchmarking, which evaluates predictions of the consistent subgraphs in Gene Ontology. Owing to its reliability and accuracy, the organizers have selected CAFA-evaluator as the official CAFA evaluation software.Availability and implementationhttps://pypi.org/project/cafaeval
      PubDate: Thu, 14 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae043
      Issue No: Vol. 4, No. 1 (2024)
       
  • Prediction of bitterness based on modular designed graph neural network

    • First page: vbae041
      Abstract: AbstractMotivationBitterness plays a pivotal role in our ability to identify and evade harmful substances in food. As one of the five tastes, it constitutes a critical component of our sensory experiences. However, the reliance on human tasting for discerning flavors presents cost challenges, rendering in silico prediction of bitterness a more practical alternative.ResultsIn this study, we introduce the use of Graph Neural Networks (GNNs) in bitterness prediction, superseding traditional machine learning techniques. We developed an advanced model, a Hybrid Graph Neural Network (HGNN), surpassing conventional GNNs according to tests on public datasets. Using HGNN and three other GNNs, we designed BitterGNNs, a bitterness predictor that achieved an AUC value of 0.87 in both external bitter/non-bitter and bitter/sweet evaluations, outperforming the acclaimed RDKFP-MLP predictor with AUC values of 0.86 and 0.85. We further created a bitterness prediction website and database, TastePD (https://www.tastepd.com/). The BitterGNNs predictor, built on GNNs, offers accurate bitterness predictions, enhancing the efficacy of bitterness prediction, aiding advanced food testing methodology development, and deepening our understanding of bitterness origins.Availability and implementationTastePD can be available at https://www.tastepd.com, all codes are at https://github.com/heyigacu/BitterGNN.
      PubDate: Wed, 13 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae041
      Issue No: Vol. 4, No. 1 (2024)
       
  • Mapping drug biology to disease genetics to discover drug impacts on the
           human phenome

    • First page: vbae038
      Abstract: AbstractMotivationMedications can have unexpected effects on disease, including not only harmful drug side effects, but also beneficial drug repurposing. These effects on disease may result from hidden influences of drugs on disease gene networks. Then, discovering how biological effects of drugs relate to disease biology can both provide insight into the mechanism of latent drug effects, and can help predict new effects.ResultsHere, we develop Draphnet, a model that integrates molecular data on 429 drugs and gene associations of nearly 200 common phenotypes to learn a network that explains drug effects on disease in terms of these molecular signals. We present evidence that our method can both predict drug effects, and can provide insight into the biology of unexpected drug effects on disease. Using Draphnet to map a drug’s known molecular effects to downstream effects on the disease genome, we put forward disease genes impacted by drugs, and we suggest a new grouping of drugs based on shared effects on the disease genome. Our approach has multiple applications, including predicting drug uses and learning drug biology, with implications for personalized medicine.Availability and implementationCode to reproduce the analysis is available at https://github.com/RDMelamed/drug-phenome
      PubDate: Sat, 09 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae038
      Issue No: Vol. 4, No. 1 (2024)
       
  • nanoBERT: a deep learning model for gene agnostic navigation of the
           nanobody mutational space

    • First page: vbae033
      Abstract: AbstractMotivationNanobodies are a subclass of immunoglobulins, whose binding site consists of only one peptide chain, bestowing favorable biophysical properties. Recently, the first nanobody therapy was approved, paving the way for further clinical applications of this antibody format. Further development of nanobody-based therapeutics could be streamlined by computational methods. One of such methods is infilling—positional prediction of biologically feasible mutations in nanobodies. Being able to identify possible positional substitutions based on sequence context, facilitates functional design of such molecules.ResultsHere we present nanoBERT, a nanobody-specific transformer to predict amino acids in a given position in a query sequence. We demonstrate the need to develop such machine-learning based protocol as opposed to gene-specific positional statistics since appropriate genetic reference is not available. We benchmark nanoBERT with respect to human-based language models and ESM-2, demonstrating the benefit for domain-specific language models. We also demonstrate the benefit of employing nanobody-specific predictions for fine-tuning on experimentally measured thermostability dataset. We hope that nanoBERT will help engineers in a range of predictive tasks for designing therapeutic nanobodies.Availability and implementationhttps://huggingface.co/NaturalAntibody/.
      PubDate: Wed, 06 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae033
      Issue No: Vol. 4, No. 1 (2024)
       
  • Node-degree aware edge sampling mitigates inflated classification
           

    • First page: vbae036
      Abstract: AbstractMotivationGraph representation learning is a family of related approaches that learn low-dimensional vector representations of nodes and other graph elements called embeddings. Embeddings approximate characteristics of the graph and can be used for a variety of machine-learning tasks such as novel edge prediction. For many biomedical applications, partial knowledge exists about positive edges that represent relationships between pairs of entities, but little to no knowledge is available about negative edges that represent the explicit lack of a relationship between two nodes. For this reason, classification procedures are forced to assume that the vast majority of unlabeled edges are negative. Existing approaches to sampling negative edges for training and evaluating classifiers do so by uniformly sampling pairs of nodes.ResultsWe show here that this sampling strategy typically leads to sets of positive and negative examples with imbalanced node degree distributions. Using representative heterogeneous biomedical knowledge graph and random walk-based graph machine learning, we show that this strategy substantially impacts classification performance. If users of graph machine-learning models apply the models to prioritize examples that are drawn from approximately the same distribution as the positive examples are, then performance of models as estimated in the validation phase may be artificially inflated. We present a degree-aware node sampling approach that mitigates this effect and is simple to implement.Availability and implementationOur code and data are publicly available at https://github.com/monarch-initiative/negativeExampleSelection.
      PubDate: Mon, 04 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae036
      Issue No: Vol. 4, No. 1 (2024)
       
  • Tackling reference bias in genotyping by using founder sequences with
           PanVC 3

    • First page: vbae027
      Abstract: AbstractSummaryOvercoming reference bias and calling insertions and deletions are major challenges in genotyping. We present PanVC 3, a set of software that can be utilized as part of various variant calling workflows. We show that, by incorporating known genetic variants to a set of founder sequences to which reads are aligned, reference bias is reduced and precision of calling insertions and deletions is improved.Availability and implementationPanVC 3 and its source code are freely available at https://github.com/tsnorri/panvc3 and at https://anaconda.org/tsnorri/panvc3 under the MIT licence. The experiment scripts are available at https://github.com/algbio/panvc3-experiments.
      PubDate: Mon, 04 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae027
      Issue No: Vol. 4, No. 1 (2024)
       
  • MERITS: a web-based integrated Mycobacterial PE/PPE protein database

    • First page: vbae035
      Abstract: AbstractMotivationPE/PPE proteins, highly abundant in the Mycobacterium genome, play a vital role in virulence and immune modulation. Understanding their functions is key to comprehending the internal mechanisms of Mycobacterium. However, a lack of dedicated resources has limited research into PE/PPE proteins.ResultsAddressing this gap, we introduce MycobactERIal PE/PPE proTeinS (MERITS), a comprehensive 3D structure database specifically designed for PE/PPE proteins. MERITS hosts 22 353 non-redundant PE/PPE proteins, encompassing details like physicochemical properties, subcellular localization, post-translational modification sites, protein functions, and measures of antigenicity, toxicity, and allergenicity. MERITS also includes data on their secondary and tertiary structure, along with other relevant biological information. MERITS is designed to be user-friendly, offering interactive search and data browsing features to aid researchers in exploring the potential functions of PE/PPE proteins. MERITS is expected to become a crucial resource in the field, aiding in developing new diagnostics and vaccines by elucidating the sequence-structure-functional relationships of PE/PPE proteins.Availability and implementationMERITS is freely accessible at http://merits.unimelb-biotools.cloud.edu.au/.
      PubDate: Sat, 02 Mar 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae035
      Issue No: Vol. 4, No. 1 (2024)
       
  • Inference of differential gene regulatory networks using boosted
           differential trees

    • First page: vbae034
      Abstract: AbstractSummaryDiseases can be caused by molecular perturbations that induce specific changes in regulatory interactions and their coordinated expression, also referred to as network rewiring. However, the detection of complex changes in regulatory connections remains a challenging task and would benefit from the development of novel nonparametric approaches. We develop a new ensemble method called BoostDiff (boosted differential regression trees) to infer a differential network discriminating between two conditions. BoostDiff builds an adaptively boosted (AdaBoost) ensemble of differential trees with respect to a target condition. To build the differential trees, we propose differential variance improvement as a novel splitting criterion. Variable importance measures derived from the resulting models are used to reflect changes in gene expression predictability and to build the output differential networks. BoostDiff outperforms existing differential network methods on simulated data evaluated in four different complexity settings. We then demonstrate the power of our approach when applied to real transcriptomics data in COVID-19, Crohn’s disease, breast cancer, prostate adenocarcinoma, and stress response in Bacillus subtilis. BoostDiff identifies context-specific networks that are enriched with genes of known disease-relevant pathways and complements standard differential expression analyses.Availability and implementationBoostDiff is available at https://github.com/scibiome/boostdiff_inference.
      PubDate: Thu, 29 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae034
      Issue No: Vol. 4, No. 1 (2024)
       
  • The T2T-CHM13 reference assembly uncovers essential WASH1 and GPRIN2
           paralogues

    • First page: vbae029
      Abstract: AbstractSummaryThe recently published T2T-CHM13 reference assembly completed the annotation of the final 8% of the human genome. It introduced 1956 genes, close to 100 of which are predicted to be coding because they have a protein coding parent gene. Here, we confirm the coding status and functional relevance of two of these genes, paralogues of WASHC1 and GPRIN2. We find that LOC124908094, one of four novel subtelomeric WASH1 genes uncovered in the new assembly, produces the WASH1 protein that forms part of the vital actin-regulatory WASH complex. Its coding status is supported by abundant proteomics, conservation, and cDNA evidence. It was previously assumed that gene WASHC1 produced the functional WASH1 protein, but new evidence shows that WASHC1 is a human-derived duplication and likely to be one of 12 WASH1 pseudogenes in the human gene set. We also find that the T2T-CHM13 assembly has added a functionally important copy of GPRIN2 to the human gene set. We demonstrate that uniquely mapping peptides from proteomics databases support the novel LOC124900631 rather than the GRCh38 assembly GPRIN2 gene. These new additions to the set of human coding genes underlines the importance of the new T2T-CHM13 assembly.Availability and implementationNone.
      PubDate: Wed, 28 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae029
      Issue No: Vol. 4, No. 1 (2024)
       
  • Trackplot: a fast and lightweight R script for epigenomic enrichment plots

    • First page: vbae031
      Abstract: AbstractMotivationBigWig files serve as essential inputs in epigenomic data visualization. However, current R packages for visualizing these files are limited, slow, and burdened by numerous dependencies.ResultsWe introduce trackplot, a minimal R script designed for the rapid generation of integrative genomics viewer (IGV) style track plots, profile plots, and heatmaps from bigWig files. This script offers speed, owing to its reliance on bwtool, resulting in performance gains of several magnitudes compared to equivalent packages. The script is lightweight, requiring only the data.table and bwtool packages as primary dependencies. Notably, the plots are generated in base R graphics, eliminating the need for additional packages. trackplot queries the University of California Santa Cruz (UCSC) genome browser for gene models thereby enhancing the reproducibility of analyses. The script extends its support to general transfer format (GTF) further enhancing its versatility. This tool addresses the gaps in existing bigWig visualization approaches by offering speed, simplicity, and minimal dependencies, thereby presenting a valuable asset to researchers in the fields of epigenomics.Availability and implementationtrackplot is implemented in R is made available under MIT license at https://github.com/PoisonAlien/trackplot.
      PubDate: Wed, 28 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae031
      Issue No: Vol. 4, No. 1 (2024)
       
  • Making mouse transcriptomics deconvolution accessible with immunedeconv

    • First page: vbae032
      Abstract: AbstractSummaryTranscriptome deconvolution has emerged as a reliable technique to estimate cell-type abundances from bulk RNA sequencing data. Unlike their human equivalents, methods to quantify the cellular composition of complex tissues from murine transcriptomics are sparse and sometimes not easy to use. We extended the immunedeconv R package to facilitate the deconvolution of mouse transcriptomics, enabling the quantification of murine immune-cell types using 13 different methods. Through immunedeconv, we further offer the possibility of tweaking cell signatures used by deconvolution methods, providing custom annotations tailored for specific cell types and tissues. These developments strongly facilitate the study of the immune-cell composition of mouse models and further open new avenues in the investigation of the cellular composition of other tissues and organisms.Availability and implementationThe R package and the documentation are available at https://github.com/omnideconv/immunedeconv.
      PubDate: Wed, 28 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae032
      Issue No: Vol. 4, No. 1 (2024)
       
  • MetaQuad: shared informative variants discovery in metagenomic samples

    • First page: vbae030
      Abstract: AbstractMotivationStrain-level analysis of metagenomic data has garnered significant interest in recent years. Microbial single nucleotide polymorphisms (SNPs) are genomic variants that can reflect strain-level differences within a microbial species. The diversity and emergence of SNPs in microbial genomes may reveal evolutionary history and environmental adaptation in microbial populations. However, efficient discovery of shared polymorphic variants in a large collection metagenomic samples remains a computational challenge.ResultsMetaQuad utilizes a density-based clustering technique to effectively distinguish between shared variants and non-polymorphic sites using shotgun metagenomic data. Empirical comparisons with other state-of-the-art methods show that MetaQuad significantly reduces the number of false positive SNPs without greatly affecting the true positive rate. We used MetaQuad to identify antibiotic-associated variants in patients who underwent Helicobacter pylori eradication therapy. MetaQuad detected 7591 variants across 529 antibiotic resistance genes. The nucleotide diversity of some genes is increased 6 weeks after antibiotic treatment, potentially indicating the role of these genes in specific antibiotic treatments.Availability and implementationMetaQuad is an open-source Python package available via https://github.com/holab-hku/MetaQuad.
      PubDate: Sat, 24 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae030
      Issue No: Vol. 4, No. 1 (2024)
       
  • TrajPy: empowering feature engineering for trajectory analysis across
           domains

    • First page: vbae026
      Abstract: MotivationTrajectories, which are sequentially measured quantities that form a path, are an important presence in many different fields, from hadronic beams in physics to electrocardiograms in medicine. Trajectory analysis requires the quantification and classification of curves, either by using statistical descriptors or physics-based features. To date, no extensive and user-friendly package for trajectory analysis has been readily available, despite its importance and potential application across various domains.ResultsWe have developed TrajPy, a free, open-source Python package that serves as a complementary tool for empowering trajectory analysis. This package features a user-friendly graphical user interface and offers a set of physical descriptors that aid in characterizing these complex structures. TrajPy has already been successfully applied to studies of mitochondrial motility in neuroblastoma cell lines and the analysis of in silico models for cell migration, in combination with image analysis.Availability and implementationThe TrajPy package is developed in Python 3 and is released under the GNU GPL-3.0 license. It can easily be installed via PyPi, and the development source code is accessible at the repository: https://github.com/ocbe-uio/TrajPy/. The package release is also automatically archived with the
      DOI 10.5281/zenodo.3656044.
      PubDate: Fri, 23 Feb 2024 00:00:00 GMT
      Issue No: Vol. 4, No. 1 (2024)
       
  • CloudProteoAnalyzer: scalable processing of big data from proteomics using
           cloud computing

    • First page: vbae024
      Abstract: AbstractSummaryShotgun proteomics is widely used in many system biology studies to determine the global protein expression profiles of tissues, cultures, and microbiomes. Many non-distributed computer algorithms have been developed for users to process proteomics data on their local computers. However, the amount of data acquired in a typical proteomics study has grown rapidly in recent years, owing to the increasing throughput of mass spectrometry and the expanding scale of study designs. This presents a big data challenge for researchers to process proteomics data in a timely manner. To overcome this challenge, we developed a cloud-based parallel computing application to offer end-to-end proteomics data analysis software as a service (SaaS). A web interface was provided to users to upload mass spectrometry-based proteomics data, configure parameters, submit jobs, and monitor job status. The data processing was distributed across multiple nodes in a supercomputer to achieve scalability for large datasets. Our study demonstrated SaaS for proteomics as a viable solution for the community to scale up the data processing using cloud computing.Availability and implementationThis application is available online at https://sipros.oscer.ou.edu/ or https://sipros.unt.edu for free use. The source code is available at https://github.com/Biocomputing-Research-Group/CloudProteoAnalyzer under the GPL version 3.0 license.
      PubDate: Fri, 23 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae024
      Issue No: Vol. 4, No. 1 (2024)
       
  • OMEinfo: global geographic metadata for -omics experiments

    • First page: vbae025
      Abstract: AbstractSummaryMicrobiome studies increasingly associate geographical features like rurality and climate with microbiomes. It is essential to correctly integrate rich geographical metadata; and inconsistent definitions of rurality, can hinder cross-study comparisons. We address this with OMEinfo, a tool for automated retrieval of consistent geographical metadata from user-provided location data. OMEinfo leverages open data sources such as the Global Human Settlement Layer, and Open-Data Inventory for Anthropogenic Carbon dioxide. OMEinfo's web-app enables users to visualize and investigate the spatial distribution of metadata features. OMEinfo promotes reproducibility and consistency in microbiome metadata through a standardized metadata retrieval approach. To demonstrate utility, OMEinfo is used to replicate the results of a previous study linking population density to bacterial diversity. As the field explores relationships between microbiomes and geographical features, tools like OMEinfo will prove vital in developing a robust, accurate, and interconnected understanding of these interactions, whilst having applicability beyond this field to any studies utilizing location-based metadata. Finally, we release the OMEinfo annotation dataset of 5.3 million OMEinfo annotated samples from the ENA, for use in retrospective analyses of sequencing samples, and suggest several ways researchers and sequencing read repositories can improve the quality of underlying metadata submitted to these public stores.Availability and implementationOMEinfo is freely available and released under an MIT licence. OMEinfo source code is available at https://github.com/m-crown/OMEinfo/ and https://doi.org/10.5281/zenodo.10518763
      PubDate: Wed, 21 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae025
      Issue No: Vol. 4, No. 1 (2024)
       
  • TOSCCA: a framework for interpretation and testing of sparse canonical
           correlations

    • First page: vbae021
      Abstract: AbstractSummaryIn clinical and biomedical research, multiple high-dimensional datasets are nowadays routinely collected from omics and imaging devices. Multivariate methods, such as Canonical Correlation Analysis (CCA), integrate two (or more) datasets to discover and understand underlying biological mechanisms. For an explorative method like CCA, interpretation is key. We present a sparse CCA method based on soft-thresholding that produces near-orthogonal components, allows for browsing over various sparsity levels, and permutation-based hypothesis testing. Our soft-thresholding approach avoids tuning of a penalty parameter. Such tuning is computationally burdensome and may render unintelligible results. In addition, unlike alternative approaches, our method is less dependent on the initialization. We examined the performance of our approach with simulations and illustrated its use on real cancer genomics data from drug sensitivity screens. Moreover, we compared its performance to Penalized Matrix Analysis (PMA), which is a popular alternative of sparse CCA with a focus on yielding interpretable results. Compared to PMA, our method offers improved interpretability of the results, while not compromising, or even improving, signal discovery.Availability and implementationThe software and simulation framework are available at https://github.com/nuria-sv/toscca.
      PubDate: Wed, 21 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae021
      Issue No: Vol. 4, No. 1 (2024)
       
  • BioModME for building and simulating dynamic computational models of
           complex biological systems

    • First page: vbae023
      Abstract: AbstractSummaryMolecular mechanisms of biological functions and disease processes are exceptionally complex, and our ability to interrogate and understand relationships is becoming increasingly dependent on the use of computational modeling. We have developed “BioModME,” a standalone R-based web application package, providing an intuitive and comprehensive graphical user interface to help investigators build, solve, visualize, and analyze computational models of complex biological systems. Some important features of the application package include multi-region system modeling, custom reaction rate laws and equations, unit conversion, model parameter estimation utilizing experimental data, and import and export of model information in the Systems Biology Matkup Language format. The users can also export models to MATLAB, R, and Python languages and the equations to LaTeX and Mathematical Markup Language formats. Other important features include an online model development platform, multi-modality visualization tool, and efficient numerical solvers for differential-algebraic equations and optimization.Availability and implementationAll relevant software information including documentation and tutorials can be found at https://mcw.marquette.edu/biomedical-engineering/computational-systems-biology-lab/biomodme.php. Deployed software can be accessed at https://biomodme.ctsi.mcw.edu/. Source code is freely available for download at https://github.com/MCWComputationalBiologyLab/BioModME.
      PubDate: Tue, 20 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae023
      Issue No: Vol. 4, No. 1 (2024)
       
  • Novel symmetry-preserving neural network model for phylogenetic inference

    • First page: vbae022
      Abstract: Abstract MotivationScientists world-wide are putting together massive efforts to understand how the biodiversity that we see on Earth evolved from single-cell organisms at the origin of life and this diversification process is represented through the Tree of Life. Low sampling rates and high heterogeneity in the rate of evolution across sites and lineages produce a phenomenon denoted “long branch attraction” (LBA) in which long nonsister lineages are estimated to be sisters regardless of their true evolutionary relationship. LBA has been a pervasive problem in phylogenetic inference affecting different types of methodologies from distance-based to likelihood-based.ResultsHere, we present a novel neural network model that outperforms standard phylogenetic methods and other neural network implementations under LBA settings. Furthermore, unlike existing neural network models in phylogenetics, our model naturally accounts for the tree isomorphisms via permutation invariant functions which ultimately result in lower memory and allows the seamless extension to larger trees.Availability and implementationWe implement our novel theory on an open-source publicly available GitHub repository: https://github.com/crsl4/nn-phylogenetics.
      PubDate: Mon, 19 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae022
      Issue No: Vol. 4, No. 1 (2024)
       
  • ExpoSeq: simplified analysis of high-throughput sequencing data from
           antibody discovery campaigns

    • First page: vbae020
      Abstract: AbstractSummaryHigh-throughput sequencing (HTS) offers a modern, fast, and explorative solution to unveil the full potential of display techniques, like antibody phage display, in molecular biology. However, a significant challenge lies in the processing and analysis of such data. Furthermore, there is a notable absence of open-access user-friendly software tools that can be utilized by scientists lacking programming expertise. Here, we present ExpoSeq as an easy-to-use tool to explore, process, and visualize HTS data from antibody discovery campaigns like an expert while only requiring a beginner’s knowledge.Availability and implementationThe pipeline is distributed via GitHub and PyPI, and it can either be installed as a package with pip or the user can choose to clone the repository.
      PubDate: Sat, 10 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae020
      Issue No: Vol. 4, No. 1 (2024)
       
  • Quantitative transcriptomic and epigenomic data analysis: a primer

    • First page: vbae019
      Abstract: AbstractSummaryThe advent of microarray and second generation sequencing technology has revolutionized the field of molecular biology, allowing researchers to quantitatively assess transcriptomic and epigenomic features in a comprehensive and cost-efficient manner. Moreover, technical advancements have pushed the resolution of these sequencing techniques to the single cell level. As a result, the bottleneck of molecular biology research has shifted from the bench to the subsequent omics data analysis. Even though most methodologies share the same general strategy, state-of-the-art literature typically focuses on data type specific approaches and already assumes expert knowledge. Here, however, we aim at providing conceptual insight in the principles of genome-wide quantitative transcriptomic and epigenomic (including open chromatin assay) data analysis by describing a generic workflow. By starting from a general framework and its assumptions, the need for alternative or additional data-analytical solutions when working with specific data types becomes clear, and are hence introduced. Thus, we aim to enable readers with basic omics expertise to deepen their conceptual and statistical understanding of general strategies and pitfalls in omics data analysis and to facilitate subsequent progression to more specialized literature.
      PubDate: Sat, 10 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae019
      Issue No: Vol. 4, No. 1 (2024)
       
  • Ultrafast learning of four-node hybridization cycles in phylogenetic
           networks using algebraic invariants

    • First page: vbae014
      Abstract: AbstractMotivationThe abundance of gene flow in the Tree of Life challenges the notion that evolution can be represented with a fully bifurcating process which cannot capture important biological realities like hybridization, introgression, or horizontal gene transfer. Coalescent-based network methods are increasingly popular, yet not scalable for big data, because they need to perform a heuristic search in the space of networks as well as numerical optimization that can be NP-hard. Here, we introduce a novel method to reconstruct phylogenetic networks based on algebraic invariants. While there is a long tradition of using algebraic invariants in phylogenetics, our work is the first to define phylogenetic invariants on concordance factors (frequencies of four-taxon splits in the input gene trees) to identify level-1 phylogenetic networks under the multispecies coalescent model.ResultsOur novel hybrid detection methodology is optimization-free as it only requires the evaluation of polynomial equations, and as such, it bypasses the traversal of network space, yielding a computational speed at least 10 times faster than the fastest-to-date network methods. We illustrate our method’s performance on simulated and real data from the genus Canis.Availability and implementationWe present an open-source publicly available Julia package PhyloDiamond.jl available at https://github.com/solislemuslab/PhyloDiamond.jl with broad applicability within the evolutionary community.
      PubDate: Thu, 08 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae014
      Issue No: Vol. 4, No. 1 (2024)
       
  • CMAT: ClinVar Mapping and Annotation Toolkit

    • First page: vbae018
      Abstract: AbstractSummarySemantic ontology mapping of clinical descriptors with disease outcome is essential. ClinVar is a key resource for human variation with known clinical significance. We present CMAT, a software toolkit and curation protocol for accurately enriching ClinVar releases with disease ontology associations and complex functional consequences.Availability and implementationThe software and ontology mappings can be obtained from: https://github.com/EBIvariation/CMAT.
      PubDate: Wed, 07 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae018
      Issue No: Vol. 4, No. 1 (2024)
       
  • ZygosityPredictor

    • First page: vbae017
      Abstract: AbstractSummaryZygosityPredictor provides functionality to evaluate how many copies of a gene are affected by mutations in next generation sequencing data. In cancer samples, the tool processes both somatic and germline mutations. In particular, ZygosityPredictor computes the number of affected copies for single nucleotide variants and small insertions and deletions (Indels). In addition, the tool integrates information at gene level via phasing of several variants and subsequent logic to derive how strongly a gene is affected by mutations and provides a measure of confidence. This information is of particular interest in precision oncology, e.g. when assessing whether unmutated copies of tumor-suppressor genes remain.Availability and implementationZygosityPredictor was implemented as an R-package and is available via Bioconductor at https://bioconductor.org/packages/ZygosityPredictor. Detailed documentation is provided in the vignette including application to an example genome.
      PubDate: Tue, 06 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae017
      Issue No: Vol. 4, No. 1 (2024)
       
  • HiTaxon: a hierarchical ensemble framework for taxonomic classification of
           short reads

    • First page: vbae016
      Abstract: AbstractMotivationWhole microbiome DNA and RNA sequencing (metagenomics and metatranscriptomics) are pivotal to determining the functional roles of microbial communities. A key challenge in analyzing these complex datasets, typically composed of tens of millions of short reads, is accurately classifying reads to their taxa of origin. While still performing worse relative to reference-based short-read tools in species classification, ML algorithms have shown promising results in taxonomic classification at higher ranks. A recent approach exploited to enhance the performance of ML tools, which can be translated to reference-dependent classifiers, has been to integrate the hierarchical structure of taxonomy within the tool’s predictive algorithm.ResultsHere, we introduce HiTaxon, an end-to-end hierarchical ensemble framework for taxonomic classification. HiTaxon facilitates data collection and processing, reference database construction and optional training of ML models to streamline ensemble creation. We show that databases created by HiTaxon improve the species-level performance of reference-dependent classifiers, while reducing their computational overhead. In addition, through exploring hierarchical methods for HiTaxon, we highlight that our custom approach to hierarchical ensembling improves species-level classification relative to traditional strategies. Finally, we demonstrate the improved performance of our hierarchical ensembles over current state-of-the-art classifiers in species classification using datasets comprised of either simulated or experimentally derived reads.Availability and implementationHiTaxon is available at: https://github.com/ParkinsonLab/HiTaxon.
      PubDate: Thu, 01 Feb 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae016
      Issue No: Vol. 4, No. 1 (2024)
       
  • iCluF: an unsupervised iterative cluster-fusion method for patient
           stratification using multiomics data

    • First page: vbae015
      Abstract: AbstractMotivationPatient stratification is crucial for the effective treatment or management of heterogeneous diseases, including cancers. Multiomic technologies facilitate molecular characterization of human diseases; however, the complexity of data warrants the need for the development of robust data integration tools for patient stratification using machine-learning approaches.ResultsiCluF iteratively integrates three types of multiomic data (mRNA, miRNA, and DNA methylation) using pairwise patient similarity matrices built from each omic data. The intermediate omic-specific neighborhood matrices implement iterative matrix fusion and message passing among the similarity matrices to derive a final integrated matrix representing all the omics profiles of a patient, which is used to further cluster patients into subtypes. iCluF outperforms other methods with significant differences in the survival profiles of 8581 patients belonging to 30 different cancers in TCGA. iCluF also predicted the four intrinsic subtypes of Breast Invasive Carcinomas with adjusted rand index and Fowlkes–Mallows scores of 0.72 and 0.83, respectively. The Gini importance score showed that methylation features were the primary decisive players, followed by mRNA and miRNA to identify disease subtypes. iCluF can be applied to stratify patients with any disease containing multiomic datasets.Availability and implementationSource code and datasets are available at https://github.com/GudaLab/iCluF_core.
      PubDate: Tue, 30 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae015
      Issue No: Vol. 4, No. 1 (2024)
       
  • PM-CNN: microbiome status recognition and disease detection model based on
           phylogeny and multi-path neural network

    • First page: vbae013
      Abstract: AbstractMotivationThe human microbiome, found throughout various body parts, plays a crucial role in health dynamics and disease development. Recent research has highlighted microbiome disparities between patients with different diseases and healthy individuals, suggesting the microbiome’s potential in recognizing health states. Traditionally, microbiome-based status classification relies on pre-trained machine learning (ML) models. However, most ML methods overlook microbial relationships, limiting model performance.ResultsTo address this gap, we propose PM-CNN (Phylogenetic Multi-path Convolutional Neural Network), a novel phylogeny-based neural network model for multi-status classification and disease detection using microbiome data. PM-CNN organizes microbes based on their phylogenetic relationships and extracts features using a multi-path convolutional neural network. An ensemble learning method then fuses these features to make accurate classification decisions. We applied PM-CNN to human microbiome data for status and disease detection, demonstrating its significant superiority over existing ML models. These results provide a robust foundation for microbiome-based state recognition and disease prediction in future research and applications.Availability and implementationPM-CNN software is available at https://github.com/qdu-bioinfo/PM_CNN.
      PubDate: Sat, 27 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae013
      Issue No: Vol. 4, No. 1 (2024)
       
  • RAMZIS: a bioinformatic toolkit for rigorous assessment of the alterations
           to glycoprotein composition that occur during biological processes

    • First page: vbae012
      Abstract: AbstractMotivationGlycosylation elaborates the structures and functions of glycoproteins; glycoproteins are common post-translationally modified proteins and are heterogeneous and non-deterministically synthesized as an evolutionarily driven mechanism that elaborates the functions of glycosylated gene products. Glycoproteins, accounting for approximately half of all proteins, require specialized proteomics data analysis methods due to micro- and macro-heterogeneities as a given glycosite can be divided into several glycosylated forms, each of which must be quantified. Sampling of heterogeneous glycopeptides is limited by mass spectrometer speed and sensitivity, resulting in missing values. In conjunction with the low sample size inherent to glycoproteomics, a specialized toolset is needed to determine if observed changes in glycopeptide abundances are biologically significant or due to data quality limitations.ResultsWe developed an R package, Relative Assessment of m/z Identifications by Similarity (RAMZIS), that uses similarity metrics to guide researchers to a more rigorous interpretation of glycoproteomics data. RAMZIS uses a permutation test to generate contextual similarity, which assesses the quality of mass spectral data and outputs a graphical demonstration of the likelihood of finding biologically significant differences in glycosylation abundance datasets. Investigators can assess dataset quality, holistically differentiate glycosites, and identify which glycopeptides are responsible for glycosylation pattern change. RAMZIS is validated by theoretical cases and a proof-of-concept application. RAMZIS enables comparison between datasets too stochastic, small, or sparse for interpolation while acknowledging these issues in its assessment. Using this tool, researchers will be able to rigorously define the role of glycosylation and the changes that occur during biological processes.Availability and implementationhttps://github.com/WillHackett22/RAMZIS.
      PubDate: Thu, 25 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae012
      Issue No: Vol. 4, No. 1 (2024)
       
  • Optimizer’s dilemma: optimization strongly influences model selection in
           transcriptomic prediction

    • First page: vbae004
      Abstract: AbstractMotivationMost models can be fit to data using various optimization approaches. While model choice is frequently reported in machine-learning-based research, optimizers are not often noted. We applied two different implementations of LASSO logistic regression implemented in Python’s scikit-learn package, using two different optimization approaches (coordinate descent, implemented in the liblinear library, and stochastic gradient descent, or SGD), to predict mutation status and gene essentiality from gene expression across a variety of pan-cancer driver genes. For varying levels of regularization, we compared performance and model sparsity between optimizers.ResultsAfter model selection and tuning, we found that liblinear and SGD tended to perform comparably. liblinear models required more extensive tuning of regularization strength, performing best for high model sparsities (more nonzero coefficients), but did not require selection of a learning rate parameter. SGD models required tuning of the learning rate to perform well, but generally performed more robustly across different model sparsities as regularization strength decreased. Given these tradeoffs, we believe that the choice of optimizers should be clearly reported as a part of the model selection and validation process, to allow readers and reviewers to better understand the context in which results have been generated.Availability and implementationThe code used to carry out the analyses in this study is available at https://github.com/greenelab/pancancer-evaluation/tree/master/01_stratified_classification. Performance/regularization strength curves for all genes in the Vogelstein et al. (2013) dataset are available at https://doi.org/10.6084/m9.figshare.22728644.
      PubDate: Wed, 24 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae004
      Issue No: Vol. 4, No. 1 (2024)
       
  • SPREd: a simulation-supervised neural network tool for gene regulatory
           network reconstruction

    • First page: vbae011
      Abstract: AbstractSummaryReconstruction of gene regulatory networks (GRNs) from expression data is a significant open problem. Common approaches train a machine learning (ML) model to predict a gene’s expression using transcription factors’ (TFs’) expression as features and designate important features/TFs as regulators of the gene. Here, we present an entirely different paradigm, where GRN edges are directly predicted by the ML model. The new approach, named “SPREd,” is a simulation-supervised neural network for GRN inference. Its inputs comprise expression relationships (e.g. correlation, mutual information) between the target gene and each TF and between pairs of TFs. The output includes binary labels indicating whether each TF regulates the target gene. We train the neural network model using synthetic expression data generated by a biophysics-inspired simulation model that incorporates linear as well as non-linear TF–gene relationships and diverse GRN configurations. We show SPREd to outperform state-of-the-art GRN reconstruction tools GENIE3, ENNET, PORTIA, and TIGRESS on synthetic datasets with high co-expression among TFs, similar to that seen in real data. A key advantage of the new approach is its robustness to relatively small numbers of conditions (columns) in the expression matrix, which is a common problem faced by existing methods. Finally, we evaluate SPREd on real data sets in yeast that represent gold-standard benchmarks of GRN reconstruction and show it to perform significantly better than or comparably to existing methods. In addition to its high accuracy and speed, SPREd marks a first step toward incorporating biophysics principles of gene regulation into ML-based approaches to GRN reconstruction.Availability and implementationData and code are available from https://github.com/iiiime/SPREd.
      PubDate: Tue, 23 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae011
      Issue No: Vol. 4, No. 1 (2024)
       
  • Non-Negative matrix factorization combined with kernel regression for the
           prediction of adverse drug reaction profiles

    • First page: vbae009
      Abstract: AbstractMotivationPost-market unexpected Adverse Drug Reactions (ADRs) are associated with significant costs, in both financial burden and human health. Due to the high cost and time required to run clinical trials, there is significant interest in accurate computational methods that can aid in the prediction of ADRs for new drugs. As a machine learning task, ADR prediction is made more challenging due to a high degree of class imbalance and existing methods do not successfully balance the requirement to detect the minority cases (true positives for ADR), as measured by the Area Under the Precision-Recall (AUPR) curve with the ability to separate true positives from true negatives [as measured by the Area Under the Receiver Operating Characteristic (AUROC) curve]. Surprisingly, the performance of most existing methods is worse than a naïve method that attributes ADRs to drugs according to the frequency with which the ADR has been observed over all other drugs. The existing advanced methods applied do not lead to substantial gains in predictive performance.ResultsWe designed a rigorous evaluation to provide an unbiased estimate of the performance of ADR prediction methods: Nested Cross-Validation and a hold-out set were adopted. Among the existing methods, Kernel Regression (KR) performed best in AUPR but had a disadvantage in AUROC, relative to other methods, including the naïve method. We proposed a novel method that combines non-negative matrix factorization with kernel regression, called VKR. This novel approach matched or exceeded the performance of existing methods, overcoming the weakness of the existing methods.AvailabilityCode and data are available on https://github.com/YezhaoZhong/VKR.
      PubDate: Tue, 23 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae009
      Issue No: Vol. 4, No. 1 (2024)
       
  • MMDRP: drug response prediction and biomarker discovery using multi-modal
           deep learning

    • First page: vbae010
      Abstract: AbstractMotivationA major challenge in cancer care is that patients with similar demographics, tumor types, and medical histories can respond quite differently to the same drug regimens. This difference is largely explained by genetic and other molecular variabilities among the patients and their cancers. Efforts in the pharmacogenomics field are underway to understand better the relationship between the genome of the patient’s healthy and tumor cells and their response to therapy. To advance this goal, research groups and consortia have undertaken large-scale systematic screening of panels of drugs across multiple cancer cell lines that have been molecularly profiled by genomics, proteomics, and similar techniques. These large data drug screening sets have been applied to the problem of drug response prediction (DRP), the challenge of predicting the response of a previously untested drug/cell-line combination. Although deep learning algorithms outperform traditional methods, there are still many challenges in DRP that ultimately result in these models’ low generalizability and hampers their clinical application.ResultsIn this article, we describe a novel algorithm that addresses the major shortcomings of current DRP methods by combining multiple cell line characterization data, addressing drug response data skewness, and improving chemical compound representation.Availability and implementationMMDRP is implemented as an open-source, Python-based, command-line program and is available at https://github.com/LincolnSteinLab/MMDRP.
      PubDate: Sat, 20 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae010
      Issue No: Vol. 4, No. 1 (2024)
       
  • Predicting evolutionary targets and parameters of gene deletion from
           expression data

    • First page: vbae002
      Abstract: AbstractMotivationGene deletion is traditionally thought of as a nonadaptive process that removes functional redundancy from genomes, such that it generally receives less attention than duplication in evolutionary turnover studies. Yet, mounting evidence suggests that deletion may promote adaptation via the “less-is-more” evolutionary hypothesis, as it often targets genes harboring unique sequences, expression profiles, and molecular functions. Hence, predicting the relative prevalence of redundant and unique functions among genes targeted by deletion, as well as the parameters underlying their evolution, can shed light on the role of gene deletion in adaptation.ResultsHere, we present CLOUDe, a suite of machine learning methods for predicting evolutionary targets of gene deletion events from expression data. Specifically, CLOUDe models expression evolution as an Ornstein–Uhlenbeck process, and uses multi-layer neural network, extreme gradient boosting, random forest, and support vector machine architectures to predict whether deleted genes are “redundant” or “unique”, as well as several parameters underlying their evolution. We show that CLOUDe boasts high power and accuracy in differentiating between classes, and high accuracy and precision in estimating evolutionary parameters, with optimal performance achieved by its neural network architecture. Application of CLOUDe to empirical data from Drosophila suggests that deletion primarily targets genes with unique functions, with further analysis showing these functions to be enriched for protein deubiquitination. Thus, CLOUDe represents a key advance in learning about the role of gene deletion in functional evolution and adaptation.Availability and implementationCLOUDe is freely available on GitHub (https://github.com/anddssan/CLOUDe).
      PubDate: Wed, 17 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae002
      Issue No: Vol. 4, No. 1 (2024)
       
  • mvlearnR and Shiny App for multiview learning

    • First page: vbae005
      Abstract: AbstractSummaryThe package mvlearnR and accompanying Shiny App is intended for integrating data from multiple sources or views or modalities (e.g. genomics, proteomics, clinical, and demographic data). Most existing software packages for multiview learning are decentralized and offer limited capabilities, making it difficult for users to perform comprehensive integrative analysis. The new package wraps statistical and machine learning methods and graphical tools, providing a convenient and easy data integration workflow. For users with limited programming language, we provide a Shiny Application to facilitate data integration anywhere and on any device. The methods have potential to offer deeper insights into complex disease mechanisms.Availability and implementationmvlearnR is available from the following GitHub repository: https://github.com/lasandrall/mvlearnR. The web application is hosted on shinyapps.io and available at: https://multi-viewlearn.shinyapps.io/MultiView_Modeling/.
      PubDate: Tue, 16 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae005
      Issue No: Vol. 4, No. 1 (2024)
       
  • HIVseqDB: a portable resource for NGS and sample metadata integration for
           HIV-1 drug resistance analysis

    • First page: vbae008
      Abstract: AbstractSummaryHuman immunodeficiency virus (HIV) remains a public health threat, with drug resistance being a major concern in HIV treatment. Next-generation sequencing (NGS) is a powerful tool for identifying low-abundance drug resistance mutations (LA-DRMs) that conventional Sanger sequencing cannot reliably detect. To fully understand the significance of LA-DRMs, it is necessary to integrate NGS data with clinical and demographic data. However, freely available tools for NGS-based HIV-1 drug resistance analysis do not integrate these data. This poses a challenge in interpretation of the impact of LA-DRMs, mainly for resource-limited settings due to the shortage of bioinformatics expertise. To address this challenge, we present HIVseqDB, a portable, secure, and user-friendly resource for integrating NGS data with associated clinical and demographic data for analysis of HIV drug resistance. HIVseqDB currently supports uploading of NGS data and associated sample data, HIV-1 drug resistance data analysis, browsing of uploaded data, and browsing and visualizing of analysis results. Each function of HIVseqDB corresponds to an individual Django application. This ensures efficient incorporation of additional features with minimal effort. HIVseqDB can be deployed on various computing environments, such as on-premises high-performance computing facilities and cloud-based platforms.Availability and implementationHIVseqDB is available at https://github.com/AlfredUg/HIVseqDB. A deployed instance of HIVseqDB is available at https://hivseqdb.org.
      PubDate: Sun, 14 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae008
      Issue No: Vol. 4, No. 1 (2024)
       
  • DeepRegFinder: deep learning-based regulatory elements finder

    • First page: vbae007
      Abstract: AbstractSummaryEnhancers and promoters are important classes of DNA regulatory elements (DREs) that govern gene expression. Identifying them at a genomic scale is a critical task in bioinformatics. The DREs often exhibit unique histone mark binding patterns, which can be captured by high-throughput ChIP-seq experiments. To account for the variations and noises among the binding sites, machine learning models are trained on known enhancer/promoter sites using histone mark ChIP-seq data and predict enhancers/promoters at other genomic regions. To this end, we have developed a highly customizable program named DeepRegFinder, which automates the entire process of data processing, model training, and prediction. We have employed convolutional and recurrent neural networks for model training and prediction. DeepRegFinder further categorizes enhancers and promoters into active and poised states, making it a unique and valuable feature for researchers. Our method demonstrates improved precision and recall in comparison to existing algorithms for enhancer prediction across multiple cell types. Moreover, our pipeline is modular and eliminates the tedious steps involved in preprocessing, making it easier for users to apply on their data quickly.Availability and implementationhttps://github.com/shenlab-sinai/DeepRegFinder
      PubDate: Sun, 14 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae007
      Issue No: Vol. 4, No. 1 (2024)
       
  • DIAgui: a Shiny application to process the output from DIA-NN

    • First page: vbae001
      Abstract: AbstractSummaryDIAgui is an R package to simplify the processing of the report file from the DIA-NN software thanks to a Shiny application. It returns the quantification of either the precursors, the peptides, the proteins, or the genes thanks to the MaxLFQ algorithm. In addition, the latest version provides the Top3 and iBAQ quantification and the number of peptides used for the quantification. In the end, DIAgui produces ready-to-interpret files from the results of DIA mass spectrometry analysis and provides visualization and statistical tools that can be used in a user-friendly way.Availability and implementationCode and documentation are available on GitHub at https://github.com/marseille-proteomique/DIAgui. The package is written in R and also uses C++ code. A vignette shows its use in an R command line workflow.
      PubDate: Sat, 13 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae001
      Issue No: Vol. 4, No. 1 (2024)
       
  • ClusterV-Web: a user-friendly tool for profiling HIV quasispecies and
           generating drug resistance reports from nanopore long-read data

    • First page: vbae006
      Abstract: AbstractSummaryThird-generation long-read sequencing is an increasingly utilized technique for profiling human immunodeficiency virus (HIV) quasispecies and detecting drug resistance mutations due to its ability to cover the entire viral genome in individual reads. Recently, the ClusterV tool has demonstrated accurate detection of HIV quasispecies from Nanopore long-read sequencing data. However, the need for scripting skills and a computational environment may act as a barrier for many potential users. To address this issue, we have introduced ClusterV-Web, a user-friendly web-based application that enables easy configuration and execution of ClusterV, both remotely and locally. Our tool provides interactive tables and data visualizations to aid in the interpretation of results. This development is expected to democratize access to long-read sequencing data analysis, enabling a wider range of researchers and clinicians to efficiently profile HIV quasispecies and detect drug resistance mutations.Availability and implementationClusterV-Web is freely available and open source, with detailed documentation accessible at http://www.bio8.cs.hku.hk/ClusterVW/. The standalone Docker image and source code are also available at https://github.com/HKU-BAL/ClusterV-Web.
      PubDate: Sat, 13 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae006
      Issue No: Vol. 4, No. 1 (2024)
       
  • RespirAnalyzer: an R package for analyzing data from continuous monitoring
           of respiratory signals

    • First page: vbae003
      Abstract: AbstractMotivationThe analysis of data obtained from continuous monitoring of respiratory signals (CMRS) holds significant importance in improving patient care, optimizing sports performance, and advancing scientific understanding in the field of respiratory health.ResultsThe R package RespirAnalyzer provides an analytic tool specifically for feature extraction, fractal and complexity analysis for CMRS data. The package covers a wide and comprehensive range of data analysis methods including obtaining inter-breath intervals (IBI) series, plotting time series, obtaining summary statistics of IBI series, conducting power spectral density, multifractal detrended fluctuation analysis (MFDFA) and multiscale sample entropy analysis, fitting the MFDFA results with the extended binomial multifractal model, displaying results using various plots, etc. This package has been developed from our work in directly analyzing CMRS data and is anticipated to assist fellow researchers in computing the related features of their CMRS data, enabling them to delve into the clinical significance inherent in these features.Availability and implementationThe package for Windows is available from both Comprehensive R Archive Network (CRAN): https://cran.r-project.org/web/packages/RespirAnalyzer/index.html and GitHub: https://github.com/dongxinzheng/RespirAnalyzer.
      PubDate: Sat, 13 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbae003
      Issue No: Vol. 4, No. 1 (2024)
       
  • Advancements in bioinformatics and computational biology: 15th annual
           Polish Bioinformatic Society Symposium

    • First page: vbad187
      Abstract: AbstractThe Polish Bioinformatic Society (PTBI) Symposium convenes annually at leading Polish Universities, and in 2023, the Silesian University of Technology hosted participants from all over the world. The 15th PTBI Symposium, spanning a 3-day duration and divided into four scientific sessions, gathered around 100 participants and centered on research related to machine learning in biomedicine, RNA structure algorithms, next-generation sequencing methods, and microbiome analysis but was not limited to only those topics. The meeting also recognized outstanding research conducted by young scientists by awarding the best poster and best talk. Finally, the awards for the best PhD, MSc, and BSc thesis in bioinformatics defended in Poland were given. This report summarizes the key highlights and outcomes of the meeting.
      PubDate: Sat, 13 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbad187
      Issue No: Vol. 4, No. 1 (2024)
       
  • DeepRank-GNN-esm: a graph neural network for scoring protein–protein
           models using protein language model

    • First page: vbad191
      Abstract: AbstractMotivationProtein–Protein interactions (PPIs) play critical roles in numerous cellular processes. By modelling the 3D structures of the correspond protein complexes valuable insights can be obtained, providing, e.g. starting points for drug and protein design. One challenge in the modelling process is however the identification of near-native models from the large pool of generated models. To this end we have previously developed DeepRank-GNN, a graph neural network that integrates structural and sequence information to enable effective pattern learning at PPI interfaces. Its main features are related to the Position Specific Scoring Matrices (PSSMs), which are computationally expensive to generate, significantly limits the algorithm's usability.ResultsWe introduce here DeepRank-GNN-esm that includes as additional features protein language model embeddings from the ESM-2 model. We show that the ESM-2 embeddings can actually replace the PSSM features at no cost in-, or even better performance on two PPI-related tasks: scoring docking poses and detecting crystal artifacts. This new DeepRank version bypasses thus the need of generating PSSM, greatly improving the usability of the software and opening new application opportunities for systems for which PSSM profiles cannot be obtained or are irrelevant (e.g. antibody-antigen complexes).Availability and implementationDeepRank-GNN-esm is freely available from https://github.com/DeepRank/DeepRank-GNN-esm.
      PubDate: Fri, 05 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbad191
      Issue No: Vol. 4, No. 1 (2024)
       
  • Joint regression analysis of multiple traits based on genetic
           relationships

    • First page: vbad192
      Abstract: AbstractMotivationPolygenic scores (PGSs) are widely available and employed in genomic data analyses for predicting and understanding genetic architectures. Existing approaches either require information on SNP level, do not infer clusters of traits sharing genetic characteristic, or do not have any immediate predictive properties.ResultsHere, we present geneJAM, which is a novel clustering and estimation method using PGSs for inferring a genetic relationship among multiple, simultaneously measured and potentially correlated traits in a multivariate GWAS.Using graphical lasso, we estimate a sparse covariance matrix of the PGSs and obtain clusters of traits sharing genetic characteristics. We use the clusters to specify the structure of the error covariance matrix of a generalized least squares (GLS) model and use the feasible GLS estimator for estimating a linear regression model with a certain unknown degree of correlation between the residuals.The method suits many biology studies well with traits embedded in some genetic functioning groups and facilitates development of the PGS research. We compare the method with fully parametric techniques on simulated data and illustrate the utility of the methods by examining a heterogeneous stock mouse data set from the Wellcome Trust Centre for Human Genetics. We demonstrate that the method successfully identifies clusters of traits and increases precision, power, and computational efficiency.Availability and implementationGeneJAM is implemented in R and available at: https://github.com/abuchardt/geneJAM.
      PubDate: Thu, 04 Jan 2024 00:00:00 GMT
      DOI: 10.1093/bioadv/vbad192
      Issue No: Vol. 4, No. 1 (2024)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 44.192.94.177
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-
JournalTOCs
 
 
  Subjects -> BIOLOGY (Total: 3134 journals)
    - BIOCHEMISTRY (239 journals)
    - BIOENGINEERING (143 journals)
    - BIOLOGY (1491 journals)
    - BIOPHYSICS (53 journals)
    - BIOTECHNOLOGY (243 journals)
    - BOTANY (220 journals)
    - CYTOLOGY AND HISTOLOGY (32 journals)
    - ENTOMOLOGY (67 journals)
    - GENETICS (152 journals)
    - MICROBIOLOGY (265 journals)
    - MICROSCOPY (13 journals)
    - ORNITHOLOGY (26 journals)
    - PHYSIOLOGY (73 journals)
    - ZOOLOGY (117 journals)

BIOLOGY (1491 journals)                  1 2 3 4 5 6 7 8 | Last

Showing 1 - 200 of 1720 Journals sorted alphabetically
AAPS Journal     Hybrid Journal   (Followers: 29)
ACS Pharmacology & Translational Science     Hybrid Journal   (Followers: 3)
ACS Synthetic Biology     Hybrid Journal   (Followers: 39)
Acta Biologica Hungarica     Full-text available via subscription   (Followers: 6)
Acta Biologica Marisiensis     Open Access   (Followers: 5)
Acta Biologica Sibirica     Open Access   (Followers: 2)
Acta Biologica Turcica     Open Access   (Followers: 2)
Acta Biomaterialia     Hybrid Journal   (Followers: 32)
Acta Biotheoretica     Hybrid Journal   (Followers: 3)
Acta Chiropterologica     Full-text available via subscription   (Followers: 6)
acta ethologica     Hybrid Journal   (Followers: 7)
Acta Fytotechnica et Zootechnica     Open Access   (Followers: 3)
Acta Ichthyologica et Piscatoria     Open Access   (Followers: 5)
Acta Médica Costarricense     Open Access   (Followers: 2)
Acta Scientiarum. Biological Sciences     Open Access   (Followers: 2)
Acta Scientifica Naturalis     Open Access   (Followers: 4)
Acta Universitatis Agriculturae et Silviculturae Mendelianae Brunensis     Open Access   (Followers: 2)
Actualidades Biológicas     Open Access   (Followers: 1)
Advanced Biology     Hybrid Journal   (Followers: 1)
Advanced Health Care Technologies     Open Access   (Followers: 12)
Advanced Journal of Graduate Research     Open Access   (Followers: 2)
Advanced Membranes     Open Access   (Followers: 9)
Advanced Quantum Technologies     Hybrid Journal   (Followers: 5)
Advances in Biological Regulation     Hybrid Journal   (Followers: 4)
Advances in Biology     Open Access   (Followers: 16)
Advances in Biomarker Sciences and Technology     Open Access   (Followers: 2)
Advances in Biosensors and Bioelectronics     Open Access   (Followers: 8)
Advances in Cell Biology/ Medical Journal of Cell Biology     Open Access   (Followers: 28)
Advances in Ecological Research     Full-text available via subscription   (Followers: 47)
Advances in Environmental Sciences - International Journal of the Bioflux Society     Open Access   (Followers: 17)
Advances in Enzyme Research     Open Access   (Followers: 11)
Advances in High Energy Physics     Open Access   (Followers: 27)
Advances in Life Science and Technology     Open Access   (Followers: 14)
Advances in Life Sciences     Open Access   (Followers: 6)
Advances in Marine Biology     Full-text available via subscription   (Followers: 29)
Advances in Virus Research     Full-text available via subscription   (Followers: 8)
Adversity and Resilience Science : Journal of Research and Practice     Hybrid Journal   (Followers: 3)
African Journal of Ecology     Hybrid Journal   (Followers: 18)
African Journal of Range & Forage Science     Hybrid Journal   (Followers: 12)
AFRREV STECH : An International Journal of Science and Technology     Open Access   (Followers: 3)
Ageing Research Reviews     Hybrid Journal   (Followers: 13)
Aggregate     Open Access   (Followers: 3)
Aging Cell     Open Access   (Followers: 23)
Agrokémia és Talajtan     Full-text available via subscription   (Followers: 2)
AJP Cell Physiology     Hybrid Journal   (Followers: 13)
AJP Endocrinology and Metabolism     Hybrid Journal   (Followers: 14)
AJP Lung Cellular and Molecular Physiology     Hybrid Journal   (Followers: 3)
Al-Kauniyah : Jurnal Biologi     Open Access  
Alasbimn Journal     Open Access   (Followers: 1)
Alces : A Journal Devoted to the Biology and Management of Moose     Open Access  
Alfarama Journal of Basic & Applied Sciences     Open Access   (Followers: 12)
All Life     Open Access   (Followers: 2)
AMB Express     Open Access   (Followers: 1)
Ambix     Hybrid Journal   (Followers: 3)
American Journal of Agricultural and Biological Sciences     Open Access   (Followers: 7)
American Journal of Bioethics     Hybrid Journal   (Followers: 17)
American Journal of Human Biology     Hybrid Journal   (Followers: 19)
American Journal of Plant Sciences     Open Access   (Followers: 24)
American Journal of Primatology     Hybrid Journal   (Followers: 17)
American Naturalist     Full-text available via subscription   (Followers: 82)
Amphibia-Reptilia     Hybrid Journal   (Followers: 5)
Anaerobe     Hybrid Journal   (Followers: 3)
Analytical Methods     Hybrid Journal   (Followers: 7)
Analytical Science Advances     Open Access   (Followers: 2)
Anatomia     Open Access   (Followers: 15)
Anatomical Science International     Hybrid Journal   (Followers: 3)
Animal Cells and Systems     Hybrid Journal   (Followers: 4)
Animal Microbiome     Open Access   (Followers: 7)
Animal Models and Experimental Medicine     Open Access  
Annales françaises d'Oto-rhino-laryngologie et de Pathologie Cervico-faciale     Full-text available via subscription   (Followers: 2)
Annales Henri Poincaré     Hybrid Journal   (Followers: 2)
Annales Universitatis Mariae Curie-Sklodowska, sectio C – Biologia     Open Access   (Followers: 1)
Annals of Applied Biology     Hybrid Journal   (Followers: 7)
Annals of Biomedical Engineering     Hybrid Journal   (Followers: 18)
Annals of Human Biology     Hybrid Journal   (Followers: 6)
Annals of Science and Technology     Open Access   (Followers: 2)
Annual Research & Review in Biology     Open Access   (Followers: 1)
Annual Review of Biomedical Engineering     Full-text available via subscription   (Followers: 19)
Annual Review of Cell and Developmental Biology     Full-text available via subscription   (Followers: 40)
Annual Review of Food Science and Technology     Full-text available via subscription   (Followers: 13)
Annual Review of Genomics and Human Genetics     Full-text available via subscription   (Followers: 32)
Antibiotics     Open Access   (Followers: 12)
Antioxidants     Open Access   (Followers: 4)
Antonie van Leeuwenhoek     Hybrid Journal   (Followers: 3)
Anzeiger für Schädlingskunde     Hybrid Journal   (Followers: 1)
Apidologie     Hybrid Journal   (Followers: 4)
Apmis     Hybrid Journal   (Followers: 1)
APOPTOSIS     Hybrid Journal   (Followers: 5)
Applied Biology     Open Access  
Applied Bionics and Biomechanics     Open Access   (Followers: 4)
Applied Phycology     Open Access   (Followers: 1)
Applied Vegetation Science     Full-text available via subscription   (Followers: 9)
Aquaculture Environment Interactions     Open Access   (Followers: 7)
Aquaculture International     Hybrid Journal   (Followers: 25)
Aquaculture Reports     Open Access   (Followers: 3)
Aquaculture, Aquarium, Conservation & Legislation - International Journal of the Bioflux Society     Open Access   (Followers: 9)
Aquatic Biology     Open Access   (Followers: 9)
Aquatic Ecology     Hybrid Journal   (Followers: 45)
Aquatic Ecosystem Health & Management     Hybrid Journal   (Followers: 16)
Aquatic Science and Technology     Open Access   (Followers: 4)
Aquatic Toxicology     Hybrid Journal   (Followers: 26)
Arabian Journal of Scientific Research / المجلة العربية للبحث العلمي     Open Access  
Archaea     Open Access   (Followers: 3)
Archiv für Molluskenkunde: International Journal of Malacology     Full-text available via subscription   (Followers: 1)
Archives of Biological Sciences     Open Access  
Archives of Microbiology     Hybrid Journal   (Followers: 9)
Archives of Natural History     Hybrid Journal   (Followers: 8)
Archives of Oral Biology     Hybrid Journal   (Followers: 2)
Archives of Virology     Hybrid Journal   (Followers: 6)
Archivum Immunologiae et Therapiae Experimentalis     Hybrid Journal   (Followers: 2)
Arid Ecosystems     Hybrid Journal   (Followers: 2)
Arquivos do Museu Dinâmico Interdisciplinar     Open Access  
Arthropod Structure & Development     Hybrid Journal   (Followers: 1)
Arthropod Systematics & Phylogeny     Open Access   (Followers: 13)
Artificial DNA: PNA & XNA     Hybrid Journal   (Followers: 2)
Artificial Intelligence in the Life Sciences     Open Access   (Followers: 1)
Asian Bioethics Review     Full-text available via subscription   (Followers: 2)
Asian Journal of Biological Sciences     Open Access   (Followers: 2)
Asian Journal of Biology     Open Access  
Asian Journal of Biotechnology and Bioresource Technology     Open Access  
Asian Journal of Cell Biology     Open Access   (Followers: 4)
Asian Journal of Developmental Biology     Open Access   (Followers: 1)
Asian Journal of Medical and Biological Research     Open Access   (Followers: 3)
Asian Journal of Nematology     Open Access   (Followers: 4)
Asian Journal of Poultry Science     Open Access   (Followers: 3)
Atti della Accademia Peloritana dei Pericolanti - Classe di Scienze Medico-Biologiche     Open Access  
Australian Life Scientist     Full-text available via subscription   (Followers: 2)
Australian Mammalogy     Hybrid Journal   (Followers: 8)
Autophagy     Hybrid Journal   (Followers: 8)
Avian Biology Research     Hybrid Journal   (Followers: 4)
Avian Conservation and Ecology     Open Access   (Followers: 19)
Bacterial Empire     Open Access   (Followers: 1)
Bacteriology Journal     Open Access   (Followers: 2)
Bacteriophage     Full-text available via subscription   (Followers: 2)
Bangladesh Journal of Bioethics     Open Access  
Bangladesh Journal of Scientific Research     Open Access  
Between the Species     Open Access   (Followers: 2)
BIO Web of Conferences     Open Access  
BIO-SITE : Biologi dan Sains Terapan     Open Access  
Biocatalysis and Biotransformation     Hybrid Journal   (Followers: 4)
BioCentury Innovations     Full-text available via subscription   (Followers: 2)
Biochemistry and Cell Biology     Hybrid Journal   (Followers: 18)
Biochimie     Hybrid Journal   (Followers: 2)
BioControl     Hybrid Journal   (Followers: 2)
Biocontrol Science and Technology     Hybrid Journal   (Followers: 5)
Biodemography and Social Biology     Hybrid Journal   (Followers: 1)
BIODIK : Jurnal Ilmiah Pendidikan Biologi     Open Access  
BioDiscovery     Open Access   (Followers: 2)
Biodiversity : Research and Conservation     Open Access   (Followers: 30)
Biodiversity Data Journal     Open Access   (Followers: 7)
Biodiversity Informatics     Open Access   (Followers: 3)
Biodiversity Information Science and Standards     Open Access   (Followers: 3)
Biodiversity Observations     Open Access   (Followers: 2)
Bioeksperimen : Jurnal Penelitian Biologi     Open Access  
Bioelectrochemistry     Hybrid Journal   (Followers: 1)
Bioelectromagnetics     Hybrid Journal   (Followers: 1)
Bioenergy Research     Hybrid Journal   (Followers: 3)
Bioengineering and Bioscience     Open Access   (Followers: 1)
BioEssays     Hybrid Journal   (Followers: 10)
Bioethics     Hybrid Journal   (Followers: 20)
BioéthiqueOnline     Open Access   (Followers: 1)
Biogeographia : The Journal of Integrative Biogeography     Open Access   (Followers: 2)
Biogeosciences (BG)     Open Access   (Followers: 19)
Biogeosciences Discussions (BGD)     Open Access   (Followers: 3)
Bioinformatics     Hybrid Journal   (Followers: 307)
Bioinformatics Advances : Journal of the International Society for Computational Biology     Open Access   (Followers: 4)
Bioinformatics and Biology Insights     Open Access   (Followers: 14)
Biointerphases     Open Access   (Followers: 1)
Biojournal of Science and Technology     Open Access  
Biologia     Hybrid Journal   (Followers: 1)
Biologia Futura     Hybrid Journal  
Biologia on-line : Revista de divulgació de la Facultat de Biologia     Open Access  
Biological Bulletin     Partially Free   (Followers: 6)
Biological Control     Hybrid Journal   (Followers: 6)
Biological Invasions     Hybrid Journal   (Followers: 24)
Biological Journal of the Linnean Society     Hybrid Journal   (Followers: 18)
Biological Procedures Online     Open Access  
Biological Psychiatry     Hybrid Journal   (Followers: 59)
Biological Psychology     Hybrid Journal   (Followers: 5)
Biological Research     Open Access   (Followers: 1)
Biological Rhythm Research     Hybrid Journal  
Biological Theory     Hybrid Journal   (Followers: 3)
Biological Trace Element Research     Hybrid Journal  
Biologicals     Full-text available via subscription   (Followers: 5)
Biologics: Targets & Therapy     Open Access   (Followers: 1)
Biologie Aujourd'hui     Full-text available via subscription  
Biologie in Unserer Zeit (Biuz)     Hybrid Journal   (Followers: 2)
Biologija     Open Access  
Biology     Open Access   (Followers: 5)
Biology and Philosophy     Hybrid Journal   (Followers: 19)
Biology Bulletin     Hybrid Journal   (Followers: 1)
Biology Bulletin Reviews     Hybrid Journal  
Biology Direct     Open Access   (Followers: 9)
Biology Methods and Protocols     Open Access  
Biology of Sex Differences     Open Access   (Followers: 1)
Biology of the Cell     Full-text available via subscription   (Followers: 8)
Biology, Medicine, & Natural Product Chemistry     Open Access   (Followers: 2)
Biomacromolecules     Hybrid Journal   (Followers: 21)
Biomarker Insights     Open Access   (Followers: 1)
Biomarkers     Hybrid Journal   (Followers: 5)

        1 2 3 4 5 6 7 8 | Last

Similar Journals
Similar Journals
HOME > Browse the 73 Subjects covered by JournalTOCs  
SubjectTotal Journals
 
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 44.192.94.177
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-