Subjects -> MEDICAL SCIENCES (Total: 8212 journals)
    - ALLERGOLOGY AND IMMUNOLOGY (205 journals)
    - ANAESTHESIOLOGY (122 journals)
    - CARDIOVASCULAR DISEASES (334 journals)
    - CHIROPRACTIC, HOMEOPATHY, OSTEOPATHY (19 journals)
    - COMMUNICABLE DISEASES, EPIDEMIOLOGY (227 journals)
    - DENTISTRY (266 journals)
    - DERMATOLOGY AND VENEREOLOGY (162 journals)
    - EMERGENCY AND INTENSIVE CRITICAL CARE (121 journals)
    - ENDOCRINOLOGY (149 journals)
    - FORENSIC SCIENCES (43 journals)
    - GASTROENTEROLOGY AND HEPATOLOGY (178 journals)
    - GERONTOLOGY AND GERIATRICS (125 journals)
    - HEMATOLOGY (160 journals)
    - HYPNOSIS (4 journals)
    - INTERNAL MEDICINE (177 journals)
    - LABORATORY AND EXPERIMENTAL MEDICINE (90 journals)
    - MEDICAL GENETICS (58 journals)
    - MEDICAL SCIENCES (2241 journals)
    - NURSES AND NURSING (331 journals)
    - OBSTETRICS AND GYNECOLOGY (199 journals)
    - ONCOLOGY (355 journals)
    - OPHTHALMOLOGY AND OPTOMETRY (135 journals)
    - ORTHOPEDICS AND TRAUMATOLOGY (150 journals)
    - OTORHINOLARYNGOLOGY (76 journals)
    - PATHOLOGY (96 journals)
    - PEDIATRICS (254 journals)
    - PHYSICAL MEDICINE AND REHABILITATION (153 journals)
    - PSYCHIATRY AND NEUROLOGY (800 journals)
    - RADIOLOGY AND NUCLEAR MEDICINE (182 journals)
    - RESPIRATORY DISEASES (109 journals)
    - RHEUMATOLOGY (75 journals)
    - SPORTS MEDICINE (77 journals)
    - SURGERY (388 journals)
    - UROLOGY, NEPHROLOGY AND ANDROLOGY (151 journals)

MEDICAL SCIENCES (2241 journals)            First | 1 2 3 4 5 6 7 8 | Last

Showing 601 - 800 of 3562 Journals sorted alphabetically
F&S Science : Official journal of the American Society for Reproductive Medicine     Open Access  
Facial Plastic Surgery & Aesthetic Medicine     Full-text available via subscription   (Followers: 2)
Facta Universitatis, Series : Medicine and Biology     Open Access  
Family Medicine and Community Health     Open Access   (Followers: 8)
Family Practice     Hybrid Journal   (Followers: 17)
Family Practice & Palliative Care     Open Access   (Followers: 5)
Family Practice Management     Full-text available via subscription   (Followers: 4)
Faridpur Medical College Journal     Open Access  
FEM : Revista de la Fundación Educación Médica     Open Access  
Finlay : Revista de Enfermedades no Transmisibles     Open Access  
Fisioterapia     Full-text available via subscription   (Followers: 2)
Fisioterapia & Saúde Funcional     Open Access  
Flugmedizin · Tropenmedizin · Reisemedizin - FTR     Hybrid Journal  
FMC - Formación Médica Continuada en Atención Primaria     Full-text available via subscription  
Folia Medica     Open Access  
Folia Medica Indonesiana     Open Access  
Folia Morphologica     Full-text available via subscription  
Folia Phoniatrica et Logopaedica     Full-text available via subscription   (Followers: 1)
Fontanus     Open Access  
Food Hydrocolloids for Health     Open Access  
Foodborne Pathogens and Disease     Hybrid Journal   (Followers: 11)
Foot & Ankle Specialist     Hybrid Journal   (Followers: 4)
Foot and Ankle Clinics     Full-text available via subscription   (Followers: 12)
Foot and Ankle Online Journal     Full-text available via subscription   (Followers: 6)
Forensic Science International : Mind and Law     Open Access   (Followers: 4)
Forum Medycyny Rodzinnej     Hybrid Journal  
Forum Zaburzeń Metabolicznych     Hybrid Journal  
Frontières     Full-text available via subscription   (Followers: 3)
Frontiers in Digital Health     Open Access   (Followers: 4)
Frontiers in Medical Technology     Open Access  
Frontiers in Medicine     Open Access   (Followers: 2)
Frontiers in Network Physiology     Open Access   (Followers: 2)
Frontiers in Neuroprosthetics     Open Access   (Followers: 6)
Frontiers in Synaptic Neuroscience     Open Access   (Followers: 2)
Frontiers in Tropical Diseases     Open Access  
Frontiers of Medical and Biological Engineering     Hybrid Journal  
Frontiers of Medicine     Hybrid Journal   (Followers: 2)
Fuss & Sprunggelenk     Hybrid Journal  
Future Medicinal Chemistry     Full-text available via subscription   (Followers: 5)
Future Prescriber     Hybrid Journal  
Future Science OA     Open Access  
Gaceta Médica Boliviana     Open Access  
Gaceta Médica Espirituana     Open Access  
Galen Medical Journal     Open Access  
Galician Medical Journal     Open Access   (Followers: 1)
Galle Medical Journal     Open Access  
Gefäßmedizin Scan     Hybrid Journal  
Gender and the Genome     Open Access   (Followers: 1)
Gene Expression     Full-text available via subscription   (Followers: 1)
General Reanimatology     Open Access  
Genes     Open Access   (Followers: 2)
Genome Instability & Disease     Hybrid Journal  
Geoforum     Hybrid Journal   (Followers: 25)
Gestão e Desenvolvimento     Open Access  
Ghana Medical Journal     Open Access   (Followers: 1)
GigaScience     Open Access   (Followers: 4)
Gimbernat : Revista d’Història de la Medicina i de les Ciències de la Salut     Open Access  
Glia     Hybrid Journal   (Followers: 5)
Global Advances in Health and Medicine     Open Access  
Global Bioethics     Open Access   (Followers: 5)
Global Health : Science and Practice     Open Access   (Followers: 7)
Global Health Journal     Open Access   (Followers: 2)
Global Journal of Integrated Chinese Medicine and Western Medicine     Open Access  
Global Journal of Cancer Therapy     Open Access  
Global Journal of Fertility and Research     Open Access  
Global Journal of Health Science     Open Access   (Followers: 5)
Global Journal of Infectious Diseases and Clinical Research     Open Access  
Global Journal of Medical and Clinical Case Reports     Open Access  
Global Journal of Obesity, Diabetes and Metabolic Syndrome     Open Access   (Followers: 1)
Global Journal of Perioperative Medicine     Open Access  
Global Journal of Rare Diseases     Open Access  
Global Medical & Health Communication     Open Access   (Followers: 1)
Global Reproductive Health     Open Access  
Grande Medical Journal     Open Access  
Growth Factors     Hybrid Journal   (Followers: 2)
GSTF Journal of Advances in Medical Research     Open Access  
Gümüşhane Üniversitesi Sağlık Bilimleri Dergisi     Open Access  
Hamdan Medical Journal     Open Access  
Hämostaseologie     Hybrid Journal   (Followers: 5)
Hämostaseologie     Open Access  
Hand     Hybrid Journal   (Followers: 4)
Hand Clinics     Full-text available via subscription   (Followers: 6)
Hand Therapy     Hybrid Journal   (Followers: 11)
Hard Tissue     Open Access  
Head & Face Medicine     Open Access   (Followers: 1)
Head and Neck Cancer Research     Open Access  
Head and Neck Tumors     Open Access  
Health Information : Jurnal Penelitian     Open Access  
Health Matrix : The Journal of Law-Medicine     Open Access  
Health Notions     Open Access  
Health Science Journal of Indonesia     Open Access  
Health Science Reports     Open Access   (Followers: 1)
Health Sciences and Disease     Open Access   (Followers: 1)
Health Sciences Review     Open Access  
Health Security     Hybrid Journal   (Followers: 1)
Healthcare Technology Letters     Open Access  
Hearing, Balance and Communication     Hybrid Journal   (Followers: 6)
Hearts     Open Access   (Followers: 1)
HEC Forum     Hybrid Journal   (Followers: 1)
Heighpubs Otolaryngology and Rhinology     Open Access  
Heilberufe     Hybrid Journal  
HeilberufeSCIENCE     Hybrid Journal  
Heilpflanzen     Hybrid Journal   (Followers: 3)
Helicobacter     Hybrid Journal  
HemaSphere     Open Access   (Followers: 2)
Hemoglobin     Hybrid Journal  
Hepatology, Medicine and Policy     Open Access  
HERALD of North-Western State Medical University named after I.I. Mechnikov     Open Access  
Herald of the Russian Academy of Sciences     Full-text available via subscription  
Herzschrittmachertherapie + Elektrophysiologie     Hybrid Journal  
Highland Medical Research Journal     Full-text available via subscription  
Hipertensión y Riesgo Vascular     Full-text available via subscription  
HIV Australia     Full-text available via subscription   (Followers: 3)
Homeopathy     Hybrid Journal   (Followers: 1)
Homoeopathic Links     Hybrid Journal  
Hong Kong Physiotherapy Journal     Open Access   (Followers: 14)
Horizonte Medico     Open Access  
Hormones : International Journal of Endocrinology and Metabolism     Hybrid Journal  
Hospital a Domicilio     Open Access  
Hospital Practices and Research     Open Access  
Hospital Topics     Hybrid Journal   (Followers: 1)
Hua Hin Sook Jai Klai Kangwon Journal     Open Access  
Huisarts en wetenschap     Hybrid Journal   (Followers: 4)
Human & Veterinary Medicine - International Journal of the Bioflux Society     Open Access   (Followers: 4)
Human Factors in Healthcare     Open Access  
Human Fertility     Hybrid Journal   (Followers: 4)
Humanidades Médicas     Open Access  
I.P. Pavlov Russian Medical Biological Herald     Open Access  
Iatreia     Open Access  
Ibnosina Journal of Medicine and Biomedical Sciences     Open Access  
IDCases     Open Access  
IEEE Journal of Biomedical and Health Informatics     Hybrid Journal   (Followers: 14)
IEEE Journal of Electromagnetics, RF and Microwaves in Medicine and Biology     Hybrid Journal  
IEEE Journal of Translational Engineering in Health and Medicine     Open Access   (Followers: 5)
IEEE Open Journal of Engineering in Medicine and Biology     Open Access   (Followers: 1)
IEEE Transactions on Medical Robotics and Bionics     Hybrid Journal   (Followers: 3)
IEEE/ACM Transactions on Computational Biology and Bioinformatics     Hybrid Journal   (Followers: 18)
IJID Regions     Open Access   (Followers: 1)
IJS Global Health     Open Access  
IJU Case Reports     Open Access  
iLiver     Open Access   (Followers: 2)
Im OP     Hybrid Journal  
Image Analysis & Stereology     Open Access   (Followers: 1)
IMAGING     Full-text available via subscription   (Followers: 1)
Imaging in Medicine     Open Access  
Imaging Journal of Clinical and Medical Sciences     Open Access   (Followers: 1)
Imam Journal of Applied Sciences     Open Access  
Indian Journal of Ayurveda and lntegrative Medicine Klue     Open Access   (Followers: 3)
Indian Journal of Burns     Open Access   (Followers: 2)
Indian Journal of Clinical Medicine     Open Access  
Indian Journal of Community and Family Medicine     Open Access   (Followers: 2)
Indian Journal of Community Medicine     Open Access   (Followers: 1)
Indian Journal of Health Sciences and Biomedical Research KLEU     Open Access   (Followers: 2)
Indian Journal of Medical Microbiology     Open Access   (Followers: 1)
Indian Journal of Medical Research     Open Access   (Followers: 3)
Indian Journal of Medical Sciences     Open Access   (Followers: 2)
Indian Journal of Medical Specialities     Hybrid Journal  
Indian Journal of Otology     Open Access   (Followers: 1)
Indian Journal of Public Health     Open Access   (Followers: 1)
Indian Journal of Transplantation     Open Access  
Indian Spine Journal     Open Access  
Indo-Pacific Journal of Phenomenology     Open Access   (Followers: 1)
Indonesia Journal of Biomedical Science     Open Access   (Followers: 1)
Indonesian Biomedical Journal     Open Access  
Indonesian Journal for Health Sciences     Open Access   (Followers: 1)
Indonesian Journal of Medicine     Open Access  
Indonesian Journal of Tropical and Infectious Disease     Open Access  
Infant Observation: International Journal of Infant Observation and Its Applications     Hybrid Journal   (Followers: 1)
Inflammation     Hybrid Journal   (Followers: 3)
Inflammation Research     Hybrid Journal   (Followers: 4)
Info Diabetologie     Full-text available via subscription   (Followers: 1)
Infodir : Revista de Información científica para la Dirección en Salud     Open Access  
Informatics in Medicine Unlocked     Open Access  
Injury Prevention     Hybrid Journal   (Followers: 6)
InnovAiT     Hybrid Journal   (Followers: 1)
Innovare Journal of Health Science     Open Access  
Innovare Journal of Medical Science     Open Access  
Innovation in Aging     Open Access   (Followers: 1)
Inside Precision Medicine     Full-text available via subscription   (Followers: 3)
Insights in Biology and Medicine     Open Access  
Integrative and Complementary Therapies     Full-text available via subscription   (Followers: 3)
Integrative Medicine Insights     Open Access   (Followers: 1)
Integrative Medicine International     Open Access   (Followers: 1)
Integrative Medicine Research     Open Access   (Followers: 3)
Intellectual Disability Australasia     Full-text available via subscription   (Followers: 12)
Intelligence-Based Medicine     Open Access  
Intelligent Medicine     Open Access   (Followers: 1)
intensiv     Hybrid Journal   (Followers: 1)
interactive Journal of Medical Research     Open Access  
Interdisciplinary Perspectives on Infectious Diseases     Open Access  
Interdisciplinary Sciences : Computational Life Sciences     Hybrid Journal   (Followers: 2)
Internal Medicine     Open Access   (Followers: 1)
International Biomechanics     Open Access   (Followers: 1)
International Health     Hybrid Journal   (Followers: 5)
International Health Trends and Perspectives     Open Access  
International Journal for Numerical Methods in Biomedical Engineering     Hybrid Journal   (Followers: 2)
International Journal for Vitamin and Nutrition Research     Hybrid Journal   (Followers: 10)
International Journal of Academic Medicine     Open Access   (Followers: 1)
International Journal of Advance in Medical Science     Open Access  
International Journal of Advanced Medical and Health Research     Open Access  

  First | 1 2 3 4 5 6 7 8 | Last

Similar Journals
Journal Cover
IEEE/ACM Transactions on Computational Biology and Bioinformatics
Journal Prestige (SJR): 0.649
Citation Impact (citeScore): 2
Number of Followers: 18  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 1545-5963
Published by IEEE Homepage  [228 journals]
  • Editorial Deep Learning and Graph Embeddings for Network Biology

    • Free pre-print version: Loading...

      Authors: Pietro Hiram Guzzi;Marinka Zitnik;
      Pages: 653 - 654
      Abstract: This special issue contains a multitude of high-quality manuscripts that cover a broad range of applications supporting the need to discuss and foster these advances in a systematic way, provide practical tools for practitioners, and describe new techniques that can facilitate biomedical discovery. We received 24 manuscripts and after peer review only 9 manuscripts were accepted. Manuscripts came from all the world (US, Italy, Australia, China) and covered both theoretical (e.g., development of novel methods) and practical (e.g., data integration or predicting metabolite associations) considerations of biological network analysis. The use of graph neural networks (GNNs) for prediction of biological associations has had a significant impact in this special issue.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • IMCHGAN: Inductive Matrix Completion With Heterogeneous Graph Attention
           Networks for Drug-Target Interactions Prediction

    • Free pre-print version: Loading...

      Authors: Jin Li;Jingru Wang;Hao Lv;Zhuoxuan Zhang;Zaixia Wang;
      Pages: 655 - 665
      Abstract: Identification of targets among known drugs plays an important role in drug repurposing and discovery. Computational approaches for prediction of drug–target interactions (DTIs)are highly desired in comparison to traditional biological experiments as its fast and low price. Moreover, recent advances of systems biology approaches have generated large-scale heterogeneous, biological information networks data, which offer opportunities for machine learning-based identification of DTIs. We present a novel Inductive Matrix Completion with Heterogeneous Graph Attention Network approach (IMCHGAN)for predicting DTIs. IMCHGAN first adopts a two-level neural attention mechanism approach to learn drug and target latent feature representations from the DTI heterogeneous network respectively. Then, the learned latent features are fed into the Inductive Matrix Completion (IMC)prediction score model which computes the best projection from drug space onto target space and output DTI score via the inner product of projected drug and target feature representations. IMCHGAN is an end-to-end neural network learning framework where the parameters of both the prediction score model and the feature representation learning model are simultaneously optimized via backpropagation under supervising of the observed known drug-target interactions data. We compare IMCHGAN with other state-of-the-art baselines on two real DTI experimental datasets. The results show that our method is superior to existing methods in term of AUC and AUPR. Moreover, IMCHGAN also shows it has strong predictive power for novel (unknown)DTIs. All datasets and code can be obtained from https://github.com/ljatynu/IMCHGAN/.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Identifying Protein Subcellular Locations With Embeddings-Based node2loc

    • Free pre-print version: Loading...

      Authors: Xiaoyong Pan;Lei Chen;Min Liu;Zhibin Niu;Tao Huang;Yu-Dong Cai;
      Pages: 666 - 675
      Abstract: Identifying protein subcellular locations is an important topic in protein function prediction. Interacting proteins may share similar locations. Thus, it is imperative to infer protein subcellular locations by taking protein-protein interactions (PPIs)into account. In this study, we present a network embedding-based method, node2loc, to identify protein subcellular locations. node2loc first learns distributed embeddings of proteins in a protein-protein interaction (PPI)network using node2vec. Then the learned embeddings are further fed into a recurrent neural network (RNN). To resolve the severe class imbalance of different subcellular locations, Synthetic Minority Over-sampling Technique (SMOTE)is applied to artificially synthesize proteins for minority classes. node2loc is evaluated on our constructed human benchmark dataset with 16 subcellular locations and yields a Matthews correlation coefficient (MCC)value of 0.800, which is superior to baseline methods. In addition, node2loc yields a better performance on a Yeast benchmark dataset with 17 locations. The results demonstrate that the learned representations from a PPI network have certain discriminative ability for classifying protein subcellular locations. However, node2loc is a transductive method, it only works for proteins connected in a PPI network, and it needs to be retrained for new proteins. In addition, the PPI network needs be annotated to some extent with location information. node2loc is freely available at https://github.com/xypan1232/node2loc.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Predicting Biomedical Interactions With Higher-Order Graph Convolutional
           Networks

    • Free pre-print version: Loading...

      Authors: Kishan KC;Rui Li;Feng Cui;Anne R. Haake;
      Pages: 676 - 687
      Abstract: Biomedical interaction networks have incredible potential to be useful in the prediction of biologically meaningful interactions, identification of network biomarkers of disease, and the discovery of putative drug targets. Recently, graph neural networks have been proposed to effectively learn representations for biomedical entities and achieved state-of-the-art results in biomedical interaction prediction. These methods only consider information from immediate neighbors but cannot learn a general mixing of features from neighbors at various distances. In this paper, we present a higher-order graph convolutional network (HOGCN)to aggregate information from the higher-order neighborhood for biomedical interaction prediction. Specifically, HOGCN collects feature representations of neighbors at various distances and learns their linear mixing to obtain informative representations of biomedical entities. Experiments on four interaction networks, including protein-protein, drug-drug, drug-target, and gene-disease interactions, show that HOGCN achieves more accurate and calibrated predictions. HOGCN performs well on noisy, sparse interaction networks when feature representations of neighbors at various distances are considered. Moreover, a set of novel interaction predictions are validated by literature-based case studies.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Inferring Metabolite-Disease Association Using Graph Convolutional
           Networks

    • Free pre-print version: Loading...

      Authors: Xiujuan Lei;Jiaojiao Tie;Yi Pan;
      Pages: 688 - 698
      Abstract: As is well known, biological experiments are time-consuming and laborious, so there is absolutely no doubt that developing an effective computational model will help solve these problems. Most of computational models rely on the biological similarity and network-based methods that cannot consider the topological structures of metabolite-disease association graphs. We proposed a novel method based on graph convolutional networks to infer potential metabolite-disease association, named MDAGCN. We first calculated three kinds of metabolite similarities and three kinds of disease similarities. The final similarity of disease and metabolite will be obtained by integrating three kinds’ similarities of each and filtering out the noise similarity values. Then metabolite similarity network, disease similarity network and known metabolite-disease association network were used to construct a heterogenous network. Finally, heterogeneous network with rich information is fed into the graph convolutional networks to obtain new features of a node through aggregation of node information so as to infer the potential associations between metabolites and diseases. Experimental results show that MDAGCN achieves more reliable results in cross validation and case studies when compared with other existing methods.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Predicting the Survival of Cancer Patients With Multimodal Graph Neural
           Network

    • Free pre-print version: Loading...

      Authors: Jianliang Gao;Tengfei Lyu;Fan Xiong;Jianxin Wang;Weimao Ke;Zhao Li;
      Pages: 699 - 709
      Abstract: In recent years, cancer patients survival prediction holds important significance for worldwide health problems, and has gained many researchers attention in medical information communities. Cancer patients survival prediction can be seen the classification work which is a meaningful and challenging task. Nevertheless, research in this field is still limited. In this work, we design a novel Multimodal Graph Neural Network (MGNN)framework for predicting cancer survival, which explores the features of real-world multimodal data such as gene expression, copy number alteration and clinical data in a unified framework. Specifically, we first construct the bipartite graphs between patients and multimodal data to explore the inherent relation. Subsequently, the embedding of each patient on different bipartite graphs is obtained with graph neural network. Finally, a multimodal fusion neural layer is proposed to fuse the medical features from different modality data. Comprehensive experiments have been conducted on real-world datasets, which demonstrate the superiority of our modal with significant improvements against state-of-the-arts. Furthermore, the proposed MGNN is validated to be more robust on other four cancer datasets.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Integrating Molecular Graph Data of Drugs and Multiple -Omic Data of Cell
           Lines for Drug Response Prediction

    • Free pre-print version: Loading...

      Authors: Giang T.T. Nguyen;Hoa D. Vu;Duc-Hau Le;
      Pages: 710 - 717
      Abstract: Previous studies have either learned drug's features from their string or numeric representations, which are not natural forms of drugs, or only used genomic data of cell lines for the drug response prediction problem. Here, we proposed a deep learning model, GraOmicDRP, to learn drug's features from their graph representation and integrate multiple -omic data of cell lines. In GraOmicDRP, drugs are represented as graphs of bindings among atoms; meanwhile, cell lines are depicted by not only genomic but also transcriptomic and epigenomic data. Graph convolutional and convolutional neural networks were used to learn the representation of drugs and cell lines, respectively. A combination of the two representations was then used to be representative of each pair of drug-cell line. Finally, the response value of each pair was predicted by a fully connected network. Experimental results indicate that transcriptomic data shows the best among single -omic data; meanwhile, the combinations of transcriptomic and other –omic data achieved the best performance overall in terms of both Root Mean Square Error and Pearson correlation coefficient. In addition, we also show that GraOmicDRP outperforms some state-of-the-art methods, including ones integrating –omic data with drug information such as GraphDRP, and ones using –omic data without drug information such as DeepDR and MOLI.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • GEFA: Early Fusion Approach in Drug-Target Affinity Prediction

    • Free pre-print version: Loading...

      Authors: Tri Minh Nguyen;Thin Nguyen;Thao Minh Le;Truyen Tran;
      Pages: 718 - 728
      Abstract: Predicting the interaction between a compound and a target is crucial for rapid drug repurposing. Deep learning has been successfully applied in drug-target affinity (DTA)problem. However, previous deep learning-based methods ignore modeling the direct interactions between drug and protein residues. This would lead to inaccurate learning of target representation which may change due to the drug binding effects. In addition, previous DTA methods learn protein representation solely based on a small number of protein sequences in DTA datasets while neglecting the use of proteins outside of the DTA datasets. We propose GEFA (Graph Early Fusion Affinity), a novel graph-in-graph neural network with attention mechanism to address the changes in target representation because of the binding effects. Specifically, a drug is modeled as a graph of atoms, which then serves as a node in a larger graph of residues-drug complex. The resulting model is an expressive deep nested graph neural network. We also use pre-trained protein representation powered by the recent effort of learning contextualized protein representation. The experiments are conducted under different settings to evaluate scenarios such as novel drugs or targets. The results demonstrate the effectiveness of the pre-trained protein embedding and the advantages our GEFA in modeling the nested graph for drug-target interaction.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Netpro2vec: A Graph Embedding Framework for Biomedical Applications

    • Free pre-print version: Loading...

      Authors: Ichcha Manipur;Mario Manzo;Ilaria Granata;Maurizio Giordano;Lucia Maddalena;Mario R. Guarracino;
      Pages: 729 - 740
      Abstract: The ever-increasing importance of structured data in different applications, especially in the biomedical field, has driven the need for reducing its complexity through projections into a more manageable space. The latest methods for learning features on graphs focus mainly on the neighborhood of nodes and edges. Methods capable of providing a representation that looks beyond the single node neighborhood are kernel graphs. However, they produce handcrafted features unaccustomed with a generalized model. To reduce this gap, in this work we propose a neural embedding framework, based on probability distribution representations of graphs, named Netpro2vec. The goal is to look at basic node descriptions other than the degree, such as those induced by the Transition Matrix and Node Distance Distribution. Netpro2vec provides embeddings completely independent from the task and nature of the data. The framework is evaluated on synthetic and various real biomedical network datasets through a comprehensive experimental classification phase and is compared to well-known competitors.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Pattern Discovery in Multilayer Networks

    • Free pre-print version: Loading...

      Authors: Yuanfang Ren;Aisharjya Sarkar;Pierangelo Veltri;Ahmet Ay;Alin Dobra;Tamer Kahveci;
      Pages: 741 - 752
      Abstract: Motivation: In bioinformatics, complex cellular modeling and behavior simulation to identify significant molecular interactions is considered a relevant problem. Traditional methods model such complex systems using single and binary network. However, this model is inadequate to represent biological networks as different sets of interactions can simultaneously take place for different interaction constraints (such as transcription regulation and protein interaction). Furthermore, biological systems may exhibit varying interaction topologies even for the same interaction type under different developmental stages or stress conditions. Therefore, models which consider biological systems as solitary interactions are inaccurate as they fail to capture the complex behavior of cellular interactions within organisms. Identification and counting of recurrent motifs within a network is one of the fundamental problems in biological network analysis. Existing methods for motif counting on single network topologies are inadequate to capture patterns of molecular interactions that have significant changes in biological expression when identified across different organisms that are similar, or even time-varying networks within the same organism. That is, they fail to identify recurrent interactions as they consider a single snapshot of a network among a set of multiple networks. Therefore, we need methods geared towards studying multiple network topologies and the pattern conservation among them. Contributions: In this paper, we consider the problem of counting the number of instances of a user supplied motif topology in a given multilayer network. We model interactions among a set of entities (e.g., genes)describing various conditions or temporal variation as multilayer networks. Thus a separate network as each layer shows the connectivity of the nodes under a unique network state. Existing motif counting and identification methods are limited to-single network topologies, and thus cannot be directly applied on multilayer networks. We apply our model and algorithm to study frequent patterns in cellular networks that are common in varying cellular states under different stress conditions, where the cellular network topology under each stress condition describes a unique network layer. Results: We develop a methodology and corresponding algorithm based on the proposed model for motif counting in multilayer networks. We performed experiments on both real and synthetic datasets. We modeled the synthetic datasets under a wide spectrum of parameters, such as network size, density, motif frequency. Results on synthetic datasets demonstrate that our algorithm finds motif embeddings with very high accuracy compared to existing state-of-the-art methods such as G-tries, ESU (FANMODE)and mfinder. Furthermore, we observe that our method runs from several times to several orders of magnitude faster than existing methods. For experiments on real dataset, we consider Escherichia coli (E. coli)transcription regulatory network under different experimental conditions. We observe that the genes selected by our method conserves functional characteristics under various stress conditions with very low false discovery rates. Moreover, the method is scalable to real networks in terms of both network size and number of layers.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • A Deep Learning Model for RNA-Protein Binding Preference Prediction Based
           on Hierarchical LSTM and Attention Network

    • Free pre-print version: Loading...

      Authors: Zhen Shen;Qinhu Zhang;Kyungsook Han;De-Shuang Huang;
      Pages: 753 - 762
      Abstract: Attention mechanism has the ability to find important information in the sequence. The regions of the RNA sequence that can bind to proteins are more important than those that cannot bind to proteins. Neither conventional methods nor deep learning-based methods, they are not good at learning this information. In this study, LSTM is used to extract the correlation features between different sites in RNA sequence. We also use attention mechanism to evaluate the importance of different sites in RNA sequence. We get the optimal combination of k-mer length, k-mer stride window, k-mer sentence length, k-mer sentence stride window, and optimization function through hyper-parm experiments. The results show that the performance of our method is better than other methods. We tested the effects of changes in k-mer vector length on model performance. We show model performance changes under various k-mer related parameter settings. Furthermore, we investigate the effect of attention mechanism and RNA structure data on model performance.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • A New Method Based on Matrix Completion and Non-Negative Matrix
           Factorization for Predicting Disease-Associated miRNAs

    • Free pre-print version: Loading...

      Authors: Zhen Gao;Yu-Tian Wang;Qing-Wen Wu;Lei Li;Jian-Cheng Ni;Chun-Hou Zheng;
      Pages: 763 - 772
      Abstract: Numerous studies have shown that microRNAs are associated with the occurrence and development of human diseases. Thus, studying disease-associated miRNAs is significantly valuable to the prevention, diagnosis and treatment of diseases. In this paper, we proposed a novel method based on matrix completion and non-negative matrix factorization (MCNMF)for predicting disease-associated miRNAs. Due to the information inadequacy on miRNA similarities and disease similarities, we calculated the latter via two models, and introduced the Gaussian interaction profile kernel similarity. In addition, the matrix completion (MC)was employed to further replenish the miRNA and disease similarities to improve the prediction performance. And to reduce the sparsity of miRNA-disease association matrix, the method of weighted K nearest neighbor (WKNKN)was used, which is a pre-processing step. We also utilized non-negative matrix factorization (NMF)using dual ${{boldsymbol{L}}_{2,1}}$L2,1-norm, graph Laplacian regularization, and Tikhonov regularization to effectively avoid the overfitting during the prediction. Finally, several experiments and a case study were implemented to evaluate the effectiveness and performance of the proposed MCNMF model. The results indicated that our method could reliably and effectively predict disease-associated miRNAs.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • A Novel Graph Topology-Based GO-Similarity Measure for Signature Detection
           From Multi-Omics Data and its Application to Other Problems

    • Free pre-print version: Loading...

      Authors: Koushik Mallick;Saurav Mallik;Sanghamitra Bandyopadhyay;Sikim Chakraborty;
      Pages: 773 - 785
      Abstract: Large scale multi-omics data analysis and signature prediction have been a topic of interest in the last two decades. While various traditional clustering/correlation-based methods have been proposed, but the overall prediction is not always satisfactory. To solve these challenges, in this article, we propose a new approach by leveraging the Gene Ontology (GO)similarity combined with multiomics data. In this article, a new GO similarity measure, $ModSchlicker$ModSchlicker, is proposed and the effectiveness of the proposed measure along with other standardized measures are reviewed while using various graph topology-based Information Content (IC)values of GO-term. The proposed measure is deployed to PPI prediction. Furthermore, by involving GO similarity, we propose a new framework for stronger disease-based gene signature detection from the multi-omics data. For the first objective, we predict interaction from various benchmark PPI datasets of Yeast and Human species. For the latter, the gene expression and methylation profiles are used to identify Differentially Expressed and Methylated (DEM)genes. Thereafter, the GO similarity score along with a statistical method are used to determine the potential gene signature. Interestingly, the proposed method produces a better performance ($>$> 0.9 avg. accuracy and $>$>0.95 AUC)as compared to the other existing related methods during the classification of the participating features (genes)of the signature. Moreover, the proposed method is highly useful in other prediction/classification problems for any kind of large scale omics data.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • A Novel Method for Constructing Classification Models by Combining
           Different Biomarker Patterns

    • Free pre-print version: Loading...

      Authors: Xin Huang;Zhenqian Liao;Bing Liu;Fengmei Tao;Benzhe Su;Xiaohui Lin;
      Pages: 786 - 794
      Abstract: Different biomarker patterns, such as those of molecular biomarkers and ratio biomarkers, have their own merits in clinical applications. In this study, a novel machine learning method used in biomedical data analysis for constructing classification models by combining different biomarker patterns (CDBP)is proposed. CDBP uses relative expression reversals to measure the discriminative ability of different biomarker patterns, and selects the pattern with the higher score for classifier construction. The decision boundary of CDBP can be characterized in simple and biologically meaningful manners. The CDBP method was compared with eight state-of-the-art methods on eight gene expression datasets to test its performance. CDBP, with fewer features or ratio features, had the highest classification performance. Subsequently, CDBP was employed to extract crucial diagnostic information from a rat hepatocarcinogenesis metabolomics dataset. The potential biomarkers selected by CDBP provided better classification of hepatocellular carcinoma (HCC)and non-HCC stages than previous works in the animal model. The statistical analyses of these potential biomarkers in an independent human dataset confirmed their discriminative abilities of different liver diseases. These experimental results highlight the potential of CDBP for biomarker identification from high-dimensional biomedical datasets and demonstrate that it can be a useful tool for disease classification.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Ab-Initio Membrane Protein Amphipathic Helix Structure Prediction Using
           Deep Neural Networks

    • Free pre-print version: Loading...

      Authors: Shi-Hao Feng;Chun-Qiu Xia;Pei-Dong Zhang;Hong-Bin Shen;
      Pages: 795 - 805
      Abstract: Amphipathic helix (AH)features the segregation of polar and nonpolar residues and plays important roles in many membrane-associated biological processes through interacting with both the lipid and the soluble phases. Although the AH structure has been discovered for a long time, few ab initio machine learning-based prediction models have been reported, due to the limited amount of training data. In this study, we report a new deep learning-based prediction model, which is composed of a residual neural network and the uneven-thresholds decision algorithm. It is constructed on 121 membrane proteins, in total 51640 residue samples, which are curated from an up-to-date membrane protein structure database. Through a rigid 10-fold nested cross-validation experiment, we demonstrate that our model can achieve promising predictions and exceed current state-of-the-art approaches in this field. This presents a new avenue for accurately predicting AHs. Analysis on the contribution of the input residues and some cases further reveals the high interpretability and the generalization of our model.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • An FVS-Based Approach to Attractor Detection in Asynchronous Random
           Boolean Networks

    • Free pre-print version: Loading...

      Authors: Trinh Van Giang;Tatsuya Akutsu;Kunihiko Hiraishi;
      Pages: 806 - 818
      Abstract: Boolean networks (BNs)play a crucial role in modeling and analyzing biological systems. One of the central issues in the analysis of BNs is attractor detection, i.e., identification of all possible attractors. This problem becomes more challenging for large asynchronous random Boolean networks (ARBNs)because of the asynchronous and non-deterministic updating scheme. In this paper, we present and formally prove several relations between feedback vertex sets (FVSs)and dynamics of BNs. From these relations, we propose an FVS-based method for detecting attractors in ARBNs. Our approach relies on the principle of removing arcs in the state transition graph to get a candidate set and the reachability property to filter the candidate set. We formally prove the correctness of our method and show its efficiency by conducting experiments on real biological networks and randomly generated $N$N-$K$K networks. The obtained results are very promising since our method can handle large networks whose sizes are up to 101 without using any network reduction technique.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • CIPHER-SC: Disease-Gene Association Inference Using Graph Convolution on a
           Context-Aware Network With Single-Cell Data

    • Free pre-print version: Loading...

      Authors: Yiding Zhang;Lyujie Chen;Shao Li;
      Pages: 819 - 829
      Abstract: Inference of disease-gene associations helps unravel the pathogenesis of diseases and contributes to the treatment. Although many machine learning-based methods have been developed to predict causative genes, accurate association inference remains challenging. One major reason is the inaccurate feature selection and accumulation of error brought by commonly used multi-stage training architecture. In addition, the existing methods do not incorporate cell-type-specific information, thus fail to study gene functions at a higher resolution. Therefore, we introduce single-cell transcriptome data and construct a context-aware network to unbiasedly integrate all data sources. Then we develop a graph convolution-based approach named CIPHER-SC to realize a complete end-to-end learning architecture. Our approach outperforms four state-of-the-art approaches in five-fold cross-validations on three distinct test sets with the best AUC of 0.9501, demonstrating its stable ability either to predict the novel genes or to predict with genetic basis. The ablation study shows that our complete end-to-end design and unbiased data integration boost the performance from 0.8727 to 0.9443 in AUC. The addition of single-cell data further improves the prediction accuracy and makes our results be enriched for cell-type-specific genes. These results confirm the ability of CIPHER-SC to discover reliable disease genes. Our implementation is available at http://github.com/YidingZhang117/CIPHER-SC.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Data Perturbation and Recovery of Time Series Gene Expression Data

    • Free pre-print version: Loading...

      Authors: Aisharjya Sarkar;Prabhat Mishra;Tamer Kahveci;
      Pages: 830 - 842
      Abstract: Cells, in order to regulate their activities, process transcripts by controlling which genes to transcribe and by what amount. The transcription level of genes often change over time. Rate of change of gene transcription varies between genes. It can even change for the same gene across different members of a population. Thus, for a given gene, it is important to study the transcription level not only at a single time point, but across multiple time points to capture changes in patterns of gene expression which underlies several phenotypic or external factors. In such a dataset perturbation can happen due to which it may have missing transcription values for different samples at different time points. In this paper, we define three data perturbation models that are significant with respect to random deletion. We also define a recovery method that recovers data loss in the perturbed dataset such that the error is minimized. Our experimental results show that the recovery method compensates for the loss made by perturbation models. We show by means of two measures, namely, normalized distance and Pearson’s correlation coefficient that the distance between the original and perturbed dataset is more than the distance between original and recovered dataset.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Deciphering Key Genes and miRNAs Associated With Hepatocellular Carcinoma
           via Network-Based Approach

    • Free pre-print version: Loading...

      Authors: Sachin Bhatt;Prithvi Singh;Archana Sharma;Arpita Rai;Ravins Dohare;Shweta Sankhwar;Akash Sharma;Mansoor Ali Syed;
      Pages: 843 - 853
      Abstract: Hepatocellular carcinoma (HCC)is a common type of liver cancer and has a high mortality world-widely. The diagnosis, prognoses, and therapeutics are very poor due to the unclear molecular mechanism of progression of the disease. To unveil the molecular mechanism of progression of HCC, we extract a large sample of mRNA expression levels from the GEO database where a total of 167 samples were used for study, and out of them, 115 samples were from HCC tumor tissue. This study aims to investigate the module of differentially expressed genes (DEGs)which are co-expressed only in HCC sample data but not in normal tissue samples. Thereafter, we identified the highly significant module of significant co-expressed genes and formed a PPI network for these genes. There were only six genes (namely, MSH3, DMC1, ALPP, IL10, ZNF223, and HSD17B7)obtained after analysis of the PPI network. Out of six only MSH3, DMC1, HSD17B7, and IL10 were found enriched in GO Term & Pathway enrichment analysis and these candidate genes were mainly involved in cellular process, metabolic and catalytic activity, which promote the development & progression of HCC. Lastly, the composite 3-node FFL reveals the driver miRNAs and TFs associated with our key genes.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Deep Reinforcement Learning-Based Progressive Sequence Saliency Discovery
           Network for Mitosis Detection In Time-Lapse Phase-Contrast Microscopy
           Images

    • Free pre-print version: Loading...

      Authors: Yu-Ting Su;Yao Lu;Mei Chen;An-An Liu;
      Pages: 854 - 865
      Abstract: Mitosis detection plays an important role in the analysis of cell status and behavior and is therefore widely utilized in many biological research and medical applications. In this article, we propose a deep reinforcement learning-based progressive sequence saliency discovery network (PSSD)for mitosis detection in time-lapse phase contrast microscopy images. By discovering the salient frames when cell state changes in the sequence, PSSD can more effectively model the mitosis process for mitosis detection. We formulate the discovery of salient frames as a Markov Decision Process (MDP)that progressively adjusts the selection positions of salient frames in the sequence, and further leverage deep reinforcement learning to learn the policy in the salient frame discovery process. The proposed method consists of two parts: 1)the saliency discovery module that selects the salient frames from the input cell image sequence by progressively adjusting the selection positions of salient frames; 2)the mitosis identification module that takes a sequence of salient frames and performs temporal information fusion for mitotic sequence classification. Since the policy network of the saliency discovery module is trained under the guidance of the mitosis identification module, PSSD can comprehensively explore the salient frames that are beneficial for mitosis detection. To our knowledge, this is the first work to implement deep reinforcement learning to the mitosis detection problem. In the experiment, we evaluate the proposed method on the largest mitosis detection dataset, C2C12-16. Experiment results show that compared with the state-of-the-arts, the proposed method can achieve significant improvement for both mitosis identification and temporal localization on C2C12-16.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Designing Uncorrelated Address Constrain for DNA Storage by DMVO Algorithm

    • Free pre-print version: Loading...

      Authors: Ben Cao;Xue Ii;Xiaokang Zhang;Bin Wang;Qiang Zhang;Xiaopeng Wei;
      Pages: 866 - 877
      Abstract: At present, huge amounts of data are being produced every second, a situation that will gradually overwhelm current storage technology. DNA is a storage medium that features high storage density and long-term stability and is now considered to be a feasible storage solution. Errors are easily made during the sequencing and synthesis of DNA, however. In order to reduce the error rate, novel uncorrelated address constrain are reported, and a Damping Multi-Verse Optimizer (DMVO)algorithm is proposed to construct a set of DNA coding, which is used as the non-payload. The DMVO algorithm exchanges objects through black/white holes in order to achieve a stable state and adds damping factors as disturbances. Compared with previous work, the coding set obtained by the DMVO algorithm is larger in size and of higher quality. The results of this study reveal that the size of the DNA storage coding set obtained by the DMVO algorithm increased by 4–16 percent, and the variance of the melting temperature decreased by 3–18 percent.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Detecting Disease-Associated SNP-SNP Interactions Using Progressive
           Screening Memetic Algorithm

    • Free pre-print version: Loading...

      Authors: Boxin Guan;Yuhai Zhao;Ying Yin;Yuan Li;
      Pages: 878 - 887
      Abstract: Hundreds of thousands of single nucleotide polymorphisms (SNPs)are currently available for genome-wide association study (GWAS). Detecting disease-associated SNP-SNP interactions is considered an important way to capture the underlying genetic causes of complex diseases. In the combinatorially explosive search space, evolutionary algorithms are promising in solving this difficult problem because of their controllable time complexity. However, in existing evolutionary algorithms, some possible SNP-SNP interactions are evaluated multiple times by the fitness function. Such reevaluations not only waste computing resources but also make these algorithms easy to fall into local optima. To tackle this drawback, a progressive screening memetic algorithm (PSMA)is proposed in the paper. PSMA first represents all possible SNP-SNP interactions in a constructed graph. Then, the proposed algorithm uses the progressive screening strategy to guarantee that every possible SNP-SNP interaction can only be evaluated once by reducing the constructed graph. Furthermore, two types of local search algorithms are introduced to enhance the detecting power of PSMA. For detecting disease-associated SNP-SNP interactions, experimental results show that our proposed method outperforms other existing state-of-the-art methods in terms of accuracy and time.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • DNA Privacy: Analyzing Malicious DNA Sequences Using Deep Neural Networks

    • Free pre-print version: Loading...

      Authors: Ho Bae;Seonwoo Min;Hyun-Soo Choi;Sungroh Yoon;
      Pages: 888 - 898
      Abstract: Recent advances in next-generation sequencing technologies have led to the successful insertion of video information into DNA using synthesized oligonucleotides. Several attempts have been made to embed larger data into living organisms. This process of embedding messages is called steganography and it is used for hiding and watermarking data to protect intellectual property. In contrast, steganalysis is a group of algorithms that serves to detect hidden information from covert media. Various methods have been developed to detect messages embedded in conventional covert channels. However, conventional steganalysis algorithms are mostly limited to common covert media. Most common detection approaches, such as frequency analysis-based methods, often overlook important signals when directly applied to DNA steganography and are easily bypassed by recently developed steganography techniques. To address the limitations of conventional approaches, a sequence-learning-based malicious DNA sequence analysis method based on neural networks has been proposed. The proposed method learns intrinsic distributions and identifies distribution variations using a classification score to predict whether a sequence is to be a coding or non-coding sequence. Based on our experiments and results, we have developed a framework to safeguard security against DNA steganography.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Dual Triggered Correspondence Topic (DTCT)model for MeSH annotation

    • Free pre-print version: Loading...

      Authors: Seonho Kim;Juntae Yoon;
      Pages: 899 - 911
      Abstract: Accurate Medical Subject Headings (MeSH)annotation is an important issue for researchers in terms of effective information retrieval and knowledge discovery in the biomedical literature. We have developed a powerful dual triggered correspondence topic (DTCT)model for MeSH annotated articles. In our model, two types of data are assumed to be generated by the same latent topic factors and words in abstracts and titles serve as descriptions of the other type, MeSH terms. Our model allows the generation of MeSHs in abstracts to be triggered either by general document topics or by document-specific “special” word distributions in a probabilistic manner, allowing for a trade-off between the benefits of topic-based abstraction and specific word matching. In order to relax the topic influences of non-topical words or domain-frequent words in text description, we integrated the discriminative feature of Okapi BM25 into word sampling probability. This allows the model to choose keywords, which stand out from others, in order to generate MeSH terms. We further incorporate prior knowledge about relations between word and MeSH in DTCT with phi-coefficient to improve topic coherence. We demonstrated the model's usefulness in automatic MeSH annotation. Our model obtained 0.62 F-score 150,00 MEDLINE test set and showed a strength in recall rate. Specially, it yielded competitive performances in an integrated probabilistic environment without additional post-processing for filtering MeSHs.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Evaluation of Existing Methods for High-Order Epistasis Detection

    • Free pre-print version: Loading...

      Authors: Christian Ponte-Fernández;Jorge González-Domínguez;Antonio Carvajal-Rodríguez;María J. Martín;
      Pages: 912 - 926
      Abstract: Finding epistatic interactions among loci when expressing a phenotype is a widely employed strategy to understand the genetic architecture of complex traits in GWAS. The abundance of methods dedicated to the same purpose, however, makes it increasingly difficult for scientists to decide which method is more suitable for their studies. This work compares the different epistasis detection methods published during the last decade in terms of runtime, detection power and type I error rate, with a special emphasis on high-order interactions. Results show that in terms of detection power, the only methods that perform well across all experiments are the exhaustive methods, although their computational cost may be prohibitive in large-scale studies. Regarding non-exhaustive methods, not one could consistently find epistasis interactions when marginal effects are absent. If marginal effects are present, there are methods that perform well for high-order interactions, such as BADTrees, FDHE-IW, SingleMI or SNPHarvester. As for false-positive control, only SNPHarvester, FDHE-IW and DCHE show good results. The study concludes that there is no single epistasis detection method to recommend in all scenarios.
      Authors should prioritize exhaustive methods when sufficient computational resources are available considering the data set size, and resort to non-exhaustive methods when the analysis time is prohibitive.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • GPU Accelerated Drug Application on Signaling Pathways Containing Multiple
           Faults Using Boolean Networks

    • Free pre-print version: Loading...

      Authors: Tapan Chowdhury;Susanta Chakraborty;Argha Nandan;
      Pages: 927 - 939
      Abstract: Cell growth is governed by the flow of information from growth factors to transcription factors. This flow involves protein-protein interactions known as a signaling pathway, which triggers the cell division. The biological network in the presence of malfunctions leads to a rapid cell division without any necessary input conditions. The effect of these malfunctions or faults can be observed if it is simulated explicitly in the Boolean derivative of the biological networks. The consequences thus produced can be nullified to a large extent, with the application of a reduced combination of drugs. This paper provides an insight into the behavior of the signaling pathway in the presence of multiple concurrent malfunctions. First, we simulate the behavior of malfunctions in the Boolean networks. Next, we apply the drug therapy to reduce the effects of malfunctions. In our approach, we introduce a parameter called probabilistic_score, which identifies the reduced drug combinations without prior knowledge of the malfunctions, and it is more beneficial in realistic cancerous conditions. The combinations of different custom drug inhibition points are chosen to produce more efficient results than known drugs. Our approach is significantly faster as GPU acceleration has been carried out during modeling the multiple faults/malfunctions in the Boolean networks.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Functional Genomics Platform, A Cloud-Based Platform for Studying
           Microbial Life at Scale

    • Free pre-print version: Loading...

      Authors: Edward E. Seabolt;Gowri Nayar;Harsha Krishnareddy;Akshay Agarwal;Kristen L. Beck;Ignacio Terrizzano;Eser Kandogan;Mark Kunitomi;Mary Roth;Vandana Mukherjee;James H. Kaufman;
      Pages: 940 - 952
      Abstract: The rapid growth in biological sequence data is revolutionizing our understanding of genotypic diversity and challenging conventional approaches to informatics. With the increasing availability of genomic data, traditional bioinformatic tools require substantial computational time and the creation of ever-larger indices each time a researcher seeks to gain insight from the data. To address these challenges, we pre-computed important relationships between biological entities spanning the Central Dogma of Molecular Biology and captured this information in a relational database. The database can be queried across hundreds of millions of entities and returns results in a fraction of the time required by traditional methods. In this paper, we describe Functional Genomics Platform (formerly known as OMXWare), a comprehensive database relating genotype to phenotype for bacterial life. Continually updated, the Functional Genomics Platform today contains data derived from 200,000 curated, self-consistently assembled genomes. The database stores functional data for over 68 million genes, 52 million proteins, and 239 million domains with associated biological activity annotations from Gene Ontology, KEGG, MetaCyc, and Reactome. The Functional Genomics Platform maps all of the many-to-many connections between each biological entity including the originating genome, gene, protein, and protein domain. Various microbial studies, from infectious disease to environmental health, can benefit from the rich data and connections. We describe the data selection, the pipeline to create and update the Functional Genomics Platform, and the developer tools (Python SDK and REST APIs)which allow researchers to efficiently study microbial life at scale.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Identifying Gene Signatures for Cancer Drug Repositioning Based on Sample
           Clustering

    • Free pre-print version: Loading...

      Authors: Fei Wang;Yulian Ding;Xiujuan Lei;Bo Liao;Fang-Xiang Wu;
      Pages: 953 - 965
      Abstract: Drug repositioning is an important approach for drug discovery. Computational drug repositioning approaches typically use a gene signature to represent a particular disease and connect the gene signature with drug perturbation profiles. Although disease samples, especially from cancer, may be heterogeneous, most existing methods consider them as a homogeneous set to identify differentially expressed genes (DEGs)for further determining a gene signature. As a result, some genes that should be in a gene signature may be averaged off. In this study, we propose a new framework to identify gene signatures for cancer drug repositioning based on sample clustering (GS4CDRSC). GS4CDRSC first groups samples into several clusters based on their gene expression profiles. Second, an existing method is applied to the samples in each cluster for generating a list of DEGs. Then a weighting approach is used to identify an intergrated gene signature from all the lists of DEGs. The integrated gene signature is used to connect with drug perturbation profiles in the Connectivity Map (CMap)database to generate a list of drug candidates. GS4CDRSC has been tested with several cancer datasets and existing methods. The computational results show that GS4CDRSC outperforms those methods without the sample clustering and weighting approaches in terms of both number and rate of predicted known drugs for specific cancers.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Immuno-Informatics Based Peptides: An Approach for Vaccine Development
           Against Outer Membrane Proteins of Pseudomonas Genus

    • Free pre-print version: Loading...

      Authors: Meshari Alazmi;Olaa Motwalli;
      Pages: 966 - 973
      Abstract: Pseudomonas genus is among the top nosocomial pathogens known to date. Being highly opportunistic, members of pseudomonas genus are most commonly connected with nosocomial infections of urinary tract and ventilator-associated pneumonia. Nevertheless, vaccine development for this pathogenic genus is slow because of no information regarding immunity correlated functional mechanism. In this present work, an immunoinformatics pipeline is used for vaccine development based on epitope-based peptide design, which can result in crucial immune response against outer membrane proteins of pseudomonas genus. A total of 127 outer membrane proteins were analysed, studied and out of them three sequences were obtained to be the producer of non-allergic, highly antigenic T-cell and B-cell epitopes which show good binding affinity towards class II HLA molecules. After performing rigorous screening utilizing docking, simulation, modelling techniques, we had one nonameric peptide (WLLATGIFL)as a good vaccine candidate. The predicted epitopes needs to be further validated for its apt use as vaccine. This work paves a new way with extensive therapeutic application against Pseudomonas genus and their associated diseases.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Inference of a Dynamic Aging-related Biological Subnetwork via Network
           Propagation

    • Free pre-print version: Loading...

      Authors: Khalique Newaz;Tijana Milenković;
      Pages: 974 - 988
      Abstract: Gene expression (GE)data capture valuable condition-specific information (“condition” can mean a biological process, disease stage, age, patient, etc.)However, GE analyses ignore physical interactions between gene products, i.e., proteins. Because proteins function by interacting with each other, and because biological networks (BNs)capture these interactions, BN analyses are promising. However, current BN data fail to capture condition-specific information. Recently, GE and BN data have been integrated using network propagation (NP)to infer condition-specific BNs. However, existing NP-based studies result in a static condition-specific subnetwork, even though cellular processes are dynamic. A dynamic process of our interest is human aging. We use prominent existing NP methods in a new task of inferring a dynamic rather than static condition-specific (aging-related)subnetwork. Then, we study evolution of network structure with age – we identify proteins whose network positions significantly change with age and predict them as new aging-related candidates. We validate the predictions via e.g., functional enrichment analyses and literature search. Dynamic network inference via NP yields higher prediction quality than the only existing method for inferring a dynamic aging-related BN, which does not use NP. Our data and code are available at https://nd.edu/~cone/dynetinf.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • LDA-LNSUBRW: lncRNA-Disease Association Prediction Based on Linear
           Neighborhood Similarity and Unbalanced bi-Random Walk

    • Free pre-print version: Loading...

      Authors: Guobo Xie;Jiawei Jiang;Yuping Sun;
      Pages: 989 - 997
      Abstract: Increasing number of experiments show that lncRNAs are involved in many biological processes, and their mutations and disorders are associated with many diseases. However, verifying the relationships between lncRNAs and diseases is time consuming and laborio. Searching for effective computational methods will contribute to our understanding of the underlying mechanisms of disease and identifying biomarkers of diseases. Therefore, we proposed a method called lncRNA-disease association prediction based on linear neighborhood similarity and unbalanced bi-random walk (LDA-LNSUBRW). Given that the known lncRNA-disease associations are rare, a pretreatment step should be performed to obtain the interaction possibility of unknown cases, so as to help us predict the potential associations. In the framework of leave-one-out cross-validation (LOOCV)and fivefold cross-validation (5-fold CV), LDA-LNSUBRW achieved effective performance with AUC of 0.8874 and 0.8632 $pm$± 0.0051, respectively. The experimental results in this paper show that the proposed method is superior to five other state-of-the-art methods. In addition, case studies of three diseases (lung cancer, breast cancer, and osteosarcoma)were carried out to illustrate that LDA-LNSUBRW could predict the relevant lncRNAs.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Learning Useful Representations of DNA Sequences From ChIP-Seq Datasets
           for Exploring Transcription Factor Binding Specificities

    • Free pre-print version: Loading...

      Authors: Lijun Quan;Xiaoyu Sun;Jian Wu;Jie Mei;Liqun Huang;Ruji He;Liangpeng Nie;Yu Chen;Qiang Lyu;
      Pages: 998 - 1008
      Abstract: Deep learning has been successfully applied to surprisingly different domains. Researchers and practitioners are employing trained deep learning models to enrich our knowledge. Transcription factors (TFs)are essential for regulating gene expression in all organisms by binding to specific DNA sequences. Here, we designed a deep learning model named SemanticCS (Semantic ChIP-seq)to predict TF binding specificities. We trained our learning model on an ensemble of ChIP-seq datasets (Multi-TF-cell)to learn useful intermediate features across multiple TFs and cells. To interpret these feature vectors, visualization analysis was used. Our results indicate that these learned representations can be used to train shallow machines for other tasks. Using diverse experimental data and evaluation metrics, we show that SemanticCS outperforms other popular methods. In addition, from experimental data, SemanticCS can help to identify the substitutions that cause regulatory abnormalities and to evaluate the effect of substitutions on the binding affinity for the RXR transcription factor. The online server for SemanticCS is freely available at http://qianglab.scst.suda.edu.cn/semanticCS/.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Linear Time Reconciliation With Bounded Transfers of Genes

    • Free pre-print version: Loading...

      Authors: Daniele Tavernelli;Tiziana Calamoneri;Paola Vocca;
      Pages: 1009 - 1017
      Abstract: Tree reconciliation is a general framework for investigating the mutual influence between gene and species trees according to the parsimony principle, that is, to each evolutionary event a cost is assigned and the goal is to find a reconciliation of minimum total cost. The resulting optimization problem is known as the reconciliation problem. Usually, the considered events are: co-divergence, gene Duplication, horizontal gene Transfer, and gene Loss (DTL model), while in a more conservative setting, gene transfers are not allowed (DL model). The reconciliation problem requires, in the DL model, time linear in the dimension of the two trees and at least quadratic time in the DTL model. Hence, it is reasonable to argue that the introduction of horizontal gene transfers increases the complexity of the problem. Instead, we introduce horizontal gene transfers with some constraints and prove that the problem is still linear in the dimension of the trees. Namely, we allow gene transfers of length bounded by $k=2$k=2, on the basis of the observation that transfers are more likely to occur between closely related species than between distantly related ones. Then we extend the same reasonings to the case in which $k>2$k>2 under additional constrains. In this paper we study also another problem related to the reconciliation one, that is optimally rooting o-e of the two trees when it is not, and also for it we prove similar results. The relevance of this contribution lies in showing that, in the transit from the DL to the DTL model, the computational time does not increase suddenly to quadratic but remains linear in the case when gene transfers are very short (i.e., happening between very close genes).
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Local R-Symmetry Co-Occurrence: Characterising Leaf Image Patterns for
           Identifying Cultivars

    • Free pre-print version: Loading...

      Authors: Bin Wang;Yongsheng Gao;Xiaohui Yuan;Shengwu Xiong;
      Pages: 1018 - 1031
      Abstract: Leaf image recognition techniques have been actively researched for plant species identification. However it remains unclear whether analysing leaf patterns can provide sufficient information for further differentiating cultivars. This paper reports our attempt on cultivar recognition from leaves as a general very fine-grained pattern recognition problem, which is not only a challenging research problem but also important for cultivar evaluation, selection and production in agriculture. We propose a novel local R-symmetry co-occurrence method for characterising discriminative local symmetry patterns to distinguish subtle differences among cultivars. Through scalable and moving R-relation radius pairs, we generate a set of radius symmetry co-occurrence matrices (RsCoM)and their measures for describing the local symmetry properties of interior regions. By varying the size of the radius pair, the RsCoM measures local R-symmetry co-occurrence from global/coarse to fine scales. A new two-phase strategy of analysing the distribution of local RsCoM measures is designed to match the multiple scale appearance symmetry pattern distributions of similar cultivar leaf images. We constructed three leaf image databases, SoyCultivar, CottCultivar, and PeanCultivar, for an extensive experimental evaluation on recognition across soybean, cotton and peanut cultivars. Encouraging experimental results of the proposed method in comparison with the state-of-the-art leaf species recognition methods demonstrate the effectiveness of the proposed method for cultivar identification, which may advance the research in leaf recognition from species to cultivar.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Multi-Modal Classification for Human Breast Cancer Prognosis Prediction:
           Proposal of Deep-Learning Based Stacked Ensemble Model

    • Free pre-print version: Loading...

      Authors: Nikhilanand Arya;Sriparna Saha;
      Pages: 1032 - 1041
      Abstract: Breast Cancer is a highly aggressive type of cancer generally formed in the cells of the breast. Despite significant advances in the treatment of primary breast cancer in the last decade, there is a dire need to attempt of an accurate predictive model for breast cancer prognosis prediction. Researchers from various disciplines are working together to develop methods to save people from this fatal disease. A good predictive model can help in correct prognosis prediction of breast cancer. This accurate prediction can have several benefits like detection of cancer in the early stage, spare patients from getting unnecessary treatment and medical expenses related to it. Previous works rely mostly on uni-modal data (selected gene expression)for predictive model design. In recent years, however, multi-modal cancer data sets have become available (gene expression, copy number alteration and clinical). Motivated by the enhancement of deep-learning based models, in the current study, we propose to use some deep-learning based predictive models in a stacked ensemble framework to improve the prognosis prediction of breast cancer from available multi-modal data sets. One of the unique advantages of the proposed approach lies in the architecture of the model. It is a two-stage model. Stage one uses a convolutional neural network for feature extraction, while stage two uses the extracted features as input to the stack-based ensemble model. The predictive performance evaluated using different performance measures shows that this model produces better result than already existing approaches. This model results in AUC value of 0.93 and accuracy of 90.2 percent at medium stringency level (Specificity = 95 percent and threshold = 0.45). Keras 2.2.1, along with Tensorflow 1.12, is used for implementing the source code of the model. The source code can be downloaded from Github: https://github.com/nikhilaryan92/BreastCancer.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • NIMCE: A Gene Regulatory Network Inference Approach Based on Multi Time
           Delays Causal Entropy

    • Free pre-print version: Loading...

      Authors: Haonan Feng;Ruiqing Zheng;Jianxin Wang;Fang-Xiang Wu;Min Li;
      Pages: 1042 - 1049
      Abstract: Gene regulatory networks (GRNs)are involved in various biological processes, such as cell cycle, differentiation and apoptosis. The existing large amount of expression data, especially the time-series expression data, provide a chance to infer GRNs by computational methods. These data can reveal the dynamics of gene expression and imply the regulatory relationships among genes. However, identify the indirect regulatory links is still a big challenge as most studies treat time points as independent observations, while ignoring the influences of time delays. In this study, we propose a GRN inference method based on information-theory measure, called NIMCE. NIMCE incorporates the transfer entropy to measure the regulatory links between each pair of genes, then applies the causation entropy to filter indirect relationships. In addition, NIMCE applies multi time delays to identify indirect regulatory relationships from candidate genes. Experiments on simulated and colorectal cancer data show NIMCE outperforms than other competing methods. All data and codes used in this study are publicly available at https://github.com/CSUBioGroup/NIMCE.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Optimal Robust Search for Parameter Values of Qualitative Models of Gene
           Regulatory Networks

    • Free pre-print version: Loading...

      Authors: Liliana Ironi;Ettore Lanzarone;
      Pages: 1050 - 1063
      Abstract: Computational and mathematical models are a must for the in silico analysis or design of Gene Regulatory Networks (GRN)as they offer a theoretical context to deeply address biological regulation. We have proposed a framework where models of network dynamics are expressed through a class of nonlinear and temporal multiscale Ordinary Differential Equations (ODE). To find out models that disclose network structures underlying an observed or desired network behavior, and parameter values that enable the candidate models to reproduce such behavior, we follow a reasoning cycle that alternates procedures for model selection and parameter refinement. Plausible network models are first selected via qualitative simulation, and next their parameters are given quantitative values such that the ODE model solution reproduces the specified behavior. This paper gives algorithms to tackle the parameter refinement problem formulated as an optimization problem. We search, within the parameter space symbolically expressed, for the largest hypersphere whose points correspond to parameter values such that the ODE solution gives an instance of the given qualitative trajectory. Our approach overcomes the limitation of a previously proposed stochastic approach, namely computational load and very reduced scalability. Its applicability and effectiveness are demonstrated through two benchmark synthetic networks with different complexity.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Optimized Elastic Network Models With Direct Characterization of
           Inter-Residue Cooperativity for Protein Dynamics

    • Free pre-print version: Loading...

      Authors: Hua Zhang;Guogen Shan;Bailin Yang;
      Pages: 1064 - 1074
      Abstract: The elastic network models (ENMs)are known as representative coarse-grained models to capture essential dynamics of proteins. Due to simple designs of the force constants as a decay with spatial distances of residue pairs in many previous studies, there is still much room for the improvement of ENMs. In this article, we directly computed the force constants with the inverse covariance estimation using a ridge-type operater for the precision matrix estimation (ROPE)on a large-scale set of NMR ensembles. Distance-dependent statistical analyses on the force constants were further comprehensively performed in terms of several paired types of sequence and structural information, including secondary structure, relative solvent accessibility, sequence distance and terminal. Various distinguished distributions of the mean force constants highlight the structural and sequential characteristics coupled with the inter-residue cooperativity beyond the spatial distances. We finally integrated these structural and sequential characteristics to build novel ENM variations using the particle swarm optimization for the parameter estimation. The considerable improvements on the correlation coefficient of the mean-square fluctuation and the mode overlap were achieved by the proposed variations when compared with traditional ENMs. This study opens a novel way to develop more accurate elastic network models for protein dynamics.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Predicting Coding Potential of RNA Sequences by Solving Local Data
           Imbalance

    • Free pre-print version: Loading...

      Authors: Xian-gan Chen;Shuai Liu;Wen Zhang;
      Pages: 1075 - 1083
      Abstract: Non-coding RNAs (ncRNAs)play an important role in various biological processes and are associated with diseases. Distinguishing between coding RNAs and ncRNAs, also known as predicting coding potential of RNA sequences, is critical for downstream biological function analysis. Many machine learning-based methods have been proposed for predicting coding potential of RNA sequences. Recent studies reveal that most existing methods have poor performance on RNA sequences with short Open Reading Frames (sORF, ORF length
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Prediction of Glioma Grade Using Intratumoral and Peritumoral Radiomic
           Features From Multiparametric MRI Images

    • Free pre-print version: Loading...

      Authors: Jianhong Cheng;Jin Liu;Hailin Yue;Harrison Bai;Yi Pan;Jianxin Wang;
      Pages: 1084 - 1095
      Abstract: The accurate prediction of glioma grade before surgery is essential for treatment planning and prognosis. Since the gold standard (i.e., biopsy)for grading gliomas is both highly invasive and expensive, and there is a need for a noninvasive and accurate method. In this study, we proposed a novel radiomics-based pipeline by incorporating the intratumoral and peritumoral features extracted from preoperative mpMRI scans to accurately and noninvasively predict glioma grade. To address the unclear peritumoral boundary, we designed an algorithm to capture the peritumoral region with a specified radius. The mpMRI scans of 285 patients derived from a multi-institutional study were adopted. A total of 2153 radiomic features were calculated separately from intratumoral volumes (ITVs)and peritumoral volumes (PTVs)on mpMRI scans, and then refined using LASSO and mRMR feature ranking methods. The top-ranking radiomic features were entered into the classifiers to build radiomic signatures for predicting glioma grade. The prediction performance was evaluated with five-fold cross-validation on a patient-level split. The radiomic signatures utilizing the features of ITV and PTV both show a high accuracy in predicting glioma grade, with AUCs reaching 0.968. By incorporating the features of ITV and PTV, the AUC of IPTV radiomic signature can be increased to 0.975, which outperforms the state-of-the-art methods. Additionally, our proposed method was further demonstrated to have strong generalization performance in an external validation dataset with 65 patients. The source code of our implementation is made publicly available at https://github.com/chengjianhong/glioma_grading.git.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Prediction of Malignant Breast Cancer Cases Using Ensemble Machine
           Learning: A Case Study of Pesticides Prone Area

    • Free pre-print version: Loading...

      Authors: Nishtha Hooda;Ruchika Gupta;Nidhi Rani Gupta;
      Pages: 1096 - 1104
      Abstract: Cancer of the female breast is one of the leading types of cancers worldwide. This paper presents a case study of Malwa Belt in India that has witnessed the proliferation in the overall mortality rate due to breast cancer. The paper researches mortality aspect of the disease and its association with the various risk parameters including demographic characteristics, percentage of pesticides residue present in the water and soil, life style of the women in the affected area, water intake, and the amount of pesticide exposure to the patient. The levels of organochlorine pesticides like DDT and its metabolites and isomers of HCH in blood, tumor and surrounding adipose are estimated. Additionally, an extent of exposure of the subjects to environmental pollutants like heavy metals (Lead, Copper, Iron, Zinc, Calcium, Selenium, and Chromium etc.)are also examined. For the obtained experimental data, an efficient ensemble machine learning based framework called Bagoost is proposed to predict the risk of breast cancer in Malwa women. The performance of the proposed machine learning model results in an accuracy of 98.21 percent, when empirically tested using K-fold cross validation over the real time data of malignant and benign cases and is established to be efficacious than the existing approaches.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Relation Extraction From Biomedical and Clinical Text: Unified Multitask
           Learning Framework

    • Free pre-print version: Loading...

      Authors: Shweta Yadav;Srivatsa Ramesh;Sriparna Saha;Asif Ekbal;
      Pages: 1105 - 1116
      Abstract: Motivation: To minimize the accelerating amount of time invested on the biomedical literature search, numerous approaches for automated knowledge extraction have been proposed. Relation extraction is one such task where semantic relations between the entities are identified from the free text. In the biomedical domain, extraction of regulatory pathways, metabolic processes, adverse drug reaction or disease models necessitates knowledge from the individual relations, for example, physical or regulatory interactions between genes, proteins, drugs, chemical, disease or phenotype. Results: In this paper, we study the relation extraction task from three major biomedical and clinical tasks, namely drug-drug interaction, protein-protein interaction, and medical concept relation extraction. Towards this, we model the relation extraction problem in a multi-task learning (MTL)framework, and introduce for the first time the concept of structured self-attentive network complemented with the adversarial learning approach for the prediction of relationships from the biomedical and clinical text. The fundamental notion of MTL is to simultaneously learn multiple problems together by utilizing the concepts of the shared representation. Additionally, we also generate the highly efficient single task model which exploits the shortest dependency path embedding learned over the attentive gated recurrent unit to compare our proposed MTL models. The framework we propose significantly improves over all the baselines (deep learning techniques)and single-task models for predicting the relationships, without compromising on the performance of all the tasks.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Row and Column Structure-Based Biclustering for Gene Expression Data

    • Free pre-print version: Loading...

      Authors: Subin Qian;Huiyi Liu;Xiaofeng Yuan;Wei Wei;Shuangshuang Chen;Hong Yan;
      Pages: 1117 - 1129
      Abstract: Due to the development of high-throughput technologies for gene analysis, the biclustering method has attracted much attention. However, existing methods have problems with high time and space complexity. This paper proposes a biclustering method, called Row and Column Structure-based Biclustering (RCSBC), with low time and space complexity to find checkerboard patterns within microarray data. First, the paper describes the structure of bicluster by using the structure of rows and columns. Second, the paper chooses the representative rows and columns with two algorithms. Finally, the gene expression data are biclustered on the space spanned by representative rows and columns. To the best of our knowledge, this paper is the first to exploit the relationship between the row/column structure of a gene expression matrix and the structure of biclusters. Both the synthetic datasets and the real-life gene expression datasets are used to validate the effectiveness of our method. It can be seen from the experiment results that the RCSBC outperforms the state-of-the-art algorithms both on clustering accuracy and time/space complexity. This study offers new insights into biclustering the large-scale gene expression data without loading the whole data into memory.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Scalable Non-Linear Graph Fusion for Prioritizing Cancer-Causing Genes

    • Free pre-print version: Loading...

      Authors: Ekta Shah;Pradipta Maji;
      Pages: 1130 - 1143
      Abstract: In the past few decades, both gene expression data and protein-protein interaction (PPI)networks have been extensively studied, due to their ability to depict important characteristics of disease-associated genes. In this regard, the paper presents a new gene prioritization algorithm to identify and prioritize cancer-causing genes, integrating judiciously the complementary information obtained from two data sources. The proposed algorithm selects disease-causing genes by maximizing the importance of selected genes and functional similarity among them. A new quantitative index is introduced to evaluate the importance of a gene. It considers whether a gene exhibits a differential expression pattern across sick and healthy individuals, and has a strong connectivity in the PPI network, which are the important characteristics of a potential biomarker. As disease-associated genes are expected to have similar expression profiles and topological structures, a scalable non-linear graph fusion technique, termed as ScaNGraF, is proposed to learn a disease-dependent functional similarity network from the co-expression and common neighbor based similarity networks. The proposed ScaNGraF, which is based on message passing algorithm, efficiently combines the shared and complementary information provided by different data sources with significantly lower computational cost. A new measure, termed as DiCoIN, is introduced to evaluate the quality of a learned affinity network. The performance of the proposed graph fusion technique and gene selection algorithm is extensively compared with that of some existing methods, using several cancer data sets.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • scLRTD : A Novel Low Rank Tensor Decomposition Method for Imputing Missing
           Values in Single-Cell Multi-Omics Sequencing Data

    • Free pre-print version: Loading...

      Authors: Zhijie Ni;Xiaoying Zheng;Xiao Zheng;Xiufen Zou;
      Pages: 1144 - 1153
      Abstract: With the successful application of single-cell sequencing technology, a large number of single-cell multi-omics sequencing (scMO-seq)data have been generated, which enables researchers to study heterogeneity between individual cells. One prominent problem in single-cell data analysis is the prevalence of dropouts, caused by failures in amplification during the experiments. It is necessary to develop effective approaches for imputing the missing values. Different with general methods imputing single type of single-cell data, we propose an imputation method called scLRTD, using low-rank tensor decomposition based on nuclear norm to impute scMO-seq data and single-cell RNA-sequencing (scRNA-seq)data with different stages, tissues or conditions. Furthermore, four sets of simulated and two sets of real scRNA-seq data from mouse embryonic stem cells and hepatocellular carcinoma, respectively, are used to carry out numerical experiments and compared with other six published methods. Error accuracy and clustering results demonstrate the effectiveness of proposed method. Moreover, we clearly identify two cell subpopulations after imputing the real scMO-seq data from hepatocellular carcinoma. Further, Gene Ontology identifies 7 genes in Bile secretion pathway, which is related to metabolism in hepatocellular carcinoma. The survival analysis using the database TCGA also show that two cell subpopulations after imputing have distinguished survival rates.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Single-Cell RNA Sequencing Data Clustering by Low-Rank Subspace Ensemble
           Framework

    • Free pre-print version: Loading...

      Authors: Chuan-Yuan Wang;Ying-Lian Gao;Jin-Xing Liu;Xiang-Zhen Kong;Chun-Hou Zheng;
      Pages: 1154 - 1164
      Abstract: The rapid development of single-cell RNA sequencing (scRNA-seq)technology reveals the gene expression status and gene structure of individual cells, reflecting the heterogeneity and diversity of cells. The traditional methods of scRNA-seq data analysis treat data as the same subspace, and hide structural information in other subspaces. In this paper, we propose a low-rank subspace ensemble clustering framework (LRSEC)to analyze scRNA-seq data. Assuming that the scRNA-seq data exist in multiple subspaces, the low-rank model is used to find the lowest rank representation of the data in the subspace. It is worth noting that the penalty factor of the low-rank kernel function is uncertain, and different penalty factors correspond to different low-rank structures. Moreover, the single cluster model is difficult to find the cellular structure of all datasets. To strengthen the correlation between model solutions, we construct a new ensemble clustering framework LRSEC by using the low-rank model as the basic learner. The LRSEC framework captures the global structure of data through low-rank subspaces, which has better clustering performance than a single clustering model. We validate the performance of the LRSEC framework on seven small datasets and one large dataset and obtain satisfactory results.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Spatial Pyramid Pooling With 3D Convolution Improves Lung Cancer Detection

    • Free pre-print version: Loading...

      Authors: Jason L. Causey;Keyu Li;Xianghao Chen;Wei Dong;Karl Walker;Jake A. Qualls;Jonathan Stubblefield;Jason H. Moore;Yuanfang Guan;Xiuzhen Huang;
      Pages: 1165 - 1172
      Abstract: Lung cancer is the leading cause of cancer deaths. Low-dose computed tomography (CT)screening has been shown to significantly reduce lung cancer mortality but suffers from a high false positive rate that leads to unnecessary diagnostic procedures. The development of deep learning techniques has the potential to help improve lung cancer screening technology. Here we present the algorithm, DeepScreener, which can predict a patient's cancer status from a volumetric lung CT scan. DeepScreener is based on our model of Spatial Pyramid Pooling, which ranked 16th of 1972 teams (top 1 percent)in the Data Science Bowl 2017 competition (DSB2017), evaluated with the challenge datasets. Here we test the algorithm with an independent set of 1449 low-dose CT scans of the National Lung Screening Trial (NLST)cohort, and we find that DeepScreener has consistent performance of high accuracy. Furthermore, by combining Spatial Pyramid Pooling and 3D Convolution, it achieves an AUC of 0.892, surpassing the previous state-of-the-art algorithms using only 3D convolution. The advancement of deep learning algorithms can potentially help improve lung cancer detection with low-dose CT scans.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Statistical Analysis of Microarray Data Clustering using NMF, Spectral
           Clustering, Kmeans, and GMM

    • Free pre-print version: Loading...

      Authors: Andri Mirzal;
      Pages: 1173 - 1192
      Abstract: In unsupervised learning literature, the study of clustering using microarray gene expression datasets has been extensively conducted with nonnegative matrix factorization (NMF), spectral clustering, kmeans, and gaussian mixture model (GMM)are some of the most used methods. However, there is still a limited number of works that utilize statistical analysis to measure the significances of performance differences between these methods. In this paper, statistical analysis of performance differences between ten NMF, six spectral clustering, four GMM, and the standard kmeans algorithms in clustering eleven publicly available microarray gene expression datasets with the number of clusters ranges from two to ten is presented. The experimental results show that statistically NMFs and kmeans have similar performances and outperform spectral clustering. As spectral clustering can be used to uncover hidden manifold structures, the underperformance of spectral methods leads us to question whether the datasets have manifold structures. Visual inspection using multidimensional scaling plots indicates that such structures do not exist. Moreover, as the plots indicate that clusters in some datasets have elliptical boundaries, GMM methods are also utilized. The experimental results show that GMM methods outperform the other methods to some degree, and thus imply that the datasets follow gaussian distributions.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Supervised Graph Clustering for Cancer Subtyping Based on Survival
           Analysis and Integration of Multi-Omic Tumor Data

    • Free pre-print version: Loading...

      Authors: Cheng Liu;Wenming Cao;Si Wu;Wenjun Shen;Dazhi Jiang;Zhiwen Yu;Hau-San Wong;
      Pages: 1193 - 1202
      Abstract: Identifying cancer subtypes by integration of multi-omic data is beneficial to improve the understanding of disease progression, and provides more precise treatment for patients. Cancer subtypes identification is usually accomplished by clustering patients with unsupervised learning approaches. Thus, most existing integrative cancer subtyping methods are performed in an entirely unsupervised way. An integrative cancer subtyping approach can be improved to discover clinically more relevant cancer subtypes when considering the clinical survival response variables. In this study, we propose a Survival Supervised Graph Clustering (S2GC)for cancer subtyping by taking into consideration survival information. Specifically, we use a graph to represent similarity of patients, and develop a multi-omic survival analysis embedding with patient-to-patient similarity graph learning for cancer subtype identification. The multi-view (omic)survival analysis model and graph of patients are jointly learned in a unified way. The learned optimal graph can be unitized to cluster cancer subtypes directly. In the proposed model, the survival analysis model and adaptive graph learning could positively reinforce each other. Consequently, the survival time can be considered as supervised information to improve the quality of the similarity graph and explore clinically more relevant subgroups of patients. Experiments on several representative multi-omic cancer datasets demonstrate that the proposed method achieves better results than a number of state-of-the-art methods. The results also suggest that our method is able to identify biologically meaningful subgroups for different cancer types. (Our Matlab source code is available online at github: https://github.com/CLiu272/S2GC)
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Synergy Between Embedding and Protein Functional Association Networks for
           Drug Label Prediction Using Harmonic Function

    • Free pre-print version: Loading...

      Authors: Mohan Timilsina;Declan Patrick Mc Kernan;Haixuan Yang;Mathieu d’Aquin;
      Pages: 1203 - 1213
      Abstract: Semi-Supervised Learning (SSL)is an approach to machine learning that makes use of unlabeled data for training with a small amount of labeled data. In the context of molecular biology and pharmacology, one can take advantage of unlabeled data. For instance, to identify drugs and targets where a few genes are known to be associated with a specific target for drugs and considered as labeled data. Labeling the genes requires laboratory verification and validation. This process is usually very time consuming and expensive. Thus, it is useful to estimate the functional role of drugs from unlabeled data using computational methods. To develop such a model, we used openly available data resources to create (i)drugs and genes, (ii)genes and disease, bipartite graphs. We constructed the genetic embedding graph from the two bipartite graphs using Tensor Factorization methods. We integrated the genetic embedding graph with the publicly available protein functional association network. Our results show the usefulness of the integration by effectively predicting drug labels.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • $gamma$ γ -OMP+Algorithm+for+Feature+Selection+With+Application+to+Gene+Expression+Data&rft.title=IEEE/ACM+Transactions+on+Computational+Biology+and+Bioinformatics&rft.issn=1545-5963&rft.date=2022&rft.volume=19&rft.spage=1214&rft.epage=1224&rft.aulast=Tsamardinos;&rft.aufirst=Michail&rft.au=Michail+Tsagris;Zacharias+Papadovasilakis;Kleanthi+Lakiotaki;Ioannis+Tsamardinos;">The $gamma$ γ -OMP Algorithm for Feature Selection With Application
           to Gene Expression Data

    • Free pre-print version: Loading...

      Authors: Michail Tsagris;Zacharias Papadovasilakis;Kleanthi Lakiotaki;Ioannis Tsamardinos;
      Pages: 1214 - 1224
      Abstract: Feature selection for predictive analytics is the problem of identifying a minimal-size subset of features that is maximally predictive of an outcome of interest. To apply to molecular data, feature selection algorithms need to be scalable to tens of thousands of features. In this paper, we propose $gamma$γ-OMP, a generalisation of the highly-scalable Orthogonal Matching Pursuit feature selection algorithm. $gamma$γ-OMP can handle (a)various types of outcomes, such as continuous, binary, nominal, time-to-event, (b)discrete (categorical)features, (c)different statistical-based stopping criteria, (d)several predictive models (e.g., linear or logistic regression), (e)various types of residuals, and (f)different types of association. We compare $gamma$γ-OMP against LASSO, a prototypical, widely used algorithm for high-dimensional data. On both simulated data and several real gene expression datasets, $gamma$γ-OMP is on par, or outperforms LASSO in binary classification (case-control data), regression (quantified outcomes), and time-to-event data (censored survival times). $gamma$-alternatives>γ-OMP is based on simple statistical ideas, it is easy to implement and to extend, and our extensive evaluation shows that it is also effective in bioinformatics analysis settings.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Unit-Vise: Deep Shallow Unit-Vise Residual Neural Networks With Transition
           Layer For Expert Level Skin Cancer Classification

    • Free pre-print version: Loading...

      Authors: Imran Razzak;Saeeda Naz;
      Pages: 1225 - 1234
      Abstract: Many modern neural network architectures with over parameterized regime have been used for identification of skin cancer. Recent work showed that network, where the hidden units are polynomially smaller in size, showed better performance than overparameterized models. Hence, in this paper, we present multistage unit-vise deep dense residual network with transition and additional supervision blocks that enforces the shorter connections resulting in better feature representation. Unlike ResNet, We divided the network into several stages, and each stage consists of several dense connected residual units that support residual learning with dense connectivity and limited the skip connectivity. Thus, each stage can consider the features from its earlier layers locally as well as less complicated in comparison to its counter network. Evaluation results on ISIC-2018 challenge consisting of 10,015 training images show considerable improvement over other approaches achieving 98.05 percent accuracy and improving on the best results achieved in comparison to state of the art methods. The code of Unit-vise network is publicly available.1
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Use Chou's 5-Steps Rule With Different Word Embedding Types to
           Boost Performance of Electron Transport Protein Prediction Model

    • Free pre-print version: Loading...

      Authors: Trinh-Trung-Duong Nguyen;Quang-Thai Ho;Nguyen-Quoc-Khanh Le;Van-Dinh Phan;Yu-Yen Ou;
      Pages: 1235 - 1244
      Abstract: Living organisms receive necessary energy substances directly from cellular respiration. The completion of electron storage and transportation requires the process of cellular respiration with the aid of electron transport chains. Therefore, the work of deciphering electron transport proteins is inevitably needed. The identification of these proteins with high performance has a prompt dependence on the choice of methods for feature extraction and machine learning algorithm. In this study, protein sequences served as natural language sentences comprising words. The nominated word embedding-based feature sets, hinged on the word embedding modulation and protein motif frequencies, were useful for feature choosing. Five word embedding types and a variety of conjoint features were examined for such feature selection. The support vector machine algorithm consequentially was employed to perform classification. The performance statistics within the 5-fold cross-validation including average accuracy, specificity, sensitivity, as well as MCC rates surpass 0.95. Such metrics in the independent test are 96.82, 97.16, 95.76 percent, and 0.9, respectively. Compared to state-of-the-art predictors, the proposed method can generate more preferable performance above all metrics indicating the effectiveness of the proposed method in determining electron transport proteins. Furthermore, this study reveals insights about the applicability of various word embeddings for understanding surveyed sequences.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
  • Using Symmetry to Enhance the Performance of Agent-Based Epidemic Models

    • Free pre-print version: Loading...

      Authors: Gilberto M. Nakamura;Alinne C. C. Souza;Francisco C. M. Souza;Renato F. Bulcão-Neto;Alexandre S. Martinez;Alessandra A. Macedo;
      Pages: 1245 - 1254
      Abstract: Symmetries express the invariance of a system towards sets of mathematical transformations. In more practical terms, symmetries greatly reduce or simplify the computational efforts required to evaluate relevant properties of a system. In this paper, two methods are proposed to implement spin symmetries which simplify the analysis of the spreading of diseases in an agent-based epidemic model. We perform a set of simulations to measure the efficiency gains compared to traditional methods. Our findings show symmetry-based algorithms improve the performance of the Monte Carlo simulation and the exact Markov process.
      PubDate: March-April 1 2022
      Issue No: Vol. 19, No. 2 (2022)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 3.236.50.79
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-