for Journals by Title or ISSN
for Articles by Keywords
  Subjects -> COMPUTER SCIENCE (Total: 2052 journals)
    - ANIMATION AND SIMULATION (30 journals)
    - AUTOMATION AND ROBOTICS (105 journals)
    - COMPUTER ARCHITECTURE (10 journals)
    - COMPUTER ENGINEERING (11 journals)
    - COMPUTER GAMES (15 journals)
    - COMPUTER PROGRAMMING (26 journals)
    - COMPUTER SCIENCE (1194 journals)
    - COMPUTER SECURITY (44 journals)
    - DATA BASE MANAGEMENT (14 journals)
    - DATA MINING (34 journals)
    - E-BUSINESS (22 journals)
    - E-LEARNING (29 journals)
    - IMAGE AND VIDEO PROCESSING (39 journals)
    - INFORMATION SYSTEMS (109 journals)
    - INTERNET (92 journals)
    - SOCIAL WEB (50 journals)
    - SOFTWARE (33 journals)
    - THEORY OF COMPUTING (8 journals)

COMPUTER SCIENCE (1194 journals)                  1 2 3 4 5 6 | Last

Showing 1 - 200 of 872 Journals sorted alphabetically
3D Printing and Additive Manufacturing     Full-text available via subscription   (Followers: 20)
Abakós     Open Access   (Followers: 4)
ACM Computing Surveys     Hybrid Journal   (Followers: 27)
ACM Journal on Computing and Cultural Heritage     Hybrid Journal   (Followers: 8)
ACM Journal on Emerging Technologies in Computing Systems     Hybrid Journal   (Followers: 12)
ACM Transactions on Accessible Computing (TACCESS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Algorithms (TALG)     Hybrid Journal   (Followers: 15)
ACM Transactions on Applied Perception (TAP)     Hybrid Journal   (Followers: 5)
ACM Transactions on Architecture and Code Optimization (TACO)     Hybrid Journal   (Followers: 9)
ACM Transactions on Autonomous and Adaptive Systems (TAAS)     Hybrid Journal   (Followers: 7)
ACM Transactions on Computation Theory (TOCT)     Hybrid Journal   (Followers: 12)
ACM Transactions on Computational Logic (TOCL)     Hybrid Journal   (Followers: 3)
ACM Transactions on Computer Systems (TOCS)     Hybrid Journal   (Followers: 17)
ACM Transactions on Computer-Human Interaction     Hybrid Journal   (Followers: 15)
ACM Transactions on Computing Education (TOCE)     Hybrid Journal   (Followers: 5)
ACM Transactions on Design Automation of Electronic Systems (TODAES)     Hybrid Journal   (Followers: 4)
ACM Transactions on Economics and Computation     Hybrid Journal  
ACM Transactions on Embedded Computing Systems (TECS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Information Systems (TOIS)     Hybrid Journal   (Followers: 19)
ACM Transactions on Intelligent Systems and Technology (TIST)     Hybrid Journal   (Followers: 7)
ACM Transactions on Interactive Intelligent Systems (TiiS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)     Hybrid Journal   (Followers: 9)
ACM Transactions on Reconfigurable Technology and Systems (TRETS)     Hybrid Journal   (Followers: 6)
ACM Transactions on Sensor Networks (TOSN)     Hybrid Journal   (Followers: 8)
ACM Transactions on Speech and Language Processing (TSLP)     Hybrid Journal   (Followers: 9)
ACM Transactions on Storage     Hybrid Journal  
ACS Applied Materials & Interfaces     Full-text available via subscription   (Followers: 29)
Acta Automatica Sinica     Full-text available via subscription   (Followers: 2)
Acta Informatica Malaysia     Open Access  
Acta Universitatis Cibiniensis. Technical Series     Open Access  
Ad Hoc Networks     Hybrid Journal   (Followers: 11)
Adaptive Behavior     Hybrid Journal   (Followers: 11)
Advanced Engineering Materials     Hybrid Journal   (Followers: 28)
Advanced Science Letters     Full-text available via subscription   (Followers: 10)
Advances in Adaptive Data Analysis     Hybrid Journal   (Followers: 7)
Advances in Artificial Intelligence     Open Access   (Followers: 15)
Advances in Calculus of Variations     Hybrid Journal   (Followers: 2)
Advances in Catalysis     Full-text available via subscription   (Followers: 5)
Advances in Computational Mathematics     Hybrid Journal   (Followers: 19)
Advances in Computer Engineering     Open Access   (Followers: 4)
Advances in Computer Science : an International Journal     Open Access   (Followers: 15)
Advances in Computing     Open Access   (Followers: 2)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 51)
Advances in Engineering Software     Hybrid Journal   (Followers: 27)
Advances in Geosciences (ADGEO)     Open Access   (Followers: 13)
Advances in Human Factors/Ergonomics     Full-text available via subscription   (Followers: 22)
Advances in Human-Computer Interaction     Open Access   (Followers: 20)
Advances in Materials Sciences     Open Access   (Followers: 14)
Advances in Operations Research     Open Access   (Followers: 12)
Advances in Parallel Computing     Full-text available via subscription   (Followers: 6)
Advances in Porous Media     Full-text available via subscription   (Followers: 5)
Advances in Remote Sensing     Open Access   (Followers: 44)
Advances in Science and Research (ASR)     Open Access   (Followers: 6)
Advances in Technology Innovation     Open Access   (Followers: 5)
AEU - International Journal of Electronics and Communications     Hybrid Journal   (Followers: 8)
African Journal of Information and Communication     Open Access   (Followers: 8)
African Journal of Mathematics and Computer Science Research     Open Access   (Followers: 4)
AI EDAM     Hybrid Journal  
Air, Soil & Water Research     Open Access   (Followers: 11)
AIS Transactions on Human-Computer Interaction     Open Access   (Followers: 6)
Algebras and Representation Theory     Hybrid Journal   (Followers: 1)
Algorithms     Open Access   (Followers: 11)
American Journal of Computational and Applied Mathematics     Open Access   (Followers: 5)
American Journal of Computational Mathematics     Open Access   (Followers: 4)
American Journal of Information Systems     Open Access   (Followers: 5)
American Journal of Sensor Technology     Open Access   (Followers: 4)
Anais da Academia Brasileira de Ciências     Open Access   (Followers: 2)
Analog Integrated Circuits and Signal Processing     Hybrid Journal   (Followers: 7)
Analysis in Theory and Applications     Hybrid Journal   (Followers: 1)
Animation Practice, Process & Production     Hybrid Journal   (Followers: 5)
Annals of Combinatorics     Hybrid Journal   (Followers: 4)
Annals of Data Science     Hybrid Journal   (Followers: 11)
Annals of Mathematics and Artificial Intelligence     Hybrid Journal   (Followers: 12)
Annals of Pure and Applied Logic     Open Access   (Followers: 2)
Annals of Software Engineering     Hybrid Journal   (Followers: 13)
Annual Reviews in Control     Hybrid Journal   (Followers: 6)
Anuario Americanista Europeo     Open Access  
Applicable Algebra in Engineering, Communication and Computing     Hybrid Journal   (Followers: 2)
Applied and Computational Harmonic Analysis     Full-text available via subscription   (Followers: 1)
Applied Artificial Intelligence: An International Journal     Hybrid Journal   (Followers: 12)
Applied Categorical Structures     Hybrid Journal   (Followers: 2)
Applied Computational Intelligence and Soft Computing     Open Access   (Followers: 11)
Applied Computer Systems     Open Access   (Followers: 2)
Applied Informatics     Open Access  
Applied Mathematics and Computation     Hybrid Journal   (Followers: 33)
Applied Medical Informatics     Open Access   (Followers: 10)
Applied Numerical Mathematics     Hybrid Journal   (Followers: 5)
Applied Soft Computing     Hybrid Journal   (Followers: 16)
Applied Spatial Analysis and Policy     Hybrid Journal   (Followers: 4)
Applied System Innovation     Open Access  
Architectural Theory Review     Hybrid Journal   (Followers: 3)
Archive of Applied Mechanics     Hybrid Journal   (Followers: 5)
Archive of Numerical Software     Open Access  
Archives and Museum Informatics     Hybrid Journal   (Followers: 146)
Archives of Computational Methods in Engineering     Hybrid Journal   (Followers: 5)
arq: Architectural Research Quarterly     Hybrid Journal   (Followers: 7)
Artifact     Hybrid Journal   (Followers: 2)
Artificial Life     Hybrid Journal   (Followers: 7)
Asia Pacific Journal on Computational Engineering     Open Access  
Asia-Pacific Journal of Information Technology and Multimedia     Open Access   (Followers: 1)
Asian Journal of Computer Science and Information Technology     Open Access  
Asian Journal of Control     Hybrid Journal  
Assembly Automation     Hybrid Journal   (Followers: 2)
at - Automatisierungstechnik     Hybrid Journal   (Followers: 1)
Australian Educational Computing     Open Access   (Followers: 1)
Automatic Control and Computer Sciences     Hybrid Journal   (Followers: 4)
Automatic Documentation and Mathematical Linguistics     Hybrid Journal   (Followers: 5)
Automatica     Hybrid Journal   (Followers: 11)
Automation in Construction     Hybrid Journal   (Followers: 6)
Autonomous Mental Development, IEEE Transactions on     Hybrid Journal   (Followers: 9)
Basin Research     Hybrid Journal   (Followers: 5)
Behaviour & Information Technology     Hybrid Journal   (Followers: 52)
Big Data and Cognitive Computing     Open Access   (Followers: 2)
Biodiversity Information Science and Standards     Open Access  
Bioinformatics     Hybrid Journal   (Followers: 294)
Biomedical Engineering     Hybrid Journal   (Followers: 15)
Biomedical Engineering and Computational Biology     Open Access   (Followers: 13)
Biomedical Engineering, IEEE Reviews in     Full-text available via subscription   (Followers: 21)
Biomedical Engineering, IEEE Transactions on     Hybrid Journal   (Followers: 37)
Briefings in Bioinformatics     Hybrid Journal   (Followers: 46)
British Journal of Educational Technology     Hybrid Journal   (Followers: 144)
Broadcasting, IEEE Transactions on     Hybrid Journal   (Followers: 12)
c't Magazin fuer Computertechnik     Full-text available via subscription   (Followers: 1)
CALCOLO     Hybrid Journal  
Calphad     Hybrid Journal   (Followers: 2)
Canadian Journal of Electrical and Computer Engineering     Full-text available via subscription   (Followers: 15)
Capturing Intelligence     Full-text available via subscription  
Catalysis in Industry     Hybrid Journal   (Followers: 1)
CEAS Space Journal     Hybrid Journal   (Followers: 2)
Cell Communication and Signaling     Open Access   (Followers: 2)
Central European Journal of Computer Science     Hybrid Journal   (Followers: 5)
CERN IdeaSquare Journal of Experimental Innovation     Open Access   (Followers: 3)
Chaos, Solitons & Fractals     Hybrid Journal   (Followers: 3)
Chemometrics and Intelligent Laboratory Systems     Hybrid Journal   (Followers: 14)
ChemSusChem     Hybrid Journal   (Followers: 7)
China Communications     Full-text available via subscription   (Followers: 7)
Chinese Journal of Catalysis     Full-text available via subscription   (Followers: 2)
CIN Computers Informatics Nursing     Full-text available via subscription   (Followers: 11)
Circuits and Systems     Open Access   (Followers: 15)
Clean Air Journal     Full-text available via subscription   (Followers: 1)
CLEI Electronic Journal     Open Access  
Clin-Alert     Hybrid Journal   (Followers: 1)
Cluster Computing     Hybrid Journal   (Followers: 1)
Cognitive Computation     Hybrid Journal   (Followers: 4)
COMBINATORICA     Hybrid Journal  
Combinatorics, Probability and Computing     Hybrid Journal   (Followers: 4)
Combustion Theory and Modelling     Hybrid Journal   (Followers: 14)
Communication Methods and Measures     Hybrid Journal   (Followers: 12)
Communication Theory     Hybrid Journal   (Followers: 21)
Communications Engineer     Hybrid Journal   (Followers: 1)
Communications in Algebra     Hybrid Journal   (Followers: 3)
Communications in Computational Physics     Full-text available via subscription   (Followers: 2)
Communications in Partial Differential Equations     Hybrid Journal   (Followers: 3)
Communications of the ACM     Full-text available via subscription   (Followers: 52)
Communications of the Association for Information Systems     Open Access   (Followers: 16)
COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering     Hybrid Journal   (Followers: 3)
Complex & Intelligent Systems     Open Access   (Followers: 1)
Complex Adaptive Systems Modeling     Open Access  
Complex Analysis and Operator Theory     Hybrid Journal   (Followers: 2)
Complexity     Hybrid Journal   (Followers: 6)
Complexus     Full-text available via subscription  
Composite Materials Series     Full-text available via subscription   (Followers: 8)
Computación y Sistemas     Open Access  
Computation     Open Access   (Followers: 1)
Computational and Applied Mathematics     Hybrid Journal   (Followers: 2)
Computational and Mathematical Methods in Medicine     Open Access   (Followers: 2)
Computational and Mathematical Organization Theory     Hybrid Journal   (Followers: 2)
Computational and Structural Biotechnology Journal     Open Access   (Followers: 2)
Computational and Theoretical Chemistry     Hybrid Journal   (Followers: 9)
Computational Astrophysics and Cosmology     Open Access   (Followers: 1)
Computational Biology and Chemistry     Hybrid Journal   (Followers: 11)
Computational Chemistry     Open Access   (Followers: 2)
Computational Cognitive Science     Open Access   (Followers: 2)
Computational Complexity     Hybrid Journal   (Followers: 4)
Computational Condensed Matter     Open Access  
Computational Ecology and Software     Open Access   (Followers: 9)
Computational Economics     Hybrid Journal   (Followers: 9)
Computational Geosciences     Hybrid Journal   (Followers: 16)
Computational Linguistics     Open Access   (Followers: 23)
Computational Management Science     Hybrid Journal  
Computational Mathematics and Modeling     Hybrid Journal   (Followers: 8)
Computational Mechanics     Hybrid Journal   (Followers: 5)
Computational Methods and Function Theory     Hybrid Journal  
Computational Molecular Bioscience     Open Access   (Followers: 2)
Computational Optimization and Applications     Hybrid Journal   (Followers: 7)
Computational Particle Mechanics     Hybrid Journal   (Followers: 1)
Computational Research     Open Access   (Followers: 1)
Computational Science and Discovery     Full-text available via subscription   (Followers: 2)
Computational Science and Techniques     Open Access  
Computational Statistics     Hybrid Journal   (Followers: 14)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 30)
Computer     Full-text available via subscription   (Followers: 94)
Computer Aided Surgery     Open Access   (Followers: 6)
Computer Applications in Engineering Education     Hybrid Journal   (Followers: 8)
Computer Communications     Hybrid Journal   (Followers: 16)
Computer Engineering and Applications Journal     Open Access   (Followers: 5)
Computer Journal     Hybrid Journal   (Followers: 9)
Computer Methods in Applied Mechanics and Engineering     Hybrid Journal   (Followers: 23)
Computer Methods in Biomechanics and Biomedical Engineering     Hybrid Journal   (Followers: 12)
Computer Methods in the Geosciences     Full-text available via subscription   (Followers: 2)

        1 2 3 4 5 6 | Last

Journal Cover Bioinformatics
  Journal Prestige (SJR): 4.643
  Citation Impact (citeScore): 271
  Number of Followers: 294  
   Hybrid Journal Hybrid journal (It can contain Open Access articles)
   ISSN (Print) 1367-4803 - ISSN (Online) 1460-2059
   Published by Oxford University Press Homepage  [396 journals]
  • Evolutionary relationship between the cysteine and histidine rich domains
           (CHORDs) and Btk-type zinc fingers
    • Authors: Kaur G; Subramanian S, Valencia A.
      Pages: 1981 - 1985
      Abstract: SummaryCysteine and histidine rich domains (CHORDs), implicated in immunity and disease resistance signaling in plants, and in development and signal transduction in muscles and tumorigenesis in animals, are seen to have a cylindrical three-dimensional structure stabilized by the tetrahedral chelation of two zinc ions. CHORDs are regarded as novel zinc-binding domains and classified independently in Pfam and ECOD. Our sequence and structure analysis reveals that both the zinc-binding sites in CHORD possess a zinc ribbon fold and are likely related to each other by duplication and circular permutation. Interestingly, we also detect an evolutionary relationship between each of the CHORD zinc fingers (ZFs) and the Bruton's tyrosine kinase (Btk)-type ZF of the zinc ribbon fold group. Btk_ZF is found in eukaryotic Tec kinase family proteins that are also implicated in signaling pathways in several lineages of hematopoietic cells involved in mammalian immunity. Our analysis suggests that the unique zinc-stabilized fold seen only in the CHORD and Btk_ZFs likely emerged specifically in eukaryotes to mediate diverse signaling pathways.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Tue, 30 Jan 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty041
      Issue No: Vol. 34, No. 12 (2018)
  • HPViewer: sensitive and specific genotyping of human papillomavirus in
           metagenomic DNA
    • Authors: Hao Y; Yang L, Galvao Neto A, et al.
      Pages: 1986 - 1995
      Abstract: MotivationShotgun DNA sequencing provides sensitive detection of all 182 HPV types in tissue and body fluid. However, existing computational methods either produce false positives misidentifying HPV types due to shared sequences among HPV, human and prokaryotes, or produce false negative since they identify HPV by assembled contigs requiring large abundant of HPV reads.ResultsWe designed HPViewer with two custom HPV reference databases masking simple repeats and homology sequences respectively and one homology distance matrix to hybridize these two databases. It directly identified HPV from short DNA reads rather than assembled contigs. Using 100 100 simulated samples, we revealed that HPViewer was robust for samples containing either high or low number of HPV reads. Using 12 shotgun sequencing samples from respiratory papillomatosis, HPViewer was equal to VirusTAP, and Vipie and better than HPVDetector with the respect to specificity and was the most sensitive method in the detection of HPV types 6 and 11. We demonstrated that contigs-based approaches had disadvantages of detection of HPV. In 1573 sets of metagenomic data from 18 human body sites, HPViewer identified 104 types of HPV in a body-site associated pattern and 89 types of HPV co-occurring in one sample with other types of HPV. We demonstrated HPViewer was sensitive and specific for HPV detection in metagenomic data.Availability and implementationHPViewer can be accessed at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Thu, 25 Jan 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty037
      Issue No: Vol. 34, No. 12 (2018)
  • EPS-LASSO: test for high-dimensional regression under extreme phenotype
           sampling of continuous traits
    • Authors: Xu C; Fang J, Shen H, et al.
      Pages: 1996 - 2003
      Abstract: MotivationExtreme phenotype sampling (EPS) is a broadly-used design to identify candidate genetic factors contributing to the variation of quantitative traits. By enriching the signals in extreme phenotypic samples, EPS can boost the association power compared to random sampling. Most existing statistical methods for EPS examine the genetic factors individually, despite many quantitative traits have multiple genetic factors underlying their variation. It is desirable to model the joint effects of genetic factors, which may increase the power and identify novel quantitative trait loci under EPS. The joint analysis of genetic data in high-dimensional situations requires specialized techniques, e.g. the least absolute shrinkage and selection operator (LASSO). Although there are extensive research and application related to LASSO, the statistical inference and testing for the sparse model under EPS remain unknown.ResultsWe propose a novel sparse model (EPS-LASSO) with hypothesis test for high-dimensional regression under EPS based on a decorrelated score function. The comprehensive simulation shows EPS-LASSO outperforms existing methods with stable type I error and FDR control. EPS-LASSO can provide a consistent power for both low- and high-dimensional situations compared with the other methods dealing with high-dimensional situations. The power of EPS-LASSO is close to other low-dimensional methods when the causal effect sizes are small and is superior when the effects are large. Applying EPS-LASSO to a transcriptome-wide gene expression study for obesity reveals 10 significant body mass index associated genes. Our results indicate that EPS-LASSO is an effective method for EPS data analysis, which can account for correlated predictors.Availability and implementationThe source code is available at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Thu, 25 Jan 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty042
      Issue No: Vol. 34, No. 12 (2018)
  • Accurity: accurate tumor purity and ploidy inference from tumor-normal WGS
           data by jointly modelling somatic copy number alterations and heterozygous
           germline single-nucleotide-variants
    • Authors: Luo Z; Fan X, Su Y, et al.
      Pages: 2004 - 2011
      Abstract: MotivationTumor purity and ploidy have a substantial impact on next-gen sequence analyses of tumor samples and may alter the biological and clinical interpretation of results. Despite the existence of several computational methods that are dedicated to estimate tumor purity and/or ploidy from The Cancer Genome Atlas (TCGA) tumor-normal whole-genome-sequencing (WGS) data, an accurate, fast and fully-automated method that works in a wide range of sequencing coverage, level of tumor purity and level of intra-tumor heterogeneity, is still missing.ResultsWe describe a computational method called Accurity that infers tumor purity, tumor cell ploidy and absolute allelic copy numbers for somatic copy number alterations (SCNAs) from tumor-normal WGS data by jointly modelling SCNAs and heterozygous germline single-nucleotide-variants (HGSNVs). Results from both in silico and real sequencing data demonstrated that Accurity is highly accurate and robust, even in low-purity, high-ploidy and low-coverage settings in which several existing methods perform poorly. Accounting for tumor purity and ploidy, Accurity significantly increased signal/noise gaps between different copy numbers. We are hopeful that Accurity is of clinical use for identifying cancer diagnostic biomarkers.Availability and implementationAccurity is implemented in C++/Rust, available at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Sat, 27 Jan 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty043
      Issue No: Vol. 34, No. 12 (2018)
  • Progressive approach for SNP calling and haplotype assembly using single
           molecular sequencing data
    • Authors: Guo F; Wang D, Wang L, et al.
      Pages: 2012 - 2018
      Abstract: MotivationHaplotype information is essential to the complete description and interpretation of genomes, genetic diversity and genetic ancestry. The new technologies can provide Single Molecular Sequencing (SMS) data that cover about 90% of positions over chromosomes. However, the SMS data has a higher error rate comparing to 1% error rate for short reads. Thus, it becomes very difficult for SNP calling and haplotype assembly using SMS reads. Most existing technologies do not work properly for the SMS data.ResultsIn this paper, we develop a progressive approach for SNP calling and haplotype assembly that works very well for the SMS data. Our method can handle more than 200 million non-N bases on Chromosome 1 with millions of reads, more than 100 blocks, each of which contains more than 2 million bases and more than 3K SNP sites on average. Experiment results show that the false discovery rate and false negative rate for our method are 15.7 and 11.0% on NA12878, and 16.5 and 11.0% on NA24385. Moreover, the overall switch errors for our method are 7.26 and 5.21 with average 3378 and 5736 SNP sites per block on NA12878 and NA24385, respectively. Here, we demonstrate that SMS reads alone can generate a high quality solution for both SNP calling and haplotype assembly.Availability and implementationSource codes and results are available at
      PubDate: Mon, 19 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty059
      Issue No: Vol. 34, No. 12 (2018)
  • BAUM: improving genome assembly by adaptive unique mapping and local
           overlap-layout-consensus approach
    • Authors: Wang A; Wang Z, Li Z, et al.
      Pages: 2019 - 2028
      Abstract: MotivationIt is highly desirable to assemble genomes of high continuity and consistency at low cost. The current bottleneck of draft genome continuity using the second generation sequencing (SGS) reads is primarily caused by uncertainty among repetitive sequences. Even though the single-molecule real-time sequencing technology is very promising to overcome the uncertainty issue, its relatively high cost and error rate add burden on budget or computation. Many long-read assemblers take the overlap-layout-consensus (OLC) paradigm, which is less sensitive to sequencing errors, heterozygosity and variability of coverage. However, current assemblers of SGS data do not sufficiently take advantage of the OLC approach.ResultsAiming at minimizing uncertainty, the proposed method BAUM, breaks the whole genome into regions by adaptive unique mapping; then the local OLC is used to assemble each region in parallel. BAUM can (i) perform reference-assisted assembly based on the genome of a close species (ii) or improve the results of existing assemblies that are obtained based on short or long sequencing reads. The tests on two eukaryote genomes, a wild rice Oryza longistaminata and a parrot Melopsittacus undulatus, show that BAUM achieved substantial improvement on genome size and continuity. Besides, BAUM reconstructed a considerable amount of repetitive regions that failed to be assembled by existing short read assemblers. We also propose statistical approaches to control the uncertainty in different steps of BAUM.Availability and implementation informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Mon, 15 Jan 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty020
      Issue No: Vol. 34, No. 12 (2018)
  • O-GlcNAcPRED-II: an integrated classification algorithm for identifying
           O-GlcNAcylation sites based on fuzzy undersampling and a K-means PCA
           oversampling technique
    • Authors: Jia C; Zuo Y, Zou Q, et al.
      Pages: 2029 - 2036
      Abstract: MotivationProtein O-GlcNAcylation (O-GlcNAc) is an important post-translational modification of serine (S)/threonine (T) residues that involves multiple molecular and cellular processes. Recent studies have suggested that abnormal O-G1cNAcylation causes many diseases, such as cancer and various neurodegenerative diseases. With the available protein O-G1cNAcylation sites experimentally verified, it is highly desired to develop automated methods to rapidly and effectively identify O-GlcNAcylation sites. Although some computational methods have been proposed, their performance has been unsatisfactory, particularly in terms of prediction sensitivity.ResultsIn this study, we developed an ensemble model O-GlcNAcPRED-II to identify potential O-GlcNAcylation sites. A K-means principal component analysis oversampling technique (KPCA) and fuzzy undersampling method (FUS) were first proposed and incorporated to reduce the proportion of the original positive and negative training samples. Then, rotation forest, a type of classifier-integrated system, was adopted to divide the eight types of feature space into several subsets using four sub-classifiers: random forest, k-nearest neighbour, naive Bayesian and support vector machine. We observed that O-GlcNAcPRED-II achieved a sensitivity of 81.05%, specificity of 95.91%, accuracy of 91.43% and Matthew’s correlation coefficient of 0.7928 for five-fold cross-validation run 10 times. Additionally, the results obtained by O-GlcNAcPRED-II on two independent datasets also indicated that the proposed predictor outperformed five published prediction tools.Availability and implementationhttp:// informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Tue, 06 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty039
      Issue No: Vol. 34, No. 12 (2018)
  • A low-complexity add-on score for protein remote homology search with
    • Authors: Margelevičius M; Hancock J.
      Pages: 2037 - 2045
      Abstract: MotivationProtein sequence alignment forms the basis for comparative modeling, the most reliable approach to protein structure prediction, among many other applications. Alignment between sequence families, or profile–profile alignment, represents one of the most, if not the most, sensitive means for homology detection but still necessitates improvement. We aim at improving the quality of profile–profile alignments and the sensitivity induced by them by refining profile–profile substitution scores.ResultsWe have developed a new score that represents an additional component of profile–profile substitution scores. A comprehensive evaluation shows that the new add-on score statistically significantly improves both the sensitivity and the alignment quality of the COMER method. We discuss why the score leads to the improvement and its almost optimal computational complexity that makes it easily implementable in any profile–profile alignment method.Availability and implementationAn implementation of the add-on score in the open-source COMER software and data are available at The COMER software is also available on Github at and as a Docker image (minmar/comer).Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Tue, 30 Jan 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty048
      Issue No: Vol. 34, No. 12 (2018)
  • HIITE: HIV-1 incidence and infection time estimator
    • Authors: Park S; Love T, Kapoor S, et al.
      Pages: 2046 - 2052
      Abstract: MotivationAround 2.1 million new HIV-1 infections were reported in 2015, alerting that the HIV-1 epidemic remains a significant global health challenge. Precise incidence assessment strengthens epidemic monitoring efforts and guides strategy optimization for prevention programs. Estimating the onset time of HIV-1 infection can facilitate optimal clinical management and identify key populations largely responsible for epidemic spread and thereby infer HIV-1 transmission chains. Our goal is to develop a genomic assay estimating the incidence and infection time in a single cross-sectional survey setting.ResultsWe created a web-based platform, HIV-1 incidence and infection time estimator (HIITE), which processes envelope gene sequences using hierarchical clustering algorithms and informs the stage of infection, along with time since infection for incident cases. HIITE’s performance was evaluated using 585 incident and 305 chronic specimens’ envelope gene sequences collected from global cohorts including HIV-1 vaccine trial participants. HIITE precisely identified chronically infected individuals as being chronic with an error less than 1% and correctly classified 94% of recently infected individuals as being incident. Using a mixed-effect model, an incident specimen’s time since infection was estimated from its single lineage diversity, showing 14% prediction error for time since infection. HIITE is the first algorithm to inform two key metrics from a single time point sequence sample. HIITE has the capacity for assessing not only population-level epidemic spread but also individual-level transmission events from a single survey, advancing HIV prevention and intervention programs.Availability and implementationWeb-based HIITE and source code of HIITE are available at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Fri, 09 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty073
      Issue No: Vol. 34, No. 12 (2018)
  • pepKalc: scalable and comprehensive calculation of electrostatic
           interactions in random coil polypeptides
    • Authors: Tamiola K; Scheek R, van der Meulen P, et al.
      Pages: 2053 - 2060
      Abstract: MotivationPolypeptide sequence length is the single dominant factor hampering the effectiveness of currently available software tools for de novo calculation of amino acid-specific protonation constants in disordered polypeptides.ResultsWe have developed pepKalc, a robust simulation software for the comprehensive evaluation of protein electrostatics in unfolded states. Our software completely removes the limitations of the previously reported Monte-Carlo approaches in the computation of protein electrostatics by using a hybrid approach that effectively combines exact and mean-field calculations to rapidly obtain accurate results. Paired with a modern architecture GPU, pepKalc is capable of evaluating protonation behavior for an arbitrary-size polypeptide in a sub-second time regime.Availability and implementation and
      PubDate: Mon, 22 Jan 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty033
      Issue No: Vol. 34, No. 12 (2018)
  • Novel overlapping subgraph clustering for the detection of antigen
    • Authors: Zhao L; Wu S, Jiang J, et al.
      Pages: 2061 - 2068
      Abstract: MotivationAntigens that contain overlapping epitopes have been occasionally reported. As current algorithms mainly take a one-antigen-one-epitope approach to the prediction of epitopes, they are not capable of detecting these multiple and overlapping epitopes accurately, or even those multiple and separated epitopes existing in some other antigens.ResultsWe introduce a novel subgraph clustering algorithm for more accurate detection of epitopes. This algorithm takes graph partitions as seeds, and expands the seeds to merge overlapping subgraphs based on the term frequency-inverse document frequency (TF-IDF) featured similarity. Then, the merged subgraphs are each classified as an epitope or non-epitope. Tests of our algorithm were conducted on three newly collected datasets of antigens. In the first dataset, each antigen contains only a single epitope; in the second, each antigen contains only multiple and separated epitopes; and in the third, each antigen contains overlapping epitopes. The prediction performance of our algorithm is significantly better than the state-of-art methods. The lifts of the averaged f-scores on top of the best existing methods are 60, 75 and 22% for the single epitope detection, the multiple and separated epitopes detection, and the overlapping epitopes detection, respectively.Availability and implementationThe source code is available at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Fri, 02 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty051
      Issue No: Vol. 34, No. 12 (2018)
  • Spectral clustering based on learning similarity matrix
    • Authors: Park S; Zhao H, Birol I.
      Pages: 2069 - 2076
      Abstract: MotivationSingle-cell RNA-sequencing (scRNA-seq) technology can generate genome-wide expression data at the single-cell levels. One important objective in scRNA-seq analysis is to cluster cells where each cluster consists of cells belonging to the same cell type based on gene expression patterns.ResultsWe introduce a novel spectral clustering framework that imposes sparse structures on a target matrix. Specifically, we utilize multiple doubly stochastic similarity matrices to learn a similarity matrix, motivated by the observation that each similarity matrix can be a different informative representation of the data. We impose a sparse structure on the target matrix followed by shrinking pairwise differences of the rows in the target matrix, motivated by the fact that the target matrix should have these structures in the ideal case. We solve the proposed non-convex problem iteratively using the ADMM algorithm and show the convergence of the algorithm. We evaluate the performance of the proposed clustering method on various simulated as well as real scRNA-seq data, and show that it can identify clusters accurately and robustly.Availability and implementationThe algorithm is implemented in MATLAB. The source code can be downloaded at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Thu, 08 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty050
      Issue No: Vol. 34, No. 12 (2018)
  • scEpath: energy landscape-based inference of transition probabilities and
           cellular trajectories from single-cell transcriptomic data
    • Authors: Jin S; MacLean A, Peng T, et al.
      Pages: 2077 - 2086
      Abstract: MotivationSingle-cell RNA-sequencing (scRNA-seq) offers unprecedented resolution for studying cellular decision-making processes. Robust inference of cell state transition paths and probabilities is an important yet challenging step in the analysis of these data.ResultsHere we present scEpath, an algorithm that calculates energy landscapes and probabilistic directed graphs in order to reconstruct developmental trajectories. We quantify the energy landscape using ‘single-cell energy’ and distance-based measures, and find that the combination of these enables robust inference of the transition probabilities and lineage relationships between cell states. We also identify marker genes and gene expression patterns associated with cell state transitions. Our approach produces pseudotemporal orderings that are—in combination—more robust and accurate than current methods, and offers higher resolution dynamics of the cell state transitions, leading to new insight into key transition events during differentiation and development. Moreover, scEpath is robust to variation in the size of the input gene set, and is broadly unsupervised, requiring few parameters to be set by the user. Applications of scEpath led to the identification of a cell-cell communication network implicated in early human embryo development, and novel transcription factors important for myoblast differentiation. scEpath allows us to identify common and specific temporal dynamics and transcriptional factor programs along branched lineages, as well as the transition probabilities that control cell fates.Availability and implementationA MATLAB package of scEpath is available at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Mon, 05 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty058
      Issue No: Vol. 34, No. 12 (2018)
  • PhenoRank: reducing study bias in gene prioritization through simulation
    • Authors: Cornish A; David A, Sternberg M, et al.
      Pages: 2087 - 2095
      Abstract: MotivationGenome-wide association studies have identified thousands of loci associated with human disease, but identifying the causal genes at these loci is often difficult. Several methods prioritize genes most likely to be disease causing through the integration of biological data, including protein–protein interaction and phenotypic data. Data availability is not the same for all genes however, potentially influencing the performance of these methods.ResultsWe demonstrate that whilst disease genes tend to be associated with greater numbers of data, this may be at least partially a result of them being better studied. With this observation we develop PhenoRank, which prioritizes disease genes whilst avoiding being biased towards genes with more available data. Bias is avoided by comparing gene scores generated for the query disease against gene scores generated using simulated sets of phenotype terms, which ensures that differences in data availability do not affect the ranking of genes. We demonstrate that whilst existing prioritization methods are biased by data availability, PhenoRank is not similarly biased. Avoiding this bias allows PhenoRank to effectively prioritize genes with fewer available data and improves its overall performance. PhenoRank outperforms three available prioritization methods in cross-validation (PhenoRank area under receiver operating characteristic curve [AUC]=0.89, DADA AUC = 0.87, EXOMISER AUC = 0.71, PRINCE AUC = 0.83, P < 2.2 × 10−16).Availability and implementationPhenoRank is freely available for download at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Wed, 17 Jan 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty028
      Issue No: Vol. 34, No. 12 (2018)
  • ChemDistiller: an engine for metabolite annotation in mass spectrometry
    • Authors: Laponogov I; Sadawi N, Galea D, et al.
      Pages: 2096 - 2102
      Abstract: MotivationHigh-resolution mass spectrometry permits simultaneous detection of thousands of different metabolites in biological samples; however, their automated annotation still presents a challenge due to the limited number of tailored computational solutions freely available to the scientific community.ResultsHere, we introduce ChemDistiller, a customizable engine that combines automated large-scale annotation of metabolites using tandem MS data with a compiled database containing tens of millions of compounds with pre-calculated ‘fingerprints’ and fragmentation patterns. Our tests using publicly and commercially available tandem MS spectra for reference compounds show retrievals rates comparable to or exceeding the ones obtainable by the current state-of-the-art solutions in the field while offering higher throughput, scalability and processing speed.Availability and implementationSource code freely available for download at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Mon, 12 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty080
      Issue No: Vol. 34, No. 12 (2018)
  • TOXsIgN: a cross-species repository for toxicogenomic signatures
    • Authors: Darde T; Gaudriault P, Beranger R, et al.
      Pages: 2116 - 2122
      Abstract: MotivationAt the same time that toxicologists express increasing concern about reproducibility in this field, the development of dedicated databases has already smoothed the path toward improving the storage and exchange of raw toxicogenomic data. Nevertheless, none provides access to analyzed and interpreted data as originally reported in scientific publications. Given the increasing demand for access to this information, we developed TOXsIgN, a repository for TOXicogenomic sIgNatures.ResultsThe TOXsIgN repository provides a flexible environment that facilitates online submission, storage and retrieval of toxicogenomic signatures by the scientific community. It currently hosts 754 projects that describe more than 450 distinct chemicals and their 8491 associated signatures. It also provides users with a working environment containing a powerful search engine as well as bioinformatics/biostatistics modules that enable signature comparisons or enrichment analyses.Availability and implementationThe TOXsIgN repository is freely accessible at Website implemented in Python, JavaScript and MongoDB, with all major browsers supported.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Sat, 27 Jan 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty040
      Issue No: Vol. 34, No. 12 (2018)
  • TSAPA: identification of tissue-specific alternative polyadenylation sites
           in plants
    • Authors: Ji G; Chen M, Ye W, et al.
      Pages: 2123 - 2125
      Abstract: SummaryAlternative polyadenylation (APA) is now emerging as a widespread mechanism modulated tissue-specifically, which highlights the need to define tissue-specific poly(A) sites for profiling APA dynamics across tissues. We have developed an R package called TSAPA based on the machine learning model for identifying tissue-specific poly(A) sites in plants. A feature space including more than 200 features was assembled to specifically characterize poly(A) sites in plants. The classification model in TSAPA can be customized by selecting desirable features or classifiers. TSAPA is also capable of predicting tissue-specific poly(A) sites in unannotated intergenic regions. TSAPA will be a valuable addition to the community for studying dynamics of APA in plants.Availability and implementation informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Sat, 27 Jan 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty044
      Issue No: Vol. 34, No. 12 (2018)
  • Integrative pipeline for profiling DNA copy number and inferring tumor
    • Authors: Urrutia E; Chen H, Zhou Z, et al.
      Pages: 2126 - 2128
      Abstract: SummaryCopy number variation is an important and abundant source of variation in the human genome, which has been associated with a number of diseases, especially cancer. Massively parallel next-generation sequencing allows copy number profiling with fine resolution. Such efforts, however, have met with mixed successes, with setbacks arising partly from the lack of reliable analytical methods to meet the diverse and unique challenges arising from the myriad experimental designs and study goals in genetic studies. In cancer genomics, detection of somatic copy number changes and profiling of allele-specific copy number (ASCN) are complicated by experimental biases and artifacts as well as normal cell contamination and cancer subclone admixture. Furthermore, careful statistical modeling is warranted to reconstruct tumor phylogeny by both somatic ASCN changes and single nucleotide variants. Here we describe a flexible computational pipeline, MARATHON, which integrates multiple related statistical software for copy number profiling and downstream analyses in disease genetic studies.Availability and implementationMARATHON is publicly available at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Mon, 05 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty057
      Issue No: Vol. 34, No. 12 (2018)
  • GCevobase: an evolution-based database for GC content in eukaryotic
    • Authors: Wang D; Hancock J.
      Pages: 2129 - 2131
      Abstract: SummaryHow to comprehend the underlying mechanism behind the origin and evolution of genome composition such as GC content has been regarded as a long-standing crucial question, highlighting its biological significance and functional relevance. To varying extents, several systematically identified patterns of GC content variations are shown to be linked to a set of genomic features in the events of replication, transcription, translation and recombination, with strong contrasts between diverse phylogenetic or taxonomical groups. In this situation, we develop a repository—GCevobase—which houses compositional and size related data presented in various forms from 1118 genomes including 5 major clades of eukaryotic species such as vertebrates, invertebrates, plants, fungi and protists. It analyzes the cautiously selected sequences with clearly-defined bases and structures them under the taxonomical classification system (kingdom, phylum, class, order and family) at the genome and gene scales. It uses the diversified and intelligible graphs to show the statistical measurements of GC content in the sequence, at the three codon positions and at 4-fold degenerate sites and CDS length and their genome-wide correlations and display the evolutionary pathways of GC content by taking into account between-species orthologs and within-species paralogs for each annotated gene. In addition, a lot of internal and external links have been created, making it an effective communication between the data from individual genomes and the raw data are downloadable.Availability and implementation informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Tue, 06 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty068
      Issue No: Vol. 34, No. 12 (2018)
  • Kpax3: Bayesian bi-clustering of large sequence datasets
    • Authors: Pessia A; Corander J, Berger B.
      Pages: 2132 - 2133
      Abstract: MotivationEstimation of the hidden population structure is an important step in many genetic studies. Often the aim is also to identify which sequence locations are the most discriminative between groups of samples for a given data partition. Automated discovery of interesting patterns that are present in the data can help to generate new biological hypotheses.ResultsWe introduce Kpax3, a Bayesian method for bi-clustering multiple sequence alignments. Influence of individual sites will be determined in a supervised manner by using informative prior distributions for the model parameters. Our inference method uses an implementation of both split-merge and Gibbs sampler type MCMC algorithms to traverse the joint posterior of partitions of samples and variables. We use a large Rotavirus sequence dataset to demonstrate the ability of Kpax3 to generate biologically important hypotheses about differential selective pressures across a virus protein.Availability and implementationKpax3 is implemented as a Julia package and released under the MIT license. Source code and documentation are available at: informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Wed, 07 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty056
      Issue No: Vol. 34, No. 12 (2018)
  • GCPred: a web tool for guanylyl cyclase functional centre prediction from
           amino acid sequence
    • Authors: Xu N; Fu D, Li S, et al.
      Pages: 2134 - 2135
      Abstract: SummaryGCPred is a webserver for the prediction of guanylyl cyclase (GC) functional centres from amino acid sequence. GCs are enzymes that generate the signalling molecule cyclic guanosine 3’, 5’-monophosphate from guanosine-5’-triphosphate. A novel class of GC centres (GCCs) has been identified in complex plant proteins. Using currently available experimental data, GCPred is created to automate and facilitate the identification of similar GCCs. The server features GCC values that consider in its calculation, the physicochemical properties of amino acids constituting the GCC and the conserved amino acids within the centre. From user input amino acid sequence, the server returns a table of GCC values and graphs depicting deviations from mean values. The utility of this server is demonstrated using plant proteins and the human interleukin-1 receptor-associated kinase family of proteins as example.Availability and implementationThe GCPred server is available at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Tue, 06 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty067
      Issue No: Vol. 34, No. 12 (2018)
  • INfORM: Inference of NetwOrk Response Modules
    • Authors: Marwah V; Kinaret P, Serra A, et al.
      Pages: 2136 - 2138
      Abstract: SummaryDetecting and interpreting responsive modules from gene expression data by using network-based approaches is a common but laborious task. It often requires the application of several computational methods implemented in different software packages, forcing biologists to compile complex analytical pipelines. Here we introduce INfORM (Inference of NetwOrk Response Modules), an R shiny application that enables non-expert users to detect, evaluate and select gene modules with high statistical and biological significance. INfORM is a comprehensive tool for the identification of biologically meaningful response modules from consensus gene networks inferred by using multiple algorithms. It is accessible through an intuitive graphical user interface allowing for a level of abstraction from the computational steps.Availability and implementationINfORM is freely available for academic use at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Wed, 07 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty063
      Issue No: Vol. 34, No. 12 (2018)
  • Improving SNP prioritization and pleiotropic architecture estimation by
           incorporating prior knowledge using graph-GPA
    • Authors: Kim H; Yu Z, Lawson A, et al.
      Pages: 2139 - 2141
      Abstract: SummaryIntegration of genetic studies for multiple phenotypes is a powerful approach to improving the identification of genetic variants associated with complex traits. Although it has been shown that leveraging shared genetic basis among phenotypes, namely pleiotropy, can increase statistical power to identify risk variants, it remains challenging to effectively integrate genome-wide association study (GWAS) datasets for a large number of phenotypes. We previously developed graph-GPA, a Bayesian hierarchical model that integrates multiple GWAS datasets to boost statistical power for the identification of risk variants and to estimate pleiotropic architecture within a unified framework. Here we propose a novel improvement of graph-GPA which incorporates external knowledge about phenotype–phenotype relationship to guide the estimation of genetic correlation and the association mapping. The application of graph-GPA to GWAS datasets for 12 complex diseases with a prior disease graph obtained from a text mining of biomedical literature illustrates its power to improve the identification of risk genetic variants and to facilitate understanding of genetic relationship among complex diseases.Availability and implementationgraph-GPA is implemented as an R package ‘GGPA’, which is publicly available at DDNet, a web interface to query diseases of interest and download a prior disease graph obtained from a text mining of biomedical literature, is publicly available at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Thu, 08 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty061
      Issue No: Vol. 34, No. 12 (2018)
  • omicsPrint: detection of data linkage errors in multiple omics studies
    • Authors: van Iterson M; Cats D, Hop P, et al.
      Pages: 2142 - 2143
      Abstract: SummaryOmicsPrint is a versatile method for the detection of data linkage errors in multiple omics studies encompassing genetic, transcriptome and/or methylome data. OmicsPrint evaluates data linkage within and between omics data types using genotype calls from SNP arrays, DNA- or RNA-sequencing data and includes an algorithm to infer genotypes from Illumina DNA methylation array data. The method uses classification to verify assumed relationships and detect any data linkage errors, e.g. arising from sample mix-ups and mislabeling. Graphical and text output is provided to inspect and resolve putative data linkage errors. If sufficient genotype calls are available, first degree family relations also are revealed which can be used to check parent–offspring relations or zygosity in twin studies.Availability and implementationomicsPrint is available from BioConductor; informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Tue, 06 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty062
      Issue No: Vol. 34, No. 12 (2018)
  • KMgene: a unified R package for gene-based association analysis for
           complex traits
    • Authors: Yan Q; Fang Z, Chen W, et al.
      Pages: 2144 - 2146
      Abstract: SummaryIn this report, we introduce an R package KMgene for performing gene-based association tests for familial, multivariate or longitudinal traits using kernel machine (KM) regression under a generalized linear mixed model framework. Extensive simulations were performed to evaluate the validity of the approaches implemented in KMgene.Availability and implementation informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Fri, 09 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty066
      Issue No: Vol. 34, No. 12 (2018)
  • GEMMER: GEnome-wide tool for Multi-scale Modeling data Extraction and
           Representation for Saccharomyces cerevisiae
    • Authors: Mondeel T; Crémazy F, Barberis M, et al.
      Pages: 2147 - 2149
      Abstract: MotivationMulti-scale modeling of biological systems requires integration of various information about genes and proteins that are connected together in networks. Spatial, temporal and functional information is available; however, it is still a challenge to retrieve and explore this knowledge in an integrated, quick and user-friendly manner.ResultsWe present GEMMER (GEnome-wide tool for Multi-scale Modeling data Extraction and Representation), a web-based data-integration tool that facilitates high quality visualization of physical, regulatory and genetic interactions between proteins/genes in Saccharomyces cerevisiae. GEMMER creates network visualizations that integrate information on function, temporal expression, localization and abundance from various existing databases. GEMMER supports modeling efforts by effortlessly gathering this information and providing convenient export options for images and their underlying data.Availability and implementationGEMMER is freely available at Source code, written in Python, JavaScript library D3js, PHP and JSON, is freely available at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Thu, 01 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty052
      Issue No: Vol. 34, No. 12 (2018)
  • L1000FWD: fireworks visualization of drug-induced transcriptomic
    • Authors: Wang Z; Lachmann A, Keenan A, et al.
      Pages: 2150 - 2152
      Abstract: MotivationAs part of the NIH Library of Integrated Network-based Cellular Signatures program, hundreds of thousands of transcriptomic signatures were generated with the L1000 technology, profiling the response of human cell lines to over 20 000 small molecule compounds. This effort is a promising approach toward revealing the mechanisms-of-action (MOA) for marketed drugs and other less studied potential therapeutic compounds.ResultsL1000 fireworks display (L1000FWD) is a web application that provides interactive visualization of over 16 000 drug and small-molecule induced gene expression signatures. L1000FWD enables coloring of signatures by different attributes such as cell type, time point, concentration, as well as drug attributes such as MOA and clinical phase. Signature similarity search is implemented to enable the search for mimicking or opposing signatures given as input of up and down gene sets. Each point on the L1000FWD interactive map is linked to a signature landing page, which provides multifaceted knowledge from various sources about the signature and the drug. Notably such information includes most frequent diagnoses, co-prescribed drugs and age distribution of prescriptions as extracted from the Mount Sinai Health System electronic medical records. Overall, L1000FWD serves as a platform for identifying functions for novel small molecules using unsupervised clustering, as well as for exploring drug MOA.Availability and implementationL1000FWD is freely accessible at: informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Tue, 06 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty060
      Issue No: Vol. 34, No. 12 (2018)
  • Selenzyme: enzyme selection tool for pathway design
    • Authors: Carbonell P; Wong J, Swainston N, et al.
      Pages: 2153 - 2154
      Abstract: SummarySynthetic biology applies the principles of engineering to biology in order to create biological functionalities not seen before in nature. One of the most exciting applications of synthetic biology is the design of new organisms with the ability to produce valuable chemicals including pharmaceuticals and biomaterials in a greener; sustainable fashion. Selecting the right enzymes to catalyze each reaction step in order to produce a desired target compound is, however, not trivial. Here, we present Selenzyme, a free online enzyme selection tool for metabolic pathway design. The user is guided through several decision steps in order to shortlist the best candidates for a given pathway step. The tool graphically presents key information about enzymes based on existing databases and tools such as: similarity of sequences and of catalyzed reactions; phylogenetic distance between source organism and intended host species; multiple alignment highlighting conserved regions, predicted catalytic site, and active regions and relevant properties such as predicted solubility and transmembrane regions. Selenzyme provides bespoke sequence selection for automated workflows in biofoundries.Availability and implementationThe tool is integrated as part of the pathway design stage into the design-build-test-learn SYNBIOCHEM pipeline. The Selenzyme web server is available at informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Wed, 07 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty065
      Issue No: Vol. 34, No. 12 (2018)
  • ssbio: a Python framework for structural systems biology
    • Authors: Mih N; Brunk E, Chen K, et al.
      Pages: 2155 - 2157
      Abstract: SummaryWorking with protein structures at the genome-scale has been challenging in a variety of ways. Here, we present ssbio, a Python package that provides a framework to easily work with structural information in the context of genome-scale network reconstructions, which can contain thousands of individual proteins. The ssbio package provides an automated pipeline to construct high quality genome-scale models with protein structures (GEM-PROs), wrappers to popular third-party programs to compute associated protein properties, and methods to visualize and annotate structures directly in Jupyter notebooks, thus lowering the barrier of linking 3D structural data with established systems workflows.Availability and implementationssbio is implemented in Python and available to download under the MIT license at Documentation and Jupyter notebook tutorials are available at Interactive notebooks can be launched using Binder at'filepath=Binder.ipynb.Supplementary informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Mon, 12 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty077
      Issue No: Vol. 34, No. 12 (2018)
  • AbDesigner3D: a structure-guided tool for peptide-based antibody
    • Authors: Saethang T; Hodge K, Kimkong I, et al.
      Pages: 2158 - 2160
      Abstract: SummaryWe present AbDesigner3D, a new tool for identification of optimal immunizing peptides for antibody production using a peptide-based strategy. AbDesigner3D integrates 3D structural data from the Protein Data Bank (PDB) with UniProt data, which includes basic sequence data, post-translational modification sites, SNP occurrences and more. Other features, such as uniqueness and conservation scores, are calculated based on sequences from UniProt. The 3D visualization capabilities allow an intuitive interface, while an abundance of quantitative output simplifies the process of comparing immunogen peptides. Important quantitative features added in this tool include calculation and display of accessible surface area (ASA) and protein-protein interacting residues (PPIR). The specialized data visualization features of AbDesigner3D will greatly assist users to optimize their choice of immunizing peptides.Availability and implementationAbDesigner3D is freely available at or informationSupplementary dataSupplementary data are available at Bioinformatics online.
      PubDate: Fri, 02 Feb 2018 00:00:00 GMT
      DOI: 10.1093/bioinformatics/bty055
      Issue No: Vol. 34, No. 12 (2018)
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
Home (Search)
Subjects A-Z
Publishers A-Z
Your IP address:
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-