for Journals by Title or ISSN
for Articles by Keywords
  Subjects -> COMPUTER SCIENCE (Total: 2052 journals)
    - ANIMATION AND SIMULATION (30 journals)
    - AUTOMATION AND ROBOTICS (105 journals)
    - COMPUTER ARCHITECTURE (10 journals)
    - COMPUTER ENGINEERING (11 journals)
    - COMPUTER GAMES (15 journals)
    - COMPUTER PROGRAMMING (26 journals)
    - COMPUTER SCIENCE (1194 journals)
    - COMPUTER SECURITY (44 journals)
    - DATA BASE MANAGEMENT (14 journals)
    - DATA MINING (34 journals)
    - E-BUSINESS (22 journals)
    - E-LEARNING (29 journals)
    - IMAGE AND VIDEO PROCESSING (39 journals)
    - INFORMATION SYSTEMS (109 journals)
    - INTERNET (92 journals)
    - SOCIAL WEB (50 journals)
    - SOFTWARE (33 journals)
    - THEORY OF COMPUTING (8 journals)

COMPUTER SCIENCE (1194 journals)                  1 2 3 4 5 6 | Last

Showing 1 - 200 of 872 Journals sorted alphabetically
3D Printing and Additive Manufacturing     Full-text available via subscription   (Followers: 20)
Abakós     Open Access   (Followers: 4)
ACM Computing Surveys     Hybrid Journal   (Followers: 27)
ACM Journal on Computing and Cultural Heritage     Hybrid Journal   (Followers: 8)
ACM Journal on Emerging Technologies in Computing Systems     Hybrid Journal   (Followers: 12)
ACM Transactions on Accessible Computing (TACCESS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Algorithms (TALG)     Hybrid Journal   (Followers: 15)
ACM Transactions on Applied Perception (TAP)     Hybrid Journal   (Followers: 5)
ACM Transactions on Architecture and Code Optimization (TACO)     Hybrid Journal   (Followers: 9)
ACM Transactions on Autonomous and Adaptive Systems (TAAS)     Hybrid Journal   (Followers: 7)
ACM Transactions on Computation Theory (TOCT)     Hybrid Journal   (Followers: 12)
ACM Transactions on Computational Logic (TOCL)     Hybrid Journal   (Followers: 3)
ACM Transactions on Computer Systems (TOCS)     Hybrid Journal   (Followers: 17)
ACM Transactions on Computer-Human Interaction     Hybrid Journal   (Followers: 15)
ACM Transactions on Computing Education (TOCE)     Hybrid Journal   (Followers: 5)
ACM Transactions on Design Automation of Electronic Systems (TODAES)     Hybrid Journal   (Followers: 4)
ACM Transactions on Economics and Computation     Hybrid Journal  
ACM Transactions on Embedded Computing Systems (TECS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Information Systems (TOIS)     Hybrid Journal   (Followers: 19)
ACM Transactions on Intelligent Systems and Technology (TIST)     Hybrid Journal   (Followers: 7)
ACM Transactions on Interactive Intelligent Systems (TiiS)     Hybrid Journal   (Followers: 3)
ACM Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP)     Hybrid Journal   (Followers: 9)
ACM Transactions on Reconfigurable Technology and Systems (TRETS)     Hybrid Journal   (Followers: 6)
ACM Transactions on Sensor Networks (TOSN)     Hybrid Journal   (Followers: 8)
ACM Transactions on Speech and Language Processing (TSLP)     Hybrid Journal   (Followers: 9)
ACM Transactions on Storage     Hybrid Journal  
ACS Applied Materials & Interfaces     Full-text available via subscription   (Followers: 29)
Acta Automatica Sinica     Full-text available via subscription   (Followers: 2)
Acta Informatica Malaysia     Open Access  
Acta Universitatis Cibiniensis. Technical Series     Open Access  
Ad Hoc Networks     Hybrid Journal   (Followers: 11)
Adaptive Behavior     Hybrid Journal   (Followers: 11)
Advanced Engineering Materials     Hybrid Journal   (Followers: 28)
Advanced Science Letters     Full-text available via subscription   (Followers: 10)
Advances in Adaptive Data Analysis     Hybrid Journal   (Followers: 7)
Advances in Artificial Intelligence     Open Access   (Followers: 15)
Advances in Calculus of Variations     Hybrid Journal   (Followers: 2)
Advances in Catalysis     Full-text available via subscription   (Followers: 5)
Advances in Computational Mathematics     Hybrid Journal   (Followers: 19)
Advances in Computer Engineering     Open Access   (Followers: 4)
Advances in Computer Science : an International Journal     Open Access   (Followers: 15)
Advances in Computing     Open Access   (Followers: 2)
Advances in Data Analysis and Classification     Hybrid Journal   (Followers: 51)
Advances in Engineering Software     Hybrid Journal   (Followers: 27)
Advances in Geosciences (ADGEO)     Open Access   (Followers: 13)
Advances in Human Factors/Ergonomics     Full-text available via subscription   (Followers: 22)
Advances in Human-Computer Interaction     Open Access   (Followers: 20)
Advances in Materials Sciences     Open Access   (Followers: 14)
Advances in Operations Research     Open Access   (Followers: 12)
Advances in Parallel Computing     Full-text available via subscription   (Followers: 6)
Advances in Porous Media     Full-text available via subscription   (Followers: 5)
Advances in Remote Sensing     Open Access   (Followers: 44)
Advances in Science and Research (ASR)     Open Access   (Followers: 6)
Advances in Technology Innovation     Open Access   (Followers: 5)
AEU - International Journal of Electronics and Communications     Hybrid Journal   (Followers: 8)
African Journal of Information and Communication     Open Access   (Followers: 8)
African Journal of Mathematics and Computer Science Research     Open Access   (Followers: 4)
AI EDAM     Hybrid Journal  
Air, Soil & Water Research     Open Access   (Followers: 11)
AIS Transactions on Human-Computer Interaction     Open Access   (Followers: 6)
Algebras and Representation Theory     Hybrid Journal   (Followers: 1)
Algorithms     Open Access   (Followers: 11)
American Journal of Computational and Applied Mathematics     Open Access   (Followers: 5)
American Journal of Computational Mathematics     Open Access   (Followers: 4)
American Journal of Information Systems     Open Access   (Followers: 5)
American Journal of Sensor Technology     Open Access   (Followers: 4)
Anais da Academia Brasileira de Ciências     Open Access   (Followers: 2)
Analog Integrated Circuits and Signal Processing     Hybrid Journal   (Followers: 7)
Analysis in Theory and Applications     Hybrid Journal   (Followers: 1)
Animation Practice, Process & Production     Hybrid Journal   (Followers: 5)
Annals of Combinatorics     Hybrid Journal   (Followers: 4)
Annals of Data Science     Hybrid Journal   (Followers: 11)
Annals of Mathematics and Artificial Intelligence     Hybrid Journal   (Followers: 12)
Annals of Pure and Applied Logic     Open Access   (Followers: 2)
Annals of Software Engineering     Hybrid Journal   (Followers: 13)
Annual Reviews in Control     Hybrid Journal   (Followers: 6)
Anuario Americanista Europeo     Open Access  
Applicable Algebra in Engineering, Communication and Computing     Hybrid Journal   (Followers: 2)
Applied and Computational Harmonic Analysis     Full-text available via subscription   (Followers: 1)
Applied Artificial Intelligence: An International Journal     Hybrid Journal   (Followers: 12)
Applied Categorical Structures     Hybrid Journal   (Followers: 2)
Applied Computational Intelligence and Soft Computing     Open Access   (Followers: 11)
Applied Computer Systems     Open Access   (Followers: 2)
Applied Informatics     Open Access  
Applied Mathematics and Computation     Hybrid Journal   (Followers: 33)
Applied Medical Informatics     Open Access   (Followers: 10)
Applied Numerical Mathematics     Hybrid Journal   (Followers: 5)
Applied Soft Computing     Hybrid Journal   (Followers: 16)
Applied Spatial Analysis and Policy     Hybrid Journal   (Followers: 4)
Applied System Innovation     Open Access  
Architectural Theory Review     Hybrid Journal   (Followers: 3)
Archive of Applied Mechanics     Hybrid Journal   (Followers: 5)
Archive of Numerical Software     Open Access  
Archives and Museum Informatics     Hybrid Journal   (Followers: 146)
Archives of Computational Methods in Engineering     Hybrid Journal   (Followers: 5)
arq: Architectural Research Quarterly     Hybrid Journal   (Followers: 7)
Artifact     Hybrid Journal   (Followers: 2)
Artificial Life     Hybrid Journal   (Followers: 7)
Asia Pacific Journal on Computational Engineering     Open Access  
Asia-Pacific Journal of Information Technology and Multimedia     Open Access   (Followers: 1)
Asian Journal of Computer Science and Information Technology     Open Access  
Asian Journal of Control     Hybrid Journal  
Assembly Automation     Hybrid Journal   (Followers: 2)
at - Automatisierungstechnik     Hybrid Journal   (Followers: 1)
Australian Educational Computing     Open Access   (Followers: 1)
Automatic Control and Computer Sciences     Hybrid Journal   (Followers: 4)
Automatic Documentation and Mathematical Linguistics     Hybrid Journal   (Followers: 5)
Automatica     Hybrid Journal   (Followers: 11)
Automation in Construction     Hybrid Journal   (Followers: 6)
Autonomous Mental Development, IEEE Transactions on     Hybrid Journal   (Followers: 9)
Basin Research     Hybrid Journal   (Followers: 5)
Behaviour & Information Technology     Hybrid Journal   (Followers: 52)
Big Data and Cognitive Computing     Open Access   (Followers: 2)
Biodiversity Information Science and Standards     Open Access  
Bioinformatics     Hybrid Journal   (Followers: 294)
Biomedical Engineering     Hybrid Journal   (Followers: 15)
Biomedical Engineering and Computational Biology     Open Access   (Followers: 13)
Biomedical Engineering, IEEE Reviews in     Full-text available via subscription   (Followers: 21)
Biomedical Engineering, IEEE Transactions on     Hybrid Journal   (Followers: 37)
Briefings in Bioinformatics     Hybrid Journal   (Followers: 46)
British Journal of Educational Technology     Hybrid Journal   (Followers: 144)
Broadcasting, IEEE Transactions on     Hybrid Journal   (Followers: 12)
c't Magazin fuer Computertechnik     Full-text available via subscription   (Followers: 1)
CALCOLO     Hybrid Journal  
Calphad     Hybrid Journal   (Followers: 2)
Canadian Journal of Electrical and Computer Engineering     Full-text available via subscription   (Followers: 15)
Capturing Intelligence     Full-text available via subscription  
Catalysis in Industry     Hybrid Journal   (Followers: 1)
CEAS Space Journal     Hybrid Journal   (Followers: 2)
Cell Communication and Signaling     Open Access   (Followers: 2)
Central European Journal of Computer Science     Hybrid Journal   (Followers: 5)
CERN IdeaSquare Journal of Experimental Innovation     Open Access   (Followers: 3)
Chaos, Solitons & Fractals     Hybrid Journal   (Followers: 3)
Chemometrics and Intelligent Laboratory Systems     Hybrid Journal   (Followers: 14)
ChemSusChem     Hybrid Journal   (Followers: 7)
China Communications     Full-text available via subscription   (Followers: 7)
Chinese Journal of Catalysis     Full-text available via subscription   (Followers: 2)
CIN Computers Informatics Nursing     Full-text available via subscription   (Followers: 11)
Circuits and Systems     Open Access   (Followers: 15)
Clean Air Journal     Full-text available via subscription   (Followers: 1)
CLEI Electronic Journal     Open Access  
Clin-Alert     Hybrid Journal   (Followers: 1)
Cluster Computing     Hybrid Journal   (Followers: 1)
Cognitive Computation     Hybrid Journal   (Followers: 4)
COMBINATORICA     Hybrid Journal  
Combinatorics, Probability and Computing     Hybrid Journal   (Followers: 4)
Combustion Theory and Modelling     Hybrid Journal   (Followers: 14)
Communication Methods and Measures     Hybrid Journal   (Followers: 12)
Communication Theory     Hybrid Journal   (Followers: 21)
Communications Engineer     Hybrid Journal   (Followers: 1)
Communications in Algebra     Hybrid Journal   (Followers: 3)
Communications in Computational Physics     Full-text available via subscription   (Followers: 2)
Communications in Partial Differential Equations     Hybrid Journal   (Followers: 3)
Communications of the ACM     Full-text available via subscription   (Followers: 52)
Communications of the Association for Information Systems     Open Access   (Followers: 16)
COMPEL: The International Journal for Computation and Mathematics in Electrical and Electronic Engineering     Hybrid Journal   (Followers: 3)
Complex & Intelligent Systems     Open Access   (Followers: 1)
Complex Adaptive Systems Modeling     Open Access  
Complex Analysis and Operator Theory     Hybrid Journal   (Followers: 2)
Complexity     Hybrid Journal   (Followers: 6)
Complexus     Full-text available via subscription  
Composite Materials Series     Full-text available via subscription   (Followers: 8)
Computación y Sistemas     Open Access  
Computation     Open Access   (Followers: 1)
Computational and Applied Mathematics     Hybrid Journal   (Followers: 2)
Computational and Mathematical Methods in Medicine     Open Access   (Followers: 2)
Computational and Mathematical Organization Theory     Hybrid Journal   (Followers: 2)
Computational and Structural Biotechnology Journal     Open Access   (Followers: 2)
Computational and Theoretical Chemistry     Hybrid Journal   (Followers: 9)
Computational Astrophysics and Cosmology     Open Access   (Followers: 1)
Computational Biology and Chemistry     Hybrid Journal   (Followers: 11)
Computational Chemistry     Open Access   (Followers: 2)
Computational Cognitive Science     Open Access   (Followers: 2)
Computational Complexity     Hybrid Journal   (Followers: 4)
Computational Condensed Matter     Open Access  
Computational Ecology and Software     Open Access   (Followers: 9)
Computational Economics     Hybrid Journal   (Followers: 9)
Computational Geosciences     Hybrid Journal   (Followers: 16)
Computational Linguistics     Open Access   (Followers: 23)
Computational Management Science     Hybrid Journal  
Computational Mathematics and Modeling     Hybrid Journal   (Followers: 8)
Computational Mechanics     Hybrid Journal   (Followers: 5)
Computational Methods and Function Theory     Hybrid Journal  
Computational Molecular Bioscience     Open Access   (Followers: 2)
Computational Optimization and Applications     Hybrid Journal   (Followers: 7)
Computational Particle Mechanics     Hybrid Journal   (Followers: 1)
Computational Research     Open Access   (Followers: 1)
Computational Science and Discovery     Full-text available via subscription   (Followers: 2)
Computational Science and Techniques     Open Access  
Computational Statistics     Hybrid Journal   (Followers: 14)
Computational Statistics & Data Analysis     Hybrid Journal   (Followers: 30)
Computer     Full-text available via subscription   (Followers: 94)
Computer Aided Surgery     Open Access   (Followers: 6)
Computer Applications in Engineering Education     Hybrid Journal   (Followers: 8)
Computer Communications     Hybrid Journal   (Followers: 16)
Computer Engineering and Applications Journal     Open Access   (Followers: 5)
Computer Journal     Hybrid Journal   (Followers: 9)
Computer Methods in Applied Mechanics and Engineering     Hybrid Journal   (Followers: 23)
Computer Methods in Biomechanics and Biomedical Engineering     Hybrid Journal   (Followers: 12)
Computer Methods in the Geosciences     Full-text available via subscription   (Followers: 2)

        1 2 3 4 5 6 | Last

Journal Cover ACM Transactions on Architecture and Code Optimization (TACO)
  Journal Prestige (SJR): 0.49
  Citation Impact (citeScore): 16
  Number of Followers: 9  
   Hybrid Journal Hybrid journal (It can contain Open Access articles)
   ISSN (Print) 1544-3566 - ISSN (Online) 1544-3973
   Published by ACM Homepage  [45 journals]
  • CAIRO: A Compiler-Assisted Technique for Enabling Instruction-Level
           Offloading of Processing-In-Memory
    • Abstract: Ramyad Hadidi, Lifeng Nai, Hyojong Kim, Hyesoon Kim

      Three-dimensional (3D)-stacking technology and the memory-wall problem have popularized processing-in-memory (PIM) concepts again, which offers the benefits of bandwidth and energy savings by offloading computations to functional units inside the memory. Several memory vendors have also started to integrate computation logics into the memory, such as Hybrid Memory Cube (HMC), the latest version of which supports up to 18 in-memory atomic instructions. Although industry prototypes have motivated studies for investigating efficient methods and architectures for PIM, researchers have not proposed a systematic way for identifying the benefits of instruction-level PIM offloading. As a result, compiler support for recognizing offloading candidates and utilizing instruction-level PIM offloading is unavailable.
      PubDate: Wed, 20 Dec 2017 00:00:00 GMT
  • ReDirect: Reconfigurable Directories for Multicore Architectures
    • Abstract: George Patsilaras, James Tuck

      As we enter the dark silicon era, architects should not envision designs in which every transistor remains turned on permanently but rather ones in which portions of the chip are judiciously turned on/off depending on the characteristics of a workload. At the same time, due to the increasing cost per transistor, architects should also consider new ways to re-purpose transistors to increase their architectural value. In this work, we consider the design of directory-based cache coherence in light of the dark silicon era and the need to re-purpose transistors. We point out that directories are not needed all of the time, and we argue that directories (and coherence) should be off unless it is actually needed for correctness.
      PubDate: Wed, 20 Dec 2017 00:00:00 GMT
  • Data-Driven Concurrency for High Performance Computing
    • Abstract: George Matheou, Paraskevas Evripidou

      In this work, we utilize dynamic dataflow/data-driven techniques to improve the performance of high performance computing (HPC) systems. The proposed techniques are implemented and evaluated through an efficient, portable, and robust programming framework that enables data-driven concurrency on HPC systems. The proposed framework is based on data-driven multithreading (DDM), a hybrid control-flow/dataflow model that schedules threads based on data availability on sequential processors. The proposed framework was evaluated using several benchmarks, with different characteristics, on two different systems: a 4-node AMD system with a total of 128 cores and a 64-node Intel HPC system with a total of 768 cores. The performance evaluation shows that the proposed framework scales well and tolerates scheduling overheads and memory latencies effectively.
      PubDate: Wed, 20 Dec 2017 00:00:00 GMT
  • Triple Engine Processor (TEP): A Heterogeneous Near-Memory Processor for
           Diverse Kernel Operations
    • Abstract: Hongyeol Lim, Giho Park

      The advent of 3D memory stacking technology, which integrates a logic layer and stacked memories, is expected to be one of the most promising memory technologies to mitigate the memory wall problem by leveraging the concept of near-memory processing (NMP). With the ability to process data locally within the logic layer of stacked memory, a variety of emerging big data applications can achieve significant performance and energy-efficiency benefits. Various approaches to the NMP logic layer architecture have been studied to utilize the advantage of stacked memory. While significant acceleration of specific kernel operations has been derived from previous NMP studies, an NMP-based system using an NMP logic architecture capable of handling some specific kernel operations can suffer from performance and energy efficiency degradation caused by a significant communication overhead between the host processor and NMP stack.
      PubDate: Mon, 18 Dec 2017 00:00:00 GMT
  • HAShCache: Heterogeneity-Aware Shared DRAMCache for Integrated
           Heterogeneous Systems
    • Abstract: Adarsh Patil, Ramaswamy Govindarajan

      Integrated Heterogeneous System (IHS) processors pack throughput-oriented General-Purpose Graphics Pprocessing Units (GPGPUs) alongside latency-oriented Central Processing Units (CPUs) on the same die sharing certain resources, e.g., shared last-level cache, Network-on-Chip (NoC), and the main memory. The demands for memory accesses and other shared resources from GPU cores can exceed that of CPU cores by two to three orders of magnitude. This disparity poses significant problems in exploiting the full potential of these architectures. In this article, we propose adding a large-capacity stacked DRAM, used as a shared last-level cache, for the IHS processors. However, adding the DRAMCache naively, leaves significant performance on the table due to the disparate demands from CPU and GPU cores for DRAMCache and memory accesses.
      PubDate: Mon, 18 Dec 2017 00:00:00 GMT
  • SCALO: Scalability-Aware Parallelism Orchestration for Multi-Threaded
    • Abstract: Giorgis Georgakoudis, Hans Vandierendonck, Peter Thoman, Bronis R. De Supinski, Thomas Fahringer, Dimitrios S. Nikolopoulos

      Shared memory machines continue to increase in scale by adding more parallelism through additional cores and complex memory hierarchies. Often, executing multiple applications concurrently, dividing among them hardware threads, provides greater efficiency rather than executing a single application with large thread counts. However, contention for shared resources can limit the improvement of concurrent application execution: orchestrating the number of threads used by each application and is essential. In this article, we contribute SCALO, a solution to orchestrate concurrent application execution to increase throughput. SCALO monitors co-executing applications at runtime to evaluate their scalability. Its optimizing thread allocator analyzes these scalability estimates to adapt the parallelism of each program.
      PubDate: Mon, 18 Dec 2017 00:00:00 GMT
  • Optimization of Triangular and Banded Matrix Operations Using 2d-Packed
    • Abstract: Toufik Baroudi, Rachid Seghir, Vincent Loechner

      Over the past few years, multicore systems have become increasingly powerful and thereby very useful in high-performance computing. However, many applications, such as some linear algebra algorithms, still cannot take full advantage of these systems. This is mainly due to the shortage of optimization techniques dealing with irregular control structures. In particular, the well-known polyhedral model fails to optimize loop nests whose bounds and/or array references are not affine functions. This is more likely to occur when handling sparse matrices in their packed formats. In this article, we propose using 2d-packed layouts and simple affine transformations to enable optimization of triangular and banded matrix operations.
      PubDate: Mon, 18 Dec 2017 00:00:00 GMT
  • ECS: Error-Correcting Strings for Lifetime Improvements in Nonvolatile
    • Abstract: Shivam Swami, Poovaiah M. Palangappa, Kartik Mohanram

      Emerging nonvolatile memories (NVMs) suffer from low write endurance, resulting in early cell failures (hard errors), which reduce memory lifetime. It was recognized early on that conventional error-correcting codes (ECCs), which are designed for soft errors, are a poor choice for addressing hard errors in NVMs. This led to the evolution of hard error correction schemes like dynamically replicated memory (DRM), error-correcting pointers (ECPs), SAFER, FREE-p, PAYG, and Zombie memory to improve NVM lifetime. Whereas these approaches made significant inroads in addressing hard errors and low memory lifetime in NVMs, overcoming the challenges of underutilization of error-correcting resources and/or implementation overhead (e.g., codec latency, hardware support) remain areas of active research and development.
      PubDate: Wed, 13 Dec 2017 00:00:00 GMT
  • Generating Fine-Grain Multithreaded Applications Using a Multigrain
    • Abstract: Jaime Arteaga, Stéphane Zuckerman, Guang R. Gao

      The recent evolution in hardware landscape, aimed at producing high-performance computing systems capable of reaching extreme-scale performance, has reignited the interest in fine-grain multithreading, particularly at the intranode level. Indeed, popular parallel programming environments, such as OpenMP, which features a simple interface for the parallelization of programs, are now incorporating fine-grain constructs. However, since coarse-grain directives are still heavily used, the OpenMP runtime is forced to support both coarse- and fine-grain models of execution, potentially reducing the advantages obtained when executing an application in a fully fine-grain environment. To evaluate the type of applications that benefit from executing in a unified fine-grain program execution model, this article presents a multigrain parallel programming environment for the generation of fine-grain multithreaded applications from programs featuring OpenMP’s API, allowing OpenMP programs to be run on top of a fine-grain event-driven program execution model.
      PubDate: Wed, 13 Dec 2017 00:00:00 GMT
  • Optimizing Affine Control With Semantic Factorizations
    • Abstract: Christophe Alias, Alexandru Plesco

      Hardware accelerators generated by polyhedral synthesis techniques make extensive use of affine expressions (affine functions and convex polyhedra) in control and steering logic. Since the control is pipelined, these affine objects must be evaluated at the same time for different values, which forbids aggressive reuse of operators. In this article, we propose a method to factorize a collection of affine expressions without preventing pipelining. Our key contributions are (i) to use semantic factorizations exploiting arithmetic properties of addition and multiplication and (ii) to rely on a cost function whose minimization ensures correct usage of FPGA resources. Our algorithm is totally parameterized by the cost function, which can be customized to fit a target FPGA.
      PubDate: Wed, 13 Dec 2017 00:00:00 GMT
  • Compiler-Assisted Loop Hardening Against Fault Attacks
    • Abstract: Julien Proy, Karine Heydemann, Alexandre Berzati, Albert Cohen

      Secure elements widely used in smartphones, digital consumer electronics, and payment systems are subject to fault attacks. To thwart such attacks, software protections are manually inserted requiring experts and time. The explosion of the Internet of Things (IoT) in home, business, and public spaces motivates the hardening of a wider class of applications and the need to offer security solutions to non-experts. This article addresses the automated protection of loops at compilation time, covering the widest range of control- and data-flow patterns, in both shape and complexity. The security property we consider is that a sensitive loop must always perform the expected number of iterations; otherwise, an attack must be reported.
      PubDate: Tue, 05 Dec 2017 00:00:00 GMT
  • CG-OoO: Energy-Efficient Coarse-Grain Out-of-Order Execution Near In-Order
           Energy with Near Out-of-Order Performance
    • Abstract: Milad Mohammadi, Tor M. Aamodt, William J. Dally

      We introduce the Coarse-Grain Out-of-Order (CG-OoO) general-purpose processor designed to achieve close to In-Order (InO) processor energy while maintaining Out-of-Order (OoO) performance. CG-OoO is an energy-performance-proportional architecture. Block-level code processing is at the heart of this architecture; CG-OoO speculates, fetches, schedules, and commits code at block-level granularity. It eliminates unnecessary accesses to energy-consuming tables and turns large tables into smaller, distributed tables that are cheaper to access. CG-OoO leverages compiler-level code optimizations to deliver efficient static code and exploits dynamic block-level and instruction-level parallelism. CG-OoO introduces Skipahead, a complexity effective, limited out-of-order instruction scheduling model. Through the energy efficiency techniques applied to the compiler and processor pipeline stages, CG-OoO closes 62% of the average energy gap between the InO and OoO baseline processors at the same area and nearly the same performance as the OoO.
      PubDate: Tue, 05 Dec 2017 00:00:00 GMT
  • SLOOP: QoS-Supervised Loop Execution to Reduce Energy on Heterogeneous
    • Abstract: M. Waqar Azhar, Per Stenström, Vassilis Papaefstathiou

      Most systems allocate computational resources to each executing task without any actual knowledge of the application’s Quality-of-Service (QoS) requirements. Such best-effort policies lead to overprovisioning of the resources and increase energy loss. This work assumes applications with soft QoS requirements and exploits the inherent timing slack to minimize the allocated computational resources to reduce energy consumption. We propose a lightweight progress-tracking methodology based on the outer loops of application kernels. It builds on online history and uses it to estimate the total execution time. The prediction of the execution time and the QoS requirements are then used to schedule the application on a heterogeneous architecture with big out-of-order cores and small (LITTLE) in-order cores and select the minimum operating frequency, using DVFS, that meets the deadline.
      PubDate: Tue, 05 Dec 2017 00:00:00 GMT
  • MBZip: Multiblock Data Compression
    • Abstract: Raghavendra Kanakagiri, Biswabandan Panda, Madhu Mutyam

      Compression techniques at the last-level cache and the DRAM play an important role in improving system performance by increasing their effective capacities. A compressed block in DRAM also reduces the transfer time over the memory bus to the caches, reducing the latency of a LLC cache miss. Usually, compression is achieved by exploiting data patterns present within a block. But applications can exhibit data locality that spread across multiple consecutive data blocks. We observe that there is significant opportunity available for compressing multiple consecutive data blocks into one single block, both at the LLC and DRAM. Our studies using 21 SPEC CPU applications show that, at the LLC, around 25% (on average) of the cache blocks can be compressed into one single cache block when grouped together in groups of 2 to 8 blocks.
      PubDate: Tue, 05 Dec 2017 00:00:00 GMT
  • Fuse: Accurate Multiplexing of Hardware Performance Counters Across
    • Abstract: Richard Neill, Andi Drebes, Antoniu Pop

      Collecting hardware event counts is essential to understanding program execution behavior. Contemporary systems offer few Performance Monitoring Counters (PMCs), thus only a small fraction of hardware events can be monitored simultaneously. We present new techniques to acquire counts for all available hardware events with high accuracy by multiplexing PMCs across multiple executions of the same program, then carefully reconciling and merging the multiple profiles into a single, coherent profile. We present a new metric for assessing the similarity of statistical distributions of event counts and show that our execution profiling approach performs significantly better than Hardware Event Multiplexing.
      PubDate: Tue, 05 Dec 2017 00:00:00 GMT
  • Could Compression Be of General Use' Evaluating Memory Compression
           across Domains
    • Abstract: Somayeh Sardashti, David A. Wood

      Recent proposals present compression as a cost-effective technique to increase cache and memory capacity and bandwidth. While these proposals show potentials of compression, there are several open questions to adopt these proposals in real systems including the following: (1) Do these techniques work for real-world workloads running for long time? (2) Which application domains would potentially benefit the most from compression? (3) At which level of memory hierarchy should we apply compression: caches, main memory, or both? In this article, our goal is to shed light on some main questions on applicability of compression. We evaluate compression in the memory hierarchy for selected examples from different application classes.
      PubDate: Tue, 05 Dec 2017 00:00:00 GMT
  • Improving the Efficiency of GPGPU Work-Queue Through Data Awareness
    • Abstract: Libo Huang, Yashuai Lü, Li Shen, Zhiying Wang

      The architecture and programming model of current GPGPUs are best suited for applications that are dominated by structured control and data flows across large regular datasets. Parallel workloads with irregular control and data structures cannot easily harness the processing power of the GPGPU. One approach for mapping these irregular-parallel workloads to GPGPUs is using work-queues. The work-queue approach improves the utilization of SIMD units by only processing useful works that are dynamically generated during execution. As current GPGPUs lack necessary supports for work-queues, a software-based work-queue implementation often suffers from memory contention and load balancing issues. In this article, we present a novel hardware work-queue design named DaQueue, which incorporates three data-aware features to improve the efficiency of work-queues on GPGPUs.
      PubDate: Tue, 05 Dec 2017 00:00:00 GMT
  • A Framework for Automated and Controlled Floating-Point Accuracy Reduction
           in Graphics Applications on GPUs
    • Abstract: Alexandra Angerd, Erik Sintorn, Per Stenström

      Reducing the precision of floating-point values can improve performance and/or reduce energy expenditure in computer graphics, among other, applications. However, reducing the precision level of floating-point values in a controlled fashion needs support both at the compiler and at the microarchitecture level. At the compiler level, a method is needed to automate the reduction of precision of each floating-point value. At the microarchitecture level, a lower precision of each floating-point register can allow more floating-point values to be packed into a register file. This, however, calls for new register file organizations. This article proposes an automated precision-selection method and a novel GPU register file organization that can store floating-point register values at arbitrary precisions densely.
      PubDate: Tue, 05 Dec 2017 00:00:00 GMT
  • Cooperative Multi-Agent Reinforcement Learning-Based Co-optimization of
           Cores, Caches, and On-chip Network
    • Abstract: Rahul Jain, Preeti Ranjan Panda, Sreenivas Subramoney

      Modern multi-core systems provide huge computational capabilities, which can be used to run multiple processes concurrently. To achieve the best possible performance within limited power budgets, the various system resources need to be allocated effectively. Any mismatch between runtime resource requirement and allocation leads to a sub-optimal energy-delay product (EDP). Different optimization techniques exist for addressing the problem of mismatch between the dynamic requirement and runtime allocation of the system resources. Choosing between multiple optimizations at runtime is complex due to the non-additive effects, making the scenario suitable for the application of machine learning techniques. We present a novel method, Machine Learned Machines (MLM), by using online reinforcement learning (RL) to perform dynamic partitioning of the last level cache (LLC), along with dynamic voltage and frequency scaling (DVFS) of the core and uncore (interconnection network and LLC).
      PubDate: Tue, 14 Nov 2017 00:00:00 GMT
  • Cache Exclusivity and Sharing: Theory and Optimization
    • Abstract: Chencheng Ye, Chen Ding, Hao Luo, Jacob Brock, Dong Chen, Hai Jin

      A problem on multicore systems is cache sharing, where the cache occupancy of a program depends on the cache usage of peer programs. Exclusive cache hierarchy as used on AMD processors is an effective solution to allow processor cores to have a large private cache while still benefitting from shared cache. The shared cache stores the “victims” (i.e., data evicted from private caches). The performance depends on how victims of co-run programs interact in shared cache. This article presents a new metric called the victim footprint (VFP). It is measured once per program in its solo execution and can then be combined to compute the performance of any exclusive cache hierarchy, replacing parallel testing with theoretical analysis.
      PubDate: Tue, 14 Nov 2017 00:00:00 GMT
  • Energy-Efficient Compilation of Irregular Task-Parallel Loops
    • Abstract: Rahul Shrivastava, V. Krishna Nandivada

      Energy-efficient compilation is an important problem for multi-core systems. In this context, irregular programs with task-parallel loops  present interesting challenges: the threads with lesser work-loads (non-critical-threads) wait at the join-points for the thread with maximum work-load (critical-thread); this leads to significant energy wastage. This problem becomes more interesting in the context of multi-socket-multi-core (MSMC) systems, where different sockets may run at different frequencies, but all the cores connected to a socket run at a single frequency. In such a configuration, even though the load-imbalance among the cores may be significant, an MSMC-oblivious technique may miss the opportunities to reduce energy consumption, if the load-imbalance across the sockets is minimal.
      PubDate: Tue, 14 Nov 2017 00:00:00 GMT
  • A Transactional Correctness Tool for Abstract Data Types
    • Abstract: Christina Peterson, Damian Dechev

      Transactional memory simplifies multiprocessor programming by providing the guarantee that a sequential block of code in the form of a transaction will exhibit atomicity and isolation. Transactional data structures offer the same guarantee to concurrent data structures by enabling the atomic execution of a composition of operations. The concurrency control of transactional memory systems preserves atomicity and isolation by detecting read/write conflicts among multiple concurrent transactions. State-of-the-art transactional data structures improve on this concurrency control protocol by providing explicit transaction-level synchronization for only non-commutative operations. Since read/write conflicts are handled by thread-level concurrency control, the correctness of transactional data structures cannot be evaluated according to the read/write histories.
      PubDate: Tue, 14 Nov 2017 00:00:00 GMT
  • Power Consumption Models for Multi-Tenant Server Infrastructures
    • Abstract: Matteo Ferroni, Andrea Corna, Andrea Damiani, Rolando Brondolin, Juan A. Colmenares, Steven Hofmeyr, John D. Kubiatowicz, Marco D. Santambrogio

      Multi-tenant virtualized infrastructures allow cloud providers to minimize costs through workload consolidation. One of the largest costs is power consumption, which is challenging to understand in heterogeneous environments. We propose a power modeling methodology that tackles this complexity using a divide-and-conquer approach. Our results outperform previous research work, achieving a relative error of 2% on average and under 4% in almost all cases. Models are portable across similar architectures, enabling predictions of power consumption before migrating a tenant to a different hardware platform. Moreover, we show the models allow us to evaluate colocations of tenants to reduce overall consumption.
      PubDate: Tue, 14 Nov 2017 00:00:00 GMT
  • ACM Transactions on Architecture and Code Optimization (TACO) Volume 14
           Issue 4, October 2017 (Issue-in-Progress)
    • PubDate: Tue, 24 Oct 2017 00:00:00 GMT
  • Bringing Parallel Patterns Out of the Corner: The P3 ARSEC
           Benchmark Suite
    • Abstract: Daniele De Sensi, Tiziano De Matteis, Massimo Torquati, Gabriele Mencagli, Marco Danelutto

      High-level parallel programming is an active research topic aimed at promoting parallel programming methodologies that provide the programmer with high-level abstractions to develop complex parallel software with reduced time to solution. Pattern-based parallel programming is based on a set of composable and customizable parallel patterns used as basic building blocks in parallel applications. In recent years, a considerable effort has been made in empowering this programming model with features able to overcome shortcomings of early approaches concerning flexibility and performance. In this article, we demonstrate that the approach is flexible and efficient enough by applying it on 12 out of 13 PARSEC applications.
      PubDate: Tue, 24 Oct 2017 00:00:00 GMT
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Tel: +00 44 (0)131 4513762
Fax: +00 44 (0)131 4513327
Home (Search)
Subjects A-Z
Publishers A-Z
Your IP address:
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-