A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

  Subjects -> ELECTRONICS (Total: 207 journals)
The end of the list has been reached or no journals were found for your choice.
Similar Journals
Journal Cover
Journal of Low Power Electronics and Applications
Journal Prestige (SJR): 0.222
Citation Impact (citeScore): 1
Number of Followers: 8  

  This is an Open Access Journal Open Access journal
ISSN (Print) 2079-9268
Published by MDPI Homepage  [246 journals]
  • JLPEA, Vol. 12, Pages 49: Designing Energy-Efficient Approximate
           Multipliers

    • Authors: Stefania Perri, Fanny Spagnolo, Fabio Frustaci, Pasquale Corsonello
      First page: 49
      Abstract: This paper proposes a novel approach suitable to design energy-efficient approximate multipliers using both ASIC and FPGAs. The new strategy harnesses specific encoding logics based on bit significance and computes the approximate product performing accurate sub-multiplications by applying an unconventional approach instead of using approximate computational modules implementing traditional static or dynamic bit-truncation approaches. The proposed platform-independent architecture exhibits an energy saving of up to 80% over the accurate counterparts and significantly better behavior in terms of accuracy loss with respect to competitor approximate architectures. When employed in 2D digital filters and edge detectors, the novel approximate multipliers lead to an energy consumption up to ~82% lower than the accurate counterparts, which is up to ~2 times higher than that obtained by state-of-the-art competitors.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-09-27
      DOI: 10.3390/jlpea12040049
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 50: FAC-V: An FPGA-Based AES Coprocessor for RISC-V

    • Authors: Tiago Gomes, Pedro Sousa, Miguel Silva, Mongkol Ekpanyapong, Sandro Pinto
      First page: 50
      Abstract: In the new Internet of Things (IoT) era, embedded Field-Programmable Gate Array (FPGA) technology is enabling the deployment of custom-tailored embedded IoT solutions for handling different application requirements and workloads. Combined with the open RISC-V Instruction Set Architecture (ISA), the FPGA technology provides endless opportunities to create reconfigurable IoT devices with different accelerators and coprocessors tightly and loosely coupled with the processor. When connecting IoT devices to the Internet, secure communications and data exchange are major concerns. However, adding security features requires extra capabilities from the already resource-constrained IoT devices. This article presents the FAC-V coprocessor, which is an FPGA-based solution for an RISC-V processor that can be deployed following two different coupling styles. FAC-V implements in hardware the Advanced Encryption Standard (AES), one of the most widely used cryptographic algorithms in IoT low-end devices, at the cost of few FPGA resources. The conducted experiments demonstrate that FAC-V can achieve performance improvements of several orders of magnitude when compared to the software-only AES implementation; e.g., encrypting a message of 16 bytes with AES-256 can reach a performance gain of around 8000× with an energy consumption of 0.1 μJ.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-09-27
      DOI: 10.3390/jlpea12040050
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 51: BIoU: An Improved Bounding Box Regression for
           Object Detection

    • Authors: Niranjan Ravi, Sami Naqvi, Mohamed El-Sharkawy
      First page: 51
      Abstract: Object detection is a predominant challenge in computer vision and image processing to detect instances of objects of various classes within an image or video. Recently, a new domain of vehicular platforms, e-scooters, has been widely used across domestic and urban environments. The driving behavior of e-scooter users significantly differs from other vehicles on the road, and their interactions with pedestrians are also increasing. To ensure pedestrian safety and develop an efficient traffic monitoring system, a reliable object detection system for e-scooters is required. However, existing object detectors based on IoU loss functions suffer various drawbacks when dealing with densely packed objects or inaccurate predictions. To address this problem, a new loss function, balanced-IoU (BIoU), is proposed in this article. This loss function considers the parameterized distance between the centers and the minimum and maximum edges of the bounding boxes to address the localization problem. With the help of synthetic data, a simulation experiment was carried out to analyze the bounding box regression of various losses. Extensive experiments have been carried out on a two-stage object detector, MASK_RCNN, and single-stage object detectors such as YOLOv5n6, YOLOv5x on Microsoft Common Objects in Context, SKU110k, and our custom e-scooter dataset. The proposed loss function demonstrated an increment of 3.70% at APS on the COCO dataset, 6.20% at AP55 on SKU110k, and 9.03% at AP80 of the custom e-scooter dataset.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-09-28
      DOI: 10.3390/jlpea12040051
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 52: Advanced Embedded System Modeling and Simulation
           in an Open Source RISC-V Virtual Prototype

    • Authors: Pascal Pieper, Vladimir Herdt, Rolf Drechsler
      First page: 52
      Abstract: RISC-V is a modern Instruction Set Architecture (ISA) that, by its open nature in combination with a clean and modular design, has enormous potential to become a game changer in the Internet of Things (IoT) era. Recently, SystemC-based Virtual Prototypes (VPs) have been introduced into the RISC-V ecosystem to lay the foundation for advanced industry-proven system-level use-cases. However, VP-driven environment modeling and interaction have mostly been neglected in the RISC-V context. In this paper, we propose such an extension to broaden the application domain for virtual prototyping in the RISC-V context. As a foundation, we built upon the open source RISC-V VP available at GitHub. For a visualization of the environment purposes, we designed a Graphical User Interface (GUI) and designed appropriate libraries to offer hardware communication interfaces such as GPIO and SPI from the VP to an interactive environment model. Our approach is designed to be integrated with SystemC-based VPs that leverage a Transaction-Level Modeling (TLM) communication system to prefer a speed-optimized simulation. To show the practicability of an environment model, we provide a set of building blocks such as buttons, LEDs and an OLED display and configured them in two demonstration environments. Moreover, for rapid prototyping purposes, we provide a modeling layer that leverages the dynamic Lua scripting language to design components and integrate them with the VP-based simulation. Our evaluation with two different case-studies demonstrates the applicability of our approach in building virtual environments effectively and correctly when matching the real physical systems. To advance the RISC-V community and stimulate further research, we provide our extended VP platform with the environment configuration and visualization toolbox, as well as both case-studies as open source on GitHub.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-09-29
      DOI: 10.3390/jlpea12040052
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 53: Multi-Objective Resource Scheduling for IoT
           Systems Using Reinforcement Learning

    • Authors: Shaswot Shresthamali, Masaaki Kondo, Hiroshi Nakamura
      First page: 53
      Abstract: IoT embedded systems have multiple objectives that need to be maximized simultaneously. These objectives conflict with each other due to limited resources and tradeoffs that need to be made. This requires multi-objective optimization (MOO) and multiple Pareto-optimal solutions are possible. In such a case, tradeoffs are made w.r.t. a user-defined preference. This work presents a general Multi-objective Reinforcement Learning (MORL) framework for MOO of IoT embedded systems. This framework comprises a general Multi-objective Markov Decision Process (MOMDP) formulation and two novel low-compute MORL algorithms. The algorithms learn policies to tradeoff between multiple objectives using a single preference parameter. We take the energy scheduling problem in general Energy Harvesting Wireless Sensor Nodes (EHWSNs) as a case example in which a sensor node is required to maximize its sensing rate, and transmission performance as well as ensure long-term uninterrupted operation within a very tight energy budget. We simulate single-task and dual-task EHWSN systems to evaluate our framework.. The results demonstrate that our MORL algorithms can learn better policies at lower learning costs and successfully tradeoff between multiple objectives at runtime.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-10-08
      DOI: 10.3390/jlpea12040053
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 54: Intelligent Control of Seizure-Like Activity in
           

    • Authors: Wallace Moreira Bessa, Gabriel da Silva Lima
      First page: 54
      Abstract: Memristive neuromorphic systems represent one of the most promising technologies to overcome the current challenges faced by conventional computer systems. They have recently been proposed for a wide variety of applications, such as nonvolatile computer memory, neuroprosthetics, and brain–machine interfaces. However, due to their intrinsically nonlinear characteristics, they present a very complex dynamic behavior, including self-sustained oscillations, seizure-like events, and chaos, which may compromise their use in closed-loop systems. In this work, a novel intelligent controller is proposed to suppress seizure-like events in a memristive circuit based on the Hodgkin–Huxley equations. For this purpose, an adaptive neural network is adopted within a Lyapunov-based nonlinear control scheme to attenuate bursting dynamics in the circuit, while compensating for modeling uncertainties and external disturbances. The boundedness and convergence properties of the proposed control scheme are rigorously proved by means of a Lyapunov-like stability analysis. The obtained results confirm the effectiveness of the proposed intelligent controller, presenting a much improved performance when compared with a conventional nonlinear control scheme.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-10-12
      DOI: 10.3390/jlpea12040054
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 55: Direct-Grown Helical-Shaped Tungsten-Oxide-Based
           Devices with Reconfigurable Selectivity for Memory Applications

    • Authors: Ying-Chen Chen, Yi-Fu Huang, Sumant Sarkar, John Gibbs, Jack Lee
      First page: 55
      Abstract: In this study, a direct-grown helical-shaped tungsten-oxide-based (h-WOx) selection device is presented for emerging memory applications. The selectivity in the selection devices is from 10 to 103 with a low off-current of 0.1 to 0.01 nA. In addition, the selectivity of volatile switching in the h-WOx selection devices is reconfigurable with a pseudo RESET process on the one-time negative voltage operations. The helical-shaped selection devices with the glancing angle deposition (GLAD) method show good compatibility, low power consumption, good selectivity, and good reconfigurability for next-generation memory applications.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-10-15
      DOI: 10.3390/jlpea12040055
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 56: Templatized Fused Vector Floating-Point Dot
           Product for High-Level Synthesis

    • Authors: Dionysios Filippas, Chrysostomos Nicopoulos, Giorgos Dimitrakopoulos
      First page: 56
      Abstract: Machine-learning accelerators rely on floating-point matrix and vector multiplication kernels. To reduce their cost, customized many-term fused architectures are preferred, which improve the latency, power, and area of the designs. In this work, we design a parameterized fused many-term floating-point dot product architecture that is ready for high-level synthesis. In this way, we can exploit the efficiency offered by a well-structured fused dot-product architecture and the freedom offered by high-level synthesis in tuning the design’s pipeline to the selected floating-point format and architectural constraints. When compared with optimized dot-product units implemented directly in RTL, the proposed design offers lower-latency implementations under the same clock frequency with marginal area savings. This result holds for a variety of floating-point formats, including standard and reduced-precision representations.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-10-17
      DOI: 10.3390/jlpea12040056
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 57: Ocelli: Efficient Processing-in-Pixel Array
           Enabling Edge Inference of Ternary Neural Networks

    • Authors: Sepehr Tabrizchi, Shaahin Angizi, Arman Roohi
      First page: 57
      Abstract: Convolutional Neural Networks (CNNs), due to their recent successes, have gained lots of attention in various vision-based applications. They have proven to produce incredible results, especially on big data, that require high processing demands. However, CNN processing demands have limited their usage in embedded edge devices with constrained energy budgets and hardware. This paper proposes an efficient new architecture, namely Ocelli includes a ternary compute pixel (TCP) consisting of a CMOS-based pixel and a compute add-on. The proposed Ocelli architecture offers several features; (I) Because of the compute add-on, TCPs can produce ternary values (i.e., −1, 0, +1) regarding the light intensity as pixels’ inputs; (II) Ocelli realizes analog convolutions enabling low-precision ternary weight neural networks. Since the first layer’s convolution operations are the performance bottleneck of accelerators, Ocelli mitigates the overhead of analog buffers and analog-to-digital converters. Moreover, our design supports a zero-skipping scheme to further power reduction; (III) Ocelli exploits non-volatile magnetic RAMs to store CNN’s weights, which remarkably reduces the static power consumption; and finally, (IV) Ocelli has two modes, including sensing and processing. Once the object is detected, the architecture switches to the typical sensing mode to capture the image. Compared to the conventional pixels, it achieves an average 10% efficiency on its lane detection power consumption compared with existing edge detection algorithms. Moreover, considering different CNN workloads, our design shows more than 23% power efficiency over conventional designs, while it can achieve better accuracy.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-10-30
      DOI: 10.3390/jlpea12040057
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 58: Tunnel Field-Effect Transistor: Iimpact of the
           Asymmetric and Symmetric Ambipolarity on Fault and Performance in Digital
           Circuits

    • Authors: Chiara Elfi Spano, Fabrizio Mo, Roberta Antonina Claudino, Yuri Ardesi, Massimo Ruo Roch, Gianluca Piccinini, Marco Vacca
      First page: 58
      Abstract: Tunnel Field-Effect Transistors (TFETs) have been considered one of the most promising technologies to complement or replace CMOS for ultra-low-power applications, thanks to their subthreshold slope below the well-known limit of 60 mV/dec at room temperature holding for the MOSFET technologies. Nevertheless, TFET technology still suffers of ambipolar conduction, limiting its applicability in digital systems. In this work, we analyze through SPICE simulations, the impact of the symmetric and asymmetric ambipolarity in failure and power consumption for TFET-based complementary logic circuits. Our results clarify the circuit-level effects induced by the ambipolarity feature, demonstrating that it affects the correct functioning of logic gates and strongly impacts power consumption. We believe that our outcomes motivate further research towards technological solutions for ambipolarity suppression in TFET technology for near-future ultra-low-power applications.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-10-31
      DOI: 10.3390/jlpea12040058
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 59: Towards Low-Power Machine Learning Architectures
           Inspired by Brain Neuromodulatory Signalling

    • Authors: Taylor Barton, Hao Yu, Kyle Rogers, Nancy Fulda, Shiuh-hua Wood Chiang, Jordan Yorgason, Karl F. Warnick
      First page: 59
      Abstract: We present a transfer learning method inspired by modulatory neurotransmitter mechanisms in biological brains and explore applications for neuromorphic hardware. In this method, the pre-trained weights of an artificial neural network are held constant and a new, similar task is learned by manipulating the firing sensitivity of each neuron via a supplemental bias input. We refer to this as neuromodulatory tuning (NT). We demonstrate empirically that neuromodulatory tuning produces results comparable with traditional fine-tuning (TFT) methods in the domain of image recognition in both feed-forward deep learning and spiking neural network architectures. In our tests, NT reduced the number of parameters to be trained by four orders of magnitude as compared with traditional fine-tuning methods. We further demonstrate that neuromodulatory tuning can be implemented in analog hardware as a current source with a variable supply voltage. Our analog neuron design implements the leaky integrate-and-fire model with three bi-directional binary-scaled current sources comprising the synapse. Signals approximating modulatory neurotransmitter mechanisms are applied via adjustable power domains associated with each synapse. We validate the feasibility of the circuit design using high-fidelity simulation tools and propose an efficient implementation of neuromodulatory tuning using integrated analog circuits that consume significantly less power than digital hardware (GPU/CPU).
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-11-04
      DOI: 10.3390/jlpea12040059
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 60: Ultra-Low-Power Circuits for Intermittent
           Communication

    • Authors: Alessandro Torrisi, Kasım Sinan Yıldırım, Davide Brunelli
      First page: 60
      Abstract: Self-sustainable energy harvesting for Internet of Things devices is challenging since ambient energy may be sporadic and unpredictable. This situation leads to frequent power failures that lead to intermittent operations, which prevent the reliability of data communications. This article presents fundamental hardware circuitry that enables reliable intermittent communications over wireless batteryless node networks. We emphasize two main mechanisms that ensure energy awareness and reliability: energy status-sharing and synchronized operation. We introduce novel low-power and self-sustainable plug-and-play circuits to support these mechanisms.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-11-13
      DOI: 10.3390/jlpea12040060
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 61: Hardware Solutions for Low-Power Smart Edge
           Computing

    • Authors: Lucas Martin Wisniewski, Jean-Michel Bec, Guillaume Boguszewski, Abdoulaye Gamatié
      First page: 61
      Abstract: The edge computing paradigm for Internet-of-Things brings computing closer to data sources, such as environmental sensors and cameras, using connected smart devices. Over the last few years, research in this area has been both interesting and timely. Typical services like analysis, decision, and control, can be realized by edge computing nodes executing full-fledged algorithms. Traditionally, low-power smart edge devices have been realized using resource-constrained systems executing machine learning (ML) algorithms for identifying objects or features, making decisions, etc. Initially, this paper discusses recent advances in embedded systems that are devoted to energy efficient ML algorithm execution. A survey of the mainstream embedded computing devices for low power IoT and edge computing is then presented. Finally, CYSmart is introduced as an innovative smart edge computing system. Two operational use cases are presented to illustrate its power efficiency.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-11-25
      DOI: 10.3390/jlpea12040061
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 62: 0.6-V 1.65-μW Second-Order Gm-C Bandpass
           Filter for Multi-Frequency Bioimpedance Analysis Based on a Bootstrapped
           Bulk-Driven Voltage Buffer

    • Authors: Juan M. Carrillo, Carlos A. de la Cruz-Blas
      First page: 62
      Abstract: A bootstrapping technique used to increase the intrinsic voltage gain of a bulk-driven MOS transistor is described in this paper. The proposed circuit incorporates a capacitor and a cutoff transistor to be connected to the gate terminal of a bulk-driven MOS device, thus achieving a quasi-floating-gate structure. As a result, the contribution of the gate transconductance is cancelled out and the voltage gain of the device is correspondingly increased. The technique allows for implementing a voltage follower with a voltage gain much closer to unity as compared to the conventional bulk-driven case. This voltage buffer, along with a pseudo-resistor, is used to design a linearized transconductor. The proposed transconductance cell includes an economic continuous tuning mechanism that permits programming the effective transconductance in a range sufficiently wide to counteract the typical variations that process parameters suffer during fabrication. The transconductor has been used to implement a second-order Gm-C bandpass filter with a relatively high selectivity factor, suited for multi-frequency bioimpedance analysis in a very low-voltage environment. All the circuits have been designed in 180 nm CMOS technology to operate with a 0.6-V single-supply voltage. Simulated results show that the proposed technique allows for increasing the linearity and reducing the input-referred noise of the bootstrapped bulk-driven MOS transistor, which results in an improvement of the overall performance of the transconductor. The center frequency of the bandpass filter designed can be programmed in the frequency range from 6.5 kHz to 37.5 kHz with a power consumption ranging between 1.34 μW and 2.19 μW. The circuit presents an in-band integrated noise of 190.5 μVrms and is able to process signals of 110 mVpp with a THD below −40 dB, thus leading to a dynamic range of 47.4 dB.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-11-30
      DOI: 10.3390/jlpea12040062
      Issue No: Vol. 12, No. 4 (2022)
       
  • JLPEA, Vol. 12, Pages 35: ±0.3V Bulk-Driven Fully Differential
           Buffer with High Figures of Merit

    • Authors: Manaswini Gangineni, Jaime Ramirez-Angulo, Héctor Vázquez-Leal, Jesús Huerta-Chua, Antonio J. Lopez-Martin, Ramon Gonzalez Carvajal
      First page: 35
      Abstract: A high performance bulk-driven rail-to-rail fully differential buffer operating from ±0.3V supplies in 180 nm CMOS technology is reported. It has a differential–difference input stage and common mode feedback circuits implemented with no-tail, high CMRR bulk-driven pseudo-differential cells. It operates in subthreshold, has infinite input impedance, low output impedance (1.4 kΩ), 86.77 dB DC open-loop gain, 172.91 kHz bandwidth and 0.684 μW static power dissipation with a 50-pF load capacitance. The buffer has power efficient class AB operation, a small signal figure of merit FOMSS = 12.69 MHzpFμW−1, a large signal figure of merit FOMLS = 34.89 (V/μs) pFμW−1, CMRR = 102 dB, PSRR+ = 109 dB, PSRR− = 100 dB, 1.1 μV/√Hz input noise spectral density, 0.3 mVrms input noise and 3.5 mV input DC offset voltage.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-06-22
      DOI: 10.3390/jlpea12030035
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 36: Performance Estimation of High-Level Dataflow
           Program on Heterogeneous Platforms by Dynamic Network Execution

    • Authors: Aurelien Bloch, Simone Casale-Brunet, Marco Mattavelli
      First page: 36
      Abstract: The performance of programs executed on heterogeneous parallel platforms largely depends on the design choices regarding how to partition the processing on the various different processing units. In other words, it depends on the assumptions and parameters that define the partitioning, mapping, scheduling, and allocation of data exchanges among the various processing elements of the platform executing the program. The advantage of programs written in languages using the dataflow model of computation (MoC) is that executing the program with different configurations and parameter settings does not require rewriting the application software for each configuration setting, but only requires generating a new synthesis of the execution code corresponding to different parameters. The synthesis stage of dataflow programs is usually supported by automatic code generation tools. Another competitive advantage of dataflow software methodologies is that they are well-suited to support designs on heterogeneous parallel systems as they are inherently free of memory access contention issues and naturally expose the available intrinsic parallelism. So as to fully exploit these advantages and to be able to efficiently search the configuration space to find the design points that better satisfy the desired design constraints, it is necessary to develop tools and associated methodologies capable of evaluating the performance of different configurations and to drive the search for good design configurations, according to the desired performance criteria. The number of possible design assumptions and associated parameter settings is usually so large (i.e., the dimensions and size of the design space) that intuition as well as trial and error are clearly unfeasible, inefficient approaches. This paper describes a method for the clock-accurate profiling of software applications developed using the dataflow programming paradigm such as the formal RVL-CAL language. The profiling can be applied when the application program has been compiled and executed on GPU/CPU heterogeneous hardware platforms utilizing two main methodologies, denoted as static and dynamic. This paper also describes how a method for the qualitative evaluation of the performance of such programs as a function of the supplied configuration parameters can be successfully applied to heterogeneous platforms. The technique was illustrated using two different application software examples and several design points.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-06-23
      DOI: 10.3390/jlpea12030036
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 37: Deep Learning Approaches to Source Code Analysis
           for Optimization of Heterogeneous Systems: Recent Results, Challenges and
           Opportunities

    • Authors: Francesco Barchi, Emanuele Parisi, Andrea Bartolini, Andrea Acquaviva
      First page: 37
      Abstract: To cope with the increasing complexity of digital systems programming, deep learning techniques have recently been proposed to enhance software deployment by analysing source code for different purposes, ranging from performance and energy improvement to debugging and security assessment. As embedded platforms for cyber-physical systems are characterised by increasing heterogeneity and parallelism, one of the most challenging and specific problems is efficiently allocating computational kernels to available hardware resources. In this field, deep learning applied to source code can be a key enabler to face this complexity. However, due to the rapid development of such techniques, it is not easy to understand which of those are suitable and most promising for this class of systems. For this purpose, we discuss recent developments in deep learning for source code analysis, and focus on techniques for kernel mapping on heterogeneous platforms, highlighting recent results, challenges and opportunities for their applications to cyber-physical systems.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-07-05
      DOI: 10.3390/jlpea12030037
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 38: Analysis and Comparison of Different Approaches
           to Implementing a Network-Based Parallel Data Processing Algorithm

    • Authors: Iouliia Skliarova
      First page: 38
      Abstract: It is well known that network-based parallel data processing algorithms are well suited to implementation in reconfigurable hardware recurring to either Field-Programmable Gate Arrays (FPGA) or Programmable Systems-on-Chip (PSoC). The intrinsic parallelism of these devices makes it possible to execute several data-independent network operations in parallel. However, the approaches to designing the respective systems vary significantly with the experience and background of the engineer in charge. In this paper, we analyze and compare the pros and cons of using an embedded processor, high-level synthesis methods, and register-transfer low-level design in terms of design effort, performance, and power consumption for implementing a parallel algorithm to find the two smallest values in a dataset. This problem is easy to formulate, has a number of practical applications (for instance, in low-density parity check decoders), and is very well suited to parallel implementation based on comparator networks.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-07-09
      DOI: 10.3390/jlpea12030038
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 39: Efficiency of Priority Queue Architectures in
           FPGA

    • Authors: Lukáš Kohútka
      First page: 39
      Abstract: This paper presents a novel SRAM-based architecture of a data structure that represents a set of multiple priority queues that can be implemented in FPGA or ASIC. The proposed architecture is based on shift registers, systolic arrays and SRAM memories. Such architecture, called MultiQueue, is optimized for minimum chip area costs, which leads to lower energy consumption too. The MultiQueue architecture has constant time complexity, constant critical path length and constant latency. Therefore, it is highly predictable and very suitable for real-time systems too. The proposed architecture was verified using a simplified version of UVM and applying millions of instructions with randomly generated input values. Achieved FPGA synthesis results are presented and discussed. These results show significant savings in FPGA Look-Up Tables consumption in comparison to existing solutions. More than 63% of Look-Up Tables can be saved using the MultiQueue architecture instead of the existing priority queues.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-07-14
      DOI: 10.3390/jlpea12030039
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 40: Dynamic SIMD Parallel Execution on GPU from
           High-Level Dataflow Synthesis

    • Authors: Aurelien Bloch, Simone Casale-Brunet, Marco Mattavelli
      First page: 40
      Abstract: Developing and fine-tuning software programs for heterogeneous hardware such as CPU/GPU processing platforms comprise a highly complex endeavor that demands considerable time and effort of software engineers and requires evaluating various fundamental components and features of both the design and of the platform to maximize the overall performance. The dataflow programming approach has proven to be an appropriate methodology for reaching such a difficult and complex goal for the intrinsic portability and the possibility of easily decomposing a network of actors on different processing units of the heterogeneous hardware. Nonetheless, such a design method might not be enough on its own to achieve the desired performance goals, and supporting tools are useful to be able to efficiently explore the design space so as to optimize the desired performance objectives. This article presents a methodology composed of several stages for enhancing the performance of dataflow software developed in RVC-CAL and generating low-level implementations to be executed on GPU/CPU heterogeneous hardware platforms. The stages are composed of a method for the efficient scheduling of parallel CUDA partitions, an optimization of the performance of the data transmission tasks across computing kernels, and the exploitation of dynamic programming for introducing SIMD-capable graphics processing unit systems. The methodology is validated on both the quantitative and qualitative side by means of dataflow software application examples running on platforms according to various different mapping configurations.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-07-17
      DOI: 10.3390/jlpea12030040
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 41: Electrical Impedance Tomography for Hand Gesture
           Recognition for HMI Interaction Applications

    • Authors: Noelia Vaquero-Gallardo, Herminio Martínez-García
      First page: 41
      Abstract: Electrical impedance tomography (EIT) is based on the physical principle of bioimpedance defined as the opposition that biological tissues exhibit to the flow of a rotating alternating electrical current. Consequently, here, we propose studying the characterization and classification of bioimpedance patterns based on EIT by measuring, on the forearm with eight electrodes in a non-invasive way, the potential drops resulting from the execution of six hand gestures. The starting point was the acquisition of bioimpedance patterns studied by means of principal component analysis (PCA), validated through the cross-validation technique, and classified using the k-nearest neighbor (kNN) classification algorithm. As a result, it is concluded that reduction and classification is feasible, with a sensitivity of 0.89 in the worst case, for each of the reduced bioimpedance patterns, leading to the following direct advantage: a reduction in the numbers of electrodes and electronics required. In this work, bioimpedance patterns were investigated for monitoring subjects’ mobility, where, generally, these solutions are based on a sensor system with moving parts that suffer from significant problems of wear, lack of adaptability to the patient, and lack of resolution. Whereas, the proposal implemented in this prototype, based on the so-called electrical impedance tomography, does not have these problems.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-07-18
      DOI: 10.3390/jlpea12030041
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 42: The Benefits and Costs of Netlist Randomization
           Based Side-Channel Countermeasures: An In-Depth Evaluation

    • Authors: Ali Asghar, Andreas Becher, Daniel Ziener
      First page: 42
      Abstract: Exchanging FPGA-based implementations of cryptographic algorithms during run-time using netlist randomized versions has been introduced recently as a unique countermeasure against side channel attacks. Using partial reconfiguration, it is possible to shuffle between structurally different but functionally similar versions of a cryptographic implementation. The resulting varying power profile enhances the resistance against power-based side channel attacks. While side channel leakage is reduced, costs in terms of additional resources and/or lowered throughput are often increased due to the overheads of the required online partial reconfiguration. In this work, we provide an in-depth evaluation of the leakage-area-throughput trade-off.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-07-23
      DOI: 10.3390/jlpea12030042
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 43: A Subthreshold Layout Strategy for Faster and
           Lower Energy Complex Digital Circuits

    • Authors: Jordan Morris, Pranay Prabhat, James Myers, Alex Yakovlev
      First page: 43
      Abstract: This work presents complex circuitry from subthreshold standard cell libraries created by geometric STI spacer patterning for bulk planar CMOS technology nodes. Performance/leakage granularity enhancement affords safer multi-Vt synthesis in aggressive voltage scaling schemes. Libraries are evaluated in silicon through implementation of 32-bit datapath 128-bit AES cores. Intra-die nominal temperature (20 °C) analysis reveals improvements of up to 8.65×/24% MEP-to-MEP in frequency and energy-per-cycle respectively, compared to a state-of-the-art subthreshold library. A negative temperature correlation with performance enhancement is demonstrated extending beyond the cell level and into more complex designs. MEP-to-MEP performance enhancement and energy-per-cycle reduction are demonstrated over a temperature range of 0 °C to 85 °C.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-08-02
      DOI: 10.3390/jlpea12030043
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 44: Review on the Basic Circuit Elements and
           Memristor Interpretation: Analysis, Technology and Applications

    • Authors: Aliyu Isah, Jean-Marie Bilbault
      First page: 44
      Abstract: Circuit or electronic components are useful elements allowing the realization of different circuit functionalities. The resistor, capacitor and inductor represent the three commonly known basic passive circuit elements owing to their fundamental nature relating them to the four circuit variables, namely voltage, magnetic flux, current and electric charge. The memory resistor (or memristor) was claimed to be the fourth basic passive circuit element, complementing the resistor, capacitor and inductor. This paper presents a review on the four basic passive circuit elements. After a brief recall on the first three known basic passive circuit elements, a thorough description of the memristor follows. Memristor sparks interest in the scientific community due to its interesting features, for example nano-scalability, memory capability, conductance modulation, connection flexibility and compatibility with CMOS technology, etc. These features among many others are currently in high demand on an industrial scale. For this reason, thousands of memristor-based applications are reported. Hence, the paper presents an in-depth overview of the philosophical argumentations of memristor, technologies and applications.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-08-03
      DOI: 10.3390/jlpea12030044
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 45: Computer Engineering Education Experiences with
           

    • Authors: Peter Jamieson, Huan Le, Nathan Martin, Tyler McGrew, Yicheng Qian, Eric Schonauer, Alan Ehret, Michel A. Kinsy
      First page: 45
      Abstract: With the growing popularity of RISC-V and various open-source released RISC-V processors, it is now possible for computer engineers students to explore this simple and relevant architecture, and also, these students can explore and design a microcontroller at a low-level using real tool-flows and implement and test their hardware. In this work, we describe our experiences with undergraduate engineers building RISC-V architectures on an FPGA and then extending their experiences to implement an Arduino-like RISC-V tool-flow and the respective hardware and software to handle input-output ports, interrupts, hardware timers, and communication protocols. The microcontroller is implemented on an FPGA as a Senior Design project to test the viability of such efforts. In this work, we will explain how undergraduates can achieve these experiences including preparation for these projects, the tool-flows they use, the challenges in understanding and extending a RISC-V processor with microcontroller functionality, and a suggestion of how to integrate this learning into an existing curriculum, including a discussion on if we should include these deeper experiences in the Computer Engineering undergraduate curriculum.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-08-09
      DOI: 10.3390/jlpea12030045
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 46: High-Speed and Energy-Efficient Carry Look-Ahead
           Adder

    • Authors: Padmanabhan Balasubramanian, Nikos E. Mastorakis
      First page: 46
      Abstract: The carry look-ahead adder (CLA) is well known among the family of high-speed adders. However, a conventional CLA is not faster than other high-speed adders such as a conditional sum adder (CSA), a carry-select adder (CSLA), and the Kogge–Stone adder (KSA), which is the fastest parallel-prefix adder. Further, in terms of power-delay product (PDP) that characterizes the energy of digital circuits, the conventional CLA is not efficient compared to CSLA and KSA. In this context, this paper presents a high-speed and energy-efficient architecture for the CLA. Many adders ranging from ripple carry to parallel-prefix adders were implemented using a 32-28 nm CMOS standard digital cell library by considering a 32-bit addition. The adders were structurally described in Verilog and synthesized using Synopsys Design Compiler. From the results obtained, it is observed that the proposed CLA achieves a reduction in critical path delay by 55.3% and a reduction in PDP by 45% compared to the conventional CLA. Compared to the CSA, the proposed CLA achieves a reduction in critical path delay by 33.9%, a reduction in power by 26.1%, and a reduction in PDP by 51.1%. Compared to an optimized CSLA, the proposed CLA achieves a reduction in power by 35.4%, a reduction in area by 37.3%, and a reduction in PDP by 37.1% without sacrificing the speed. Although the KSA is faster, the proposed CLA achieves a reduction in power by 39.6%, a reduction in PDP by 6.5%, and a reduction in area by 55.6% in comparison.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-08-10
      DOI: 10.3390/jlpea12030046
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 47: LoRa-Based Wireless Sensors Network for Rockfall
           and Landslide Monitoring: A Case Study in Pantelleria Island with Portable
           LoRaWAN Access

    • Authors: Mattia Ragnoli, Alfiero Leoni, Gianluca Barile, Giuseppe Ferri, Vincenzo Stornelli
      First page: 47
      Abstract: Rockfalls and landslides are hazards triggered from geomorphological and climatic factors other than human interaction. The economic and social impacts are not negligible, therefore the topic has become an important field in the application of remote monitoring. Wireless sensor networks (WSNs) are particularly suited for the deployment of such systems, thanks to the different technologies and topologies that are evolving nowadays. Among these, LoRa modulation technique represents a fitting technical solution for nodes communication in a WSN. In this paper, a smart autonomous LoRa-based rockfall and landslide monitoring system is presented. The structure has been operating in Pantelleria Island, Sicily, Italy. The sensing elements are disposed in sensor nodes arranged in a star topology. Network access to the LoRaWAN and the Internet is provided through gateways using a portable, solar powered device assembly. A system overview concerning both hardware and functionality of the nodes and gateways devices, then a power analysis is reported, and a monthly recorded result is presented, with related discussion.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-09-07
      DOI: 10.3390/jlpea12030047
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 48: Time- and Amplitude-Controlled Power Noise
           Generator against SPA Attacks for FPGA-Based IoT Devices

    • Authors: Luis Parrilla, Antonio García, Encarnación Castillo, Salvador Rodríguez-Bolívar, Juan Antonio López-Villanueva
      First page: 48
      Abstract: Power noise generation for masking power traces is a powerful countermeasure against Simple Power Analysis (SPA), and it has also been used against Differential Power Analysis (DPA) or Correlation Power Analysis (CPA) in the case of cryptographic circuits. This technique makes use of power consumption generators as basic modules, which are usually based on ring oscillators when implemented on FPGAs. These modules can be used to generate power noise and to also extract digital signatures through the power side channel for Intellectual Property (IP) protection purposes. In this paper, a new power consumption generator, named Xored High Consuming Module (XHCM), is proposed. XHCM improves, when compared to others proposals in the literature, the amount of current consumption per LUT when implemented on FPGAs. Experimental results show that these modules can achieve current increments in the range from 2.4 mA (with only 16 LUTs on Artix-7 devices with a power consumption density of 0.75 mW/LUT when using a single HCM) to 11.1 mA (with 67 LUTs when using 8 XHCMs, with a power consumption density of 0.83 mW/LUT). Moreover, a version controlled by Pulse-Width Modulation (PWM) has been developed, named PWM-XHCM, which is, as XHCM, suitable for power watermarking. In order to build countermeasures against SPA attacks, a multi-level XHCM (ML-XHCM) is also presented, which is capable of generating different power consumption levels with minimal area overhead (27 six-input LUTS for generating 16 different amplitude levels on Artix-7 devices). Finally, a randomized version, named RML-XHCM, has also been developed using two True Random Number Generators (TRNGs) to generate current consumption peaks with random amplitudes at random times. RML-XHCM requires less than 150 LUTs on Artix-7 devices. Taking into account these characteristics, two main contributions have been carried out in this article: first, XHCM and PWM-XHCM provide an efficient power consumption generator for extracting digital signatures through the power side channel, and on the other hand, ML-XHCM and RML-XHCM are powerful tools for the protection of processing units against SPA attacks in IoT devices implemented on FPGAs.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-09-10
      DOI: 10.3390/jlpea12030048
      Issue No: Vol. 12, No. 3 (2022)
       
  • JLPEA, Vol. 12, Pages 19: A Novel Inductorless Design Technique for Linear
           Equalization in Optical Receivers

    • Authors: Diaaeldin Abdelrahman, Christopher Williams, Odile Liboiron-Ladouceur, Glenn E. R. Cowan
      First page: 19
      Abstract: To mitigate the trade-off between gain and bandwidth of CMOS multistage amplifiers, a receiver front-end (FE) that employs a high-gain narrowband transimpedance amplifier (TIA) followed by an equalizing main amplifier (EMA) is proposed. The EMA provides a high-frequency peaking to extend the FE’s bandwidth from 25% to 60% of the targeted data rate (fbit). The peaking is realized by adding a pole in the feedback paths of an active feedback-based wideband amplifier. By embedding the peaking in the main amplifier (MA), the front-end meets the sensitivity and gain of conventional equalizer-based receivers with better energy efficiency by eliminating the equalizer stages. Simulated in TSMC 65 nm CMOS technology, the proposed front-end achieves 7.4 dB and 6 dB higher gain at 10 Gb/s and 20 Gb/s, respectively, compared to a conventional front-end that is designed for equal bandwidth and dissipates the same power. The higher gain demonstrates the capability of the proposed technique in breaking the gain-bandwidth trade-off. The higher gain also reduces the power penalty incurred by the decision circuit and improves the sensitivity by 1.5 dB and 2.24 dB at 10 Gb/s and 20 Gb/s, respectively. Simulations also confirm that the proposed FE exhibits a robust performance against process and temperature variations and can support large input currents.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-04-01
      DOI: 10.3390/jlpea12020019
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 20: An Experimental Study on Step-Up DC–DC
           Converters for Organic Photovoltaic Cells

    • Authors: P. Mendonça dos Santos, António J. Serralheiro, Beatriz Borges, João Paulo N. Torres, Ana Charas
      First page: 20
      Abstract: This work studies two circuit topologies to step-up the voltage supplied by an organic photovoltaic (OPV) cell. Comparison and validation of the proposed topologies are accomplished throughout analytical, simulation, and experimental results. Two circuit solutions were found more suitable to boost the harvested OPV cell low voltage, depending on the load condition: the classical hard-switching boost converter and a multilevel boost converter. Both experimental circuits include the drive of the MOSFET switch based on an LC oscillator at 1.2 MHz, allowing the implementation of a conversion system, supplied by voltages as low as 500 mV, with output voltages from 1.2 V up to 7 V, under solar simulator conditions. The circuit area for each converter prototype is 2.35 cm2, with a total area below 3.0 cm2 for the overall energy harvesting system, including the OPV cell, which makes this proposal an extremely compact solution for ultra-low power harvesting applications.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-04-08
      DOI: 10.3390/jlpea12020020
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 21: Real-Time Embedded Implementation of Improved
           Object Detector for Resource-Constrained Devices

    • Authors: Niranjan Ravi, Mohamed El-Sharkawy
      First page: 21
      Abstract: Artificial intelligence (A.I.) has revolutionised a wide range of human activities, including the accelerated development of autonomous vehicles. Self-navigating delivery robots are recent trends in A.I. applications such as multitarget object detection, image classification, and segmentation to tackle sociotechnical challenges, including the development of autonomous driving vehicles, surveillance systems, intelligent transportation, and smart traffic monitoring systems. In recent years, object detection and its deployment on embedded edge devices have seen a rise in interest compared to other perception tasks. Embedded edge devices have limited computing power, which impedes the deployment of efficient detection algorithms in resource-constrained environments. To improve on-board computational latency, edge devices often sacrifice performance, creating the need for highly efficient A.I. models. This research examines existing loss metrics and their weaknesses, and proposes an improved loss metric that can address the bounding box regression problem. Enhanced metrics were implemented in an ultraefficient YOLOv5 network and tested on the targeted datasets. The latest version of the PyTorch framework was incorporated in model development. The model was further deployed using the ROS 2 framework running on NVIDIA Jetson Xavier NX, an embedded development platform, to conduct the experiment in real time.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-04-13
      DOI: 10.3390/jlpea12020021
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 22: Graph Coloring via Locally-Active Memristor
           Oscillatory Networks

    • Authors: Alon Ascoli, Martin Weiher, Melanie Herzig, Stefan Slesazeck, Thomas Mikolajick, Ronald Tetzlaff
      First page: 22
      Abstract: This manuscript provides a comprehensive tutorial on the operating principles of a bio-inspired Cellular Nonlinear Network, leveraging the local activity of NbOx memristors to apply a spike-based computing paradigm, which is expected to deliver such a separation between the steady-state phases of its capacitively-coupled oscillators, relative to a reference cell, as to unveal the classification of the nodes of the associated graphs into the least number of groups, according to the rules of a non-deterministic polynomial-hard combinatorial optimization problem, known as vertex coloring. Besides providing the theoretical foundations of the bio-inspired signal-processing paradigm, implemented by the proposed Memristor Oscillatory Network, and presenting pedagogical examples, illustrating how the phase dynamics of the memristive computing engine enables to solve the graph coloring problem, the paper further presents strategies to compensate for an imbalance in the number of couplings per oscillator, to counteract the intrinsic variability observed in the electrical behaviours of memristor samples from the same batch, and to prevent the impasse appearing when the array attains a steady-state corresponding to a local minimum of the optimization goal. The proposed Memristor Cellular Nonlinear Network, endowed with ad hoc circuitry for the implementation of these control strategies, is found to classify the vertices of a wide set of graphs in a number of color groups lower than the cardinality of the set of colors identified by traditional either software or hardware competitor systems. Given that, under nominal operating conditions, a biological system, such as the brain, is naturally capable to optimise energy consumption in problem-solving activities, the capability of locally-active memristor nanotechnologies to enable the circuit implementation of bio-inspired signal processing paradigms is expected to pave the way toward electronics with higher time and energy efficiency than state-of-the-art purely-CMOS hardware.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-04-18
      DOI: 10.3390/jlpea12020022
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 23: A Network Simulator for the Estimation of
           Bandwidth Load and Latency Created by Heterogeneous Spiking Neural
           Networks on Neuromorphic Computing Communication Networks

    • Authors: Robert Kleijnen, Markus Robens, Michael Schiek, Stefan van Waasen
      First page: 23
      Abstract: Accelerated simulations of biological neural networks are in demand to discover the principals of biological learning. Novel many-core simulation platforms, e.g., SpiNNaker, BrainScaleS and Neurogrid, allow one to study neuron behavior in the brain at an accelerated rate, with a high level of detail. However, they do not come anywhere near simulating the human brain. The massive amount of spike communication has turned out to be a bottleneck. We specifically developed a network simulator to analyze in high detail the network loads and latencies caused by different network topologies and communication protocols in neuromorphic computing communication networks. This simulator allows simulating the impacts of heterogeneous neural networks and evaluating neuron mapping algorithms, which is a unique feature among state-of-the-art network models and simulators. The simulator was cross-checked by comparing the results of a homogeneous neural network-based run with corresponding bandwidth load results from comparable works. Additionally, the increased level of detail achieved by the new simulator is presented. Then, we show the impact heterogeneous connectivity can have on the network load, first for a small-scale test case, and later for a large-scale test case, and how different neuron mapping algorithms can influence this effect. Finally, we look at the latency estimations performed by the simulator for different mapping algorithms, and the impact of the node size.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-04-21
      DOI: 10.3390/jlpea12020023
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 24: Low-Power Deep Learning Model for Plant Disease
           Detection for Smart-Hydroponics Using Knowledge Distillation Techniques

    • Authors: Aminu Musa, Mohammed Hassan, Mohamed Hamada, Farouq Aliyu
      First page: 24
      Abstract: Recent advances in computing allows researchers to propose the automation of hydroponic systems to boost efficiency and reduce manpower demands, hence increasing agricultural produce and profit. A completely automated hydroponic system should be equipped with tools capable of detecting plant diseases in real-time. Despite the availability of deep-learning-based plant disease detection models, the existing models are not designed for an embedded system environment, and the models cannot realistically be deployed on resource-constrained IoT devices such as raspberry pi or a smartphone. Some of the drawbacks of the existing models are the following: high computational resource requirements, high power consumption, dissipates energy rapidly, and occupies large storage space due to large complex structure. Therefore, in this paper, we proposed a low-power deep learning model for plant disease detection using knowledge distillation techniques. The proposed low-power model has a simple network structure of a shallow neural network. The parameters of the model were also reduced by more than 90%. This reduces its computational requirements as well as its power consumption. The proposed low-power model has a maximum power consumption of 6.22 w, which is significantly lower compared to the existing models, and achieved a detection accuracy of 99.4%.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-04-26
      DOI: 10.3390/jlpea12020024
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 25: Selective Noise Based Power-Efficient and
           Effective Countermeasure against Thermal Covert Channel Attacks in
           Multi-Core Systems

    • Authors: Parisa Rahimi, Amit Kumar Singh, Xiaohang Wang
      First page: 25
      Abstract: With increasing interest in multi-core systems, such as any communication systems, infra-structures can become targets for information leakages via covert channel communication. Covert channel attacks lead to leaking secret information and data. To design countermeasures against these threats, we need to have good knowledge about classes of covert channel attacks along with their properties. Temperature–based covert communication channel, known as Thermal Covert Channel (TCC), can pose a threat to the security of critical information and data. In this paper, we present a novel scheme against such TCC attacks. The scheme adds selective noise to the thermal signal so that any possible TCC attack can be wiped out. The noise addition only happens at instances when there are chances of correct information exchange to increase the bit error rate (BER) and keep the power consumption low. Our experiments have illustrated that the BER of a TCC attack can increase to 94% while having similar power consumption as that of state-of-the-art.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-05-03
      DOI: 10.3390/jlpea12020025
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 26: A Generalistic Approach to
           Machine-Learning-Supported Task Migration on Real-Time Systems

    • Authors: Octavio Delgadillo, Bernhard Blieninger, Juri Kuhn, Uwe Baumgarten
      First page: 26
      Abstract: Consolidating tasks to a smaller number of electronic control units (ECUs) is an important strategy for optimizing costs and resources in the automotive industry. In our research, we aim to enable ECU consolidation by migrating tasks at runtime between different ECUs, which adds redundancy and fail-safety capabilities to the system. In this paper, we present a setup with a generalistic and modular architecture that allows for integrating and testing different ECU architectures and machine learning (ML) models. As part of a holistic testbed, we introduce a collection of reproducible tasks, as well as a toolchain that controls the dynamic migration of tasks depending on ECU status and load. The migration is aided by the machine learning predictions on the schedulability analysis of possible future task distributions. To demonstrate the capabilities of the setup, we show its integration with FreeRTOS-based ECUs and two ML models—a long short-term memory (LSTM) network and a spiking neural network—along with a collection of tasks to distribute among the ECUs. Our approach shows a promising potential for machine-learning-based schedulability analysis and enables a comparison between different ML models.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-05-03
      DOI: 10.3390/jlpea12020026
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 27: A Standard-Cell-Based CMFB for Fully
           Synthesizable OTAs

    • Authors: Francesco Centurelli, Riccardo Della Sala, Giuseppe Scotti
      First page: 27
      Abstract: In this paper, we propose a fully standard-cell-based common-mode feedback (CMFB) loop with an explicit voltage reference to improve the CMRR of pseudo-differential standard-cell-based amplifiers and to stabilize the dc output voltage. This latter feature allows robust biasing of operational transconductance amplifiers (OTAs) based on a cascade of such stages. A detailed analysis of the CMFB is reported to both provide insight into circuit behavior and to derive useful design guidelines. The proposed CMFB is then exploited to build a fully standard-cell OTA suitable for automatic place and route. Simulation results referring to the standard-cell library of a commercial 130 nm CMOS process illustrated a differential gain of 28.3 dB with a gain-bandwidth product of 15.4 MHz when driving a 1.5 pF load capacitance. The OTA exhibits good robustness under PVT and mismatch variations and achieves state-of-the-art FOMs also thanks to the limited area footprint.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-05-05
      DOI: 10.3390/jlpea12020027
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 28: Big–Little Adaptive Neural Networks on
           Low-Power Near-Subthreshold Processors

    • Authors: Zichao Shen, Neil Howard, Jose Nunez-Yanez
      First page: 28
      Abstract: This paper investigates the energy savings that near-subthreshold processors can obtain in edge AI applications and proposes strategies to improve them while maintaining the accuracy of the application. The selected processors deploy adaptive voltage scaling techniques in which the frequency and voltage levels of the processor core are determined at the run-time. In these systems, embedded RAM and flash memory size is typically limited to less than 1 megabyte to save power. This limited memory imposes restrictions on the complexity of the neural networks model that can be mapped to these devices and the required trade-offs between accuracy and battery life. To address these issues, we propose and evaluate alternative ‘big–little’ neural network strategies to improve battery life while maintaining prediction accuracy. The strategies are applied to a human activity recognition application selected as a demonstrator that shows that compared to the original network, the best configurations obtain an energy reduction measured at 80% while maintaining the original level of inference accuracy.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-05-18
      DOI: 10.3390/jlpea12020028
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 29: Low-Overhead Reinforcement Learning-Based Power
           Management Using 2QoSM

    • Authors: Michael Giardino, Daniel Schwyn, Bonnie Ferri, Aldo Ferri
      First page: 29
      Abstract: With the computational systems of even embedded devices becoming ever more powerful, there is a need for more effective and pro-active methods of dynamic power management. The work presented in this paper demonstrates the effectiveness of a reinforcement-learning based dynamic power manager placed in a software framework. This combination of Q-learning for determining policy and the software abstractions provide many of the benefits of co-design, namely, good performance, responsiveness and application guidance, with the flexibility of easily changing policies or platforms. The Q-learning based Quality of Service Manager (2QoSM) is implemented on an autonomous robot built on a complex, powerful embedded single-board computer (SBC) and a high-resolution path-planning algorithm. We find that the 2QoSM reduces power consumption up to 42% compared to the Linux on-demand governor and 10.2% over a state-of-the-art situation aware governor. Moreover, the performance as measured by path error is improved by up to 6.1%, all while saving power.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-05-19
      DOI: 10.3390/jlpea12020029
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 30: Embedded Object Detection with Custom LittleNet,
           FINN and Vitis AI DCNN Accelerators

    • Authors: Michal Machura, Michal Danilowicz, Tomasz Kryjak
      First page: 30
      Abstract: Object detection is an essential component of many systems used, for example, in advanced driver assistance systems (ADAS) or advanced video surveillance systems (AVSS). Currently, the highest detection accuracy is achieved by solutions using deep convolutional neural networks (DCNN). Unfortunately, these come at the cost of a high computational complexity; hence, the work on the widely understood acceleration of these algorithms is very important and timely. In this work, we compare three different DCNN hardware accelerator implementation methods: coarse-grained (a custom accelerator called LittleNet), fine-grained (FINN) and sequential (Vitis AI). We evaluate the approaches in terms of object detection accuracy, throughput and energy usage on the VOT and VTB datasets. We also present the limitations of each of the methods considered. We describe the whole process of DNNs implementation, including architecture design, training, quantisation and hardware implementation. We used two custom DNN architectures to obtain a higher accuracy, higher throughput and lower energy consumption. The first was implemented in SystemVerilog and the second with the FINN tool from AMD Xilinx. Next, both approaches were compared with the Vitis AI tool from AMD Xilinx. The final implementations were tested on the Avnet Ultra96-V2 development board with the Zynq UltraScale+ MPSoC ZCU3EG device. For two different DNNs architectures, we achieved a throughput of 196 fps for our custom accelerator and 111 fps for FINN. The same networks implemented with Vitis AI achieved 123.3 fps and 53.3 fps, respectively.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-05-20
      DOI: 10.3390/jlpea12020030
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 31: A Methodology to Design Static NCL Libraries

    • Authors: Toi Le Thanh, Lac Truong Tri, Trang Hoang
      First page: 31
      Abstract: The Null Convention Logic (NCL) based asynchronous design technique has interested researchers because this technique had overcome disadvantages of the synchronous technique, such as noise, glitches, clock skew and power. However, using the NCL-based asynchronous design method is difficult for university students and researchers because of the lack of standard NCL cell libraries. Therefore, in this paper, a novel flow is proposed to design NCL cell libraries. These libraries are used to synthesize NCL-based asynchronous designs. We chose the static NCL cell library to illustrate the proposed design solution because this library is one of the most basic NCL libraries. Static NCL cells in this library are designed based on the Process Design Kit 45nm technology and are implemented by the Virtuoso and the Design Compiler (DC) tool. In addition, the Ocean script and Electronic Design Automation (EDA) environment are used for supporting designs and simulations. A complete library of 27 NCL cells was designed to serve for study and research. We also implemented synthesis for NCL full adders using this library and compared our synthesis results with the results of other authors. The comparison results indicated that our results were a 20% improvement on power consumption.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-06-06
      DOI: 10.3390/jlpea12020031
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 32: Implementing a Timing Error-Resilient and
           Energy-Efficient Near-Threshold Hardware Accelerator for Deep Neural
           Network Inference

    • Authors: Noel Daniel Gundi, Pramesh Pandey, Sanghamitra Roy, Koushik Chakraborty
      First page: 32
      Abstract: Increasing processing requirements in the Artificial Intelligence (AI) realm has led to the emergence of domain-specific architectures for Deep Neural Network (DNN) applications. Tensor Processing Unit (TPU), a DNN accelerator by Google, has emerged as a front runner outclassing its contemporaries, CPUs and GPUs, in performance by 15×–30×. TPUs have been deployed in Google data centers to cater to the performance demands. However, a TPU’s performance enhancement is accompanied by a mammoth power consumption. In the pursuit of lowering the energy utilization, this paper proposes PREDITOR—a low-power TPU operating in the Near-Threshold Computing (NTC) realm. PREDITOR uses mathematical analysis to mitigate the undetectable timing errors by boosting the voltage of the selective multiplier-and-accumulator units at specific intervals to enhance the performance of the NTC TPU, thereby ensuring a high inference accuracy at low voltage. PREDITOR offers up to 3×–5× improved performance in comparison to the leading-edge error mitigation schemes with a minor loss in accuracy.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-06-06
      DOI: 10.3390/jlpea12020032
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 33: The Potential of SoC FPAAs for Emerging
           Ultra-Low-Power Machine Learning

    • Authors: Jennifer Hasler
      First page: 33
      Abstract: Large-scale field-programmable analog arrays (FPAA) have the potential to handle machine inference and learning applications with significantly low energy requirements, potentially alleviating the high cost of these processes today, even in cloud-based systems. FPAA devices enable embedded machine learning, one form of physical mixed-signal computing, enabling machine learning and inference on low-power embedded platforms, particularly edge platforms. This discussion reviews the current capabilities of large-scale field-programmable analog arrays (FPAA), as well as considering the future potential of these SoC FPAA devices, including questions that enable ubiquitous use of FPAA devices similar to FPGA devices. Today’s FPAA devices include integrated analog and digital fabric, as well as specialized processors and infrastructure, becoming a platform of mixed-signal development and analog-enabled computing. We address and show that next-generation FPAAs can handle the required load of 10,000–10,000,000,000 PMAC, required for present and future large fielded applications, at orders of magnitude of lower energy levels than those expected by current technology, motivating the need to develop these new generations of FPAA devices.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-06-06
      DOI: 10.3390/jlpea12020033
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 34: Bridging the Gap between Design and Simulation
           of Low-Voltage CMOS Circuits

    • Authors: Cristina Missel Adornes, Deni Germano Alves Neto, Márcio Cherem Schneider, Carlos Galup-Montoro
      First page: 34
      Abstract: This work proposes a truly compact MOSFET model that contains only four parameters to assist an integrated circuits (IC) designer in a design by hand. The four-parameter model (4PM) is based on the advanced compact MOSFET (ACM) model and was implemented in Verilog-A to simulate different circuits designed with the ACM model in Verilog-compatible simulators. Being able to simulate MOS circuits through the same model used in a hand design benefits designers in understanding how the main MOSFET parameters affect the design. Herein, the classic CMOS inverter, a ring oscillator, a self-biased current source and a common source amplifier were designed and simulated using either the 4PM or the BSIM model. The four-parameter model was simulated in many sorts of circuits with very satisfactory results in the low-voltage cases. As the ultra-low-voltage (ULV) domain is expanding due to applications, such as the internet of things and wearable circuits, so is the use of a simplified ULV MOSFET model.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-06-16
      DOI: 10.3390/jlpea12020034
      Issue No: Vol. 12, No. 2 (2022)
       
  • JLPEA, Vol. 12, Pages 3: A Time-Domain z−1 Circuit with Digital
           Calibration

    • Authors: Orfeas Panetas-Felouris, Spyridon Vlassis
      First page: 3
      Abstract: This paper presents a novel circuit of a z−1 operation which is suitable, as a basic building block, for time-domain topologies and signal processing. The proposed circuit employs a time register circuit which is based on the capacitor discharging method. The large variation of the capacitor discharging slope over technology process and chip temperature variations which affect the z−1 accuracy is improved using a novel digital calibration loop. The circuit is designed using a 28 nm Samsung FD-SOI process under 1 V supply voltage with 5 MHz sampling frequency. Simulation results validate the theoretical analysis presenting a variation of capacitor voltage discharging slope less than 5% over worst-case process corners for temperature between 0 °C and 100 °C while consuming only 30 μA. Also, the worst-case accuracy of z−1 operation is better than 33 ps for input pulse widths between 5 ns and 45 ns presenting huge improvement compared with the uncalibrated operator.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-01-03
      DOI: 10.3390/jlpea12010003
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 4: CORDIC Hardware Acceleration Using DMA-Based ISA
           Extension

    • Authors: Erez Manor, Avrech Ben-David, Shlomo Greenberg
      First page: 4
      Abstract: The use of RISC-based embedded processors aimed at low cost and low power is becoming an increasingly popular ecosystem for both hardware and software development. High-performance yet low-power embedded processors may be attained via the use of hardware acceleration and Instruction Set Architecture (ISA) extension. Recent publications of AI have demonstrated the use of Coordinate Rotation Digital Computer (CORDIC) as a dedicated low-power solution for solving nonlinear equations applied to Neural Networks (NN). This paper proposes ISA extension to support floating-point CORDIC, providing efficient hardware acceleration for mathematical functions. A new DMA-based ISA extension approach integrated with a pipeline CORDIC accelerator is proposed. The CORDIC ISA extension is directly interfaced with a standard processor data path, allowing efficient implementation of new trigonometric ALU-based custom instructions. The proposed DMA-based CORDIC accelerator can also be used to perform repeated array calculations, offering a significant speedup over software implementations. The proposed accelerator is evaluated on Intel Cyclone-IV FPGA as an extension to Nios processor. Experimental results show a significant speedup of over three orders of magnitude compared with software implementation, while applied to trigonometric arrays, and outperforms the existing commercial CORDIC hardware accelerator.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-01-15
      DOI: 10.3390/jlpea12010004
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 5: Design Aspects of a Single-Output Multi-String
           WLED Driver Using 40 nm CMOS Technology

    • Authors: Fadi R. Shahroury, Hani H. Ahmad, Ibrahim Abuishmais
      First page: 5
      Abstract: This work presents various essential features and design aspects of a single-inductor, common-output, and multi-string White Light Emitting Diode (WLED) driver for low-power portable devices. High efficiency is one of the main features of such a device. Here, the efficiency improvement is achieved by selecting the proper arrangement of WLEDs and a proper sensing-circuit technique to determine the minimum, real-time, needed output voltage. This minimum voltage necessary to activate all WLEDs depends on the number of strings and the forward voltage drops among the WLEDs. Advanced CMOS technology is advantageous in mixed-signal environments such as WLED drivers. However, this process suffers from low on-resistance, which degrades the accuracy of the current sinks. To accommodate the above features and mitigate the low node process issue, a boost-converter that is single output with a load of a three-string arrangement, with 6 WLEDs each, is presented. The designed driver has an input voltage range of 3.2–4.2V. The proposed solution is realized with ultra-low power consumption circuits and verified using ADS tools utilizing 40 nm 1P9M TSMC CMOS technology. An inter-string current accuracy of 0.2% and peak efficiency of 91% are achieved with an output voltage up to 25 V. The integrated WLED driver circuitry enables a high switching frequency of 1MHz and reduces the passive elements’ size in the power stage.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-01-18
      DOI: 10.3390/jlpea12010005
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 6: Hardware/Software Solution for Low Power
           Evaluation of Tsunami Danger

    • Authors: Mikhail Lavrentiev, Konstantin Lysakov, Andrey Marchuk, Konstantin Oblaukhov, Mikhail Shadrin
      First page: 6
      Abstract: Carbon footprint reduction issues have been drawing more and more attention these days. Reducing the energy consumption is among the basic directions along this line. In the paper, a low-energy approach to tsunami danger evaluation is concerned. After several disaster tsunamis of the XXIst century, the question arises whether is it possible to evaluate in a couple of minutes the tsunami wave parameters, expected at the particular geo location. The point is that it takes around 20 min for the wave to approach the nearest coast after a seismic event offshore of Japan. Currently, the main tool for studying tsunamis is computer modeling. In particular, the expected tsunami height near the coastline, when a major underwater earthquake is detected, can be estimated by a series of numerical experiments of various scenarios of generation and the following wave propagation. Reducing the calculation time of such scenarios and the necessary energy consumption for this is the scope of this study. Moreover, in case of the major earthquake, the electric power shutdown is possible (e.g., the accident at the Fukushima nuclear power station in Japan on 11 May 2011), so the solution should be of low energy-consuming, preferably based at regular personal computers (PCs) or laptops. The way to achieve the requested performance of numerical modeling at the PC platform is a combination of efficient algorithms and their hardware acceleration. Following this strategy, a solution for the fast numerical simulation of tsunami wave propagation has been proposed. Most of tsunami researchers use the shallow-water approximation to simulate tsunami wave propagation at deep water areas. For software implementation, the MacCormack finite-difference scheme has been chosen, as it is suitable for pipelining. For hardware code acceleration, a special processor, that is, the calculator, has been designed at a field-programmable gate array (FPGA) platform. This combination was tested in terms of precision by comparison with the reference code and with the exact solutions (known for some special cases of the bottom profile). The achieved performance made it possible to calculate the wave propagation over a 1000 × 500 km water area in 1 min (the mesh size was compared to 250 m). It was nearly 300 times faster compared to that of a regular PC and 10 times faster compared to the use of a central processing unit (CPU). This result, being implemented into tsunami warning systems, will make it possible to reduce human casualties and economy losses for the so-called near-field tsunamis. The presented paper discussed the new aspect of such implementation, namely low energy consumption. The corresponding measurements for three platforms (PC and two types of FPGA) have been performed, and a comparison of the obtained results of energy consumption was given. As the numerical simulation of numerous tsunami propagation scenarios from different sources are needed for the purpose of coastal tsunami zoning, the integrated amount of the saving energy is expected to be really valuable. For the time being, tsunami researchers have not used the FPGA-based acceleration of computer code execution. Perhaps, the energy-saving aspect is able to promote the use of FPGAs in tsunami researches. The approach to designing special FPGA-based processors for the fast solution of various engineering problems using a PC could be extended to other areas, such as bioinformatics (motif search in DNA sequences and other algorithms of genome analysis and molecular dynamics) and seismic data processing (three-dimensional (3D) wave package decomposition, data compression, noise suppression, etc.).
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-01-21
      DOI: 10.3390/jlpea12010006
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 7: Acknowledgment to Reviewers of Journal of Low
           Power Electronics and Applications in 2021

    • Authors: Journal of Low Power Electronics; Applications Editorial Office
      First page: 7
      Abstract: Rigorous peer-reviews are the basis of high-quality academic publishing [...]
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-01-25
      DOI: 10.3390/jlpea12010007
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 8: CondenseNeXtV2: Light-Weight Modern Image
           Classifier Utilizing Self-Querying Augmentation Policies

    • Authors: Priyank Kalgaonkar, Mohamed El-Sharkawy
      First page: 8
      Abstract: Artificial Intelligence (AI) combines computer science and robust datasets to mimic natural intelligence demonstrated by human beings to aid in problem-solving and decision-making involving consciousness up to a certain extent. From Apple’s virtual personal assistant, Siri, to Tesla’s self-driving cars, research and development in the field of AI is progressing rapidly along with privacy concerns surrounding the usage and storage of user data on external servers which has further fueled the need of modern ultra-efficient AI networks and algorithms. The scope of the work presented within this paper focuses on introducing a modern image classifier which is a light-weight and ultra-efficient CNN intended to be deployed on local embedded systems, also known as edge devices, for general-purpose usage. This work is an extension of the award-winning paper entitled ‘CondenseNeXt: An Ultra-Efficient Deep Neural Network for Embedded Systems’ published for the 2021 IEEE 11th Annual Computing and Communication Workshop and Conference (CCWC). The proposed neural network dubbed CondenseNeXtV2 utilizes a new self-querying augmentation policy technique on the target dataset along with adaption to the latest version of PyTorch framework and activation functions resulting in improved efficiency in image classification computation and accuracy. Finally, we deploy the trained weights of CondenseNeXtV2 on NXP BlueBox which is an edge device designed to serve as a development platform for self-driving cars, and conclusions will be extrapolated accordingly.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-02-03
      DOI: 10.3390/jlpea12010008
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 9: Fully Differential Miller Op-Amp with Enhanced
           Large- and Small-Signal Figures of Merit

    • Authors: Anindita Paul, Jaime Ramirez-Angulo, Héctor Vázquez-Leal, Jesús Huerta-Chua, Alejandro Diaz-Sanchez
      First page: 9
      Abstract: A highly power-efficient, fully differential Miller op-amp with accurately controlled output quiescent current is introduced. The op-amp can drive both capacitive and resistive load due to the presence of the auxiliary amplifier. This amplifier helps to achieve class AB operation of the proposed op-amp. The fully differential auxiliary amplifier is compact and uses a resistive local common-mode feedback network. It consumes only 6% of the total current of the op-amp. The proposed op-amp has several innovative features. Incorporating the auxiliary amplifier helps to improve the unity gain frequency, power efficiency, slew-rate, and common-mode rejection ratio of the proposed op-amp. It can drive a wide range of resistive (200 Ω–1 MΩ) and capacitive loads (5 pF–300 pF). The op-amp has a large signal dynamic current efficiency of 8.6 and a large signal static current efficiency of 7.9. The small-signal figure of merit is 8.7 for RL = 1 MΩ and 7.3 for RL = 200 Ω.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-02-08
      DOI: 10.3390/jlpea12010009
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 10: Mapping Transformation Enabled High-Performance
           and Low-Energy Memristor-Based DNNs

    • Authors: Md. Oli-Uz-Zaman, Saleh Ahmad Khan, Geng Yuan, Zhiheng Liao, Jingyan Fu, Caiwen Ding, Yanzhi Wang, Jinhui Wang
      First page: 10
      Abstract: When deep neural network (DNN) is extensively utilized for edge AI (Artificial Intelligence), for example, the Internet of things (IoT) and autonomous vehicles, it makes CMOS (Complementary Metal Oxide Semiconductor)-based conventional computers suffer from overly large computing loads. Memristor-based devices are emerging as an option to conduct computing in memory for DNNs to make them faster, much more energy efficient, and accurate. Despite having excellent properties, the memristor-based DNNs are yet to be commercially available because of Stuck-At-Fault (SAF) defects. A Mapping Transformation (MT) method is proposed in this paper to mitigate Stuck-at-Fault (SAF) defects from memristor-based DNNs. First, the weight distribution for the VGG8 model with the CIFAR10 dataset is presented and analyzed. Then, the MT method is used for recovering inference accuracies at 0.1% to 50% SAFs with two typical cases, SA1 (Stuck-At-One): SA0 (Stuck-At-Zero) = 5:1 and 1:5, respectively. The experiment results show that the MT method can recover DNNs to their original inference accuracies (90%) when the ratio of SAFs is smaller than 2.5%. Moreover, even when the SAF is in the extreme condition of 50%, it is still highly efficient to recover the inference accuracy to 80% and 21%. What is more, the MT method acts as a regulator to avoid energy and latency overhead generated by SAFs. Finally, the immunity of the MT Method against non-linearity is investigated, and we conclude that the MT method can benefit accuracy, energy, and latency even with high non-linearity LTP = 4 and LTD = −4.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-02-10
      DOI: 10.3390/jlpea12010010
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 11: DSCU: Accelerating CNN Inference in FPGAs with
           Dual Sizes of Compute Unit

    • Authors: Zhenshan Bao, Junnan Guo, Wenbo Zhang, Hongbo Dang
      First page: 11
      Abstract: FPGA-based accelerators have shown great potential in improving the performance of CNN inference. However, the existing FPGA-based approaches suffer from a low compute unit (CU) efficiency due to their large number of redundant computations, thus leading to high levels of performance degradation. In this paper, we show that no single CU can perform best across all the convolutional layers (CONV-layers). To this end, we propose the use of dual sizes of compute unit (DSCU), an approach that aims to accelerate CNN inference in FPGAs. The key idea of DSCU is to select the best combination of CUs via dynamic programming scheduling for each CONV-layer and then assemble each CONV-layer combination into a computing solution for the given CNN to deploy in FPGAs. The experimental results show that DSCU can achieve a performance density of 3.36 × 10−3 GOPs/slice on a Xilinx Zynq ZU3EG, which is 4.29 times higher than that achieved by other approaches.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-02-13
      DOI: 10.3390/jlpea12010011
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 12: A Tree-Based Architecture for High-Performance
           Ultra-Low-Voltage Amplifiers

    • Authors: Francesco Centurelli, Riccardo Della Sala, Pietro Monsurrò, Giuseppe Scotti, Alessandro Trifiletti
      First page: 12
      Abstract: In this paper, we introduce a novel tree-based architecture which allows the implementation of Ultra-Low-Voltage (ULV) amplifiers. The architecture exploits a body-driven input stage to guarantee a rail-to-rail input common mode range and body-diode loading to avoid Miller compensation, thanks to the absence of high-impedance internal nodes. The tree-based structure improves the CMRR of the proposed amplifier with respect to the conventional OTA architectures and allows achievement of a reasonable CMRR even at supply voltages as low as 0.3 V and without tail current generators which cannot be used in ULV circuits. The bias currents and the static output voltages of all the stages implementing the architecture are accurately set through the gate terminals of biasing transistors in order to guarantee good robustness against PVT variations. The proposed architecture and the implementing stages are investigated from an analytical point of view and design equations for the main performance metrics are presented to provide insight into circuit behavior. A 0.3 V supply voltage, subthreshold, ultra-low-power (ULP) OTA, based on the proposed tree-based architecture, was designed in a commercial 130 nm CMOS process. Simulation results show a dc gain higher than 52 dB with a gain-bandwidth product of about 35 kHz and reasonable values of CMRR and PSRR, even at such low supply voltages and considering mismatches. The power consumption is as low as 21.89 nW and state-of-the-art small-signal and large-signal FoMs are achieved. Extensive parametric and Monte Carlo simulations show the robustness of the proposed circuit to PVT variations and mismatch. These results confirm that the proposed OTA is a good candidate to implement ULV, ULP, high performance analog building blocks for directly harvested IoT nodes.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-02-17
      DOI: 10.3390/jlpea12010012
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 13: A Model for the Evaluation of Monostable
           Molecule Signal Energy in Molecular Field-Coupled Nanocomputing

    • Authors: Yuri Ardesi, Mariagrazia Graziano, Gianluca Piccinini
      First page: 13
      Abstract: Molecular Field-Coupled Nanocomputing (FCN) is a computational paradigm promising high-frequency information elaboration at ambient temperature. This work proposes a model to evaluate the signal energy involved in propagating and elaborating the information. It splits the evaluation into several energy contributions calculated with closed-form expressions without computationally expensive calculation. The essential features of the 1,4-diallylbutane cation are evaluated with Density Functional Theory (DFT) and used in the model to evaluate circuit energy. This model enables understanding the information propagation mechanism in the FCN paradigm based on monostable molecules. We use the model to verify the bistable factor theory, describing the information propagation in molecular FCN based on monostable molecules, analyzed so far only from an electrostatic standpoint. Finally, the model is integrated into the SCERPA tool and used to quantify the information encoding stability and possible memory effects. The obtained results are consistent with state-of-the-art considerations and comparable with DFT calculation.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-03-01
      DOI: 10.3390/jlpea12010013
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 14: Silicon-Compatible Memristive Devices Tailored
           by Laser and Thermal Treatments

    • Authors: Maria N. Koryazhkina, Dmitry O. Filatov, Stanislav V. Tikhov, Alexey I. Belov, Dmitry S. Korolev, Alexander V. Kruglov, Ruslan N. Kryukov, Sergey Yu. Zubkov, Vladislav A. Vorontsov, Dmitry A. Pavlov, David I. Tetelbaum, Alexey N. Mikhaylov, Sergey A. Shchanikov, Sungjun Kim, Bernardo Spagnolo
      First page: 14
      Abstract: Nowadays, memristors are of considerable interest to researchers and engineers due to the promise they hold for the creation of power-efficient memristor-based information or computing systems. In particular, this refers to memristive devices based on the resistive switching phenomenon, which in most cases are fabricated in the form of metal–insulator–metal structures. At the same time, the demand for compatibility with the standard fabrication process of complementary metal–oxide semiconductors makes it relevant from a practical point of view to fabricate memristive devices directly on a silicon or SOI (silicon on insulator) substrate. Here we have investigated the electrical characteristics and resistive switching of SiOx- and SiNx-based memristors fabricated on SOI substrates and subjected to additional laser treatment and thermal treatment. The investigated memristors do not require electroforming and demonstrate a synaptic type of resistive switching. It is found that the parameters of resistive switching of SiOx- and SiNx-based memristors on SOI substrates are remarkably improved. In particular, the laser treatment gives rise to a significant increase in the hysteresis loop in I–V curves of SiNx-based memristors. Moreover, for SiOx-based memristors, the thermal treatment used after the laser treatment produces a notable decrease in the resistive switching voltage.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-03-02
      DOI: 10.3390/jlpea12010014
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 15: Cooperative Design of Devices and Services to
           Balance Low Power and User Experience

    • Authors: Takayuki Hoshino, Rentaro Yoshioka, Yukihide Kohira, Shingo Tetsuka
      First page: 15
      Abstract: CPS (Cyber Physical Systems) is an approach often adopted for improving real-world activities by utilizing data. It also can be used to improve customer experiences in service applications by analyzing customer behavior, captured by sensing devices and by supporting utilization of that data by the service providers, to improve the system. In developing such systems, no method has been established to systematically evaluate the impact of individual component design on the user experience. Knowledge Experience Design is a method for distilling and validating information that affects the quality of the user experience by focusing on user activities and underlying knowledge. This methodology has been applied to a system for a museum, in which visitor activities are observed by sensing devices, to aid the Curator’s awareness for improving museum services. As a result, a cooperative process for designing devices and user experience as a service was derived, in which competing interests of lower power consumption and user experience improvement have been attained. The proposed design method can be used for the co-design of systems that are built on the close coordination of hardware devices and software applications, for providing value-oriented services to users, which aids realization of CPS oriented to evaluating and improving such environments.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-03-08
      DOI: 10.3390/jlpea12010015
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 16: A 0.5 V Sub-Threshold CMOS Current-Controlled
           Ring Oscillator for IoT and Implantable Devices

    • Authors: Andrea Ballo, Salvatore Pennisi, Giuseppe Scotti, Chiara Venezia
      First page: 16
      Abstract: A current-controlled CMOS ring oscillator topology, which exploits the bulk voltages of the inverter stages as control terminals to tune the oscillation frequency, is proposed and analyzed. The solution can be adopted in sub-1 V applications, as it exploits MOSFETS in the subthreshold regime. Oscillators made up of 3, 5, and 7 stages designed in a standard 28-nm technology and supplied by 0.5 V, were simulated. By exploiting a programmable capacitor array, it allows a very large range of oscillation frequencies to be set, from 1 MHz to about 1 GHz, with a limited current consumption. Considering, for example, the five-stage topology, a nominal oscillation frequency of 516 MHz is obtained with an average power dissipation of about 29 µW. The solution provides a tuneable oscillation frequency, which can be adjusted from 360 to 640 MHz by controlling the bias current with a sensitivity of 0.43 MHz/nA.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-03-09
      DOI: 10.3390/jlpea12010016
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 17: Implementation of a Fuel Estimation Algorithm
           Using Approximated Computing

    • Authors: Imed Ben Dhaou
      First page: 17
      Abstract: The rising concerns about global warming have motivated the international community to take remedial actions to lower greenhouse gas emissions. The transportation sector is believed to be one of the largest air polluters. The quantity of greenhouse gas emissions is directly linked to the fuel consumption of vehicles. Eco-driving is an emergent driving style that aims at improving gas mileage. Real-time fuel estimation is a critical feature of eco-driving and eco-routing. There are numerous approaches to fuel estimation. The first approach uses instantaneous values of speed and acceleration. This can be accomplished using either GPS data or direct reading through the OBDII interface. The second approach uses the average value of the speed and acceleration that can be measured using historical data or through web mapping. The former cannot be used for route planning. The latter can be used for eco-routing. This paper elaborates on a highly pipelined VLSI architecture for the fuel estimation algorithm. Several high-level transformation techniques have been exercised to reduce the complexity of the algorithm. Three competing architectures have been implemented on FPGA and compared. The first one uses a binary search algorithm, the second architecture employs a direct address table, and the last one uses approximation techniques. The complexity of the algorithm is further reduced by combining both approximated computing and precalculation. This approach helped reduce the floating-point operations by 30% compared with the state-of-the-art implementation.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-03-16
      DOI: 10.3390/jlpea12010017
      Issue No: Vol. 12, No. 1 (2022)
       
  • JLPEA, Vol. 12, Pages 18: Towards Integration of a Dedicated Memory
           Controller and Its Instruction Set to Improve Performance of Systems
           Containing Computational SRAM

    • Authors: Kévin Mambu, Henri-Pierre Charles, Maha Kooli, Julie Dumas
      First page: 18
      Abstract: In-memory computing (IMC) aims to solve the performance gap between CPU and memories introduced by the memory wall. However, it does not address the energy wall problem caused by data transfer over memory hierarchies. This paper proposes the data-locality management unit (DMU) to efficiently transfer data from a DRAM memory to a computational SRAM (C-SRAM) memory allowing IMC operations. The DMU is tightly coupled within the C-SRAM and allows one to align the data structure in order to perform effective in-memory computation. We propose a dedicated instruction set within the DMU to issue data transfers. The performance evaluation of a system integrating C-SRAM within the DMU compared to a reference scalar system architecture shows an increase from ×5.73 to ×11.01 in speed-up and from ×29.49 to ×46.67 in energy reduction, versus a system integrating C-SRAM without any transfer mechanism compared to a reference scalar system architecture.
      Citation: Journal of Low Power Electronics and Applications
      PubDate: 2022-03-16
      DOI: 10.3390/jlpea12010018
      Issue No: Vol. 12, No. 1 (2022)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 100.24.118.144
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-