A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z  

  Subjects -> ELECTRONICS (Total: 207 journals)
The end of the list has been reached or no journals were found for your choice.
Similar Journals
Journal Cover
IEEE Transactions on Automatic Control
Journal Prestige (SJR): 3.433
Citation Impact (citeScore): 6
Number of Followers: 70  
 
  Hybrid Journal Hybrid journal (It can contain Open Access articles)
ISSN (Print) 0018-9286
Published by IEEE Homepage  [228 journals]
  • IEEE Control Systems Society Information

    • Free pre-print version: Loading...

      Pages: C2 - C2
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • IEEE Control Systems Society Information

    • Free pre-print version: Loading...

      Pages: C3 - C3
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Robust Stability Analysis of a Simple Data-Driven Model Predictive Control
           Approach

    • Free pre-print version: Loading...

      Authors: Joscha Bongard;Julian Berberich;Johannes Köhler;Frank Allgöwer;
      Pages: 2625 - 2637
      Abstract: In this article, we provide a theoretical analysis of closed-loop properties of a simple data-driven model predictive control (MPC) scheme. The formulation does not involve any terminal ingredients, thus allowing for a simple implementation without (potential) feasibility issues. The proposed approach relies on an implicit description of linear time-invariant systems based on behavioral systems theory, which only requires one input–output trajectory of an unknown system. For the nominal case with noise-free data, we prove that the data-driven MPC scheme ensures exponential stability for the closed loop if the prediction horizon is sufficiently long. Moreover, we analyze the robust data-driven MPC scheme for noisy output measurements for which we prove closed-loop practical exponential stability. The advantages of the presented approach are illustrated with a numerical example.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Predictive Control Barrier Functions: Enhanced Safety Mechanisms for
           Learning-Based Control

    • Free pre-print version: Loading...

      Authors: Kim P. Wabersich;Melanie N. Zeilinger;
      Pages: 2638 - 2651
      Abstract: While learning-based control techniques often outperform classical controller designs, safety requirements limit the acceptance of such methods in many applications. Recent developments address this issue through so-called predictive safety filters, which assess if a proposed learning-based control input can lead to constraint violations and modifies it if necessary to ensure safety for all future time steps. The theoretical guarantees of such predictive safety filters rely on the model assumptions and minor deviations can lead to failure of the filter putting the system at risk. This article introduces an auxiliary soft-constrained predictive control problem that is always feasible at each time step and asymptotically stabilizes the feasible set of the original predictive safety filter problem, thereby providing a recovery mechanism in safety–critical situations. This is achieved by a simple constraint tightening in combination with a terminal control barrier function. By extending discrete-time control barrier function theory, we establish that the proposed auxiliary problem provides a “predictive” control barrier function. The resulting algorithm is demonstrated using numerical examples.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • An Explicit Dual Control Approach for Constrained Reference Tracking of
           Uncertain Linear Systems

    • Free pre-print version: Loading...

      Authors: Anilkumar Parsi;Andrea Iannelli;Roy S. Smith;
      Pages: 2652 - 2666
      Abstract: A finite horizon optimal tracking problem is considered for linear dynamical systems subject to parametric uncertainties in the state-space matrices and exogenous disturbances. A suboptimal solution is proposed using a model predictive control (MPC) based explicit dual control approach, which enables active uncertainty learning. A novel algorithm for the design of robustly invariant online terminal sets and terminal controllers is presented. Set membership identification is used to update the parameter uncertainty online. A predicted worst-case cost is used in the MPC optimization problem to model the dual effect of the control input. The cost-to-go is estimated using contractivity of the proposed terminal set and the remaining time horizon, so that the optimizer can estimate future benefits of exploration. The proposed dual control algorithm ensures robust constraint satisfaction and recursive feasibility, and navigates the exploration-exploitation tradeoff using a robust performance metric.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Multiagent Low-Dimensional Linear Bandits

    • Free pre-print version: Loading...

      Authors: Ronshee Chawla;Abishek Sankararaman;Sanjay Shakkottai;
      Pages: 2667 - 2682
      Abstract: We study a multiagent stochastic linear bandit with side information, parameterized by an unknown vector $theta ^* in mathbb {R}^{d}$. The side information consists of a finite collection of low-dimensional subspaces, one of which contains $theta ^*$. In our setting, agents can collaborate to reduce regret by sending recommendations across a communication graph connecting them. We present a novel decentralized algorithm, where agents communicate subspace indices with each other and each agent plays a projected variant of LinUCB on the corresponding (low dimensional) subspace. By distributing the search for the optimal subspace across users and learning of the unknown vector by each agent in the corresponding low-dimensional subspace, we show that the per-agent finite-time regret is much smaller than the case when agents do not communicate. We finally complement these results through simulations.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Model-Based Policy Iterations for Nonlinear Systems via Controlled
           Hamiltonian Dynamics

    • Free pre-print version: Loading...

      Authors: Mario Sassano;Thulasi Mylvaganam;Alessandro Astolfi;
      Pages: 2683 - 2698
      Abstract: The infinite-horizon optimal control problem for nonlinear systems is studied. In the context of model-based, iterative learning strategies we propose an alternative definition and construction of the temporal difference error arising in policy iteration strategies. In such architectures, the error is computed via the evolution of the Hamiltonian function (or, possibly, of its integral) along the trajectories of the closed-loop system. Herein the temporal difference error is instead obtained via two subsequent steps: first the dynamics of the underlying costate variable in the Hamiltonian system is steered by means of a (virtual) control input in such a way that the stable invariant manifold becomes externally attractive. Then, the distance-from-invariance of the manifold, induced by approximate solutions, yields a natural candidate measure for the policy evaluation step. The policy improvement phase is then performed by means of standard gradient descent methods that allows us to correctly update the weights of the underlying functional approximator. The above-mentioned architecture then yields an iterative (episodic) learning scheme based on a scalar, constant reward at each iteration, the value of which is insensitive to the length of the episode, as in the original spirit of reinforcement learning strategies for discrete-time systems. Finally, the theory is validated by means of a numerical simulation involving an automatic flight control problem.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Near-Optimal Design of Safe Output-Feedback Controllers From Noisy Data

    • Free pre-print version: Loading...

      Authors: Luca Furieri;Baiwei Guo;Andrea Martin;Giancarlo Ferrari-Trecate;
      Pages: 2699 - 2714
      Abstract: As we transition toward the deployment of data-driven controllers for black-box cyberphysical systems, complying with hard safety constraints becomes a primary concern. Two key aspects should be addressed when input–output data are corrupted by noise: how much uncertainty can one tolerate without compromising safety, and to what extent is the control performance affected' By focusing on finite-horizon constrained linear– quadratic problems, we provide an answer to these questions in terms of the model mismatch incurred during a preliminary identification phase. We propose a control design procedure based on a quasiconvex relaxation of the original robust problem and we prove that, if the uncertainty is sufficiently small, the synthesized controller is safe and near-optimal, in the sense that the suboptimality gap increases linearly with the model mismatch level. Since the proposed method is independent of the specific identification procedure, our analysis holds in combination with state-of-the-art behavioral estimators beyond standard least squares. The main theoretical results are validated by numerical experiments.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Universal Approximation Power of Deep Residual Neural Networks Through the
           Lens of Control

    • Free pre-print version: Loading...

      Authors: Paulo Tabuada;Bahman Gharesifard;
      Pages: 2715 - 2728
      Abstract: In this article, we show that deep residual neural networks have the power of universal approximation by using, in an essential manner, the observation that these networks can be modeled as nonlinear control systems. We first study the problem of using a deep residual neural network to exactly memorize training data by formulating it as a controllability problem for an ensemble control system. Using techniques from geometric control theory, we identify a class of activation functions that allow us to ensure controllability on an open and dense submanifold of sample points. Using this result, and resorting to the notion of monotonicity, we establish that any continuous function can be approximated on a compact set to arbitrary accuracy, with respect to the uniform norm, by this class of neural networks. Moreover, we provide optimal bounds on the number of required neurons.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • On-the-Fly Control of Unknown Nonlinear Systems With Sublinear Regret

    • Free pre-print version: Loading...

      Authors: Abraham P. Vinod;Arie Israel;Ufuk Topcu;
      Pages: 2729 - 2742
      Abstract: We study the problem of data-driven, constrained control of unknown nonlinear dynamics from a single ongoing and finite-horizon trajectory. We consider a one-step optimal control problem with a smooth, black-box objective, typically a composition of a known cost function and the unknown dynamics. We investigate an on-the-fly control paradigm, i.e., at each time step, the evolution of the dynamics and the first-order information of the cost are provided only for the executed control action. We propose an optimization-based control algorithm that iteratively minimizes a data-driven surrogate function for the unknown objective. We prove that the proposed approach incurs sublinear cumulative regret (step-wise suboptimality with respect to an optimal one-step controller) and is worst-case optimal among a broad class of data-driven control algorithms. We also present tractable reformulations of the approach that can leverage off-the-shelf solvers for efficient implementations.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Safe Value Functions

    • Free pre-print version: Loading...

      Authors: Pierre-François Massiani;Steve Heim;Friedrich Solowjow;Sebastian Trimpe;
      Pages: 2743 - 2757
      Abstract: Safety constraints and optimality are important but sometimes conflicting criteria for controllers. Although these criteria are often solved separately with different tools to maintain formal guarantees, it is also common practice in reinforcement learning (RL) to simply modify reward functions by penalizing failures, with the penalty treated as a mere heuristic. We rigorously examine the relationship of both safety and optimality to penalties, and formalize sufficient conditions for safe value functions (SVFs): value functions that are both optimal for a given task, and enforce safety constraints. We reveal this structure by examining when rewards preserve viability under optimal control, and show that there always exists a finite penalty that induces an SVF. This penalty is not unique, but upper-unbounded: larger penalties do not harm optimality. Although it is often not possible to compute the minimum required penalty, we reveal clear structure of how the penalty, rewards, discount factor, and dynamics interact. This insight suggests practical, theory-guided heuristics to design reward functions for control problems where safety is important.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Finite-Time Convergence Rates of Decentralized Stochastic Approximation
           With Applications in Multi-Agent and Multi-Task Learning

    • Free pre-print version: Loading...

      Authors: Sihan Zeng;Thinh T. Doan;Justin Romberg;
      Pages: 2758 - 2773
      Abstract: In this article, we study a decentralized variant of stochastic approximation (SA), a data-driven approach for finding the root of an operator under noisy measurements. A network of agents, each with its own operator and data observations, cooperatively find the fixed point of the aggregate operator over a decentralized communication graph. Our main contribution is to provide a finite-time analysis of this decentralized SA method when the data observed at each agent are sampled from a Markov process; this lack of independence makes the iterates biased and (potentially) unbounded. Under fairly standard assumptions, we show that the convergence rate of the proposed method is essentially the same as if the samples were independent, differing only by a log factor that accounts for the mixing time of the Markov processes. The key idea in our analysis is to introduce a novel Lyapunov–Razumikhin function, motivated by the one used in analyzing the stability of delayed ordinary differential equations. We also discuss applications of the proposed method on a number of interesting learning problems in multiagent systems.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Online Learning of the Kalman Filter With Logarithmic Regret

    • Free pre-print version: Loading...

      Authors: Anastasios Tsiamis;George J. Pappas;
      Pages: 2774 - 2789
      Abstract: In this article, we consider the problem of predicting observations generated online by an unknown, partially observable linear system, which is driven by Gaussian noise. In the linear Gaussian setting, the optimal predictor in the mean square error sense is the celebrated Kalman filter, which can be explicitly computed when the system model is known. When the system model is unknown, we have to learn how to predict observations online based on finite data, suffering possibly a nonzero regret with respect to the Kalman filter's prediction. We show that it is possible to achieve a regret of the order of $text{poly}log (N)$ with high probability, where $N$ is the number of observations collected. This is achieved using an online least-squares algorithm, which exploits the approximately linear relation between future observations and past observations. The regret analysis is based on the stability properties of the Kalman filter, recent statistical tools for finite sample analysis of system identification, and classical results for the analysis of least-squares algorithms for time series. Our regret analysis can also be applied to other predictors, e.g., multiple step-ahead prediction, or prediction under exogenous inputs including closed-loop prediction. A fundamental technical contribution is that our bounds hold even for the class of nonexplosive systems (including marginally stable systems), which was not addressed before in the case of online prediction.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Efficient Learning of a Linear Dynamical System With Stability Guarantees

    • Free pre-print version: Loading...

      Authors: Wouter Jongeneel;Tobias Sutter;Daniel Kuhn;
      Pages: 2790 - 2804
      Abstract: We propose a principled method for projecting an arbitrary square matrix to the nonconvex set of asymptotically stable matrices. Leveraging ideas from large deviations theory, we show that this projection is optimal in an information-theoretic sense and that it simply amounts to shifting the initial matrix by an optimal linear quadratic feedback gain, which can be computed exactly and highly efficiently by solving a standard linear quadratic regulator problem. The proposed approach allows us to learn the system matrix of a stable linear dynamical system from a single trajectory of correlated state observations. The resulting estimator is guaranteed to be stable and offers statistical bounds on the estimation error.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Finite-Time Identification of Linear Systems: Fundamental Limits and
           Optimal Algorithms

    • Free pre-print version: Loading...

      Authors: Yassir Jedra;Alexandre Proutiere;
      Pages: 2805 - 2820
      Abstract: We investigate the linear system identification problem in the so-called fixed budget and fixed confidence settings. In the fixed budget setting, the learner aims at estimating the state transition matrix A from a random system trajectory of fixed length, whereas in the fixed confidence setting, the learner also controls the length of the observed trajectory – she can stop when she believes that enough information has been gathered. For both settings, we analyze the sample complexity in the probably approximately correct (PAC) framework defined as the length of the observed trajectory required to identify the system parameters with prescribed accuracy and confidence levels $(varepsilon, delta)$. In the fixed budget setting, we first establish problem-specific sample complexity lower bounds. We then present a finite-time analysis of the estimation error of the least-squares estimator (LSE) for stable systems, and show that in the high-accuracy regime, the sample complexity of the LSE matches our lower bounds. Our analysis of the LSE is sharper and easier to interpret than existing analyzes, and relies on novel concentration results for the covariates matrix. In the fixed confidence setting, in addition to the estimation objective, the learner also has to decide when to stop the collection of observations. The sample complexity then corresponds to the expected stopping time. For this setting, we also provide problem specific sample complexity lower bounds. We also propose a stopping rule which combined to the LSE enjoys a sample complexity that matches our lower bounds in the high-accuracy and high-confidence regime.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • PAC Reinforcement Learning Algorithm for General-Sum Markov Games

    • Free pre-print version: Loading...

      Authors: Ashkan Zehfroosh;Herbert G. Tanner;
      Pages: 2821 - 2831
      Abstract: This article presents a theoretical framework for probably approximately correct (PAC) multi-agent reinforcement learning (MARL) algorithms for Markov games. Using the idea of delayed Q-learning, this article extends the well-known Nash Q-learning algorithm to build a new PAC MARL algorithm for general-sum Markov games. In addition to guiding the design of a provably PAC MARL algorithm, the framework enables checking whether an arbitrary MARL algorithm is PAC. Comparative numerical results demonstrate the algorithm's performance and robustness.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Data-Driven Inference on Optimal Input–Output Properties of Polynomial
           Systems With Focus on Nonlinearity Measures

    • Free pre-print version: Loading...

      Authors: Tim Martin;Frank Allgöwer;
      Pages: 2832 - 2847
      Abstract: In the context of dynamical systems, nonlinearity measures quantify the strength of nonlinearity by means of the distance of their input–output behavior to a set of linear input–output mappings. In this article, we establish a framework to determine nonlinearity measures and other optimal input–output properties for nonlinear polynomial systems without explicitly identifying a model but from a finite number of input-state measurements, which are subject to noise. To this end, we deduce from data for the unidentified ground-truth system three possible set-membership representations, compare their accuracy, and prove that they are asymptotically consistent with respect to the amount of samples. Moreover, we leverage these representations to compute guaranteed upper bounds on nonlinearity measures and the corresponding optimal linear approximation model via semidefinite programming. Furthermore, we extend the established framework to determine optimal input–output properties described by time domain hard integral quadratic constraints.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Robust Uncertainty Bounds in Reproducing Kernel Hilbert Spaces: A Convex
           Optimization Approach

    • Free pre-print version: Loading...

      Authors: Paul Scharnhorst;Emilio T. Maddalena;Yuning Jiang;Colin N. Jones;
      Pages: 2848 - 2861
      Abstract: The problem of establishing out-of-sample bounds for the values of an unknown ground-truth function is considered. Kernels and their associated Hilbert spaces are the main formalism employed herein, along with an observational model where outputs are corrupted by bounded measurement noise. The noise can originate from any compactly supported distribution, and no independent assumptions are made on the available data. In this setting, we show how computing tight, finite-sample uncertainty bounds amounts to solving parametric quadratically constrained linear programs. Next, the properties of our approach are established, and its relationship with another method is studied. Numerical experiments are presented to exemplify how the theory can be applied in various scenarios and to contrast it with other closed-form alternatives.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Annealing Optimization for Progressive Learning With Stochastic
           Approximation

    • Free pre-print version: Loading...

      Authors: Christos N. Mavridis;John S. Baras;
      Pages: 2862 - 2874
      Abstract: In this work, we introduce a learning model designed to meet the needs of applications in which computational resources are limited, and robustness and interpretability are prioritized. Learning problems can be formulated as constrained stochastic optimization problems, with the constraints originating mainly from model assumptions that define a tradeoff between complexity and performance. This tradeoff is closely related to overfitting, generalization capacity, and robustness to noise and adversarial attacks, and depends on both the structure and complexity of the model, as well as the properties of the optimization methods used. We develop an online prototype-based learning algorithm based on annealing optimization that is formulated as an online gradient-free stochastic approximation algorithm. The learning model can be viewed as an interpretable and progressively growing competitive-learning neural network model to be used for supervised, unsupervised, and reinforcement learning. The annealing nature of the algorithm contributes to minimal hyperparameter tuning requirements, poor local minima prevention, and robustness with respect to the initial conditions. At the same time, it provides online control over the performance–complexity tradeoff by progressively increasing the complexity of the learning model as needed, through an intuitive bifurcation phenomenon. Finally, the use of stochastic approximation enables the study of the convergence of the learning algorithm through mathematical tools from dynamical systems and control, and allows for its integration with reinforcement learning algorithms, constructing an adaptive state–action aggregation scheme.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Regret and Cumulative Constraint Violation Analysis for Distributed Online
           Constrained Convex Optimization

    • Free pre-print version: Loading...

      Authors: Xinlei Yi;Xiuxian Li;Tao Yang;Lihua Xie;Tianyou Chai;Karl Henrik Johansson;
      Pages: 2875 - 2890
      Abstract: This article considers the distributed online convex optimization problem with time-varying constraints over a network of agents. This is a sequential decision making problem with two sequences of arbitrarily varying convex loss and constraint functions. At each round, each agent selects a decision from the decision set, and then only a portion of the loss function and a coordinate block of the constraint function at this round are privately revealed to this agent. The goal of the network is to minimize the network-wide loss accumulated over time. Two distributed online algorithms with full-information and bandit feedback are proposed. Both dynamic and static network regret bounds are analyzed for the proposed algorithms, and network cumulative constraint violation is used to measure constraint violation, which excludes the situation that strictly feasible constraints can compensate the effects of violated constraints. In particular, we show that the proposed algorithms achieve $mathcal {O}(T^{max lbrace kappa,1-kappa rbrace })$ static network regret and $mathcal {O}(T^{1-kappa /2})$ network cumulative constraint violation, where $T$ is the time horizon and $kappa in (0,1)$ is a user-defined tradeoff parameter. Moreover, if the loss functions are strongly convex, then the static network regret bound can be reduced to $mathcal {O}(T^{kappa })$. Finally, numerical simulations are provided to illustrate the effectiveness of the theoretical results.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Sample Complexity and Overparameterization Bounds for Temporal-Difference
           Learning With Neural Network Approximation

    • Free pre-print version: Loading...

      Authors: Semih Cayci;Siddhartha Satpathi;Niao He;R. Srikant;
      Pages: 2891 - 2905
      Abstract: In this article, we study the dynamics of temporal-difference (TD) learning with neural network-based value function approximation over a general state space, namely, neural TD learning. We consider two practically used algorithms, projection-free and max-norm regularized neural TD learning, and establish the first convergence bounds for these algorithms. An interesting observation from our results is that max-norm regularization can dramatically improve the performance of TD learning algorithms in terms of sample complexity and overparameterization. The results in this work rely on a Lyapunov drift analysis of the network parameters as a stopped and controlled random process.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Adaptive Composite Online Optimization: Predictions in Static and Dynamic
           Environments

    • Free pre-print version: Loading...

      Authors: Pedro Zattoni Scroccaro;Arman Sharifi Kolarijani;Peyman Mohajerin Esfahani;
      Pages: 2906 - 2921
      Abstract: In the past few years, online convex optimization (OCO) has received notable attention in the control literature thanks to its flexible real-time nature and powerful performance guarantees. In this article, we propose new step-size rules and OCO algorithms that simultaneously exploit gradient predictions, function predictions and dynamics, features particularly pertinent to control applications. The proposed algorithms enjoy static and dynamic regret bounds in terms of the dynamics of the reference action sequence, gradient prediction error, and function prediction error, which are generalizations of known regularity measures from the literature. We present results for both convex and strongly convex costs. We validate the performance of the proposed algorithms in a trajectory tracking case study, as well as portfolio optimization using real-world datasets.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Efficient Off-Policy Q-Learning for Data-Based Discrete-Time LQR Problems

    • Free pre-print version: Loading...

      Authors: Victor G. Lopez;Mohammad Alsalti;Matthias A. Müller;
      Pages: 2922 - 2933
      Abstract: This article introduces and analyzes an improved Q-learning algorithm for discrete-time linear time-invariant systems. The proposed method does not require any knowledge of the system dynamics, and it enjoys significant efficiency advantages over other data-based optimal control methods in the literature. This algorithm can be fully executed offline, as it does not require to apply the current estimate of the optimal input to the system as in on-policy algorithms. It is shown that a PE input, defined from an easily tested matrix rank condition, guarantees the convergence of the algorithm. A data-based method is proposed to design the initial stabilizing feedback gain that the algorithm requires. Robustness of the algorithm in the presence of noisy measurements is analyzed. We compare the proposed algorithm in simulation to different direct and indirect data-based control design methods.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Global Convergence of Policy Gradient Primal–Dual Methods for
           Risk-Constrained LQRs

    • Free pre-print version: Loading...

      Authors: Feiran Zhao;Keyou You;Tamer Başar;
      Pages: 2934 - 2949
      Abstract: While the techniques in optimal control theory are often model-based, the policy optimization (PO) approach directly optimizes the performance metric of interest. Even though it has been an essential approach for reinforcement learning problems, there is little theoretical understanding of its performance. In this article, we focus on the risk-constrained linear quadratic regulator problem via the PO approach, which requires addressing a challenging nonconvex constrained optimization problem. To solve it, we first build on our earlier result that an optimal policy has a time-invariant affine structure to show that the associated Lagrangian function is coercive, locally gradient dominated, and has a local Lipschitz continuous gradient, based on which we establish strong duality. Then, we design policy gradient primal–dual methods with global convergence guarantees in both model-based and sample-based settings. Finally, we use samples of system trajectories in simulations to validate our methods.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • A General Framework for Learning-Based Distributionally Robust MPC of
           Markov Jump Systems

    • Free pre-print version: Loading...

      Authors: Mathijs Schuurmans;Panagiotis Patrinos;
      Pages: 2950 - 2965
      Abstract: In this article, we present a data-driven learning model predictive control (MPC) scheme for chance-constrained Markov jump systems with unknown switching probabilities. Using samples of the underlying Markov chain, ambiguity sets of transition probabilities are estimated, which include the true conditional probability distributions with high probability. These sets are updated online and used to formulate a time-varying, risk-averse optimal control problem. We prove recursive feasibility of the resulting MPC scheme and show that the original chance constraints remain satisfied at every time step. Furthermore, we show that under sufficient decrease of the confidence levels, the resulting MPC scheme renders the closed-loop system mean-square stable with respect to the true-but-unknown distributions, while remaining less conservative than a fully robust approach. Finally, we show that the data-driven value function of the learning MPC converges from above to its nominal counterpart as the sample size grows to infinity. We illustrate our approach on a numerical example.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • A Nonsmooth Dynamical Systems Perspective on Accelerated Extensions of
           ADMM

    • Free pre-print version: Loading...

      Authors: Guilherme França;Daniel P. Robinson;René Vidal;
      Pages: 2966 - 2978
      Abstract: Recently, there has been great interest in connections between continuous-time dynamical systems and optimization methods, notably in the context of accelerated methods for smooth and unconstrained problems. In this article, we extend this perspective to nonsmooth and constrained problems by obtaining differential inclusions associated with novel accelerated variants of the alternating direction method of multipliers (ADMM). Through a Lyapunov analysis, we derive rates of convergence for these dynamical systems in different settings that illustrate an interesting tradeoff between decaying versus constant damping strategies. We also obtain modified equations capturing fine-grained details of these methods, which have improved stability and preserve the leading-order convergence rates. An extension to general nonlinear equality and inequality constraints in connection with singular perturbation theory is provided.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Learning to Act Safely With Limited Exposure and Almost Sure Certainty

    • Free pre-print version: Loading...

      Authors: Agustin Castellano;Hancheng Min;Juan Andres Bazerque;Enrique Mallada;
      Pages: 2979 - 2994
      Abstract: This article puts forward the concept that learning to take safe actions in unknown environments, even with probability one guarantees, can be achieved without the need for an unbounded number of exploratory trials. This is indeed possible, provided that one is willing to navigate tradeoffs between optimality, level of exposure to unsafe events, and the maximum detection time of unsafe actions. We illustrate this concept in two complementary settings. We first focus on the canonical multiarmed bandit problem and study the intrinsic tradeoffs of learning safety in the presence of uncertainty. Under mild assumptions on sufficient exploration, we provide an algorithm that provably detects all unsafe machines in an (expected) finite number of rounds. The analysis also unveils a tradeoff between the number of rounds needed to secure the environment and the probability of discarding safe machines. We then consider the problem of finding optimal policies for a Markov decision process (MDP) with almost sure constraints. We show that the action-value function satisfies a barrier-based decomposition that allows for the identification of feasible policies independently of the reward process. Using this decomposition, we develop a barrier-learning algorithm, that identifies such unsafe state–action pairs in a finite expected number of steps. Our analysis further highlights a tradeoff between the time lag for the underlying MDP necessary to detect unsafe actions, and the level of exposure to unsafe events. Simulations corroborate our theoretical findings, further illustrating the aforementioned tradeoffs, and suggesting that safety constraints can speed up the learning process.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Representer Theorem for Learning Koopman Operators

    • Free pre-print version: Loading...

      Authors: Mohammad Khosravi;
      Pages: 2995 - 3010
      Abstract: In this work, we consider the problem of learning the Koopman operator for discrete-time autonomous systems. The learning problem is formulated as a generic constrained regularized empirical loss minimization in the infinite-dimensional space of linear operators. We show that a representer theorem holds for the introduced learning problem under certain but general conditions, which allows convex reformulation of the problem in a specific finite-dimensional space without any approximation and loss of precision. We discuss the inclusion of various forms of regularization and constraints in the learning problem, such as the operator norm, the Frobenius norm, the operator rank, the nuclear norm, and the stability. Subsequently, we derive the corresponding equivalent finite-dimensional problem. Furthermore, we demonstrate the connection between the proposed formulation and the extended dynamic mode decomposition. We present several numerical examples to illustrate the theoretical results and verify the performance of regularized learning of the Koopman operators.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Formal Verification of Unknown Discrete- and Continuous-Time Systems: A
           Data-Driven Approach

    • Free pre-print version: Loading...

      Authors: Ameneh Nejati;Abolfazl Lavaei;Pushpak Jagtap;Sadegh Soudjani;Majid Zamani;
      Pages: 3011 - 3024
      Abstract: This article is concerned with a formal verification scheme for both discrete- and continuous-time deterministic systems with unknown mathematical models. The main target is to verify the safety of unknown systems based on the construction of barrier certificates via a set of data collected from trajectories of systems while providing an a-priori guaranteed confidence on the safety. In our proposed framework, we first cast the original safety problem as a robust convex program (RCP). Solving the proposed RCP is not tractable in general since the unknown model appears in one of the constraints. Instead, we collect finite numbers of data from trajectories of the system and provide a scenario convex program (SCP) corresponding to the original RCP. We then establish a probabilistic closeness between the optimal value of SCP and that of RCP, and as a result, we formally quantify the safety guarantee of unknown systems based on the number of data points and a required level of confidence. We propose our framework in both discrete-time and continuous-time settings. We illustrate the effectiveness of our proposed results by first applying them to an unknown continuous-time room temperature system. We verify that the temperature of the room maintains in a comfort zone with some desirable confidence by collecting data from trajectories of the system. To show the applicability of our techniques to higher dimensional systems with nonlinear dynamics, we then apply our results to a continuous-time nonlinear jet engine compressor and a discrete-time DC motor.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Distributed Multiarmed Bandits

    • Free pre-print version: Loading...

      Authors: Jingxuan Zhu;Ji Liu;
      Pages: 3025 - 3040
      Abstract: This article studies a distributed multiarmed bandit problem with heterogeneous observations of rewards. The problem is cooperatively solved by $N$ agents assuming each agent faces a common set of $M$ arms yet observes only local biased rewards of the arms. The goal of each agent is to minimize the cumulative expected regret with respect to the true rewards of the arms, where the mean of each arm's true reward equals the average of the means of all agents' observed biased rewards. Each agent recursively updates its decision by utilizing the information from its neighbors. Neighbor relationships are described by a time-dependent directed graph $mathbb{G}(t)$ whose vertices correspond to agents and whose arcs depict neighbor relationships. A fully distributed bandit algorithm is proposed, which couples the classical distributed averaging algorithm and the celebrated upper confidence bound bandit algorithm. It is shown that for any uniformly strongly connected sequence of $mathbb{G}(t)$, the algorithm achieves guaranteed regret for each agent at the order of $O(log T)$.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Regret-Optimal Estimation and Control

    • Free pre-print version: Loading...

      Authors: Gautam Goel;Babak Hassibi;
      Pages: 3041 - 3053
      Abstract: In this article, we consider estimation and control in linear dynamical systems from the perspective of regret minimization. Unlike most prior work in this area, we focus on the problem of designing causal state estimators and causal controllers, which compete against a clairvoyant noncausal policy, instead of the best policy selected in hindsight from some fixed parametric class. We show that regret-optimal filters and regret-optimal controllers can be derived in state space form using operator-theoretic techniques from robust control. Our results can be viewed as extending traditional robust estimation and control, which focuses on minimizing worst-case cost, to minimizing worst-case regret. We propose regret-optimal analogs of model-predictive control and the extended Kalman filter for systems with nonlinear dynamics and present numerical experiments which show that these algorithms can significantly outperform standard approaches to estimation and control.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Data-Driven Reachability Analysis From Noisy Data

    • Free pre-print version: Loading...

      Authors: Amr Alanwar;Anne Koch;Frank Allgöwer;Karl Henrik Johansson;
      Pages: 3054 - 3069
      Abstract: We consider the problem of computing reachable sets directly from noisy data without a given system model. Several reachability algorithms are presented for different types of systems generating the data. First, an algorithm for computing over-approximated reachable sets based on matrix zonotopes is proposed for linear systems. Constrained matrix zonotopes are introduced to provide less conservative reachable sets at the cost of increased computational expenses and utilized to incorporate prior knowledge about the unknown system model. Then we extend the approach to polynomial systems and, under the assumption of Lipschitz continuity, to nonlinear systems. Theoretical guarantees are given for these algorithms in that they give a proper over-approximate reachable set containing the true reachable set. Multiple numerical examples and real experiments show the applicability of the introduced algorithms, and comparisons are made between algorithms.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Value Iteration for Continuous-Time Linear Time-Invariant Systems

    • Free pre-print version: Loading...

      Authors: Corrado Possieri;Mario Sassano;
      Pages: 3070 - 3077
      Abstract: Two data-driven strategies for value iteration in linear quadratic optimal control problems over an infinite horizon are proposed. The two architectures share common features, since they both consist of a purely continuous-time control architecture and are based on the forward integration of the differential Riccati equation (DRE). They profoundly differ, instead, in the estimation mechanism of the vector field of the underlying DRE from collected data: The first relies on a characterization of properties of the advantage function associated to the problem, whereas the second is inspired by tools from adaptive control theory and ensures semi-global exponential convergence to the optimal solution. Advantages and drawbacks of the architectures are discussed, while the performance is validated via a benchmark numerical example.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Recursive Identification of Time-Varying Hammerstein Systems With Matrix
           Forgetting

    • Free pre-print version: Loading...

      Authors: Jakub Dokoupil;Pavel Václavek;
      Pages: 3078 - 3085
      Abstract: The real-time estimation of the time-varying Hammerstein system by using a noniterative learning schema is considered and extended to incorporate a matrix forgetting factor. The estimation is cast in a variational-Bayes framework to best emulate the original posterior distribution of the parameters within the set of distributions with feasible moments. The recursive concept we propose approximates the exact posterior comprising undistorted information about the estimated parameters. In many practical settings, the incomplete model of parameter variations is compensated by forgetting of obsolete information. As a rule, the forgetting operation is initiated by the inclusion of an appropriate prediction alternative into the time update. It is shown that the careful formulation of the prediction alternative, which relies on Bayesian conditioning, results in partial forgetting. This article inspects two options with respect to the order of the conditioning in the posterior, which proves vital in the successful localization of the source of inconsistency in the data-generating process. The geometric mean of the discussed alternatives then modifies recursive learning through the matrix forgetting factor. We adopt the decision-making approach to revisit the posterior uncertainty by dynamically allocating the probability to each of the prediction alternatives to be combined.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Data-Efficient Active Weighting Algorithm for Composite Adaptive Control
           Systems

    • Free pre-print version: Loading...

      Authors: Seong-hun Kim;Hanna Lee;Namhoon Cho;Youdan Kim;
      Pages: 3086 - 3090
      Abstract: We propose an active weighting algorithm for composite adaptive control to reduce the state and estimate errors while maintaining the estimation quality. Unlike previous studies that construct the composite term by simply stacking, removing, and pausing observed data, the proposed method efficiently utilizes the data by providing a theoretical set of weights for observations that can actively manipulate the composite term to have desired characteristics. As an example, a convex optimization formulation is provided, which maximizes the minimum eigenvalue while keeping other constraints, and an illustrative numerical simulation is also presented.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Cooperative Control of Uncertain Multiagent Systems via Distributed
           Gaussian Processes

    • Free pre-print version: Loading...

      Authors: Armin Lederer;Zewen Yang;Junjie Jiao;Sandra Hirche;
      Pages: 3091 - 3098
      Abstract: For single agent systems, probabilistic machine learning techniques such as Gaussian process regression have been shown to be suitable methods for inferring models of unknown nonlinearities, which can be employed to improve the performance of control laws. While this approach can be extended to the cooperative control of multiagent systems, it leads to a decentralized learning of the unknown nonlinearity, i.e., each agent independently infers a model. However, decentralized learning can potentially lead to poor control performance, since the models of individual agents are often accurate in merely a small region of the state space. In order to overcome this issue, we propose a novel method for the distributed aggregation of Gaussian process models, and extend probabilistic error bounds for Gaussian process regression to the proposed approach. Based on this distributed learning method, we develop a cooperative tracking control law for leader–follower consensus of multiagent systems with partially unknown, higher order, control-affine dynamics, and analyze its stability using the Lyapunov theory. The effectiveness of the proposed methods is demonstrated in numerical evaluations.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • A Scalable Distributed Dynamical Systems Approach to Learn the Strongly
           Connected Components and Diameter of Networks

    • Free pre-print version: Loading...

      Authors: Emily A. Reed;Guilherme Ramos;Paul Bogdan;Sérgio Pequito;
      Pages: 3099 - 3106
      Abstract: Finding strongly connected components (SCCs) and the diameter of a directed network play a key role in a variety of machine learning and control theory problems. In this article, we provide for the first time a scalable distributed solution for these two problems by leveraging dynamical consensus-like protocols to find the SCCs. The proposed solution has a time complexity of $mathcal {O}(NDd_{text{in-degree}}^{max })$, where $N$ is the number of vertices in the network, $D$ is the (finite) diameter of the network, and $d_{text{in-degree}}^{max }$ is the maximum in-degree of the network. Additionally, we prove that our algorithm terminates in $D+2$ iterations, which allows us to retrieve the finite diameter of the network. We perform exhaustive simulations that support the outperformance of our algorithm against the state of the art on several random networks, including Erdős–Rényi, Barabási–Albert, and Watts–Strogatz networks.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Distributed State Estimation With Deep Neural Networks for Uncertain
           Nonlinear Systems Under Event-Triggered Communication

    • Free pre-print version: Loading...

      Authors: Federico M. Zegers;Runhan Sun;Girish Chowdhary;Warren E. Dixon;
      Pages: 3107 - 3114
      Abstract: This work explores the distributed state estimation problem for an uncertain, nonlinear, and continuous-time system. Given a sensor network, each agent is assigned a deep neural network (DNN) that is used to approximate the system's dynamics. Each agent updates the weights of their DNN through a multiple timescale approach, i.e., the outer layer weights are updated online with a Lyapunov-based gradient descent update law, and the inner layer weights are updated concurrently using a supervised learning strategy. To promote the efficient use of network resources, the distributed observer uses event-triggered communication. A nonsmooth Lyapunov analysis demonstrates that the distributed event-triggered observer achieves uniformly ultimately bounded state reconstruction. A simulation example of a five-agent sensor network estimating the state of a two-link robotic manipulator tracking a desired trajectory is provided to validate the result and showcase the performance improvements afforded by DNNs.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Compactly Restrictable Metric Policy Optimization Problems

    • Free pre-print version: Loading...

      Authors: Victor D. Dorobantu;Kamyar Azizzadenesheli;Yisong Yue;
      Pages: 3115 - 3122
      Abstract: We study policy optimization problems for deterministic Markov decision processes (MDPs) with metric state and action spaces, which we refer to as metric policy optimization problems (MPOPs). Our goal is to establish theoretical results on the well-posedness of MPOPs that can characterize practically relevant continuous control systems. To do so, we define a special class of MPOPs called compactly restrictable MPOPs (CR-MPOPs), which are flexible enough to capture the complex behavior of robotic systems but specific enough to admit solutions using dynamic programming methods such as value iteration. We show how to arrive at CR-MPOPs using forward-invariance. We further show that our theoretical results on CR-MPOPs can be used to characterize feedback linearizable control affine systems.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Concurrent Active Learning in Autonomous Airborne Source Search: Dual
           Control for Exploration and Exploitation

    • Free pre-print version: Loading...

      Authors: Zhongguo Li;Wen-Hua Chen;Jun Yang;
      Pages: 3123 - 3130
      Abstract: A concurrent learning framework is developed for source search in an unknown environment using autonomous platforms equipped with onboard sensors. Distinct from the existing solutions that require significant computational power for Bayesian estimation and path planning, the proposed solution is computationally affordable for onboard processors. A new concept of concurrent learning using multiple parallel estimators is proposed to learn the operational environment and quantify estimation uncertainty. The search agent is empowered with the dual capability of exploiting current-estimated parameters to track the source and probing the environment to reduce the impacts of uncertainty, namely Concurrent Learning based Dual Control for Exploration and Exploitation (CL-DCEE). In this setting, the control action not only minimizes the tracking error between future agent's position and estimated source location, but also the uncertainty of predicted estimation. More importantly, the rigorous proven properties, such as the convergence of CL-DCEE algorithm, are established under mild assumptions on noises, and the impact of noises on the search performance is examined. Simulation results are provided to validate the effectiveness of the proposed CL-DCEE algorithm. Compared with the information-theoretic approach, CL-DCEE not only guarantees convergence, but produces better search performance and consumes much less computational time.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Numerical Gaussian Process Kalman Filtering for Spatiotemporal Systems

    • Free pre-print version: Loading...

      Authors: Armin Küper;Steffen Waldherr;
      Pages: 3131 - 3138
      Abstract: We present a novel Kalman filter (KF) for spatiotemporal systems called the numerical Gaussian process Kalman filter (NGPKF). Numerical Gaussian processes have recently been introduced as a physics-informed machine-learning method for simulating time-dependent partial differential equations without the need for spatial discretization while also providing uncertainty quantification of the simulation resulting from noisy initial data. We formulate numerical Gaussian processes as linear Gaussian state space models. This allows us to derive the recursive KF algorithm under the numerical Gaussian process state space model. Using two case studies, we show that the NGPKF is more accurate and robust, given enough measurements, than a spatial discretization-based KF.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • On Centralized and Distributed Mirror Descent: Convergence Analysis Using
           Quadratic Constraints

    • Free pre-print version: Loading...

      Authors: Youbang Sun;Mahyar Fazlyab;Shahin Shahrampour;
      Pages: 3139 - 3146
      Abstract: Mirror descent (MD) is a powerful first-order optimization technique that subsumes several optimization algorithms including gradient descent (GD). In this work, we leverage quadratic constraints and Lyapunov functions to analyze the stability and characterize the convergence rate of the MD algorithm as well as its distributed variant using semidefinite programming (SDP). For both algorithms, we consider both strongly convex and nonstrongly convex assumptions. For centralized MD and strongly convex problems, we construct an SDP that certifies exponential convergence rates and derive a closed-form feasible solution to the SDP that recovers the optimal rate of GD as a special case. We complement our analysis by providing an explicit $O(1/k)$ convergence rate for convex problems. Next, we analyze the convergence of distributed MD and characterize the rate numerically using an SDP whose dimensions are independent of the network size. To the best of our knowledge, the numerical rate of distributed MD has not been previously reported in the literature. We further prove an $O(1/k)$ convergence rate for distributed MD in the convex setting. Our numerical experiments on strongly convex problems indicate that our framework certifies superior convergence rates compared to the existing rates for distributed GD.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Multiagent Online Source Seeking Using Bandit Algorithm

    • Free pre-print version: Loading...

      Authors: Bin Du;Kun Qian;Christian Claudel;Dengfeng Sun;
      Pages: 3147 - 3154
      Abstract: This article presents a learning-based algorithm for solving the online source-seeking problem with a multiagent system under an unknown dynamical environment. Our algorithm, building on a notion termed as dummy confidence upper bound (D-UCB), integrates both estimation of the unknown environment and task planning for the multiple agents simultaneously, and as a result, enables the multiple agents to track the extremum spots of the dynamical environment in an online manner. Unlike the standard confidence upper bound algorithm in the context of multiarmed bandits, the notion of D-UCB helps significantly reduce the computational complexity in solving the subproblems of task planning, and thus renders our algorithm exceptionally computation-efficient in the distributed setting. The performance of our algorithm is theoretically guaranteed by showing a sublinear upper bound of the cumulative regret. Numerical results on a real-world pollution monitoring and tracking problem are also provided to demonstrate the effectiveness of the algorithm.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Hamiltonian Deep Neural Networks Guaranteeing Nonvanishing Gradients by
           Design

    • Free pre-print version: Loading...

      Authors: Clara Lucía Galimberti;Luca Furieri;Liang Xu;Giancarlo Ferrari-Trecate;
      Pages: 3155 - 3162
      Abstract: Deep neural networks (DNNs) training can be difficult due to vanishing and exploding gradients during weight optimization through backpropagation. To address this problem, we propose a general class of Hamiltonian DNNs (H-DNNs) that stem from the discretization of continuous-time Hamiltonian systems and include several existing DNN architectures based on ordinary differential equations. Our main result is that a broad set of H-DNNs ensures nonvanishing gradients by design for an arbitrary network depth. This is obtained by proving that, using a semi-implicit Euler discretization scheme, the backward sensitivity matrices involved in gradient computations are symplectic. We also provide an upper bound to the magnitude of sensitivity matrices and show that exploding gradients can be controlled through regularization. The good performance of H-DNNs is demonstrated on benchmark classification problems, including image classification with the MNIST dataset.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Robust Data-Enabled Predictive Control: Tractable Formulations and
           Performance Guarantees

    • Free pre-print version: Loading...

      Authors: Linbin Huang;Jianzhe Zhen;John Lygeros;Florian Dörfler;
      Pages: 3163 - 3170
      Abstract: We introduce a general framework for robust data-enabled predictive control (DeePC) for linear time-invariant systems, which enables us to obtain robust and optimal control in a receding-horizon fashion based on inexact input and output data. Robust DeePC solves a min–max optimization problem to compute the optimal control sequence that is resilient to all possible realizations of the uncertainties in data within a prescribed uncertainty set. We present computationally tractable reformulations of the min–max problem with various uncertainty sets. Moreover, we show that even though an accurate prediction of the future behavior is unattainable due to inaccessibility of exact data, the obtained control sequence provides performance guarantees for the actually realized input and output cost in open loop. Finally, we demonstrate the performance of robust DeePC using high-fidelity simulations of a power converter system.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Deep Neural Network-Based Approximate Optimal Tracking for Unknown
           Nonlinear Systems

    • Free pre-print version: Loading...

      Authors: Max L. Greene;Zachary I. Bell;Scott Nivison;Warren E. Dixon;
      Pages: 3171 - 3177
      Abstract: The infinite horizon optimal tracking problem is solved for a deterministic, control-affine, unknown nonlinear dynamical system. A deep neural network (DNN) is updated in real time to approximate the unknown nonlinear system dynamics. The developed framework uses a multitimescale concurrent learning-based weight update policy, with which the output layer DNN weights are updated in real time, but the internal DNN features are updated discretely and at a slower timescale (i.e., with batch-like updates). The design of the output layer weight update policy is motivated by a Lyapunov-based analysis, and the inner features are updated according to existing DNN optimization algorithms. Simulation results demonstrate the efficacy of the developed technique and compare its performance to existing techniques.
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Expand Your Network, Get Rewarded

    • Free pre-print version: Loading...

      Pages: 3178 - 3178
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • TechRxiv: Share Your Preprint Research with the World!

    • Free pre-print version: Loading...

      Pages: 3179 - 3179
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
  • Introducing IEEE Collabratec

    • Free pre-print version: Loading...

      Pages: 3180 - 3180
      PubDate: May 2023
      Issue No: Vol. 68, No. 5 (2023)
       
 
JournalTOCs
School of Mathematical and Computer Sciences
Heriot-Watt University
Edinburgh, EH14 4AS, UK
Email: journaltocs@hw.ac.uk
Tel: +00 44 (0)131 4513762
 


Your IP address: 3.237.32.15
 
Home (Search)
API
About JournalTOCs
News (blog, publications)
JournalTOCs on Twitter   JournalTOCs on Facebook

JournalTOCs © 2009-