|
|
- Extremes.jl: Extreme Value Analysis in Julia
Authors: Jonathan Jalbert; Marilou Farmer, Gabriel Gobeil, Philippe Roy Abstract: The Extremes.jl package provides exhaustive, high-performance functions by leveraging the multiple-dispatch capabilities in Julia for the analysis of extreme values. In particular, the package implements statistical models for both block maxima and peaks-over-threshold methods, along with several methods for the generalized extreme value and generalized Pareto distributions used in extreme value theory. Additionally, the package offers various parameter estimation methods, such as probability-weighted moments, maximum likelihood, and Bayesian estimation. It also includes tools for handling dependence in excesses over a threshold and methods for managing nonstationary models. Inference for extreme quantiles is available for both stationary and nonstationary models, along with diagnostic figures to assess the goodness of fit of the model to the data. PubDate: Tue, 04 Jun 2024 00:00:00 +000
- Generalized Plackett-Luce Likelihoods
Authors: Robin K. S. Hankin Abstract: The hyper2 package provides functionality to work with extensions of the Bradley-Terry probability model such as Plackett-Luce likelihood including team strengths and reified entities (monsters). The package allows one to use relatively natural R idiom to manipulate such likelihood functions. Here, I present a generalization of hyper2 in which multiple entities are constrained to have identical Bradley-Terry strengths. A new S3 class 'hyper3', along with associated methods, is motivated and introduced. Three datasets are analyzed, each analysis furnishing new insight, and each highlighting different capabilities of the package. PubDate: Mon, 03 Jun 2024 00:00:00 +000
- fHMM: Hidden Markov Models for Financial Time Series in R
Authors: Lennart Oelschläger; Timo Adam, Rouven Michels Abstract: Hidden Markov models constitute a versatile class of statistical models for time series that are driven by hidden states. In financial applications, the hidden states can often be linked to market regimes such as bearish and bullish markets or recessions and periods of economics growth. To give an example, when the market is in a nervous state, corresponding stock returns often follow some distribution with relatively high variance, whereas calm periods are often characterized by a different distribution with relatively smaller variance. Hidden Markov models can be used to explicitly model the distribution of the observations conditional on the hidden states and the transitions between states, and thus help us to draw a comprehensive picture of market behavior. While various implementations of hidden Markov models are available, a comprehensive R package that is tailored to financial applications is still lacking. In this paper, we introduce the R package fHMM, which provides various tools for applying hidden Markov models to financial time series. It contains functions for fitting hidden Markov models to data, conducting simulation experiments, and decoding the hidden state sequence. Furthermore, functions for model checking, model selection, and state prediction are provided. In addition to basic hidden Markov models, hierarchical hidden Markov models are implemented, which can be used to jointly model multiple data streams that were observed at different temporal resolutions. The aim of the fHMM package is to give R users with an interest in financial applications access to hidden Markov models and their extensions. PubDate: Mon, 03 Jun 2024 00:00:00 +000
- Emulation and History Matching Using the hmer Package
Authors: Andrew Iskauskas; Ian Vernon, Michael Goldstein, Danny Scarponi, Nicky McCreesh, Trevelyan J. McKinley, Richard G. White Abstract: Modeling complex real-world situations such as infectious diseases, geological phenomena, and biological processes can present a dilemma: the computer model (referred to as a simulator) needs to be complex enough to capture the dynamics of the system, but each increase in complexity increases the evaluation time of such a simulation, making it difficult to obtain an informative description of parameter choices that would be consistent with observed reality. While methods for identifying acceptable matches to real-world observations exist, for example optimization or Markov chain Monte Carlo methods, they may result in non-robust inferences or may be infeasible for computationally intensive simulators. The techniques of emulation and history matching can make such determinations feasible, efficiently identifying regions of parameter space that produce acceptable matches to data while also providing valuable information about the simulator's structure, but the mathematical considerations required to perform emulation can present a barrier for makers and users of such simulators compared to other methods. The hmer package provides an accessible framework for using history matching and emulation on simulator data, leveraging the computational efficiency of the approach while enabling users to easily match to, visualize, and robustly predict from their complex simulators. PubDate: Mon, 03 Jun 2024 00:00:00 +000
- cpop: Detecting Changes in Piecewise-Linear Signals
Authors: Paul Fearnhead; Daniel Grose Abstract: Changepoint detection is an important problem with a wide range of applications. There are many different types of changes that one may wish to detect, and a widerange of algorithms and software for detecting them. However there are relatively few approaches for detecting changes-in-slope in the mean of a signal plus noise model. We describe the R package cpop, available on the Comprehensive R Archive Network (CRAN). This package implements CPOP, a dynamic programming algorithm, to find the optimal set of changes that minimizes an L0 penalized cost, with the cost being a weighted residual sum of squares. The package has extended the CPOP algorithm so it can analyse data that is unevenly spaced, allow for heterogeneous noise variance, and allows for a grid of potential change locations to be different from the locations of the data points. There is also an implementation that uses the CROPS algorithm to detect all segmentations that are optimal as you vary the L0 penalty for adding a change across a continuous range of values. PubDate: Wed, 29 May 2024 00:00:00 +000
- funGp: An R Package for Gaussian Process Regression with Scalar and
Functional Inputs Authors: José Betancourt; François Bachoc, Thierry Klein, Déborah Idier, Jérémy Rohmer, Yves Deville Abstract: This article introduces funGp, an R package which handles regression problems involving multiple scalar and/or functional inputs, and a scalar output, through the Gaussian process model. This is particularly of interest for the design and analysis of computer experiments with expensive-to-evaluate numerical codes that take as inputs regularly sampled time series. Rather than imposing any particular parametric input-output relationship in advance (e.g., linear, polynomial), Gaussian process models extract this information directly from the data. The package offers built-in dimension reduction, which helps to simplify the representation of the functional inputs and obtain lighter models. It also implements an ant colony based optimization algorithm which supports the calibration of multiple structural characteristics of the model such as the state of each input (active or inactive) and the type of kernel function, while seeking for greater prediction power. The implemented methods are tested and applied to a real case in the domain of marine flooding. PubDate: Sat, 11 May 2024 00:00:00 +000
- bizicount: Bivariate Zero-Inflated Count Copula Regression Using R
Authors: John M. Niehaus; Lin Zhu, Scott J. Cook, Mikyoung Jun Abstract: Two common issues arise in regression modelling of bivariate count data: (i) dependence across outcomes, and (ii) excess zero counts (i.e., zero inflation). However, there are currently few options to estimate bivariate zero-inflated count regression models in R. Therefore, we present an R package, bizicount, that enables researchers to easily estimate bivariate zero-inflated count copula regression models. By using copulas to model the dependence across outcomes, researchers do not have to make assumptions about the multivariate (and zero-inflated) structure relating their count variables to one another. Instead, they are only required to make familiar assumptions about the marginal distribution of each outcome variable, which should enable wider use of our approach. Below we present our proposed estimator, detail its advantages over existing alternatives, and demonstrate the use of the corresponding functions for bivariate modeling of terrorism data from Nigeria. PubDate: Wed, 08 May 2024 00:00:00 +000
- scikit-fda: A Python Package for Functional Data Analysis
Authors: Carlos Ramos-Carreño; José Luis Torrecilla, Miguel Carbajo-Berrocal, Pablo Marcos, Alberto Suárez Abstract: The library scikit-fda is a Python package for functional data analysis (FDA). It provides a comprehensive set of tools for representation, preprocessing, and exploratory analysis of functional data. The library is built upon and integrated in Python's scientific ecosystem. In particular, it conforms to the scikit-learn application programming interface so as to take advantage of the functionality for machine learning provided by this package: Pipelines, model selection, and hyperparameter tuning, among others. The scikit-fda package has been released as free and open-source software under a 3-clause BSD license and is open to contributions from the FDA community. The library's extensive documentation includes step-by-step tutorials and detailed examples of use. PubDate: Wed, 08 May 2024 00:00:00 +000
- openTSNE: A Modular Python Library for t-SNE Dimensionality Reduction and
Embedding Authors: Pavlin G. Poličar; Martin Stražar, Blaž Zupan Abstract: One of the most popular techniques for visualizing large, high-dimensional data sets is t-distributed stochastic neighbor embedding (t-SNE). Recently, several extensions have been proposed to address scalability issues and the quality of the resulting visualizations. We introduce openTSNE, a modular Python library that implements the core t-SNE algorithm and its many extensions. The library is faster than existing implementations and can compute projections of data sets containing millions of data points in minutes. PubDate: Wed, 08 May 2024 00:00:00 +000
- magi: A Package for Inference of Dynamic Systems from Noisy and Sparse
Data via Manifold-Constrained Gaussian Processes Authors: Samuel W. K. Wong; Shihao Yang, S. C. Kou Abstract: This article presents the magi software package for the inference of dynamic systems. The focus of magi is on dynamics modeled by nonlinear ordinary differential equations with unknown parameters. While such models are widely used in science and engineering, the available experimental data for parameter estimation may be noisy and sparse. Furthermore, some system components may be entirely unobserved. magi solves this inference problem with the help of manifold-constrained Gaussian processes within a Bayesian statistical framework, whereas unobserved components have posed a significant challenge for existing software. We use several realistic examples to illustrate the functionality of magi. The user may choose to use the package in any of the R, MATLAB, and Python environments. PubDate: Wed, 08 May 2024 00:00:00 +000
|