Authors:Huda Ibeid; Rio Yokota; Jennifer Pestana; David Keyes Abstract: Among optimal hierarchical algorithms for the computational solution of elliptic problems, the fast multipole method (FMM) stands out for its adaptability to emerging architectures, having high arithmetic intensity, tunable accuracy, and relaxable global synchronization requirements. We demonstrate that, beyond its traditional use as a solver in problems for which explicit free-space kernel representations are available, the FMM has applicability as a preconditioner in finite domain elliptic boundary value problems, by equipping it with boundary integral capability for satisfying conditions at finite boundaries and by wrapping it in a Krylov method for extensibility to more general operators. Here, we do not discuss the well developed applications of FMM to implement matrix-vector multiplications within Krylov solvers of boundary element methods. Instead, we propose using FMM for the volume-to-volume contribution of inhomogeneous Poisson-like problems, where the boundary integral is a small part of the overall computation. Our method may be used to precondition sparse matrices arising from finite difference/element discretizations, and can handle a broader range of scientific applications. It is capable of algebraic convergence rates down to the truncation error of the discretized PDE comparable to those of multigrid methods, and it offers potentially superior multicore and distributed memory scalability properties on commodity architecture supercomputers. Compared with other methods exploiting the low-rank character of off-diagonal blocks of the dense resolvent operator, FMM-preconditioned Krylov iteration may reduce the amount of communication because it is matrix-free and exploits the tree structure of FMM. We describe our tests in reproducible detail with freely available codes and outline directions for further extensibility. PubDate: 2017-11-09 DOI: 10.1007/s00791-017-0287-5

Authors:Sriparna Saha; Monalisa Pal; Amit Konar Abstract: A novel approach to distinguish 25 body gestures enlightening physical disorders in young and elder individuals is explained using the proposed system. Here a well-known human sensing device, Kinect sensor is used which approximates the human body by virtue of 20 body joints and produces a data stream from which skeleton of the human body is traced. Sampling rate of the data stream is 30 frames per second where every frame represents a body gesture. The overall system is bifurcated into two parts. The offline part calculates 19 features from each frame representing a diseased gesture. These features are angle and distance information between 20 body joints. Features correspond to a definite pattern for a specific body gesture. In online part, triangular fuzzy matching based algorithm performs to detect real-time gestures with 90.57% accuracy. For achieving better accuracy, decision tree is enforced to separate sitting and standing body gestures. The proposed approach is observed to outperform several contemporary approaches in terms of accuracy while presenting a simple system which is based on medical knowledge and is capable of distinguishing as large as 25 gestures. PubDate: 2017-10-07 DOI: 10.1007/s00791-017-0281-y

Authors:Sagar Adatrao; Mayank Mittal Abstract: Detecting the size and/or location of circular object(s) in an image(s) has application in many areas, like, flow diagnostics, biomedical engineering, computer vision, etc. The detection accuracy of circular object(s) largely depends on the accuracy of centroiding algorithm and image preprocessing technique. In the present work, an error analysis is performed in determining the centroids of circular objects using synthetic images with eight different signal-to-noise ratios ranging from 2.7 to 17.8. In the first stage, four different centroiding algorithms, namely, Center of Mass, Weighted Center of Mass, Späth algorithm, and Hough transform, are studied and compared. The error analysis shows that Späth algorithm performs better than other algorithms. In the second stage, various image preprocessing techniques, consisting of two filters, namely, Median and Wiener, and five image segmentation methods, namely, Sobel, Prewitt, Laplacian of Gaussian (LoG) edge detector, basic global thresholding, and Otsu’s global thresholding, are compared to determine the centroids of circular objects using Späth algorithm. It is found that Wiener filter plus LoG edge detector performs better than other preprocessing techniques. Real images of a calibration target (typical in flow diagnostics) and the secondary atomization of water droplets are then considered for centroids detection. These two images are preprocessed using Wiener filter plus LoG edge detector and then processed using Späth algorithm to detect the centroids of circular objects. It is observed that the results of real image of the calibration target and synthetic images are comparable. Also, based on visual inspection, the centroids detected in the real image of water droplets are found to be reasonably accurate. PubDate: 2017-10-06 DOI: 10.1007/s00791-017-0286-6

Authors:R. D. Falgout; S. Friedhoff; Tz. V. Kolev; S. P. MacLachlan; J. B. Schroder; S. Vandewalle Abstract: We consider the comparison of multigrid methods for parabolic partial differential equations that allow space–time concurrency. With current trends in computer architectures leading towards systems with more, but not faster, processors, space–time concurrency is crucial for speeding up time-integration simulations. In contrast, traditional time-integration techniques impose serious limitations on parallel performance due to the sequential nature of the time-stepping approach, allowing spatial concurrency only. This paper considers the three basic options of multigrid algorithms on space–time grids that allow parallelism in space and time: coarsening in space and time, semicoarsening in the spatial dimensions, and semicoarsening in the temporal dimension. We develop parallel software and performance models to study the three methods at scales of up to 16K cores and introduce an extension of one of them for handling multistep time integration. We then discuss advantages and disadvantages of the different approaches and their benefit compared to traditional space-parallel algorithms with sequential time stepping on modern architectures. PubDate: 2017-10-06 DOI: 10.1007/s00791-017-0283-9

Authors:Yangang Chen; Justin W. L. Wan Abstract: We propose multigrid methods for convergent mixed finite difference discretization for the two dimensional Monge–Ampère equation. We apply mixed standard 7-point stencil and semi-Lagrangian wide stencil discretization, such that the numerical solution is guaranteed to converge to the viscosity solution of the Monge–Ampère equation. We investigate multigrid methods for two scenarios. The first scenario considers applying standard 7-point stencil discretization on the entire computational domain. We use full approximation scheme with four-directional alternating line smoothers. The second scenario considers the more general mixed stencil discretization and is used for the linearized problem. We propose a coarsening strategy where wide stencil points are set as coarse grid points. Linear interpolation is applied on the entire computational domain. At wide stencil points, injection as the restriction yields a good coarse grid correction. Numerical experiments show that the convergence rates of the proposed multigrid methods are mesh-independent. PubDate: 2017-10-05 DOI: 10.1007/s00791-017-0284-8

Authors:Duncan Kioi Gathungu; Alfio Borzì Abstract: The fast multigrid solution of an optimal control problem governed by a convection–diffusion partial-integro differential equation is investigated. This optimization problem considers a cost functional of tracking type and a constrained distributed control. The optimal control sought is characterized by the solution to the corresponding optimality system, which is approximated by a finite volume and quadrature discretization schemes and solved by multigrid techniques. The proposed multigrid approach combines a multigrid method for the governing model with a fast multigrid integration method. The convergence of this solution procedure is analyzed by local Fourier analysis and validated by results of numerical experiments. PubDate: 2017-10-03 DOI: 10.1007/s00791-017-0285-7

Authors:Rajib Sarkar; Soumya Kanti Naskar; Sanjoy Kumar Saha Abstract: Classification of music signal is a fundamental step for organized archival of music collection and fast retrieval thereafter. For Indian classical music, raga is the basic melodic framework. Manual identification of raga demands high expertise which is not available easily. Thus an automated system for raga identification is of great importance. In this work, we have studied the basic properties of the ragas in North Indian (Hindusthani) classical music and designed the features to capture the same. Pitch based Swara (note) profile is formed. Occurrence and energy distribution of notes generated from the profile are used as features. Note sequence plays an important role in the raga composition. Proposed note co-occurrence matrix summarizes this aspect. An audio clip is represented by these features which have strong correlation with the properties of raga. Support vector machine is used for classification. Experiment is done with a diversified dataset. Performance of the proposed work is compared with two other systems. It is observed that proposed methodology performs better. PubDate: 2017-09-23 DOI: 10.1007/s00791-017-0282-x

Authors:Randolph E. Bank Abstract: Higher order finite elements present certain challenges for multilevel methods. Such matrices have more nonzero elements and special block structure. In the case of \(h{-}p\) adaptive methods, the block structure is more complicated. In this work we present a simple two level solver for such systems, that exploits these special properties. The convergence rate is (empirically) multigrid-like, at least up to piecewise polynomials of degree nine. Numerical illustrations demonstrate its robustness on a wide variety of problems, including convection–diffusion and Helmholtz equations. PubDate: 2017-05-25 DOI: 10.1007/s00791-017-0280-z

Authors:Melania Carfagna; Alfio Grillo Abstract: Nowadays, the description of complex physical systems, such as biological tissues, calls for highly detailed and accurate mathematical models. These, in turn, necessitate increasingly elaborate numerical methods as well as dedicated algorithms capable of resolving each detail which they account for. Especially when commercial software is used, the performance of the algorithms coded by the user must be tested and carefully assessed. In Computational Biomechanics, the Spherical Design Algorithm (SDA) is a widely used algorithm to model biological tissues that, like articular cartilage, are described as composites reinforced by statistically oriented collagen fibres. The purpose of the present work is to analyse the performances of the SDA, which we implement in a commercial software for several sets of integration points (referred to as “spherical designs”), and compare the results with those determined by using an appropriate set of points proposed in this manuscript. As terms for comparison we take the results obtained by employing the integration scheme Integral, available in Matlab \(^{{\textregistered }}\) . For the numerical simulations, we study a well-documented benchmark test on articular cartilage, known as ‘unconfined compression test’. The reported numerical results highlight the influence of the fibres on the elasticity and permeability of this tissue. Moreover, some technical issues of the SDA (such as the choice of the quadrature points and their position in the integration domain) are proposed and discussed. PubDate: 2017-04-21 DOI: 10.1007/s00791-017-0278-6

Authors:Klaus-Peter Kröhn Abstract: The underground Hard Rock Laboratory (HRL) at Äspö, Sweden, is located in granitic rock and dedicated to investigations concerning deep geological radioactive waste disposal. Several in-situ experiments have been performed in the HRL with respect to geotechnical barriers which are intended to protect the waste canisters against the prevailing groundwater. Among them are the recent Buffer-Rock Interaction Experiment (BRIE) and, on a much larger scale, the long-term Prototype Repository (PR) experiment. Modelling the surrounding groundwater flow systems has been performed with the code \(\hbox {d}^{3}\hbox {f}\) using an approach where large fractures are represented discretely in a model while the remaining set of smaller fractures—also called “background fractures”—is assumed to act like an additional homogeneous continuum. It has been firstly applied to the BRIE in a cube-like domain of 40 m side length. Calibration of the model resulted in a considerable increase of matrix permeability due to the influence of the background fractures. To check the validity of the approach the calibrated data for the BRIE were applied to a model for the much larger PR which is also located in the HRL but at quite some distance from the BRIE. Only moderate modifications of the initially used permeabilities sufficed to fit the numerous outflow data for the PR-tunnel as well as for the six “deposition boreholes”. The chosen approach for the BRIE can thus be considered to be successfully transferred to the PR building considerable confidence in the conceptual approach. PubDate: 2017-04-12 DOI: 10.1007/s00791-017-0279-5

Authors:L. John; P. Pustějovská; O. Steinbach Abstract: Hemodynamic indicators such as the averaged wall shear stress (AWSS) and the oscillatory shear index (OSI) are well established to characterize areas of arterial walls with respect to the formation and progression of aneurysms. Here, we study two different forms for the wall shear stress vector from which AWSS and OSI are computed. One is commonly used as a generalization from the two-dimensional setting, the latter is derived from the full decomposition of the wall traction force given by the Cauchy stress tensor. We compare the influence of both approaches on hemodynamic indicators by numerical simulations under different computational settings. Namely, different (real and artificial) vessel geometries, and the influence of a physiological periodic inflow profile. The blood is modeled either as a Newtonian fluid or as a generalized Newtonian fluid with a shear rate dependent viscosity. Numerical results are obtained by using a stabilized finite element method. We observe profound differences in hemodynamic indicators computed by these two approaches, mainly at critical areas of the arterial wall. PubDate: 2017-04-12 DOI: 10.1007/s00791-017-0277-7

Authors:Randolph E. Bank; Chris Deotte Abstract: This paper discusses the effects that partitioning has on the convergence rate of Domain Decomposition. When Finite Elements are employed to solve a second order elliptic partial differential equation with strong convection and/or anisotropic diffusion, the shape and alignment of a partition’s parts significantly affect the Domain Decomposition convergence rate. Given a PDE, if b is the direction of convection or the prominent direction of anisotropic diffusion, then if one considers traversing the domain in the direction of b, partitions having fewer parts to traverse in this direction converge faster while partitions having more converge slower. PubDate: 2017-01-20 DOI: 10.1007/s00791-016-0271-5

Authors:Rolf Krause; Alessandro Rigazzi; Johannes Steiner Pages: 1 - 15 Abstract: The parallel solution of constrained minimization problems requires special care to be taken with respect to the information transfer between the different subproblems. Here, we present a nonlinear decomposition approach which employs an additional nonlinear correction step along the processor interfaces. Our approach is generic in the sense that it can be applied to a wide class of minimization problems with strongly local nonlinearities, including even nonsmooth minimization problems. We also describe the implementation of our nonlinear decomposition method in the object oriented library ObsLib \(++\) . The flexibility of our approach and its implementation is presented along different problem classes as obstacle problems, frictional contact problems and biomechanical applications. For the same examples, number of iterations, computation time, and parallelization speedup are measured, and the results demonstrate that the implementation scales reasonably well up to 4096 processors. PubDate: 2016-02-01 DOI: 10.1007/s00791-016-0267-1 Issue No:Vol. 18, No. 1 (2016)

Authors:Peter Frolkovič; Michael Lampe; Gabriel Wittum Pages: 17 - 29 Abstract: When a realistic modelling of radioactive contaminant transport in flowing groundwater is required, very large systems of coupled partial and ordinary differential equations can arise that have to be solved numerically. For that purpose, the software package \(r^3t\) is developed in which several advanced numerical methods are implemented to solve such models efficiently and accurately. Using software tools of \(r^3t\) one can treat successfully nontrivial mathematical problems like advection-dominated system with different retardation of transport for each component and with nonlinear Freundlich sorption and/or precipitation. Additionally, long time simulations on complex 3D geological domains using unstructured grids can be realized. In this paper we introduce and summarize the most important and novel features of numerical simulation for radioactive contaminant transport in porous media when using \(r^3t\) . PubDate: 2016-02-01 DOI: 10.1007/s00791-016-0268-0 Issue No:Vol. 18, No. 1 (2016)

Authors:Peter Frolkovič; Dmitriy Logashenko; Christian Wehner Pages: 31 - 52 Abstract: In this paper we deal with the application of the flux-based level set method to moving interface computations on unstructured grids. The focus lies on the overcoming of the known difficulties of level set methods, e.g. accurate computations of important geometric properties, reliable and precise reinitialization of the level set function and the adaption of standard discretization methods to the moving boundary case. The basic building block of our approach is the high-resolution flux-based level set method for general advection equation (Frolkovič and Mikula in SIAM J Sci Comput 29(2):579–597, 2007, Frolkovič and Wehner in Comput Vis Sci 12(6):626–650, 2009). We extend this method for the problem of reinitialization of the level set function on unstructured grids by using quadratic interpolation to compute distances for nodes close to the interface. To realize numerical simulation for some applications with moving boundaries, we adapt the approach of ghost fluid method (Gibou and Fedkiw in J Comput Phys 202:577–601, 2005) for unstructured grids. The idea is to describe the development of the moving boundary with a level set formulation while the computational grid remains fixed and the boundary conditions are enforced using some extrapolation. Our main motivation is the numerical solution of two-phase incompressible flow problems. Additionally to previously mentioned steps, we introduce further numerical schemes in the framework of finite volume discretization for the flow. Possible jumps of the pressure and the directional derivative of velocity at the interface are modeled directly within the method using the approach of extended approximation spaces. Besides that, an algorithm for the computations of curvature is considered that exhibits the second order accuracy for some examples. Numerical experiments are provided for the presented methods. PubDate: 2016-02-01 DOI: 10.1007/s00791-016-0269-z Issue No:Vol. 18, No. 1 (2016)

Authors:Randolph E. Bank; Chris Deotte Abstract: In this work, we compare and contrast a few finite element h-adaptive and hp-adaptive algorithms. We test these schemes on three example PDE problems and we utilize and evaluate an a posteriori error estimate. In the process, we introduce a new framework to study adaptive algorithms and a posteriori error estimators. Our innovative environment begins with a solution u and then uses interpolation to simulate solving a corresponding PDE. As a result, we always know the exact error and we avoid the noise associated with solving. Using an effort indicator, we evaluate the relationship between accuracy and computational work. We report the order of convergence of different approaches. And we evaluate the accuracy and effectiveness of an a posteriori error estimator. PubDate: 2016-12-26 DOI: 10.1007/s00791-016-0272-4

Authors:Tao Cui; Jinchao Xu; Chen-Song Zhang Abstract: Due to increasing complexity of supercomputers, hard and soft errors are causing more and more problems in high-performance scientific and engineering computation. In order to improve reliability (increase the mean time to failure) of computing systems, a lot of efforts have been devoted to developing techniques to forecast, prevent, and recover from errors at different levels, including architecture, application, and algorithm. In this paper, we focus on algorithmic error resilient iterative solvers and introduce a redundant subspace correction method. Using a general framework of redundant subspace corrections, we construct iterative methods, which have the following properties: (1) maintain convergence when error occurs assuming it is detectable; (2) introduce low computational overhead when no error occurs; (3) require only small amount of point-to-point communication compared to traditional methods and maintain good load balance; (4) improve the mean time to failure. Preliminary numerical experiments demonstrate the efficiency and effectiveness of the new subspace correction method. For simplicity, the main ideas of the proposed framework were demonstrated using the Schwarz methods without a coarse space, which do not scale well in practice. PubDate: 2016-12-22 DOI: 10.1007/s00791-016-0270-6

Authors:Zheng Li; Shuhong Wu; Chen-Song Zhang; Jinchao Xu; Chunsheng Feng; Xiaozhe Hu Abstract: Numerical simulation based on fine-scale reservoir models helps petroleum engineers in understanding fluid flow in porous media and achieving higher recovery ratio. Fine-scale models give rise to large-scale linear systems, and thus require effective solvers for solving these linear systems to finish simulation in reasonable turn-around time. In this paper, we study convergence, robustness, and efficiency of a class of multi-stage preconditioners accelerated by Krylov subspace methods for solving Jacobian systems from a fully implicit discretization. We compare components of these preconditioners, including decoupling and sub-problem solvers, for fine-scale reservoir simulation. Several benchmark and real-world problems, including a ten-million-cell reservoir problem, were simulated on a desktop computer. Numerical tests show that the combination of the alternating block factorization method and multi-stage subspace correction preconditioner gives a robust and memory-efficient solver for fine-scale reservoir simulation. PubDate: 2016-12-21 DOI: 10.1007/s00791-016-0273-3

Authors:E. Carlini; R. Ferretti Abstract: We propose a Semi-Lagrangian scheme coupled with Radial Basis Function interpolation for approximating a curvature-related level set model, which has been proposed by Zhao et al. (Comput Vis Image Underst 80:295–319, 2000) to reconstruct unknown surfaces from sparse data sets. The main advantages of the proposed scheme are the possibility to solve the level set method on unstructured grids, as well as to concentrate the reconstruction points in the neighbourhood of the data set, with a consequent reduction of the computational effort. Moreover, the scheme is explicit. Numerical tests show the accuracy and robustness of our approach to reconstruct curves and surfaces from relatively sparse data sets. PubDate: 2016-12-21 DOI: 10.1007/s00791-016-0274-2

Authors:Sambasiva Rao Chinnamsetty; Mike Espig; Wolfgang Hackbusch Pages: 267 - 275 Abstract: The computation of a six-dimensional density matrix is the crucial step for the evaluation of kinetic energy in electronic structure calculations. For molecules with heavy nuclei, one has to consider a very refined mesh in order to deal with the nuclear cusps. This leads to high computational time and needs huge memory for the computation of the density matrix. To reduce the computational complexity and avoid discretization errors in the approximation, we use mesh-free canonical tensor products in electronic structure calculations. In this paper, we approximate the six-dimensional density matrix in an efficient way and then compute the kinetic energy. Accuracy is examined by comparing our computed kinetic energy with the exact computation of the kinetic energy. PubDate: 2015-12-01 DOI: 10.1007/s00791-016-0263-5 Issue No:Vol. 17, No. 6 (2015)