Abstract: Abstract A wide range of important problems in machine learning, expert system, social network analysis, bioinformatics and information theory can be formulated as a maximum a-posteriori (MAP) inference problem on statistical relational models. While off-the-shelf inference algorithms that are based on local search and message-passing may provide adequate solutions in some situations, they frequently give poor results when faced with models that possess high-density networks. Unfortunately, these situations always occur in models of real-world applications. As such, accurate and scalable maximum a-posteriori (MAP) inference on such models often remains a key challenge. In this paper, we first introduce a novel family of extended factor graphs that are parameterized by a smoothing parameter χ ∈ [0,1]. Applying belief propagation (BP) message-passing to this family formulates a new family of W eighted S urvey P ropagation algorithms (WSP-χ) applicable to relational domains. Unlike off-the-shelf inference algorithms, WSP-χ detects the “backbone” ground atoms in a solution cluster that involve potentially optimal MAP solutions: the cluster backbone atoms are not only portions of the optimal solutions, but they also can be exploited for scaling MAP inference by iteratively fixing them to reduce the complex parts until the network is simplified into one that can be solved accurately using any conventional MAP inference method. We also propose a lazy variant of this WSP-χ family of algorithms. Our experiments on several real-world problems show the efficiency of WSP-χ and its lazy variants over existing prominent MAP inference solvers such as MaxWalkSAT, RockIt, IPP, SP-Y and WCSP. PubDate: 2020-08-01

Abstract: Abstract We tackle some fundamental problems in probability theory on corrupted random processes on the integer line. We analyze when a biased random walk is expected to reach its bottommost point and when intervals of integer points can be detected under a natural model of noise. We apply these results to problems in learning thresholds and intervals under a new model for learning under adversarial design. PubDate: 2020-08-01

Abstract: Abstract In this paper, we analyze the fundamental conditions for low-rank tensor completion given the separation or tensor-train (TT) rank, i.e., ranks of TT unfoldings. We exploit the algebraic structure of the TT decomposition to obtain the deterministic necessary and sufficient conditions on the locations of the samples to ensure finite completability. Specifically, we propose an algebraic geometric analysis on the TT manifold that can incorporate the whole rank vector simultaneously in contrast to the existing approach based on the Grassmannian manifold that can only incorporate one rank component. Our proposed technique characterizes the algebraic independence of a set of polynomials defined based on the sampling pattern and the TT decomposition, which is instrumental to obtaining the deterministic condition on the sampling pattern for finite completability. In addition, based on the proposed analysis, assuming that the entries of the tensor are sampled independently with probability p, we derive a lower bound on the sampling probability p, or equivalently, the number of sampled entries that ensures finite completability with high probability. Moreover, we also provide the deterministic and probabilistic conditions for unique completability. PubDate: 2020-08-01

Abstract: Abstract The introduction in 2015 of Residual Neural Networks (RNN) and ResNET allowed for outstanding improvements of the performance of learning algorithms for evolution problems containing a “large” number of layers. Continuous-depth RNN-like models called Neural Ordinary Differential Equations (NODE) were then introduced in 2019. The latter have a constant memory cost, and avoid the a priori specification of the number of hidden layers. In this paper, we derive and analyze a parallel (-in-parameter and time) version of the NODE, which potentially allows for a more efficient implementation than a standard/naive parallelization of NODEs with respect to the parameters only. We expect this approach to be relevant whenever we have access to a very large number of processors, or when we are dealing with high dimensional ODE systems. Moreover, when using implicit ODE solvers, solutions to linear systems with up to cubic complexity are then required for solving nonlinear systems using for instance Newton’s algorithm; as the proposed approach allows to reduce the overall number of time-steps thanks to an iterative increase of the accuracy order of the ODE system solvers, it then reduces the number of linear systems to solve, hence benefiting from a scaling effect. PubDate: 2020-07-25

Abstract: Abstract There are various geometric transformations, e.g., translations, rotations, which are always bijections in the Euclidean space. Their digital counterpart, i.e., their digitized variants are defined on discrete grids, since most of our pictures are digital nowadays. Usually, these digital versions of the transformations have different properties than the original continuous variants have. Rotations are bijective on the Euclidean plane, but in many cases they are not injective and not surjective on digital grids. Since these transformations play an important role in image processing and in image manipulation, it is important to discover their properties. Neighborhood motion maps are tools to analyze digital transformations, e.g., rotations by local bijectivity point of view. In this paper we show digitized rotations of a pixel and its 12-neighbors on the triangular grid. In particular, different rotation centers are considered with respect to the corresponding main pixel, e.g. edge midpoints and corner points. Angles of all locally bijective and non-bijective rotations are described in details. It is also shown that the triangular grid shows better performance in some cases than the square grid regarding the number of lost pixels in the neighborhood motion map. PubDate: 2020-07-22

Abstract: Abstract The fundamental concepts underlying Markov networks are the conditional independence and the set of rules called Markov properties that translate conditional independence constraints into graphs. We introduce the concept of mutual conditional independence in an independent set of a Markov network, and we prove its equivalence to the Markov properties under certain regularity conditions. This extends the notion of similarity between separation in graph and conditional independence in probability to similarity between the mutual separation in graph and the mutual conditional independence in probability. Model selection in graphical models remains a challenging task due to the large search space. We show that mutual conditional independence property can be exploited to reduce the search space. We present a new forward model selection algorithm for graphical log-linear models using mutual conditional independence. We illustrate our algorithm with a real data set example. We show that for sparse models the size of the search space can be reduced from \(\mathcal {O} (n^{3})\) to \(\mathcal {O}(n^{2})\) using our proposed forward selection method rather than the classical forward selection method. We also envision that this property can be leveraged for model selection and inference in different types of graphical models. PubDate: 2020-07-21

Abstract: Abstract We propose a directed acyclic hypergraph framework for a probabilistic graphical model that we call Bayesian hypergraphs. The space of directed acyclic hypergraphs is much larger than the space of chain graphs. Hence Bayesian hypergraphs can model much finer factorizations than Bayesian networks or LWF chain graphs and provide simpler and more computationally efficient procedures for factorizations and interventions. Bayesian hypergraphs also allow a modeler to represent causal patterns of interaction such as Noisy-OR graphically (without additional annotations). We introduce global, local and pairwise Markov properties of Bayesian hypergraphs and prove under which conditions they are equivalent. We also extend the causal interpretation of LWF chain graphs to Bayesian hypergraphs and provide corresponding formulas and a graphical criterion for intervention. PubDate: 2020-07-10

Abstract: Abstract Recently, Mahloujifar and Mahmoody (Theory of Cryptography Conference’17) studied attacks against learning algorithms using a special case of Valiant’s malicious noise, called p-tampering, in which the adversary gets to change any training example with independent probability p but is limited to only choose ‘adversarial’ examples with correct labels. They obtained p-tampering attacks that increase the error probability in the so called ‘targeted’ poisoning model in which the adversary’s goal is to increase the loss of the trained hypothesis over a particular test example. At the heart of their attack was an efficient algorithm to bias the expected value of any bounded real-output function through p-tampering. In this work, we present new biasing attacks for increasing the expected value of bounded real-valued functions. Our improved biasing attacks, directly imply improved p-tampering attacks against learners in the targeted poisoning model. As a bonus, our attacks come with considerably simpler analysis. We also study the possibility of PAC learning under p-tampering attacks in the non-targeted (aka indiscriminate) setting where the adversary’s goal is to increase the risk of the generated hypothesis (for a random test example). We show that PAC learning is possible under p-tampering poisoning attacks essentially whenever it is possible in the realizable setting without the attacks. We further show that PAC learning under ‘no-mistake’ adversarial noise is not possible, if the adversary could choose the (still limited to only p fraction of) tampered examples that she substitutes with adversarially chosen ones. Our formal model for such ‘bounded-budget’ tampering attackers is inspired by the notions of adaptive corruption in cryptography. PubDate: 2020-07-01

Abstract: Abstract We study the complexity of fair division of indivisible goods and consider settings where agents can have nonzero utility for the empty bundle. This is a deviation from a common normalization assumption in the literature, and we show that this inconspicuous change can lead to an increase in complexity: In particular, while an allocation maximizing social welfare by the Nash product is known to be easy to detect in the normalized setting whenever there are as many agents as there are resources, without normalization it can no longer be found in polynomial time, unless P = NP. The same statement also holds for egalitarian social welfare. Moreover, we show that it is NP-complete to decide whether there is an allocation whose Nash product social welfare is above a certain threshold if the number of resources is a multiple of the number of agents. Finally, we consider elitist social welfare and prove that the increase in expressive power by allowing negative coefficients again yields NP-completeness. PubDate: 2020-07-01

Abstract: Abstract In statistical learning theory, numerous works established non-asymptotic bounds assessing the generalization capacity of empirical risk minimizers under a large variety of complexity assumptions for the class of decision rules over which optimization is performed, by means of sharp control of uniform deviation of i.i.d. averages from their expectation, while fully ignoring the possible dependence across training data in general. It is the purpose of this paper to show that similar results can be obtained when statistical learning is based on a data sequence drawn from a (Harris positive) Markov chain X, through the running example of estimation of minimum volume sets (MV-sets) related to X’s stationary distribution, an unsupervised statistical learning approach to anomaly/novelty detection. Based on novel maximal deviation inequalities we establish, using the regenerative method, learning rate bounds that depend not only on the complexity of the class of candidate sets but also on the ergodicity rate of the chain X, expressed in terms of tail conditions for the length of the regenerative cycles. In particular, this approach fully tailored to Markovian data permits to interpret the rate bound results obtained in frequentist terms, in contrast to alternative coupling techniques based on mixing conditions: the larger the expected number of cycles over a trajectory of finite length, the more accurate the MV-set estimates. Beyond the theoretical analysis, this phenomenon is supported by illustrative numerical experiments. PubDate: 2020-07-01

Abstract: Abstract In this work, we approach the issue of privacy in distributed constraint reasoning by studying how agents compromise solution quality for preserving privacy, using utility and game theory. We propose a utilitarian definition of privacy in the context of distributed constraint reasoning, detail its different implications, and present a model and solvers, as well as their properties. We then show how important steps in a distributed constraint optimization with privacy requirements can be modeled as a planning problem, and more specifically as a stochastic game. We present experiments validating the interest of our approach, according to several criteria. PubDate: 2020-07-01

Abstract: Abstract Nowadays, energy represents the most important resource; however, we need to face several energy-related rising issues, one main concern is how energy is consumed. In particular, how we can stimulate consumers on a specific behaviour. In this work, we present a model facing energy allocation and payment. Thus, we start with the explanation of the first step of our work concerning a mechanism design approach for energy allocation among consumers. More in details, we go deep into the formal description of the energy model and users’ consumption profiles. We aim to select the optimal consumption profile for every user avoiding consumption peaks when the total required energy could exceed the energy production. The mechanism will be able to drive users in shifting energy consumptions in different hours of the day. The next step concerns a payment estimation problem which involves a community of users and an energy distributor (or producer). Our aim is to compute payments for every user in the community according to the single user’s consumption, the community’s consumption and the available energy. By computing community-dependent energy bills, our model stimulates a users’ virtuous behaviour, so that everyone approaches the production threshold as close as possible. Our payment function distributes incentives if the consumption is lower than the produced energy and penalties when the consumption exceeds the resources threshold, satisfying efficiency and fairness properties both from the community (efficiency as an economic equilibrium among sellers and buyers) and the single user (fairness as an economic measure of energy good-behaving) points of view. PubDate: 2020-07-01

Abstract: Abstract In this paper we combine the theory of probability aggregation with results of machine learning theory concerning the optimality of predictions under expert advice. In probability aggregation theory several characterization results for linear aggregation exist. However, in linear aggregation weights are not fixed, but free parameters. We show how fixing such weights by success-based scores, a generalization of Brier scoring, allows for transferring the mentioned optimality results to the case of probability aggregation. PubDate: 2020-07-01

Abstract: Abstract We present LogAG, a weighted algebraic non-monotonic logic for reasoning with graded beliefs. LogAG is algebraic in that it is a language of only terms, some of which denote propositions and may be associated with ordered grades. The grades could be taken to represent a wide variety of phenomena including preference degrees, priority levels, trust ranks, and uncertainty measures. Reasoning in LogAG is non-monotonic and may give rise to contradictions. Belief revision is, hence, an integral part of reasoning and is guided by the grades. This yields a quite expressive language providing an interesting alternative to the currently existing approaches to non-monotonicity. We show how LogAG can be utilised for modelling resource-bounded reasoning; simulating inconclusive reasoning with circular, liar-like sentences; and reasoning about information arriving over a chain of sources each with a different degree of trust. While there certainly are accounts in the literature for each of these issues, we are not aware of any single framework that accounts for them all like LogAG does. We also show how LogAG captures a wide variety of non-monotonic logical formalisms. As such, LogAG is a unifying framework for non-monotonicity which is flexible enough to admit a wide array of potential uses. PubDate: 2020-06-20

Abstract: Abstract Word embedding models excel in measuring word similarity and completing analogies. Word embeddings based on different notions of context trade off strengths in one area for weaknesses in another. Linear bag-of-words contexts, such as in word2vec, can capture topical similarity better, while dependency-based word embeddings better encode functional similarity. By combining these two word embeddings using different metrics, we show how the best aspects of both approaches can be captured. We show state-of-the-art performance on standard word and relational similarity benchmarks. PubDate: 2020-06-01

Abstract: Abstract By their very design, many robot control programs are non-terminating. This paper describes a situation calculus approach to expressing and proving properties of non-terminating programs expressed in Golog, a high level logic programming language for modeling and implementing dynamical systems. Because in this approach actions and programs are represented in classical (second-order) logic, it is natural to express and prove properties of Golog programs, including non-terminating ones, in the very same logic. This approach to program proofs has the advantage of logical uniformity and the availability of classical proof theory. PubDate: 2020-06-01

Abstract: Abstract Eighteenth and nineteenth century philosophers took interest in humour and, in particular, humorous incongruities. Humour was not necessarily their main interest; however, observations on humour could support their more general philosophical theories. Spontaneous and unintentional humour such as anecdotes, witty remarks and absurd events were the styles of humour that they analysed and made part of their theories. Prepared humour such as verbal jokes were rarely included in their observations, likely dismissed as too vulgar and not requiring intellectual effort. Humour, as analysed by several eighteenth and nineteenth century philosophers, was seen as part of daily life or life simulated on stage. In the twentieth century, Freud emphasized a possible ‘relief’ function of ‘prepared’ humour such as jokes. Additionally, linguists began developing theories to analyse jokes. A joke has a particular structure that is constructed with the aim of achieving a humorous effect. This structure makes jokes suitable for linguistic analysis. In the present-day humour research, jokes have become a main topic of research. This linguistically oriented joke research neglects many other forms of humour: spontaneous humour, non-verbal humour, physical humour, and many forms of unintentional humour that appear in real life. We want to survey and re-evaluate the contributions to the humour research of these eighteenth, nineteenth and early twentieth century philosophers and clarify that their more general contributions to the humour research have been neglected in favour of the very restricted form of prepared humour and linguistically expressed and analysed humour as it appears in jokes. We hope that the views expressed in this paper will help to steer the humour research away from joke research and help to integrate humour in the design of human-computer interfaces and smart environments. That is, rather than considering only verbal jokes, we should aim at generating smart environments that understand, facilitate or create humour that goes beyond jokes. PubDate: 2020-06-01