Abstract: Zbigniew Gołębiewski, Abram Magner, Wojciech Szpankowski

We continue developing the information theory of structured data. In this article, we study models generating d-ary trees (d ≥ 2) and trees with unrestricted degree. We first compute the entropy which gives us the fundamental lower bound on compression of such trees. Then we present efficient compression algorithms based on arithmetic encoding that achieve the entropy within a constant number of bits. A naïve implementation of these algorithms has a prohibitive time complexity of O(nd) elementary arithmetic operations (each corresponding to a number f(n, d) of bit operations), but our efficient algorithms run in O(n2) of these operations, where n is the number of nodes. PubDate: Mon, 01 Oct 2018 00:00:00 GMT

Unaggregated data, in a streamed or distributed form, are prevalent and come from diverse sources such as interactions of users with web services and IP traffic. Data elements have keys (cookies, users, queries), and elements with different keys interleave. Analytics on such data typically utilizes statistics expressed as a sum over keys in a specified segment of a function f applied to the frequency (the total number of occurrences) of the key. In particular, Distinct is the number of active keys in the segment, Sum is the sum of their frequencies, and both are special cases of frequency cap statistics, which cap the frequency by a parameter T. PubDate: Mon, 24 Sep 2018 00:00:00 GMT

Abstract: Zhengyang Liu, Xi Chen, Rocco A. Servedio, Ying Sheng, Jinyu Xie

We study the problem of testing whether an unknown n-variable Boolean function is a k-junta in the distribution-free property testing model, where the distance between functions is measured with respect to an arbitrary and unknown probability distribution over {0,1}n. Our first main result is that distribution-free k-junta testing can be performed, with one-sided error, by an adaptive algorithm that uses Õ(k2)/ε queries (independent of n). Complementing this, our second main result is a lower bound showing that any non-adaptive distribution-free k-junta testing algorithm must make Ω(2k/3) queries even to test to accuracy ε = 1/3. PubDate: Mon, 24 Sep 2018 00:00:00 GMT

Abstract: Marcin Pilipczuk, Michał Pilipczuk, Piotr Sankowski, Erik Jan Van Leeuwen

We propose polynomial-time algorithms that sparsify planar and bounded-genus graphs while preserving optimal or near-optimal solutions to Steiner problems. Our main contribution is a polynomial-time algorithm that, given an unweighted undirected graph G embedded on a surface of genus g and a designated face f bounded by a simple cycle of length k, uncovers a set F ⊆ E(G) of size polynomial in g and k that contains an optimal Steiner tree for any set of terminals that is a subset of the vertices of f. We apply this general theorem to prove that: — Given an unweighted graph G embedded on a surface of genus g and a terminal set S⊆ V(G), one can in polynomial time find a set F ⊆ E(G) that contains an optimal Steiner tree T for S and that has size polynomial in g and E(T) . PubDate: Mon, 17 Sep 2018 00:00:00 GMT

Abstract: Patrizio Angelini, Giordano Da Lozzo, Giuseppe Di Battista, Valentino Di Donato, Philipp Kindermann, Günter Rote, Ignaz Rutter

Given a planar graph G and a partition of the neighbors of each vertex v in four sets v↗, v↖, v↙, and v↘, the problem Windrose Planarity asks to decide whether G admits a windrose-planar drawing, that is, a planar drawing in which (i) each neighbor u ∈ v↗v is above and to the right of v, (ii) each neighbor u ∈ v↖ is above and to the left of v, (iii) each neighbor u ∈ v↙ is below and to the left of v, (iv) each neighbor u ∈ v↘ is below and to the right of v, and (v) edges are represented by curves that are monotone with respect to each axis. PubDate: Mon, 17 Sep 2018 00:00:00 GMT

Abstract: Aris Anagnostopoulos, Fabrizio Grandoni, Stefano Leonardi, Andreas Wiese

We study the problem of unsplittable flow on a path (UFP), which arises naturally in many applications such as bandwidth allocation, job scheduling, and caching. Here we are given a path with nonnegative edge capacities and a set of tasks, which are characterized by a subpath, a demand, and a profit. The goal is to find the most profitable subset of tasks whose total demand does not violate the edge capacities. Not surprisingly, this problem has received a lot of attention in the research community. If the demand of each task is at most a small-enough fraction Δ of the capacity along its subpath (Δ-small tasks), then it has been known for a long time [Chekuri et al., ICALP 2003] how to compute a solution of value arbitrarily close to the optimum via LP rounding. PubDate: Mon, 17 Sep 2018 00:00:00 GMT

We study the online submodular maximization problem with free disposal under a matroid constraint. Elements from some ground set arrive one by one in rounds, and the algorithm maintains a feasible set that is independent in the underlying matroid. In each round when a new element arrives, the algorithm may accept the new element into its feasible set and possibly remove elements from it, provided that the resulting set is still independent. The goal is to maximize the value of the final feasible set under some monotone submodular function, to which the algorithm has oracle access. For k-uniform matroids, we give a deterministic algorithm with competitive ratio at least 0.2959, and the ratio approaches 1/α∞≈ 0.3178 as k approaches infinity, improving the previous best ratio of 0.25 by Chakrabarti and Kale (IPCO 2014), Buchbinder et al. PubDate: Mon, 17 Sep 2018 00:00:00 GMT

Many randomized algorithms can be derandomized efficiently using either the method of conditional expectations or probability spaces with low (almost-) independence. A series of papers, beginning with Luby (1993) and continuing with Berger and Rompel (1991) and Chari et al. (2000), showed that these techniques can be combined to give deterministic parallel algorithms for combinatorial optimization problems involving sums of w-juntas. We improve these algorithms through derandomized variable partitioning, reducing the processor complexity to essentially independent of w and time complexity to linear in w. As a key subroutine, we give a new algorithm to generate a probability space which can fool a given set of neighborhoods. PubDate: Tue, 28 Aug 2018 00:00:00 GMT

Abstract: Hossein Esfandiari, Mohammadtaghi Hajiaghayi, Vahid Liaghat, Morteza Monemizadeh, Krzysztof Onak

We consider the problem of estimating the size of a maximum matching when the edges are revealed in a streaming fashion. When the input graph is planar, we present a simple and elegant streaming algorithm that, with high probability, estimates the size of a maximum matching within a constant factor using Õ(n2/3) space, where n is the number of vertices. The approach generalizes to the family of graphs that have bounded arboricity, which include graphs with an excluded constant-size minor. To the best of our knowledge, this is the first result for estimating the size of a maximum matching in the adversarial-order streaming model (as opposed to the random-order streaming model) in o(n) space. PubDate: Tue, 28 Aug 2018 00:00:00 GMT

We provide evidence that computing the maximum flow value between every pair of nodes in a directed graph on n nodes, m edges, and capacities in the range [1‥n], which we call the All-Pairs Max-Flow problem, cannot be solved in time that is significantly faster (i.e., by a polynomial factor) than O(n3) even for sparse graphs, namely m = O(n); thus for general m, it cannot be solved significantly faster than O(n2m). Since a single maximum st-flow can be solved in time Õ(m√n) [Lee and Sidford, FOCS 2014], we conclude that the all-pairs version might require time equivalent to Ω˜(n3/2) computations of maximum st-flow, which strongly separates the directed case from the undirected one. PubDate: Tue, 21 Aug 2018 00:00:00 GMT

We introduce the dependent doors problem as an abstraction for situations in which one must perform a sequence of dependent decisions, without receiving feedback information on the effectiveness of previously made actions. Informally, the problem considers a set of d doors that are initially closed, and the aim is to open all of them as fast as possible. To open a door, the algorithm knocks on it, and it might open or not according to some probability distribution. This distribution may depend on which other doors are currently open, as well as on which other doors were open during each of the previous knocks on that door. PubDate: Tue, 21 Aug 2018 00:00:00 GMT

A filtration over a simplicial complex K is an ordering of the simplices of K such that all prefixes in the ordering are subcomplexes of K. Filtrations are at the core of Persistent Homology, a major tool in Topological Data Analysis. To represent the filtration of a simplicial complex, the entire filtration can be appended to any data structure that explicitly stores all the simplices of the complex such as the Hasse diagram or the recently introduced Simplex Tree [Algorithmica’14]. However, with the popularity of various computational methods that need to handle simplicial complexes, and with the rapidly increasing size of the complexes, the task of finding a compact data structure that can still support efficient queries is of great interest. PubDate: Tue, 21 Aug 2018 00:00:00 GMT

Given an n-vertex undirected graph G = (V,E) and positive edge weights {we}e∈E, a linear arrangement is a permutation π : V → {1, 2, …, n}. The value of the arrangement is val(G, π) := 1/n∑ e ={u, v} ∈ E we|π(u) − π (v)|. In the minimum linear arrangement problem, the goal is to find a linear arrangement π * that achieves val(G, π*) = MLA(G) := min π val(G, π). In this article, we show that for any ε > 0 and positive integer r, there is an nO(r/&epsis;)-time randomized algorithm that, given a graph G, returns a linear arrangement π, such that val(G, π) ≤ (1 + 2/(1 − ϵ)λr(L)) MLA(G) + O(√log n/n ∑ e ∈ E we) with high probability, where L is the normalized Laplacian of G and λr(L) is the rth smallest eigenvalue of L. PubDate: Tue, 21 Aug 2018 00:00:00 GMT

Abstract: Pankaj K. Agarwal, Kyle Fox, Oren Salzman

We study a path-planning problem amid a set O of obstacles in R2, in which we wish to compute a short path between two points while also maintaining a high clearance from O; the clearance of a point is its distance from a nearest obstacle in O. Specifically, the problem asks for a path minimizing the reciprocal of the clearance integrated over the length of the path. We present the first polynomial-time approximation scheme for this problem. Let n be the total number of obstacle vertices and let ϵ ∈ (0, 1]. PubDate: Tue, 21 Aug 2018 00:00:00 GMT

Dynamic Time Warping (DTW) and Geometric Edit Distance (GED) are basic similarity measures between curves or general temporal sequences (e.g., time series) that are represented as sequences of points in some metric space (X, dist). The DTW and GED measures are massively used in various fields of computer science and computational biology. Consequently, the tasks of computing these measures are among the core problems in P. Despite extensive efforts to find more efficient algorithms, the best-known algorithms for computing the DTW or GED between two sequences of points in X = Rd are long-standing dynamic programming algorithms that require quadratic runtime, even for the one-dimensional case d = 1, which is perhaps one of the most used in practice. PubDate: Tue, 21 Aug 2018 00:00:00 GMT

We consider a natural generalization of the classical multiple knapsack problem in which instead of packing single items we are packing groups of items. In this problem, we have multiple knapsacks and a set of items partitioned into groups. Each item has an individual weight, while the profit is associated with groups rather than items. The profit of a group can be attained if and only if every item of this group is packed. Such a general model finds applications in various practical problems, e.g., delivering bundles of goods. The tractability of this problem relies heavily on how large a group could be. PubDate: Tue, 21 Aug 2018 00:00:00 GMT

Abstract: Sampath Kannan, Claire Mathieu, Hang Zhou

How efficiently can we find an unknown graph using distance or shortest path queries between its vertices' We assume that the unknown graph G is connected, unweighted, and has bounded degree. In the reconstruction problem, the goal is to find the graph G. In the verification problem, we are given a hypothetical graph Ĝ and want to check whether G is equal to Ĝ. We provide a randomized algorithm for reconstruction using Õ(n&frac;32) distance queries, based on Voronoi cell decomposition. Next, we analyze natural greedy algorithms for reconstruction using a shortest path oracle and also for verification using either oracle, and show that their query complexity is n1+o(1). PubDate: Thu, 09 Aug 2018 00:00:00 GMT

The SINR (Signal to Interference plus Noise Ratio) model for the quality of wireless connections has been the subject of extensive recent study. It attempts to predict whether a particular transmitter is heard at a specific location, in a setting consisting of n simultaneous transmitters and background noise. The SINR model gives rise to a natural geometric object, the SINR diagram, which partitions the space into n regions where each of the transmitters can be heard and the remaining space where no transmitter can be heard. Efficient point location in the SINR diagram, i.e., being able to build a data structure that facilitates determining, for a query point, whether any transmitter is heard there, and if so, which one, has been recently investigated in several articles. PubDate: Thu, 09 Aug 2018 00:00:00 GMT

The Swap-Insert Correction distance from a string S of length n to another string L of length m ≥ n on the alphabet [1‥σ] is the minimum number of insertions, and swaps of pairs of adjacent symbols, converting S into L. Contrarily to other correction distances, computing it is NP-Hard in the size σ of the alphabet. We describe an algorithm computing this distance in time within O(σ 2nmgσ−1), where for each α ∈ [1‥σ] there are nα occurrences of α in S, mα occurrences of α in L, and where g = max α∈[1‥σ] min{nα, mα − nα} is a new parameter of the analysis, measuring one aspect of the difficulty of the instance. PubDate: Thu, 09 Aug 2018 00:00:00 GMT