Keywords:Information Theory Abstract: This paper investigates the fundamental limits for compressing random graphs. It also discusses the compression of data on graphs. The graphs are assumed to have labelled vertices. The most basic example is the ErdH{o}s-R'enyi model, which corresponds to the well-understood case of compressing i.i.d. bits. This paper investigates inhomogeneous random graphs that have clusters, or equivalently, block models. These correspond to mixtures of ErdH{o}s-R'enyi models for which basic tools for i.i.d.-like sources may not apply, capturing a key feature of general network models. It is shown how the fundamental limit of lossless compression for such models takes different forms as the graph gets sparser. The paper also introduces connections between compression and clustering, how each field can impact the other, and how clustering can help for the compression of data on graphs.

Keywords:Computational Models and Representations Abstract: We consider the sparse stochastic block model in the case where the degrees are uninformative. The case where the two communities have approximately the same size has been extensively studied and we concentrate here on the community detection problem in the case of unbalanced communities. In this setting, spectral algorithms based on the non-backtracking matrix are known to solve the community detection problem (i.e. do strictly better than a random guess) when the signal is sufficiently large namely above the so-called Kesten Stigum threshold. In this regime and when the average degree tends to infinity, we show that if the community of a vanishing fraction of the vertices is revealed, then a local algorithm (belief propagation) is optimal down to Kesten Stigum threshold and we quantify explicitly its performance. Below the Kesten Stigum threshold, we show that, in the large degree limit, there is a second threshold called the spinodal curve below which, the community detection problem is not solvable. The spinodal curve provides a lower bound on the threshold for solvability which is shown to be tight for not too unbalanced communities. Our proof relies on a carefull analysis of the associated reconstruction problem on trees which might be of independent interest. In particular, we show that the spinodal curve corresponds to the reconstruction threshold on the tree.

Keywords:Signal Acquisition, Coding, and Retrieval, Computational Models and Representations Abstract: The theoretical basis for conventional acquisition of bandlimited signals typically relies on uniform time sampling and assumes infinite-precision amplitude values. In this paper, we explore signal representation and recovery based on uniform amplitude sampling with assumed infinite precision timing information. The approach is based on applying a one-level level-crossing detector to the result of adding a sawtooth waveform to the source signal. The source signal is then represented by the level-crossing times. For analysis purposes, the output of the level-crossing detector is interpreted as the result of applying a multi-level level crossing detector to the monotonic function consisting of the sum of the source signal and an appropriate linear ramp. This monotonic function is then uniformly sampled in amplitude with the source signal again represented by the level crossing times of the monotonic function. We refer to this technique as amplitude sampling. The approach can equivalently be viewed as nonuniform time sampling of the original source signal or uniform amplitude sampling of the monotonic function to which it is transformed. In this paper we explore this approach and, in particular, present duality and frequency-domain properties for the functions involved in the transformation and develop and compare two iterative algorithms for recovery of the source signal.

Keywords:Biological Information Systems Abstract: The P300 speller is a brain-computer interface that enables people with severe neuromuscular disorders to communicate. It is based on eliciting and detecting event-related potentials (ERP) in electroencephalography (EEG) measurements, in response to rare target events. One of the challenges to fast and reliable communication is the fact that the P300-based (ERP) has a refractory period that induces temporal dependence in the user's EEG responses. This negatively affects the performance of the speller. The contribution of this paper is to provide a model for the P300 speller as a communication process with memory to account for refractory effects. Using this model, we design codebooks that maximize the mutual information rate between the user's desired characters and the measured EEG responses. We show simulation results that compare our codebook with other codebooks described in literature.

Keywords:Biological Information Systems, Data Storage, Data Analytics Abstract: This paper proposes a new prediction process to explain and predicts popularity evolution of YouTube videos. We exploit prior study on the classification of YouTube videos in order to predict the evolution of videos' view-count. This classification allows to identify important factors of the observed popularity dynamics. In particular, we use this classification as filtering method allowing to identify the factors responsible for this popularity evolution. Results given by extensive experiments show that the proposed prediction process is able to reduce the average prediction errors compared to a state-of-the-art baseline model. We also evaluate the impact of adding popularity criteria in the classification.

Keywords:Data Analytics, Signal Acquisition, Coding, and Retrieval Abstract: We investigate the problem of extracting a sparse data set via histogram queries. A data set is a collection of items, and each item carries a piece of data. A data set is called sparse if there are only a small number of items carrying data of interest. We show that the fundamental limit on the query complexity is Theta(frac{k}{log k}logfrac{n}{k}), n being the size of the data set and k < n being the sparsity level. A counting argument is used to establish the converse part, that is, the lower bound on query complexity. For the achievability part, we analyze a randomized querying method, where in each query, the items to be included in the queried subset are uniformly randomly selected. It is shown that with high probability, the randomly constructed querying method exactly recovers the desired data. Furthermore, we propose an adaptive deterministic algorithm to extract the sparse data set with query complexity Theta((frac{k}{log k} logfrac{n}{k}) loglog k), achieving the fundamental limit to within a loglog k factor.

Keywords:Network Information Theory, Coding Theory, Information Theory Abstract: Given two identical linear codes mathcal C with rate R over mathbb F_q of length n, we independently pick one codeword from each codebook uniformly at random. A textit{sumset} is formed by adding these two codewords entry-wise as integer vectors and a sumset is called textit{typical}, if the sum falls inside this set with high probability. In this paper we show that the asymptotic size of such typical sumsets for most codes is min{2^{2nR},2^{n(R+D)}} where D depends solely on the alphabet size q. More generally, we completely characterize the asymptotic size of typical sumsets of two nested linear codes mathcal C_1, mathcal C_2 with different rates. We also provide two applications of the results, one on a computation problem over the general two-user multiple-access channel, and one on a communication problem over an additive two-user multiple-access channel.

Keywords:Information Theory, Network Information Theory, Data Storage Abstract: Information inequalities can be used to bound the fundamental limits of communication systems and data storage systems. Information inequalities, particularly Shannon-type inequalities, and the problem-specific constraints are usually either linear equalities or inequalities of joint entropies, and thus outer bounding the fundamental limit can be viewed and solved as a linear program (LP). However, for many practical engineering problems, the resultant LP is very large. It was shown previously that symmetry in these problems can be used to reduce the scale of the LP, however the precise amount of reduction was not well understood. In this work, we provide a generic method to pinpoint this reduction. In particular, three problems are studied: extremal pairwise cyclic entropy inequalities, the regenerating code problem, and the caching problem. By viewing the symmetry as an induced permutation group on certain set, Polya counting theorem can be applied, which however requires identifying the cycle index of the induced permutation.

Keywords:Information Theory, Network Information Theory Abstract: In this work, we study two instances of the Heegard-Berger problem. In both settings, an encoder observes a pair of memoryless, arbitrarily correlated, sources (S^n_1,S^n_2) and communicates with two receivers over an error-free rate-limited link of capacity R. Both receivers reproduce the source component S^n_2 losslessly; and Receiver 1 also reproduces the source component S^n_1 lossily, to within some prescribed fidelity level D_1. Also, Receiver 1 and Receiver 2 are equipped respectively with memoryless side information sequences Y^n_1 and Y^n_2. The side information sequences are arbitrarily correlated among them, and with the source pair (S^n_1,S^n_2). In adition, in the first setting, which is of successive refinement type, the encoder is also connected to Receiver 1 through an additional error-free rate-limited link of capacity R_1. In the second setting, which is of reconstruction-scalable coding type, the individual link, of capacity R_2, is provided to Receiver 2. We establish single-letter characterizations of the rate-distortion regions of both models in the discrete memoryless case.

Keywords:Information Theory, Network Information Theory Abstract: In this work we consider the successive refinement with side information where decoders cooperate via a cribbing link. Rate-distortion regions are characterized for non-causal side information and causal side information under appropriate constraints. An example is provided to illustrate the rate-distortion region for non-causal side information.

Keywords:Information Theory, Coding Theory, Coding Techniques and Applications Abstract: We study the problem of generating an approximately i.i.d. string at the output of a discrete memoryless channel using a limited amount of randomness at its input in presence of causal noiseless feedback. Feedback does not decrease the channel resolution, the minimum entropy rate required to achieve an accurate approximation of an i.i.d. output string. However, we show that, at least over a binary symmetric channel, a significantly larger resolvability exponent (the exponential decay rate of the divergence between the output distribution and product measure), compared to the best known achievable resolvability exponent in a system without feedback, is possible. We show that by employing a variable-length resolvability scheme and using an average number of coin-flips per channel use, the average divergence between the distribution of the output sequence and product measure decays exponentially fast in the average length of output sequence with an exponent equal to [R-I(U;V)]^{+} where I(U;V) is the mutual information developed across the channel.

Keywords:Wireless Communication Systems, Sensor Networks in Communications, Information Theory and Statistics Abstract: We investigate the secure connectivity of wireless sensor networks under a heterogeneous random key predistribution scheme. This scheme has recently been introduced as a generalization of the Eschenauer and Gligor scheme and accounts for the cases when the network comprises sensor nodes with varying level of resources and/or connectivity requirements, e.g., regular nodes vs. cluster heads. Our paper is the first to consider the heterogeneous random key predistribution scheme under a heterogeneous on/off channel model. In particular, we study a random graph formed by the intersection of an inhomogeneous random key graph with an inhomogeneous Erdos–Renyi graph. The former graph is naturally induced by the heterogeneous random key predistribution scheme while the latter constitutes a heterogeneous on/off channel model; wherein, the wireless channel between a class-i node and a class-j node is on with probability alpha_{ij}. We present conditions (in the form of zero-one laws) on how to scale the parameters of the intersection model so that it has no isolated node with high probability as the number of nodes gets large. We also present numerical results to support these zero-one laws in the finite-node regime.

Keywords:Wireless Communication Systems, Performance Analysis Abstract: In this paper, we investigate the content delivery problem in the context of multi-antenna (MIMO) wireless networks. The single-antenna users, equipped with some cache memory, receive requested contents from the server through a multi-antenna base station. We propose a scheme that carefully combines the multicast and unicast capabilities offered by MIMO, as a function of the quality of channel state information at the transmitter side. Thereby we reveal the complementary roles of coded caching and MIMO transmission for content delivery.

Keywords:Wireless Communication Systems Abstract: This paper proposes a new mathematical framework is proposed for analyzing the outage performance of a periodic broadcast service in vehicular networks. Using this framework we are able to calculate higher-order statistics of various random variables associated with the network, including message delay and the number of received packets from neighborhood nodes. Characterizing such quantities beyond their first-order statistics is important in resilient and safety-critical design of networked control systems. We also provide numerical results to compare the outage performance of IEEE 802.11 medium access control with a simple ALOHA like protocol.

Keywords:Wireless Communication Systems, Multiuser Detection and Estimation Abstract: In this paper, we consider the problem of user scheduling and pilot assignment in TDD multicell multiuser Massive MIMO systems. While in TDD systems the channel is acquired using uplink pilots, we propose a scheme that utilizes additional downlink probing in order to improve the spectral efficiency. The idea is to dynamically assign mobile users to different clusters based on the statistics of their channels through the use of downlink reference beams. This will result in forcing the interference to be centered in semi-orthogonal subspaces without the need for covariance matrix estimation or important feedback, and therefore enabling reduction of the pilot contamination effect. The scheduled users in each cluster employs orthogonal training sequences with a pilot reuse factor of 1 among the clusters. We then propose a graphical framework for pilot assignment. We show that this problem can be modeled as a max cut problem and we provide an approximation algorithm that optimizes the pilot allocation.

Keywords:Adaptive Control and Automation, Statistical Signal Processing, Detection and Estimation Abstract: This paper considers a multiple stopping problem on a Hidden Markov model sample path of infinite horizon; where a reward, dependent on the underlying state, is asso- ciated with each stop. The decision maker stops L times to maximize the total expected revenue. The aim is to determine the structure of the optimal multiple stopping policy.

The formulation generalizes the classical (single) stopping time Partially Observed Markov Decision (POMDP) problem. Even though the stopping set (in terms of the Bayesian beliefs) is not necessarily convex, we show that is a connected set. The structural results are illustrated using a numerical example.

Keywords:Detection and Estimation, Learning and Inference, Statistical Signal Processing Abstract: Quickest sequential sampling for identifying a subset of nodes in a large network that form a known correlation structure is considered. The layout of the correlated nodes in the network is known, and it is assumed that the remaining nodes in the network generate independent and identically distributed random variables. Despite its widespread applications, the problem of identifying subnetworks with such desired correlation structures is often investigated under the {em fixed-sample size} setting, where the data acquisition process and the inferential mechanisms are decoupled. Motivated by the significant advantages of sequential detection for agile inference, this paper analyzes this problem under a {em fully sequential} setting. Specifically, it formalizes a framework that incorporates the intertwined processes of data-adaptive information gathering and decision making. Additionally, through a constructive proof, it also offers an optimal sequential data-gathering process as well as the attendant decision rules for identifying a structured subnetwork of interest in a given network.

Keywords:Detection and Estimation, Statistical Signal Processing, Learning and Inference Abstract: Detecting emergence of a low-rank signal from high-dimensional data is an important problem arising from many applications such as camera surveillance and swarm monitoring using sensors. We consider a procedure based on the largest eigenvalue of the sample covariance matrix over a sliding window to detect the change. To achieve dimensionality reduction, we present a sketching-based approach for rank change detection using the low-dimensional linear sketches of the original high-dimensional observations. The premise is that when the sketching matrix is a random Gaussian matrix, and the dimension of the sketching vector is sufficiently large, the rank of sample covariance matrix for these sketches equals the rank of the original sample covariance matrix with high probability. Hence, we may be able to detect the low-rank change using sample covariance matrices of the sketches without having to recover the original covariance matrix. We character the performance of the largest eigenvalue statistic in terms of the false-alarm-rate and the expected detection delay, and present an efficient online implementation via subspace tracking.

Keywords:Adaptive Control and Automation, Complex Networked Systems, Performance Analysis Abstract: Traffic intersection signal control usually employ fixed-time, or periodic control policies. Yet, their optimality is unknown. Recent work has proposed queue-dependent max- weight scheduling policies that are optimal. But these are not regarded as practical and unlikely to be adopted. This paper considers the question whether there exist optimal fixed-time signal control policies. We consider a fluid model for arrivals and departures. For simplicity, only one queue is actuated at a time, though this can be generalized. Furthermore, switching the actuation from one queue to the next incurs an all-red ∆ delay. We show that there exists a fixed-time signal control policy that is near-optimal. The proof is non-constructive. Nevertheless, it answers an important question and suggests the existence of approximate optimal fixed-time policies that would be practical as well.

Keywords:Information Theory, Machine Learning and Learning Theory Abstract: The feature-selection problem is formulated from an information-theoretic perspective. We show that the problem can be efficiently solved by a recently proposed info-clustering paradigm. This reveals a fundamental duality between feature selection and data clustering, which is a consequence of a more general duality between the principal partition and the principal lattice of partitions in combinatorial optimization.

Keywords:Information Theory Abstract: Consider a memoryless relay channel, where the channel from the relay to the destination is an isolated bit pipe of capacity C_0. Let C(C_0) denote the capacity of this channel as a function of C_0. What is the critical value of C_0 such that C(C_0) first equals C(infinity)? This is a long-standing open problem posed by Cover and named “The Capacity of The Relay Channel,” in Open Problems in Communication and Computation, Springer-Verlag, 1987. In this paper, we answer this question in the Gaussian case and show that C(C_0) can not equal to C(infinity) unless C_0 = infinity, regardless of the SNR of the Gaussian channels, while the cutset bound would suggest that C(infinity) can be achieved at finite C_0. Our approach is geometric and relies on a strengthening of the isoperimetric inequality on the sphere by using Riesz’s rearrangement inequality.

Keywords:Learning and Inference, Information Theory, Data Analytics Abstract: Consider a network comprised of many variables, which interact with each other following a causal stochastic dynamical system model. Given time-series measurements of these variables, an important task is the inference of the structure of the causal dynamical system. Recently, directed information has been proposed as a generalization of Granger causality to accurately infer the structure of the network. We observe a curious phenomenon: in the limit that the relationships become purely deterministic, this measure loses power completely. In this paper, we study this phenomenon, explore its connection to Taken's delay embedding theorem, propose a remedy called restricted directed information (RDI); and finally, demonstrate its efficacy in simulations. In particular we show that, RDI recovers the graph correctly in all instances, deterministic or stochastic, where there is enough information to recover the graph uniquely.

Keywords:Information Theory, Coding Techniques and Applications, Data Analytics Abstract: In this paper, we first review the Coded Distributed Computing (CDC) framework, recently proposed to significantly slash the data shuffling load of distributed computing via coding, and then discuss the extension of the CDC techniques to cope with two major challenges in general distributed computing problems, namely the straggling servers and multistage computations. When faced with straggling servers in a distributed computing cluster, we describe a unified coding scheme that superimposes CDC with the Maximum-Distance-Separable (MDS) coding on computation tasks, which allows a flexible tradeoff between computation latency and communication load. On the other hand, for a general multistage computation task expressed as a directed acyclic graph (DAG), we propose a coded framework that given the load of computation on each vertex of the DAG, applies the generalized CDC scheme individually on each vertex to minimize the communication load.

Keywords:Complex Networked Systems, Reliability, Security and Trust Abstract: We propose an interdependent random geometric graph (RGG) model for interdependent networks. Based on this model, we study the robustness of two interdependent spatially embedded networks where interdependence exists between geographically nearby nodes in the two networks. We study the emergence of the mutual giant component in two interdependent RGGs as node densities increase, and define the percolation threshold as a pair of node densities above which the mutual giant component first appears. In contrast to the case for a single RGG, where the percolation threshold is a unique scalar under a given connection distance, for two interdependent RGGs, multiple pairs of percolation thresholds may exist, given that a smaller node density in one RGG may increase the minimum node density in the other RGG in order for a mutual giant component to exist. We derive analytical upper bounds on the percolation thresholds of two interdependent RGGs by discretization, and obtain 99% confidence intervals for the percolation thresholds by simulation. Based on these results, we derive conditions for the interdependent RGGs to be robust under random failures and geographical attacks.

Keywords:Complex Networked Systems, Network Games and Algorithms, Performance Analysis Abstract: The advent of social media has generated tremendous interest in ways that organizations or companies use social networks to maximize information or product adoption by users. In this paper, we are interested in finding powerful information initial spreaders. We utilize a ubiquitous feature of social networks, namely the existence of communities (i.e., tight-knit groups of nodes), and design a community-based distributed algorithm to find initial spreaders which maximize information spreading. In addition to the rigorous analysis, we also demonstrate our result with experiments over real network topologies.

Keywords:Complex Networked Systems Abstract: Networked control systems (NCS) have attracted many researchers' attention in the past few years due to their flexibility and wide range of industrial applications. In this work, we address the controller synthesis problem for NCS with sophisticated closed-loop objectives, e.g. temporal logic requirements. Specifically, we consider controller synthesis schemes based on symbolic models, also known as finite abstractions. We present a methodology to construct symbolic models for NCS using a recently introduced notion of feedback refinement relations. The symbolic models for the NCS, which incorporate several network non-idealities, can be directly obtained from the symbolic model of the plant. The constructed symbolic models can be then leveraged to algorithmically synthesize controllers while enforcing complex specifications, e.g. those given by linear temporal logic (LTL) formulae. We illustrate the effectiveness of the proposed results by constructing symbolic models of several NCS using that of the plants within them.

Keywords:Complex Networked Systems, Network Games and Algorithms, Pricing and Congestion Control Abstract: Social network subscription services have two key characteristics: i) a network externality, in which the perceived value of the service by users is an increasing function of the set of subscribers (the ``active'' set), and ii) a subscription cost paid by the users. An active set is stable at a given cost if (non) subscribers have (negative) positive net utility, and is stabilizable if a cost exists at which the set is stable. Transitions among stabilizable sets induced by destabilizing costs follow cascade dynamics. We introduce the stability transition graph (STG) to capture these induced transitions, and characterize its structure. We also characterize the transitive reduction of the STG, which shows that stability transitions achievable via large price destabilizations are always achievable by a series of smaller price destabilizations.

Keywords:Data Analytics, Distributed and Large-Scale Systems, Complex Networked Systems Abstract: We propose an iterative and distributed Markov Chain Monte Carlo scheme for estimation of effective edge conductances in a graph. A sample complexity analysis is provided. The theoretical guarantees on the performance of the proposed algorithm are weak compared to those of existing algorithms. But numerical experiments suggest that the algorithm might still be effective while offering the advantages of low per iterate computation and memory requirements.

Keywords:Distributed and Large-Scale Systems, Optimization, Decentralized Control Systems Abstract: This paper considers the distributed optimization problem over a network, where the objective is to optimize a global function formed by a sum of local functions, using only local computation and communication. We develop an Accelerated Distributed Nesterov Gradient Descent (Acc-DNGD) method for strongly-convex and smooth functions. We show that it achieves a linear convergence rate and analyze how the convergence rate depends on the condition number and the underlying graph structure.

Keywords:Decentralized Control Systems, Optimization, Dynamic Games Abstract: In this paper, we view Witsenhausen's problem as a leader-follower coordination game in which the action of the leader is corrupted by an additive normal noise, before reaching the follower. The leader who observes the realization of the state, chooses an action that minimizes her distance to the state of the world and the ex-ante expected deviation from the follower's action. The follower then makes a noisy observation of the leader's action and chooses an action minimizing her expected deviation from the leader's action. To this end, we analyze the perfect Bayesian equilibria of this game and show that strong complimentarily between the leader and the follower combined with a prior with poor enough precision can give rise to what we call "near piecewise-linear equilibria". As a major consequence, we prove local optimality of a class of slopey quantization strategies which had been suspected of being the optimal solution in the past, based on numerical evidence for Witsenhausen's counterexample.

Keywords:Decentralized Control Systems, Optimization, Complex Networked Systems Abstract: The primal-dual or saddle-point gradient algorithm has recently attracted interest as a systematic technique for solving distributed optimization problems. To examine the robustness of the algorithm, here we introduce exogenous disturbance inputs and quantify the performance of the algorithm in terms of the induced L2-gain from the disturbance to deviations around the optimizer. For convex problems without inequality constraints, we find that the L2-gain from a disturbance to the deviation of the primal state from the optimizer depends only on how strongly convex the agent objective functions are, and not on the equality constraints or on algorithm time constants. For primal-dual laws derived from an augmented Lagrangian, we show that the L2-gain is a non-increasing function of the augmentation parameters, and therefore that augmentation may be beneficial for improving input/output performance.

Keywords:Optimization, Robust and Nonlinear Control Abstract: Motivated by online optimization problems arising in nonlinear power system applications, this article concerns optimization over closed subsets of Riemannian manifolds. Compared to conventional optimization over manifolds, we explicitly consider inequality constraints that result in a feasible set that is itself not a smooth manifold. We propose a continuous-time projected gradient descent algorithm over the feasible set and show its well-behaved convergent behavior. Under mild assumptions on the non-degeneracy of equilibria we show that points are local minimizers if and only if they are asymptotically stable. The proposed algorithm can be implemented as a real-time feedback control law on a physical system. This approach is particularly appropriate for online load flow optimization problem in power systems, in which the state of the grid is naturally constrained to the manifold that represents the solution space to the nonlinear AC power flow equations. We specialize our approach for the case of power distribution systems that need to respect operational constraints while being economically efficient, and we illustrate the resulting closed-loop behavior in simulations.

Keywords:Information Theory, Dynamic Games Abstract: In this paper, we investigate the coordination of autonomous devices with non-aligned utility functions. Both encoder and decoder are considered as players, that choose the encoding and the decoding in order to maximize their long-run utility functions. The topology of the point-to-point network under investigation, suggests that the decoder implements a strategy, knowing in advance the strategy of the encoder. We characterize the encoding and decoding functions that form an equilibrium, by using empirical coordination. The equilibrium solution is related to an auxiliary game in which both players choose some conditional distributions in order to maximize their expected utilities. This problem is closely related to the literature on "Information Design" in Game Theory. We also characterize the set of posterior distributions that are compatible with a rate-limited channel between the encoder and the decoder. Finally, we provide an example of non-aligned utility functions corresponding to parallel fading multiple access channels.

Keywords:Dynamic Games, Network Games and Algorithms, Distributed and Large-Scale Systems Abstract: We study a two-stage electricity market with renewables. Each energy producer in the market has a portfolio of both renewable and conventional energy generators. In the day-ahead (DA) market, each producer submits a parameterized supply function (i.e., the amounts of energy to produce at various prices) to the independent system operator (ISO), who determines the DA market clearing price and the amounts of DA committed energy by each producer. In the real-time market, each producer tries to fulfill its DA committed energy with (zero-cost) renewables. If the renewable energy is insufficient, the producer uses conventional energy generation and incurs a cost; otherwise, it sells the surplus of renewable energy to the ISO at a predetermined feed-in tariff.

We study the robust supply function equilibrium (SFE) in this market, where each producer has incomplete information about the other producers' marginal costs and the distribution of its random renewable energy, and performs worst-case optimization against these unknown variables. We fully characterize the unique robust SFE, and study the impact of the feed-in tariff on the equilibrium outcome.

Keywords:Optimization Abstract: The proliferation of distributed generation and storage units has enabled the development of local, small-scale distribution grids, known as microgrids (MGs). To instill grid resilience, the power company will incentivize the MGs to store a portion of their energy surplus, which will then be used to meet critical loads in case of emergency. In this paper, the problem of optimizing the energy trading decisions of MG operators(MGOs) is studied using game theory. In the formulated game,each MGO must decide on the amounts of energy that mustbe sold immediately or stored for future emergencies, given the prospective market prices that are influenced by other MGO’s decisions. The problem is modeled using a Bayesiangame to account for the incomplete information that MGOshave on each others’ levels of surplus. The proposed gameexplicitly accounts for each MGO’s subjective decision when faced with the uncertainty of its opponents’ energy surplus. In particular, the so-called framing effect, from the framework of prospect theory (PT), is used to account for each MGO’s valuation of its gains and losses with respect to an individual utility reference point. The closed-form expression of the Bayesian Nash equilibrium is derived for the standard game. Under PT, a best response algorithm is proposed to find the equilibrium. Simulation results show that, depending on their reference points, MGOs can tend to store more or less energy under PT compared to classical game theory.

Keywords:Dynamic Games, Decentralized Control Systems, Network Games and Algorithms Abstract: In dynamic games with asymmetric information structure, the widely used concept of equilibrium is perfect Bayesian equilibrium (PBE). This is expressed as a strategy and belief pair that simultaneously satisfy sequential rationality and belief consistency. Unlike symmetric information dynamic games, where subgame perfect equilibrium (SPE) is the natural equilibrium concept, to date there does not exist a universal algorithm that decouples the interdependence of strategies and beliefs over time in calculating PBE. In this paper we find a subset of PBE for an infinite horizon discounted reward asymmetric information dynamic game. We refer to it as Structured PBE or SPBE; in SPBE, any agents' strategy depends on the public history only through a common public belief and on private history only through the respective agents' latest private information (his private type). The public belief acts as a summary of all the relevant past information and it's dimension does not increase with time. The motivation for this comes the common information approach proposed in Nayyar et al. (2013) for solving decentralized team (non-strategic) resource allocation problems with asymmetric information. We calculate SPBE by solving a single-shot fixed-point equation and a corresponding forward recursive algorithm. We demonstrate our methodology by means of a public goods example.

Keywords:Dynamic Games, Decentralized Control Systems, Learning and Inference Abstract: We study the problem of Bayesian learning in a dynamical system involving strategic agents with asymmetric information. In a series of seminal papers in the literature, this problem has been studied under a simplifying model where selfish players appear sequentially and act once in the game, based on private noisy observations of the system state and public observation of past players’ actions. It is shown that there exist information cascades where users discard their private information and mimic the action of their predecessor. In this paper, we provide a framework for studying Bayesian learning dynamics in a more general setting than the one described above. In particular, our model incorporates cases where players participate for the whole duration of the game, and cases where an endogenous process selects which subset of players will act at each time instance. The proposed methodology hinges on a sequential decomposition for finding perfect Bayesian equilibria (PBE) of a general class of dynamic games with asymmetric information, where user-specific states evolve as conditionally independent Markov process and users make independent noisy observations of their states. Using our methodology, we study a specific dynamic learning model where players make decisions about investing in the team, based on their estimates of everyone’s types. We characterize a set of informational cascades for this problem where learning stops for the team as a whole.

Keywords:Reliability, Security and Trust, Decentralized Control Systems Abstract: We formulate several notions of decentralized opacity for cyberphysical systems in the presence of multiple adversarial observers. Broadly speaking, we study the following cases: i) the presence or lack of a centralized coordinator, and ii) the presence or absence of collusion among the adversaries. In the case of colluding adversaries, we derive a condition for non-opacity that depends on the structure of the directed graph representing the communication between adversaries. Finally, we define a notion of opacity where the condition that the outputs be indistinguishable is relaxed.

Keywords:Data Analytics, Learning and Inference, Information Theory Abstract: In this short paper, we extend some of the recent results on matrix completion under the assumption that the columns of the matrix can be grouped (clustered) into subspaces (not necessarily disjoint or independent). This model deviates from the typical assumption prevalent in the literature dealing with compression and recovery for big-data applications. The results have a direct bearing on the problem of subspace clustering under missing or incomplete information.

Keywords:Optimization, Statistical Signal Processing, Data Analytics Abstract: We consider an online version of the sparse PCA problem with missing data in which we seek a set of sparse orthogonal basis vectors. We are motivated by big data applications where we must sequentially process possibly incomplete vector observations to find an approximating subspace, and we desire the subspace representation to be sparse and have orthogonal columns for reasons of interpretability. We propose two different algorithms for solving this problem inspired by the work of [15], where the main idea is to find a rotation matrix such that the subspace basis is sparse after rotation. Our first algorithm is a batch algorithm with updates for the rotation matrix estimate made using gradient steps on the Stiefel manifold. The second algorithm is online, and for each observation it performs two updates, one of the rotation matrix estimate and one of the subspace estimate, the latter of which is updated using gradient steps on the Grassmannian. The batch algorithm is competitive with state-of-the-art on full data. The online algorithm allows for a trade-off between subspace fit and sparsity of the subspace, and its performance degrades gracefully with missing data. We evaluate the performance of these two algorithm

Keywords:Statistical Signal Processing, Signal Acquisition, Coding, and Retrieval, Detection and Estimation Abstract: This paper considers a controlled approach to sparse reconstruction under sensing constraints that have been largely ignored in related work on compressive sensing and sparse recovery. The first constraint stems from the reduced number of degrees of freedom of actual information gathering systems, which imposes specific structures on the sensing matrix departing from the conventional random ensembles. The second limitation originates from the unknown statistical model of the corrupting noise. A controlled sensing approach is proposed to guide the collection of informative measurements given the constrained sensing structure. In the presence of additive noise with unknown statistics, the proposed approach is shown to yield stable recovery and dispenses with the usual de-noising requirements. In addition, a sequential implementation with a stopping rule is proposed, thereby reducing the sample complexity for a target performance in reconstruction.

Keywords:Statistical Signal Processing, Signal Acquisition, Coding, and Retrieval, Machine Learning and Learning Theory Abstract: The Compressive Sensing (CS) approach for recovering sparse signal with orthonormal measurements has been studied under various notions of coherence. However, existing notions of coherence either do not exploit the structure of the underlying signal, or are too complicated to provide an explicit sampling scheme for all orthonormal basis sets. Consequently, there is lack of understanding of key factors that guide the sampling of CS with orthonormal measurements and achieve as low sample complexity as possible. In this paper, we introduce a new notion of pi-coherence that exploits both the sparsity structure of the signal and the local coherence. Based on pi-coherence, we propose a sampling scheme that is adapted to the underlying true signal and is applicable for CS under all orthonormal basis. Our scheme outperforms (up to a constant factor) existing sampling schemes for orthonormal measurements, and achieves a near-optimal sample complexity (up to certain logarithm factors) for several popular choices of orthonormal basis. Furthermore, we characterize the necessary conditions on the sampling schemes for CS with orthonormal measurements. We then propose a practical multi-phase implementation of our sampling scheme, and verify its advantage over existing sampling schemes via application to magnetic resonance imaging (MRI) in medical science.

Keywords:Coding Techniques and Applications, Signal Acquisition, Coding, and Retrieval, Coding Theory Abstract: In this work, we analyze the failing sets of the interval-passing algorithm (IPA) for compressed sensing. The IPA is an efficient iterative algorithm for reconstructing a k-sparse nonnegative n-dimensional real signal x from a small number of linear measurements y. In particular, we show that the IPA fails to recover x from y if and only if it fails to recover a corresponding binary vector of the same support, and also that only positions of nonzero values in the measurement matrix are of importance for success of recovery. Based on this observation, we introduce termatiko sets and show that the IPA fails to fully recover x if and only if the support of x contains a nonempty termatiko set, thus giving a complete (graph-theoretic) description of the failing sets of the IPA. Finally, we present an extensive numerical study showing that in many cases there exist termatiko sets of size strictly smaller than the stopping distance of the binary measurement matrix; even as low as half the stopping distance in some cases.

Keywords:Distributed and Large-Scale Systems, Decentralized Control Systems, Complex Networked Systems Abstract: In the context of load balancing, Lu et al. introduced the distributed Join-Idle-Queue algorithm, where a group of dispatchers distribute jobs to a cluster of parallel servers. Each dispatcher maintains a queue of recently idle servers; when a job arrives to a dispatcher, it sends it to a server on its queue, or to a random server if the queue is empty. In turn, when a server becomes idle, it requests to be placed on the queue of a randomly chosen dispatcher.

Although this algorithm was shown to be quite effective, the original asymptotic analysis makes simplifying assumptions that become increasingly inaccurate as the system load increases. Further, the analysis does not naturally generalize to interesting variations, such as having a server request to be placed on the queue of a dispatcher before it has completed all jobs, which can be beneficial under high loads.

We provide a new asymptotic analysis of Join-Idle-Queue systems based on mean field fluid limit methods, deriving families of differential equations that describe these systems. Our analysis avoids previous simplifying assumptions, is empirically more accurate, and generalizes naturally to the variation described above, as well as other simple variations. Our theoretical and empirical analyses shed further light on the performance of Join-Idle-Queue, including potential performance pitfalls under high load.

Keywords:Performance Analysis, Robust and Nonlinear Control, Complex Networked Systems Abstract: We consider an infinite-server system, where multiple customers can be placed into one server for service simultaneously, subject to packing constraints. Customers are placed for service immediately upon arrival, and leave the system after service completion.

A key system objective is to minimize the total number of occupied servers in steady state. We consider the Greedy-Random (GRAND) algorithm. This is an extremely simple customer placement algorithm, which is also easy to implement. In an earlier paper, we showed that a version of the GRAND algorithm – so called GRAND(aZ), where Z is the current total number of customers in the system, and a > 0 is an algorithm parameter – is weakly asymptotically optimal, in the following sense: the steady-state optimality gap grows as c(a)r, where r is the system scale, i.e., the expected total number of customers in steady state, and c(a) > 0 depends on a, and c(a) approaches 0 as a approaches 0.

In this paper, we consider the GRAND(Z^{p}) algorithm, where p is an algorithm parameter. We prove that for any fixed p sufficiently close to 1, GRAND(Z^{p}) is strongly asymptotically optimal, in the sense that the optimality gap is o(r) as r approaches infinity, sublinear in the system scale r. This is a stronger result than the weak asymptotically optimality of GRAND(aZ).

Keywords:Data Analytics, Learning and Inference, Machine Learning and Learning Theory Abstract: For safety reasons, the interpretability of models is important in consequential applications of supervised classification in which predictions are used to support human decision makers. In this paper, we extend cardinal shape composition, a new method developed in the image processing and computer vision literature for image segmentation, to general machine learning problems. Our transformation results in a computationally-tractable l1-regularized hinge loss optimization over a shape dictionary. This approach yields human-interpretable models with an appropriate choice of atomic shapes in the dictionary used to compose decision boundaries.

Keywords:Learning and Inference, Statistical Signal Processing, Signal Acquisition, Coding, and Retrieval Abstract: We consider recovering a signal x in R^n from the magnitudes of Gaussian measurements by minimizing a second order yet non-smooth loss function. By exploiting existing concentration results of the loss function, we show that the non-convex loss function satisfies several quadratic geometrical properties. Based on these geometrical properties, we characterize the linear convergence of the sequence of function graph generated by the gradient flow on minimizing the loss function. Furthermore, we propose an accelerated version of the gradient flow, and establish an in-exact linear convergence of the generated sequence of function graph by exploiting the quadratic geometries of the loss function. Then, we verify the numerical advantages of the proposed algorithms over other state-of-art algorithms.

Keywords:Data Analytics, Machine Learning and Learning Theory, Statistical Signal Processing Abstract: In this paper we consider the problem of unsupervised clustering under a union of tensor (multilinear) subspaces model. In particular we take the popular union of subspaces model and endow the subspaces with an additional algebraic structure, namely that each subspace is a is a tensor product of subspaces. The resulting model is referred to as the union of tensor or multilinear subspaces. We show that several real world data sets such as images and action video data sets can be effectively represented using this model. Under this model we investigate an algorithm referred to as the Multilinear Subspace Clustering (MSC). In this context we first show by simulations on synthetic data that MSC is computationally more efficient compared to several existing methods indicating its utility in dealing with high dimensional data. Further we compare the clustering performance of the algorithm on several real-world data sets and show that MSC is competitive with current state-or-art methods. Based on these results we point out future research directions in using the multilinear/tensor algebraic framework for big-data processing.

Keywords:Coding Techniques and Applications Abstract: A novel deep learning method for improving the belief propagation algorithm is proposed. The method generalizes the standard belief propagation algorithm by assigning weights to the edges of the Tanner graph. These edges are then trained using deep learning techniques. A well-known property of the belief propagation algorithm is the independence of the performance on the transmitted codeword. A crucial property of our new method is that our decoder preserved this property. Furthermore, this property allows us to learn only a single codeword instead of exponential number of codewords. Improvements over the belief propagation algorithm are demonstrated for various high density parity check codes.

Keywords:Coding Theory, Coding Techniques and Applications, Information Theory Abstract: The polarization of the BSC(gamma_1) with a BSC(gamma_2) is characterized explicitly for gamma_1, gamma_2 in [0, frac{1}{2}]. The polarization yields a channel W^{-} which is a BSC(lambda), and a channel W^{+} which is composed of a BSC(xi) with probability 1-lambda and a BSC(varphi) with probability lambda. The parameters lambda, xi and varphi are functions of gamma_1 and gamma_2. For a general binary-input, output-symmetric, discrete, memoryless (BMS) channel W, a simple method is identified for constructing polar codes based on the fact that each polarized channel is defined by a emph{mutual information profile}, and is comprised of sub-channel emph{components}, similar to results by [Pedarsani et al., 2011; Tal and Vardy, 2013]. Algebraic polar transforms may be applied recursively to each sub-channel component. As an example, polar codes are constructed for a hybrid BMS channel with an erasure probability epsilon, a bit-flip probability gamma, and capacity C(epsilon, gamma) = (1 - epsilon)(1 - h_b(gamma)) where h_b(x) triangleq -xlog_2(x) - (1-x)log_2(1-x). Based on the structure of polarization via explicit parameters, relations regarding the information density and channel dispersion V(W) are analyzed for polarized channels, including the super-martingale property of V(W). The analysis depends on second-order terms involving the function psi(x) triangleq x log_2^{2} x + (1-x)log_2^{2}(1-x).

Keywords:Coding Theory, Coding Techniques and Applications Abstract: The standard binary reflected Gray code gives a sequence of binary numbers in the range 0 to n-1, where n is a power of 2, such that each number in the sequence differs from the preceding number in only one bit. We present two methods to compute Gray codes containing exactly n numbers in the range 0 to n-1—that is, a permutation of 0,1,...,n-1 in which each number differs from the preceding number in only one bit—where n is unconstrained. The first method produces a Gray code that is not cyclic: the first and last numbers in the sequence differ in more than one bit. The second method produces a cyclic Gray code if n is even, so that the first and last numbers differ in only one bit, at the expense of a slightly more complicated procedure. Both methods are based on the standard binary reflected Gray code and, as in the binary reflected Gray code, each number in the output sequence can be computed in a constant number of word operations given just its index in the sequence.

Keywords:Coding Theory, Information Theory Abstract: New bounds on the semantic secrecy capacity of the binary adversarial wiretap channel are established. Against an adversary which reads a rho_r fraction of the transmitted codeword and modifies a rho_w fraction of the codeword, we show an achievable rate of 1-h(rho_w)-rho_r, where h(cdot) is the binary entropy function. We also give an upper bound which is nearly matching when rho_r is small.

Keywords:Information Theory, Network Information Theory, Information Theory and Statistics Abstract: The minimum common randomness required for the approximate and separate generation of a pair of correlated discrete memoryless sources is quantified by Wyner’s notion of common information. Recently, Kumar, Li, and El Gamal introduced the notion of exact common information as the minimum common randomness required for the exact and separate generation of a pair of correlated discrete memoryless sources. This new notion of common information, which does not have a general single-letter characterization, was shown to match Wyner’s notion for the symmetric binary erasure source. In this work, we present two conditions on the joint statistics of the pair of sources under either of which the exact and Wyner’s notions of common information coincide. Though the conditions are implicit, we prove the equality of Wyner and exact common information for the generalized binary Z-source, generalized erasure source and the noisy typewriter source by establishing that these sources meet either of these conditions.

Keywords:Information Theory, Information Theory and Statistics, Learning and Inference Abstract: A new layered encoding scheme is proposed to effectively generate a random vector with prescribed joint density that induces a latent Gaussian tree structure. The encoding algorithm relies on the learned structure of tree to use minimal number of common random variables to synthesize the desired density, which we argue such algorithm is also computationally efficient. We characterize the achievable rate region for the rate tuples of multi-layer latent Gaussian tree, through which the number of bits needed to simulate such Gaussian joint density are determined. The random sources used in our algorithm are the latent variables at the top layer of tree along with Bernoulli sign inputs, which capture the correlation signs between the variables. In latent Gaussian trees the pairwise correlation signs between the variables are intrinsically unrecoverable. Such information is vital since it completely determines the direction in which two variables are associated. As a by-product of determining the achievable rate region, we quantify the amount of information loss due to unrecoverable sign information. It is shown that maximizing the achievable rate-region is equivalent to finding the worst case density for Bernoulli sign inputs where maximum amount of sign information is lost.

Keywords:Information Theory, Information Theory and Statistics Abstract: We study a generalization of Wyner’s Common Information to Watanabe’s Total Correlation. The first minimizes the description size required for a variable that can make two other random variables conditionally independent. If independence is unattainable, Watanabe’s total (conditional) correlation is measure to check just how independent they have become. Following up on earlier work for scalar Gaussians, we discuss the minimization of total correlation for Gaussian vector sources. Using Gaussian auxiliaries, we show one should transform two vectors of length d into d independent pairs, after which a reverse water filling procedure distributes the minimization over all these pairs. Lastly, we show how this minimization of total conditional correlation fits a lossy coding problem by using the Gray–Wyner network as a model for a caching problem.

Keywords:Information Theory, Network Information Theory Abstract: Several users observing random sequences take turns sending error-free messages to a central estimation officer (CEO) and all other users one at a time. The CEO, which also observes a side information correlated with the users observations, aims to compute a function of the sources and the side information in a lossy manner. The users observations are assumed to be conditionally independent given the side information. Inner and outer bounds to the rate distortion region for this lossy interactive distributed function computation problem are established and shown to coincide in some special cases. In addition to this newly closed case, two more examples are provided in which each user observes a subset of independent random variables, and the full rate distortion region is characterized. Additionally, the relationship between a zero-distortion limit and lossless distributed function computation is studied for a special class of extremum functions and related distortions.

Keywords:Information Theory, Coding Theory Abstract: Group testing is the process of pooling arbitrary subsets from a set of n items so as to identify, with a minimal number of disjunctive tests, a ``small'' subset of d defective items. In ``classical'' non-adaptive group testing, it is known that when d = o(n^{1-delta}) for any delta>0, theta(dlog(n)) tests are both information-theoretically necessary, and sufficient to guarantee recovery with high probability. Group testing schemes in the literature meeting this bound require most items to be tested Omega(log(n)) times, and most tests to incorporate Omega(n/d) items.

Motivated by physical considerations, we study group testing models in which the testing procedure is constrained to be ``sparse''. Specifically, we consider (separately) scenarios in which (a) items are finitely divisible and hence may participate in at most gamma tests; and (b) tests are size-constrained to pool no more than rho items per test. For both scenarios we provide information-theoretic lower bounds on the number of tests required to guarantee high probability recovery. In particular, one of our main results shows that gamma-finite divisibility of items forces {it any} group testing algorithm with probability of recovery error at most epsilon to perform at least Omega(gamma d(n/d)^{(1-2epsilon)/((1+2epsilon)gamma)}) tests. Analogously, for rho-sized constrained tests, we show an information-theoretic lower bound of Omega(nlog(n/d)/(rholog(n/rho d)))

Keywords:Learning and Inference, Statistical Signal Processing, Data Analytics Abstract: We present a new method for identifying the latent categorization of items based on their rankings. Complimenting a recent work that uses a Dirichlet prior on preference vectors and variational inference, we show that this problem can be effectively dealt with using existing community detection algorithms, with the communities corresponding to item categories. In particular we convert the bipartite ranking data to a unipartite graph of item affinities, and apply community detection algorithms. In this context we modify an existing algorithm - namely the label propagation algorithm to a variant that uses the distance between the nodes for weighting the label propagation - to identify the categories. We propose and analyze a synthetic ordinal ranking model and show its relation to the recently much studied stochastic block model. We test our algorithms on synthetic data and compare performance with several popular community detection algorithms. We also test the method on real data sets of movie categorization from the Movie Lens database. In all of the cases our algorithm is able to identify the categories for a suitable choice of tuning parameter.

Keywords:Learning and Inference, Information Theory and Statistics, Machine Learning and Learning Theory Abstract: Consider a noisy linear observation model with an unknown permutation, based on observing y = Pi^* A x^* + w, where x^* in real^d is an unknown vector, Pi^* is an unknown n times n permutation matrix, and w in real^n is additive Gaussian noise. We analyze the problem of permutation recovery in a random design setting in which the entries of the matrix A are drawn i.i.d. from a standard Gaussian distribution, and establish sharp conditions on the SNR, sample size n, and dimension d under which Pi^* is exactly and approximately recoverable. On the computational front, we show that the maximum likelihood estimate of Pi^* is NP-hard to compute, while also providing a polynomial time algorithm when d =1.

Keywords:Learning and Inference, Network Games and Algorithms Abstract: We consider a group of Bayesian agents who try to estimate a state of the world theta through interaction on a social network. Each agent v initially receives a private measurement of theta: a number S_v picked from a Gaussian distribution with mean theta and standard deviation sigma. Then, in each discrete time iteration, each reveals its estimate of theta to its neighbors, and, observing its neighbors' actions, updates its belief using Bayes' Law.

This process aggregates information efficiently, in the sense that all the agents converge to the belief that they would have, had they access to all the private measurements. We show that this process is computationally efficient, so that each agent's calculation can be easily carried out. We also show that on any graph the process converges after at most 2N cdot D steps, where N is the number of agents and D is the diameter of the network. Finally, we show that on trees and on distance-transitive graphs the process converges after D steps, and that it preserves privacy, so that agents learn very little about the private signal of most other agents, despite the efficient aggregation of information. Our results extend those in an unpublished manuscript of the first and last authors.

Keywords:Learning and Inference, Detection and Estimation, Statistical Signal Processing Abstract: This paper presents a non-asymptotic upper bound for the estimation error of the constrained lasso, under the high-dimensional (n << p) setting. In contrast to existing results, the error bound in this paper is sharp, is valid when the parameter to be estimated is not exactly sparse (e.g., when it is weakly sparse), and shows explicitly the effect of over-estimating the ell_1-norm of the parameter to be estimated on the estimation performance. The results of this paper show that the constrained lasso is minimax optimal for estimating a parameter with bounded ell_1-norm, and also for estimating a weakly sparse parameter if its ell_1-norm is accessible.

Keywords:Optimization, Machine Learning and Learning Theory, Learning and Inference Abstract: A rank-r matrix can be written as a product of two tall matrices, U and V. One could exploit this observation in optimization: e.g., consider the minimization of a convex function f over rank-r matrices, where the set of rank-r matrices is modeled via the factorization in U and V variables. Such heuristic has been widely used before for problem instances, where the solution is (approximately) low-rank. Though such parameterization reduces the number of variables and is more efficient w.r.t. computational and memory requirements (of particular interest is the case where r is much smaller than the ambient dimensions), it comes at a cost: f(UV^T) becomes a non-convex function w.r.t. U and V.

In this paper, we study such parameterization in optimizing generic smooth convex f, that has Lipschitz continuous gradients, and focus on first-order, gradient descent algorithmic solutions. We propose the Bi-Factored Gradient Descent (BFGD) algorithm, an efficient first-order method that operates on the U, V factors. We show that when f is smooth and BFGDis initialized properly, it has local sublinear convergence to a globally optimum point. As a test case, we consider the 1-bit matrix completion problem: We compare BFGD with state-of-the-art approaches and show that it has at least competitive test error performance on real dataset experiments, while being faster in execution,

Keywords:Data Analytics, Learning and Inference, Distributed and Large-Scale Systems Abstract: This paper considers the subspace clustering problem in a decentralized setting. The core algorithm finds directions of novelty in the span of the data to identify the membership of a collection of distributed data points. The low rank structure of the full M_1 times M_2 data matrix bD is exploited to substantially reduce the processing and communication overhead. Two decentralized designs are presented. In the first design, each agent/sensor sends a compressed version of its data vector to a central processing unit, which applies the clustering algorithm to the compressed data vectors. In the second design, only a small random subset of the agents send their compressed data vectors to the central unit and the clustering algorithm is applied to the sampled compressed data vectors. It is shown that the gain in communication overhead of the first and second decentralized designs relative to the centralized solution is of order calO( frac{M_1}{r} ) and calO( frac{M_1 M_2}{ r^2 + M_2} ), respectively, where r is the rank of bD.

Keywords:Distributed and Large-Scale Systems, Decentralized Control Systems, Optimization Abstract: This paper studies a class of network optimization problems where the objective function is the summation of individual agents' convex functions and their decision variables are coupled by linear equality constraints. These constraints are not sparse, meaning that they do not match the pattern of the network adjacency matrix. We propose two approaches to design efficient distributed algorithms to solve the network optimization problem. Our first approach consists of transforming the non-sparse equality constraints into sparse ones by increasing the number of the agents' decision variables, yielding an exact reformulation of the original optimization problem. We discuss two reformulations, based on the addition of consensus variables and of constraint-mismatch variables, and discuss the scalability of the strategies resulting from them. Our second approach consists instead of sparsifying the non-sparse constraints by zeroing some coefficients, yielding an approximate reformulation of the original problem. We formally characterize the gap on the distance between the optimizers of the original and approximated problems as a function of the number of entries made zero in the constraints. Various simulations illustrate our results.

Keywords:Decentralized Control Systems, Complex Networked Systems, Distributed and Large-Scale Systems Abstract: This paper presents a distributed control law that locally forces a group of agents to self-organize into a rigid formation specified by a subset of interagent distances. The control law presented here is distributed as its execution requires each agent to only sense its relative positions with its neighbors, i.e. agents with whom they share a specified desired distance, and its own velocity. The paper extends cite{Krick2008}, cite{lin}, and cite{MSC}. In cite{Krick2008} each agent is modeled as a single integrator. On the other hand, cite{lin} assumes double integrator dynamics but requires that agents can also sense the velocities of their neighbors, doubling the communication/sensing overhead. In cite{MSC} the agent velocities are generated by the actuation signals through a Linear Time Invariant Positive Real dynamics. Double integrator dynamics are a special case of this. Unlike cite{lin} no agent needs its neighbor's velocities. This paper on its parts relaxes the assumption of linear time invariance and instead assumes that the actuation to velocity dynamics of each agent is nonlinear but passive. We prove that the control law of cite{MSC} still guarantees local stability despite the presence of nonlinear actuation dtnamics. We argue that, like cite{Krick2008}, cite{lin} and cite{MSC}, no law for this problem can be globally stable.

Keywords:Data Analytics, Machine Learning and Learning Theory Abstract: We study the problem of ranking test items, i.e., the ordering of items according to the amount of information they provide on the latent trait of the respondents. We focus on educational applications, where instructors are interested in ranking questions so as to select a small set of informative questions in order to efficiently assess the students’ understanding on the course material. Using the Rasch model for modeling student responses, we prove that the simple algorithm of sorting the item level parameters of the Rasch model is optimal in the setting where the goal is to maximize the entropy of the student responses. We demonstrate the optimality of the sorting algorithm using both theoretical results and using empirical results on several real-world datasets. Furthermore, we also demonstrate how the sorting algorithm can be used in a batch adaptive manner for predicting unobserved student responses.

Keywords:Machine Learning and Learning Theory, Data Analytics, Statistical Signal Processing Abstract: In this paper we present a simple and efficient method to compute the canonical polyadic decomposition (CPD) of generic low-rank tensors using elementary linear algebra. The key insight is that all the columns in a low-rank tensor lie in a low-dimensional subspace, and that the coefficients of the columns in each slice with respect to the right basis are scaled copies of one an other. The basis, together with the coefficients of a few carefully selected columns determine the CPD. The computational complexity of our method scales linearly in the order and the rank of the tensor, and at most quadratically in its largest dimension. Furthermore, our approach can be easily adapted to noisy settings. We complement our theoretical analysis with experiments that support our findings.

Keywords:Optimization, Pricing and Congestion Control Abstract: We study the impact of perturbations on the convergence of the subgradient method for the dual problem in constrained convex optimisation. Perturbations are likely to be present in practical implementations of the subgradient method and can affect either the computation of a subgradient, the update of the dual variables, or both. In the context of networks, perturbations can be related to the exchange of network information, time varying channel conditions, discrete actions, etc. The analysis presented in this paper is general, and establishes the conditions under which the objective function will converge to the optimum asymptotically. With an example, we illustrate how the analysis can be applied to network flow problems where the intensity of the flows that arrive in the system changes over time.

Keywords:Optimization, Machine Learning and Learning Theory, Data Analytics Abstract: The regularization and output consistency behavior of dropout and layer-wise pretraining for learning deep networks have been fairly well studied. However, our understanding of how the asymptotic convergence of backpropagation in deep architectures is related to the structural properties of the network and other design choices (like denoising and dropout rate) is less clear at this time. An interesting question one may ask is whether the network architecture and input data statistics may guide the choices of learning parameters and vice versa. In this work, we explore the association between such structural, distributional and learnability aspects vis-`a-vis their interaction with parameter convergence rates. We present a framework to address these questions based on convergence of backpropagation for general nonconvex objectives using first-order information. This analysis suggests an interesting relationship between feature denoising and dropout. Building upon these results, we obtain a setup that provides systematic guidance regarding the choice of learning parameters and network sizes that achieve a certain level of convergence (in the optimization sense) often mediated by statistical attributes of the inputs. Our results are supported by a set of experimental evaluations as well as independent empirical observations reported by other groups.

Keywords:Statistical Signal Processing, Optimization, Learning and Inference Abstract: Principal Component Analysis (PCA) is a method for estimating a subspace given noisy samples. It is useful in a variety of problems ranging from dimensionality reduction to anomaly detection and the visualization of high dimensional data. PCA performs well in the presence of moderate noise and even with missing data, but is also sensitive to outliers. PCA is also known to have a phase transition when noise is independent and identically distributed; recovery of the subspace sharply declines at a threshold noise variance. Effective use of PCA requires a rigorous understanding of these behaviors. This paper provides a step towards an analysis of PCA for samples with heteroscedastic noise, that is, samples that have non-uniform noise variances and so are no longer identically distributed. In particular, we provide a simple asymptotic prediction of the recovery of a one-dimensional subspace from noisy heteroscedastic samples. The prediction enables: a) easy and efficient calculation of the asymptotic performance, and b) qualitative reasoning to understand how PCA is impacted by heteroscedasticity (such as outliers).

Keywords:Distributed and Large-Scale Systems, Optimization Abstract: This paper studies the projected saddle-point dynamics for a twice differentiable convex-concave function, which we term saddle function. The dynamics consists of gradient descent of the saddle function in variables corresponding to convexity and (projected) gradient ascent in variables corresponding to concavity. We provide a novel characterization of the omega-limit set of the trajectories of these dynamics in terms of the diagonal Hessian blocks of the saddle function. Using this characterization, we establish global asymptotic convergence of the dynamics under local strong convexity-concavity of the saddle function. If this property is global, and for the case when the saddle function takes the form of the Lagrangian of an equality constrained optimization problem, we establish the input-to-state stability of the saddle-point dynamics by providing an ISS Lyapunov function. Various examples illustrate our results.

Keywords:Information Theory, Network Information Theory, Information Theory and Statistics Abstract: We prove the partial strong converse property for the discrete memoryless non-degraded wiretap channel, for which we require the leakage to the eavesdropper to vanish but allow a non-zero asymptotic error probability. We use a recently developed technique based on information spectrum and a recursive bounding method to evaluate the correct decoding probability. We show that when the transmission rate is above the secrecy capacity, the correct decoding probability decays to zero exponentially. Therefore, the maximum transmission rate is the same as the case of zero asymptotic error probability, and the partial strong converse property holds.

Keywords:Information Theory, Coding Theory Abstract: Consider a sequence S of length n emitted by a Discrete Memoryless Source (DMS) with unknown distribution p. We wish to construct a lossless source code that maps S to a sequence S' of minimal length m such that S' approximates in terms of Kullback-Leibler divergence a sequence emitted by another DMS with known distribution q. Our main result is the existence of a coding scheme that performs such a task with an asymptotically optimal compression rate, i.e., such that the limit of m/n is H(p)/H(q) as n goes to infinity. Our coding scheme overcomes the challenges created by the lack of knowledge about p by relying on a sufficiently fine estimation of H(p), followed by an appropriately designed type-based compression that jointly performs source resolvability and universal lossless source coding. Our result recovers several previous results that either assume p uniform, or q uniform, or p known. The price paid for these generalizations is the use of common randomness with vanishing rate. We further determine that the length of the latter roughly scales as the square root of n, by an analysis of second order asymptotics and error exponents.

Keywords:Information Theory, Information Theoretic Approaches in Wireless Communications, Wireless Communication Systems Abstract: We consider secure communication over two-user symmetric interference channel, where each transmitter sends a confidential message to its legitimate receiver. For the deterministic case of this channel, we provide a new upper bound and complete the secure sum capacity characterization for any channel parameters. For the Gaussian case, we derive a secure sum capacity within a constant gap in a weak interference regime.

Keywords:Reliability, Security and Trust, Information Theory, Data Analytics Abstract: The tradeoff between privacy and utility is studied for small datasets using tools from fixed error asymptotics in information theory. The problem is formulated as determining the privacy mechanism (random mapping) which minimizes the mutual information (a metric for privacy leakage) between the private features of the original dataset and a released version, subject to a distortion constraint between the public features and the released version. An excess probability bound is used to constrain the distortion, thus limiting the random variation in distortion due to the finite length. Bounds are derived for the following variants of the problem: (1) whether the mechanism is memoryless (local privacy) or not (global privacy), (2) whether the privacy mechanism has direct access to the private data or not. It is shown that these settings yield different performance in the first order: for global privacy, the first-order leakage decreases with the excess probability, whereas for local privacy it remains constant. The derived bounds also provide tight performance results up to second order for local privacy, as well as bounds on the second order term for global privacy.

Keywords:Information Theory, Network Information Theory Abstract: Recently, a secrecy measure based on list-reconstruction has been proposed cite{Schieler}, in which a wiretapper is allowed to produce a list of 2^{mR_{L}} reconstruction sequences and secrecy is measured by the minimum distortion over the entire list. In this paper, we show that this list secrecy problem is equivalent to one with secrecy measured by a new quantity emph{lossy-equivocation}, which is proven to be the minimum optimistic 1-achievable source coding rate of the source with the wirtapped signal as two-sided information, and also can be seen as an extension of conventional equivocation to lossy case. Upon this (or list) secrecy measure, we study source-channel secrecy problem in the discrete memoryless Shannon cipher system with emph{noisy} wiretap channel. Two inner bounds and an outer bound on the achievable region of secret key rate, list rate, wiretapper distortion, and distortion of legitimate user are given. The inner bounds are derived by using an uncoded scheme and a (operationly) separate scheme, respectively. Thanks to the equivalence between lossy-equivocation secrecy and list secrecy, the information spectrum method is leveraged to prove the outer bound. As special cases, the admissible region for the cases of degraded wiretap channel or lossless communication for legitimate user has been characterized completely. Besides, we also extend our results to characterize the achievable region for Gaussian communication case.

Keywords:Information Theory, Reliability, Security and Trust, Detection and Estimation Abstract: In recent years, the information theory of covert communications, where the very presence of the communication is undetectable to a watchful adversary, has been of great interest. Our recent work introduced the information-theoretic limits for covert communications over packet channels whose packet timings are governed by a Poisson point process. Here we consider the extension to timing channels characterized by more general renewal processes of rate λ. We consider two scenarios. In the first scenario, the source of the packets on the channel cannot be authenticated by Willie, and therefore Alice can insert packets into the channel. We show that if the total number of transmitted packets by Jack is N, Alice can covertly insert O (√N) packets and, if she transmits more, she will be detected by Willie. In the second scenario, packets are authenticated by Willie but we assume that Alice and Bob share a secret key; hence, Alice alters the timings of the packets according to a pre-shared codebook with Bob to send information to him over a G/M/1 queue with service rate μ> λ . We show that Alice can covertly and reliably transmit O(N) bits to Bob when the total number of packets sent from Jack to Steve is N.

Keywords:Robotics, Adaptive Control and Automation Abstract: Steel bridges constitute the second most bridges in the U.S. while the number of deficient bridges are growing. Currently, the majority number of current bridge inspections are done manually by inspectors which require significant amount of human resources along with expensive and specialized equipment. Moreover, it is difficult and dangerous for inspectors to inspect large bridges with high structures. In this paper, we propose a new design and implementation of a robotic system that can be used for steel bridge inspection. The robot consists of four magnetic wheels which create adhesion to steel surface. It is able to carry multiple sensors for navigation and mapping. Collected data are sent to the ground station for live monitoring as well as further processing. In addition, magnetic field and range sensors are also integrated to enable robot to move safely on steel surfaces. Results with physical tests on real steel structures are shown to validate the feasibility of robot design.

Keywords:Decentralized Control Systems, Robotics, Robust and Nonlinear Control Abstract: Flocking control of multiple agents with their point-mass models has been extensively studied. However, flocking control of mobile robots with full dynamic models is challenging research due to nonhonolomic nature. This paper presents a novel approach to distributed flocking control of nonholonomic mobile robots by bounded feedback. The flocking control objectives include velocity consensus, collision avoidance, and cohesion maintenance among mobile robots. A flocking protocol which is based on the neighborhood information of mobile robots is constructed by means of control design. A Lyapunov-like function and graph theory are employed for convergence analysis. Simulation results are presented to illustrate the effectiveness of the proposed distributed flocking control scheme.

Keywords:Adaptive Control and Automation, Reliability, Security and Trust, Complex Networked Systems Abstract: Motivated by the recent interest in abstraction- based correct-by-construction control synthesis, in this paper we propose a new algorithm to construct finite abstractions for “large”, possibly infinite, transition systems. As opposed to the standard bisimulation algorithms that create a partition of the state space, the new algorithm uses overlapping subsets of the state space as the states of the abstraction. We show that the output of the new bisimulation-like algorithm preserves realizability of linear-time properties. Several interesting prop- erties of the algorithm are analyzed. In particular, when a finite bisimulation of the original system exists, the new algorithm is shown to always terminate in a finite number of steps. Moreover, we show with an example that even when the original system does not have a finite bisimulation, the new algorithm can result in a finite transition system whose infinite traces are equivalent to those of the original system. In the second part of the paper, we focus on the application of this algorithm to construct finite abstractions for discrete-time linear control systems and discuss several of its advantages over the standard bisimulation algorithm. Finally, the new algorithm is compared to the existing algorithms with some numerical examples.

Keywords:Robust and Nonlinear Control, Complex Networked Systems Abstract: Incremental stability is a strong property of dynamical systems ensuring the uniform asymptotic stability of each trajectory rather than a fixed equilibrium point or fixed trajectory. Here, we introduce a notion of incremental stability for time-delayed stochastic control systems and provide a sufficient condition under which the time-delayed stochastic control systems are incrementally stable. We show the effectiveness of the proposed result on a numerical example.

Keywords:Decentralized Control Systems, Optimization, Distributed and Large-Scale Systems Abstract: This paper is concerned with the optimal distributed control problem for linear discrete-time deterministic and stochastic systems. The objective is to design a stabilizing static distributed controller whose performance is close to that of the optimal centralized controller (if such controller exists). In our previous work, we have developed a computational framework to transform centralized controllers into distributed ones for deterministic systems. By building on this result, we derive strong theoretical lower bounds on the optimality guarantee of the designed distributed controllers and show that the proposed mathematical framework indirectly maximizes the derived lower bound while striving to achieve a closed-loop stability. Furthermore, we extend the proposed design method to stochastic systems that are subject to input disturbance and measurement noise. The developed optimization problem has a closed-form solution (explicit formula) and can be easily deployed for large-scale systems that require low computational efforts. The proposed approach is tested on a power network and random systems to demonstrate its efficacy.

Keywords:Decentralized Control Systems, Optimization, Distributed and Large-Scale Systems Abstract: This note considers a class of decentralized convex optimization problems subject to constraints. We propose a discrete-time algorithm with constant step-size that exploits the simultaneous perturbation method to obtain information of the cost function. Under some technical conditions, we prove practical convergence in probability of the algorithm to a ball that contains the optimizer and which has a step-dependent size. The novelty of our approach is that the agents do not require a closed form expression of the cost function, nor global knowledge of total resources in the network or any specific procedure for algorithm initialization. Our proof methods employ nonsmooth Lyapunov theory, convex analysis, and stochastic difference inclusions. We illustrate the applicability of the algorithm in an electricity market scenario.