Algorithmic and Coding-theoretic Methods for Group Testing and Private Information Retrieval
Abstract
In the first part of this dissertation, we consider the Group Testing (GT) problem and its two variants, the Quantitative GT (QGT) problem and the Coin Weighing (CW) problem. An instance of the GT problem includes a ground set of items that includes a small subset of defective items. The GT procedure consists of a number of tests, such that each test indicates whether or not a given subset of items includes one or more defective items. The goal of the GT procedure is to identify the subset of defective items with the minimum number of tests.
Motivated by practical scenarios where the outcome of the tests can be affected by noise, we focus on the noisy GT setting, in which the outcome of a test can be flipped with some probability. In the noisy GT setting, the goal is to identify the set of defective items with high probability. We investigate the performance of two variants of the Belief Propagation (BP) algorithm for decoding of noisy non-adaptive GT under the combinatorial model for defective items. Through extensive simulations, we show that the proposed algorithms achieve higher success probability and lower false-negative and false-positive rates when compared to the traditional BP algorithm. We also consider a variation of the probabilistic GT model in which the prior probability of each item to be defective is not uniform and in which there is a certain amount of side information on the distribution of the defective items available to the GT algorithm. This dissertation focuses on leveraging the side information for improving the performance of decoding algorithms for noisy GT. First, we propose a probabilistic model, referred to as an interaction model, that captures the side information about the probability distribution of the defective items. Next, we present a decoding scheme, based on BP, that leverages the interaction model to improve the decoding accuracy. Our results indicate that the proposed algorithm achieves higher success probability and lower false-negative and false-positive rates when compared to the traditional BP, especially in the high noise regime.
In the QGT problem, the result of a test reveals the number of defective items in the tested group. This is in contrast to the standard GT where the result of each test is either 1 or 0 depending on whether the tested group contains any defective items or not. In this dissertation, we study the QGT problem for the combinatorial and probabilistic models of defective items. We propose non-adaptive QGT algorithms using sparse graph codes over bi-regular and irregular bipartite graphs, and binary t-error-correcting BCH codes. The proposed schemes provide exact recovery with a probabilistic guarantee, i.e. recover all the defective items with high probability. The proposed schemes outperform existing non-adaptive QGT schemes for the sub-linear regime in terms of the number of tests required to identify all defective items with high probability.
The CW problem lies at the intersection of GT and compressed sensing problems. Given a collection of coins and the total weight of the coins, where the weight of each coin is an unknown integer, the problem is to determine the weight of each coin by weighing subsets of coins on a spring scale. The goal is to minimize the average number of weighings over all possible weight configurations. Toward this goal, we propose and analyze a simple and effective adaptive weighing strategy. This is the first non-trivial achievable upper bound on the minimum expected required number of weighings.
In the second part of this dissertation, we focus on the private information retrieval problem. In many practical settings, the user needs to retrieve information messages from a server in a periodic manner, over multiple rounds of communication. The messages are retrieved one at a time and the identity of future requests is not known to the server. We study the private information retrieval protocols that ensure that the identities of all the messages retrieved from the server are protected. This scenario can occur in practical settings such as periodic content download from text and multimedia repositories. We refer to this problem of minimizing the rate of data download as online private information retrieval problem. Following the previous line of work by Kadhe et al., we assume that the user knows a subset of messages in the database as side information. The identities of these messages are initially unknown to the server. Focusing on scalar-linear settings, we characterize the per-round capacity, i.e., the maximum achievable download rate at each round. The key idea of our achievability scheme is to combine the data downloaded during the current round and the previous rounds with the original side information messages and use the resulting data as side information for the subsequent rounds.
Citation
Karimi, Esmaeil (2022). Algorithmic and Coding-theoretic Methods for Group Testing and Private Information Retrieval. Doctoral dissertation, Texas A&M University. Available electronically from https : / /hdl .handle .net /1969 .1 /197240.