½­ÄÏÌåÓý

Coded Computing via Binary Linear Codes: Designs and Performance Limits

Submitted by admin on Mon, 10/28/2024 - 01:24

We consider the problem of coded distributed computing where a large linear computational job, such as a matrix multiplication, is divided into $k$ smaller tasks, encoded using an $(n,k)$ linear code, and performed over $n$ distributed nodes. The goal is to reduce the average execution time of the computational job. We provide a connection between the problem of characterizing the average execution time of a coded distributed computing system and the problem of analyzing the error probability of codes of length $n$ used over erasure channels.

Slow and Stale Gradients Can Win the Race

Submitted by admin on Mon, 10/28/2024 - 01:24

Distributed Stochastic Gradient Descent (SGD) when run in a synchronous manner, suffers from delays in runtime as it waits for the slowest workers (stragglers). Asynchronous methods can alleviate stragglers, but cause gradient staleness that can adversely affect the convergence error. In this work, we present a novel theoretical characterization of the speedup offered by asynchronous methods by analyzing the trade-off between the error in the trained model and the actual training runtime (wallclock time).

ϵ-Approximate Coded Matrix Multiplication Is Nearly Twice as Efficient as Exact Multiplication

Submitted by admin on Mon, 10/28/2024 - 01:24

We study coded distributed matrix multiplication from an approximate recovery viewpoint. We consider a system of $P$ computation nodes where each node stores $1/m$ of each multiplicand via linear encoding. Our main result shows that the matrix product can be recovered with $\epsilon $ relative error from any $m$ of the $P$ nodes for any $\epsilon > 0$ . We obtain this result through a careful specialization of MatDot codes — a class of matrix multiplication codes previously developed in the context of exact recovery ( $\epsilon =0$ ).

List-Decodable Coded Computing: Breaking the Adversarial Toleration Barrier

Submitted by admin on Mon, 10/28/2024 - 01:24

We consider the problem of coded computing, where a computational task is performed in a distributed fashion in the presence of adversarial workers. We propose techniques to break the adversarial toleration threshold barrier previously known in coded computing. More specifically, we leverage list-decoding techniques for folded Reed-Solomon codes and propose novel algorithms to recover the correct codeword using side information.

The Capacity Region of Distributed Multi-User Secret Sharing

Submitted by admin on Mon, 10/28/2024 - 01:24

In this paper, we study the problem of distributed multi-user secret sharing, including a trusted master node, $N\in \mathbb {N}$ storage nodes, and $K$ users, where each user has access to the contents of a subset of storage nodes. Each user has an independent secret message with certain rate, defined as the size of the message normalized by the size of a storage node.

Degree Tables for Secure Distributed Matrix Multiplication

Submitted by admin on Mon, 10/28/2024 - 01:24

We consider the problem of secure distributed matrix multiplication (SDMM) in which a user wishes to compute the product of two matrices with the assistance of honest but curious servers. We construct polynomial codes for SDMM by studying a recently introduced combinatorial tool called the degree table. For a fixed partitioning, minimizing the total communication cost of a polynomial code for SDMM is equivalent to minimizing $N$ , the number of distinct elements in the corresponding degree table.

Sequential Gradient Coding for Packet-Loss Networks

Submitted by admin on Mon, 10/28/2024 - 01:24

We consider distributed computation of a sequence of $J$ gradients $\{\mathbf {g}(0), \ldots,\mathbf {g}(J-1)\}$ . Each worker node computes a fraction of $\mathbf {g}(t)$ in round- $t$ and attempts to communicate the result to a master. Master is required to obtain the full gradient $\mathbf {g}(t)$ by the end of round- $(t+T)$ . The goal here is to finish all the $J$ gradient computations, keeping the cumulative processing time as short as possible. Delayed availability of results from individual workers causes bottlenecks in this setting.

Multi-Party Proof Generation in QAP-Based zk-SNARKs

Submitted by admin on Mon, 10/28/2024 - 01:24

Zero-knowledge succinct non-interactive argument of knowledge (zkSNARK) allows a party, known as the prover, to convince another party, known as the verifier, that he knows a private value $v$ , without revealing it, such that $F(u,v)=y$ for some function $F$ and public values $u$ and $y$ . There are various versions of zk-SNARK, among them, Quadratic Arithmetic Program (QAP)-based zk-SNARK has been widely used in practice, especially in Blockchain technology.

Stream Distributed Coded Computing

Submitted by admin on Mon, 10/28/2024 - 01:24

The emerging large-scale and data-hungry algorithms require the computations to be delegated from a central server to several worker nodes. One major challenge in the distributed computations is to tackle delays and failures caused by the stragglers. To address this challenge, introducing efficient amount of redundant computations via distributed coded computation has received significant attention. Recent approaches in this area have mainly focused on introducing minimum computational redundancies to tolerate certain number of stragglers.

Function Load Balancing Over Networks

Submitted by admin on Mon, 10/28/2024 - 01:24

Using networks as a means of computing can reduce the communication flow over networks. We propose to distribute the computation load in stationary networks and formulate a flow-based delay minimization problem that jointly captures the costs of communications and computation. We exploit the distributed compression scheme of Slepian-Wolf that is applicable under any protocol information. We introduce the notion of entropic surjectivity as a measure of function’s sparsity and to understand the limits of functional compression for computation.