Abstract
Consider a linear [ n , k , d ] q 听肠辞诲别听 C . We say that the听 i th coordinate of听 C 听丑补蝉听 locality 听 r 听, if the value at this coordinate can be recovered from accessing some other听 r 听coordinates of听 C . Data storage applications require codes with small redundancy, low听 locality 听for information coordinates, large distance, and low听 locality 听for parity coordinates. In this paper, we carry out an in-depth study of the relations between these parameters. We establish a tight bound for the redundancy听 n - k 听in terms of the message length, the distance, and the听 locality 听of information coordinates. We refer to codes attaining the bound as optimal. We prove some structure theorems about optimal codes, which are particularly strong for small distances. This gives a fairly complete picture of the tradeoffs between codewords length, worst case distance, and听 locality 听of information听 symbols . We then consider the locality 听of parity check听 symbols 听and erasure correction beyond worst case distance for optimal codes. Using our structure theorem, we obtain a tight bound for the听 locality 听of parity听 symbols possible in such codes for a broad class of parameter settings. We prove that there is a tradeoff between having good听 locality 听and the ability to correct erasures beyond the minimum distance.