## Algorithms for Optimal Replica Placement in Data Centers

##### Abstract

##### Abstract

Owners of data centers are contractually obligated to provide high-availability service to their customers in the presence of ubiquitous hardware failures. Studies have indicated that co-occurring component failure is a key contributing factor towards unavailability in modern data centers [12]. Much effort has been made to produce quality statistical models of correlation among failures. In this dissertation we depart from this approach and provide a model which explicitly captures dependencies among system components. Our model consists of a directed graph wherein nodes represent hardware components and directed edges are used to connect nodes whose associated hardware components have a causal failure dependency. That is to say, failure in the source component may result in the compromised operation of the destination component. Given such a model, we consider how best to place r replicas of data in the data center so as to ensure as many replicas survive as possible. To this end, we cast our goals as combinatorial optimization problems for which we then provide algorithms or establish hardness.
We consider several variations on the survivable replica placement problem. Motivated by their use in commercially-available systems for distributed storage, we ﬁrst address the case wherein the graph is given as a tree. For this problem, we propose lexico-minimizing the failure aggregate, a novel vector-valued objective which closely matches our intuition concerning “good” replica placements. We provide an O(n+r log r) time dynamic programming algorithm for lexico-minimizing the failure aggregate. Next, we consider the problem of placing m replicated blocks of data, so as to optimize the placement of replicas for each block simultaneously. The complexity of this problem appears to be closely related to the skew, which we deﬁne to be the difference between the largest and smallest number of replicas among all blocks. We provide an algorithm whose running time grows like O(m^{0(δ2)), where δ is the skew, which is polynomial-time when the skew is a constant.
We then consider models which are more complicated than trees. We prove that optimizing replica placement over natural extensions of our model to bipartite graphs is NP-hard, and further show that this implies NP-hardness for directed graphs in general. In light of these hardness results, we next consider classes of graphs which are computationally tractable. Speciﬁcally, we show that reliable replica placement is ﬁxed-parameter tractable in a special class of multitrees. A multitree is a directed acyclic graph in which the set of all nodes that can be reached from any ﬁxed node induces a subgraph which is a tree. We parameterize multitrees via their number of roots (i.e., nodes with in-degree zero), and prove that survivable replica placement is NP-hard even when restricted to multitrees with only three roots. We then design a polynomial time algorithm for a special class of multitrees with two roots, and show how to extend this algorithm to one which runs on multitrees with k roots in O(nr^{2k+2}) time, which is polynomial-time when both the number of roots and the number of replicas is ﬁxed.