Redundant allocation of data in a distributed relational database /

Bibliographic Details
Main Author: Sarathy, Rathindra, 1956-
Other Authors: Curry, Guy L. (degree committee member.), Tretter, Marietta J. (degree committee member.)
Format: Thesis Book
Language:English
Published: 1990.
Subjects:
Online Access:ProQuest, Abstract
Link to OAKTrust copy
Description
Abstract:The increasing availability of computer networks and the need for managing decentralized data at several geographically remote locations, have given rise to a number of distributed databases. This study considers an important design problem, that of allocating data from a relational distributed database over a network with a general topology. In practice, users pay for accessing data at remote sites, usually depending on the volume of data sent over the network. The objective of data allocation is to minimize the total cost of data transmitted over the network, due to user queries from all sites. In distributed databases, user queries containing multi-relation operations, such as the JOIN operation, are handled by distributed query processing (DQP) algorithms. DQP sets up a complicated schedule of transmissions (called a query processing schedule or QPS) between remote sites containing the data needed by the JOIN operation. Consideration of the costs of QPSs in data allocation leads to an important issue called "data dependency". Data allocation in the presence of data dependency is an open problem, with limited research on its solvability for problems with several sites and relations. This study models the data allocation problem as a binary, integer programming problem. Due to data dependency, the model is nonlinear. The Lagrangian relaxation of the nonlinear model decomposes into unconstrained 0-1 quadratic programming subproblems, which in certain cases are also solved as maximum flow problems in a network. The optimal Lagrangian multipliers are obtained through subgradient optimization. The Lagrangian relaxation provides lower bounds on the optimal solution, while upper bounds are obtained by heuristics. The difference between the bounds provides solutions whose proximity to the optimal solution is known. Computational results are provided, which indicate the feasibility of the proposed solution approach. Also, other important issues such as the trade-off between solution quality and computational time, the effect of aggregate measures of single and multi-relation query transmissions and the effect of optimal DQP strategies on data allocation are investigated.
Item Description:Typescript (photocopy).
Vita.
"Major subject: Business analysis and research."
Physical Description:ix, 110 leaves : illustrations ; 29 cm
Bibliography:Includes bibliographical references.