Efficient parallel join in the hypercube multicomputer system /

Bibliographic Details
Main Author: Kim, Taeyoung
Other Authors: Friesen, Donald K. (degree committee member.), Lively, William M. (degree committee member.), Wortman, M. A. (degree committee member.)
Format: Thesis Book
Language:English
Published: 1994.
Subjects:
Online Access:Link to OAKTrust copy
Description
Abstract:Join is the most expensive operation in the relational database management system, and hence the parallelization of the join operation is of great importance in case of joining very large relations. In this work, we develop an efficient sort-merge join algorithm in a shared-nothing, hypercube multicomputer system by addressing two important issues in parallel join: even distribution of tuples and processing. Previous works on parallel sort-merge join do not distribute tuples evenly among processors due to the static partitioning scheme by using the DYOP (DYnamic and Order Preserving) file; worse, a large number of tuples has to be moved around due to the centralized communication scheme. Hence, due to load imbalance and message congestion at specific nodes, the performance of parallel join may degrade seriously, as the degree of skew increases and the sizes of the relations grow. In our approach, we adopt a file structure called the grid file which enables us to dynamically partition a relation into a number of fragments, each of which has approximately the same number of tuples. In addition, communication between processors is done in a distributed manner fully exploiting the recursive structure of the hypercube interconnection network. Performance analysis shows that our algorithm is immune to data skew and attains lower join cost.
Item Description:Vita.
"Major subject: Computer Science."
Physical Description:xi, 105 leaves : illustrations ; 28 cm
Bibliography:Includes bibliographical references.