Adaptive fault diagnosis for multiprocessor architectures /
Techniques of fault diagnosis are important to maintain the reliability, and availability of a multiprocessor system. On-line diagnosis is desired in many critical applications, such as space shuttle mission control and bank transactions, where the systems can not afford to reboot for diagnosis. For...
| Main Author: | |
|---|---|
| Format: | Thesis Book |
| Language: | English |
| Published: |
[Place of publication not identified] :
[publisher not identified] ;
1995.
|
| Subjects: | |
| Online Access: | http://proxy.library.tamu.edu/login?url=http://proquest.umi.com/pqdweb?did=741405811&sid=1&Fmt=2&clientId=2945&RQT=309&VName=PQD |
| Summary: | Techniques of fault diagnosis are important to maintain the reliability, and availability of a multiprocessor system. On-line diagnosis is desired in many critical applications, such as space shuttle mission control and bank transactions, where the systems can not afford to reboot for diagnosis. For an on-line diagnosis, it is essential to minimize the overhead of diagnosis that must be spared by the computation power. On the other hand, a diagnosis scheme must be highly effective in the sense that it is still able to function correctly when a large number of faults exist. These two goals often conflict with each other ill present diagnostic approaches. The major objective of this dissertation is to develop a technique for the available multiprocessor systems to achieve both an efficient and an effective fault diagnosis. First we propose adaptive system-level diagnosis approaches for hypercubes and meshes with wraparound, respectively. In addition to the theoretical analysis, experiments have been performed on NCUBE, a real hypercube machine. The proposed model for diagnosis cost is verified. Based on the previously proposed algorithms for meshes and hypercubes, we pursue generalized diagnosis approaches for a whole class of processor arrays. Therefore, a unified methodology is proposed, namely, a divide-and-conquer methodology of diagnosis. Applying this unified methodology further enhances the previous diagnosis approaches for meshes and hypercubes. When a system runs in a degraded manner, network topology is destroyed to be arbitrary and independent of any property that it originally had. So, our secondary objective is to pursue a low cost system-level diagnosis even in an arbitrary network topology. A near-optimum loop searching algorithm is also proposed to consolidate our diagnosis approach. The widely adopted PMC model [1] is used in this dissertation. To evaluate the efficiency of the proposed diagnosis schemes, all three measures of cost for on-line system-level diagnosis are considered: namely, (1) diagnosis time, (2) number of tests, and (3) number of test links. To evaluate the effectiveness of the proposed diagnosis schemes, the maximum allowed number of faults (called fault bound) for a correct diagnosis is analyzed. |
|---|---|
| Item Description: | Vita. "Major Subject: Computer Science". |
| Physical Description: | xii, 151 leaves : illustrations ; 28 cm. Issued also on microfiche from University Microfilms Inc. |
| Bibliography: | Includes bibliographical references. |