NOTE: This item is not available outside the Texas A&M University network. Texas A&M affiliated users who are off campus can access the item through NetID and password authentication or by using TAMU VPN. Non-affiliated individuals should request a copy through their local library's interlibrary loan service.
Software implemented fault-tolernace on distributed-memory MIMD architectures
dc.creator | Holland, Gavin D | |
dc.date.accessioned | 2012-06-07T22:40:52Z | |
dc.date.available | 2012-06-07T22:40:52Z | |
dc.date.created | 1995 | |
dc.date.issued | 1995 | |
dc.identifier.uri | https://hdl.handle.net/1969.1/ETD-TAMU-1995-THESIS-H643 | |
dc.description | Due to the character of the original source materials and the nature of batch digitization, quality control issues may be present in this document. Please report any quality issues you encounter to digital@library.tamu.edu, referencing the URI of the item. | en |
dc.description | Includes bibliographical references. | en |
dc.description | Issued also on microfiche from Lange Micrographics. | en |
dc.description.abstract | Large multicomputer systems are inherently unreliable because of their enormous complexity. This has a direct impact on distributed computations performed on these systems. As the size and execution time of these distributed computations grows, so does the probability that a hardware failure will cause the computations to fail. This thesis presents a novel architecture for a software-implemented fault-tolerance layer, designed for the purpose of enhancing the reliability of distributed computations performed on large multicomputer systems, such as massively parallel computers and distributed computing systems. The objective of this research is to develop the conceptual framework for a purely software-based, user-level solution for fault detection, reconfiguration, and recovery in a parallel environment. The symmetrically distributed, multi-tiered layer envelopes user applications, enabling it to perform fault-tolerance related actions apart from, and transparent to the application. Its modular design enables dynamic run-time selection of the most appropriate fault-tolerant algorithm, and is, therefore, not restricted to one particular fault-tolerant method. Performance and coverage measurements of a minimal implementation of the proposed layer are presented, and indicate that user-level software-implemented fault-tolerance can be reasonably efficient and effective. | en |
dc.format.medium | electronic | en |
dc.format.mimetype | application/pdf | |
dc.language.iso | en_US | |
dc.publisher | Texas A&M University | |
dc.rights | This thesis was part of a retrospective digitization project authorized by the Texas A&M University Libraries in 2008. Copyright remains vested with the author(s). It is the user's responsibility to secure permission from the copyright holder(s) for re-use of the work beyond the provision of Fair Use. | en |
dc.subject | computer science. | en |
dc.subject | Major computer science. | en |
dc.title | Software implemented fault-tolernace on distributed-memory MIMD architectures | en |
dc.type | Thesis | en |
thesis.degree.discipline | computer science | en |
thesis.degree.name | M.S. | en |
thesis.degree.level | Masters | en |
dc.type.genre | thesis | en |
dc.type.material | text | en |
dc.format.digitalOrigin | reformatted digital | en |
Files in this item
This item appears in the following Collection(s)
-
Digitized Theses and Dissertations (1922–2004)
Texas A&M University Theses and Dissertations (1922–2004)
Request Open Access
This item and its contents are restricted. If this is your thesis or dissertation, you can make it open-access. This will allow all visitors to view the contents of the thesis.