Show simple item record

dc.creatorHolland, Gavin D
dc.date.accessioned2012-06-07T22:40:52Z
dc.date.available2012-06-07T22:40:52Z
dc.date.created1995
dc.date.issued1995
dc.identifier.urihttps://hdl.handle.net/1969.1/ETD-TAMU-1995-THESIS-H643
dc.descriptionDue to the character of the original source materials and the nature of batch digitization, quality control issues may be present in this document. Please report any quality issues you encounter to digital@library.tamu.edu, referencing the URI of the item.en
dc.descriptionIncludes bibliographical references.en
dc.descriptionIssued also on microfiche from Lange Micrographics.en
dc.description.abstractLarge multicomputer systems are inherently unreliable because of their enormous complexity. This has a direct impact on distributed computations performed on these systems. As the size and execution time of these distributed computations grows, so does the probability that a hardware failure will cause the computations to fail. This thesis presents a novel architecture for a software-implemented fault-tolerance layer, designed for the purpose of enhancing the reliability of distributed computations performed on large multicomputer systems, such as massively parallel computers and distributed computing systems. The objective of this research is to develop the conceptual framework for a purely software-based, user-level solution for fault detection, reconfiguration, and recovery in a parallel environment. The symmetrically distributed, multi-tiered layer envelopes user applications, enabling it to perform fault-tolerance related actions apart from, and transparent to the application. Its modular design enables dynamic run-time selection of the most appropriate fault-tolerant algorithm, and is, therefore, not restricted to one particular fault-tolerant method. Performance and coverage measurements of a minimal implementation of the proposed layer are presented, and indicate that user-level software-implemented fault-tolerance can be reasonably efficient and effective.en
dc.format.mediumelectronicen
dc.format.mimetypeapplication/pdf
dc.language.isoen_US
dc.publisherTexas A&M University
dc.rightsThis thesis was part of a retrospective digitization project authorized by the Texas A&M University Libraries in 2008. Copyright remains vested with the author(s). It is the user's responsibility to secure permission from the copyright holder(s) for re-use of the work beyond the provision of Fair Use.en
dc.subjectcomputer science.en
dc.subjectMajor computer science.en
dc.titleSoftware implemented fault-tolernace on distributed-memory MIMD architecturesen
dc.typeThesisen
thesis.degree.disciplinecomputer scienceen
thesis.degree.nameM.S.en
thesis.degree.levelMastersen
dc.type.genrethesisen
dc.type.materialtexten
dc.format.digitalOriginreformatted digitalen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record

This item and its contents are restricted. If this is your thesis or dissertation, you can make it open-access. This will allow all visitors to view the contents of the thesis.

Request Open Access