Show simple item record

dc.contributor.advisorFuruta, Richard
dc.creatorDalal, Zubin Jamshed
dc.date.accessioned2004-09-30T02:09:22Z
dc.date.available2004-09-30T02:09:22Z
dc.date.created2003-08
dc.date.issued2004-09-30
dc.identifier.urihttps://hdl.handle.net/1969.1/539
dc.description.abstractWith the extent of the web expanding at an increasing rate, the problems caused by broken links are reaching epidemic proportions. Studies have indicated that a substantial number of links on the Internet are broken. User surveys indicate broken links are considered the third biggest problem faced on the Internet. Currently Walden's Paths Path Manager tool is capable of detecting the degree and type of change within a page in a path. Although it also has the ability to highlight missing pages or broken links, it has no method of correcting them thus leaving the broken link problem unsolved. This thesis proposes a solution to this problem in Walden's Paths. The solution centers on the idea that "significant" keyphrases extracted from the original page can be used to accurately locate the document using a search engine. This thesis proposes an algorithm to extract representative keyphrases to locate exact copies of the original page. In the absence of an exact copy, a similar but separate algorithm is used to extract keyphrases that will help locating similar pages that can be substituted in place of the missing page. Both sets of keyphrases are stored as additions to the page signature in the Path Manager tool and can be used when the original page is removed from its current location on the Web.en
dc.format.extent494376 bytesen
dc.format.extent77021 bytesen
dc.format.mediumelectronicen
dc.format.mimetypeapplication/pdf
dc.format.mimetypetext/plain
dc.language.isoen_US
dc.publisherTexas A&M University
dc.subjectbroken link problemen
dc.subjectwalden's Pathsen
dc.subjectkeyphrase extractionen
dc.titleSolving the broken link problem in Walden's Pathsen
dc.typeBooken
dc.typeThesisen
thesis.degree.departmentComputer Scienceen
thesis.degree.disciplineComputer Scienceen
thesis.degree.grantorTexas A&M Universityen
thesis.degree.nameMaster of Scienceen
thesis.degree.levelMastersen
dc.contributor.committeeMemberLi, Du
dc.contributor.committeeMemberUrbina, Eduardo
dc.type.genreElectronic Thesisen
dc.type.materialtexten
dc.format.digitalOriginborn digitalen


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record