NOTE: This item is not available outside the Texas A&M University network. Texas A&M affiliated users who are off campus can access the item through NetID and password authentication or by using TAMU VPN. Non-affiliated individuals should request a copy through their local library's interlibrary loan service.
Automated, heuristic conversion of plain text bibliographic records to MARC format
dc.creator | Kalasapur, Siddarth Subash | |
dc.date.accessioned | 2012-06-07T23:05:34Z | |
dc.date.available | 2012-06-07T23:05:34Z | |
dc.date.created | 2001 | |
dc.date.issued | 2001 | |
dc.identifier.uri | https://hdl.handle.net/1969.1/ETD-TAMU-2001-THESIS-K32 | |
dc.description | Due to the character of the original source materials and the nature of batch digitization, quality control issues may be present in this document. Please report any quality issues you encounter to digital@library.tamu.edu, referencing the URI of the item. | en |
dc.description | Includes bibliographical references (leaves 59-60). | en |
dc.description | Issued also on microfiche from Lange Micrographics. | en |
dc.description.abstract | Bibliographies in journals and publications are a valuable reference tool that aid researchers to gain a better understanding of an article of study. Making bibliographies searchable across journals is a feature that will be of great use to researchers. In order to make bibliographies searchable, they must be present in a format that can be searched easily, and with accurate results. MARC has emerged as a versatile and popular standard, and records in MARC format are suitable for easy and quick searching. MARC, which stands for Machine Reading and Cataloging, allows for an efficient means to store bibliographic records and allows extensive searching. Since MARC is a standard followed by most libraries, various tools have already been built that allow for extensive and efficient searching. However, bibliographies, when present in journals or publications, exist in plain text form, and are not stored as MARC records. They have to be converted to MARC format to allow for searching. This thesis examines the structure of bibliographies and proposes a mechanism to convert plain text bibliographies to MARC format. The motivation behind the conversion is that a MARC based bibliography is more suitable for extensive searching, when compared to a plain text bibliography. The conversion process examines the structure of a plain text bibliography to determine its contents, and the format of each record of the bibliography is then carefully compared against standard formats to aid the conversion. However, it has been observed that many records do not adhere to the standards, and hence any conversion process that does not accommodate for these variances will not yield accurate results. Consequently, the conversion process has to be based on heuristics, so that any variances from the original format can be detected, and these records too are converted. The result of this conversion is that bibliographies in MARC format can be exchanged between digital libraries. Another outcome of the conversion is that journals and other publications that contain bibliographies can be linked together. | en |
dc.format.medium | electronic | en |
dc.format.mimetype | application/pdf | |
dc.language.iso | en_US | |
dc.publisher | Texas A&M University | |
dc.rights | This thesis was part of a retrospective digitization project authorized by the Texas A&M University Libraries in 2008. Copyright remains vested with the author(s). It is the user's responsibility to secure permission from the copyright holder(s) for re-use of the work beyond the provision of Fair Use. | en |
dc.subject | computer science. | en |
dc.subject | Major computer science. | en |
dc.title | Automated, heuristic conversion of plain text bibliographic records to MARC format | en |
dc.type | Thesis | en |
thesis.degree.discipline | computer science | en |
thesis.degree.name | M.S. | en |
thesis.degree.level | Masters | en |
dc.type.genre | thesis | en |
dc.type.material | text | en |
dc.format.digitalOrigin | reformatted digital | en |
Files in this item
This item appears in the following Collection(s)
-
Digitized Theses and Dissertations (1922–2004)
Texas A&M University Theses and Dissertations (1922–2004)
Request Open Access
This item and its contents are restricted. If this is your thesis or dissertation, you can make it open-access. This will allow all visitors to view the contents of the thesis.