Abstract
Bibliographies in journals and publications are a valuable reference tool that aid researchers to gain a better understanding of an article of study. Making bibliographies searchable across journals is a feature that will be of great use to researchers. In order to make bibliographies searchable, they must be present in a format that can be searched easily, and with accurate results. MARC has emerged as a versatile and popular standard, and records in MARC format are suitable for easy and quick searching. MARC, which stands for Machine Reading and Cataloging, allows for an efficient means to store bibliographic records and allows extensive searching. Since MARC is a standard followed by most libraries, various tools have already been built that allow for extensive and efficient searching. However, bibliographies, when present in journals or publications, exist in plain text form, and are not stored as MARC records. They have to be converted to MARC format to allow for searching. This thesis examines the structure of bibliographies and proposes a mechanism to convert plain text bibliographies to MARC format. The motivation behind the conversion is that a MARC based bibliography is more suitable for extensive searching, when compared to a plain text bibliography. The conversion process examines the structure of a plain text bibliography to determine its contents, and the format of each record of the bibliography is then carefully compared against standard formats to aid the conversion. However, it has been observed that many records do not adhere to the standards, and hence any conversion process that does not accommodate for these variances will not yield accurate results. Consequently, the conversion process has to be based on heuristics, so that any variances from the original format can be detected, and these records too are converted. The result of this conversion is that bibliographies in MARC format can be exchanged between digital libraries. Another outcome of the conversion is that journals and other publications that contain bibliographies can be linked together.
Kalasapur, Siddarth Subash (2001). Automated, heuristic conversion of plain text bibliographic records to MARC format. Master's thesis, Texas A&M University. Available electronically from
https : / /hdl .handle .net /1969 .1 /ETD -TAMU -2001 -THESIS -K32.