Text markup using the TEI and collations using the MVED: a comparison of text encoding schemes in producing electronic text

Kochumman, Rajiv Daniel

NOTE: This item is not available outside the Texas A&M University network. Texas A&M affiliated users who are off campus can access the item through NetID and password authentication or by using TAMU VPN. Non-affiliated individuals should request a copy through their local library's interlibrary loan service.

View/ Open

2002 Thesis K59.pdf (1.328Mb)

Date

2002

Author

Kochumman, Rajiv Daniel

Metadata

Show full item record

Abstract

The Cervantes Project aims to create an EVE (Electronic Variorum Edition) from existing copies of the Don Quixote de la Mancha novel by the Spanish author Cervantes. This process involves collating text versions of the many available copies of the book. During the collation process, the editor, a Spanish scholar, corrects, emends, and annotates portions of the texts. The tool within the Cervantes Project that helps in doing this is the Multi-Variant Editor for Documents (MVED). The aim is to develop a set of interlinked texts and images, having all the corrections, emendations, and annotations in place. This set is the EVE. Thus, the text files are manipulated in multiple ways within the collation process. The manipulations are stored in a database repository, without modifying the actual texts themselves. In a sense, we are encoding the text document that we are working with. The TEI (Text Encoding Initiative) standard was launched in 1987 to electronically encode all kinds of text documents, using a scheme that is simple yet powerful. The TEI DTD defines tags that can reflect how text documents are encoded, and how they are modified. This thesis aims to find a parallel between the encoding schemes provided by the MVED and the TEI. It aims to compare the two initiatives in encoding text documents, and find out the strengths and weaknesses of each. The MVED was developed much after the TEI was founded, and uses recent advances in computer technology to assist the end user in encoding the text document. We aim to compare the two processes, i.e., the processes of encoding using the MVED and the TEI, to find similarities between the two. We hope to be able to encode part of the EVE in the TEI specified format, with all the information encoded by the MVED. Also, we hope to study the advantages or disadvantages of data abstraction as provided by the MVED. As a part of the comparison process, we provide functionality within the MVED to encode the ongoing collation process using the TEI. In developing this tool (the Text2TEI Converter), we hope to cover all of the action points in comparing between the TEI and the MVED, as regard text encoding. This would involve converting the base text of the collation, with editorial modifications, into the TEI format.

URI

https://hdl.handle.net/1969.1/ETD-TAMU-2002-THESIS-K59

Description

Due to the character of the original source materials and the nature of batch digitization, quality control issues may be present in this document. Please report any quality issues you encounter to digital@library.tamu.edu, referencing the URI of the item.
Includes bibliographical references (leaves 35-38).
Issued also on microfiche from Lange Micrographics.

Subject

computer science.
Major computer science.

Collections

Digitized Theses and Dissertations (1922–2004)

Citation

Kochumman, Rajiv Daniel (2002). Text markup using the TEI and collations using the MVED: a comparison of text encoding schemes in producing electronic text. Master's thesis, Texas A&M University. Available electronically from https : / /hdl .handle .net /1969 .1 /ETD -TAMU -2002 -THESIS -K59.

This item and its contents are restricted. If this is your thesis or dissertation, you can make it open-access. This will allow all visitors to view the contents of the thesis.

Request Open Access