Publications & Presentations

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 15 of 15
  • Item
    Cultivating Shared Services - Four Winds Digital Signage
    (2015-10-27) Bolton, Michael; Richardson, Casey; Sweeney, David
    By establishing a partnership between a variety of units on the College Station campus, we successfully partnered with Four Winds Interactive, a leading provider of digital signage, to provide a shared service that leverages the buying power of multiple units, the technical power of a shared installation, and the collaborative power of an IT governance model. This presentation will detail the model used to create a successful shared service that benefits a variety of audiences, both internal and external.
  • Item
    The Promise and Peril of RDF for Formalizing the Humanities
    (2015-10-26) Creel, James; Potvin, Sarah
    The Resource Description Framework (RDF) defines structures for describing entities identifiable by Uniform Resource Identifiers (URIs). RDF exists at the top of the stack of technology standards proposed by the World Wide Web Consortium (W3C) that dominate the ecosystem of the Web today, and is purported to enable a Semantic Web on which machines can interpret webpages to perform information seeking and processing tasks on our behalf. RDF describes entities by means of triples, each consisting of a subject, predicate, and object. Triples express that a subject stands in a certain relation (the predicate) to the object. Subjects and objects are either URIs or literals (such as strings or numbers), but predicates are always URIs referring to abstract relations present in an RDF schema (sometimes informally referred to as an ontology). Since anyone can define schemata and assign new URIs to entities, RDF offers a flexibility of expression approaching natural language. Institutions have adopted RDF for description of humanities works and projects as well as scholars and humanists themselves. Using RDF, digital repositories like Fedora and DSpace expose works, and the VIVO semantic networking tool describes researchers, their affiliations, and works. To the degree that the URIs occur in such different contexts, our data are linked, forming a graph. Yet to what degree can RDF graphs bear meaning like expressions of natural language? The answer does not depend simply upon how linked the data are. More telling is how the data are used and by whom. A related question is what it means for our RDF graphs to be machine readable. Strictly speaking, ontologies do not enable machines to understand each other’s data. Rather, ontologies can help humans understand other humans expressions in a digital medium. A predicate such as (i.e. dc:author) is significant of the authorship relation only insofar as humans using the data believe and act as though it is. Our concern, then, is a hermeneutic one: how do humans interpret RDF and how can machines facilitate this interpretation? Our view is that representational systems take on meaning only by consensus. No predicate or URI can bear communicable meaning by fiat of a single agent. Naturally, as systems scale in scope and ambition, ever more stakeholders must reach consensus. When a large ontology is developed without continual feedback from a community, the complexity of the finished product will prove a barrier to its adoption. In this talk, we will briefly consider some of the historic lessons learned in formal knowledge representation in the computer science space, where decades of work have yielded mixed results. We will also look at two current projects that have successfully applied RDF in the humanities space: the Pelagios (http://pelagios-project.blogspot.co.uk) initiative to annotate historic documents with references to places in gazetteers, and the Pleiades (http://pleiades.stoa.org) project that provides such a gazetteer of the ancient world. These systems have enjoyed success by employing RDF predicates in common use and by cultivating community involvement from their inception.
  • Item
    Automation, Virtualization, and Integration of a Digital Repository Server Architecture or How to Deploy Three Production DSpaces in One Night and Be Home for Dinner
    (Texas Digital Library, 2015-03-17) Creel, James; Cooper, John Micah; Huff, Jeremy
    Texas A&M University Libraries have operated a DSpace repository since the year 2004. For most of this period, the public-facing production server ran on dedicated hardware and was installed by hand by a system administrator using numerous tweaks for the local architecture. DSpace requires several inter-dependent sub-applications including a user interface (XMLUI or JSPUI), a Solr indexer, an OAI service, and a handle server. Even after careful pre-production releases and testing, redeployments can take all night and lead to days or weeks of bug fixing. Yet migrations and upgrades are inevitable and desirable. The open-source community is constantly implementing new features. In addition, with usage and content submission, a production DSpace instance will eventually outgrow its server hardware and need to be redeployed. With a diversity of special requirements from repository stakeholders, TAMU Libraries has a history of heavy customization of DSpace. An abridged list of customizations includes a search interface for ETDs, additional options for item licensing, request a copy for restricted items, an expanding/collapsing browser for the community/collection hierarchy, and improvements to administrative views, all presented with a branded theme. These customizations touch on every level of the code, from the java backend to the XML front-ends. Although DSpace-related software development at TAMU produced important contributions early on, most notably the XMLUI front-end (aka Manakin), over the years local demands for it new features fast outweighed the imperative to package features for submission back to the core DSpace code. Such demands proved shortsighted, as DSpace upgrades became increasingly difficult and fraught as customization increased. Lead times for redeployments grew intolerable, as developers were forced with each upgrade to rewrite customizations and examine thousands of lines of configuration. Recently, three factors have brought about a profound improvement in developers' efficiency at TAMU Libraries - but not without cost. The Libraries' system administration leadership saw a need to automate server deployment tasks, and in a related initiative, to move to a virtualized server infrastructure whereby server machines are deployed to commodified and generic virtual machines using resources from a transparently allocated hardware pool. Finally, since DSpace 3.x, the official DSpace code has been refactored in such a way as to facilitate customization with independent sub-modules that need not disturb the build structure. The repayment of this technical debt has required about a year, during which time customers made do with very little new feature development. However, having returned to a standard build with modular customizations, we are now better equipped to submit our customizations back to the community. Our software for build automation is the Chef tool, a Ruby based framework that encapsulates a multitude of common deployment functions like writing and templating files, managing users and permissions, and enabling services. For our virtualization infrastructure, we started on OpenStack, and have recently migrated to vmware. In this talk, we will recount experiences with systems and customers during our lengthy transition to automation and virtualization and conclude with some recent success stories about production DSpace deployments.
  • Item
    Using the DuraCloud REST interface
    (2015-10-01) Bolton, Michael
    This presentation covered use of the DuraCloud API, both via the curl interface and via PERL programs that used the REST interface.
  • Item
    Mapping Text: Automated Geoparsing and Map Browser for Electronic Theses and Dissertations
    (2013-07-24) Weimer, Katherine H.; Creel, James; Modala, Naga Raghuveer; Gargate, Rohit
  • Item
    Creating and Evaluating Metadata for a Large Legacy Thesis Collection: From 'Vocational Agriculture' (1922) to 'Microemulsion-mediated syntheses' (2004)
    (2013-05-23) Potvin, Sarah; Creel, James Silas
    In the summer of 2012, Texas A&M University Libraries uploaded more than 16,000 retrospectively-digitized masters-level theses, dating from 1922 to 2004, into our DSpace institutional repository. Item records for the Retrospective Theses collection were created by mapping existing MARC records, then transforming and enhancing this metadata. Records included fields encoded in our Qualified Dublin Core schema, as well as the custom Thesis schema developed by the TDL member consortium. MODS metadata records were also generated, to be stored as bitstreams.
  • Item
    Adding OAI-ORE Support to Repository Platforms
    (Open Repositories Conference 2009, 2009-05-17) Maslov, Alexey; Mikeal, Adam; Phillips, Scott; Leggett, John; McFarland, Mark
    The Texas Digital Library is a cooperative initiative of Texas universities. One of TDL’s core services is a federated collection of ETDs from its member schools. As this collection grew, the need for tools to manage the content exchange from the local to the federated repository became evident. This paper presents our experiences adding harvesting support to the DSpace repository platform using the ORE and PMH protocols from the Open Archives Initiative. We describe our use case for a statewide ETD repository and the mapping of the OAI-ORE data model to the DSpace architecture. We discuss our implementation that adds both dissemination and harvesting functionality to the repository. We conclude by discussing the architectural flexibility added to the TDL repository through this project.
  • Item
    ETD Management in the Texas Digital Library: lessons Learned from a Demonstrator
    (2008-07-24) Mikeal, Adam
    As a consortium of libraries from public and private institutions across the state of Texas, the Texas Digital Library (TDL) exists to promote the scholarly activities of its members. One of its earliest initiatives was a federated collection of ETDs from across the state. There are currently 16 participating schools in TDL, four of which are contributing over 4000 ETDs per year, and membership and contributions are growing. A diverse set of content contributors introduces the problems of inconsistent metadata and incompatible storage and access methods, making it difficult to offer effective tools and services. This influenced the decision to create a state-wide system for managing the entire life-cycle of ETDs, from the point of ingestion to final publication; pooling resources to address this common problem was appealing for both technical and economic reasons. In 2007, we reported on the status of the functional system prototype. This paper reports on the results of the demonstrator event that is taking place in spring 2008 at Texas A&M University and the University of Texas, and discusses the requirements for moving to a production environment. These include testing and scaling the system to handle the large numbers of users dispersed over a significant geographic area (Texas is the third-largest producer of PhDs in the US). Our intention is to embrace international standards for ETD metadata and policies as they continue to evolve through community efforts, such as the NDLTD union catalog of ETDs. Finally, we will examine the status of the project’s release as an add-on component to a DSpace repository through the Manakin interface framework under an open source license. A primary design goal of this project is to create a product that satisfies TDL’s requirements and provides a turnkey implementation for ETD management and publication that can be scaled for the broader academic community.
  • Item
    De-facing DSpace with Manakin
    (2007-07-16) Phillips, Scott; Green, Cody; Maslov, Alexey; Mikeal, Adam; Leggett, John
  • Item
    DSpace XML UI Project Technical Overview
    (2007-07-16) Phillips, Scott; Maslov, Alexey; Leggett, John; Mikeal, Adam
    This paper describes the modifications to DSpace by Texas A&M Libraries to support an XML-based user interface. DSpace supports digital repositories composed of communities and collections. Each community within DSpace typically represents an organizational unit within an institution. To increase the appeal of DSpace as a digital repository to these communities, this project enables the establishment of a unique look and feel that might extend outside of DSpace into an existing web presence. We believe this may increase the adoption of DSpace by these communities.
  • Item
    Manakin Themes: customizing the look-and-feel of DSpace
    (2007-07-16) Maslov, Alexey; Green, Cody; Mikeal, Adam; Phillips, Scott; Leggett, John
  • Item
    Introducing Manakin: Overview and Architecture
    (2007-07-16) Phillips, Scott; Green, Cody; Maslov, Alexey; Mikeal, Adam; Leggett, John
  • Item
    Manakin Case Study: visualizing geospatial metadata and complex items
    (2007-07-16) Mikeal, Adam; Green, Cody; Maslov, Alexey; Phillips, Scott; Weimer, Kathy; Leggett, John
  • Item
    Preserving the Scholarly Side of the Web
    (2007-07-12) Mikeal, Adam; Green, Cody; Maslov, Alexey; Phillips, Scott; Leggett, John
    This paper presents results of a case study that addresses many issues surrounding the difficult task of preservation in a digital library. We focus on a subset of these issues as they apply to the preservation of scholarly articles encoded in current web standards. We also describe the two common preservation mechanisms, emulation and migration, as well as our selection of the latter for our particular case. Finally, we compare two approaches to migration, automatic and manual, and discuss their strengths and weaknesses in our context. We show that consistent use of open standards leads to more efficient migration processes and issue a “call to arms” to the digital preservation community to ensure that scholarly material currently on the web can be preserved for future generations.
  • Item
    Developing a Common Submission System for ETDs in the Texas Digital Library
    (2007-07-12) Mikeal, Adam; Brace, Tim; Leggett, John; McFarland, Mark; Phillips, Scott
    The Texas Digital Library is a consortium of universities organized to provide a single digital infrastructure for the scholarly activities of Texas universities. The four current Association of Research Libraries (ARL) universities and their systems comprise more than 40 campuses, 375,000 students, 30,000 faculty, and 100,000 staff; while non-ARL institutions represent another sizable addition in both students and faculty1. TDL’s principal collection is currently its federated collection of ETDs from three of the major institutions; The University of Texas, Texas A&M University, and Texas Tech University. Since the ARL institutions in Texas alone produce over 4,000 ETDs per year, the growth potential for a single state-wide repository is significant. To facilitate the creation of this federated collection, the schools agreed upon a common metadata standard represented by a MODS XML schema2. Although this creates a baseline for metadata consistency, there exists ambiguity within the interpretation of the schema that creates usability and interoperability challenges. Name resolution issues are not addressed by the schema, and certain descriptive metadata elements need consistency in format and level of significance so that common repository functionality will operate intuitively across the collection. It was determined that a common ingestion point for ETDs was needed to collect metadata in a consistent, authoritative manner. A working group was formed that consisted of representatives from five universities, and a state-wide survey of the state of ETDs was conducted, with varied levels of engagement with ETDs reported. Many issues were identified, including policy questions such as open access publishing, copyright considerations and the collection of release authorizations, the role of infrastructure development such as a Shibboleth federation for authentication, and interoperability with third-party publishers such as UMI. ETD workflows at six schools were analyzed, and a meta-workflow was identified with three stages: ingest, verification, and publication. It was decided that Shibboleth would be used for authentication and identity management within the application. This paper reports on the results of the survey, and describes the system and submission workflow that was developed as a consequence. A functional prototype of the ingest stage has been built, and a full prototype with Shibboleth integration is slated for completion in May of 2007. Demonstrators of the application are expected to be deployed in fall of 2007 at three schools.