KodikologieundPaläographieim digitalenZeitalter ... · PDF...

download KodikologieundPaläographieim digitalenZeitalter ... · PDF fileSchriftendesInstitutsfürDokumentologieundEditorik—Band2 KodikologieundPaläographieim digitalenZeitalter CodicologyandPalaeographyinthe

If you can't read please download the document

Transcript of KodikologieundPaläographieim digitalenZeitalter ... · PDF...

  • Schriften des Instituts fr Dokumentologie und Editorik Band 2

    Kodikologie und Palographie imdigitalen Zeitalter

    Codicology and Palaeography in theDigital Age

    herausgegeben von | edited by

    Malte Rehbein, Patrick Sahle, Torsten Schaan

    unter Mitarbeit von | in collaboration with

    Bernhard Assmann, Franz Fischer, Christiane Fritze

    2009BoD, Norderstedt

  • Bibliografische Information der Deutschen Nationalbibliothek:Die Deutsche Nationalbibliothek verzeichnet diese Publikation in derDeutschen Nationalbibliografie; detaillierte bibliografische Daten sind im In-ternet ber http://dnb.d-nb.de/ abrufbar.

    Leicht vernderte Fassung fr die digitale Publikation (siehe Vorwort).

    Slightly modified version to be published digitally (see preface).

    Publication ralise avec le soutien dApicesAssociation Palographique InternationaleCulture criture SocitDotation J.M.M. Hermans.http://www.palaeographia.org/apices/

    2009

    Herstellung und Verlag: Books on Demand GmbH, NorderstedtISBN: 978-3-8370-9842-6Einbandgestaltung: Katharina WeberSatz: XTEX und Bernhard Assmann

    http://dnb.d-nb.de/http://www.palaeographia.org/apices/

  • Kodikologie und Palographie im Digitalen Zeitalter Codicology and Palaeography in the Digital Age. Hrsg. Malte Rehbein,Patrick Sahle und Torsten Schaan, unter Mitarbeit von Bernhard Assmann, Franz Fischer und Christiane Fritze. Schriftendes Instituts fr Dokumentologie und Editorik 2. Norderstedt: Books on Demand, 2009. 7995.

    Representation and Encoding of Heterogeneous Data ina Web Based Research Environment for Manuscript and

    Textual Studies

    Daniel Deckers, Lutz Koch, Cristina Vertan

    Abstract

    This paper describes the general architecture of a digital research environment formanuscript and textual studies (particularly those pertaining to ancient Greek andByzantine texts), and it discusses some questions of data representation and encodingin the framework of such an online research platform. The platform is being developedby the project Teuchos. Zentrum fr Handschriften- und Textforschung, establishedin 2007 by the Institut fr Griechische und Lateinische Philologie (Universitt Ham-burg) in cooperation with the Aristoteles-Archiv (Freie Universitt Berlin). Teuchos isa long-term infrastructural project of the Universitt Hamburg. It is currently in itsthree-year initial phase which is being co-funded by the German Research Foundation(DFG) through the Thematic Information Networks scheme within the ScientificLibrary Services and Information Systems programme. We introduce the main ob-ject types to be handled by our system and describe the overall functionality of theonline platform. The paper focuses on the representations of two main object types:manuscripts as textual witnesses and watermarks, with an emphasis on the former.Since the adequate encoding of different layers of structure of a transmitted text is par-ticularly relevant to optimising users choices of navigating both digital images of thecontaining manuscripts and trancriptions of the text contained, this topic is discussedin some detail. We introduce the formal data model and the corresponding encodingfor the object types discussed. The project encodes textual data in XML, aiming for TEIconformance where possible. Since no accepted XML model exists for the encoding ofmetadata within a watermark collection, we briefly explain how we chose to modelthe objects to accomodate the collections the project is making accessible.

    Zusammenfassung

    Der folgende Aufsatz beschreibt die Gesamtarchitektur einer digitalen Arbeitsumge-bung fr Handschriften- und Textforschung (insbesondere im Bereich altgriechischerund byzantinischer Texte) und Lsungsanstze zu einigen Problemen der Datenrepr-sentation und Kodierung im Rahmen einer solchen Online-Plattform. Die Plattform

  • 80 Daniel Deckers Lutz Koch Cristina Vertan

    wird durch das 2007 am Institut fr Griechische und Lateinische Philologie (Univer-sitt Hamburg) gegrndete Projekt Teuchos. Zentrum fr Handschriften- und Textfor-schung in Kooperationmit demAristoteles-Archiv (Freie Universitt Berlin) entwickelt.Teuchos ist als langfristige Infrastruktureinrichtung der Universitt Hamburg ange-legt und befindet sich derzeit in seiner dreijhrigen Startphase, die von der DeutschenForschungsgemeinschaft (DFG) im Programm Themenorientierte Informationsnet-ze des Frderinstruments Wissenschaftliche Literaturversorgungs- und Informati-onssysteme (LIS) durch eine Anschubfinanzierung mitgetragen wird. Wir stellen zu-nchst die wichtigsten Arten von Objekten vor, die das System verwendet, um danndie bergreifende Funktionalitt der Plattform zu beschreiben. Der Schwerpunkt liegthier auf der Darstellung zweier zentraler Objektarten: Handschriften in ihrer Funktionals Textzeugen sowieWasserzeichen, wobei die Handschriften ausfhrlicher behandeltwerden. Insbesondere wird auf ein geeignetes Kodierungsmodell fr verschiedenartigeStrukturierungsebenen handschriftlich berlieferter Texte eingegangen, da ein solchesvon zentraler Bedeutung ist, um den Nutzern mglichst vielseitige Mglichkeiten zubieten, einerseits durch die Digitalisate der texttragenden Handschriften, andererseitsauch durch die Transkriptionen der enthaltenen Texte zu navigieren. Formale Daten-modelle und die zugehrige Kodierung fr die behandelten Objektarten werden kurzdargestellt. Innerhalb des Teuchos-Projekts werden Textdaten in XML kodiert, soweitmglich TEI-konform; da kein etabliertes XML-Modell fr die Kodierung von Meta-daten innerhalb einer Wasserzeichensammlung existiert, wird zudem die Objektmo-dellierung fr die durch das Projekt online bereitgestellten Sammlungen skizzenhafterlutert.

    1 Introduction

    This paper briefly describes the general architecture of a digital research environmentfor manuscript and textual studies, as well as discussing some questions of data rep-resentation and encoding. The project Teuchos. Zentrum fr Handschriften- undTextforschung was initiated in 2007 by the Institut fr Griechische und LateinischePhilologie (Universitt Hamburg) in cooperation with theAristoteles-Archiv (Freie Uni-versitt Berlin). Teuchos is a long-term infrastructural project of the Universitt Ham-burg that is currently in its three-year initial phase which is being co-funded by theGerman Research Foundation (DFG) through the Thematic Information Networksscheme within the Scientific Library Services and Information Systems programme.In its final form Teuchos is to provide a web based research environment suited for

    manuscript and textual studies, offering tools for capturing, exchange and collaborativeediting of primary philogical data. The data shall be made accessible to the scholarly

  • Heterogeneous Data in a Web Based Research Environment 81

    community as primary or raw data in order to be reusable as source material for vari-ous individual or collaborative research projects. This objective entails an open accesspolicy using creative commons licenses regarding the content generated and publishedby means of the platform (esp. digital images of manuscripts may have to be handledrestrictively dependant upon the holding institutions policies). The software developedin the course of the project will be made available under free open source licensing asa contribution to the evolving diversity of digital humanities tools and applications.

    Distinctive features of the Teuchos platform are the integration of heterogeneousresearch data (cf. 2) and the participation of different user groups in the generationand enhancement of the content. The system as a whole is geared to the needs andpreferences of specialised research (rather than to the presentation of library treasuresto a wider public). The following use cases are forseen:

    Provision of data facilitating the use of digitised manuscripts (created and sharedby different user groups), ranging from structural information regarding the intel-lectual content of themanuscript to transcriptions containing indications of variantreadings.

    Provision of digitised manuscripts accompanied by (partial) transcriptions both asa basis for further editorial work and to make core information on the contentand the manuscript tradition available (and citable) to the scholarly communityat the same early stage. While in certain cases this first step may be the onlyone that will be taken, with a view to the other cases it may also be considered amethodological improvement for textual studies in general to render the separatestages of the editorial process verifiable.

    Collaboration of networked researchers independent of time and space as a pre-requisite for the analysis and utilisation of special materials, e.g. domain-specifictexts and inaccessible or damaged sources such as palimpsests.

    An evolving collection of manuscript descriptions gives access to detailed infor-mation on codicology, manuscript history and textual transmission. This materialderives from autoptical library studies and is thus often inherently sporadic anddisjointed; on the other hand the collection is independent of library catalogisationprojects and open to the collaboration of researchers worldwide who contribute ac-cording to their respective field of expertise and/or their serendipitous findings.

    A flexible model allows for the integration of manuscript descriptions of varyingdepth. A substantial amount of material taken from both published and unpub-lished materials of the Aristoteles Graecus offers a model for comprehensive andhighly structured descriptions. The complex relationships between manuscriptsbelonging to a coherent textual tradition, e.g. common sources, scribes, owners,annotators etc. offer multiple possibilities of inter-linkage, which may be used forthe individual exploration and eventually for automated analysis of such a cor