home

=**Annotation manual for the Parsed Corpus of Early New High German.** = ====Created and maintained by Caitlin Light, University of Pennsylvania. ====

Welcome to the documentation for the Parsed Corpus of ENHG, which currently consists of over 100,000 words of fully parsed text from Martin Luther's first translation of the New Testament, the //Septembertestament// (published 1522). The text currently parsed includes the books of Matthew, Mark and John, and I am now in the process of parsing Acts. **Version 0.5 of the corpus is now available for download below.** **The corpus is released under a free and open source license (LGPL) and there is no registration wall.** **Please contact me with any errors so that I may continue to improve the corpus, and make sure to cite the corpus in any published work.** The corpus is particularly designed to be searchable using the CorpusSearch software developed by Beth Randall, although other software may also exist which will serve the same purpose.


 * If you use this corpus for any published research, please do not forget to cite it. **



The main purpose of this site is to provide annotation guidelines for the corpus. This annotation manual is not meant as a stand-alone document, but as an extension/revision of the annotation guidelines for the Penn Historical Corpora of English, to suit a historical German corpus. This annotation manual is intended exclusively to document the differences between the English and German parsing guidelines.

The Parsed Corpus of ENHG, and this annotation manual, were created in collaboration with the Icelandic Parsed Historical Corpus (IcePaHC) project, currently in development. A preview of IcePaHC is available for free download at the project website. Together with the Penn Parsed Corpora of Historical English, we hope to initiate a set of parallel New Testament corpora for comparative syntactic and information structural research. IcePaHC contains a sample of the New Testament translation by Oddur Gottskálksson, printed 1540; the P[|enn-Helsinki Parsed Corpus of Early Modern English] (PPCEME) contains a sample of the Tyndale New Testament, printed 1534. Both are strongly influenced by Luther's New Testament, and thus the Bible samples in the ENHG, PPCEME and IcePaHC corpora represent a natural locus of comparison.

The text used to create this corpus was acquired from Wikisource.