All of the material included here was prepared by Prof. Larry Benson as part of a larger Glossarial DataBase of Middle English. It was subsequently marked up according the TEI (P3) Guidelines using the analytical markup section. Considerable work on this version of the DataBase is still necessary, including linking in Prof. Benson's "dictionary" and adding a variety of searches. If you have questions about the collection, please contact dlps-help@umich.edu. If you have concerns about the inclusion of an item in this collection, please contact LibraryIT-Info@umich.edu
A sample line is included below to illustrate the markup of the text.
<DIV0 TYPE="frag" N="Frag1"> <HEAD>Fragment I, CT<HEAD> <DIV1 TYPE="CT" N="GP"> <HEAD>General Prlogoue </HEAD> <L N="GP:1"> <W LEMMA="whan" ANA="ADVC">Whan</W> <W TYPE="gram" LEMMA="that" ANA="CONJ">that</W> <W LEMMA="april" ANA="NOUN">Aprill</W> <W LEMMA="with" ANA="PREP">with</W> <W TYPE="infl" LEMMA="his" ANA="P1GN">his</W> <W TYPE="infl" LEMMA="shour" ANA="NPL0">shoures</W> <W LEMMA="sote" ANA="ADJ0">soote</W> </L>
Some words are marked, using the TEI "lang" attribute, as LAT
(Latin), FR
(French), or GR
(Greek). Remaining words
are indexed as English and show in tables as EN
.
Words are sometimes marked (using the TEI "type" attribute) as
infl
(inflected), gram
(grammatical), or
infl+gram
(a combination of both). Most words are not
marked for Form. Benson describes grammatical and his practice
of markup as:
Grammatical Words
A number of words are labelled "grammatical": e.g. al as an adjective is labelled: al@al_gram#adj. This departs from the usual practice of following the MED's head word (which in this case follows the # rather than the @); it was adopted to prevent syntactic searches for, say, adjective+noun from bringing up this+noun, swich+noun, etc. and to clarify syntactic patterns (@this_gram+@n rather than @adj+@n).
If this proves unsatisfactory it can easily be changed. Note that by no means are all the grammatical words in Chaucer so labelled. This practice was adopted only for those words that were numerous enough and used often enough as nouns and adjectives to interfere with searches for combinations of fully semantic words.
The following words are labelled "grammatical":
al ani another ech everi ilke mani no non other same som swich that these thilke this tho tother what which
Codes used for the ana
attribute on each word are
listed below. In some cases, the prose description from Prof.
Benson is provided. In other cases, his code (e.g., adv#interj
is provided instead. Parenthetical numbers indicate the number of
occurrences in the Caterbury Tales for this characteristic.
A1NS
ABBR
ADJ1
ADJ2
ADJ3
ADJ0
ADJC
ADJE
ADJI
ADJN
ADJP
ADJS
ADJV
ADV1
ADV2
ADV0
ADVC
ADVI
ADVJ
ADVN
ADVP
ADVS
AJ2C
AJ1N
AJ1S
AJNP
AJNS
AJPL
AV-J
CNJ1
CONJ
DEFA
G1PL
G2PL
GER1
GER2
GER3
GER0
GERA
GRPL
IDFA
IN12
INTJ
INTN
JVNC
N1AJ
N2AJ
N5AJ
N1GN
N2GN
N3GN
N5GN
N1IN
N1NG
N2NG
N2NO
N1PG
N1PL
N2PL
N3PL
N4PL
N6PL
NADJ
NADV
NAPL
NGAB
NGEN
NINT
NNON
NOU1
NOU2
NOU3
NOU4
NOU6
NOU8
NOUN
NPL0
NPLG
NPRG
NPRP
NSUP
NUAJ
NUAV
NUM0
NUMA
NUMC
NUMN
NUMR
P1FG
P2FI
P2FJ
P2FO
P1GN
P2GN
P1IN
P1NM
P2OJ
P2PL
PR1J
PART
PN1I
PN4J
PNAJ
PPAB
PPAJ
PPJN
PPL0
PRAB
PRAJ
PRAV
PRCO
PRCT
PREP
PRGA
PRGN
PRIN
PRN1
PRN2
PRN3
PRN4
PRN5
PRN6
PRN7
PRN8
PRNM
PRNO
PROJ
PRON
PRP2
PRPG
PRPJ
PRPL
PRPN
PRPO
V1IM
V2IM
V3IM
V4IM
V1IN
V2IN
V3IN
V4IN
V8IN
V1JP
V2JP
V1JR
V2JR
V1PA
V2PA
V3PA
V4PA
V1PE
V1PL
V2PL
V3PL
V4PL
V1PN
V1PP
V2PP
V3PP
V4PP
V1PR
V2PR
V3PR
V4PR
V1PT
V2PT
V3PT
V4PT
V1TN
V1TR
V12I
V12L
V12P
V12R
VAJP
VAJR
VAVR
VERB
VIM0
VINF
VPPA
VPPL
VPPN
VPPR
VPR0
VPRA
VPRL
VPRN
VPRP
VPT0
VPTL
VPTN
VTLN
VTPR