This chapter is organized as follows. Following this introduction, section 16.2, Elementary Feature Structures: Features with Binary Values introduces the binary feature values, and shows how elementary feature structures using features with those values may be constructed. Section 16.3, Feature, Feature-Structure and Feature-Value Libraries introduces the tags that represent libraries of features, feature structures and feature values, along with methods for pointing at features, feature structures and feature values in these libraries. Section 16.4, Symbolic, Numeric, Measurement, Rate and String Values , presents the tags for symbolic, numeric, measurement, rate, and string values. Section 16.5, Structured Values , shows how to use feature-structures themselves as values, thus enabling feature structures to be recursively defined. Section 16.6, Singleton, Set, Bag and List Collections of Values demonstrates the use of multiple values for features, for encoding set, bag, and list collections of values. Section 16.7, Alternative Features and Feature Values presents various methods for representing alternations (disjunctions) of features and feature values. Section 16.8, Boolean, Default and Uncertain Values , presents tags for boolean, default, and uncertain values, along with methods for underspecifying feature values. Section 16.9, Indirect Specification of Values Using the REL Attribute shows how to specify various logical relations, such as negation and subsumption, between the expressed values for a feature and its actual values. Finally, section 16.10, Two Illustrations , illustrates how feature structures may be linked to to text elements.
This tag set is selected as described in 3.3, Invocation of the TEI DTD ; in a document which uses the markup described in this chapter, the document type declaration should contain the following declaration of the entity TEI.fs, or an equivalent one:
<!ENTITY % TEI.fs 'INCLUDE'>The entire document type declaration for a document using this additional tag set together with the base tag set for prose might look like this:
<!DOCTYPE TEI.2 system 'tei2.dtd' [
<!ENTITY TEI.prose 'INCLUDE' >
<!ENTITY TEI.fs 'INCLUDE' >
]>
The overall document type declaration for this additional tag set has the following structure:
<!-- 16.1: Feature Structures --> <!-- Text Encoding Initiative: Guidelines for Electronic --> <!-- Text Encoding and Interchange. Document TEI P3, 1994. --> <!-- Copyright (c) 1994 ACH, ACL, ALLC. Permission to copy --> <!-- in any form is granted, provided this notice is --> <!-- included in all copies. --> <!-- These materials may not be altered; modifications to --> <!-- these DTDs should be performed as specified in the --> <!-- Guidelines in chapter "Modifying the TEI DTD." --> <!-- These materials subject to revision. Current versions --> <!-- are available from the Text Encoding Initiative. --> <!-- ... declarations from section 16.2 --> <!-- (Feature structures, binary values) --> <!-- go here ... --> <!-- ... declarations from section 16.3 --> <!-- (Feature libraries) --> <!-- go here ... --> <!-- ... declarations from section 16.4 --> <!-- (Symbolic, etc. values) --> <!-- go here ... --> <!-- ... declarations from section 16.6 --> <!-- (Null values) --> <!-- go here ... --> <!-- ... declarations from section 16.7 --> <!-- (Alternative features and feature values) --> <!-- go here ... --> <!-- ... declarations from section 16.8 --> <!-- (Boolean, default, uncertainty values) --> <!-- go here ... -->
This section considers the special case of feature structures that contain features whose single value is one of the binary values represented by the empty elements <plus> and <minus>. The elements which are used for representing feature structures, features and the binary values, along with their descriptions and attributes, are the following.
<fs>
type
feats
rel
eq
ne
sb
ns
<f>
name
org
single
set
bag
list
fVal
rel
eq
ne
sb
ns
<plus>
<minus>
An <fs> element containing <f> elements with binary values can be straightforwardly used to encode the matrices of feature-value specifications for phonetic segments, such as the following for the English segment [s].
+--- ---+ | + consonantal | | - vocalic | | - voiced | | + anterior | | + coronal | | + continuant | | + strident | +--- ---+
Using the additional tag set for feature structures, this might be encoded as follows. Note that <fs> elements may have a type attribute indicating the kind of feature structure in question.
<fs type='phonological segment'> <f name=consonantal><plus></f> <f name=vocalic> <minus></f> <f name=voiced> <minus></f> <f name=anterior> <plus></f> <f name=coronal> <plus></f> <f name=continuant> <plus></f> <f name=strident> <plus></f> </fs>The restriction of specific features to specific types of values (e.g. the restriction of the feature strident to the values <plus> or <minus>) cannot be validated by an SGML parser. To enable an application program to check that only legal values for particular features appear, one may write a feature-system declaration; see chapter 26, Feature System Declaration .
Here are the formal declarations of the <fs>, <f>, <plus> and <minus> elements.
<!-- 16.2: Feature structures, binary values -->
<!ELEMENT fs - - ((f | fAlt | alt)*) >
<!ATTLIST fs %a.global;
type CDATA #IMPLIED
feats IDREFS #IMPLIED
rel (eq | ne | sb | ns) sb >
<!ELEMENT f - O (null | (plus | minus | any | none
| dft | uncertain | sym | nbr |
msr | rate | str | vAlt | alt |
fs)*) >
<!ATTLIST f %a.global;
name NMTOKEN #REQUIRED
org (single | set | bag | list)
#IMPLIED
rel (eq | ne | sb | ns) eq
fVal IDREFS #IMPLIED >
<!ELEMENT plus - O EMPTY >
<!ATTLIST plus %a.global; >
<!ELEMENT minus - O EMPTY >
<!ATTLIST minus %a.global; >
<!-- This fragment is used in sec. 16.1 -->
<fLib>
type
<fsLib>
type
<fvLib>
type
For example, suppose a feature library for phonological feature specifications is set up as follows.
<fLib type='phonological features'> <f id=cns1 name=consonantal><plus> </f> <f id=cns0 name=consonantal><minus></f> <f id=voc1 name=vocalic> <plus> </f> <f id=voc0 name=vocalic> <minus></f> <f id=voi1 name=voiced> <plus> </f> <f id=voi0 name=voiced> <minus></f> <f id=ant1 name=anterior> <plus> </f> <f id=ant0 name=anterior> <minus></f> <f id=cor1 name=coronal> <plus> </f> <f id=cor0 name=coronal> <minus></f> <f id=cnt1 name=continuant> <plus> </f> <f id=cnt0 name=continuant> <minus></f> <f id=str1 name=strident> <plus> </f> <f id=str0 name=strident> <minus></f> <!-- ... --> </fLib>
Then the feature structures that represent the analysis of the phonological segments (phonemes) /t/, /d/, /s/ and /z/ can be defined as follows.
<fs feats='cns1 voc0 voi0 ant1 cor1 cnt0 str0'></fs> <fs feats='cns1 voc0 voi1 ant1 cor1 cnt0 str0'></fs> <fs feats='cns1 voc0 voi0 ant1 cor1 cnt1 str1'></fs> <fs feats='cns1 voc0 voi1 ant1 cor1 cnt1 str1'></fs>
The preceding are but four of the 128 logically possible fully specified phonological segments using the seven binary features listed in the feature library. Presumably not all combinations of features correspond to phonological segments (there are no strident vowels, for example). The legal combinations, however, can be collected together in a feature-structure library, with each element being given a unique id attribute, as in the following example.
<fsLib id=fsl1 type='phonological segment definitions'> <!-- ... --> <fs id=t.df feats='cns1 voc0 voi0 ant1 cor1 cnt0 str0'></fs> <fs id=d.df feats='cns1 voc0 voi1 ant1 cor1 cnt0 str0'></fs> <fs id=s.df feats='cns1 voc0 voi0 ant1 cor1 cnt1 str1'></fs> <fs id=z.df feats='cns1 voc0 voi1 ant1 cor1 cnt1 str1'></fs> <!-- ... --> </fsLib>
Text elements can be linked to these feature structures in any of the ways described in section 15.2, Global Attributes for Simple Analyses . In the following example, a <linkGrp> element is used to link selected characters in the text Caesar seized control to their phonological representations.
<text id=txt1> <!-- ... -->
<body> <!-- ... -->
<s id=s1>
<w id=s1w1>
<c id=s1w1c1>C</c>ae<c id=s1w1c2>s</c>ar</w>
<w id=s1w2>
<c id=s1w2c1>s</c>ei<c id=s1w2c2>z</c>e<c id=s1w2c3>d</c></w>
<w id=s1w3>
con<c id=s1w3c1>t</c>rol</w>.
</s>
<!-- ... -->
</body></text>
<fsLib id=fsl1 type='phonological segment definitions'>
<!-- as in previous example -->
</fsLib>
<linkGrp type='phonological identification of characters'
domains='fsl1 txt1'
targFunc='phonological.segment character'
extendArgs=repeat>
<!-- ... -->
<link id=lt targets='s.df s1w3c1'>
<link id=ld targets='z.df s1w2c3'>
<link id=ls targets='s.df s1w1c1 s1w2c1'>
<link id=lz targets='z.df s1w1c2 s1w2c2'>
<!-- ... -->
</linkGrp>
Because of the simplicity of the binary feature values, there is no particular gain in pointing at those values rather than specifying them directly. However, the mechanism of using the fVal attribute on <f> elements is useful for representing more complex feature values, and can be illustrated using binary values. Suppose the <plus> and <minus> elements are collected together in a <fvLib>, as follows.
<fvLib type='binary values'> <plus id=b1> <minus id=b0> </fvLib>Then the feature library presented at the beginning of this section can be represented as follows.
<fLib type='phonological features'> <f id=cns1 name=consonantal fVal=b1></f> <f id=cns0 name=consonantal fVal=b0></f> <f id=voc1 name=vocalic fVal=b1></f> <f id=voc0 name=vocalic fVal=b0></f> <f id=voi1 name=voiced fVal=b1></f> <f id=voi0 name=voiced fVal=b0></f> <f id=ant1 name=anterior fVal=b1></f> <f id=ant0 name=anterior fVal=b0></f> <f id=cor1 name=coronal fVal=b1></f> <f id=cor0 name=coronal fVal=b0></f> <f id=cnt1 name=continuant fVal=b1></f> <f id=cnt0 name=continuant fVal=b0></f> <f id=str1 name=strident fVal=b1></f> <f id=str0 name=strident fVal=b0></f> <!-- ... --> </fLib>
Although <fs> elements are legitimate feature values (see section 16.5, Structured Values ), they are not allowed within <fvLib> elements. They should be placed in <fsLib> elements.
Here are the formal declarations of the <fLib>, <fsLib> and <fvLib> elements.
<!-- 16.3: Feature libraries -->
<!ELEMENT fLib - - ((f | fAlt)*) >
<!ATTLIST fLib %a.global;
type CDATA #IMPLIED >
<!ELEMENT fsLib - - ((fs | vAlt)*) >
<!ATTLIST fsLib %a.global;
type CDATA #IMPLIED >
<!ELEMENT fvLib - - ((plus | minus | any | none | dft
| uncertain | null | sym | nbr |
msr | rate | str | vAlt)*) >
<!ATTLIST fvLib %a.global;
type CDATA #IMPLIED >
<!-- This fragment is used in sec. 16.1 -->
<sym>
value
rel
eq
ne
<nbr>
value
valueTo
type
int
real
rel
eq
ne
lt
le
gt
ge
<msr>
unit
value
valueTo
type
int
real
rel
eq
ne
lt
le
gt
ge
<rate>
unit
per
value
valueTo
type
int
real
rel
eq
ne
lt
le
gt
ge
<str>
rel
eq
ne
sb
ns
lt
le
gt
ge
The <sym> element is to be used for the value of a feature when that feature can have any of a small, finite set of possible values, representable as character strings. For example, consider the problem of specifying the grammatical case, gender and number features of classical Greek noun forms. Assuming that the case feature can take on any of the five values nominative, genitive, dative, accusative and vocative; that the gender feature can take on any of the three values feminine, masculine, and neuter; and that the number feature can take on either of the values singular and plural, then the following may be used to represent the claim that noun form theás goddesses has accusative case, feminine gender and plural number.
<fs type='word structure'> <!-- ... --> <f name=case><sym value=accusative></f> <f name=gender><sym value=feminine></f> <f name=number><sym value=plural></f> <!-- ... --> </fs>
Note that instead of using a symbolic value for grammatical number, one could have named the feature singular or plural and given it an appropriate binary value, as in the following example. Whether one uses a binary or symbolic value in situations like this is largely a matter of taste.
<fs type='word structure'> <!-- ... --> <f name=case><sym value=accusative></f> <f name=gender><sym value=feminine></f> <f name=singular><minus></f> <!-- ... --> </fs>
An SGML validator by itself cannot determine that particular values do or do not go with particular features; in particular, it cannot distinguish between the presumably legal encodings in the preceding two examples and the presumably illegal encoding in the following example.
<!-- *PRESUMABLY ILLEGAL* ... --> <fs type='word structure'> <!-- ... --> <f name=case><sym value=feminine></f> <f name=gender><sym value=accusative></f> <f name=number><minus></f> <!-- ... --> </fs>
There are two ways of attempting to ensure that only legal combinations of feature names and values are used. First, if the total number of legal combinations is relatively small, one can simply list all of those combinations in <fLib> elements (together possibly with <fvLib> elements), and point to them using the feats attribute in the enclosing <fs> element. This method is suitable in the situation described above, since it requires specifying a total of only ten (5 + 3 + 2) combinations of features and values. Further, to ensure that the features are themselves combined legally into feature structures, one can put the legal feature structures inside <fsLib> elements. A total of 30 feature structures (5 x 3 x 2) is required to enumerate all the legal combinations of individual case, gender and number values in the preceding illustration. Of course, the legality of the markup requires that the feat attributes actually point at legally defined features, which an SGML validator, by itself, cannot guarantee.
A more general method of attempting to ensure that only legal combinations of feature names and values are used is to provide a feature system declaration which includes a <valRange> element for each feature one uses. Here is a sample <valRange> element for the <f name=case> element described above; for further discussion of the <valRange> element, see chapter 26, Feature System Declaration ; the <vAlt> element is discussed in section 16.7, Alternative Features and Feature Values .
<!-- VALRANGE specification for CASE feature -->
<valRange>
<vAlt>
<sym value=nominative>
<sym value=genitive>
<sym value=dative>
<sym value=accusative>
<sym value=vocative>
</vAlt>
</valRange>
Similarly, to ensure that only legal combinations of features are used as the content of feature structures, one should provide <fsConstraint> elements for each of the types of feature structure one employs. For discussion of the <fDecl> and <fsConstraint> elements, see 26, Feature System Declaration . Validation of the feature structures used in a document based on the feature-system declaration, however, requires that there be an application program that can use the information contained in the feature-system declaration.
Features with <sym>, <plus>, and <minus> values may be used to encode highly structured information such as may be obtained from precoded survey instruments. We illustrate by means of a coding scheme based on the one that is used for classifying potential printed entries in the British National Corpus. The scheme uses the following features and associated values.
medium
domain
level
sampling range
date of origination
published (miscellaneous items only)
selection method (books and periodicals only)
A comprehensive feature library for this scheme is the following; the id specifications are those currently used in the BNC project.
<fLib type='BNC classification features'> <f id=CA002 name=medium><sym value=book.or.periodical></f> <f id=CA003 name=medium><sym value=miscellaneous></f> <f id=CA004 name=medium><sym value=written.to.be.spoken></f> <f id=CA005 name=domain><sym value=imaginative></f> <f id=CA006 name=domain><sym value=applied.science></f> <f id=CA007 name=domain><sym value=arts></f> <f id=CA008 name=domain><sym value=belief.and.thought></f> <f id=CA009 name=domain><sym value=commerce.and.finance></f> <f id=CA00A name=domain><sym value=leisure></f> <f id=CA00B name=domain><sym value=natural.and.pure.science></f> <f id=CA00C name=domain><sym value=social.science></f> <f id=CA00D name=domain><sym value=world.affairs></f> <f id=CA00E name=level><sym value=high></f> <f id=CA00F name=level><sym value=medium></f> <f id=CA00G name=level><sym value=low></f> <f id=CA00H name=sample.type><sym value=beginning></f> <f id=CA00J name=sample.type><sym value=middle></f> <f id=CA00K name=sample.type><sym value=end></f> <f id=CA00L name=sample.type><sym value=whole></f> <f id=CA00M name=sample.type><sym value=whole.less.ten.percent></f> <f id=CA00N name=published.between><sym value=1960.1975></f> <f id=CA00P name=published.between><sym value=1975.1993></f> <f id=CA00R name=published><plus></f> <f id=CA00S name=published><minus></f> <f id=CA00T name=selection.method><sym value=principled></f> <f id=CA00U name=selection.method><sym value=random></f> </fLib>
An entry which is a book or periodical on world affairs, medium level, sampled from the middle, published between 1975 and 1993, and selected on a principled basis could then be assigned the following feature-structure code; this code could also be placed in a feature-structure library that contains all the possible fully-specified BNC entry classifications. This library would have a total of 1620 (3 x 9 x 3 x 5 x 2 x 2) entries.
<fs type='BNC classification for written documents'
id=CA2DFJPT
feats='CA002 CA00D CA00F CA00J CA00P CA00T'></fs>
The <nbr> element is to be used when the value of a feature is a number or a range of numbers. For example, suppose one wishes to encode information contained in classified advertisements for the sale or rental of real estate, such as the number of bedrooms and bathrooms in a listed property, and its advertised selling or rental price. One way of representing such information is as follows.
<fs type='real estate listing'> <!-- ... --> <f name=number.of.bathrooms><nbr value=2></f> <f name=number.of.bedrooms><nbr value=3></f> <f name=monthly.rent><nbr value=625.00></f> <!-- ... --> </fs>
The information that the number of bedrooms is in the range from 3 to 5 and the monthly rent is in the range from 625.00 to 950.00 may be represented as follows, using the optional valueTo attribute.
<f name=number.of.bedrooms><nbr value=3 valueTo=5></f> <f name=monthly.rent><nbr value=625.00 valueTo=950.00></f>
The <nbr> (and also the <msr> and <rate> elements defined below) element also may have a type attribute to specify whether the values of the value and valueTo attributes are to be construed as integer or real numbers.
The <msr> element to be used when the value of a feature is a scalar quantity, essentially a combination of a numeric value and a symbolic value for identifying the scale on which the numeric value occurs. For example, real estate listings often provide the area (in square feet or meters) of a house or apartment and the area (in acres or hectares) of land being sold or rented. One way of representing information about such areas is as follows.
<fs type='real estate listing'> <!-- ... --> <f name=interior.area><msr unit=sq.ft value=2000></f> <f name=property.area><msr unit=acre value=0.5></f> <!-- ... --> </fs>
The value of the <f name=monthly.rent> feature in the two examples above might be more accurately analysed as a measurement rather than as a numeric value, since the amount of the rent in question is to be understood as payable in a particular currency, such as US or Canadian dollars or Italian lire. To make the currency scale explicit, the first example of this feature might be re-encoded as follows.
<f name=monthly.rent><msr unit=USD value=625.00></f>
The unit and value attributes of the <msr> element are both required. If the unit attribute is not needed (for example, if no confusion would result if the unit attribute is not specified), then the <nbr> element may be used to express the feature value.
The <rate> element is to be used when the value of a feature is a rate. This element has a required per attribute for expressing the interval over which the rate is measured (typically, but not necessarily, a temporal interval), and an optional unit attribute for expressing the scalar unit. For example, one can encode the wage rate of USD $8.25 per hour as follows.
<f name=wage.rate><rate value=8.25 unit=USD per=hour></f>Note that the <f name=monthly.rent> element illustrated above can be re-encoded as having a rate value, with a per=month attribute, as follows.
<f name=rent><rate value=625.00 unit=USD per=month></f>
To encode interest, inflation or tax rates, the unit attribute can be used to indicate that the value attribute is to be understood as a percentage. For example, an interest rate of 8.25% per year can be encoded in either of the following two ways.
<f name=interest><rate unit=percent value=8.25 per=year></f> <f name=interest><rate value=0.0825 per=year></f>
Finally, the <str> element is to be used for the value of a feature when that value is a string drawn from a very large or potentially unbounded set of possible strings of characters, so that it would be impractical or impossible to use the <sym> element. These values are expressed not as the values of the value attribute, as in the case of symbolic, numeric, measurement and rate values, but as the content of the <str> element. For example, one may encode the street address of a property in a real estate listing, as follows.
<fs type='real estate listing'> <!-- ... --> <f name=address><str>3418 East Third Street</str></f> <!-- ... --> </fs>
Here are the formal declarations of the <sym>, <nbr>, <msr>, <rate> and <str> elements.
<!-- 16.4: Symbolic, etc. values -->
<!ELEMENT sym - O EMPTY >
<!ATTLIST sym %a.global;
value CDATA #REQUIRED
rel (eq | ne) eq >
<!ELEMENT nbr - O EMPTY >
<!ATTLIST nbr %a.global;
value CDATA #REQUIRED
valueTo CDATA #IMPLIED
rel (eq | ne | lt | le | gt | ge)
eq
type (int | real) #IMPLIED >
<!ELEMENT msr - O EMPTY >
<!ATTLIST msr %a.global;
value CDATA #REQUIRED
valueTo CDATA #IMPLIED
unit CDATA #REQUIRED
rel (eq | ne | lt | le | gt | ge)
eq
type (int | real) #IMPLIED >
<!ELEMENT rate - O EMPTY >
<!ATTLIST rate %a.global;
value CDATA #REQUIRED
valueTo CDATA #IMPLIED
unit CDATA #IMPLIED
per CDATA #REQUIRED
rel (eq | ne | gt | ge | lt | le)
eq
type (int | real) #IMPLIED >
<!ELEMENT str - - (#PCDATA) >
<!ATTLIST str %a.global;
rel (eq | ne | sb | ns | lt | le | gt
| ge) eq >
<!-- This fragment is used in sec. 16.1 -->
<fs type='personal record'>
<f name=full.name>
<fs type='name record'>
<f name=first.name><str>Kathleen</str></f>
<f name=middle.name><str>Anne</str></f>
<f name=surname><str>Barnett</str></f>
</fs>
</f>
<f name=date.of.birth>
<fs type='date record'>
<f name=year><nbr value=1968></f>
<f name=month><nbr value=4></f>
<f name=day><nbr value=17></f>
</fs>
</f>
<f name=place.of.birth>
<fs type='place record'>
<f name=city><str>Austin</str></f>
<f name=state><sym value=TX></f>
</fs>
</f>
<f name=sex><sym value=female>
</fs>
Now suppose that feature-structure libraries are maintained for name
records and place records, and that the name record in the previous
example is identified with the attribute id=Nkab027 and the
place record is identified with the attribute id=txaustin.
Feature-structure, rather than feature-value, libraries
should be used for housing collections of feature structures.
Then the preceding example could also be encoded as follows. (An
identifier is also provided for the personal record.)
<fs id=Pkab027 type='personal record'>
<f name=full.name fVal=Nkab027></f>
<f name=date.of.birth>
<fs type='date record'>
<f name=year><nbr value=1968></f>
<f name=month><nbr value=4></f>
<f name=day><nbr value=17></f>
</fs>
</f>
<f name=place.of.birth fVal=txaustin></f>
<f name=sex><sym value=female>
</fs>
This representation could be simplified further if a feature library is
maintained for the year, month, day and sex features, so that the
feats attribute may be used as follows.
<fs id=Pkab027 type='personal record' feats='sxf'>
<f name=full.name fVal=Nkab027></f>
<f name=date.of.birth>
<fs type='date record' feats='y1968 m04 d17'></fs>
</f>
<f name=place.of.birth fVal=txaustin></f>
</fs>
Next, suppose that a feature-structure library is also maintained for personal records, and that the library also contains records for the parents of the individual identified in the previous example. Suppose that the father is identified as Pmfb009 and the mother as Parn002. Then the personal-record feature structure could be easily augmented to include pointers to the parents, as follows.
<fs id=Pkab027 type='personal record' feats='sxf'>
<f name=full.name fVal=Nkab027></f>
<f name=date.of.birth>
<fs type='date record' feats='y1968 m04 d17'></fs>
</f>
<f name=place.of.birth fVal=austinTX></f>
<f name=mother fVal=Parn002>
<f name=father fVal=Pmfb009>
</fs>
If the personal records identified as Parn002 and Pmfb009 also contain
information about the parents of those individuals, then from the present
record, one would have access to that individual's grandparents as well.
Assuming that personal records of the sort described in this section are being maintained in association with text files, the records can be linked to those texts in any of the ways described in chapter 14, Linking, Segmentation, and Alignment , provided that identifiers are added for appropriate features, as in the following illustration.
<text id=Bfile>
<!-- ... -->
<div id=Tkab027 type='birth certificate'>
<p><name type=person id=T1kab027>Kathleen Anne Barnett</name>
was born at <time id=T1t0659>6:59 a.m.</time> on
<date id=T1d680417>April 17, 1968</date> in
<name type=org id=T1setonhsp>Seton Hospital</name> in
<name type=place id=T1txaustin>Austin</name>
to <seg id=s1>Mr.</seg> and <seg id=s2>Mrs.</seg>
<name type=person id=T1mfb009>Michael F. Barnett</name> of
<name type=place id=T1sansabaTX>San Saba</name>.
<!-- ... -->
<join id=T1arn002 targets='s2 T1mfb009'>
<join id=T2mfb009 targets='s1 T1mfb009'>
<!-- ... -->
</text>
<fsLib id=Prec type='personal records'>
<!-- ... -->
<fs id=Pkab027 type='personal record' feats='sxf'>
<f name=full.name fVal=Nkab027></f>
<f id=Dkab027 name=date.of.birth>
<fs type='date record' feats='y1968 m04 d17'></fs>
</f>
<f id=Bkab027 name=place.of.birth fVal=txaustin></f>
<f id=Mkab027 name=mother fVal=Parn002></f>
<f id=Fkab027 name=father fVal=Pmfb009></f>
</fs>
<!-- ... -->
</fsLib>
<linkGrp type='record verification'
domains='Bfile Prec'
targFunc='source goal'
extendTarg=0>
<!-- ... -->
<link targets='T1kab027 Nkab027'>
<link targets='T1d680417 Dkab027'>
<link targets='T1txaustin Bkab027'>
<link targets='T1arn002 Mkab027'>
<link targets='T2mfb009 Fkab027'>
<!-- ... -->
</linkGrp>
No default value for the org attribute is declared in the
DTD; however, a default value for that attribute can be declared for
particular features in the feature-system declaration; see 26, Feature System Declaration
. Note that if only one value is specified for a given
<f> element, the set, bag and list values of the org
are all essentially equivalent to the singleton value, so the omission
of the org attribute for such a feature is not problematic.
Unless the value is the <null> element; see
below.
To illustrate the use of the org attribute, suppose that the illustration of personal records from the previous section is extended to include pointers to an individual's siblings. Suppose also that the individual identified as <fs id=Pkab027> has siblings identified as <fs id=Panb005>, <fs id=Pmfb010> and <fs id=Pzrb001> in the personal records library. Then we may extend the personal record for <fs id=Pkab027> as follows.
<fs id=Pkab027 type='personal record' feats='sxf'>
<f name=full.name fVal=Nkab027></f>
<f name=date.of.birth>
<fs type='date record' feats='y1988 m04 d17'></fs>
</f>
<f name=place.of.birth fVal=austinTX></f>
<f name=mother fVal=Parn002>
<f name=father fVal=Pmfb009>
<f name=siblings org=set fVal='Panb005 Pmfb010 Pzrb001'>
</fs>
A more elaborate illustration of the use of the org attribute is the the following <f name=career org=list> element which may be added to the personal records of an individual to record the job career of that individual. The feature structures which constitute the value of this feature document the jobs which the individual has held in the order in which they were held. Note that a list has been embedded within a list by means of intervening <fs type='employment record'> and <f name=promotion.history> elements.
<f name=career org=list>
<fs type='employment record'>
<f name=employer><str>Safeway Stores</str></f>
<f name=hiring.information>
<fs type='hire structure'>
<f name=hire.date>
<fs type='date structure' feats='y1988 m06'></fs>
</f>
<f name=job.title><sym value=stocker></f>
<f name=wage><rate value=6.00 per=hour></f>
<f name=hours.worked><rate value=40 per=week></f>
<f name=status.code fVal=sc4A></f>
</fs>
</f>
<f name=promotion.history org=list>
<fs type='promotion record'>
<f name=date>
<fs type='date structure' feats='y1988 m12'></fs>
</f>
<f name=job.title><sym value=cashier></f>
<f name=wage><rate value=7.00 per=hour></f>
<f name=hours.worked><rate value=40 per=week></f>
<f name=status.code fVal=sc4A></f>
</fs>
<fs type='promotion record'>
<f name=date>
<fs type='date structure' feats='y1990 m02'></fs>
</f>
<f name=job.title><sym value=supervisor></f>
<f name=salary><rate value=18000 per=year></f>
<f name=status.code fVal=sc3C></f>
</fs>
</f>
<f name=termination.information>
<fs type='termination structure'>
<f name=termination.date>
<fs type='date structure' feats='y1991 m04'></fs>
</f>
<f name=status.code fVal=sc3C></f>
<f name=reason.for.termination><sym value=laid.off></f>
</fs>
</f>
</fs>
<fs type='employment record'>
<!-- ... -->
</fs>
<!-- ... -->
</f>
The information contained in such features may be linked to textual references in the usual way. The <f name=status.code> feature has been included to show how evaluative or interpretive information can be included along with information gleaned from textual records. The example presumes that the status code values are maintained in a designated <fvLib>.
Features with values organized as sets, bags or lists can sometimes be used instead of features organized as singletons, whose values are individual feature structures. For example, consider the following encoding of the English verb form sinks, which contains a <f name=agreement> element whose value is a feature structure which contains <f name=person> and <f name=number> elements with symbolic values.
<fs type='word structure'>
<!-- ... -->
<f name=word.class><sym value=verb></f>
<f name=tense><sym value=present></f>
<f name=agreement>
<fs type='agreement structure'>
<f name=person><sym value=third></f>
<f name=number><sym value=singular></f>
</fs>
</f>
<!-- ... -->
</fs>
If one does not care about the names of the features contained within the <fs type='agreement structure'> element, the containing <f name=agreement> element can be given an org attribute with the value set, and the contained <fs> element, together with the person and number feature elements it contained, can be eliminated, as follows.
<fs type='word structure'> <!-- ... --> <f name=word.class><sym value=verb></f> <f name=tense><sym value=present></f> <f name=agreement org=set><sym value=third><sym value=singular></f> <!-- ... --> </fs>
The encoding in the preceding example presumes that the <fDecl> element for the <f name=agreement> element would look something like the following; for further details, see 26, Feature System Declaration .
<fDecl name=agreement org=set>
<!-- ... -->
<valRange>
<vAlt>
<sym value=first>
<sym value=second>
<sym value=third>
</vAlt>
<vAlt>
<sym value=singular>
<sym value=plural>
</vAlt>
</valRange>
<!-- ... -->
</fDecl>
The set, bag or list which has no members is known as the null (or empty) set, bag or list. To refer to it, the <null> element is provided; its description and attributes are as follows.
<null>
So, for example, to indicate that the individual identified above by the <fs id=Pkab027> element has no siblings, we may specify the <f name=siblings> element as follows.
<f name=siblings org=set><null></f>
The <null> element when used with a feature organized as a singleton is a semantic error; however, its appearance as a value for such a feature cannot be flagged by SGML. The <null> element, when it appears as a feature value, must be the only value.
Here is the formal declarations of the <null> element.
<!-- 16.6: Null values --> <!ELEMENT null - O EMPTY > <!ATTLIST null %a.global; > <!-- This fragment is used in sec. 16.1 -->
<fAlt>
mutExcl
Y
N
<vAlt>
mutExcl
Y
N
To illustrate the use of the <fAlt> element to represent the alternation of features, suppose one is uncertain whether a particular real estate advertisement describes a house with two bedrooms or with two bathrooms. This uncertainty can be represented as follows.
<fs type='real estate listing'>
<!-- ... -->
<fAlt>
<f name=number.of.bathrooms><nbr value=2></f>
<f name=number.of.bedrooms><nbr value=2></f>
</fAlt>
<!-- ... -->
</fs>
This representation leaves unspecified whether or not the alternation is mutually exclusive (i.e. whether having two bathrooms excludes the possibility of having two bedrooms and vice versa). To make this aspect of the alternation explicit, one can specify a value for the mutExcl attribute, as follows.
<fs type='real estate listing'>
<!-- ... -->
<fAlt mutExcl=N>
<f name=number.of.bathrooms><nbr value=2></f>
<f name=number.of.bedrooms><nbr value=2></f>
</fAlt>
<!-- ... -->
</fs>
The <fAlt> element can also be used to represent uncertainty about whether the number of bathrooms is two or three, as follows; note that the attribute value mutExcl=Y can be inferred for the <fAlt> element in this example.
<fs type='real estate listing'>
<!-- ... -->
<fAlt>
<f name=number.of.bathrooms><nbr value=2></f>
<f name=number.of.bathrooms><nbr value=3></f>
</fAlt>
<!-- ... -->
</fs>
However, the <f name=number.of.bathrooms> element in this example can be factored out of the alternation, and a <vAlt> element used instead to represent the alternation of just the feature values, as follows.
<fs type='real estate listing'>
<!-- ... -->
<f name=number.of.bathrooms>
<vAlt><nbr value=2><nbr value=3></vAlt>
</f>
<!-- ... -->
</fs>
The <fAlt> and <vAlt> elements can also be used to indicate certain alternations among values of features organized as sets, bags or lists. For example, suppose one uses a <f name=extras org=set> element in feature structures for real estate listings to represent items that are mentioned to enhance a property's sales value, such as whether it has a pool or a good view. Now suppose for a particular listing, the extras include an alarm system and a fenced-in yard, and either a pool or a jacuzzi (but not both). This situation could be represented, using the <vAlt> element, as follows.
<fs type='real estate listing'>
<!-- ... -->
<f name=extras org=set>
<str>alarm system</str>
<str>fenced-in yard</str>
<vAlt mutExcl=Y>
<str>pool</str>
<str>jacuzzi</str>
</vAlt>
</f>
<!-- ... -->
</fs>
Now suppose the situation is like the preceding except that one is also uncertain whether the property has an alarm system or a fenced-in yard, or possibly both. This can be represented as follows.
<fs type='real estate listing'>
<!-- ... -->
<f name=extras org=set>
<vAlt mutExcl=N>
<str>alarm system</str>
<str>fenced-in yard</str>
</vAlt>
<vAlt mutExcl=Y>
<str>pool</str>
<str>jacuzzi</str>
</vAlt>
</f>
<!-- ... -->
</fs>
Finally, suppose that the listing specifies that the property has a finished basement, and that it also has either an alarm system and a pool or a fenced-in yard and a jacuzzi. This situation cannot be represented using the <vAlt> element, because the alternation holds between subsets of two values each. It can, however, be represented using the <fAlt> element, as follows; note that the <str> element with the value finished basement element must be repeated.
<fs type='real estate listing'>
<!-- ... -->
<fAlt mutExcl=Y>
<f name=extras org=set>
<str>finished basement</str>
<str>alarm system</str>
<str>pool</str>
</f>
<f name=extras org=set>
<str>finished basement</str>
<str>fenced-in yard</str>
<str>jacuzzi</str>
</f>
</fAlt>
<!-- ... -->
</fs>
If a large number of ambiguities or uncertainties involving a relatively small number of features and values need to be represented, it is recommended that the general-purpose <alt> element discussed in section 14.8, Alternation be used, rather than the special-purpose <fAlt> and <vAlt> elements. The use of the <alt> element avoids the need to explictly represent the alternating elements more than once.
For example, suppose one has set up a <fsLib> element containing feature structures representing the morphological structures of classical Greek inflected words, along with collections of individual features and feature values, encoded by <fLib> and <fvLib> elements as appropriate. The following example shows how one might then represent the morphological structure of a feminine gender, accusative case, plural number noun form in classical Greek, such as theás goddesses discussed in section 16.4, Symbolic, Numeric, Measurement, Rate and String Values :
<fsLib type='noun structures'> <!-- ... --> <!-- plural accusative feminine noun --> <fs id=WnGfKaNp type='noun structure' feats='Wn Gf Ka Np'></fs> <!-- ... --> </fsLib> <fLib type='morphological features'> <f id=Wn name=word.class fVal=nn></f> <!-- ... --> <f id=Gf name=gender fVal=fe></f> <!-- ... --> <f id=Ka name=case fVal=ac></f> <!-- ... --> <f id=Np name=number fVal=pl></f> <!-- ... --> </fLib> <fvLib type='morphological feature values'> <!-- ... --> <sym id=nn value=noun> <!-- ... --> <sym id=fe value=feminine> <!-- ... --> <sym id=ac value=accusative> <!-- ... --> <sym id=pl value=plural> <!-- ... --> </fvLib>
Now consider the noun form theaí goddesses, which is analyzable as a feminine plural noun form in either the nominative or the vocative case. We may represent this ambiguity by adding the following entries to the <fsLib>, <fLib>, and <fvLib> elements in the preceding example; assume that appropriate entries for unambiguous nominative and vocative case forms have already been entered.
<!-- Add the following to the feature-structure library (FSLIB) --> <!-- plural nominative-or-vocative feminine noun --> <fs id=WnGfKnvNp type='noun structure' feats='Wn Gf Knv Np'></fs> <!-- Add the following to the feature library (FLIB) --> <!-- CASE=nominative or vocative --> <f id=Knv name=case fVal=novo></f> <!-- Add the following to the feature value library (FVLIB) --> <!-- nominative or vocative --> <alt id=novo targets='no vo'>
If the <fvLib> element is not used, and specifications for particular feature values are entered as content of the <f name=...> elements in the <fLib> element, then the ambiguity can be represented as follows.
<fsLib type='noun structures'> <!-- ... --> <!-- plural nominative-or-vocative feminine noun --> <fs id=WnGfKnvNp type='noun structure' feats='Wn Gf Knv Np'></fs> <!-- ... --> </fsLib> <fLib type='morphological features'> <!-- ... --> <f id=Kn name=case><sym value=nominative> <!-- ... --> <f id=Kv name=case><sym value=vocative> <!-- ... --> <alt id=Knv targets='Kn Kv'> <!-- ... --> </fLib>
The <alt> element together with the <join> element can, unlike the <fAlt> and <vAlt> elements, be used to express alternations between sets of features. An example of such an alternation is found in certain feminine gender Greek noun forms ending in -as, such as peíras attempt(s), which may be analyzed as having either genitive case and singular number features or accusative case and plural number features, as follows (again, assuming the existence of other elements and identifier attributes for simple features and values).
<!-- Add the following to the FSLIB element -->
<!-- feminine noun, either genitive singular or accusative plural -->
<fs id=WnGfKg.NsKa.Np type='noun structure' feats='Wn Gf Kg.NsKa.Np'>
</fs>
<!-- Add the following to the FLIB element -->
<join id=Kg.Ns targets='Kg Ns'> <!-- genitive singular -->
<join id=Ka.Np targets='Ka Np'> <!-- accusative plural -->
<!-- alternation: gen. sg. or acc. plural -->
<alt id=Kg.NsKa.Np targets='Kg.Ns KaNp'>
Here are the formal declarations of the <fAlt> and <vAlt> elements.
<!-- 16.7: Alternative features and feature values -->
<!ELEMENT fAlt - - ((f | fs | fAlt), (f | fs |
fAlt)+) >
<!ATTLIST fAlt %a.global;
mutExcl (Y | N) #IMPLIED >
<!ELEMENT vAlt - - ((plus | minus | any | none | dft
| uncertain | null | sym | nbr |
msr | rate | str | vAlt | fs),
(plus | minus | any | none | dft |
uncertain | null | sym | nbr | msr
| rate | str | vAlt | fs)+) >
<!ATTLIST vAlt %a.global;
mutExcl (Y | N) #IMPLIED >
<!-- This fragment is used in sec. 16.1 -->
The boolean value elements are used to indicate whether the features they are associated with have values. The element <any> corresponds to the boolean value true (i.e., that the feature it is associated with has a value --- not the same as the binary value plus), and the element <none> corresponds to the boolean value false (i.e., that the feature it is associated with has no value). The <dft> element is used to indicate that the feature it is associated with has its default value in the feature structure in which it appears. Finally, the <uncertain> element may be used to indicate uncertainty about what value, if any, its associated feature has; it is equivalent to the alternation of the <any> and <none> elements. To indicate uncertainty about which of the possible legal values a particular feature has, one should use the <any> element.
The descriptions and attributes of these elements are as follows.
<any>
<none>
<dft>
<uncertain>
The values <null> and <none> are distinct. The former is to be used with a feature organized as a set, bag, or list to indicate that its value is the null set, bag, or list in a particular feature structure. The latter is to be used with such a feature to indicate that it has no value in a particular feature structure.
The boolean values <any> and <none> are also distinct from the binary values <plus> and <minus>. The latter pair are specific possible values for features, whereas the former pair represent ranges of possible values, not specific possible values, for features. For example, suppose that the <valRange> element for the <f name=auxiliary> element is declared as follows in the feature structure declaration, so that either boolean value is legal.
<!-- VALRANGE tag for the AUXILIARY feature --> <valRange><vAlt><plus><minus></vAlt></valRange>Then the following two pairs of specifications are distinct.
<f name=auxiliary><plus></f> =/= <f name=auxiliary><any></f> <f name=auxiliary><minus></f> =/= <f name=auxiliary><none></f>In this situation, the <any> element is equivalent to the alternation of the <plus> and <minus> elements, and the <none> element is equivalent to the negation of that alternation.
However, if the auxiliary feature is declared to take only the <plus> value, then the first pair of the specifications below are equivalent, but the second is not; in fact, the first member of the second pair is invalid.
<f name=auxiliary><plus></f> == <f name=auxiliary><any></f> <f name=auxiliary><minus></f> =/= <f name=auxiliary><none></f>
It is even possible to declare that a particular feature can never have values, as follows for the feature <f name=impossible>.
<!-- VALRANGE tag for the IMPOSSIBLE feature --> <valRange></valRange>In this case, the following specifications are equivalent.
<f name=impossible><any></f> == <f name=impossible><none></f>
The elements <any> and <dft> are also designed to be used in conjunction with the <fDecl> and <valDefault> elements in the feature system declaration discussed in section 26, Feature System Declaration . First, consider the <any> element, and suppose that the <valRange> element in the <fDecl> element for the <f name=gender> element is specified as follows.
<!-- VALRANGE for the GENDER feature -->
<valRange>
<vAlt>
<sym value=feminine>
<sym value=masculine>
<sym value=neuter>
</vAlt>
</valRange>
Then the following two representations are equivalent.
<f name=gender><any></f>
<f name=gender>
<vAlt>
<sym value=feminine>
<sym value=masculine>
<sym value=neuter>
</vAlt>
</f>
Second, consider the <dft> element, and suppose that the default value for the <f name=gender> element is specified in the <valDefault> element of its <fDecl> element as having the value <sym value=feminine>. Then the following three representations are equivalent; note that if a <f name=...> element appears without content and without a valid fVal attribute, then it is equivalent to the same element with the <dft> element as its content.
<f name=gender></f> <f name=gender><dft></f> <f name=gender><sym value=feminine></f>
Using the <any> and <dft> elements, together with an <fDecl> element for the corresponding feature in the feature system declaration, provides a method for underspecifying the value of that feature. The <any> element means that the associated feature has a legal value but what value it has is not specified. The <dft> element means that the associated feature has the value which the encoder has declared is the normal value of the feature.
The boolean elements <any> and <none> also have specific uses within <fsConstraints> and <fDecl> elements in feature system declarations, as described in chapter 26, Feature System Declaration . For example, the element <any> can appear as the value of a feature contained within an <fs> of a particular type which appears in the <cond> element of an <fsConstraints> element, to indicate that the feature must appear in feature structures of the designated type (i.e., that it is obligatory) and that when it does appear, it may appear with any of its legal values. Similarly, <none> can appear in this way to specify that the feature cannot be present in feature structures of the indicated type (i.e., that it is obligatorily absent from such feature structures). All other features that are declared to have values are understood to be optional in such feature structures.
For example, the following may appear as part of the <fsConstraints> of a feature system declaration to indicate that a <fs type='agreement structure'> must contain a legal instance of the <f name=number> element but must not contain a legal instance of the <f name=category> element.
<cond><fs type='agreement structure'></fs>
<then><fs>
<f name=number><any></f>
<f name=category><none></f>
</fs>
Further constraints can be imposed on a feature structure of a particular type in the <valRange> elements of features which take feature structures of that type as values. For example, suppose that verb and adjective agreement in German are represented by feature structures of the following sorts, in which verb forms agree in person and number with their subjects and adjective forms agree in gender, case, and number with their subjects.
<fs type='verb structure'>
<!-- ... -->
<f name=verbAgreement>
<fs type='agreement structure'>
<f name=person><sym value=first></f>
<f name=number><sym value=plural></f>
</fs>
</f>
<!-- ... -->
</fs>
<fs type='adjective structure'>
<!-- ... -->
<f name=adjAgreement>
<fs type='agreement structure'>
<f name=gender><sym value=feminine></f>
<f name=case><sym value=accusative></f>
<f name=number><sym value=plural></f>
</fs>
</f>
<!-- ... -->
</fs>
In order to ensure that a <fs type='agreement structure'> tag which appears as the value of a <f name=verbAgreement> element may be specified for any person and number feature, but for no gender and case feature, we may provide a <valRange> element for the verbAgreement feature as follows.
<!--VALRANGE specification for the VERBAGREEMENT feature -->
<valRange>
<fs type='agreement structure'>
<f name=person><any></f>
<f name=case><none></f>
<f name=gender><none></f>
<f name=number><any></f>
</fs>
</valRange>
Similarly, to ensure that a <fs type='agreement structure'>
element which appears as the value of a <f name=adjAgreement>
element may be specified for any case, gender, and number feature, but
for no person feature, we may provide a <valRange> element for
the adjAgreement feature as follows.
<valRange>
<fs type='agreement structure'>
<f name=person><none>
<f name=case><any>
<f name=gender><any>
<f name=number><any>
</fs>
</valRange>
The combination of declarations like these and the principle of subsumption discussed in 16.9, Indirect Specification of Values Using the REL Attribute , allows feature structures to be underspecified in text markup. For example, to indicate that a given adjective inflection feature (tagged <f name=adjInflection>) is a feature structure (tagged <fs type='inflection structure'>) specifying plural number and any gender and case, we can omit the elements for gender and case on the <fs> element, as follows.
<f name=adjInflection>
<fs type='inflection structure'>
<f name=number><sym value=plural></f>
</fs>
</f>
When supplied as the value of a verb inflection feature (tagged <f
name=verbInflection>), the same feature structure would be
interpreted as an inflection structure specifying plural number and any
person.
If an optional feature is not specified in a feature-structure value, then it is assumed to occur with the <uncertain> value. For further discussion, see section 16.9, Indirect Specification of Values Using the REL Attribute .
Here are the formal declarations of the <any>, <none>, <dft>, and <uncertain> elements.
<!-- 16.8: Boolean, default, uncertainty values --> <!ELEMENT any - O EMPTY > <!ATTLIST any %a.global; > <!ELEMENT none - O EMPTY > <!ATTLIST none %a.global; > <!ELEMENT dft - O EMPTY > <!ATTLIST dft %a.global; > <!ELEMENT uncertain - O EMPTY > <!ATTLIST uncertain %a.global; > <!-- This fragment is used in sec. 16.1 -->
If an <fDecl> element has been provided which defines the
legal values for the associated feature, then the value ne
can be given a positive interpretation. For example, suppose that the
<valRange> element is declared in the <fDecl> element for
the <f name=case> element as follows.
Suppose also that the <f name=case> element is declared as
obligatory in a particular feature structure. Then the following
specifications are equivalent in that structure.
On the other hand, if the <f name=case> feature is declared
as optional in a particular feature structure, then the following
specifications are equivalent in that structure.
If the rel attribute is specified with the value
ne for a <nbr>, <msr>, or <rate>
element for which the valueTo attribute is also specified,
then the actual range may be any range distinct from that given. For
example, the following means that the number of bathrooms is a range
distinct from 3 to 5 (e.g., 3 to 4, 3 to 6, 4 to 5, 4 to 6, 0 to 2,
etc.).
These attribute values may be used as shown in the following
examples. The first states that the number of bedrooms is less than 5;
the second that an illegal speed is any speed greater than 65 miles per
hour; the third that a lot size is in a range which is less than or
equal to the range of from 5 to 10 acres;
On the <str> element, these values are used to specify that
the string value given in the <str> element is or is not a
substring of the actual value of the feature. The first
example below specifies that the actual feature value may be any string
at all (since the empty string is a substring of every string), the
second that it might be any string in which the string
the occurs as a substring, and the third that it
might be any string in which the string the does
not occur as a substring.
On the <fs> element, the attribute values sb and
ns indicate that the given feature structure does or does
not legally subsume the actual feature structure. By
definition, one feature structure subsumes another if the second feature
structure is identical to the first or contains more information than
the first. The default value for the rel attribute of the
<fs> element is sb. The subsumption of feature
structures is illustrated by the following four examples; suppose that
the <f name=person> and <f name=number> elements are
either optional or obligatory in these <fs type='agreement
structure'> example elements.
The fourth example, ``pxnx'', subsumes all four of the examples,
since each contains at least as much information as does feature
structure ``pxnx''. Conversely, the first example, ``p3ns'',
subsumes only itself. Finally, the second and third examples,
identified as ``p3nx'' and ``pxns'' attributes, subsume themselves
and the first feature structure, but not each other.
If both person and number are obligatory features of agreement
structure elements, then the last three elements in the preceding list
have the same interpretation as their counterparts in the following
list.
On the other hand, if both person and number are optional features of
agreement structures, then those three elements have the same
interpretation as their counterparts in the following list.
The value sb is chosen as the default value for the
rel attribute of the <fs> element, because it provides
for the most economical means for underspecifying them. One situation
in which it may be preferable to specify <fs rel=eq> is when
the feature structure has many optional features and it is known that
none of them occur.
The specification <fs rel=ns> is used to denote the feature
structures that the specified feature structure does not subsume. This
provides a handy way of saying that a certain combination of features is
not present, for example the combination of third person and singular
number, as in the agreement structure of the English verb form
sink, understood as a present tense verb form.
The following example expresses the claim that third-person and
singular-number features are not both present in the agreement feature,
but makes no further claim about what is present.
In most real situations, of course, one can infer, from the range of
possible values for person and number, what the remaining possibilities
are. Suppose, for example, that in the relevant feature system
declaration, the features person and
number are given the following <valRange>
elements:
Suppose, further, that the person and number features are obligatory
in feature structures of the type ``agreement structure''. Then the
element <fs id=Np3ns> above is equivalent to the following
alternation; the features whose value is <any> may be omitted,
since they are implied by the default value of sb for the
rel attribute in the enclosing <fs> elements.
If, on the other hand, the person and number features were optional
in feature structures of type ``agreement structure'', then the
interpretation of an underspecified feature structure will change. The
element <fs id=Np3ns> given above is then equivalent to the
following alternation; the features whose value is <uncertain>
may be omitted as they are implied by the default subsumption relation
holding between the structure given and the actual structure.
The first example states that the extras
feature has the null set as its value. The second example states that
the extras feature is a set which is not equal to
the null set. That is, its actual value might be any non-null set. The
third example states that the extras feature has
as its value a set of which the null set is a subset; that is to say,
any set at all, including the null set. Note that this is not
equivalent to the following, which states that the extras feature has as
its value a single element which is any legal value for the
extras feature, including for example a
<str> element containing the value pool.
Finally, the fourth example states that the
extras feature has as its value a set of which
the null set is not a subset. Since the null set is a subset of every
set, the fourth example in effect claims that the
extras feature has no legal value; it is thus
equivalent to the following, which states directly that the
extras feature has no value.
Consider next the use of the rel attribute with the
<f> element when the given value of the feature is a single
<str> element with the content pool:
The first example states that the value of the
extras feature is a set consisting of a single
member, namely a <str> element containing the value
pool. The second example states that the
extras feature has as its value a set which is
not equal to the set consisting of this particular member. It could,
however, be a two-membered set, one of whose members is some other
value. This example is thus not equivalent to the following, which
states that the extras feature has as its value a
set comprising a single member other than a
<str> element with the content pool:
The third example states that the extras
feature has as its value any set of which the set consisting of the
single member specified is a subset (i.e., any set which contains the
element <str> with the value pool, and possibly
others). Finally, the fourth example states that the
extras feature has as its value any set which
does not contain this element as a member.
Because the value sb is not defined for the attribute
rel on the <nbr>, <msr> and <rate>
elements, the indication that a value may be any number, measure or rate
is sometimes not quite as simple. Here is one way of specifying any
positive or negative integer numeric value.
The value ns also is understood in similar ways in the
different elements in which it may occur. Above in this section, the
equivalence of the following representations under certain conditions
was shown (the id attributes and the redundant features with
<any> values have been omitted).
The value ns has an analogous meaning when the value in
question is a set rather than a feature structure. Recast in such
terms, the equivalence above still holds good:
The first illustration is based on the observation that from the
example text, it is possible to infer a fairly extensive medical history
for the dog described in it. Suppose that we have a definition of a
feature structure that represents a canine medical history. Then we can
fill in feature values in that history from the text, and prepare a
<linkGrp> element that specifies the links between the text
segments and the various features specified in the feature structure.
Here is a hypothetical example of such a filled-in feature structure and
associated link group.
From this illustration, we see that links can be made not only between
text and feature structure elements, but also between text and feature
elements. For that matter, links between text and feature value elements
can also be made.
The second illustration takes advantage of the fact that this text,
like others that appear in the BNC, has been provided with detailed
grammatical markup of most of its orthographic words and certain other
comparable structural units. For example, the second paragraph of the
above text has been marked up as follows.
The entities that appear in this fragment may be expanded into
pointers to feature structures that represent grammatical structure by
means of entity definitions as follows.
This method of associating feature structures with textual elements
has a number of drawbacks, most important of which is the fact that the
association is implicit, relying on the relative position of pointer
and associated text, rather than being explicit.
A better method would be to segment the text into the units under
analysis, and point to the feature structures from within the unit tags,
by means of the ana attribute (see sections 15.2, Global Attributes for Simple Analyses
and 15.4, Linguistic Annotation
).
To provide pointers in both direction between text and structural
analysis, one may supply both the text segments and the feature-structure
tags with identifiers, and associate the segments with their analysis by
means of a <linkGrp> (see section 14.1, Pointers
), as follows.
First, we define a feature-structure library to
represent all of the
grammatical structures that are used in the BNC encoding scheme. (For
illustrative purposes, we cite here only the structures needed for the
first six words of the sample sentence):
Next, here is a markup of the start of our sample sentence being
analyzed, with identifiers for each segment; see section 15.1, Linguistic Segment Categories
for discussion of the <phr>, <w>,
<m> and <c> elements used here.
Finally, here is a <linkGrp>, which contains all of the
<link> elements that associate the text segments in the example
sentence with their respective grammatical structures.
This grammatical markup represents the text as completely unambiguous,
despite the fact that instances of the same textual unit are associated
with different structure elements (e.g. the word
to), and at least one sequence (namely
to exercise, identified by the attribute values
id=mds0905 and id=mds0906), is in fact
structurally ambiguous in English. That sequence may be analyzed as a
preposition followed by a singular noun (as this markup asserts) or as
the infinitive marker followed by an uninflected form of a main verb.
To represent the ambiguity of words like to and
exercise, and of phrases like to
exercise, we may use the <alt> and <join>
elements defined in sections
14.8, Alternation
and 14.7, Aggregation
, as follows.
First, we define <alt> elements for
the ambiguous word classes, and add these to the <fsLib>.
As the encoding now stands, the phrase to
exercise has four structural analyses associated with it:
preposition followed by noun, preposition followed by verb, infinitive
marker followed by noun and infinitive marker followed by verb. To
narrow the choices down to the desired two, namely preposition followed
by noun and infinitive marker followed by verb, we next form
<join> elements to represent the desired sequences.
Note that the technique of forming <join> elements for
sequences of structure elements and associating them with textual units
can also be used to provide a complete structural analysis for the
complex word he'd. First, we add an
id attribute for the word.
<f name=case><sym value=genitive></f>
<f name=case><sym rel=eq value=genitive></f>
The Not-Equals Relation
The rel attribute can also be specified as having the
value ne, which means that the associated feature has a
value which is not equal to the given value. For example, the value
<nbr rel=ne value=1> in the following example denotes any legal
numeric value for the element <f name=number.of.bathrooms> other
than 1.
<f name=number.of.bathrooms><nbr rel=ne value=1></f>
<!-- VALRANGE specification in FDECL for CASE feature -->
<valRange>
<vAlt>
<sym value=nominative>
<sym value=genitive>
<sym value=dative>
<sym value=accusative>
<sym value=vocative>
</vAlt>
</valRange>
<f name=case><sym rel=ne value=genitive></f>
<f name=case>
<vAlt>
<sym value=nominative>
<sym value=dative>
<sym value=accusative>
<sym value=vocative>
</vAlt>
</f>
That is, when the rel attribute occurs with the value
ne in the value of an obligatory feature in a feature
structure, the actual value of that feature may be any of its legal
values other than the specified value.
<f name=case><sym rel=ne value=genitive></f>
<f name=case>
<vAlt>
<sym value=nominative>
<sym value=dative>
<sym value=accusative>
<sym value=vocative>
<none>
</vAlt>
</f>
That is, when the rel attribute has the value
ne in the value of an optional feature in a feature
structure, the actual value of that feature may be any of its legal
values other than the specified value, or <none>.
<f name=number.of.bathrooms><nbr rel=ne value=3 valueTo=5></f>
Other Inequality Relations
For the elements <nbr>, <msr>, <rate>, and
<str>, the rel attribute may also take on the
following values; the use of these values for the <str> element
presumes that a particular character and string ordering (or
sorting) convention is understood.
lt
le
gt
ge
We say that one range is less than or equal to another
if both the value and valueTo attributes of the
first are less than or equal to the corresponding attributes of the
second.
the fourth that the last name is any string greater than the empty
string (i.e., any nonempty string, given normal string-ordering
conventions); and the fifth that for a feature whose value is a list of
two strings, the first precedes the string M and
the second is the string M, or any string
following it.
<f name=number.of.bedrooms><nbr rel=lt value=5></f>
<f name=illegal.speed><rate rel=gt value=65 unit=miles per=hour></f>
<f name=lot.size><msr rel=le value=5 valueTo=10 unit=acre></f>
<f name=last.name><str rel=gt></str></f>
<f name=pairs org=list><str rel=lt>M</str><str rel=ge>M</str></f>
Subsumption and Non-subsumption Relations
When the rel attribute is given the values sb
or ns, the markup expresses the claim that the value given
subsumes, or does not subsume, the actual value for the feature in
question.
<str rel=sb></str>
<str rel=sb>the</str>
<str rel=ns>the</str>
<fs id=p3ns type='agreement structure'>
<f name=person><sym value=third></f>
<f name=number><sym value=singular></f>
</fs> <!-- 3d person singular -->
<fs id=p3nx type='agreement structure'>
<f name=person><sym value=third></f>
</fs> <!-- 3d person -->
<fs id=pxns type='agreement structure'>
<f name=number><sym value=singular></f>
</fs> <!-- singular -->
<fs id=pxnx type='agreement structure'>
</fs> <!-- -->
<fs id=p3na type='agreement structure'>
<f name=person><sym value=third></f>
<f name=number><any></f>
</fs> <!-- 3d person -->
<fs id=pans type='agreement structure'>
<f name=person><any></f>
<f name=number><sym value=singular></f>
</fs> <!-- singular -->
<fs id=pana type='agreement structure'>
<f name=person><any></f>
<f name=number><any></f>
</fs> <!-- -->
<fs id=p3nu type='agreement structure'>
<f name=person><sym value=third></f>
<f name=number><uncertain></f>
</fs> <!-- 3d person -->
<fs id=puns type='agreement structure'>
<f name=person><uncertain></f>
<f name=number><sym value=singular></f>
</fs> <!-- singular -->
<fs id=punu type='agreement structure'>
<f name=person><uncertain></f>
<f name=number><uncertain></f>
</fs> <!-- -->
That is, if an optional feature is omitted from a feature-structure
representation, then that feature may have any of its legal values or
the value <uncertain>.
<f name=agreement>
<fs id=Np3ns rel=ns type='agreement structure'>
<f name=person><sym value=third></f>
<f name=number><sym value=singular></f>
</fs>
</f>
<!-- VALRANGE for the PERSON feature -->
<valRange>
<vAlt>
<sym value=first>
<sym value=second>
<sym value=third>
</vAlt>
</valRange>
<!-- VALRANGE for the NUMBER feature -->
<valRange>
<vAlt>
<sym value=singular>
<sym value=plural>
</vAlt>
</valRange>
<vAlt id=p12na-panp>
<fs id=p12na type='agreement structure'>
<f name=person>
<vAlt><sym value=first><sym value=second></vAlt></f>
<f name=number><any></f>
</fs>
<fs id=panp type='agreement structure'>
<f name=person><any></f>
<f name=number><sym value=plural></f>
</fs>
</vAlt>
<vAlt id=p120nu-punp0>
<fs id=p120nu type='agreement structure'>
<f name=person>
<vAlt><sym value=first><sym value=second><none></vAlt>
</f>
<f name=number><uncertain>
</fs>
<fs id=punp0 type='agreement structure'>
<f name=person><uncertain>
<f name=number>
<vAlt><sym value=plural><none></vAlt>
</f>
</fs>
</vAlt>
Relations Holding with Sets, Bags, and Lists
The rel attribute is also provided for the <f>
element, but is designed to be used with that element only when its
org attribute (see section 16.6, Singleton, Set, Bag and List Collections of Values
) is set to
set, bag, or list. When
associated with the <f> element, the rel attribute may
take on any of the following four values: eq,
ne, sb, and ns. The default
value is eq. Consider first the use of the rel
attribute with the <f> element when the given value of the
feature is <null>.
<f name=extras org=set><null></f>
<f name=extras org=set rel=ne><null></f>
<f name=extras org=set rel=sb><null></f>
<f name=extras org=set rel=ns><null></f>
<f name=extras org=set><any></f>
<f name=extras org=set><none></f>
<f name=extras org=set><str>pool</str></f>
<f name=extras org=set rel=ne><str>pool</str></f>
<f name=extras org=set rel=sb><str>pool</str></f>
<f name=extras org=set rel=ns><str>pool</str></f>
<f name=extras org=set><str rel=ne>pool</str></f>
Varieties of Subsumption and Non-subsumption
The rel values sb and ns have
different meanings depending on whether they occur within a
<str>, <fs> or <f> element. However, the use of a
common name for the value reflects a fundamental similarity in those
meanings. For example, the value sb can be used in all
three elements to indicate that the actual value is any string, any
feature structure, or any set, bag or list, as follows. In the second
example below, the rel attribute has not been specified,
since it has the value sb by default on <fs>
elements.
<str rel=sb></str>
<fs></fs>
<f name=... org=set rel=sb><null></f>
<f name=... org=bag rel=sb><null></f>
<f name=... org=list rel=sb><null></f>
Typically, there will be no need to use an encoding
like this one as the value of a feature, since the <any> element
is available for that purpose. However, in setting up the feature
declaration for that feature, it may be necessary to use such an
encoding, precisely so as to provide an interpretation for the use of
the <any> element as the value of that feature.
<vAlt>
<nbr type=int rel=gt value=0>
<nbr type=int rel=le value=0>
</vAlt>
<f name=agreement>
<fs rel=ns type='agreement structure'>
<f name=person><sym value=third></f>
<f name=number><sym value=singular></f>
</fs>
</f>
<f name=agreement>
<vAlt>
<fs type='agreement structure'>
<f name=person>
<vAlt><sym value=first><sym value=second></vAlt>
</f>
</fs>
<fs type='agreement structure'>
<f name=number><sym value=plural></f>
</fs>
</vAlt>
</f>
<f name=agreement org=set rel=ns>
<sym value=third><sym value=singular>
</f>
<f name=agreement org=set rel=sb>
<vAlt>
<vAlt><sym value=first><sym value=second></vAlt>
<sym value=plural>
</vAlt>
</f>
16.10: Two Illustrations
In this section, we present two illustrations based on one text of
how to associate feature structures and their components with textual
elements. Our example text is the article Memoirs of a Dog
Shrink that appeared in the popular magazine Dogs
Today in August 1991. This text has been selected for
inclusion in the British National Corpus. The first illustration
associates the text with a structure that represents a significant
portion of the information contained in the text. The second marks up
the grammatical structure of the orthographic words and certain other
comparable units in the text. Here is the text, with markup provided
down to the level of <s> elements. The n attribute
values are taken from the BNC markup; the id attribute
values have been added for purposes of these illustrations.
<div1 id=DT91mds type='article'>
<head rend=italic>
<s id=mds01 n=00732>Memoirs of a Dog Shrink</s></head>
<head type=sub rend=italic>
<s id=mds04 n=00735>Cartoonist Russell Jones
takes a ramble through Peter Neville's files</s></head>
<list>
<item><s id=mds05 n=00736>Case number: 72</s></item>
<item><s id=mds06 n=00737>Name: Jessie</s></item>
<item><s id=mds07 n=00738>Breed: Collie</s></item>
<item><s id=mds08 n=00739>Problem: Light bulb phobia</s></item>
</list>
<p><s id=mds09 n=00740>Jess the collie was a laid-back
sort of hound who spent most of his life stretched out
on a fireside rug in his large Surrey home.</s></p>
<p><s id=mds10 n=00741>The closest he came to exercise
was to open one eye every so often, if someone entered
the room, or to open both eyes, smile, and wag his
tail as he'd done on one occasion when confronted by a
housebreaker!</s></p>
<p><s id=mds11 n=00742>This extremely lazy lifestyle
was one long yawn from dawn to dusk.</s>
<s id=mds12 n=00743>Only the odd bouts of involuntary
twitching in his sleep reassured his owner that Jess
was still safe and sound in the land of the
living!</s></p>
<p><s id=mds13 n=00744>One winter night, as the mutt
twitched away in front of the fire, his mind somewhere
between Basingstoke and the twilight zone, a 100-watt
light bulb in the standard lamp above his head
suddenly exploded without warning!</s></p>
<p><s id=mds14 n=00745>According to his owner, who
witnessed the spectacle, Jessie rose gracefully toward
the ceiling like a space shuttle and, after lingering
in mid-air for what seemed an eternity, crashed to the
floor and fled the house with a speed and agility the
owner found quite amazing.</s></p>
<p><s id=mds15 n=00746>Jessie did not return home for
several hours.</s>
<s id=mds16 n=00747>When he eventually did show up, it
was obvious to all that he was a changed dog!</s>
<s id=mds17 n=00748>What plodded through the front door
was not the lovable, lazy hound who had once lived
there but a grim-faced light bulb serial
killer!</s></p>
<p><s id=mds18 n=00749>Within seconds of his return,
Jessie launched a vicious attack on a table lamp,
popping the bulb and wrecking the shade before
charging into the lounge.</s>
<s id=mds19 n=00750>There, in a frenzy of violence, he
reduced the standard lamp to a table lamp in 10
seconds flat!</s></p>
<p><s id=mds20 n=00751>After a room-to-room chase
lasting several minutes, during which every lamp in
the house was turned to sawdust, the dog was finally
caught and wrestled to the ground.</s></p>
<p><s id=mds21 n=00752>With his house plunged into
darkness, Jessie's owner sought my help.</s></p>
<div2 type='part'><head>
<s id=mds22 n=00753>SIMPLE SOLUTION</s></head>
<p><s id=mds23 n=00754>When I first saw the dog, it
was quite obvious he'd been deeply affected by the
explosion and had developed a 100-watt phobia for
light bulbs!</s></p>
<p><s id=mds24 n=00755>By placing his feeding bowl
closer each day to a table lamp the dog gradually
learned to live with his enemy.</s>
<s id=mds25 n=00756>Within a couple of weeks, his
killer instincts had disappeared and he was back where
he belonged — twitching away peacefully on the
fireside rug.</s></p>
</div2>
</div1>
<fs type='canine medical history'id=j37>
<f name=name id=j37pn><str>Jessie</str></f>
<f name=called.by org=set id=j37pc>
<str>Jessie</str>
<str>Jess</str>
</f>
<f name=breed id=j37b><sym value=collie>
<f name=owner id=j37o>
<fs type='owner description'>
<f name=name><uncertain></f>
<f name=address id=j37or><str>Surrey</str></f>
</fs>
<f name=illness org=list id=j37i>
<fs type='case history' id=j37i1>
<f name=name.of.specialist id=j37i1sn>
<fs type='name structure'>
<f name=last.name><str>Neville</str></f>
<f name=first.name><str>Peter</str></f>
</fs>
<f name=title.of.specialist><uncertain>
<f name=case.number id=j37i1n><nbr value=72></f>
<f name=age.at.incidence><uncertain></f>
<f name=date.of.incidence><uncertain>
<f name=baseline.condition org=set id=j37i1b>
<sym value=lazy>
<sym value=friendly>
<sym value=indoor>
</f>
<f name=symptoms id=j37i1s>
<fs type='symptom structure'>
<f name=behaviors org=set id=j37i1sb>
<sym value=agitated>
<sym value=destructive>
<sym value=unfriendly>
</f>
<f name=particulars id=j37i1sp>
<str>ran off, then returned and
destroyed every lamp in the house</str></f>
</fs>
</f>
<f name=diagnosis id=j37i1d>
<fs type='diagnosis structure'>
<f name=date.of.diagnosis><uncertain>
<f name=disease id=j37i1dd>
<str>light bulb phobia</str>
</f>
<f name=presumed.cause id=j37i1dc>
<str>explosion of light bulb over patient's head</str>
</f>
</fs>
</f>
<f name=treatment id=j37i1t>
<fs type='treatment history'>
<f name=medicine><none></f>
<f name=regime id=j37i1tr><str>positive reinforcement</str></f>
<f name=particulars id=j37i1tp>
<str>systematically decreased distance between
feeding bowl and table lamp</str></f>
<f name=duration.of.treatment id=j37i1td>
<msr unit=week value=2>
</f>
</fs>
</f>
<f name=result id=j37i1r>
<str>return to baseline condition</str>
</f>
</fs>
</f>
</fs>
<linkGrp domains='j37 DT91mds'
targFunc='feature segment'
extendTarg=2>
<link targets='j37pn mds06'>
<link targets='j37pc mds09 mds11 mds14 mds18 mds21'>
<link targets='j37b mds07 mds09'>
<link targets='j37or mds09'>
<link targets='j37i1sn mds04'>
<link targets='j37i1n mds05'>
<link targets='j37i1b mds09 mds10 mds11'>
<link targets='j37i1sb mds14 mds16 mds17 mds18 mds19 msb20'>
<link targets='j37i1sp mds15 mds20'>
<link targets='j37i1dd mds08 mds23'>
<link targets='j37i1dc mds23'>
<link targets='j37i1tr mds24'>
<link targets='j37i1td mds25'>
<link targets='j37r mds25'>
</linkGrp>
<s n=00741>The&AT0; closest&AJS; he&PNP; came&VVD; to&PRP; exercise&NN1;
was&VBD; to&TO0; open&VVI; one&CRD; eye&NN1; every so often&AV0;,&PUN;
if&CJS; someone&PNI; entered&VVD; the&AT0; room&NN1;,&PUN; or&CJC;
to&TO0; open&VVI; both&DT0; eyes&NN2;,&PUN; smile&VVI;,&PUN; and&CJC;
wag&VVI; his&DPS; tail&NN1; as&CJS; he&PNP;'d&VHD; done&VDN; on&PRP;
one&CRD; occasion&NN1; when&AVQ; confronted&VVN; by&PRP; a&AT0;
housebreaker&NN1;!&PUN;</s>
<!-- ... -->
<!ENTITY AJS "<ptr target=AJS>" >
<!-- ... -->
<!ENTITY AT0 "<ptr target=AT0>" >
<!-- ... -->
<s n=00741>
<w ana=AT0>The</w>
<w ana=AJS>closest</w>
<w ana=PNP>he</w>
<w ana=VVD>came</w>
<w ana=PRP>to</w>
<w ana=NN1>exercise</w>
<w ana=VBD>was</w>
<w ana=TO0>to</w>
<w ana=VVI>open</w>
<w ana=CRD>one</w>
<w ana=NN1>eye</w>
<phr ana=AV0>
<w>every</w>
<w>so</w>
<w>often</w>
</phr>
<c ana=PUN>,</c>
<w ana=CJS>if</w>
<w ana=PNI>someone</w>
<w ana=VVD>entered</w>
<w ana=AT0>the</w>
<w ana=NN1>room</w>
<!-- ... -->
</s>
<fsLib id=BNCgs type='BNC grammatical structures'>
<!-- ... -->
<fs type='grammatical structure' id=AJS feats='Wj Ds'></fs>
<fs type='grammatical structure' id=AT0 feats='Wl'></fs>
<fs type='grammatical structure' id=PNP feats='Wr Rp'></fs>
<fs type='grammatical structure' id=VVD feats='Wv Bv Fd'></fs>
<fs type='grammatical structure' id=PRP feats='Wp Bp'></fs>
<fs type='grammatical structure' id=NN1 feats='Wn Tc Ns'></fs>
<!-- ... -->
</fsLib>
It will be noted that each feature structure in this library bears an
identifier corresponding with the code supplied as the value for the
ana attribute in the sample sentence. The component
features of each feature structure are further specified by the
feats attribute. These identify one or more <f>
elements in the following feature library (again, only a few of the
available features are quoted here):
<fLib type='BNC grammatical features'>
<!-- ... -->
<f name=verbBase id=Bv><sym value=main></f>
<f name=prepBase id=Bp><sym value=lexical></f>
<f name=degree id=Ds><sym value=superlative></f>
<f name=verbForm id=Fd><sym value=ed></f>
<f name=number id=Ns><sym value=singular></f>
<f name=pronType id=Rp><sym value=personal></f>
<f name=nounType id=Tc><sym value=common></f>
<f name=class id=Wj><sym value=adjective></f>
<f name=class id=Wl><sym value=article></f>
<f name=class id=Wn><sym value=noun></f>
<f name=class id=Wp><sym value=preposition></f>
<f name=class id=Wr><sym value=pronoun></f>
<f name=class id=Wv><sym value=verb></f>
<!-- ... -->
</fLib>
<s n=00741 id=mds09>
<w id=mds0901>The</w>
<w id=mds0902>closest</w>
<w id=mds0903>he</w>
<w id=mds0904>came</w>
<w id=mds0905>to</w>
<w id=mds0906>exercise</w>
<w id=mds0907>was</w>
<w id=mds0908>to</w>
<w id=mds0909>open</w>
<w id=mds0910>one</w>
<w id=mds0911>eye</w>
<phr id=mds0912>
<w>every</w>
<w>so</w>
<w>often</w>
</phr>
<c id=mds0913>,</c>
<w id=mds0914>if</w>
<w id=mds0915>someone</w>
<w id=mds0916>entered</w>
<w id=mds0917>the</w>
<w id=mds0918>room</w>
<c id=mds0919>,</c>
<w id=mds0920>or</w>
<w id=mds0921>to</w>
<w id=mds0922>open</w>
<w id=mds0923>both</w>
<w id=mds0924>eyes</w>
<c id=mds0925>,</c>
<w id=mds0926>smile</w>
<c id=mds0927>,</c>
<w id=mds0928>and</w>
<w id=mds0929>wag</w>
<w id=mds0930>his</w>
<w id=mds0931>tail</w>
<w id=mds0932>as</w>
<w>
<w id=mds0933>he</w>
<m id=mds0934>'d</m>
</w>
<w id=mds0935>done</w>
<w id=mds0936>on</w>
<w id=mds0937>one</w>
<w id=mds0938>occasion</w>
<w id=mds0939>when</w>
<w id=mds0940>confronted</w>
<w id=mds0941>by</w>
<w id=mds0942>a</w>
<w id=mds0943>housebreaker</w>
<c id=mds0944>!</c>
</s>
<linkGrp domains='mds09 BNCgs'
targFunc='segment analysis'
extendTarg=0>
<link targets='mds0901 AT0'>
<link targets='mds0902 AJS'>
<link targets='mds0903 PNP'>
<link targets='mds0904 VVD'>
<link targets='mds0905 PRP'>
<link targets='mds0906 NN1'>
<link targets='mds0907 VBD'>
<link targets='mds0908 TO0'>
<link targets='mds0909 VVI'>
<link targets='mds0910 CRD'>
<link targets='mds0911 NN1'>
<link targets='mds0912 AV0'>
<link targets='mds0913 PUN'>
<link targets='mds0914 CJS'>
<link targets='mds0915 PNI'>
<link targets='mds0916 VVD'>
<link targets='mds0917 AT0'>
<link targets='mds0918 NN1'>
<link targets='mds0919 PUN'>
<link targets='mds0920 CJC'>
<link targets='mds0921 TO0'>
<link targets='mds0922 VVI'>
<link targets='mds0923 DT0'>
<link targets='mds0924 NN2'>
<link targets='mds0925 PUN'>
<link targets='mds0926 VVI'>
<link targets='mds0927 PUN'>
<link targets='mds0928 CJC'>
<link targets='mds0929 VVI'>
<link targets='mds0930 DPS'>
<link targets='mds0931 NN1'>
<link targets='mds0932 CJS'>
<link targets='mds0933 PNP'>
<link targets='mds0934 VHD'>
<link targets='mds0935 VDN'>
<link targets='mds0936 PRP'>
<link targets='mds0937 CRD'>
<link targets='mds0938 NN1'>
<link targets='mds0939 AVQ'>
<link targets='mds0940 VVN'>
<link targets='mds0941 PRP'>
<link targets='mds0942 AT0'>
<link targets='mds0943 NN1'>
<link targets='mds0944 PUN'>
</linkGrp>
<alt id=PRP-TO0 targets='PRP TO0'>
<alt id=NN1-VVI targets='NN1 VVI'>
Next, we change the <link> elements for the text elements
identified by the id=mds0905 and id=mds0905
attribute values.
<link targets='mds0905 PRP-TO0'>
<link targets='mds0906 NN1-VVI'>
<join id=PRP.NN1 targets='PRP NN1'>
<join id=TO0.VVI targets='TO0 VVI'>
We then define an <alt> element to express the alternation
between the two <join> elements.
<alt id=PRP.NN1-TO0.VVI targets='PRP.NN1 TO0.VVI'>
Next, we add a <phr> element in the encoding of the text for the
phrase to exercise.
<phr id=mds090506>
<w id=mds0905>to</w>
<w id=mds0906>exercise</w>
</phr>
Finally, we add to the <linkGrp> element a <link> element
connecting that phrase to the <alt> that represent its two
analyses.
<link targets='mds090506 PRP.NN1-TO0.VVI'>
<w id=mds093334>
<w id=mds0933>he</w>
<m id=mds0934>'d</m>
</w>
Next, we form a join of the structures associated separately with the
subelements he and 'd.
<join id=PRP.VHD targets='PRP VHD'>
Finally, we define a link between the complex word and the new
<join> element.
<link targets='mds093334 PRP.VHD'>