Advanced users can enter SGML tags into the search box to locate
particular structures within texts.
Why would I want to search with SGML tags?
Using SGML tags allows you to customize your search in ways that
the drop-down menus on the interface do not. The search
screen drop-down menus are made to suit the queries that users
most commonly want to construct. These menu options
get their returns by reading SGML tagging, but there are plenty
of tags in the texts that are not picked up when these options
are selected. If you know how to enter these tags into the
search box, you can structure queries that meet more particular
How to use SGML tags in searching
SGML tags can be entered in any search box to locate occurrences
of a particular label in the encoded text. This sort of
searching can be helpful for finding particular features of a
text, helping to single out structures like epigraphs, block quotations,
and lists. It can also allow you to explore, for example,
how many quotations are attributed to Horace in the database or
in a bookbagged group of texts.
You can use SGML tags as query terms on any TCP search screen.
SGML tags can be combined with other non-tag terms for Boolean
Be sure to leave pointed brackets open and include an asterisk
when searching with SGML tags. Otherwise, the interface
will not know how to interpret your query. Because the database's
display software suppresses the tags to make TCP returns
easier to read, the tags will not be highlighted in red in the
results page. In fact, they will not appear at all, but
the structures will indeed be present, and you can either see
them in the text layout or rely on highlighted, non-SGML query
terms to locate the information you were looking for.
To learn more about the SGML tags used in TCP encoding,
please consult the Keyboarding Instructions and the project DTD.
Block Quotations: <Q*, <BIBL*
These have been marked with a <Q> tag, which is not associated
with ordinary, shorter quotations.
If the source of a quotation is given, the name will be labeled
with <BIBL>, which is used to denote information attached
to a quotation.
Typeface Changes: <HI*
Every change from the predominant typeface is marked with the
<HI> tag; the nature of the shift (e.g., from bold to italic)
is not noted.
Early modern printers often shifted fonts in order to emphasize
place names and the names of people, so searching with the <HI>
tag may serve as a useful shortcut for locating this sort of information.
Illustrations: <FIGURE*, <HEAD*
TCP texts note the presence of illustrations with a <FIGURE>
tag. These illustrations have not been categorized (as maps
or portraits, etc), because the TCP interface Boolean search
option provides such categorization.
The captions for figures, however, have been recorded and can
be searched. They are labeled with the <HEAD> tag.
To search for captions mentioning Queen Elizabeth, then, you could
perform a Boolean search for "<HEAD*" AND "<FIGURE*"
These summary paragraphs often appear directly under chapter
headings and make mention of the different subjects treated in
the section to follow. They may be worded like a Table of
Contents, but they are formatted as paragraphs, often with dashes
separating their components.
Generally serving as introductory comments on what is to come
rather than previews of a section's contents, these epigraphs
often take the form of quotations from famous authors.
If the material in an epigraph is indeed a quotation, it will
be labeled with the <Q> tag. If the material is in
verse, the lines with also be marked with the <L> tag.
There has been no attempt to expand abbreviations, but they have
been marked wherever they are recognized as such.
Where Latin abbreviations are noted, they are marked with their
own particular codes. Click here to view these codes along
with the characters early printers used to signal these abbreviations.
Tables and Lists:
Tables and lists have been noted in the encoding, and their parts
can be searched using the following tags
Tables: <TABLE*, <ROW*, <CELL*
Lists: <LIST*, <ITEM*
Other Encoding for Printed Characters:
The presence of ampersands, all symbols of the zodiac, and paragraph
symbols that are present in the source text have been noted in
the encoded version.
Tags Common to Particular Text Types:
All poems are marked as such, and each line of poetry is marked
with an <L> and each stanza marked as a line group <LG>.
You can choose "Line" or "Stanza" from the
interface drop-down menu to focus your search, or you can enter
these tags if doing so better suits your needs.
Instances where space issues have made a printer place the concluding
words of a line next to the previous line cannot be searched –
keyboarders have been instructed to "patch" the two
parts of the line together.
The following features are marked and could be used to narrow
<STAGE* Stage directions
<SPEAKER* Speaker's names (which can often be abbreviated)
Speeches (which can appear without speaker names)
Marked as such when varying from the predominant form of the
text. Specific parts within them can be searched as well.
<CLOSER* Closing of the letter
Other Helpful SGML Tags to Know about:
The "section and work titles" search scans the divisions
generated by the Table of Contents command. These can also
be searched by entering the tags that mark the divisions.
For example, you can enter:
<TEXT* marks the
start of every text in the database
marks the different divisions of the texts and is followed by
a number -- DIV1, for example, would be assigned to Book 1,
Book 2, and Book 3 of the Faerie Queene, and the
chapters within it would be classified as DIV2
marks each page break
Markers denoting absences in the encoding:
<UNCLEAR* marks places where text cannot be read because of
blurred print or damage to the original
denotes a place in the text where material could not be encoded --
this can signal the presence of:
Non-Roman alphabets (the individual characters have not been recorded)