Help Topics 
Searching
  - help for beginners
  - searching tips
  - about early modern spelling >
  - choosing a search type
  - using simple search
  - searching regions
  - using boolean search
  - using proximity search
  - using citation search
  - using word index
  - using sgml tags
Interpreting search results
Viewing a text
Viewing search history
Using the bookbag
FAQ

Early modern spelling and TCP searching

The original spellings and letter choices present in TCP texts have been retained in the database. Since spelling was not regularized in this early period of printing, the keyword term you are searching for may appear in multiple forms. The more of these spellings you use while searching, the more returns you will get. This takes some getting used to, and it takes some creativity, but knowing a few simple habits of early modern typesetters can help you increase the numbers of returns you get.

Early modern spelling and typesetting habits to keep in mind:

  • The letter e often appears on the end of words where you might not expect it. For example, regard may well appear as regarde. Truncating your search with an asterisk, looking for regard*, will ensure that you get returns that include the e, as well as forms like regarding.

  • The letters u and v are often interchangeable. As a result, if you are searching for the word slave, you might also look for slaue. The Boolean search screen can be very helpful in constructing these searches. Unfortunately, the system does not support wildcard searching, so you cannot use sla*e in a simple search to find more hits.

  • w often appears as vv. A search for wonder, then, will be more productive if done from the Boolean search screen, entering wonder in the first box, selecting "or" from the drop-down menu, and then entering vvonder in the second box. You can also search for wonder* and vvonder* if you would like to pick up instances of wonderful or vvonderous.

  • The letter i often replaces j. You may want to search for iealous as well as jealous for more complete results.

In many cases, TCP texts simplify characters for easier searching. There are many early modern symbols that can be ignored in structuring your search. For example:

ſ (long s) = s
œ = oe, and other ligatured or "joined" letters are rendered separately

To see a list of spellings that might have been used for your search term during the early modern era, try entering the term into the Word Index. From there, you will be taken to a screen that shows the spellings present in the database, listed in alphabetical order with your word highlighted. From there, you can look for and select for viewing other spellings that might be related.

While the TCP encoding standards have worked to make these texts as accessible as possible, your search may well skip over some occurrences of your search term for the following reasons:

Macrons

Early modern typesetters often purposely omitted letters from words so that they would fit more easily on a line. These omissions are often signaled with a horizontal bar over a character in the word (such transcriptions are sometimes called macrons). For example, convenient may appear as c_venient. TCP texts transcribe such omissions using tildes, as in co~venient, rather than attempt to expand these abbreviations, which are often hard to make out. A search for convenient will not turn up this spelling.

Truncations can be helpful in working around macrons, though that may not always be so when the line stroke appears at the beginning of a word. In cases where thoroughness really matters, you may want to query the database using the tilde in the spelling of your search term. A search for co~venient will indeed work, hitting this variant spelling.

Abbreviations

Some TCP works use abbreviations borrowed from Latin texts. Words with common Latin roots are sometimes spelled using special symbols in place of ordinary letters. For example, perform can occasionally be spelled as form. These abbreviations are labeled with SGML tags but not spelled out in the TCP encoded texts. As explained in using sgml tags, you may search for the presences of these tags if you wish.

Word Division

In many texts, space considerations drove early modern typesetters to break words in two at the end of one line to continue it on the next. Sometimes, these typesetters used hyphens, as we do today. When that happens, a pipe symbol (|) is inserted at the break, and you will see this mark on your screen. If the early modern typesetter omitted the hyphen, simply placing hap on one line and py on the next, it will be encoded as hap+py, and the plus sign will be visible.

The search engine will find words that are interrupted by the pipe symbol and the plus sign. For example, a search for happy will pick up hap|py, as well as hap+py.

Related topics:

Search tips
Searching regions