Keywords online help

Keyword builder demonstrates how natural language parser can be used to build a list of keywords.

Web user interface.

In the main field enter the text, for which you want to build a list of keywords.

Keywords count field limits the maximum number of keywords.

Use verbs check box controls if verbs can appear in a keyword list. Verbs are essential because they characterize a topic of a document, but in search queries verbs seems to be used less frequent than nouns. The checkbox allows you to control what kind of keywords to build.

How it works.

To extract keywords Keyword builder uses syntax relations between words in an utterance and word syntax information like part of speech and additional syntax tags (like proper).

Importance of a word in an utterance may be determined by its syntax role. Subject, verb or object typically convey the most important information, while adjective or adverb modifiers play a helper role by adding more details to core meaning. This fact may be used when calculating importance of a word in a text.

Keyword builder checks subject, verb, directObject, indirectObject, and subectComplement:

Enumeration Description
subject The king gave Anne Boleyn his love.
verb The king gave Anne Boleyn his love.
directObject The king gave Anne Boleyn his love.
indirectObject The king gave Anne Boleyn his love.
subjectComplementNoun The king is Henry VIII.

Additionally part of speech may be examined. For example, auxiliary verbs can not describe a topic of a document. Proper nouns better describe document idea than common nouns because they are more unique.

Pronouns are skipped because they substitute nouns. (If pronoun appears as a subject or complement, you can increase the weight of referenced nouns)

Syntactically important words, which appear in the text more frequently are suggested as a keywords.

Demo doesn't use synonyms or word inflections.

For simplicity Keyword builder doesn't use word inflections. For example gave and give will appear as different words.

Demo doesn't use synonyms. King and emperor will appear as different words.

It is not a limitation of an algorithm; for real application you need to take it into account when gathering keyword statistics.

Advantages of keyword builder

Keyword builder relies on syntax information. It is more accurate than pure statistic approach or using format information (like headers).

It can be applied without having semantic database.

Limitations of keyword builder

Keyword builder doesn't use word meanings. If you have meanings database you can replace the lexemes with your meanings. It could give you much better results. Depending on your objectives, a small database for your knowledge domain may be sufficient.

Keyword builder is more efficient for monotonic documents. Though it is a general limitation of characterizing a text with a set of keywords.

For developers

How to build a list of keywords