Natural language
question-answer online demonstrates how syntax-lexical pattern can be applied to search direct answers for a question.
If you have a database with word meanings you can replace words with their meanings and use syntax-semantic pattern in the same way.
You can use online demo for a quick test to get the idea how searching by
syntax-lexical pattern works.
Web user interface.
The upper text field in a web form is a question. Please note, that question should be
syntactically correct, otherwise natural language
parser cannot process it properly. You can verify which syntax diagram is
used as a pattern for your question and answers for your text with Reed-Kellogg syntax.
The second text box contains a text for search. question-answer doesn't use any
internal database so you have to provide the text containing a direct answer in the web form.
How it works?
First, the question (from the upper field) is parsed to get a Reed-Kellogg tree
syntax graph. Then the graph is transformed into its direct answer form. For
example QuestionClause syntax node
is replaced with a Clause syntax node.
The resulting graph is used as a syntax-lexical pattern.
Then the algorithm scans the text in the second field and tries to find
utterances, most similar to the pattern. First, it compares a
syntax node from the pattern with a
syntax node from target text. If
syntax nodes match, it compares the meanings
of words on the nodes. To compare word meanings it
simply compares the Lexemes.
If both syntax and meanings match, algorithm goes down the syntax trees and builds the syntax
fragment common for both utterances. The more
syntax nodes have been matched, the higher is
matching score. The best answers are shown as a result.
If question has a question word, the tool assures
that question word is always matched. The node in the answer graph, which
matches the question word is the short answer
(possibly with all underlying words in a syntax
tree)
The picture demonstrates the matching process. It shows how "What is NLParser?"
question is mapped to the direct answer "The NLParser is a lexical processor and
a syntax parser in a single class."
Please use exactly the same word forms in your text as in the question. The algorithm doesn't use synonyms
and word inflections. It is not a limitation of syntax-lexical search but done
for simplicity. Your application can take into account synonyms and word
forms. Even better, you can use word meanings from your knowledge
domain.
The algorithm cannot do any reasoning. For example it cannot understand negative meanings.
For the default text you can ask questions like:
- What is NLParser?
- What does NLParser allow?
- How many methods does NLParser have?
Advantages of search by syntax-lexical pattern.
It allows asking a question to a computer in a natural language form as if talking to a human.
Comparing with keyword-based searching approaches the syntax-lexical pattern
algorithm can find an exact answer because it understands syntax information in
a user question.
Ability to give a short answer when possible is a unique feature, which makes it
more user-friendly and may save time. For example user can read only through a
list of short answers rather than their full versions.
Using syntax-lexical pattern doesn't require a semantic database and can be
applied on large volumes of information written for humans.
Syntax-lexical or syntax-semantic comparison is not question
specific. It can be applied to compare assertion clauses in the same way. For
assertion clauses it is even more simple because conversion from question to
answer form is not required.
The algorithm can be applied to search a topic or to compare documents. In
this case you can use multiple utterances in the search pattern.
Limitations of search by syntax-lexical pattern.
Assuming that word inflections and synonyms are used in syntax lexical pattern,
the main problem is that syntax-lexical pattern doesn't utilize word meanings.
For example, it cannot understand the meanings of question words, negative
meanings etc.
The searched texts should contain a direct answer because algorithm doesn't make
any reasoning. Though building reasoning capable software even on a small
knowledge domain could be in orders more complex, while search by syntax-lexical
and syntax-semantic patterns is relatively straightforward.
Fore developers
The C# code and more details about the implementation can be found here: How to search direct answers for a natural language question.