25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé...

26
25 juin 2002 TALN 2002-Michel Généreux 1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for Natural Language) Michel Généreux - Austrian Research Institute for Artificial Intelligence

Transcript of 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé...

Page 1: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 1

Un analyseur sémantique pour les langues naturelles basé sur des exemples(An Example-Based Semantic Parser for Natural Language)

Michel Généreux - Austrian Research Institute for Artificial Intelligence

Page 2: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 2

Introduction - MotivationsBuild a robust, portable (in practice, only a set of new annotated examples is needed), wide

covering parser that deals well with real data (hesitations, recognition error, idiomatic expressions,...)

In the current context of:

Large quantity of Information accessible from the Internet

Speech Recognition development

Natural Language Interfaces (NLI) have become very attractive to access easily information and/or speak to a computer (e.g. multimodality)

Improve upon other corpus-based semantic parsers

Provides an open and flexible model: the statistical model could be adapted for different types of training corpus

Gives context a crucial role in the parsing process

Abstracts over topics for changing domains

Not rule-based (burden of creating hand-crafted rules, portability)

Page 3: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 3

Architecture du système

Page 4: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 4

Introduction - Caractéristiques du système

• An empirical methods for semantic parsing of natural language: to learn to parse a new sentence by looking at previous examples

• A shift-reduce type parsing paradigm where the operators are based on domain-specific semantic concepts, obtained from a lexicon.

• A statistically trained model "specializes" the parser, by guiding the runtime beam-like search of possible parses.

• Decisions are made on the basis of three criteria for the parsing stage: the similarities between the contexts in which the action took place, the similarities between the final meaning representation and finally, the sheer number of occurrences of those actions and final representations.

• Finally, a module is provided so that training and parsing can also be done on a changing domain such as newspaper browsing.

Page 5: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 5

IntroductionWhat can a semantic parser learn from a training corpus of examples ?

• Training([Maria,suchst,einen,großen,Hund],suchen(Maria,groß(Hund))).

– It can learn that the predicate groß/1 combines best as the suchen/2 argument, and not the other way round.

• Training([Einen,großen,Hund,suchst,Maria],suchen(Maria,groß(Hund))).

– It can learn that word order may not really matter.

• Training([Na,ja,einen,großen,Hund,um,suchst,Maria],suchen(Maria,groß(Hund))).

– It can learn that some words or hesitations don’t matter.

• Training([Paul,kicked,the,bucket],die(Paul));

• Training([Paul,kicked,the,ball],kick(Paul,ball)).

– Using contextual information, it can learn how to disambiguate meanings.

• Training([Big, deal, that, Maria, is, looking, for, a, big, dog], search(Maria, big(dog))).– It can learn that the same word may or may not participate to the meaning.

• The corpus may give the parser direct examples on how to conduct its actions and contextual information found in those examples may give the parser additional clues to interpret word order and disambiguate meanings.

Page 6: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 6

IntroductionHow will the parser use that information to parse new sentences ?

• When parsing a new sentence, each step (action) is compared (including context) to those generated at the training phase;

• The highest the similarity and the frequency, the highest the action ranks;

• A full parse is ranked according to the individual ranking of actions and the ranking of the final state;

• A final choice is made among a set of full parses (limited by the search beam)

Page 7: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 7

Phase "Training"Données

Page 8: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 8

"Training" -Données

• Origin of the train and test data: mainly from Wizard of Oz experiments, completed by some artificial examples, in the domain of newspaper searching and browsing

• Example of annotation:

– Command:• training([Zurück,bitte],zurück)

– Search• training([Ich,suche,jetzt,etwas,über,topic(1)],suchen([topic(1)],zeitung(_),ze

it(_))

– Command+search• training([Zurück,zu,den,Begriffen,topic(1),und,Politik],section_zurück(Politik

,[topic(1)])).

Page 9: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 9

"Training" Exemples de Données

Written Korpus• " Artikel über das Wiener Neujahrskonzert suche ich. "

• " Artikel über das Neujahrskonzert als festen Bestandteil des kulturellen Lebens suche ich. "

Spoken Korpus• "ich möchte jetzt eine neue Suche beginnen"

• "die Kosovo-Krise un und Bill Clinton"

• "Bill Clinton hält eine Ansprache vor der Öffentlichkeit"

Artificial data• Bitte geben Sie mir einen Text zum Thema Kosovo-Krise.

• Aber bitte nur in der Zeitung Salzburger Nachrichten von vor einer Woche.

• Salzburger Nachrichten und Kosovo-Krise suche ich jetzt.

Page 10: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 10

"Training"

The training phase uses an overly general parser, which produces all possible paths of actions (only limited by the training beam) to be taken in order to get from each training example utterance to its semantic representation. In the process, it records successful actions (called op), as well as the different final states. For each of them uniquely defined, it assigns a frequency* measure, defined as follows:

Frequency**= Occurrence_of_an_action_in_a_specific_context / Total_number_of_occurrence_of_this_action

* = the measure is similar for final states

** = a_particular_action is an action WITH arguments

Page 11: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 11

"Training" Le fichier de statistiques

• The overlyGeneralParser parses the training file to generate the statistical file. Every step needed to go from the topicalized_phrase to the meaning is recorded, as well as final states themselves. Final states are simply the states of the parse stack themselves at the end of the parse. Each of them (actions and final states) are assigned a frequency measure as described previously. Each line has either one of the following format (recall that op is a container for any action):

op(ACTION#PARSE_STACK#INPUT_STRING#FREQUENCY).

final(FINAL_STATE#FREQUENCY).

with– The Input string

• [Ich,suche,einen,Artikel,über,Bush]

– The Parse Stack

• [concept1:[context1],concept2:[context2], ...]

• [suchen([],zeitung(_),zeit(_)):[suche,einen,Artikel],start:[ich]]

• Here are two examples:

op(sHIFT(ich)#[start:[]]#[ich,suche,einen,Text,for,topic(1),topic(2),bitte,bearbeiten,Sie,meinen,Suchauftrag]#0.3333).

final([bestätigen(neue_suche):[bearbeiten,Sie,meinen,Suchauftrag],start:[ich,möchte,jetzt,eine]] #0.2).

• These lines are used by the specializedParser to compute the best parse.

Page 12: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 12

La phase d’analyse (parsing)

Page 13: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 13

AnalyseExtraction de thèmes

Page 14: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 14

Analyse sémantique statistique Analyse syntaxique: PCFG

P = maxtP(t,s|G) we try to find the parse P with the highest probability, given a grammar G, where t is a parse tree, s a

sentence and where each grammar rule is assigned a probability according to its frequency in a corpus.

P(S) = 0.6*0.3*0.4*0.7*0.2*0.3*0.2*0.1 (= 0.0006048)

• Semantic parsing: – P = maxfP(f,s|L) we try to find the parse P with the highest probability, given a first-order logical language

L, where f is a formula, s a sentence and where each formula is assigned a probability according to its frequency in a corpus.

– Probability(kick(Paul,ball)) = P(Introduce(Paul)) * P(Introduce(kick(_,_))) * P(DROP(Paul,kick(_,_))) * P(SHIFT(the)) * P(INTRODUCE(ball)) * P(DROP(ball,kick(Paul,_))) * P(kick(Paul,ball))

• The actual parsing of the input phrase is done by a specializedParser. It is specialized in the sense that it uses a statistical model to process all the information available from the training phase in order to get the best possible parse (the one with the highest probability).

• The search space: the search beam parameter

Page 15: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 15

Analyse statistique: vue d’ensemble

Page 16: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 16

Analyse-Adapter le modèle pour différents types de corpus

• Small training set– threshold – training beam

• Small lexicon (narrow domain)– Pop

– Pfinal

Page 17: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 17

Éléments de l’analyse

• Elements of the parser:

– Actions

• sHIFT(word_to_be_shifted) puts the first word from the input string into the end of the context of the concept on the top of the parse stack

• iNTRODUCE(concept_to_be_introduced) takes a concept from the semantic lexicon and puts it on the top of the parse stack

• dROP(source_term, target_term) attempts to place a term from the parse stack as argument to another term of the parse stack

• The semantic lexicon• lexicon(CONCEPT, [TRIGGERING_PHRASE]).

– lexicon(topic(1),[topic(1)]).

– lexicon(suchen([],zeitung(_),zeit(_)),[suche]).

Page 18: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 18

Analyse-Version du SR Analyseur: Shift-Introduce-Drop analyseur

• We are now ready to present the variant of the shift-reduce parser we are using. The algorithm previously introduced must be modified as follows:

1. Try to introduce a new concept or shift a word. *

2. If possible, make one dROP action. *

3. If there are more words in the input string, go back to Step 1. Otherwise stop.

• * The backtracking mechanism will ensure that ALL possible actions will be executed.

Page 19: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 19

Analyse - Un exemple

• We now show a parse for Ich suche einen Artikel über Bush. We assume for simplicity that the parser always takes the best available action. The following trace presents the successive actions taken by the parser. The initial parse stack and input string states are:

– [start:[]] and [ich,suche,einen,Artikel,topic(1)] Note: PP topicalized

• Here is the complete parse, using the two previous lexical entries (lexicon(topic(1),[topic(1)]), lexicon(suchen([],zeitung(_),zeit(_)),[suche])). Each line represents a Parse State:

– sHIFT(ich)#[start:[]]#[ich,suche,einen,Artikel,topic(1)]

• The word ich is not in the semantic lexicon, so the only action possible is to shift it on the parse stack.

– iNTRODUCE(suchen([],zeitung(_),zeit(_)))#[start:[ich]]#[suche,einen,Artikel,topic(1)]

• The word suche is in the lexicon, so it can be introduced as a new predicate on the parse stack. Another possibility would be to shift it.

– sHIFT(einen)#[suchen([],zeitung(_),zeit(_)):[suche],start:[ich]]#[einen,Artikel,topic(1)]

• The word einen is not in the lexicon, it must be shifted.– sHIFT(Artikel)#[suchen([],zeitung(_),zeit(_)):[suche,einen],start:[ich]]#[Artikel,topic(1)]

Page 20: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 20

Analyse-Un exemple (suite)

• The word Artikel is not in the lexicon (it is actually an relevant_word for the newspaper domain) and is therefore shifted.

– iNTRODUCE(topic(1))#[suchen([],zeitung(_),zeit(_)):[suche,einen,Artikel],start:[ich]]#[topic(1)]

• topic(1) is in the lexicon, so it can be introduced.– dROP(topic(1),suchen([],zeitung(_),zeit(_)))#[topic(1):[topic(1)],suchen([],zeitung(_),zeit(_)):

[suche,einen,Artikel],start:[ich]]#[]

• We can drop the predicate topic(1) into the first argument of the suchen predicate. The final parse stack or final state is:

– [suchen([topic(1)],zeitung(_),zeit(_)):[suche,einen,Artikel],start:[ich]]

• In the final stage, the parser simply puts back the meaning for topic(1) collected during the topic_extraction phase and the final parse is, without contextual information:

– suchen([(Bush)],zeitung(_),zeit(_))

• which signals a search for the topic Bush with no specific newspaper or time frame.

• Note: a parse WITH statistics is a parse in which each of the previous actions, plus a final state, is given a probability according to similarity and frequence.

Page 21: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 21

Résultats• Testing: from a pool of 90 examples, 9 different subsets of 10 test

examples were used as test data (training beam of 1), the rest (80) being the training examples. The parser averages 62% correctness while parsing a new sentence. Although it is difficult to quantitatively compare the approach with others, accuracy is therefore slightly lower than the other approaches. I believe there are mainly three reasons to that:

– The very low number (80) of training examples, compare to 560, 225 and 4000 sentences of other approaches.

– The lack of extensive testing on what would be the best setting for default values of weighting parameters. Only a set of rather intuitive values were used.

– The assimilation of natural language utterances to sets instead of lists.

Page 22: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 22

Résultat-Courbe d’apprentissage: "training" avec un "beam" de 1, analyse avec un "beam" de 10

Page 23: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 23

Résultat-Démonstration d’analyse

Page 24: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 24

Recherche en cours• Testing with larger domains, bigger corpus to see if it scales

up well

• See if the model can be simplified (in terms of number of switches)

• Establish a clear relation between the statistical parameters (switches) and the type of corpus in terms of efficiency and accuracy

• Include a treatment of questions• Include an automated acquisition of a lexicon such as the one

proposed in C. Thompson and R. Mooney. Semantic lexicon acquisition for learning natural language interfaces. In 6th Workshop on Very Large Corpora, August 1998.

• Another useful tools to enlarge our semantic dictionary would be WordNet.

Page 25: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 25

Étape suivante: Discours

• Extend the approach to Discourse using Discourse Representation Theory (DRT)

– DRT structures can also be mapped to logical structures such as the ones used to represent isolate sentences

– By mapping DiscourseDRT structures, the parser could train to resolve typical discourse problems such as pronoun resolution.

Page 26: 25 juin 2002TALN 2002-Michel Généreux1 Un analyseur sémantique pour les langues naturelles basé sur des exemples (An Example-Based Semantic Parser for.

25 juin 2002 TALN 2002-Michel Généreux 26

Conclusion

• Corpus-based methods offer a more effective way to deal with real data, and statistics offers an efficient and robust way to model and implement methods on a computer. There is no need to rely upon hand crafted rules; only a set of training examples is needed.

• The parser learns efficient ways of parsing new sentences by collecting statistics on the context in which each parsing action takes place. Comparing to similar systems using some machine-learning technique, ours offers an approach in which linguistics, through context, can play a decisive role.

• The parser configuration can be change in many ways, to fit different types of corpus or domains.

• Some testing remains to be done to see how well the model scales up, but so far, promising results have paved the way for the use of the system to include even more sophisticated analysis, such as discourse information.