Parsing with Unification
Transkrypt
Parsing with Unification
Parsing with Unification Methods for parsing natural language resemble those used in parsing programming languages, although there are maverick methods. Most grammars use context-free grammars. An additional factor is often use of unification. feature1 value1 The principle structure used in feature2 value2 unification is an attribute-value .. .. . . matrix (AVM). featuren valuen Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (264 / 277) Parsing with Unification AVMs reduce the number of syntactic rules, as they make it possible to capture a more general relation with a rule, and then to express e.g. agreement or subcategorization in AVMs. AVMs can be nested, e.g.: category NP [ ] number sg agreement person 3rd A feature path is a sequence of names of features leading to a particular value, e.g. the value sg is reachable using the path agreement/number. Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (265 / 277) Parsing with Unification Some feature structures can be shared with other features. It should be noted that having the same value does not imply that the value is shared, i.e. that it is the same shared value: category sentence [ ] number sg agreement 1 person 3 head [ ] subject agreement 1 The feature value head/agreement is the same value (not merely equal value) as the value of head/subject/agreement. Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (266 / 277) Parsing with Unification Unification of two equal feature values gives the same value: [ ] [ ] [ ] number sg t number sg = number sg whereas unification of the same feature with values that do not match fails: [ ] [ ] number sg t number pl Error!!! A lack of value for a given feature during unification means the same as the set of all possible values for that feature: [ ] [ ] [ ] number sg t number [] = number sg Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (267 / 277) Parsing with Unification A lack of value for a given feature is usually denoted as a lack of that feature: [ ] [ ] [ ] number sg number sg t person 3rd = person 3rd If feature values are substructures containing other features, unification is used recursively on those values. Unification concerns also shared values – sharing remains also after unification. Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (268 / 277) Parsing with Unification Feature undergoing unification can have substructures, it is also possible to share values: [ ] number sg agreement 1 person 3rd ] [ subject agreement 1 [ [ [ ] ] ] person 3rd t subject agreement number sg [ ] number sg agreement 1 person 3rd ] = [ subject agreement 1 Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (269 / 277) Parsing with Unification The following attempt of unification fails: [ ] number sg agreement 1 person 3rd [ ] subject agreement 1 [ ] number sg agreement [ person 3rd[ ] ] t number pl subject agreement person 3rd = Error Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (270 / 277) Applications of Unification Unification models phenomena that would be uneasy to model using context-free grammars. One of them is agreement relation, e.g. between a nominal phrase being a subject and a verbal frame, between a noun and an adjective that describes it etc.’: S → NP VP hNP agreementi = hVP agreementi Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (271 / 277) Applications of Unification Also subcategorization can be modeled with unification. It determines e.g. requirements of a verb placed on its complements (e.g. czekać –wait in Polish– requires accusative), requirements of some nouns, e.g. proszek do prania/pieczenia (washing powder, cooking powder). Relations of verbs in a sentence are so complicated, that they are written in subcategorization frames. In English, there are about 50 to 100 such frames. Example: ORTH CAT HEAD want VERB [ SUBCAT Jan Daciuk, DIIS, ETI, GUT [ ] [ CAT NP , Natural Language Processing CAT [VP VFORM HEAD INFINITIVE 11. Parsing with Unification ] ] ] (272 / 277) Applications of Unification Unification makes it possible to transfer features of a head higher up in the hierarchy as features of the whole phrase. For example, features of a noun or a pronoun –a subject– are features of the whole subject nominal phrase. Features of a verb are features of the whole verbal phrase. Unification is the main mechanism for parsing in HPSG. It is used both to transfer head features to the phrase level, and to model agreement and subcategorization. Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (273 / 277) Applications of Unification VP V V D E daje książkę phon fin head D E 1,2 subcat D E daje phon fin head D E 1,2,3 subcat Jan Daciuk, DIIS, ETI, GUT D E daje książkę koledze phon fin . head D E 1 subcat 3 NP Natural Language Processing 2 NP [ phon D koledze E] [ D E] phon książkę 11. Parsing with Unification (274 / 277) Applications of Unification Unification can also model distant relations – those that cross a phrase barrier. E.g. in a sentence: Którą książkę Marek pożyczył od Gosi? the word pożyczyć requires a direct object. The role of that object is played by którą książkę, but it does not occur at the usual position (after the verb); it occurs at the beginning of the sentence. „Marek pożyczył [] od Gosi” can be parsed as a sentence (S) with a gap, in which a trace is put. The full sentence is of type S0 , and the trace must match „którą książkę”. Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (275 / 277) Implementation of Unification 1: function Unify(f1-orig,f2-orig) 2: f1 ← dereference(f1-orig); f2 ← dereference(f2-orig) 3: if f1 ≡ f2 then 4: f1.pointer ← f2 5: return f2 6: else if f1 = null then 7: f1.pointer ← f2 8: return f2 9: else if f2 = null then 10: f2.pointer ← f1 11: return f1 12: else if complex(f1) ∧ complex(f2) then 13: f2.pointer ← f1 14: for all f2feature ∈ f2 do 15: f1feature ← find or create a corresponding feature in f1 16: if Unify(f1feature .value, f2feature .value) = failure then 17: return failure 18: end if 19: end for 20: return f1 21: else 22: return failure 23: end if 24: end function Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (276 / 277) Bibliography . Elżbieta Dobryjanowicz, Podstawy przetwarzania języka naturalnego. Wybrane metody analizy składniowej, Akademicka Oficyna Wydawnicza RM, Warszawa, 1992. 2. Christer Samuelsson, Fast Natural-Language Parsing Using 1 Explanation-Based Learning, Swedish Institute of Computer Science, ISRN SICS/D–13–SE, 1994. Jan Daciuk, DIIS, ETI, GUT Natural Language Processing 11. Parsing with Unification (277 / 277)