Parsing with Unification

Transkrypt

Parsing with Unification
Parsing with Unification
Methods for parsing natural language resemble those used in parsing
programming languages, although there are maverick methods. Most
grammars use context-free grammars. An additional factor is often use of
unification.


feature1 value1
The principle structure used in
 feature2 value2 


unification is an attribute-value
 ..

..
 .

.
matrix (AVM).
featuren valuen
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(264 / 277)
Parsing with Unification
AVMs reduce the number of syntactic rules, as they make it possible to
capture a more general relation with a rule, and then to express e.g.
agreement or subcategorization in AVMs. AVMs can be nested, e.g.:


category
NP
[
]


number sg
agreement
person 3rd
A feature path is a sequence of names of features leading to a particular
value, e.g. the value sg is reachable using the path agreement/number.
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(265 / 277)
Parsing with Unification
Some feature structures can be shared with other features. It should be
noted that having the same value does not imply that the value is shared,
i.e. that it is the same shared value:


category sentence

[
] 


number sg


agreement
1


 
person
3
 head



[
]
subject
agreement 1
The feature value head/agreement is the same value (not merely equal
value) as the value of head/subject/agreement.
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(266 / 277)
Parsing with Unification
Unification of two equal feature values gives the same value:
[
] [
] [
]
number sg t number sg = number sg
whereas unification of the same feature with values that do not match fails:
[
] [
]
number sg t number pl
Error!!!
A lack of value for a given feature during unification means the same as
the set of all possible values for that feature:
[
] [
] [
]
number sg t number [] = number sg
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(267 / 277)
Parsing with Unification
A lack of value for a given feature is usually denoted as a lack of that
feature:
[
]
[
] [
]
number sg
number sg t person 3rd =
person 3rd
If feature values are substructures containing other features, unification is
used recursively on those values. Unification concerns also shared values –
sharing remains also after unification.
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(268 / 277)
Parsing with Unification
Feature undergoing unification can have substructures, it is also possible to
share values:

[
] 
number sg
 agreement 1 person 3rd 

] 
[
subject
agreement 1
[
[
[
] ] ]
person 3rd
t
subject
agreement
number sg

[
] 
number sg
agreement
1

person 3rd ] 
= 

[
subject
agreement 1
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(269 / 277)
Parsing with Unification
The following attempt of unification fails:

[
] 
number sg
 agreement 1 person 3rd 

[
] 
subject
agreement 1
[
]

number sg
agreement


[ person 3rd[
] ] 
t 


number pl
subject
agreement
person 3rd
= Error

Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(270 / 277)
Applications of Unification
Unification models phenomena that would be uneasy to model using
context-free grammars. One of them is agreement relation, e.g. between
a nominal phrase being a subject and a verbal frame, between a noun and
an adjective that describes it etc.’:
S → NP VP
hNP agreementi = hVP agreementi
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(271 / 277)
Applications of Unification
Also subcategorization can be modeled with unification. It determines
e.g. requirements of a verb placed on its complements (e.g. czekać –wait
in Polish– requires accusative), requirements of some nouns, e.g. proszek
do prania/pieczenia (washing powder, cooking powder).
Relations of verbs in a sentence are so complicated, that they are written
in subcategorization frames. In English, there are about 50 to 100 such
frames. Example:

ORTH
 CAT


HEAD
want
VERB
[
SUBCAT
Jan Daciuk, DIIS, ETI, GUT

[
]
[
CAT NP ,
Natural Language Processing
CAT
[VP
VFORM
HEAD
INFINITIVE
11. Parsing with Unification
]

] ] 

(272 / 277)
Applications of Unification
Unification makes it possible to transfer features of a head higher up in
the hierarchy as features of the whole phrase. For example, features of a
noun or a pronoun –a subject– are features of the whole subject nominal
phrase. Features of a verb are features of the whole verbal phrase.
Unification is the main mechanism for parsing in HPSG. It is used both to
transfer head features to the phrase level, and to model agreement and
subcategorization.
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(273 / 277)
Applications of Unification
VP
V
V
D
E

daje książkę
phon




fin
head



D
E
1,2
subcat
D
E 

daje
phon




fin
head


D
E
1,2,3
subcat
Jan Daciuk, DIIS, ETI, GUT
D
E

daje książkę koledze
phon




fin .
head



D E
1
subcat
3 NP
Natural Language Processing
2 NP
[
phon
D
koledze
E]
[
D
E]
phon książkę
11. Parsing with Unification
(274 / 277)
Applications of Unification
Unification can also model distant relations – those that cross a phrase
barrier. E.g. in a sentence:
Którą książkę Marek pożyczył od Gosi?
the word pożyczyć requires a direct object. The role of that object is played
by którą książkę, but it does not occur at the usual position (after the
verb); it occurs at the beginning of the sentence. „Marek pożyczył [] od
Gosi” can be parsed as a sentence (S) with a gap, in which a trace is put.
The full sentence is of type S0 , and the trace must match „którą książkę”.
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(275 / 277)
Implementation of Unification
1: function Unify(f1-orig,f2-orig)
2:
f1 ← dereference(f1-orig); f2 ← dereference(f2-orig)
3:
if f1 ≡ f2 then
4:
f1.pointer ← f2
5:
return f2
6:
else if f1 = null then
7:
f1.pointer ← f2
8:
return f2
9:
else if f2 = null then
10:
f2.pointer ← f1
11:
return f1
12:
else if complex(f1) ∧ complex(f2) then
13:
f2.pointer ← f1
14:
for all f2feature ∈ f2 do
15:
f1feature ← find or create a corresponding feature in f1
16:
if Unify(f1feature .value, f2feature .value) = failure then
17:
return failure
18:
end if
19:
end for
20:
return f1
21:
else
22:
return failure
23:
end if
24: end function
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(276 / 277)
Bibliography
. Elżbieta Dobryjanowicz, Podstawy przetwarzania języka naturalnego.
Wybrane metody analizy składniowej, Akademicka Oficyna
Wydawnicza RM, Warszawa, 1992.
2. Christer Samuelsson, Fast Natural-Language Parsing Using
1
Explanation-Based Learning, Swedish Institute of Computer Science,
ISRN SICS/D–13–SE, 1994.
Jan Daciuk, DIIS, ETI, GUT
Natural Language Processing
11. Parsing with Unification
(277 / 277)

Podobne dokumenty