[Python] 9.2 language processing and processing characteristics of extended feat
9.2 processing feature structure
The contents of this section for how to construct the characteristic structure and operation in NLTK.
NLTK provides the FeatStruct constructor feature structure using the () statement.
>>>fs1 = nltk.FeatStruct(TENSE='past',NUM='sg') >>>print fs1 [ NUM = 'sg' ] [ TENSE= 'past' ]
The characteristics of structure as the map view is often useful (directed acyclic graph)
Also can appear structure sharing, or re entrant. As shown in Fig.:
When the two paths with the same value, called equivalence.
Said structure sharing in the code, as shown below:
>>>print nltk.FeatStruct("""[NAME='Lee',ADDRESS=(1)[NUMBER=74,STREET='ruePascal'], ... SPOUSE=[NAME='Kim', ADDRESS->(1)]]""") [ ADDRESS= (1) [ NUMBER= 74 ] ] [ [ STREET = 'rue Pascal'] ] [ ] [ NAME = 'Lee' ] [ ] [ SPOUSE =[ ADDRESS-> (1) ] ] [ [ NAME = 'Kim' ] ]
Contains and unity
Contains: a feature structure more generally contain a few general.
Unity: the merger of the two character structure information is become unity.
>>>fs1 = nltk.FeatStruct(NUMBER=74,STREET='ruePascal') >>>fs2 = nltk.FeatStruct(CITY='Paris') >>>print fs1.unify(fs2) [ CITY = 'Paris' ] [ NUMBER=74 ] [ STREET= 'rue Pascal']
Structure sharing can also use variables, such as?x
>>>fs1 = nltk.FeatStruct("[ADDRESS1=[NUMBER=74, STREET='ruePascal']]") >>>fs2 = nltk.FeatStruct("[ADDRESS1=?x,ADDRESS2=?x]") >>>print fs2 [ ADDRESS1= ?x ] [ ADDRESS2= ?x ] >>>print fs2.unify(fs1) [ ADDRESS1= (1) [ NUMBER= 74 ] ] [ [ STREET= 'rue Pascal'] ] [ ] [ ADDRESS2-> (1)
9.3 extended feature based grammar
In this section, will explore the various language problems, and show the features included in the grammar benefits.
VP[TENSE=?t,NUM=?n]-> V[SUBCAT=intrans,TENSE=?t,NUM=?n] VP[TENSE=?t,NUM=?n]-> V[SUBCAT=trans,TENSE=?t,NUM=?n]NP VP[TENSE=?t,NUM=?n]-> V[SUBCAT=clause,TENSE=?t,NUM=?n]SBar V[SUBCAT=intrans,TENSE=pres,NUM=sg]-> 'disappears' | 'walks' V[SUBCAT=trans,TENSE=pres,NUM=sg]-> 'sees' | 'likes' V[SUBCAT=clause,TENSE=pres,NUM=sg]-> 'says' | 'claims' V[SUBCAT=intrans,TENSE=pres,NUM=pl]-> 'disappear' | 'walk' V[SUBCAT=trans,TENSE=pres,NUM=pl]-> 'see' | 'like' V[SUBCAT=clause,TENSE=pres,NUM=pl]-> 'say' | 'claim' V[SUBCAT=intrans,TENSE=past]-> 'disappeared' | 'walked' V[SUBCAT=trans,TENSE=past]-> 'saw' | 'liked' V[SUBCAT=clause,TENSE=past]-> 'said' | 'claimed'
SBar represents a clause label.
SBar-> CompS Comp-> 'that'
The following You claim that youlike children. this sentence generation structure:
For example, this sentence: put the book on the table
Can be expressed as:
Wherein, NP represents the subject, followed by PP NP, complement sub categories.
So, Kimput the bookonthe table, this sentence can be parsed into:
The core word review
X-bar syntax by abstract concept phrase level. Generally there are three levels.
For example, as shown in Fig.:
Core structure of 36a is N, N" N', known as N projection. N" is the largest projection, N is sometimes called zero projection.
Direct complement a lexical categories are always at the core of X core word brother position, and the modifier is located in the middle of the category X'brothers position.
S -> N[BAR=2]V[BAR=2] N[BAR=2]-> DetN[BAR=1] N[BAR=1]-> N[BAR=1]P[BAR=2] N[BAR=1]-> N[BAR=0]P[BAR=2]
The auxiliary verb and flip
(39)a. Doyoulike children? b.CanJodywalk? (40)a. Rarelydoyousee Kim. b.NeverhaveI seen this dog.
But not a verb can be placed in front of the. Can be called verb in clause at the beginning of the term. For example: do can have, including be,will,shall
We can use the following formula:
S[+INV]-> V[+AUX] NP VP
Mark [+inv], contains an auxiliary verb.
AUX distinguish whether it is a verb.
SUBCAT stands for sub categories.
Unlimited dependent components
Filling the gap distance between words and no upper bound. This fact can easily use the sentence complement components to illustrate.
a. Who do you like __? b.Who do you claim that you like __? c.Who do you claim that Jody says that you like __?
Infinite rely on formal grammatical processing in generalized phrase grammar:
A slash categories are in the form of Y/XP; we explain: sub component category Y phrase is missing a category XP. For example: S/NP is the lack of a NP S.
>>>nltk.data.show_cfg('grammars/book_grammars/feat1.fcfg') %start S #################### #GrammarProductions #################### S[-INV] -> NPVP S[-INV]/?x -> NPVP/?x S[-INV] -> NPS/NP S[-INV] -> Adv[+NEG]S[+INV] S[+INV]-> V[+AUX]NPVP S[+INV]/?x-> V[+AUX]NPVP/?x SBar-> CompS[-INV] SBar/?x-> CompS[-INV]/?x VP-> V[SUBCAT=intrans,-AUX] VP-> V[SUBCAT=trans,-AUX] NP VP/?x-> V[SUBCAT=trans,-AUX] NP/?x VP-> V[SUBCAT=clause,-AUX] SBar VP/?x-> V[SUBCAT=clause,-AUX] SBar/?x VP-> V[+AUX]VP VP/?x-> V[+AUX]VP/?x #################### #LexicalProductions #################### V[SUBCAT=intrans,-AUX] -> 'walk' | 'sing' V[SUBCAT=trans,-AUX] -> 'see' | 'like' V[SUBCAT=clause,-AUX] -> 'say' | 'claim' V[+AUX]-> 'do' | 'can' NP[-WH]-> 'you' | 'cats' NP[+WH]-> 'who' Adv[+NEG]-> 'rarely' | 'never' NP/NP-> Comp-> 'that'
Use the grammar to parse the sentence:
>>>tokens = 'who doyouclaim that youlike'.split() >>>from nltk import load_parser >>>cp = load_parser('grammars/book_grammars/feat1.fcfg') >>>for tree in cp.nbest_parse(tokens): ... print tree (S[-INV] (NP[+WH] who) (S[+INV]/NP (V[+AUX] do) (NP[-WH] you) (VP/NP (V[-AUX, SUBCAT='clause']claim) (SBar/NP (Comp that) (S[-INV]/NP (NP[-WH] you) (VP/NP (V[-AUX, SUBCAT='trans']like) (NP/NP )))))))
Posted by Simon at November 13, 2013 - 5:11 PM