[language processing and Python] 10.5 paragraph Yi layer

A paragraph is a sequence of sentences.

Paragraph representation theory

Quantitative methods in first order logic is limited to a single sentence, but some quantifier can be expanded to more than two sentences.

See the example below:

(54)a. Angus owns a dog. It bit Irene.
b.∃x.(dog(x) &own(Angus, x)&bite(x, Irene))

Paragraph representation theory (Discourse RepresentationTheory, DRT) aims to provide a way to deal with this and appear to be characteristics of paragraph of other semantic phenomena.

DRS(discourse representation structure,DRS)Paragraph structure

As shown in Fig.:


In order to deal with the calculation of DRS, we need to convert it into a linear format.

Among them, DRS is composed of a paragraph reference list and a DRS list consists of paired:

([x, y],[angus(x), dog(y),own(x,y)])

Create a DRS object in NLTK:

>>>dp= nltk.DrtParser()
>>>drs1 = dp.parse('([x,y],[angus(x), dog(y),own(x,y)])')
>>>print drs1
([x,y],[angus(x), dog(y),own(x,y)])

You can view the visualization effect:



Each DRS can be transformed into first-order logic formulas:

>>>print drs1.fol()
exists xy.((angus(x) &dog(y)) &own(x,y))

DRT expression is connected with DRS operator, +.

>>>drs2 = dp.parse('([x],[walk(x)]) + ([y], [run(y)])')
>>>print drs2
(([x],[walk(x)]) + ([y],[run(y)]))
>>>print drs2.simplify()
([x,y],[walk(x), run(y)])

A DRS embedded within another DRS. This is a universal quantifier is handled.


>>>drs3 = dp.parse('([],[(([x], [dog(x)]) -> ([y],[ankle(y), bite(x, y)]))])')
>>>print drs3.fol()
all x.(dog(x)-> exists y.(ankle(y) &bite(x,y)))

If the DRS contains PRO (x) forms of the condition, method resolve_anaphora () will replace the x=[...] forms of the condition, which [...] is a possible antecedent list.

>>>drs4 = dp.parse('([x,y],[angus(x), dog(y),own(x,y)])')
>>>drs5 = dp.parse('([u,z], [PRO(u), irene(z), bite(u, z)])')
>>>drs6 = drs4 + drs5
>>>print drs6.simplify()
([x,y,u,z],[angus(x), dog(y),own(x,y), PRO(u),irene(z), bite(u,z)])
>>>print drs6.simplify().resolve_anaphora()
([x,y,u,z],[angus(x), dog(y),own(x,y), (u = [x,y,z]), irene(z), bite(u,z)])

The existing mechanism processing and processing λ DRS abstraction is compatible.

Det[NUM=sg,SEM=<\P Q.([x],[])+ P(x)+ Q(x)>]-> 'a'
Det[NUM=sg,SEM=<\P Q.exists x.(P(x)&Q(x))>]-> 'a'

For example, a dog:

(NP[NUM='sg', SEM=<\Q.(([x],[dog(x)])+ Q(x))>]
(Det[NUM'sg', SEM=<\PQ.((([x],[])+ P(x))+ Q(x))>]a)
(Nom[NUM='sg', SEM=<\x.([],[dog(x)])>]
(N[NUM='sg', SEM=<\x.([],[dog(x)])>]dog)))))

We can parse sentences using DRT analytical methods:


>>>from nltk import load_parser
>>>parser= load_parser('grammars/book_grammars/drt.fcfg', logic_parser=nltk.DrtParser())
>>>trees = parser.nbest_parse('Angus ownsa dog'.split())
>>>print trees[0].node['sem'].simplify()
([x,z2],[Angus(x), dog(z2),own(x,z2)])

Paragraph processing

A paragraph is a sentence sequence, S1, S2, s3... Paragraph line is the sequence of s1-ri reading,…sn-f.

>>>dt =nltk.DiscourseTester(['A student dances','Every student is a person'])
s0readings: s0-r0: exists x.(student(x)&dance(x))
s1readings: s1-r0: all x.(student(x)-> person(x))

We can always add and delete the sentence sentence, consistency setting consistchk=True reading sequence checking each acceptable to check module:

>>>dt.add_sentence('No person dances',consistchk=True)
Inconsistent discoursed0['s0-r0', 's1-r0', 's2-r0']:
s0-r0: exists x.(student(x)&dance(x))
s1-r0: all x.(student(x) -> person(x))
s2-r0: -exists x.(person(x)&dance(x))
>>>dt.retract_sentence('No person dances',verbose=True)
Current sentences are
s0: A student dances
s1: Every student is a person

We use informchk=True to check the new sentences with if there is information on the current paragraph.


>>>dt.add_sentence('A person dances',informchk=True)
Sentence'A persondances'under reading 'exists x.(person(x)&dance(x))':
Not informative relative to thread 'd0'

Posted by Adam at November 27, 2013 - 11:01 PM