User:Mitcho/ParserTNG: Difference between revisions

Jump to navigation Jump to search
no edit summary
No edit summary
No edit summary
Line 19: Line 19:
==step 1: split words/arguments==
==step 1: split words/arguments==
Japanese: split on common particles... in the future get feedback from user for this
Japanese: split on common particles... in the future get feedback from user for this
Chinese: split on common functional verbs and prepositions
Chinese: split on common functional verbs and prepositions


Line 30: Line 31:
This step will return a set of (V,argString) pairs. (Note, this includes one pair where <code>V=null</code> and <code>argString</code> is the whole input.)
This step will return a set of (V,argString) pairs. (Note, this includes one pair where <code>V=null</code> and <code>argString</code> is the whole input.)


<b>EX</b>: <code>('add','lunch with Dan tomorrow to my calendar'), ('','add lunch with Dan tomorrow to my calendar')</code>
<b>EX</b>:
('add','lunch with Dan tomorrow to my calendar'),
('','add lunch with Dan tomorrow to my calendar')


==step 3: pick possible clitics==
==step 3: pick possible clitics==
Line 39: Line 42:
Find delimiters (see above).
Find delimiters (see above).


<b>EX:</b> for ('','add lunch with Dan tomorrow to my calendar'),
<b>EX:</b> for <code>('','add lunch with Dan tomorrow to my calendar')</code>,
we get:
we get:
'add lunch *with* Dan tomorrow *to* my calendar'
add lunch *with* Dan tomorrow *to* my calendar
'add lunch with Dan tomorrow *to* my calendar'
add lunch with Dan tomorrow *to* my calendar
'add lunch *with* Dan tomorrow to my calendar'
add lunch *with* Dan tomorrow to my calendar


then move to the right of each argument (because English is head-initial... see parameter above) to get argument substrings:
then move to the right of each argument (because English is head-initial... see parameter above) to get argument substrings:


'add lunch *with* Dan tomorrow *to* my calendar':
for <code>add lunch *with* Dan tomorrow *to* my calendar</code>:
{V:    null,
{V:    null,
DO:  ['add lunch','tomorrow','calendar'],
  DO:  ['add lunch','tomorrow','calendar'],
with: 'Dan'
  with: 'Dan'
goal: 'my'},
  goal: 'my'},
{V:    null,
{V:    null,
DO:  ['add lunch','calendar'],
  DO:  ['add lunch','calendar'],
with: 'Dan tomorrow'
  with: 'Dan tomorrow'
goal: 'my'},
  goal: 'my'},
{V:    null,
{V:    null,
DO:  ['add lunch','tomorrow'],
  DO:  ['add lunch','tomorrow'],
with: 'Dan'
  with: 'Dan'
goal: 'my calendar'},
  goal: 'my calendar'},
{V:    null,
{V:    null,
DO:  ['add lunch'],
  DO:  ['add lunch'],
with: 'Dan tomorrow'
  with: 'Dan tomorrow'
goal: 'my calendar'}
  goal: 'my calendar'}


(Note: for words which are not incorporated into an oblique argument (aka "modifier argument"), they are pushed onto the DO list.)
(Note: for words which are not incorporated into an oblique argument (aka "modifier argument"), they are pushed onto the DO list.)
Line 70: Line 73:
For each parse, send each argument string to the noun type detector. The noun type detector will cache detection results, so it only checks each string once. This returns a list of possible noun types with their "scores".
For each parse, send each argument string to the noun type detector. The noun type detector will cache detection results, so it only checks each string once. This returns a list of possible noun types with their "scores".


EX:
<b>EX:</b>
'Dan' -> [{type: contact, score: 1},{type: arb, score: .7}]
'Dan' -> [{type: contact, score: 1},{type: arb, score: .7}]
'my calendar' -> [{type: service, score: 1},{type: arb, score: .7}]
'my calendar' -> [{type: service, score: 1},{type: arb, score: .7}]


==step 6: ranking==
==step 6: ranking==


<code>
foreach parse (w/o V)
foreach parse (w/o V)
  by semantic roles in the parse, find appropriate verbs
  by semantic roles in the parse, find appropriate verbs
  foreach possible verb
  foreach possible verb
    score = \prod_{each semantic role in the verb} score(the content of that argument being the appropriate nountype)
    score = \prod_{each semantic role in the verb} score(the content of that argument being the appropriate nountype)
</code>
    
    
<b>EX:</b>
<b>EX:</b>


{V:    null,
{V:    null,
DO:  ['add lunch','tomorrow'],
  DO:  ['add lunch','tomorrow'],
with: 'Dan'
  with: 'Dan'
goal: 'my calendar'}
  goal: 'my calendar'}


'Dan' -> [{type: contact, score: 1},{type: arb, score: .7}]
'Dan' -> [{type: contact, score: 1},{type: arb, score: .7}]
'my calendar' -> [{type: service, score: 1},{type: arb, score: .7}]
'my calendar' -> [{type: service, score: 1},{type: arb, score: .7}]


"add" lexical item:
"add" lexical item:
...args:{DO: arb, with: contact, goal: service}...
...args:{DO: arb, with: contact, goal: service}...


so...
so...


score = P(DO is a bunch of arb) * P(with is a contact) * P(goal is a service)
score = P(DO is a bunch of arb) * P(with is a contact) * P(goal is a service)
= 1 * 1 * 1
= 1 * 1 * 1


so score = 1
so <code>score = 1</code>


/EX
<b>/EX</b>


Now lower the score for >1 direct objects:
Now lower the score for >1 direct objects:


score = score * (1-0.5**(#DO-1)) (example algorithm)
score = score * (1-0.5**(#DO-1)) (example algorithm)


<b>EX:</b> score = 1, with 2 direct objects, so
<b>EX:</b> <code>score = 1</code>, with 2 direct objects, so
score = 1 * (1-0.5**1) = 1 * 0.5 = 0.5
score = 1 * (1-0.5**1) = 1 * 0.5 = 0.5
308

edits

Navigation menu