GATE track 1 session: Difference between revisions

Jump to navigation Jump to search
 
(11 intermediate revisions by the same user not shown)
Line 20: Line 20:
* Learning Systems - statistical
* Learning Systems - statistical


Old Bailey IE project  - old english (Online)
Old Bailey IE project  - 17th century english (Online)


* POS - assigned in Token (noun, verb, etc)
* POS - assigned in Token (noun, verb, etc)


* Gazateer - gotcha, have to set initialization parameter listsURL before it's
* Gazateer - gotcha, have to set initialization parameter listsURL before it's loaded. Must also "save and reinitialize."
loaded. Must also "save and reinitialize."
* Gazeteer creates Lookups, then transducer creaties named entities
* Gazeteer creates Lookups, then transducer creaties named entities
* Then orthomatcher (spelling features in common) coreference associates those
* Then orthomatcher (spelling features in common) coreference associates those
Line 36: Line 35:
* Evaluation metric - mathematically against human annotated
* Evaluation metric - mathematically against human annotated
* Scoring - performance measures for annotation types
* Scoring - performance measures for annotation types
* Precision = correct / correct + spurious
* Recall = correct / correct + missing
* F-measure is precision and recall (harmonic mean)
* F=2⋅(precision⋅recall / precision+recall)
* GATE supports average, strict, lenient


* Result types - Correct, missing, spurious, partially correct (overlapped)
* Result types - Correct, missing, spurious, partially correct (overlapped)
Line 47: Line 52:
** useful for eg HTML that has boilerplate
** useful for eg HTML that has boilerplate


= Other notes =
== Lucene data store and ANIC ==
* Use <null> for default set
* Go to Datastore for queries
** eg {Person}({Token})+{Money}
* Useful for debugging JAPE and results
[[File:GATE-lucene-person-money.png|800px]]


== To investigate ==
== To investigate ==
Line 67: Line 62:


* Rules based on tokens and lookups
* Rules based on tokens and lookups
Phase: MatchingStyles
Input: Lookup
Options: control = appelt
Rule: Test1
(
({Lookup.majorType == location})?
{Lookup.majorType == loc_key}
):match
-->
:match.Location = {rule=Test1}
Copying features: :match.Location = { type = :match.Lookup.minorType}


== To review, gotchas ==
== To review, gotchas ==
Line 76: Line 84:
* macros
* macros
* To reuse created annotations has to be a separate rule
* To reuse created annotations has to be a separate rule
=== Matching types ===
[[File:gate-matching.png|800px|Matching styles for JAPE]]


= To follow up =
= To follow up =


* WebSphinx crawler CREOLE plugin
* WebSphinx crawler CREOLE plugin
* [http://www.semanticsoftware.info/semantic-assistants-architecture GATE NLP web services]
= Other notes =
== Lucene data store and ANNIC ==
* Use <null> for default set
* Go to Datastore for queries
** eg {Person}({Token})+{Money}
* Useful for debugging JAPE and results
[[File:GATE-lucene-person-money.png|800px]]


= Demos =
= Demos =
Line 93: Line 117:


While it can do a lot out of the box and benefits from development time and breadth of connectivity, to be useful to more than patient specialists, it needs usability testing. A lot of things are inobvious and too domain specific that with a bit of work could be more broadly useful. Interaction could include a lot more immediate, useful and interesting looking displays. A web based version could have these features. However the team seems somewhat ambivalent about development. :)
While it can do a lot out of the box and benefits from development time and breadth of connectivity, to be useful to more than patient specialists, it needs usability testing. A lot of things are inobvious and too domain specific that with a bit of work could be more broadly useful. Interaction could include a lot more immediate, useful and interesting looking displays. A web based version could have these features. However the team seems somewhat ambivalent about development. :)
Looking forward to learning about programming using GATE libraries.


{{Blikied|Aug 30, 2010}}
{{Blikied|Aug 30, 2010}}


[[Category:SemWeb]]
[[Category:SemWeb]]

Navigation menu