2,153
edits
No edit summary |
|||
Line 1: | Line 1: | ||
[ | A full week of learning GATE text mining/information extraction language processing and talks. [http://gate.ac.uk/wiki/TrainingCourseAug2010/ Session wiki] | ||
[[File:GATE_screenshot.png|900px|GATE developer screenshot]] | |||
{{ | GATE is written in Java and very Java centric. This makes it portable, fast, and heavyweight. A programming library is available. It's 14 years old and has many users and contributors. | ||
| | |||
}} | == Using GATE developer == | ||
* GATE developer is used to process sets of Language Resources in Corpus using Processing Resources. They are typically saved to a serialized Datastore. | |||
* ANNIE, VG (verb group) processors. | |||
* Preserve formatting embeds tags in HTML or XML. | |||
** Different strengths using GATE's graph (node/offset) based XML vs. preserved formatting (original xml/html) | |||
=== To investigate === | |||
* markupAware for HTML/XML (keeps tags in editor) | |||
* AnnotationStack | |||
* Advanced Options | |||
{{Blikied|Aug 30, 2010}} |