GATE track 1 session: Difference between revisions

Jump to navigation Jump to search
Line 1: Line 1:
A full week of learning [http://en.wikipedia.org/wiki/General_Architecture_for_Text_Engineering GATE] text mining/information extraction language processing and talks. [http://gate.ac.uk/wiki/TrainingCourseAug2010/ Session wiki]
Rogers made a vow to open up their support to 'social media' and post any reasonable comment. This is a very progressive position, but some comments don't seem to be making it through the moderation queue.


[[File:GATE_screenshot.png|900px|GATE developer screenshot]]
'''Some of these have been eventually posted''', after the conversation had died down. A few were called off topic.


GATE is written in Java and very Java centric. This makes it portable, fast, and heavyweight. A programming library is available.  It's 14 years old and has many users and contributors.
== http://redboard.rogers.com/2010/rogers-on-demand-online-rentals-to-launch-next-month/ ==


= Using GATE developer =
I realize (and it’s clear from the inclusion of statistics) that Rogers doesn’t need to focus on what any individual wants. But I don’t want just first run movies, trendy sports and series in an exclusive community, I want world scale choice and pricing with the knowledge and chaos of the Internet. I have yet to see any Rogers promotions that have been aimed at me, aside maybe from the initial “open,” “revolution” Android ads, which unfortunately proved rather false. While I am not included in “nearly 60 per cent of 18-34 year-old,” I am pretty sure the majority of people would simply like Roger to provide reliable, fast, competitive Internet service so we can choose from a global palette of options. Rogers should certainly participate in extra offerings, but not to the detriment to the bigger idea.


* GATE developer is used to process sets of Language Resources in Corpus using Processing Resources. They are typically saved to a serialized Datastore.
"What are your thoughts on the new service? How have you changed the ways you enjoy video entertainment?"
* ANNIE, VG (verb group) processors.
* Preserve formatting embeds tags in HTML or XML.
** Different strengths using GATE's graph (node/offset) based XML vs. preserved formatting (original xml/html)


= Information Extraction =
Posted Sept 10 morning. One day later several other postings made ... not that one.


* IR - retrieve docs
== Samsung Galaxy S Captivate coming soon to Rogers ==
* IE - retreive structured data


* Knowledge Engineering - rule based
Below not posted but they posted a more controversial post wit some back and forth:
* Learning Systems - statistical


Old Bailey IE project  - 17th century english (Online)
----


* POS - assigned in Token (noun, verb, etc)
I think you missed one point:


* Gazateer - gotcha, have to set initialization parameter listsURL before it's loaded. Must also "save and reinitialize."
- Rogers wanted to head off any customers thinking of going to Bell for this device
* Gazeteer creates Lookups, then transducer creaties named entities
* Then orthomatcher (spelling features in common) coreference associates those


* Annotation Key sets and annotation comparing
However, if Rogers doesn’t literally *release* this device in “weeks” as they stated, they’re being dishonest.
** Need setToKeep key in Document Reset for any pre annotated texts


== Evaluation / Metrics ==
By the way, I’d made similar points about the Samsung Galaxy S captivate coming soon to Rogers in another post that never made it here, perhaps something about it was considered “offtopic” *cough* unsubsidized plans with consumers buying and controlling their handset of choice on the open market *cough*


* Evaluation metric - mathematically against human annotated
Posted Aug 19, 2010 1435 in response to a similar post that didn't make it.
* Scoring - performance measures for annotation types


* Precision = correct / correct + spurious
== RedBoard’s FAAQ: Frequently Asked Android Questions ==
* Recall = correct / correct + missing
* F-measure is precision and recall (harmonic mean)
* F=2⋅(precision⋅recall / precision+recall)
* GATE supports average, strict, lenient


* Result types - Correct, missing, spurious, partially correct (overlapped)
Alan - let me assure you, you don't need to step in and apologize on behalf of corporations. There's nothing, anywhere that says "consumer phones are meant to be locked down." That's more tenuous than Rogers' statement that their offering would be a "revolution" and "open." A statement that some of us would like them to follow up on, rather than making excuses on their behalf that accepts a dysfunctional status quo.


* Tools > Annotations Diff - comparing human vs machine annotation
I have no desire to own the Nexus One as I want a keyboard device. Further, the formula doesn't make sense, since I'd be continuing to pay monthly subsidization fees to Rogers, effectively nearly paying double for my handset.


* Corpus > Corpus quality assurance - compare by type
Please, unless you're going to say something useful, next time keep your thoughts to yourself. ;)
* (B has to be the generated set)


* Annotation set transfer (in tools) - transfer between docs in pipeline
Post May 13
** useful for eg HTML that has boilerplate


== Rogers self-service on demand: How-to videos, cheat sheets and FAQs ==


== To investigate ==
I'm still waiting to see my post from several days ago asking how to self serve purchase a new handset less than three years after my current one. In an age of fast moving (especially the first generation — Dream/Magic — which most quickly became outdated), ever lower priced standard handsets, this is obviously something many people will want to do. Alternatively, what price discounts can we expect if we don't obtain a 'subsidized' handset from Rogers?


* markupAware for HTML/XML (keeps tags in editor)
Posted March 25 to http://redboard.rogers.com/2010/rogers-self-service-on-demand-how-to-videos-cheat-sheets-and-faqs - previous post 'disappeared.'
* AnnotationStack
* Advanced Options


= JAPE =
== How @RogersHelps helps: Improving the customer experience 140 characters at a time ==


* Rules based on tokens and lookups
Is @RogersHelp going to do anything if my smartphone is stolen except explain I have to break my contract (around $300), create a new three year contract, and buy a new device (around $600) ?


Phase: MatchingStyles
I’d like to see Rogers simply offer replacement devices at a pro rated cost. If you say the device is worth $600, and it’s $200 up front, and you’re claiming I’m subsidizing the remainder over 36 months, I should be able to get a replacement for $332 after one year. That’s about what I’d pay to break the contract alone. Does this not sound fair to anyone?
Input: Lookup
Options: control = appelt
Rule: Test1
(
({Lookup.majorType == location})?
{Lookup.majorType == loc_key}
):match
-->
:match.Location = {rule=Test1}


Copying features: :match.Location = { type = :match.Lookup.minorType}
And it does imply that those who get their own smartphone should be paying $11 less a month…


== To review, gotchas ==
Posted March 31, 2010 0730am to http://redboard.rogers.com/2010/how-rogershelps-helps-improving-the-customer-experience-140-characters-at-a-time/


* Rule types : first takes only first match, excludes compound
----
** a? b for "a b" will match "a b"
* multiplexor tranducers
* multi-constraint statements
* macros
* To reuse created annotations has to be a separate rule


=== Matching types ===
(in addendum to above)


[[File:gate-matching.png|800px|Matching styles for JAPE]]
In case my comment isn't clear, let me explain...


= To follow up =
A few months ago my wife's iPhone was stolen. It's from Fido, but please hold off on the Fido-isn't-Rogers macro, I think the story would be the same.


* WebSphinx crawler CREOLE plugin
I called customer service and explained the situation. The rep I spoke to sounded like she'd never heard of anyone losing their phone before, as if everyone simply had their phones for three years and this was a very special circumstance. Her first proposal was crazy. We'd have to break the phone and data contracts, and lose our existing phone number. We'd have to buy a new iPhone for $600, including a $35 signup fee and lose our FidoDollars. Pretty much $1000. After three phone calls, we finally got it so we'd only need to cancel one plan for a $125 fee, not pay a signup fee, and pay $600 for a new phone, and use our FidoDollars to pay for most of that (thank goodness for Fido Dollars).
* [http://www.semanticsoftware.info/semantic-assistants-architecture GATE NLP web services]


= Other notes =
When I got the bill, it seems we were charged $300 for cancelling, but by then I was so damned tired of arguing and confused by all the different options I gave up. I sincerely hope that wasn't part of the plan.


== Lucene data store and ANNIC ==
What I'd like to see Rogers do is go for a simple, Really Real Reality option where for any reason I can sign on to MyAccount, click a button on the phone I want, and it will tell me how much I'll pay and/or how much my contract will be extended, given the system knows the terms of my account. This will eliminate all the confusion, including the unfairness associated with special deals people sometimes seem to get, and avoid the confusion that often ensues when speaking with a rep


* Use <null> for default set
Posted March 31 to http://redboard.rogers.com/2010/how-rogershelps-helps-improving-the-customer-experience-140-characters-at-a-time/
* Go to Datastore for queries
** eg {Person}({Token})+{Money}
* Useful for debugging JAPE and results


[[File:GATE-lucene-person-money.png|800px]]
== RedBoard @ one month: How are we doing? ==


= Demos =
Rogers is making a brave and smart move with this service, hopefully toward getting to the bottom of what we all want, healthy competitive service focused on true Internet access, with worthwhile Rogers offers layered on top. Many homes are, like mine, spending around $8000 over three years on Rogers wireless. For this price we should expect quite a bit. [Rogers is in a tough position because Internet is something that could and should be freely distributed]


* Mímir for querying large volumes of data (uses MG4J)
With social media comes the hope relationships can be formed. Even if we don't remember a name or avatar, history is just a click away and hopefully we can avoid seeing unhelpful canned responses, and this can be more of a conversation, and less disassociated blips.
* Translating parts of speech between languages using Compound editor and Alignment editor
* Predicate extractor (MultiPaX)
** Mixed results at best
* OwlExporter
** NLP ontology


= Conclusions =
I have been continually promoting a few themes —


While it can do a lot out of the box and benefits from development time and breadth of connectivity, to be useful to more than patient specialists, it needs usability testing. A lot of things are inobvious and too domain specific that with a bit of work could be more broadly useful. Interaction could include a lot more immediate, useful and interesting looking displays. A web based version could have these features. However the team seems somewhat ambivalent about development. :)
* Rogers [and HTC] made a mistake with the Dream's inability to upgrade and the Magic offer was not an adequate response [for many Dream purchasers]
* customers should be able to more easily replace or upgrade devices purchased through Rogers without having to go through bizarre 'contract cancellation' contortions. ideally, within a few clicks through their myRogers account
* It's good Rogers has clarified we can use third party devices, and even embraced the Nexus One. However, people who are not using a subsidized device should be paying a lower fee [rogers can use its position as a hardware purveyor more competitively and progressively]
Finally, I would like a "my profile" link on this site so I can find all postings I've made, and see others' profiles. Several postings I made haven't shown up after a week.


Looking forward to learning about programming using GATE libraries.
Posted April 1, 2010 to http://redboard.rogers.com/2010/redboard-one-month-how-are-we-doing/


{{Blikied|Aug 30, 2010}}
[[Category:Advocacy]]
 
[[Category:Android]]
[[Category:SemWeb]]

Navigation menu