Dave Morton - Oct 27, 2012:
What criteria should we use for determining “positive” or “negative”, Jan? I’ve run through a couple, and they were more or less in the realm of “gobbledy-gook”. For example:
RETWEET na yg sayang aku :p wkwk
Harap kawasan je besar tapi lampu jaa
@carolcorazza htt p://t.co/1bKK8TmV (text altered to prevent link creation)
Ok, now granted, some of the above examples are in languages I don’t recognize, let alone read, but still, Should I just use the “don’t know” button liberally?
I would also suggest a “neutral” button, as well. Things like just the number 100 are neither positive nor negative, and “don’t know” isn’t proper for those. Just a suggestion.
Yes, the foreign languages are a problem. I’m trying to filter on only Enlish language, but apparently there are many false ‘english’ flags in the data.
The idea is to use the ‘I don’t know’ button as liberal as possible. When you’re not certain, just press that one. At the moment, this is not a recorded value.
I know what you mean about the ‘neutral’ thing. I have been thinking about that one also. For the training algorithm that I currently have in mind, this isn’t really important, but perhaps for the future.
Don’t know yet. For the time being, just say ‘don’t know’ and you’ll get a next one (plenty of tweets to tag)
In the future, I might perhaps go through some statements recursively to tag them with different things besides positive/negative.
What criteria should we use for determining “positive” or “negative”, Jan?
Anything that sounds in either way, for instance: insults, or people telling how miserable they feel = negative. People going like YESSSS, or or WE WON = positive.
Take a peek at the ‘results’ tab. I’ve already labeled some.