Automatic Tagging

The article Automatic Detection of Tags for Political Blogs described a system created to automatically generate tags for political blogs.  Tags are generally used to quickly find blog posts related to specified keywords, and presently are user identified.  This can create problems, as it requires the user to identify all individual keywords within a post, assign them the same tag as previous times that keyword was tagged, and take the time to apply the tags.  It is both a flawed system and sometimes not even used when bloggers are hurried or lazy.

In retrospect, I believe this was done on political blogs because it would be easiest to apply here.  Political blogs are very clear on their keywords, and generally focus on the same keywords, much of which is either constant issues or ones related to current events.  As such, their keywords are predictable and easy to identify.  An automatic tagging system in this case simply needs to look for names that are politically important and other keywords that are currently politically relevant.

As one who’s written in numerous blogs with numerous topics, I feel an automatic tagging system would be incredibly convenient.  Of course, with many (if not most) blogs, keyword identification may not be nearly as feasible as with political blogs.  As such, an automatic tagging system might not be ideal for most blogs.  On the other hand, if linguistic algorithms can be designed to identify the keywords, normalize the words, and apply tags with a high accuracy, such a system could potentially be incredibly useful.  As the system would not be perfect, users should be able to edit, add, and remove tags, and, if so chosen, automatic tagging should be able to be toggled off.  However, I know that I, among others, would much prefer automatic tagging.  As noted in this blog, I often don’t even bother with tags.  I often feel it’s an unnecessary waste of time, with the exception of widely read blogs on specific topics.  I will admit, though, that tagging enhances any blog, even personal ones, and automatic tagging would not be an unnecessary feature.

The fact that automatic detection of tags was designed for political blogs with high accuracy makes me feel that a full automatic tagging system might be possible to design and implement.  It is simply a matter of someone investing the time and energy to create such a system.

Leave a comment

Filed under Uncategorized

Dialogue in Videogames – Developer’s End

Now, I know I just wrote a paper that critical looks at dialogue in video games.  I looked specifically at player dialogue in online games.  And yet that is not the only kind of dialogue that exists in video games, and a couple of the other projects that focused on language in video games got me thinking on other areas of dialogue in video games.  Specifically, I started thinking of the developer’s dialogue.

Perhaps I should be more accurate in what I mean by the developer’s dialogue.  I mean dialogue of non-playable characters (NPCs), and even the scripted dialogue of playable characters (PCs).  After all, I was originally a game design major with a minor in creative writing, and even now my current major is a blend of game design and creative writing.  I had a large interest in writing in video games, specifically script writing (or dialogue).

Dialogue is quite important in most genres of video games.  Some, like first person shooters, use dialogue to communicate missions, while others, like role playing games, are incredibly reliant on dialogue for the sake of storytelling.  In the case of massively multiplayer online games (MMOs), PC dialogue is usually absent and NPC dialogue is rarely as important as player dialogue.  Game play is not as story driven in these cases, as rewards and community seems to overshadow the story.  I know this from experience playing MMOs; players usually complete tasks either to reap rewards or to help others reap rewards.

Offline games tend to rely on dialogue to tell story, though there are exceptions.  Braid, an award winning XBox Live Arcade game, is a beautiful example of this. Much of the story is read in books placed at the start of each world.  However, near in the end of the game, the hero helps the princess (his lover) escape from a villain and reunites with her.  Well, it seems that way, except it then reveals that the event was in reverse, and playing it back shows the princess was trying to escape from the “hero”, who was smothering her, and the “villain” was helping her get away from him. This entire scene is conveyed with no dialogue, showing that dialogue is only one way for video games to tell stories.

Does this mean dialogue is in danger of being obsolete in single player video games?  Could different story-telling techniques ever become the norm for video games?  I highly doubt that.  Though online games make dialogue between players more and more important, most single player games seem to rely heavily on dialogue.

One of the most popular video games of the past year, Portal 2, relies heavily on dialogue, despite the fact that the protagonist is mute.   The villain, an insane AI by the name of GLaDOS, constantly degrades the protagonist, and her condescending insults have become so popular that there are even GPSes to replicate her voice and mannerisms.  A supporting character, an unintelligent robot named Wheatley, both guides and amuses players throughout the adventure.  Even a minor character, a robot obsessed with outer space (aptly named the Space Personality Core), has become a well-known internet meme.

I could, of course, give countless examples of dialogue usage in video games.  It’s a full career in itself.  Many games have multiple writers, as larger games may have tens of thousands of lines of dialogue, if not more.  Branching stories have added layers of complexity much deeper than Choose Your Own Adventure books ever could reach.  It’s definitely interesting to see where dialogue may go in future video games.  Perhaps it will be possible to have actual conversations with NPCs, eliminating the need for the PC to talk at all.  Either way, I’m sure there is a lot more to research on this particular subject.

Leave a comment

Filed under Uncategorized

Hashtag Popularity

Now I personally am not a Twitter user.  I do have a Twitter account, but I only used it for a short while before growing bored with it.  As such, I am no way well-informed on the subject of hashtags.  I do still know how they are used, though.


The study on how hashtags propagate is definitely interesting.  However, I was left with more questions than answers.  I felt the study was very limited.  From what I gathered, they only picked three hashtags to evaluate, all of which were popular hashtags.  These hashtags followed the idea that the rich get richer.  Yet, wouldn’t the fact that the hashtags were already growing in popularity affect the results?  Even if these hashtags were randomly picked without prior knowledge of the popularity, three hashtags hardly represents the behavior of all hashtags.  Is it possible hashtags could fluctuate more?  Could there be ones that reach a certain popularity and then stay constant or even decrease?  Could hashtags remain unpopular for awhile and suddenly gain some popularity?  Could other hashtags repeatedly increase and decrease in popularity over a course of time?

I could go on and on with questions.  I felt the research was very limited and more research could be done.  It did peak my interest, and the conclusion of the studies were logically sound in addition to being backed up with the research.  I just think there could be more research done with a broader range of hashtags.

At the same time, how relevant is this research?  Of course there are ways the results could be utilized, such as spreading information quickly in emergencies, but is the technology constant enough to be worth studying?  Will the results of these studies be obsolete in a decade?  Perhaps that can’t be answered, and the results might prove useful for future technologies.  Personally, sheer curiosity would be enough for me to carry out research in this direction.  The thing is would it be an efficient use of time and money?

Leave a comment

Filed under Uncategorized

Internet Multimedia as Linguistics

It’s quite interesting how linguistics on the internet is becoming more and more dependent on other media.  Previously, written language was fully based on text, with the exception of cases where pictures are relevant (picture books, signs, etc). However, the internet has created a sort of network between all forms of media.

Of course, there are plenty of websites that are purely text.  Even websites that are text and images may use alt text, allowing the images to be replaced by text (primarily for use when the images cannot load and in cases of accessibility for the blind).  Yet, it is becoming more and more rare to find websites that only rely on text, or even text and images.

A perfect example of this level of multimedia is shown on Facebook.  Facebook users often rely on the ability to use many forms of media, many times seamlessly. Pictures are uploaded, sometimes moments after being taken, and friends can be tagged in them, whether to link them to images of themselves or simply to grab their attention. In the same way, videos can be uploaded, and, though it has yet to support uploading audio, many users post Youtube videos to share music.  Apps allow people to not only participate in interactive media, but share them as well, potentially adding infinite other ways for media to be integrated into communication.

And then there’s the hyperlink.  This little key component of the world wide web provides a linguistic pathway heretofore unparalleled, and the application of this pathway is endless.  This is the primary thread connecting the multimedia together, with text (and sometimes images) usually being the backbone of it all.  As an example, two friends on Facebook might have a conversation.  One might find an image to work better than text as a response, linking that.  Further into the conversation, the other friend might tag one of their friends to pull them into the conversation instead of communicating directly.  That friend might know of a related web page and choose to add another hyperlink.  In this way, potentially endless links are formed, most of which intended as communication.  One can argue that, though it isn’t spoken or written language, these media are all being used as language.  I believe that, arguably, the internet has become  one of the biggest changes to linguistics since the writing system was created.  It is giving the average person an incredibly number of tools to creatively and functionally affect the very way they converse.

The number of media available to the internet user is still increasing.  Developments are being made in the areas of smell, taste, and touch.  We could grow to the point where we could download scents to fill our homes and share new recipes first by taste.  Hologram technology has even reached breakthroughs such as touchable holograms (, meaning someday we might be able to download virtual items we could actually feel.  These technologies are advances at an incredible pace.  How might they affect linguistics?

Leave a comment

Filed under Uncategorized

Twitter Propagating News

After reading about how how Twitter was used to spread information in emergency situations, I started thinking about how applicable this could be.  It was already proven there is a difference in the speed a tweet is retweeted based on whether the tweet is factual.  Simply integrating some sort of algorithm to allow users to see how accurate the information is could help the propagation.

I actually believe this is the future of emergency systems.  Twitter is perfect for getting across important information very quickly.  In situations of emergencies, the truth algorithm could kick in.  No longer would the public have to rely on news networks to gather and post information on the television, radio, and internet.  The public themselves could contribute any useful information and control the spread of factual information.

Granted, a lot of this was covered in the article we read for class, but I believe this is a tool that should be applied by Twitter.  Perhaps more studies should at least be carried out to find out if this is truly an accurate method.  Either way, I think it would be foolish to let this research go to waste.

Leave a comment

Filed under Uncategorized

Forensic Internet Linguistics

Evaluating internet conversations to identify potential criminals is a task I wish could be invested in more.   The internet makes criminal behavior such as stalking and acts of violence much easier while leaving the criminals hard to track down.  We constantly hear stories of pedophiles using chat rooms and social networking websites to connect with innocent teenagers and abuse them.  It is a real threat and one that could potentially be prevented.

David Crystal used some realistic methods of identifying predators.  Unfortunately, his corpus was relatively limited, as it is hard to obtain data from these cases.  I think it would not be an infringement of rights for this information to be shared to prevent these crimes.  It’s not for publication or even for sheer curiosity.  It’s for safety.

I also do not think it would infringe on rights to survey children’s internet conversations.  Minors already do not have all the rights of an adult, and this is for protection.  Why shouldn’t their online activity be moderated, at least by computer programs looking for the suggestive words such as the ones that Crystal used in his study?  It’s not to limit privacy.  Molestation and rape are real dangers.  I know individuals who have gone through these awful experiences, and it can scar people for life.

It is interesting how language usage can reveal intentions.  Isolated words aren’t necessarily bad, but how often they’re used can be enough to signal crimes before they occur.  Unfortunately, I have my doubts such a system could be perfect.  In rare cases, there may be a high level of suggestive words without any malicious intentions.  I would hope this wouldn’t lead to anyone being falsely criminalized.  I do not know why there would be a high level of suggestive words in such a case, but it’s a factor to consider.  Also, criminals aren’t necessarily unintelligent.  Ones that get away with their crimes may be skilled at being discrete, and so I am certain there would be predators who would find ways around the system.

Despite risks, I think that this research should be taken further.  It’s a huge advancement in the safety of children, which is something I do not believe is worth ignoring.  I don’t know how to easily obtain information and permission to carry this out, but I think it is something that truly needs to be done.  The internet has brought about a new medium for criminals, and it is important that, as we do with all other mediums, there be laws for the safety of individuals, especially children.

Leave a comment

Filed under Uncategorized

Online Ad Efficiency

Reading about how online ads work and the mistakes they can make really made me reflect on situations I’ve seen it fail.  Though I’ve seen countless examples of this, a lot of the ones I remember clearly are from Facebook.  Many times I not only feel the ads I see on Facebook are irrelevant, but I also have trouble finding out where they come from… but perhaps further examination can clear up where they went wrong.


It’s nothing new for me to see ads for dating websites when on Facebook.  The best I can gather is that, because my profile says I’m single, Facebook assumes I’m looking for a girlfriend.  The ads are at least correct is assuming I’m a straight male, though that much can easily be gathered from my profile.  However, for about a year I constantly would see ads for Muslim dating sites.  I am in no way Islamic.  My profile states I’m a Christian, which explains why I sometimes see ads for Christian dating sites, but contradicts the ads for Muslim dating sites.  Sometimes ads appear because multiple friends have clicked like for them, but none of them have clicked like for any of the Muslim dating sites.  In fact, only one of my friends is Islamic, and I hardly talk to her (in addition to the fact that she lives in California, quite far from Rochester, NY).  I haven’t posted anything written in Arabic, nor do I appear to even look Islamic (or even Middle Eastern for that matter).  I simply cannot understand where the assumption was derived from that I would be interested in Muslim dating sites.  Quite clearly this was a failure on the advertising end, and would easily have been rectified; an ad like this should appear only on profiles stating the individual is a single Muslim.

The worst part is that I clicked the “x” to get rid of the ads, stating the reason being they were against my views.  It would be great if that got rid of the ads permanently, but the next time I logged onto Facebook, the ads were back.  I tried multiple times to get rid of them, so clearly the system did not remember my preferences, which might not be a linguistic problem but is a problem nonetheless.


Another good example is  An ad for this website appeared repeatedly on my profile for quite awhile. I am not a Mormon.  Once again, I am listed as a Christian.  Some consider the Mormon religion a branch of Christianity.  Ignoring my views on whether it is, and assuming the ad assumes it is, that does not mean all Christians are Mormons.  In other words, I gave the website no reason to believe I am a Mormon, nor that I am interested in becoming a Mormon.  I will admit I do have one friend that is a Mormon, and it is possible I was speaking to her often at the time this ad would appear, so perhaps that was triggering the ad.  However, it should not, as I had never expressed interest in becoming a Mormon, to her or anyone else.  Once again, this ad was falsely aimed at me.  Perhaps in this case, I had at some point used the word “Mormon” in talking to her, meaning it is very possible that it identified that word and assumed I had an interest in the religion.  I feel this ad would be better targeted towards individuals who state, in their profile, that they are Mormons.

Once again, the “x” refused to get rid of the ad, leading me to believe that the “x” does not actually do anything, or that whatever it does is fairly inconsequential.  I do believe that the “x” could have a linguistic function, by identifying the key words of the ad, determining which part or parts of it were insignificant to me, and used that for future ad placement.  As far as I can identify, it just removes the ad for the rest of the online session.


Another example: ads for research needing gay men with HIV.  I only fill one part of that requirement, being that I am a man.  My profile clearly states I am interested in women, not men, meaning right off the bat it failed to identify very easy to obtain data, wasting the advertising on someone who is admittedly straight.  The HIV part may not have actually been used for placement, as many with HIV don’t publicly state it.  Ideally, the ad could be shown for men stating they are gay and have HIV, and thrown out there sometimes for gay men who don’t state they have HIV, but there is no reason I should have gotten it, unless it was assuming I was a closet gay male with HIV.  I gave it no reason to assume that, though, so I feel it was another wasted ad.


One of the most recent misplaced ads I’ve gotten on Facebook has been for getting help with my diabetes.  FYI, I do not have diabetes.  Once again, I hadn’t ever said anything about having diabetes.  Granted, this isn’t something that would be noted in one’s profile, but I’ve never posted anything on Facebook that should lead the system to believe I do have diabetes.  Another ad that seems to have failed.


I’ve seen Facebook ads fail more times than succeed.  I do not know how they decide their placement, but I do know not everyone receives the same ads.  Sometimes it guesses right.  Sometimes it notes things in my profile that I like, claiming another product is similar and that I should try it.  The products have ended up not being very similar, but the ads were placed correctly.  However, there are still glaring problems in the ad placement.  I think most of these could be fixed by having them look through key parts of the profile, rather than everything posted.  Others didn’t seem to look through anything, as for as I could tell, else they would not appear, as I indicated information that clearly contradicted that ads’ requirements.  For such a widely used website, Facebook’s ads are very inefficient.  There is a lot of marketing opportunity to tap into if someone would simply fix the system.

Leave a comment

Filed under Uncategorized