Archive | December, 2013

Misspelling(s) of the Year

19 Dec

[sociable]

So I just wrote a guest blog post over at Dictionary.com on The Misspelling of the Year, which is also featured in a BuzzFeed listicle. There are actually three prominent misspellings for furlough (my favorite is ferlow).

The word comes from the Dutch verlof. That ver part is something like ‘forward’. In English it turned into fur but that doesn’t make it related to furtive (which we get from French where the fūr comes from ‘thief’). Nor is it related to furious (which we also grabbed from French, tracing back to furia, a state of frenzied rage). The lof part is related to German laub, which connects it to leave and maybe believe–‘laub’ seems to be about pleasure and approval. A verlof/furlough was permission for a leave of absence. In ye olden years, furlough really was pronounced with a final /f/ sound, which uh makes the -ough ending make a little more sense? (Okay, no.)

To get misspellings of the year, I looked at search terms at Dictionary.com for the last two years and sorted out which ones had had the most significant increases. One thing that doesn’t get reported in the other posts are which things are getting spelled more standardly. The following misspellings went down last year while the standard spellings went up.

  • absolutly
  • arguements
  • greatfully
  • senarios
  • unanymous
  • daquiri

Meanwhile, if this is your first time to the Idibon blog, welcome! Here’s a quick selection of other posts on “popular linguistics”:

And since Dictionary.com’s Word of the Year was “privacy“, here’s a related post from us on using (our) natural language processing tools to protect privacy and security:

– Tyler Schnoebelen (@TSchnoebelen)

[sociable]

Petulant punctuation

3 Dec

Last week, Mark Liberman over at Language Log wrote about Aggressive periods and the popularity of linguistics, which was really a review of Ben Crair’s article on The Period Is Pissed over at TNR. One of Liberman’s main points was that the number of comments on Crair’s article (125 when he wrote) was way more than any of the other articles in the Culture section of the TNR (a distant second went to Why Didn’t an American Make ’12 Years a Slave’? with 14 comments). Language peeves and counter-peeves are clearly a big industry.

Crair suggests that periods can turn neutral phrases into negative ones, or as he writes, “people [can] use the period not simply to conclude a sentence, but to announce ‘I am not happy about the sentence I just concluded.'”

In this post, I want to come at the question of the affective slant that punctuation might take. (If you want a lot of reading, you can check out my work on affective linguistic resources, which is what a period would count as if it were communicating crankiness—the emoticons part is pages 190-259.)

One way to test Crair’s hypothesis is to have a bunch of people evaluate a bunch of text with and without final periods and score them for aggressiveness. As a proxy, we might use the Stanford Sentiment Analysis Corpus since they had people judging sentences and parts of sentences. There are over 14,000 pairs that differ merely by whether the raters saw a final period or not. But when we compare the average ratings for with-period and without-period conditions, the result was…

Nothing.

But you should have an objection: a movie review corpus does not really test Crair’s hypothesis. What Crair is considering is the kind of exchange where you text a friend or lover and they text back “It’s ok” versus “It’s ok.” (uh-oh). Movie reviews aren’t the kind of interpersonal genre that can really evaluate Crair’s idea. You have to match the right analysis with the right data—even within a single language, the way linguistic resources are put to use differs a lot by context.

So let’s move away from “sentiment” (positive/negative/neutral) and into something a bit more interesting: emotion. And let’s move to a corpus that involves more conversation: 9 million directed social media interactions. The emotional coloring of punctuation can be assessed quickly by the contexts it appears in. There are a variety of ways of closing off social media interactions. Let’s look at these in final position:

. period
! exclamation
? question mark
ellipses
🙂 smiley face
😦 frowning face

Our results show the period being used expressively instead of just being a neutral default ending. I won’t go into too much detail about what emotion prediction looks like in Idibon (stay tuned), but here’s a phenomena that gives you the idea: the correlation between final punctuation and affective second-person phrases: love youmiss you, and—avert your eyes if you need to—fuck you.

In the following graph, you’ll find that the strongest relationship by far is between miss you and the frowny emoticon (the no-nose dialect). They appear together 584% more often than we’d expect if everything were randomly distributed. There’s also a strong negative relationship between love you and final question marks—that is, there are far fewer messages that have love you somewhere in them and also end in a question mark than we’d expect just by multiplying the percentage of overall love you messages times the percentage of overall messages that end in question marks. There’s also a strong constraint against smiley faces and fuck you.

Alright, so these relationships look like what we’d predict. What about the period?

pissedperiod-1

If the period were pissed-off, it probably wouldn’t want to appear with love you or miss you and there are constraints on them occurring together in this data. By contrast, if it’s getting pissy, it may well occur with fuck you. And it does: 113% more than we’d expect. Or in other words, more than double what we would expect if there was no correlation. For positive correlations, the fuck you/period combo is surpassed (in this data) only by the predictable correlation between miss you and a frowny face. This is a bit surprising to us, especially since exclamation marks appear with fuck you only 4% more than if there was no correlation.

So the evidence here seems to support Ben Crair’s hypothesis and it is a novel discovery. But here’s how it makes sense: if the convention is to end sentences with a period, then a period is just a default. But once you start making the convention that you don’t have periods (like in texting and social media conversations), suddenly the period has room to take on expressive, emphatic connotations. You. Have. Seen. This. Before. What’s most unexpected is that there is enough of a signal for it to stand out. We expected all the other correlations, but to be honest, we didn’t expect to find the power of the period.

So drop your periods unless you’re peeved

– Tyler Schnoebelen (@TSchnoebelen)

ps: .

pps: I can’t help it. It needed that period. It’s not you, it’s me. You’re fine. You’re wonderful. Uh-oh