Archive | June, 2016

Which new emoji will be the most popular?

20 Jun

June 21st is the release of Unicode 9, which will feature 72 new emoji–folks at Emojipedia have helpfully put them all together. The question in this blog post is: which ones will turn out to be the most popular? (Note that most people aren’t going to be able to use them immediately–you have to get an update of your phone/browser for them to show up and so will anyone you want to send them to.)

unicode-9-emojis-emojipedia.jpg

Two emoji that won’t become popular are going to be the rifle and the modern pentathlon since it won’t be easy to access them. In May, Apple led an effort against them, so you almost certainly won’t see them in any keyboard even though I believe the code will be in place.

Using past data to predict the next round

In general, you should bet on hearts, faces, and hand gestures. Here are some screenshots from emojitracker.com and EmojiXpress, which show what’s been most popular on Twitter and SMS text messages, respectively.

Screenshot 2016-06-20 13.22.43.png

Top emoji of all time on Twitter by emojitracker.com (the coloring doesn’t mean anything you need to worry about)

Screenshot 2016-06-20 13.24.23.png

Top emoji of all time from emojiXpress, which represents use in SMS text messages

EmojiXpress also helpfully shows which of the newest emoji have been most popular:

Screenshot 2016-06-20 13.26.58.png

The most popular of the newest emoji used on emojiXpress

Note that there was a pretty big campaign for the taco, but emojiXpress has it currently 21st of the emoji that were released last year. So I don’t think that augurs well for those of you who want to predict bacon. I did a quick look at the usage of taco over the last several days and there’s no upward trajectory, it’s plodding along at the rate it has been for the last several months. If anything in the Unicode 8 emoji is trending, it’s probably the scorpion but it’s going to take a while to overtake even the chipmunk.

Prediction: Overall

My prediction for the number one overall spot is the ROFL face because there’s a strong tendency for people to use emoji to express happy states of affairs. My I hope-it’s-not-the-runner-up pick is the black heart.

Screenshot 2016-06-20 13.39.29.png

Prediction: Not-faces

I think it’s likely that the shrug and the face palm are going to have aficionados. And while I like the John Travolta moves on the dancing man, I’d rather live in a world in which we all agree that EVERYONE is a woman-in-a-red-dress 💃 (note that not all platforms show a red dress).

Meanwhile, there’s a cartwheel, which really should be popular, but it’s going to appear in the athletic section so most people will miss it. And they’ll miss water polo, too, which is a shame not just because I used to play but have you seen water polo players?

Even before skin tones were easily available, people using the #blacklivesmatter and #icantbreathe hashtags on Twitter were using a lot of the fist emoji to indicate solidarity and Black Power. Now that skin tones are available people can use hand gestures and other people-emoji that more accurately describe them.

The new batch of hand gestures can be used playfully, positively and politely (handshakes and fist bumps being ways of making contact). I’m not quite sure how having left- and right-facing fists bumps will work. Most emoji are just one way, like you have to run and drive off to the right 🏃🚗🚓. It’ll be neat if people offer a fist bump facing right and then get a reply that has a fist bump facing left to connect.

Back to popularity. I want to vote for the shrug, but I’m afraid that the fact that it looks like it’s going to show a woman’s head by default means that a lot of people won’t use it. But like the dancing woman, we should all use it. The smart money is likely on the raised hand since it can mean so much (stop, high five, etc). But I’m going to wager that folks using and making fun of selfies are going to cause it to take off:

Screenshot 2016-06-20 15.05.49.png

Prediction: Animals

Finally, as much as I like the gorilla, it’s probably not going to win the animal bracket. Animals are an interesting class because they are all nouns. They offer further evidence that emoji aren’t really about substituting for nouns. Instead emoji are usually about emotional stance, identity, and metaphor. The most popular animals include the see-no-evil monkey, which people don’t use to talk about the actual animals 🙈. The cat-faces with heart eyes or tears are also popular 😻. The unicorn is also very popular among the new emoji–and it doesn’t even really exist…but it can convey sparkle magic.

Despite my warning about treating emoji as if they are just noun-pictures, let’s look at how the are used as nouns. For example, the fox is very common to be talked and written about–at least in American English over the last five years. But that’s mainly because of Fox News. My plea: do not use the fox emoji to refer to this organization.

Search term Per million words
a fox 2.84
a deer 2.82
a shark 1.81
a bat (not disambiguated) 1.69
a duck 1.64
a butterfly 1.37
an eagle 1.24
an owl 0.85
a gorilla 0.28
a shrimp 0.28
a lizard/gecko/newt 0.25
a rhino/rhinoceros 0.20
a squid 0.17

Lots of people who aren’t Americans or English-speakers use emoji, of course. The top new animal emoji for Spanish-speakers may be the butterfly, the duck and the fox. For Portuguese speakers, it may be the lizard, the butterfly, the shark, and the eagle. But properly, I should use bigger corpora on those languages and add at least a half dozen more. Even better would be to look at image search results to see which animals people are searching for. 

If this post could put me in touch with Joan Embery, that would be swell. But until she weighs in, I’ll make the claim that butterflies are also the most likely of these animals for people to encounter worldwide since it seems to be distributed everywhere in the world except Antarctica. But I am skeptical of using in-life frequency to predict emoji frequency. So it is because butterflies are widespread symbols of natural beauty that I’m going to pick them as my Most Likely to Succeed in the animal bracket.

Screenshot 2016-06-20 15.37.46

What do you predict?

I think the other two interesting brackets are probably sports and food. Which ones are you picking?

Advertisements

Artificial intelligence in the press and in history!

16 Jun

Over on CrowdFlower’s blog, I’ve got two posts on artificial intelligence.

  • How does the media cover AI?
    • In which I look at 2,000 articles over the last year and a half and talk about the major themes.
  • An AI Springtime
    • Where I take a look at whether we’re just in hyper-hype and how past hype/bust cycles have worked. Also I get to count volcano eruptions and tokens of “disappoint”.
Clusters.png

The major themes in recent press about artificial intelligence

Poetry v. not-poetry

5 Jun

I’ve been training an artificial intelligence system to write poetry and this morning I got interested in what the little parts of syntax and semantics are that preoccupy poets compared to other forms of written language. So I took a heap of poetry and a heap of not-poetry , pulled out the bigrams (two-word phrases) and did some statistics to see what distinguishes poetic writing from non-poetic writing.

Poets are preoccupied by these phrases:

  • Metaphor:
    • like a
    • like the
  • Nature:
    • the sky
    • the sun
    • the wind
    • the moon
    • the earth
    • the dark
    • the river
    • the sea
    • the snow
    • a stone
    • the water
    • the air
  • Self and others:
    • said geryon (this is because there’s a fair amount of Anne Carson in the data and she has a whole book about a character named Geryon)
    • the world 
    • my mother
    • the dead
    • my heart
    • of your
  • Space/time/prepositional phrases:
    • the present
    • in the
    • on my
    • in your
    • in its
    • under the
    • from the
    • in a

Meanwhile, they steer clear of the following phrases, which seem to be better for writing letters, fiction, essays or other kinds of non-fiction. The point of this comparison set was not so much to compare poetry to any particular other genre, but to collect a variety of “not-poetry” to see what poets tend not to use:

  • she had
  • he had
  • had been
  • she was
  • she said
  • it was
  • did not
  • going to
  • that he
  • that she
  • i had
  • there was
  • to her
  • seemed to
  • to do
  • he was
  • (okay, I’m going to stop there)

In other words, poets–or at least these poets–don’t tend to talk much about the past tense. They do, however, orient things spatially, as evidenced by all those prepositional phrases. Not surprisingly, the poetry is a lot more personal, with many more I/my/we/you/each other. And of course like a and like the are prominent, since it’s hard to resist a metaphor.

Notice that there’s a lot of definite articles used in poetry. I think that’s mostly in service of talking about nature and poetical things, though there are twice as many the phrases in the poetry-camp than in the non-poetry camp. Those that are non-poetic and reasonably frequent in both lists are the mostthe hospital, the baby, the american, the time, the fact, and the country.

If you’re curious for some major phrases where there is no difference, let me give you a sample of those: and aagainst theas it, and you do are all examples of phrases that are even in usage betwen poetic and non-poetic writing.

Methods and data

I took 75,678 lines of poetry (537,711 words) and compared them to 15,000 paragraphs of fiction and non-fiction (534,723 words). If you’re curious about methodology, you can read more about it here and here.

The poetry sample is 37 texts from 35 authors. By word count, the top authors in the data here are Lorine Niedecker (12.8%), Wisława Szymborska (8.7%), Jane Shore (6.1%), and Anne Carson (5.5%). Szymborska wrote in Polish but here I’ve included her in English–so you may want to say I’ve included her and/or the translators, Stanislaw Baranczak and Clare Cavanagh.

The non-poetry is randomly sampled “lines” (paragraphs) from 41 texts by 32 authors. The biggest amounts come from Joan Didion (12%), Virginia Woolf (7.7%), Penelope Fitzgerald (6.7%), Rainbow Rowell (5.3%), and Louisa May Alcott (5.2%).

What about women who aren’t white?

The authors in the data are all white women. You will be shocked–shocked!–to hear that it’s harder to get collections of poetry by non-white poets who are women. I currently only have eight poetry collections that fit. You’ll grant me that it would be strange to include Phillis Wheatley who was writing in the 1700s with a bunch of much-more modern writers. But if you think I should go ahead and add in people like Claudia Rankine, Nikki Giovanni, and Maya Angelou, I’m certainly open to that critique.

There’s a lot more data available for non-white novelists who are women, so that’s probably the next step. EXCEPT that one wants to be careful about what their lumping. So I’m unlikely to compare novelists in terms of race unless I have A LOT more data.

That said, diving into the phrases that preoccupy, say, Toni Morrison or Octavia E. Butler compared to other people (or each other!) has some appeal. If you have particular interests or suggestions, I’d be very happy to get them.