Archive | August, 2016

Paul Ryan dislikes Trump almost as much as Cruz does: On (not) naming names at the conventions

3 Aug

Donald Trump isn’t “there yet” in supporting Republican Speaker of the House Paul Ryan. Ryan has repeatedly chastised Trump and famously said he wasn’t there yet on Trump earlier this year. How strong is his distaste? Did he (and his writers) overcome it at the RNC Convention?

Let’s examine this question by looking at how people refer to other people (I’ll be using the term ‘referents’ and building on my previous post about the most DNC-y and RNC-y words).


What children and spouses do and don’t do

If Paul Ryan showed distaste for Trump in his choice of referents, then we might expect that they will be pretty different than how someone who loves/likes someone behaves. Families have all sorts of interesting dynamics, but let’s begin by assuming that spouses and children speaking at a major convention will perform love. So let’s start with them. I’ve described all the referents in sections below, but they can be summarized this way:

  • Children don’t refer to their parents by their first name or their last name only
  • Spouses don’t refer to their partners by last name only (but obviously first name only is fine)
  • Full names are available for children (and spouses) to use for rhetorical effect
  • Basically all your pronouns refer to the nominee who is your relative
  • You don’t refer to the other candidate at all

What does the average DNC/RNC person do with first/last names?

Let’s look at the RNC/DNC speeches that aren’t given by family members (or the nominees).

Across 51 RNC speeches, there are 160 Donald Trump‘s, 4 Donald J. Trump‘s. There are 4 Mr. Trump‘s, 1 Fred Trump, 2 President Trump‘s; there are 22 other Trump uses and 30 other Donald’s. Across 42 non-Clinton DNC speeches, there are 145 Hillary Clinton‘s, 10 other Clinton‘s, and 130 other Hillary‘s. In other words…there’s a lot more naming of Hillary Clinton at the DNC.

There are six RNC speakers who don’t use Donald or Trump at all: Cotton, Ernst, Kirk, Mukasey, Perry, and Sullivan. Across 42 DNC speakers, only one of them doesn’t use Hillary or Clinton: Kareem Abdul-Jabbar. In a moment, we’ll tackle whether counts of names are a good measure. Before I do that, let me propose something slightly more complicated than simple counts.

How could we measure naming preferences?

Different speakers have different areas of focus and they have different word counts, so at a minimum we should take those into consideration. We’re on pretty firm footing, it would seem, to say that the convention at a convention is to mention the nominee. So the longer a speech goes without saying their name, the odder it is.

I’m interested in whether tf-idf can be used to quantify this oddness. Normally, tf-idf (“term frequency-inverse document frequency) is used for things like search retrieval–if you’re searching for a term across a bunch of documents, tf-idf is one way to figure out you should show Document X but not Document Y. It’s also a way of quantifying “aboutness”.

Across the RNC speakers who aren’t Trumps, the average tf-idf of Donald is 0.055 (median 0.045). The average for Trump is 0.052 (median 0.051). The average tf-idf at the DNC for Hillary is 0.077 (median 0.071), the average tf-idf there for Clinton is 0.042 (median 0.037).

Sniff test #1: Do the biggest scores pick out big supporters?

Let’s look at how has a high tf-idf score for the nominee names:

  • Hillary
    • Alison Lundergan Grimes (0.20 tf-idf; 8 Hillary‘s, 2 Hillary Clinton‘s, 1 President Clinton (Bill, not counted here))
    • Gabby Giffords (0.16 tf-idf; 3 Hillary‘s)
    • Ryan Moore (0.15 tf-idf; 6 Hillary‘s, 2 Hillary Clinton‘s)
  • Donald
    • Dana White (0.20 tf-idf; 5 Donald Trump‘s, 4 Donald‘s)
    • Rick Scott (0.14 tf-idf; 5 Donald Trump‘s, 1 Donald)
    • Laura Ingraham (0.13 tf-idf; 6 Donald Trump‘s)
  • Clinton
    • Joe Sweeney (0.11 tf-idf; 2 Hillary Clinton‘s, 2 Secretary Clinton‘s, 1 Hillary)
    • Tom Harkin (0.11 tf-idf; 6 Hillary Clinton‘s)
    • O’Malley (0.11 tf-idf; 7 Hillary Clinton‘s)
  • Trump
    • Kerry Woolard (0.14 tf-idf; 5 Donald Trump‘s, 3 the Trumps (about the family, not counted here), 2 Trump Winery (counted))
    • Laura Ingraham (0.13 tf-idf for this term, too, see above)
    • Harold Hamm (0.13 tf-idf; 4 Donald Trump‘s, 2 President Trump‘s)

If tf-idf of the nominees names is important, than these people should be among the biggest proponents. Is that true?

Well, Alison Lundergan Grimes is described as a “close family friend“. Gabby Giffords had a very short speech since it’s hard for her to speak but she has endorsed Clinton since January 10th. Ryan Moore met Hillary Clinton when he was seven years old.

The people who have high tf-idf scores for Clinton are less clear to me. Retired Iowa Senator Tom Harkin endorsed Hillary Clinton in August of 2015, which is an early endorsement. But Martin O’Malley ran against Clinton and hasn’t always had nice things to say, but he includes her passionately from start to end. Finally, I’m not quite as sure what to do with NYPD detective Joe Sweeney–is he an ardent, long-term Clinton supporter? That’s a real question. But in the meantime, every mention of the Democratic nominee in his speech is glowing.

Trump has been good to president of the UFC, Dana White over the years. Rick Scott only endorsed Trump after he had already won the Florida primaries, though in the speech he goes out of his way to say Trump is a friend.

Laura Ingraham has supported Trump in various ways for a while and her speech seems pretty full bodied. Kerry Woolard is the General Manager of Trump Winery, so whatever she feels, she has a big incentive to perform support. Harold Hamm is a potential cabinet pick for Trump and very pro-Trump.

Not surprisingly, mentions of the first name alone suggest familiarity. There are various ways we might discount or weight the tf-idf scores for first name vs. last name vs. full name. But for now, the top tf-idf ones do seem to be performing strong support.

Sniff test #2: What about no-naming?

Kareem Abdul-Jabbar never mentions the Democratic nominee in his prepared remarks. Instead, he focuses on Donald Trump and Mike Pence as being deeply problematic because of their penchants for discrimination. I’m not sure what Abdul-Jabbar’s beliefs are, of course, but it does seem plain that in his speech he is more anti-Trump than particularly pro-Clinton. The words with the highest tf-idf scores for him are JeffersonKhantyranny, and discrimination.

The RNC has a lot more no-namers. And they have longer (more problematic) histories with Trump.

  • Joni Ernst’s highest tf-idf words include Iowamy, our, country, and failed
  • Tom Cotton’s highest tf-idf words include wishesinfantrymanArmypunishingvolunteered, and peace
  • Charlie Kirk’s highest tf-idf words include campusesdemocratsparty, and youngest
  • Michael Mukasey’s highest tf-idf words include falselysheemailslaw, and hacked
  • Rick Perry’s highest tf-idf words include battlesTexas, and veterans
  • Dan Sullivan’s highest tf-idf words include Senatewe, and United States

Do these people dislike Trump? I like this headline: “Rick Perry Gives Speech At Trump-Centered Convention, Pretends Donald Trump Doesn’t Exist“. Ernst declined to be considered for Vice President and has tended to avoid talking about Trump or not said great things about him. Tom Cotton has spoken out against Trump’s Muslim ban. Sullivan, like most of the Alaska delegation, was more about supporting the nominee than Trump specifically.

Finally, in May of this year, Charlie Kirk posted an article called “I saw Trump coming, and I chose to ignore it”. I can’t tell you what’s in that article, though, because he’s removed it. But that’s the syntax of a critic not a supporter.

In other words, failure to mention the nominee’s name, minimally means the speaker is concentrating on something other than the nominee. I think it also is reasonable to say that people who don’t really support the nominee much will avoid mentioning them.

What does Paul Ryan do?

Paul Ryan refers to Donald Trump as Donald Trump two times at the RNC convention:

  • In the opening of his speech: “But you’ll find me right there on the rostrum with Vice President Mike Pence and President Donald Trump.”

  • And in the middle: “Only with Donald Trump and Mike Pence do we have a chance at a better way.”

(Ponder-point: how important is it that Donald Trump’s mentions link him to Mike Pence, the “stable” one?)

This is basically even with the amount Ryan mentions the rival nominee. He says Hillary Clinton once, Hillary twice and Clinton twice (“another Clinton”, “The Clinton years”).

Ryan has zero pronouns referring to either Trump or Clinton.

Is this surprising? Ryan’s speech has 1,609 words, making it the tenth longest at the RNC this year (in order: Trump, Pence, Barrack, Ivanka, Gingrich, Geist and Tiegen, Cruz, Christie, Eric, then Ryan).

Only a few people have tf-idf scores for Donald that are lower than Ryan’s: the six people who never use his name at all as well as the duo of Mark Geist/John Tiegen and…Ted Cruz.

There are seven people who don’t use Trump at all (the six above plus Jason Beardsley). Again, only Geist/Tiegen and Ted Cruz have lower tf-idf scores. Geist and Tiegen just say Donald Trump once and Ted Cruz also just says it once:

  • Geist & Tiegen: Someone who will have our backs. Someone who will bring our guys home. Someone who will lead with strength and integrity. That someone is Donald Trump.

  • Ted Cruz: I want to congratulate Donald Trump on winning the nomination last night.

The Geist & Tiegen mention doesn’t really look like a snub so much as they are talking about other stuff (their highest tf-idf words include Glenarm, and Tripoli…they also have 4 rhetorical uses of someone that are really referring to Trump).

The Ted Cruz mention is…well, if you’re reading this you probably already have read about his non-endorsement and know he’s a very Big Anti-fan of Trump. And Paul Ryan’s statistics are very close to his.


You’re going to tell me about Bernie, right?

Sure, okay. Bernie’s speech at the DNC has 14 uses of Hillary Clinton and one Secretary Clinton. That means that Bernie’s tf-idf scores for Hillary are in the top quartile in the DNC and the tf-idf scores for Clinton are just under the median for the DNC speakers.

He’s largely seen as giving a complete and total endorsement of Clinton in his speech, so these numbers check out.

Hey, I want the gory details about the family members

Chelsea, Ivanka, and Eric

If you’re Chelsea, you don’t  call your mom, “Hillary” or “Clinton” or “Hillary Clinton”. It’d be a bit weird, right?  All of her uses of pronouns refer to her mom.

  • The pronoun she is fine (31 uses)
  • So is her (16 uses)
  • my mom
  • my own mother, 1 My wonderful, thoughtful, hilarious mother, 1 a mother (about Hillary) and 5 my mother

Chelsea’s two uses of he are about her son Aidan–there’s no mention of Trump in her introduction of her mom.

Ivanka does something different. Like Chelsea, she never refers to her parent by just his first name or just his last name. But for rhetorical effect, she does refer to Donald Trump seven times and Donald J. Trump once. (If you want to track everything: Ivanka refers to Trump Tower twice and Trump Organization once.) Like Chelsea, all of Ivanka’s pronoun uses are about her parent-the-nominee:

  • Ivanka has 37 uses of he
  • She has 27 uses of his
  • And she has 12 uses of him
  • 18 my father‘s

Ivanka’s 1 her (re: America) and 1 she (re: a woman becoming a mother).

Eric uses Trump twice–once in reference to The Eric Trump Foundation and once to say he has never “been more proud to be a Trump”. All but one of Eric’s pronouns refer to his dad:

  • Eric has 25 uses of he 
  • 18 his
  • him
  • my dad and 1 Dad
  • 19 my father’s

The only one instance that doesn’t refer to his dad is also the only time he uses a female pronoun: “the veteran tuning into this speech from his or her hospital bed”.

Bill and Melania

Spouses have different rights and requirements. Bill Clinton does refer to his wife as Hillary (25 times).

  • She (134 times in that form, 9 she’s, 3 she’ll and 1 she’d)
  • Her (67 times and 2 hers)
  • (No uses of wife)

He also has one use of their shared last name: “after Hillary testified before the education committee and the chairman, a plainspoken farmer, said looks to me like we elected the wrong Clinton”.

Bill Clinton never refers to Trump by name or by pronoun. Bill Clinton’s he‘s are about Gingrich (1), Obama (2), Don Jones (3), an imaginary son for checking on a racist school (1), Franklin Garcia (1), a Law School classmate (1). His him‘s are about a segregationist (1), DeLay (1), Rangel (1). The three his in Bill Clinton’s speech are about Obama (2) and Don Jones (1).

Like Ivanka, Melania refers to Donald Trump by his full name several times. She uses Donald J. Trump twice and Donald Trump once. She refers to the Trump family and a Trump contest (that refers to the campaign to November). She refers to Donald 14 other times. All her pronouns are about him (Melania doesn’t use any female pronouns in her speech):

  • 19 he 
  • 12 his 
  • him
  • my husband‘s

I guess I should wrap up with a note here to reaffirm that speakers have lots of help writing their speeches. That makes it a bit more likely that they will fit certain conventions (e.g., to make a good First Lady speech, copy from a former First Lady speech). This post has been about getting a handle on those hidden rules and seeing how far people deviate from them. We don’t have access to people’s hearts–but we do have access to what they write and say and can compare it to others.


Failed vs. fighting: the linguistic differences between speeches at the RNC and the DNC conventions

1 Aug

We know that Republicans and Democrats talk differently, but what’s the best way to describe these differences? Commentators note the relative darkness of the Republican National Convention and the focus on optimism and higher production quality for the Democratic National Convention. Looking at the words speakers use helps–but you can’t just use simple frequency (for details, check out the methodology section at the bottom).

The major differences

I’ve listed the top 15 words most characteristic of each convention, but I believe they boil down to the difference between “failed” and “fighting”. Consider some of the words that are firmly on the Republican side: governmentfailed, and politicians.

Failure is, of course, negative. You could consider government and politicians to just be neutral nouns, but that’s not how they are used by Republicans. And it seems that Democratic speakers are sensitive to this since they tend to avoid these words–there were 61 uses of government at the RNC but only 10 at the DNC. There were 19 occurrences of politicians at the RNC but only one at the DNC (that’s Obama and he’s using it negatively, too). But there are two ways of looking at this: the Republicans have seized them for negative purposes…and the Democrats have ceded them to anti-government/anti-politics forces on the right.

Meanwhile, fighting emerges as an important Democratic word. It’s not that the count is huge here–40 DNC uses versus the Republicans’ 6. But that it is indicative of a dominant framing. The past tense of this word is also characteristic of the DNC–Hillary Clinton and others are described by the battles they have fought (34 uses at the DNC, 5 at the RNC). This is a contrastive analysis, so it’s worth pointing out that Hillary Clinton has a longer career in politics than Donald Trump and his past work isn’t usually described in terms of battles, with perhaps one major exception.

For the Democrats, joint action also seems to be highly relevant–voting for instance or together, which is used regularly without being in the campaign motto, stronger together (stronger is also more of a DNC word than an RNC word–it scores 0.64 in Democratic relevance). Together was used 95 times at the DNC, only 27 times at the RNC. That said, the phrase stronger together only occurred 13 times at the DNC, once at the RNC. The great again part of Trump’s campaign motto occurred 24 times at the RNC, 6 times at the DNC.

Other differences include:

  • Republicans often invoke Benghazi and terrorism (Benghazi, enemies, Islamic, terrorism, radical)
  • Republicans frame immigration/terror in terms of borders
  • Children of nominees don’t usually have as prominent a role as they did at the RNC: father occurs 56 times in 2016 RNC speeches. That said, it’s the Democrats who focus on kids (61 uses at the DNC, 14 at the RNC–child and children also skew strongly Democratic)
  • Among the issues that Democrats focus on are health care/insurance, issues of social justice and gun control, which barely misses being in the infographic
  • The Democrats tend to be more colloquial–the conventions are about equal in their use of we, but the Democrats are much more likely to use contractions we’ve and we’re as well as she’s and that’s. The Democrats also use got (usually got to) while the Republicans use have (though the strongest phrase for them is, have been).
  • Ben Zimmer points out that researchers have found that the correlates with “high psychological distance”. It may also correlate with more older male speakers. Check out the Language Log for some other thoughts.


Going beyond words

I could also give you the two-word phrases that pop out, but let me just summarize those.

For the Republicans:

  • Trump will
  • my father
  • our enemies
  • Donald Trump
  • he will
  • American Dream
  • no longer
  • who will
  • great again

For Democrats:

  • she knows
  • fighting for
  • each other
  • she was
  • that’s why
  • First Lady
  • health care
  • when she
  • with her

Methodology and data

This post uses techniques described in Monroe et al (2008), a paper that pursued this as a question of methodology. In their paper, one of the prime examples was how  Democrats and Republicans in Congress talk about issues like abortion.

The main point of Monroe et al is that you need to figure out some way to contrast two categories against each other and some background information–for example, Republican speeches on abortion versus Democratic speeches on abortion, with a background of everything else that Republicans and Democrats talk about. Technically speaking, I’m using weighted log-odds-ratios with informative Dirichlet priors. I call these “relevancy scores” in the infographic.

For the data, I took 50 speeches from this year’s RNC (50,191 words), 45 speeches from this year’s DNC (53,994 words), and 47 speeches from the past (180,719 words). DNC speakers use more words overall and their sentences are longer, too.

For the 2016 convention speeches, I used all of the speeches that the two parties made available via Medium, but for the prime-time speeches, I made an effort to get actual transcripts (removing annotations about applause, laughter and chants from the audience).

For setting “priors”, I used these word counts plus data from the recent and not-so-recent past. The past data is made up of: (a) all nominees’ speeches back to Carter and Reagan in 1980, (b) all the spouses’ convention speeches back to 2000–except for Tipper Gore’s, let me know if you can find a transcript for that, and (c) all the public speeches recorded for every Republican/Democratic presidential candidate between Jan 2016 and the conventions, as provided by The American Presidency Project.

“Donald” is not an important word in prior conventions, so it is characteristic of the current year and the RNC, in particular (218 uses of it in the RNC, 142 in the DNC speeches). The same goes for “Hillary” (300 times in the DNC speeches, 181 in the RNC). And that’s also one of the reasons that “she” appears as a major keyword for the DNC (419 occurrences in the DNC , 142 in the RNC). Clinton is the first nominee of a major party whose pronoun of choice is “she”. But there’s not much special about “he” (234 in DNC, 288 in the RNC). Yes, “he” often refers to Donald Trump in the conventions this year, but it could also refer to all the other nominees that both parties have fielded.