Tuesday, June 28, 2016

Presidential Words: A lexical analysis of Trump and Sanders

By Salvatore Callesano, Axel Bohmann, Erica Brozovsky, Noli Chew, Lars Hinrichs, Kirsten Meemann, and Patrick Schultz

1 Introduction
Donald Trump and Bernie Sanders, two of the 2016 presidential candidates, differ drastically when it comes to their political views. However, their dialects are quite similar and this is unsurprising as they both come from New York City. Our previous analysis of Trump's and Sanders' accents illustrated this similarity. Nevertheless, one important difference between the two candidates was found. Trump is a linguistic performer who adapts his speech to and away from New York City English depending on who he is speaking to and in which type of context (e.g. debates, speeches, and interviews). Sanders, however, is steadfast in his speech.

In the current analysis, we are turning our attention away from the phonetic details about Trump’s and Sanders’ accents and towards their lexicon, or in other words, the words they use. We are interested in how Donald Trump and Bernie Sanders may or may not use the same words to express their different standpoints during their political speeches, debates, and interviews. Interest in the rhetoric employed by the two candidates has been increasing throughout the current race to presidency. For example, researchers at the University of Pennsylvania have been looking at Trump’s speech, specifically his lexicon, in some detail. Additionally, popular news outlets have begun to take interest in Trump’s way of speaking. Below we organize our lexical analysis by semantic fields. This is to say we group words together that belong to a similar semantic category, which can range from a topic or theme (e.g. immigration) to a specific pragmatic unit (e.g. negation). The categories analyzed below are pronouns, negation, adverbs, and groups of people.  

2 Methods of Analysis
In order to analyze the frequency of the two candidates’ words within our semantic domains of interest, we collected video clips from Youtube of debates, speeches, and interviews. Fourteen video clips were gathered for each candidate. Each video was transcribed, then processed and analyzed using Python scripts. Words of interest including first person pronouns, negation words, certain adverbs, and pairs of positive and negative terms for marginalized peoples were averaged for each speaker. The averages were normalized in order to compare the two speakers. Sentences containing these words of interest were compiled and qualitatively analyzed.

3 Results
In the following chart, usage of each selected word is presented as a percentage. Trump is represented in red and Sanders in blue. For example, the chart shows that across all 28 video clips Sander’s accounts for 76% of the word ‘war’ and Trump the remaining 24%. In other words, and in our data which shows normalized proportions by total words for each candidate, Sanders uses the word ‘war’ more often than Trump in his speeches, debates, and interviews.

The following four sections present this type of analysis based on semantic fields. Each is accompanied by a qualitative analysis of the transcripts.

4 Pronouns
When it comes to the candidates’ uses of pronouns in our data, we see an interesting split. Considering the set of first person pronominal forms (myself, mine, me, my, and I), it is Trump that uses all of these words more frequently than Sanders, except for the possessive pronoun ‘my’. Also, Trump accounts for 100% of the uses of the word ‘mine’.

As for words like ‘they’, ‘us’, and ‘you’, we see again that in Trump’s speeches, debates, and interviews, these words are more frequent than for Sanders. However, for ‘we’, Sanders’ data accounts for about 55% (N = 235) of the occurrences.
Our qualitative analysis of the transcripts helps to interpret these data. While Trump’s percentage of ‘you’ may be coming from an equally high use of the phrase ‘you know’, what we notice in our data is that Trump’s use of pronouns reflects a strong ‘us’ versus ‘them’ narrative. ‘They’ in Trump’s rhetoric typically refers to other countries, their governments, and their citizens, while ‘us’ means Americans. This contrasts with how Sanders uses his ‘us’ versus ‘them’ narrative; for him ‘us’ refers to Americans, but not including those associated with Wall Street. The same goes for his use of ‘we’. ‘They’ for Sanders refers to Wall Street, big banks, and occasionally other countries. Overall, Trump’s speeches, debates, and interviews are more egocentric than those of Sanders. Trump’s frequent and repetitive use of ‘I’ (N > 2000) shows he often speaks of his personal accomplishments, while Sanders seems to include himself in his discussions of middle America.

5 Negation
Next we consider words that mark negation (nothing, never, no, not) as well as auxiliary verbs that are contracted with a negative word (couldn’t, cannot, can’t, won’t, don’t). Trump and Sanders each provide 50% of the uses of the word ‘no’. Trump’s data, however, accounts for 100% (N = 15) of the occurrences of ‘won’t’. This computational analysis shows that with the exception of the words ‘cannot’ and ‘not’, Trump’s overall use of negation in his discourses is much higher than Sanders’. So are Trump’s speeches, debates, and interviews more negative than Sanders’? We cannot say for certain yet. However, our data suggest that the topics the Trump discusses might lean in a negative direction (i.e. who’s not going to the cross the border, we [the U.S.] won’t win, etc.)

6 Adverbs
Considering the following three adverbs, honestly, frankly, and actually, we want to point out that Trump’s data reflects the majority of uses, showing at least 70% for each word. This is interesting to consider because these words are used to reaffirm the validity of what one is saying. It may be the case that Sanders does not feel as much of a need to consistently attempt to legitimize his discourse. Moreover, this finding fits the narrative we found in our previous analysis of Trump and Sanders, where Trump adapts his speech as a performance. Adverbs like ‘honestly’, ‘frankly’, and ‘actually’ might be helping Trump in his political performances because they give him time to pause and pull in his audience with a sense of validation.

7 Groups of people 

The last semantic field that we will address here is, in a broad sense, references to groups of people that are commonly mentioned in the current 2016 political race. These are: ‘immigrants/illegals’, ‘muslims’, ‘hispanic/latinos’, and ‘friends’.

We separate the group ‘immigrants/illegals’ into four words, namely ‘immigrants’, ‘immigrant’, ‘illegals’, and ‘illegal’, because depending on the context they can have different referents. Interestingly, when it comes to the globally encompassing plural versions ‘immigrants’ and ‘illegals’, Trump’s data accounts for 100% (N = 3 for ‘illegals’ and N = 5 for ‘immigrants) of the use. That is he often refers to large groups of people that he calls ‘immigrants’ and ‘illegals’, especially in comparison to Sanders. As for the singular versions, Trump uses the word ‘immigrant’ more frequently, but Sanders uses the term ‘illegal’ more than Trump. We do note that our analysis does not distinguish the many different uses of ‘illegal’, so we cannot say whether or not Sanders is often referring to “an illegal” or something that is “illegal”.

The next group is ‘muslims’. We only included the plural version here because in our data we found no instances of singular ‘muslim’. This time, which might be counterintuitive given Trump’s usual rhetoric, Bernie Sanders’ data accounts for 100% of the instances of ‘muslims’. Our study has yet to analyze the surround contexts of these words, so we cannot yet say empirically what Sanders is referring to with his high rates of the word ‘muslims’, although one can imagine given his general political views.

Related to the topic of ‘immigrants/illegals’, ‘hispanic/latino’ is our next group of interest in this study. Here we find a very clearly defined split between the two candidates.  In our data, Trump only uses the terms ‘hispanics’ or ‘hispanic’, while Sanders only uses the terms ‘latinos’ or ‘latino’. This type of split has been noted in academic research; Linda Martín Alcoff studies the link between ethnic names and political movements and writes “would-be political leaders have long known that one’s choice between these terms can signal one’s political views about assimilation, cultural nationalism, and the relative importance of race” (2005, 397). She notes that use of the term ‘hispanic’ denotes a right-leaning politician, while ‘latino’ marks those towards the political left. This is clearly seen in our data and Alcoff also notes the same terminological split between George W. Bush and Al Gore in the 2000 presidential election. Despite such research, there is no clear answer on what exactly these terms mean individually. Popular tropes about these terms often permeate the ideas that ‘latino’ refers to North, Central, and South Americans whose languages derive from Latin (including Brazilians but excluding Spaniards) and ‘hispanic’ to those places colonized by Spain (excludes Brazil). Additionally, these terms seem to be used differently in different geographic regions, where most of the U.S. South prefers the term ‘hispanic’. We note that this seems to correlate with the notion that the terms are strongly tied to specific political ideologies, where the southern regions of the U.S. tend to have more conservative political views.

Lastly, our analysis provided an interesting result with regards to the terms ‘friends’ and ‘friend’. We note that Trump’s data accounts for 85% (N = 15) of the singular version ‘friend’, while Sanders’ accounts for 58%  (N = 8) of the plural counterpart ‘friends’. Our qualitative analysis of the transcripts shows us that Trump makes it a point to tell his audience that most of the people he is speaking about are his ‘friend’. Sanders, albeit likely sarcastic, uses the plural ‘friends’ to refer to his “conservative friends” on the other side of the presidential race.

8 Summary of Findings
  • Trump and Sanders use the relationship between the pronouns ‘us’ and ‘them’ differently. Trump creates a separatist functions between ‘us’ (i.e. America) and ‘them’ (anything non-American). Trump’s data also accounts for a high percentage of the first person singular pronoun “I”, suggesting an overall egocentric narrative. Sanders, however, is more self-inclusive in his use of ‘us’, yet at the same time it seems that his notion of ‘us’ and ‘America’ does not include a relation to Wall Street.
  • Overall, Trump’s political debates, speeches, and interviews contain more occurrences of negation words.
  • Trump uses more truth-verifying adverbs, such as ‘actually’, ‘honestly’, and ‘frankly’ as compared to Sanders.
  • Trump and Sanders refer to Hispanics/Latinos with distinct terminology. Trump categorically uses ‘hispanic(s)’, while Sanders categorically uses ‘latino(s)’.
  • Trump’s discourses account for 100% of the occurrences of the plural terms ‘illegals’ and ‘immigrants’. While not always the majority, Sanders’ data does show uses of the singular terms ‘illegal’ and ‘immigrant’.

9 References

Alcoff, L. M. 2005. Latino vs. Hispanic The politics of ethnic names. Philosophy & Social
Criticism, 31(4), 395-407.

