My Midterm.

Me, You, Us and the Nation— Charting data from presidential inaugural speeches

The Graph.

Introduction.

This line graph shows how first-person and second-person pronoun use within presidential inaugural speeches has changed over the centuries. To put it more simply, do each century’s presidents speak more about “me,” “us,” or “you?” While there aren’t a lot of data points due to how they’re organized, there appears to be a strong upward trend in the use of first-person plural pronouns such as “we.” This could represent a change in the presidential mindset overtime. Is governing a collaborative process? Who is considered “part of the nation”?

Sources.

For this project, I used data from Project Gutenberg. In class, we were provided with a txt document containing all speeches from 1789 to 2005. Both of President Obama’s inaugural speeches, as well as President Trump’s, are available in separate Project Gutenberg files. For President Biden’s inaugural speech, I used the official White House website.

Processes.

This process took a lot more steps than it might look. First, I had to separate each century worth of data into different txt files. I also had to “clean” the data, meaning I removed parts such as the dates, headings and transcription notes. This was because I needed to get an accurate word count for each century in order to make sure I was using the correct percentages of pronoun prevalence. Since there are so few speeches in the 18th and 21st centuries, the data would otherwise be quite meaningless. 

With each “clean” txt file of data, I uploaded it to Voyant Tools. This website helps analyze patterns within text. One part of the analysis separates all words and counts their frequencies. For each century’s file, I plugged in a list of words: “I’d, I’ll, I’m, I’ve, I, my, mine, me, let’s, we’d, we’re, we, we’ve, our, ours, us, you’d, you’ll, you’re, you’ve, thou, you, your, yours, thee, thy, y’all,” and I recorded their frequencies in a separate spreadsheet. Once I had all the relevant word frequencies, I combined the counts into groups of first-person singular pronouns, first-person plural pronouns and second-person pronouns. I then divided the counts by the full word count of each document to show the relative use of each group. 


The final step was to add this data to Flourish, a website that helps users make graphs and other data visualizations. I was able to plug in my spreadsheet data right into a template for a line graph, and to start out, the graph was already pretty clear. I changed the colors of the lines to a theme with more contrast and added a title, axis labels, my sources, and clarifying notes. I also had to ensure that separate grid lines in between the points for each decade were excluded, so that the graph was as clear as possible that each data point was representing the full century.

Presentation.

For this website, I wanted to keep it very simple in order to focus on the data and my report. I also wanted it to match with my main domain, so I used the same theme. Flourish allows you to embed the graph, which I think is a great resource because it allows the viewer to hover their mouse over the graph and gather information about specific data points. While I kept my graph relatively simple, this can be especially useful when there’s a lot of data condensed in one place.

Significance.

I think this data is representative of the Digital Arts and Humanities because it can connect to multiple different disciplines. A major one is political science. As a president, is it more strategic to cast yourself out from the rest of the nation? Should you focus on yourself, ensuring them that you’re a hero that can fix their problems? Or is it better to remind the American people that you’re one of them? There’s also an obvious historical element of this. How did those views and strategies change over time? I also think there’s an interesting connection to English and changes in writing overtime. While I chose to condense the data so that there wasn’t too much going on, I noticed during my process that all uses of the word “thou” were actually in the 1900s, not the 1700s or 1800s, like some might expect. I think we could definitely expand on this data to discover even more trends.

In the future, I also think it would be interesting to add more individual data points. Organizing the data by century makes it easy to visualize, but it also leaves out some nuances, and doesn’t fairly account for the differences in just how much data is included in each century. It’s easy for one president to heavily skew the data from either the 1700s or 2000s compared to the 1800s or 1900s, which have a full century worth of speeches. Overall though, I think the findings were quite interesting and they add insight into changes in political mindsets over our country’s history.