This week's Digital History class asked us to look at text analysis tools, and see how prominent certain words are in various texts. I chose to look at the inaugural speech of John F. Kennedy nearly fifty years ago, to see what are the most common words, and the themes covered, by Kennedy.
The first tool I used is TAPoR (Text Analysis Portal for Research), which allows you to examine documents and see how often certain words are used within the text. This can be done either by inserting a website address into a toolbar, or by uploading a text file from your hard drive. The immediate benefit of the tool is that you can quickly analyse how frequently certain words appear in speeches. The downside of this is that, without filtering, the tool lists the frequency of all words, including those that you might not want. Kennedy's inaugural address, for example, mentioned "the" 85 times, "of" 66 times, and "of" 37 times, which isn't much use for anyone wanting to show the emphasis he gave to international affairs. I ran the document through the program a second time, this time excluding words which were used for constructing sentences, rather than for constructing Kennedy's political beliefs. "We", "our" and "us" dominated, showing Kennedy's desire to portray America and his government as united, an important factor given that the 1960 election was the closest run campaign ever, until the 2000 Bush-Gore battle. These words are also common English words, so I ran the speech through the program again, this time checking the "apply inflectional stemmer" option, which removes commonly used words from the results. The frequency of all (non-common) words mentioned at least three times in the inaugural address can be found here.
The second tool I used was Wordle, a program which places text into a "word cloud", an image illustrating the frequency of words used in the text, with words used more frequently shown in larger sizes than words used less frequently. Entering the text of Kennedy's inaugural address into the program, an image is created, which can be custom formatted to suit one's taste. The program automatically discounts commonly used words, so the image created appears like this:
This is useful, because when commonly used words are included, the image is radically altered, as the common words dominate the picture, as can be seen below.
In all, these are useful programs, allowing people to sort through documents to see what words, ideas and themes may appear within the text. Wordle is probably the more user-friendly tool, being easier to navigate as well as looking prettier. That the creator allows, in fact encourages, the use of the program to create T-shirt designs is a nice added bonus!
No comments:
Post a Comment