Reading was once something I genuinely enjoyed. As I got older and decided to pursue a degree in Literature, reading has felt more like work and has lost a lot of its magic for me. Analyzing corpuses of texts, on the other hand, is something incredibly interesting and unorthodox. The Horatio Alger novel I have chosen to contribute to the class folder is “The Young Circus Rider.”
Prior to using Voyant Tools, I did some research on Alger via Wikipedia to get an idea of his style of writing. A major takeaway was how his writings were characterized as “rags to riches” narratives. With that in mind, I entered the text into Voyant and omitted common words such as “said,” “know,” “like,” and character names. After that, I noticed that “boy,” “man,” and “circus,” were among the top three most common words throughout the novel. It seems that Alger’s Wikipedia page could be accurate. I can make an inference that this novel is about a young boy who makes his living working at a circus. With a theme like “rags to riches,” I was surprised “money” didn’t crack the top five. I decided to look deeper into this and make use of the “Trends” tool in Voyant.
The above graph shows the frequencies of the words, “boy,” “money,” “man,” “circus.” I wanted to see if there were correlations between a boy making money at the circus. I included “man” because I figured if this boy worked enough to get rich, he could be referred to as a “man.” There is a noticeable pattern between “boy” (olive green) and money (tan) and at Document Segment 6, both “boy,” and “money” intersect. I think this data supports the idea that Alger stayed true to his reputation and wrote another “rag to riches” novel and we can make this inference without reading a single sentence.
Next, it was time to compare my text with some other Alger texts from the class folder. After cleaning up the texts, the most common words were, “Mr.” “boy,” “man,” “think,” and “good.”
What I found to be interesting was how “boy” wasn’t really prevalent in 2/5 of the works I chose (or at least compared to the other three). “Boy” is marked in green and we see how it is lower in frequency until we analyze “Train Boy.” The peak is at “Cashboy,” and when it hits my original text, “The Young Circus Rider,” we see a dramatic plummet in frequency. This leads me to believe that TrainBoy, Cashboy, and The Young Circus Rider’s main character is a boy, but what about “Bound to Rise”, and “Nothing to Eat?” Since “Mr.” is common among the five texts, perhaps the boys in the novels have a father/mentor type figure to help them achieve success. And this could be the cause of “boy” having a lower frequency
Next, I wanted to reinvestigate the word “money” and how it related to a boy’s success in the novels. I used “boy” in pink, “good” in tan, which leaves “money” to be in green. Initially, my hypothesis was Alger’s idea of success was money. In order to have a good life, being rich was a major factor. Based on these results, I was surprised to see “money” fluctuates much as it did but it was even more surprising to see how steady “good” was. With this new data, we could make the inference that most of his novels having happy endings or consist of a lot of good things happening throughout the books.
Furthermore, I used nGram to look more in-depth at the terms “boy,” “money,” and “good” across all publications from the year 1800-2000. To no surprise, “good” is used more frequently, but what was interesting was the giant decrease is frequency during the 60s-70s. My initial thought was perhaps the Vietnam War influenced people to not write about “good” or happy things, but that seems like a stretch to me since there were no dramatic spikes during WW1 and WW2. It’s a possibility that a slang word became more popular than “good” during that era then phased out in the 80s. “Money,” and “Boy” have fluctuated closely for the past two centuries. This was not as surprising to me once I realized how closely they relate. It is considered “American” to be a blue collar worker. Most men began working and making money as boys, so this is seemed like a common and plausible thing to write about.
It is astonishing how much information you can uncover about multiple texts with a single click of a button. Though we won’t get the full story by using Voyant Tools or nGrams, we can get an abundance of data that can help us make inferences. Reading the books will definitely help you in understanding the plot, but by studying data and trends, you are able to get a broader understanding of themes and trends that other authors partake in.